5,018 Matching Annotations
  1. Mar 2025
    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript focuses on the olfactory system of Pieris brassicae larvae and the importance of olfactory information in their interactions with the host plant Brassica oleracea and the major parasitic wasp Cotesia glomerata. The authors used CRISPR/Cas9 to knockout odorant receptor co-receptors (Orco), and conducted a comparative study on the behavior and olfactory system of the mutant and wild-type larvae. The study found that Orco-expressing olfactory sensory neurons in antennae and maxillary palps of Orco knockout (KO) larvae disappeared, and the number of glomeruli in the brain decreased, which impairs the olfactory detection and primary processing in the brain. Orco KO caterpillars show weight loss and loss of preference for optimal food plants; KO larvae also lost weight when attacked by parasitoids with the ovipositor removed, and mortality increased when attacked by untreated parasitoids. On this basis, the authors further studied the responses of caterpillars to volatiles from plants attacked by the larvae of the same species and volatiles from plants on which the caterpillars were themselves attacked by parasitic wasps. Lack of OR-mediated olfactory inputs prevents caterpillars from finding suitable food sources and from choosing spaces free of enemies.

      Strengths:

      The findings help to understand the important role of olfaction in caterpillar feeding and predator avoidance, highlighting the importance of odorant receptor genes in shaping ecological interactions.

      Weaknesses:

      There are the following major concerns:

      (1) Possible non-targeted effects of Orco knockout using CRISPR/Cas9 should be analyzed and evaluated in Materials and Methods and Results.

      Thank you for your suggestion. In the Materials and Methods, we mention how we selected the target region and evaluated potential off-target sites by Exonerate and CHOPCHOP. Neither of these methods found potential off-target sites with a more-than-17-nt alignment identity. Therefore, we assumed no off-target effect in our Orco KO. Furthermore, we did not find any developmental differences between WT and KO caterpillars when these were reared on leaf discs in Petri dishes (Fig S4). We will further highlight this information on the off-target evaluation in the Results section of our revised manuscript.

      (2) Figure 1E: Only one olfactory receptor neuron was marked in WT. There are at least three olfactory sensilla at the top of the maxillary palp. Therefore, to explain the loss of Orco-expressing neurons in the mutant (Figure 1F), a more rigorous explanation of the photo is required.

      Thank you for pointing this out. The figure shows only a qualitative comparison between WT and KO and we did not aim to determine the total number of Orco positive neurons in the maxillary palps or antennae of WT and KO caterpillars, but please see our previous work for the neuron numbers in the caterpillar antennae (Wang et al., 2023). We did indeed find more than one neuron in the maxillary palps, but as these were in very different image planes it was not possible to visualize them together. However, we will add a few sentences in the Results and Discussion section to explain the results of the maxillary palp Orco staining.

      (3) In Figure 1G, H, the four glomeruli are circled by dotted lines: their corresponding relationship between the two figures needs to be further clarified.

      Thank you for pointing this out. The four glomeruli in Figure 1G and 1H are not strictly corresponding. We circled these glomeruli to highlight them, as they are the best visualized and clearly shown in this view. In this study, we only counted the number of glomeruli in both WT and KO, however, we did not clarify which glomeruli are missing in the KO caterpillar brain. We will further explain this in the figure legend.

      (4) Line 130: Since the main topic in this study is the olfactory system of larvae, the experimental results of this part are all about antennal electrophysiological responses, mating frequency, and egg production of female and male adults of wild type and Orco KO mutant, it may be considered to include this part in the supplementary files. It is better to include some data about the olfactory responses of larvae.

      Thank you for your suggestion. We do agree with your suggestion, and we will consider moving this part to the supplementary information. Regarding larval olfactory response, we unfortunately failed to record any spikes using single sensillum recordings due to the difficult nature of the preparation; however, we do believe that this would be an interesting avenue for further research.

      (5) Line 166: The sentences in the text are about the choice test between " healthy plant vs. infested plant", while in Fig 3C, it is "infested plant vs. no plant". The content in the text does not match the figure.

      Thank you for pointing this out. The sentence is “We compared the behaviors of both WT and Orco KO caterpillars in response to clean air, a healthy plant and a caterpillar-infested plant”. We tested these three stimuli in two comparisons: healthy plant vs no plant, infested plant vs no plant. The two comparisons are shown in Figure 3C separately. We will aim to describe this more clearly in the revised version of the manuscript.

      (6) Lines 174-178: Figure 3A showed that the body weight of Orco KO larvae in the absence of parasitic wasps also decreased compared with that of WT. Therefore, in the experiments of Figure 3A and E, the difference in the body weight of Orco KO larvae in the presence or absence of parasitic wasps without ovipositors should also be compared. The current data cannot determine the reduced weight of KO mutant is due to the Orco knockout or the presence of parasitic wasps.

      Thank you for pointing this out. We did not make a comparison between the data of Figures 3A and 3E since the two experiments were not conducted at the same time due to the limited space in our BioSafety Ⅲ greenhouse. We do agree that the weight decrease in Figure 3E is partly due to the reduced caterpillar growth shown in Figure 3A. However, we are confident that the additional decrease in caterpillar weight shown in Figure 3E is mainly driven by the presence of disarmed parasitoids. To be specific, the average weight in Figure 3A is 0.4544 g for WT and 0.4230 g for KO, KO weight is 93.1% of WT caterpillars. While in Figure 3E, the average weight is 0.4273 g for WT and 0.3637 g for KO, KO weight is 85.1% of WT caterpillars. We will discuss this interaction between caterpillar growth and the effect of the parasitoid attacks more extensively in the revised version of the manuscript.

      (7) Lines 179-181: Figure 3F shows that the survival rate of larvae of Orco KO mutant decreased in the presence of parasitic wasps, and the difference in survival rate of larvae of WT and Orco KO mutant in the absence of parasitic wasps should also be compared. The current data cannot determine whether the reduced survival of the KO mutant is due to the Orco knockout or the presence of parasitic wasps.

      We are happy that you highlight this point. When conducting these experiments, we selected groups of caterpillars and carefully placed them on a leaf with minimal disturbance of the caterpillars, which minimized hurting and mortality. We did test the survival of caterpillars in the absence of parasitoid wasps from the experiment presented in Figure 3A, although this was missing from the manuscript. There is no significant difference in the survival rate of caterpillars between the two genotypes in the absence of wasps (average mortality WT = 8.8 %, average mortality KO = 2.9 %; P = 0.088, Wilcoxon test), so the decreased survival rate is most likely due to the attack of the wasps. We will add this information to the revised version of the manuscript.

      (8) In Figure 4B, why do the compounds tested have no volatiles derived from plants? Cruciferous plants have the well-known mustard bomb. In the behavioral experiments, the larvae responses to ITC compounds were not included, which is suggested to be explained in the discussion section.

      Thank you for the suggestion. We assume you mean Figure 4D/4E instead of Figure 4B. In Figure 4B, many of the identified chemical compounds are essentially plant volatiles, especially those from caterpillar frass and caterpillar spit. In Figure 4D/4E, most of the tested chemicals are derived from plants. We did include several ITCs in the butterfly EAG tests shown in figure 2A/B, however because the butterfly antennae did not respond strongly to ITCs, we did not include ITCs in the subsequent larval behavioural tests. Instead, the tested chemicals in Figure 4D/4E either elicit high EAG responses of butterflies or have been identified as significant by VIP scores in the chemical analyses. We will add this explanation to the revised version of our manuscript.

      (9) The custom-made setup and the relevant behavioral experiments in Figure 4C need to be described in detail (Line 545).

      We will add more detailed descriptions for the setup and method in the Materials and Methods.

      (10) Materials and Methods Line 448: 10 μL paraffin oil should be used for negative control.

      Thank you for pointing this out. We used both clean filter paper and clean filter paper with 10 μL paraffin oil as negative controls, but we did not find a significant difference between the two controls. Therefore, in the EAG results of Figure 2A/2B, we presented paraffin oil as one of the tested chemicals. We will re-run our statistical tests with paraffin oil as negative control, although we do not expect any major differences to the previous tests.

      Reviewer #2 (Public review):

      Summary:

      This manuscript investigated the effect of olfactory cues on caterpillar performance and parasitoid avoidance in Pieris brassicae. The authors knocked out Orco to produce caterpillars with significantly reduced olfactory perception. These caterpillars showed reduced performance and increased susceptibility to a parasitoid wasp.

      Strengths:

      This is an impressive piece of work and a well-written manuscript. The authors have used multiple techniques to investigate not only the effect of the loss of olfactory cues on host-parasitoid interactions, but also the mechanisms underlying this.

      Weaknesses:

      (1) I do have one major query regarding this manuscript - I agree that the results of the caterpillar choice tests in a y-maze give weight to the idea that olfactory cues may help them avoid areas with higher numbers of parasitoids. However, the experiments with parasitoids were carried out on a single plant. Given that caterpillars in these experiments were very limited in their potential movement and source of food - how likely is it that avoidance played a role in the results seen from these experiments, as opposed to simply the slower growth of the KO caterpillars extending their period of susceptibility? While the two mechanisms may well both take place in nature - only one suggests a direct role of olfaction in enemy avoidance at this life stage, while the other is an indirect effect, hence the distinction is important.

      We do agree with your comment that both mechanisms may be at work in nature, and we do address this in the Discussion section. In our study, we did find that wildtype caterpillars were more efficient in locating their food source and did grow faster on full plants than knockout caterpillars. This faster growth will enable wildtype caterpillars to more quickly outgrow the life-stages most vulnerable to the parasitoids (L1 and L2). The olfactory system therefore supports the escape from parasitoids indirectly by enhancing feeding efficiency directly.

      In addition, we show in our Y-tube experiments that WT caterpillars were able to avoid plant where conspecifics are under the attack by parasitiods (Figure 3D). Therefore, we speculate that WT caterpillars make use of volatiles from the plant or from conspecifics via their spit or faeces to avoid plants or leaves potentially attracting natural enemies. Knockout caterpillars are unable to use these volatile danger cues and therefore do not avoid plants or leaves that are most attractive to their natural enemies, making KO caterpillars more susceptible and leading to more natural enemy harassment. Through this, olfaction also directly impacts the ability of a caterpillar to find an enemy-free feeding site.

      We think that olfaction supports the enemy avoidance of caterpillars via both these mechanisms, although at different time scales. Unfortunately, our analysis was not detailed enough to discern the relative importance of the two mechanisms we found. However, we feel that this would be an interesting avenue for further research. Moreover, we will sharpen our discussion on the potential importance of the two different mechanisms in the revised version of this manuscript.

      (2) My other issue was determining sample sizes used from the text was sometimes a bit confusing. (This was much clearer from the figures).

      We will revise the sample size in the text to make it clearer.

      (3) I also couldn't find the test statistics for any of the statistical methods in the main text, or in the supplementary materials.

      Thank you for pointing this out. We will provide more detailed test statistics in the main text and in the supplementary materials of the revised version of the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the editor and reviewers for their supportive comments about our modeling approach and conclusions, and for raising several valid concerns; we address them briefly below. In addition, a detailed, point-by-point response to the reviewers’ comments are below, along with additions and edits we have made to the revised manuscript. 

      Concerns about model’s biological realism and impact on interpretations

      The goal of this paper was to use an interpretable and modular model to investigate the impact of varying sensorimotor delays. Aspects of the model (e.g. layered architecture, modularity) are inspired by biology; at the same time, necessary abstractions and simplifications (e.g. using an optimal controller) are made for interpretability and generalizability, and they reflect common approaches from past work. The hypothesized effects of certain simplifying assumptions are discussed in detail in Section 3.5. Furthermore, the modularity of our model allows us to readily incorporate additional biological realism (e.g. biomechanics, connectomics, and neural dynamics) in future work. In the revision, we have added citations and edits to the text to clarify these points.

      Concerns that the model is overly complex

      To investigate the impact of sensorimotor delays on locomotion, we built a closed-loop model that recapitulates the complex joint trajectories of fly walking. We agree that locomotion models face a tradeoff between simplicity/interpretability and realism — therefore, we developed a model that was as simple and interpretable as possible, while still reasonably recapitulating joint trajectories and generalizing to novel simulation scenarios. Along these lines, we also did not select a model that primarily recreates empirical data, as this would hinder generalizability and add unnecessary complexity to the model. We do not think these design choices are significant weaknesses of this model; in fact, few comparable models account for all joints involved in locomotion, and fewer explicitly compare model kinematics with kinematics from data. We have add citations and edits to the text to clarify these points in the revision. 

      Concerns about the validity of the Kinematic Similarity (KS) metric to evaluate walking

      We chose to incorporate only the first two PCA modes dimensions in the KS metric because the kernel density estimator performs poorly for high dimensional data. Our primary use of this metric was to indicate whether the simulated fly continues walking in the presence of perturbations. For technical reasons, it is not feasible to perform equivalent experiments on real walking flies, which is one of the reasons we explore this phenomenon with the model. We note the dramatic shift from walking to nonwalking as delay increases (Figure 5). To be thorough, in the revision, we have investigated the effect of incorporating additional PCA modes, and whether this affects the interpretation of our results. We have additionally added to the discussion and presentation of the KS metric to clarify its purpose in this study. We agree with the reviewers that the KS metric is too coarse to reflect fine details of joint kinematics; indeed, in the unperturbed case, we evaluate our model’s performance using other metrics based on comparisons with empirical data (Figures 2, 7, 8). 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, the authors present a novel, multi-layer computational model of motor control to produce realistic walking behaviour of a Drosophila model in the presence of external perturbations and under sensory and motor delays. The novelty of their model of motor control is that it is modular, with divisions inspired by the fly nervous system, with one component based on deep learning while the rest are based on control theory. They show that their model can produce realistic walking trajectories. Given the mostly reasonable assumptions of their model, they convincingly show that the sensory and motor delays present in the fly nervous system are the maximum allowable for robustness to unexpected perturbations.

      Their fly model outputs torque at each joint in the leg, and their dynamics model translates these into movements, resulting in time-series trajectories of joint angles. Inspired by the anatomy of the fly nervous system, their fly model is a modular architecture that separates motor control at three levels of abstraction:

      (1) oscillator-based model of coupling of phase angles between legs,

      (2) generation of future joint-angle trajectories based on the current state and inputs for each leg (the trajectory generator), and

      (3) closed-loop control of the joint-angles using torques applied at every joint in the model (control and dynamics).

      These three levels of abstraction ensure coordination between the legs, future predictions of desired joint angles, and corrections to deviations from desired joint-angle trajectories. The parameters of the model are tuned in the absence of external perturbations using experimental data of joint angles of a tethered fly. A notable disconnect from reality is that the dynamics model used does not model the movement of the body and ground contacts as is the case in natural walking, nor the movement of a ball for a tethered fly, but instead something like legs moving in the air for a tethered fly.

      n order to validate the realism of the generated simulated walking trajectories, the authors compare various attributes of simulated to real tethered fly trajectories and show qualitative and quantitative similarities, including using a novel metric coined as Kinematic Similarity (KS). The KS score of a trajectory is a measure of the likelihood that the trajectory belongs to the distribution of real trajectories estimated from the experimental data. While such a metric is a useful tool to validate the quality of simulated data, there is some room for improvement in the actual computation of this score. For instance, the KS score is computed for any given time-window of walking simulation using a fraction of information from the joint-angle trajectories. It is unclear if the remaining information in joint-angle trajectories that are not used in the computation of the KS score can be ignored in the context of validating the realism of simulated walking trajectories.

      The authors validate simulated walking trajectories generated by the trained model under a range of sensorimotor delays and external perturbations. The trained model is shown to generate realistic jointangle trajectories in the presence of external perturbations as long as the sensorimotor delays are constrained within a certain range. This range of sensorimotor delays is shown to be comparable to experimental measurements of sensorimotor delays, leading to the conclusion that the fly nervous system is just fast enough to be robust to perturbations.

      Strengths:

      This work presents a novel framework to simulate Drosophila walking in the presence of external perturbations and sensorimotor delay. Although the model makes some simplifying assumptions, it has sufficient complexity to generate new, testable hypotheses regarding motor control in Drosophila. The authors provide evidence for realistic simulated walking trajectories by comparing simulated trajectories generated by their trained model with experimental data using a novel metric proposed by the authors. The model proposes a crucial role in future predictions to ensure robust walking trajectories against external perturbations and motor delay. Realistic simulations under a range of prediction intervals, perturbations, and motor delays generating realistic walking trajectories support this claim. The modular architecture of the framework provides opportunities to make testable predictions regarding motor control in Drosophila. The work can be of interest to the Drosophila community interested in digitally simulating realistic models of Drosophila locomotion behaviors, as well as to experimentalists in generating testable hypotheses for novel discoveries regarding neural control of locomotion in Drosophila. Moreover, the work can be of broad interest to neuroethologists, serving as a benchmark in modelling animal locomotion in general.

      We thank the reviewer for their positive comments.

      Weaknesses:

      As the authors acknowledge in their work, the control and dynamics model makes some simplifying assumptions about Drosophila physics/physiology in the context of walking. For instance, the model does not incorporate ground contact forces and inertial effects of the fly's body. It is not clear how these simplifying assumptions would affect some of the quantitative results derived by the authors. The range of tolerable values of sensorimotor delays that generate realistic walking trajectories is shown to be comparable with sensorimotor delays inferred from physiological measurements. It is unclear if this comparison is meaningful in the context of the model's simplifying assumptions.

      We now discuss how some of these assumptions affect the quantitative results in the section “Towards biomechanical and neural realism”. We reproduce the relevant sentences below:

      “The inclusion of explicit leg-ground contact interactions would also make it harder for the model to recover when perturbed, because perturbations during walking often occur upon contact with the ground (e.g. the ground is slippery or bumpy).”

      “We anticipate that the increased sensory resolution from more detailed proprioceptor models and the stability from mechanical compliance of limbs in a more detailed biomechanical model would make the system easier to control and increase the allowable range of delay parameters. Conversely, we expect that modeling the nonlinearity and noise inherent to biological sensors and actuators may decrease the allowable range of delay parameters.”

      The authors propose a novel metric coined as Kinematic Similarity (KS) to distinguish realistic walking trajectories from unrealistic walking trajectories. Defining such an objective metric to evaluate the model's predictions is a useful exercise, and could potentially be applied to benchmark other computational animal models that are proposed in the future. However, the KS score proposed in this work is calculated using only the first two PCA modes that cumulatively account for less than 50% of the variance in the joint angles. It is not obvious that the information in the remaining PCA modes may not change the log-likelihood that occurs in the real walking data.

      The primary reason we designed the KS metric was to determine whether the simulated fly continues walking in the presence of perturbations. We initially limited the analysis of the KS to the first 2 principal components. For completeness, we now investigate the additional principal components in Appendix 9 and the effect of evaluating KS with different numbers of components in Appendix 10. 

      Overall, the results look similar when including additional components for impulse perturbations. For stochastic perturbations, the range of similar walking decreases as we increase the number of components used to evaluate walking kinematics. Comparing this with Appendix 9, which shows that higher components represent higher frequencies of the walking cycle, we conclude that at the edge of stability for delays (where sum of sensory and actuation delays are about 40ms), flies can continue walking but with impaired higher frequencies (relative to no perturbations) during and after perturbation. 

      We added the following text in the methods:

      “We chose 2 dimensions for PCA for two key reasons. First, these 2 dimensions alone accounted for a large portion of the variance in the data (52.7% total, with 42.1% for first component and 10.6% for second component). There was a big drop in variance explained from the first to the second component, but no sudden drop in the next 10 components (see Appendix 9). Second, the KDE procedure only works effectively in low-dimensional spaces, and the minimal number of dimensions needed to obtain circular dynamics for walking is 2. We investigate the effect of varying the number of dimensions of PCA in Appendix 10.”

      (Note that we have corrected the percentage of variance accounted for by the principal components, as these numbers were from an older analysis prior to the first draft.)

      We also reference Appendix 10 in the results:

      “We observed that robust walking was not contingent on the specific values of motor and sensory delay, but rather the sum of these two values (Fig. 5E). Furthermore, as delay increases, higher frequencies of walking are impacted first before walking collapses entirely (Appendix 10).”

      Reviewer #2 (Public Review):

      Summary:

      In this study, Karashchuk et al. develop a hierarchical control system to control the legs of a dynamic model of the fly. They intend to demonstrate that temporal delays in sensorimotor processing can destabilize walking and that the fly's nervous system may be operating with as long of delays as could possibly be corrected for.

      Strengths:

      Overall, the approach the authors take is impressive. Their model is trained using a huge dataset of animal data, which is a strength. Their model was not trained to reproduce animal responses to perturbations, but it successfully rejects small perturbations and continues to operate stably. Their results are consistent with the literature, that sensorimotor delays destabilize movements.

      Weaknesses:

      The model is sophisticated and interesting, but the reviewer has great concerns regarding this manuscript's contributions, as laid out in the abstract:

      (1) Much simpler models can be used to show that delays in sensorimotor systems destabilize behavior (e.g., Bingham, Choi, and Ting 2011; Ashtiani, Sarvestani, and Badri-Sproewitz 2021), so why create this extremely complex system to test this idea? The complexity of the system obscures the results and leaves the reviewer wondering if the instability is due to the many, many moving parts within the model. The reviewer understands (and appreciates) that the authors tested the impact of the delay in a controlled way, which supports their conclusion. However, the reviewer thinks the authors did not use the most parsimonious model possible, and as such, leave many possible sources for other causes of instability.

      We thank the reviewer for this observation — we agree that we did not make the goal of the work quite clear. The goal of this paper was to build an interpretable and generalizable model of fly walking, which was then used to investigate varying sensorimotor delays in the context of locomotion. To this end, we used a modular model to recreate walking kinematics, and then investigated the effect of delays on locomotion. Locomotion in itself is a complex phenomenon — thus, we have chosen a model that is complex enough to reasonably recapitulate joint trajectories, while remaining interpretable.

      We have clarified this in the text near the end of the introduction:

      “Here, we develop a new, interpretable, and generalizable model of fly walking, which we use to investigate the impact of varying sensorimotor delays in Drosophila locomotion.”

      We also emphasize the investigation of sensorimotor delays in the context of locomotion in the beginning of the “Effect of sensory and motor delays on walking” section:

      “... we used our model to investigate how changing sensory and motor delays affects locomotor robustness.”

      We also remark that while they are very relevant papers for our work, neither of the prior papers focus on locomotion: the first involves a 2D balance model of a biped, and the second involves drop landings of quadrupeds.

      Lastly, we note that the investigation of delay is not the only use for this model —  in the future, this model can also be used to study other aspects of locomotion such as the role of proprioceptive feedback (see “Role of proprioceptive feedback in fly walking” section). The layered framework of the model can also be extended to other animals and locomotor strategies (see “Layered model produces robust walking and facilitates local control” section”).

      (2) In a related way, the reviewer is not sure that the elements the authors introduced reflect the structure or function of the fly's nervous system. For example, optimal control is an active field of research and is behind the success of many-legged robots, but the reviewer is not sure what evidence exists that suggests the fly ventral nerve cord functions as an optimal controller. If this were bolstered with additional references, the reviewer would be less concerned.

      We thank the reviewer for the comment — we have now further clarified how our model elements reflect the fly’s nervous system. The elements we introduce are plausible but only loosely analogous to the fly’s nervous system. While we draw parallels from these elements to anatomy (e.g. in Fig 1A-B, and in the first paragraph of the Results section), we do not mean to suggest that these functional elements directly correspond to specific structures in the fly’s nervous system. A substantial portion of the suggested future work (see “Towards biomechanical and neural realism”) aims to bridge the gap between these functional elements and fly physiology, which is beyond the scope of this work. 

      We have added clarifying text to the Results section:

      “While the model is inspired by neuroanatomy, its components do not strictly correspond to components of the nervous system --- the construction of a neuroanatomically accurate model is deferred to future work (see Discussion).”

      In the specific case of optimal control — optimal control is a theoretical model that predicts various aspects of motor control in humans, there is evidence that optimal control is implemented by the human nervous system (Todorov and Jordan, 2002; Scott, 2004; Berret et al., 2011). Based on this, we make the assumption that optimal control is a reasonable model for motor control in flies implemented by the fly nervous system as well. Fly movement makes use of proprioceptive feedback signals (Mendes et al., 2013; Pratt et al., 2024; Berendes et al., 2016), and optimal control is a plausible mechanism that incorporates feedback signals into movement.

      We have added the following clarifying text in the Results section: 

      “The optimal controller layer maintains walking kinematics in the presence of sensori motor delays and helps compensate for external perturbations. This design was inspired by optimal control-based models of movements in humans (Todorov and Jordan, 2002; Scott, 2004; Berret et al., 2011)”

      (3) "The model generates realistic simulated walking that matches real fly walking kinematics...". The reviewer appreciates the difficulty in conducting this type of work, but the reviewer cannot conclude that the kinematics "match real fly walking kinematics". The range of motion of several joints is 30% too small compared to the animal (Figure 2B) and the reviewer finds the video comparisons unpersuasive. The reviewer would understand if there were additional constraints, e.g., the authors had designed a robot that physically could not complete the prescribed motions. However the reviewer cannot think of a reason why this simulation could not replicate the animal kinematics with arbitrary precision, if that is the goal.

      We agree with the reviewer that the model-generated kinematics are not perfectly indistinguishable from real walking kinematics, and now clarify this in the text. We also agree with the reviewer that one could build a model that precisely replicates real kinematics, but as they intuit, that was not our goal. Our goal was to build a model that both replicates animal kinematics, and is interpretable and generalizable (which allows us to investigate what happens when perturbations and varying sensorimotor delays are introduced). There is a trade-off between realism and generalizability — a simulation that fully recreates empirical data would require a model that is completely fit to data, which is likely to be more complex (in terms of parameters required) and less generalizable to novel scenarios. We have made design choices that result in a model that balances these trade-offs. We do not consider this to be a weakness of the model; in fact, few comparable models account for all joints involved in locomotion, and fewer explicitly compare model kinematics with kinematics from data.

      We have tempered the language in the abstract:

      “The model generates realistic simulated walking that resembles real fly walking kinematics”

      The tempered statement, we believe, is a fair characterization of the walking — it resembles but does not perfectly match real kinematics.

      We have also introduced clarifying text in the introduction:

      “Overall, existing walking models focus on either kinematic or physiological accuracy, but few achieve both, and none consider the effect of varying sensorimotor delays. Here, we develop a new, interpretable, and generalizable model of fly walking, which we use to investigate the impact of varying sensorimotor delays in Drosophila locomotion.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Potential typo on page 5:

      2.1.2 Joint kinematics trajectory generator

      Paragraph 4, last line: Original text - ".....it also estimates the current phase". Suggested correction - "...it also estimates the current phase velocity"

      Done

      Potential typo on page 8:

      2.3 Model maintains walking under unpredictable external perturbations.

      Paragraph 3, line 2: Original text - "...brief, unexpected force (e.g. legs slipping on an unstable surface)".

      Consider replacing force with motion, or providing an example of a force as opposed to displacement (slipping).

      Done

      Potential typo on page 8:

      2.3 Model maintains walking under unpredictable external perturbations.

      Paragraph 3, line 4: Original text - "The magnitude of this velocity is drawn from a normal distribution...".

      Is this really magnitude? If so, please discuss how the sign (+/-) is assigned to velocity, and how the normal distribution is centred so as to sample only positive values representing magnitude.

      Indeed the magnitude of the velocity is drawn from a normal distribution. A positive or negative sign is then assigned with equal odds. We have added text to clarify this:

      “The sign of the velocity was drawn separately so that there is equal likelihood for negative or positive perturbation velocities.”

      Page 8:

      2.3 Model maintains walking under unpredictable external perturbations.

      In Paragraph 5: Why is the data reduced to only 2 dimensions? Could higher order PCA modes (cumulatively accounting for more than 50% variance in the data) not have distinguishing information between realistic and unrealistic walking trajectories?

      We provide a longer response for this in the public review above.

      Page 11:

      Why wouldn't a system trained in the presence of external perturbations perform better? What is the motivation to remove external perturbations during training?

      We agree that a system trained in the presence of external perturbations would probably perform better — however, we do not have data that contains walking with external perturbations. Nothing was removed — all the data used in this study involve a fly walking without perturbations.

      We have added a clarification:

      “our model maintains realistic walking in the presence of external dynamic perturbations, despite being trained only on data of walking without perturbations (no perturbation data was available).”

      Page 16:

      4.1 Tracking joint angles of D. melanogaster walking in 3D.

      Paragraph 1: Readers who wish to collect similar data might benefit from specifying the exposure time, animal size in pixels (or camera sensor format and field of view), in addition to the frame rate. Alternatively, consider mentioning the camera and lens part numbers provided by the manufacturer.

      This is a good point. We have updated the text to include these specifications:

      “We obtained fruit fly D. melanogaster walking kinematics data following the procedure previously described in (Karashchuk et al, 2021). Briefly, a fly was tethered to a tungsten wire and positioned on a frictionless spherical treadmill ball suspended on compressed air. Six cameras (Basler acA800-510um with Computar zoom lens MLM3X-MP) captured the movement of all of the fly's legs at 300 Hz. The fly size in pixels ranges from about 300x300 up to 700x500 pixels across the 6 cameras. Using Anipose, we tracked 30 keypoints on the fly, which are the following 5 points on each of the 6 legs: body-coxa, coxa-femur, femur-tibia, and tibia-tarsus joints, as well as the tip of the tarsus.”

      Potential typos on page 18:

      4.3.3 Training procedure

      Paragraph 2, line 1: Original text - "..(, p)"

      Do the authors mean "...(, )"

      Paragraph 2, line 2: Original text - "... (,, v, p)" Do the authors mean "... (,, v, )"?

      Paragraph 3, line 3: Original text - "... (,, v, p)" Do the authors mean "... (,, v, )"?

      Thank you for pointing out this issue. We have now fixed the phase p to be \phi to be consistent with the rest of the text.

      Paragraph 3, line 3: Original text - "...()"

      Do the authors mean "(d)"? If not, please discuss the difference between and d.

      Thank you for pointing this out. \hat \theta and \theta_d were used interchangeably which is confusing. We have standardized our reference to the desired trajectory as \theta_d throughout the text.

      Page 19:

      Typo after eqn. (6):

      Original text: "where x := q - q, ... A and B are Jacobians with respect to...."

      Correction: "where x := q - q, ... Ac and Bc are Jacobians with respect to...."

      Similar corrections in eqn. 7 and eqn. 8: A and B should be replaced with Ac and Bc. Done

      Page 19, eqn. (10b):

      Should the last term be qd(t+T) as opposed to qd(t+1)?

      No: in fact (10a) contains the typo: it should be y(t+1) as opposed to y(t+T). This has been fixed.

      Page 19

      The authors' detailed description of the initial steps leading up to the dynamics model, involving the construction of the ODE, linearizing the system about the fixed point makes the text broadly accessible to the general reader. Similarly, adding some more description of the predictive model (eqn. 11 - 15) could improve the text's accessibility and the reader's appreciation for the model. This is especially relevant since the effects of sensorimotor delay and external perturbations, which are incorporated in the control and dynamics model, form a major contribution to this work. What do the matrices F, G, L, H, and K look like for the Drosophila model? Are there any differences between the model in Stenberg et al. (referenced in the paper) and the authors' model for predictive control? Are there any differences in the assumptions made in Stenberg et al. compared to the model presented in this work? The readers would likely also benefit from a figure showing the information flow in the model, and describing all the variables used in the predictive control model in eqn. 11 through eqn. 15 (analogous to Figure 1 in Stenberg et al. (2022)). Such a detailed description of the control and dynamics model would help the reader easily appreciate the assumptions made in modelling the effects of sensorimotor delay and external perturbations.

      Done

      Page 20:

      Eqn. 12: Should z(t+1) be z(t+T) instead?

      Similar comment for eqn. 14

      No: we made a mistake in (10a); there should be no (t+T) terms; all terms should be (t+1) terms to reflect a standard discrete-time difference equation.

      Eqn. 13: r(t) can be defined explicitly

      Done

      4.5 Generate joint trajectories of the complete model with perturbations Paragraph 2, line 2: Please read the previous comment

      \hat \theta and \theta_d were previously used interchangeably which is confusing. We have standardized our reference to the desired trajectory as \theta_d throughout the text.

      Original text - "Every 8 timesteps, we set :=...."

      Does this mean dis set to? If so, the motivation for this is not clear.

      We mean that \theta_d is set to be equal to \theta. We have replaced “:=” with “=” for clarity.

      General comments for the authors:

      Could the authors discuss the assumptions regarding Drosophila physiology implied in the control model?

      The control model is primarily included as a plausible functional element of the fly’s nervous system, and as such implies minimal assumptions on physiology itself. The main assumption, which is evident from the description of the model components, is that the fly uses proprioceptive feedback information to inform future movements.

      We have added clarifying text to the Results section:

      “While the model is inspired by neuroanatomy, its components do not strictly correspond to components of the nervous system --- the construction of a neuroanatomically accurate model is deferred to future work (see Discussion).”

      The authors acknowledge the absence of ground contact forces in the model. It is probably worth discussing how this simplification may affect inferences regarding the acceptable range of sensorimotor delay in generating realistic walking trajectories.

      We agree, and discuss how some of these assumptions affect the quantitative results in the section “Towards biomechanical and neural realism”. We replicate the relevant sentences below:

      “The inclusion of explicit leg-ground contact interactions would also make it harder for the model to recover when perturbed, because perturbations during walking often occur upon contact with the ground (e.g. the ground is slippery or bumpy).”

      The effects of other simplifications are also mentioned in the same section.

      Can the authors provide an insight into why the use of a second derivative of joint angles as the output of the trajectory generator () leads to more realistic trajectories (4.3.1 Model formulation, paragraph 1)?

      Does the use of a second-order derivative of joint angles lead to drift error because of integration?

      Could the distribution of θd produced be out of the domain due to drift errors? Could this affect the performance of the neural network model approximating the trajectory generator?

      We are not sure why the second derivative works better than the first derivative. It is possible that modeling the system as a second order differential equation gives the network more ability to produce complex dynamics. 

      As can be seen in the example time series in Figures 2 and 3 and supplemental videos, there is no drift error from integration, so it is unlikely to affect the performance of the neural network.

      What does the model's failure (quantified by a low KS score) look like in the context of fly dynamics? What do the joint angles look like for low values of KS score? Does the fly fall down, for example?

      Since the model primarily considers kinematics, a low KS score means that kinematics are unrealistic, e.g. the legs attain unnatural angles or configurations. Examples of this can be seen in videos 4-7 (linked from Appendix 1 of the paper), as well as in the bottom row of Fig. 5, panel A. Here, at 40ms of motor delay, L2 femur rotation is seen to attain values that far exceed the normal ranges. 

      We have added a small clarification in the caption of Fig.5 panel A:

      “low KS indicates that the perturbed walking deviates from data and results in unnatural angles

      (as seen at 40ms motor delay)” 

      We remark that since our simulations do not incorporate contact forces (as the reviewer remarks above, we simulate something like legs moving in the air for a tethered fly), the fly cannot “fall down” per se. However, if forces were incorporated then yes, these unrealistic kinematics would correspond to a fly that falls down or is no longer walking.

      Reviewer #2 (Recommendations For The Authors):

      L49: "Computational models of locomotion do not typically include delay as a tunable parameter, and most existing models of walking cannot sustain locomotion in the presence of delays and external perturbations". This remark confuses the reviewer.

      (1) If models do not "typically" include delay as a tunable parameter, this suggests that atypical models do. Which models do? Please provide references.

      Our initial phrasing was confusing. We meant to say that most models do not include delay, and some models do include delay as a fixed value (rather than a tunable value). We clarify in the updated text, which is replicated below:

      “Computational models of locomotion typically have not included delays as a tunable parameter, although some models have included them as fixed values (Geyer and Herr, 2010; Geijtenbeek et al., 2013).”

      (2) Has the statement that most existing models cannot sustain locomotion with delays been tested? If so, provide references. If not, please remove this statement or temper the language.

      Since most models don’t include delays, they cannot be run in scenarios with delays. We clarify in the updated text, which is replicated below:

      “Computational models of locomotion have not typically included delays. Some have included delay as a fixed value rather than a tunable parameter (Geyer and Herr, 2010; Geijtenbeek et al., 2013). However, in general, the impact of sensorimotor delays on locomotor control and robustness remains an underexplored topic in computational neuroscience.”

      L57: "two of six legs lift off the ground at a time" - Two legs are off the ground at any time, but they do not "lift off" simultaneously in the fruit fly. To lift off simultaneously, contralateral leg pairs would need to be 33% out of phase with one another, but they are almost always 50% out of phase.

      Thank you for pointing out this oversight. We have updated the text accordingly:

      “Flies walk rhythmically with a continuum of stepping patterns that range from tetrapod (where two of six legs are off the ground at a time) to tripod (where three of six legs are off the ground at a time)"

      L88: "a new model of fly walking" - The intention of the authors is to produce a model from which to learn about walking in the fly, is that correct? The reviewer has read the paper several times now and wants to be sure that this is the authors' goal, not to engineer a control system for an animation or a robot.

      Indeed, this is our goal. We were previously unclear about this, and have made text edits to clarify this — we provide a longer response for this in the public review above (see (1)).

      L126: "These desired phases are synchronized across pairs of legs to maintain a tripod coordination pattern, even when subject to unpredictable perturbations." - Does the animal maintain tripod coordination even when perturbed? In the reviewer's experience, flies vary their interleg coordination all the time. The reviewer would also expect that if perturbed strongly (as the supplemental videos show), the animal would adapt its interleg coordination in response. The author finds this assumption to be a weak point in the paper for the use of this disturbance exploring animal locomotion.

      We do not know exactly how flies may react to our mechanical perturbations. However, we may hypothesize based on past papers. 

      Couzin-Fuchs et al (2015) apply a mechanical perturbation to walking cockroaches. They find that that tripod is temporarily broken immediately after the perturbation but the cockroach recovers to a full tripod within one step cycle. 

      DeAngelis et al (2019) apply optogenetic perturbations to fly moonwalker neurons that drive backward walking. Flies slow down following perturbation, but then recover after 200ms (about 2-3 steps) to their original speed (on average). 

      Thus, we think it is reasonable to model a fly’s internal phase coupling to maintain tripod and for its intended speed to remain the same even after a perturbation. 

      We do agree with the reviewer that it is plausible a fly might also slow down or even stop after a perturbation and we do not model such cases. We have added some text to the discussion on future work:

      “Future work may also model how higher-level planning of fly behavior interacts with the lowerlevel coordination of joint angles and legs. Walking flies continuously change their direction and speed as they navigate the environment (Katsov et al, 2017; Iwasaki et al 2024). Past work shows that flies tend to recover and walk at similar speeds following perturbations (DeAngelis et al, 2019), but individual flies might still change walking speed, phase coupling, or even transition to other behaviors, such as grooming. Modeling these higher-level changes in behavior would involve combining our sensorimotor model with models for navigation (Fisher 2022) or behavioral transitions (Berman et al, 2016).”

      L136: "...to output joint torques to the physical model of each leg" - Is this the ultimate output of the nervous system? Muscles are certainly not idealized torque generators. There are dynamics related to activation and mechanics. The reviewer is skeptical that this is a model of neural control in the animal, because the computation of the nervous system would be tuned to account for all these additional dynamics.

      We agree with the reviewer that joint torques are not the ultimate output of the nervous system. We use a torque controller because it is parsimonious, and serves our purpose of creating an interpretable and modular locomotion model.

      We also agree that muscles are an important consideration — we make mention of them later on in the paper under the section “Toward biomechanical and neural realism”, where we state “Another step toward biological realism is the incorporation of explicit dynamical models of proprioceptors, muscles, tendons, and other biomechanical aspects of the exoskeleton.”

      Our goal is not to directly model neural control of the animal. We have introduced text clarifications to emphasize this — we provide a longer response for this in the public review above (see (2)).

      L143: "To train the network from data, we used joint kinematics of flies walking on a spherical treadmill..." This is an impressive approach, but then the reviewer is confused about why the kinematics of the model are so different from those of the animal. The animal takes longer strides at a lower frequency than the model. If the model were trained with data, why aren't they identical? This kind of mismatch makes the reviewer think the approach in this paper is too complicated to address the main problem.

      The design of our trajectory generator model is one of the simplest for reproducing the output of a dynamical system. It consists of a multilayer perceptron model that models the phase velocity and joint angle accelerations at each timestep. All of its inputs are observable and interpretable: the current joint angles, joint angle derivatives, desired walking speed, and phase angle. 

      We chose this model for ease of interpretability, integration with the optimal controller, and to allow for generalization across perturbations. Given all of these constraints, this is the best model of desired kinematics we could obtain. We note that the simulated kinematics do match real fly kinematics qualitatively (Figure 2A and supplemental videos) and are close quantitatively (Figure 2B and C). We speculate that matching the animals’ strides at all walking frequencies may require explicitly modeling differences across individual flies. We leave the design and training of more accurate (but more complex) walking models for future work.

      We add some further discussion about fitting kinematics in the discussion:

      “Although we believe our model matches the fly walking sufficiently for this investigation, we do note that our model still underfits the joint angle oscillations in the walking cycle of the fly (see Figure 2 and Appendix 3). More precise fitting of the joint angle kinematics may come from increasing the complexity of the neural network architecture, improving the training procedure based on advances in imitation learning (Hussein et al., 2018), or explicitly accounting for individual differences in kinematics across flies (Deangelis et al., 2019; Pratt et al., 2024).”

      Figure 2: The reviewer thinks the violin plots in Figure 2C are misleading. Joint angles could be greater or less than 0, correct? If so, why not keep the sign (pos/neg) in the data? Taking the absolute value of the errors and "folding over" the distribution results in some strange statistics. Furthermore, the absolute value would shroud any systematic bias in the model, e.g., joint angles are always too small. The reviewer suggests the authors plot the un-rectified data and simply include 2 dashed lines, one at 5.56 degrees and one at -5.56 degrees.

      These violin plots are averages of errors over all phases within each speed. We chose to do this to summarize the errors across all phase angle plots, which are shown in detail in Appendix 3 and 4.

      For the reviewer, we have added a plot of the raw errors across all phase angle plots in Appendix 5, E.

      L156: Should "\phi\dot" be "\phi"?

      We originally had a typo: we said “phase” when we meant “phase velocity”. This has been fixed. \phi\dot is correct.

      L160: "This control is possible because the controller operates at a higher temporal frequency than the trajectory generator...". This statement concerns the reviewer. To the reviewer, this sounds like the higher-level control system communicates with the "muscles" at a higher frequency than the low-level control system, which conflicts with the hierarchical timescales at which the nervous system operates. Or do the authors mean that the optimal controller can perform many iterations in between updates from the trajectory generator level? If so, please clarify.

      We mean that the optimal controller can perform many iterations in between updates from the trajectory generator level. The text has been clarified:

      “This control is possible because the controller operates at a higher temporal frequency than the trajectory generator in the model. The controller can perform many iterations (and reject disturbances) in between updates to and from the trajectory generator.”

      L225: "We considered two types of perturbations: impulse and persistent stochastic". Are these realistic perturbations? Realistic perturbations such as a single leg slipping, or the body movement being altered would produce highly correlated joint velocities.

      These perturbations are not quite realistic — nonetheless, we illustrate their analogousness to real perturbations in the subsequent text in the paper, and restrict our simulations to ranges that would be biologically plausible (see Appendix 7). We agree that realistic perturbations would produce highly correlated joint accelerations and velocities, whereas our perturbations produce random joint accelerations. 

      L265: "...but they are difficult to manipulate experimentally..." This is true, but it can and has been done. The authors should cite:

      Bässler, U. (1993). The femur-tibia control system of stick insects-A model system for the study of the neural basis of joint control. Brain Research Reviews, 18(2), 207-226. 

      Thank you for the suggestion, we have incorporated it into the text at the end of the referenced sentence.

      L274: "...since the controller can effectively compensate for large delays by using predictions of joint angles in the future". But can the nervous system do this? Or, is there a reason to think that the nervous system can? The reviewer thinks the authors need stronger justification from the literature for their optimal control layer.

      To clarify, this sentence describes a feature of the model’s behavior when no external perturbations are present. This is not directly relevant to the nervous system, since organisms do not typically exist in an environment free of perturbations — we are not suggesting that the nervous system does this.

      In response to the question of whether the nervous system can compensate for delays using predictions: we know that delays are present in the nervous system, perturbations exist in the environment, and that flies manage to walk in spite of them. Thus, some type of compensation must exist to offset the effects of delays (the reviewer themself has provided some excellent citations that study the effects of delays). In our model, we use prediction as the compensation mechanism — this is one of our central hypotheses. We further discuss this in the section “Predictive control is critical for responding to perturbations due to motor delay”.

      L319: "The formulation of a modular, multi-layered model for locomotor control makes new experimentally-testable hypotheses about fly motor control...". What testable hypotheses are these? The authors should explicitly state them. They are not clear to the reviewer, especially given the nonphysiological nature of the control system and the mechanics.

      A number of testable hypotheses are mentioned throughout the Discussion section:

      “Our model predicts that at the same perturbation magnitude, walking robustness decreases as delays increase. This could be experimentally tested by altering conduction velocities in the fly, for example by increasing or decreasing the ambient temperature (Banerjee et al, 2021).  If a warmer ambient temperature decreases delays in the fly, but fly walking robustness remains the same in response to a fixed perturbation, this would indicate a stronger role for central control in walking than our modeling results suggest.”

      “In our model, robust locomotion was constrained by the cumulative sensorimotor delay. This result could be experimentally validated by comparing how animals with different ratios of sensory to motor delays respond to perturbations. Alternatively, it may be possible to manipulate sensory vs. motor delays in a single animal, perhaps by altering the development of specific neurons or ensheathing glia (Kottmeier et al., 2020). If sensory and motor delays have significantly different effects on walking quality, then additional compensatory mechanisms for delays could play a larger role than we expect, such as prediction through sensory integration, mechanical feedback, or compensation through central control.”

      “we hypothesize that removing proprioceptive feedback would impair an insect's ability to sustain locomotion following external perturbations.”

      “We propose that fly motor circuits may encode predictions of future joint positions, so the fly may generate motor commands that account for motor neuron and muscle delays.”

      L323: "...and biomechanical interactions between the limb and the environment". In the reviewer's experience, the primary determinant of delay tolerance is the mechanical parameters of the limb: inertia, damping, and parallel elasticity. For example, in Ashtiani et al. 2021, equation 5 shows exactly how this comes about: the delay changes the roots and poles of the control system. This is why the reviewer is confused by the complexity of the model in this submission; a simpler model would explain why delays cannot be tolerated in certain circumstances.

      We were previously unclear about the goal of the model, and have made text edits to clarify this — we provide a longer response for this in the public review above (see (1)).

      L362: Another highly relevant reference here would be Sutton et al. 2023.

      Done

      L366: Szczecinski et al. 2018 is hardly a "model"; it is mostly a description of experimental data. How about Goldsmith, Szczecinski, and Quinn 2020 in B&B? Their model of fly walking has patterngenerating elements that are coordinated through sensory feedback. In their model, motor activation is also altered by sensory feedback. The reviewer thinks the statement "Models of fly walking have ignored the role of feedback" is inaccurate and their description of these references should be refined.

      Thank you for the suggestion; we have tempered the language and revised this section to include more references, including the suggested one — text is replicated below. 

      “Many models of fly walking ignore the role of feedback, relying instead on central pattern generators (Lobato-Rios et al., 2022; Szczecinski et al., 2018; Aminzare et al., 2018) or metachondral waves (Deangelis et al., 2019) to model kinematics. Some models incorporate proprioceptive feedback, primarily as a mechanism that alters timing of movements in inter-leg coordination (Goldsmith et al., 2020; Wang-Chen et al., 2023).”

      We remark that Szczecinski et al does include a model that replicates data without using sensory feedback, so we think it is fair to include.  

      L371: "...highly dependent on proprioceptive feedback for leg coordination during walking." What about Berendes et al. 2016, which showed that eliminating CS feedback from one leg greatly diminished its ability to coordinate with the other legs? This suggests that even flies depend on sensory feedback for proper coordination, at least in some sense.

      Interesting suggestion – we have integrated it into the text a little further down, where it better fits:

      “Silencing mechanosensory chordotonal neurons alters step kinematics in walking Drosophila (Mendes et al., 2013; Pratt et al., 2024). Additionally, removing proprioceptive signals via amputation interferes with inter-leg coordination in flies at low walking speeds (Berendes et al., 2016)”

      L426: "The layered model approach also has potential applications for bio-mimetic robotic locomotion.". How fast can this model be computed? Can it run faster than real-time? This would be an important prerequisite for use as a robot control system.

      The model should be able to be run quite fast, as it involves only

      (1) Addition, subtraction, matrix multiplication, and sinusoidal computation on scalars (for the phase coordinator and optimal controller)

      (2) Neural network inference with a relatively small network (for the trajectory generator) Whether this can run in real-time depends on the hardware capabilities of the specific robot and the frequency requirements — it is possible to run this on a desktop or smaller embedded device.

      We do note that the model needs to first be set up and trained before it can be run, which takes some time (see panel D of Figure 1).

      L432: "...which is a popular technique in robotics.". Please cite references supporting this statement.

      We have added citations: the text and relevant citations are reproduced below:

      “... which is a popular technique in robotics (Hua et al., 2021; Johns, 2021)

      Hua J, Zeng L, Li G, Ju Z. Learning for a robot: Deep reinforcement learning, imitation learning, transfer learning. Sensors. 2021; 21(4):1278

      Johns E. Coarse-to-fine imitation learning: Robot manipulation from a single demonstration. In:

      2021 IEEE international conference on robotics and automation (ICRA) IEEE; 2021. p. 4613–4619

      L509: "We find that the phase offset across legs is not modulated across walking speeds in our dataset". This is a surprising result to the reviewer. Looking at Figure 6C, the reviewer understands that there are no drastic changes in coordinate with speed, but there are certainly some changes, e.g., L1-R3, L3-R1. In the reviewer's experience, even very small changes in interleg phasing can change the visual classification of walking from "tripod" to "tetrapod" or "metachronal". Furthermore, several leg pairs do not reside exactly at 0 or \pi radians apart, e.g., L1-L3, L2-L3, R1-R3, R2-R3. In conclusion, the reviewer thinks that setting the interleg coordination to tripod in all cases is a large assumption that requires stronger justification (or, should be eliminated altogether).

      We made a simplifying assumption of a tripod coordination across all speeds. The change in relative phase coordination across speeds is indeed relatively small and additionally we see little change in our results across forward speeds (see Figures 4B, 5C and 5D). 

      We have added text to clarify this assumption and what could be changed for future studies in the methods:

      “We estimate $\bar \phi_{ij}$ from the walking data by taking the circular mean over phase differences of pairs the legs during walking bouts. We find that the phase offset across legs is not strongly modulated across walking speeds in our dataset (see Appendix 2) so we model $\bar \phi_{ij}$ as a single constant independent of speed. In future studies, this could be a function of forward and rotation speeds to account for fine phase modulation differences.”

      L581: "of dimension...". Should the asterisk be replaced by \times? The asterisk makes the reviewer think of convolution. This change should be made throughout this paragraph.

      Good point, done.

      Figure 6: Rotational velocities in all 3 sections are reported in mm/s, but these units do not make sense. Rotational velocities must be reported in rad/s or deg/s.

      The rotation velocity of mm/s corresponded to the tangential velocity of the ball the fly walked on. We agree that this does not easily generalize across setups, so we have updated the figure rotation velocities in rad/s. 

      L619: The reviewer is unconvinced by using only 2 principal components of the data to compare the model and animal kinematics. The authors state on line 626 that the 2 principal components do not capture 56.9% of the variation in the data, which seems like a lot to the reviewer. This is even more extreme considering that the model has 20 joints, and the authors are reducing this to 2 variables; the reviewer can't see how any of the original waveforms, aside from the most fundamental frequencies, could possibly be represented in the PCA dataset. If the walking fly models looked similar to each other, the reviewer could accept that this method works. But the fact that this method says the kinematics are similar, but the motion is clearly different, leads the reviewer to suspect this method was used so the authors could state that the data was a good match.

      Our primary use of the KS metric was to indicate whether the simulated fly continues walking in the presence of perturbations, hence we limited the analysis of the KS to the first 2 principal components. 

      For completeness, we investigate the principal components in Appendix 9 and the effect of evaluating KS with different numbers of components in Appendix 10. 

      The results look similar across components for impulse perturbations. For stochastic perturbations, the range of similar walking decreases as we increase the number of components used to evaluate walking kinematics. Comparing this with Appendix 9 showing that higher components represent higher frequencies of the walking cycle, we conclude that at the edge of stability for delays (where sum of sensory and actuation delays are about 40ms), flies can continue walking but with impaired higher frequencies (relative to no perturbations) during and after perturbation. 

      We add text in the methods:

      “We chose 2 dimensions for PCA for two key reasons. First, these 2 dimensions alone accounted for a large portion of the variance in the data (52.7% total, with 42.1% for first component and 10.6% for second component)). There was a big drop in variance explained from the first to the second component, but no sudden drop in the next 10 components (see Appendix 9). Second, the KDE procedure only works effectively in low-dimensional spaces, and the minimal number of dimensions needed to obtain circular dynamics for walking is 2. We investigate the effect of varying the number of dimensions of PCA in Appendix 10.”

      (Note that we have corrected the percentage of variance accounted for by the principal components, as these numbers were from an older analysis prior to the first draft.)

      We also reference Appendix 10 in the results:

      “We observed that robust walking was not contingent on the specific values of motor and sensory delay, but rather the sum of these two values (Fig. 5E). Furthermore, as delay increases, higher frequencies of walking are impacted first before walking collapses entirely (Appendix 10).”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, the authors study the effects of synaptic activity on the process of eye-specific segregation, focusing on the role of caspase 3, classically associated with apoptosis. The method for synaptic silencing is elegant and requires intrauterine injection of a tetanus toxin light chain into the eye. The authors report that this silencing leads to increased caspase 3 in the contralateral eye (Figure 1) and demonstrate evidence of punctate caspase 3 that does not overlap neuronal markers like map2. However, the quantifications showing increased caspase 3 in the silenced eye (done at P5) are complicated by overlap with the signal from entire dying cells in the thalamus. The authors also show that global caspase 3 deficiency impairs the process of eye-specific segregation and circuit refinement (Figures 3-4).

      The reviewer states: “this silencing leads to increased caspase 3 in the contralateral eye”. We observed increased caspase-3 activity, not protein levels, in the contralateral dLGN, not eye.

      The reviewer states: “and demonstrate evidence of punctate caspase 3 that does not overlap neuronal markers like map2”. We do not believe that this statement is accurate, as we show that the punctate active caspase-3 signals overlap with the dendritic marker MAP2 (Figure S4A).

      The reviewer also states: “, the quantifications showing increased caspase 3 [activity] in the silenced [dLGN] (done at P5) are complicated by overlap with the signal from entire dying cells in the thalamus”. We do not believe that this statement is accurate. The apoptotic neurons we observed are relay neurons (confirmed by their morphology and positive staining of NeuN – Figure S4B-C) located in the dLGN (the dLGN is clearly labeled by expression of fluorescent proteins in RGCs, and only caspase-3 activity in the dLGN area is analyzed), not “cells” of unknown lineage (as suggested by the reviewer) in the general “thalamus” area (as suggested by the reviewer). If the dying cells were non-neuronal cells, that would indeed confound our quantification and conclusions, but that is not the case.

      We argue that whole-cell caspase-3 activation in dLGN relay neurons is a bona fide response to synaptic silencing by TeTxLC and therefore should be included in the quantification. We have two sets of controls: one is between the strongly inactivated dLGN and the weakly inactivated dLGN in the same TeTxLC-injected animal; and the second is between the dLGN of TeTxLC-injected animals and mock-injected animals. In both controls, only the dLGNs receiving strong synapse inactivation have more apoptotic dLGN relay neurons, demonstrating that these cells occur because of synapse inactivation. It is also unlikely that our perturbation is causing cell death through a non-synaptic mechanism. Since mock injections do not cause apoptosis in dLGN neurons, this phenomenon is not related to surgical damage. TeTxLC is injected into the eyes and only expressed in presynaptic RGCs, not in postsynaptic relay neurons, so this phenomenon is also unlikely to be caused by TeTxLC-related toxicity. Furthermore, if apoptosis of dLGN relay neurons is not related to synapse inactivation, then when TeTxLC is injected into both eyes, one would expect to see either the same amount or more apoptotic relay neurons, but we instead observed a reduction in dLGN neuron apoptosis, suggesting that synapse-related mechanisms are responsible. Considering the above, occasional whole-cell caspase-3 activation in relay neurons in TeTxLC-inactivated dLGN is causally linked to synapse inactivation and should be included in the quantification.

      We also revised the manuscript to better explain the possible mechanistic connection between localized caspase-3 activity and whole-cell caspase-3 activity. We propose that whole-cell caspase-3 activation occurs because of uncontrolled accumulation of localized caspase-3 activation. Please see line 127-140 and line 403-413 for details.

      Additionally, we would like to clarify that we are not claiming that synapse inactivation leads to only localized caspase-3 activation or only whole-cell caspase-3 activation, as is suggested by the editors and reviewers in the eLife assessment. We have clearly stated in the manuscript that both types of signals were observed. However, we reasoned that, because whole-cell caspase-3 activation in unperturbed dLGNs – which undergo normal synapse elimination – is infrequently observed, whole-cell caspase-3 activation may not be a significant driver of synapse elimination during normal development. In this revision, we included a new experiment to corroborate this hypothesis. If whole-cell caspase-3 activation in dLGN relay neurons is a prevalent phenomenon during normal development, such caspase-3 activity would lead to significant death of dLGN relay neurons during normal development. Consequently, if we block caspase-3 activation by deleting caspase-3, the number of relay neurons in the dLGN should increase. However, in support of our hypothesis, we observed comparable numbers of relay neurons in Casp3<sup>+/+</sup> and Casp3<sup>-/-</sup> mice. Please see Figure S7 for details.

      The authors also report that "synapse weakening-induced caspase-3 activation determines the specificity of synapse elimination mediated by microglia but not astrocytes" (abstract). They report that microglia engulf fewer RGC axon terminals in caspase 3 deficient animals (Figure 5), and that this preferentially occurs in silenced terminals, but this preferential effect is lost in caspase 3 knockouts. Based on this, the authors conclude that caspase 3 directs microglia to eliminate weaker synapses. However, a much simpler and critical experiment that the authors did not perform is to eliminate microglia and show that the caspase 3 dependent effects go away. Without this experiment, there is no reason to assume that microglia are directing synaptic elimination.

      The reviewer states: “microglia engulf fewer RGC axon terminals in caspase 3 deficient animals (Figure 5), and that this preferentially occurs in silenced terminals, but this preferential effect is lost in caspase 3 knockouts”. We are not sure what the reviewer means by “this preferentially occurs in silenced terminals”. Our results show that microglia preferentially engulf silenced terminals, and such preference is lost in caspase-3 deficient mice (Figure 6).

      We do not understand the experiment where the reviewer suggested to: “eliminate microglia and show that the caspase 3 dependent effects go away”. To quantify caspase-3 dependent engulfment of synaptic material by microglia or preferential engulfment of silenced terminals by microglia, microglia must be present in the tissue sample. If we eliminate microglia, neither of these measurements can be made. What could be measured if microglia are eliminated is the refinement of retinogeniculate pathway. This experiment would test whether microglia are required for caspase-3 dependent phenotypes. This is not a claim made in the manuscript. Instead, we claimed caspase-3 is required for microglia to engulf weak synapses, as supported by the evidence presented in Figure 6.

      We did not claim that “microglia are directing synaptic elimination”. Our claim is that synapse inactivation induces caspase-3 activity, and caspase-3 activation in turn leads to engulfment of weak synapses by microglia. Based on this model, it is the neuronal activity that fundamentally directs synapse elimination. Synapse engulfment by microglia is only a readout we used to measure the outcome of activity-dependent synapse elimination. We have revised all sections in the manuscript that are related to synapse engulfment by microglia to emphasize the logic of this model.

      We have also revised the abstract and title of the paper to better align it with our main claims, removed the reference to astrocytes, and clarified that microglia engulfment measurements are used as readouts of synapse elimination.

      Finally, the authors also report that caspase 3 deficiency alters synapse loss in 6-month-old female APP/PS1 mice, but this is not really related to the rest of the paper.

      We respectfully disagree that Figure 7 is not related to the rest of the paper. Many genes involved in postnatal synapse elimination, such as C1q and C3, have been implicated in neurodegeneration. It is therefore natural and important to ask whether the function of caspase-3 in regulating synaptic homeostasis extends to neurodegenerative diseases in adult animals. The answer to this question may have broad therapeutic impacts.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript by Yu et al. demonstrates that activation of caspase-3 is essential for synapse elimination by microglia, but not by astrocytes. This study also reveals that caspase 3 activation-mediated synapse elimination is required for retinogeniculate circuit refinement and eye-specific territories segregation in dLGN in an activity-dependent manner. Inhibition of synaptic activity increases caspase-3 activation and microglial phagocytosis, while caspase-3 deficiency blocks microglia-mediated synapse elimination and circuit refinement in the dLGN. The authors further demonstrate that caspase-3 activation mediates synapse loss in AD, loss of caspase-3 prevented synapse loss in AD mice. Overall, this study reveals that caspase-3 activation is an important mechanism underlying the selectivity of microglia-mediated synapse elimination during brain development and in neurodegenerative diseases.

      Strengths:

      A previous study (Gyorffy B. et al., PNSA 2018) has shown that caspase-3 signal correlates with C1q tagging of synapses (mostly using in vitro approaches), which suggests that caspase-3 would be an underlying mechanism of microglial selection of synapses for removal. The current study provides direct in vivo evidence demonstrating that caspase-3 activation is essential for microglial elimination of synapses in both brain development and neurodegeneration.

      The paper is well-organized and easy to read. The schematic drawings are helpful for understanding the experimental designs and purposes.

      Weaknesses:

      It seems that astrocytes contain large amounts of engulfed materials from ipsilateral and contralateral axon terminals (Figure S11B) and that caspase-3 deficiency also decreased the volume of engulfed materials by astrocytes (Figures S11C, D). So the possibility that astrocyte-mediated synapse elimination contributes to circuit refinement in dLGN cannot be excluded.

      We would like to clarify that we do not claim that astrocytes are unimportant for synapse elimination or circuit refinement. We acknowledge that the claim made in the original submitted manuscript that caspase-3 does not regulate synapse elimination by astrocytes lacks strong supporting evidence. We have removed this claim and revised the section related to synapse engulfment by astrocytes to provide a more rigorous interpretation of our data. We also removed the section in discussion regarding distinct substrate preferences of microglia and astrocytes.

      Does blocking single or dual inactivation of synapse activity (using TeTxLC) increase microglial or astrocytic engulfment of synaptic materials (of one or both sides) in dLGN?

      We assume that by “blocking single or dual inactivation of synapse activity”, the reviewer refers to inactivating retinogeniculate synapses from one or both eyes.

      We showed that inactivating retinogeniculate synapses from one eye (single inactivation) increases engulfment of inactive synapses by microglia (Figure 6). We did not measure synapse engulfment by microglia while inactivating retinogeniculate synapses from both eyes (dual inactivation). However, based on the total active caspase-3 signal (Figure 2) in the dual inactivation scenario, we do not expect to see an increase in engulfment of synaptic material by microglia.

      We did not measure astrocyte-mediated engulfment with single or dual inactivation, as we did not see a robust caspase-3 dependent phenotype in synapse engulfment by astrocytes.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the Authors):

      (1) Figure 1 - It is not clear from this figure whether the authors are measuring caspase 3 in dendritic compartments or in dying relay neurons in the thalamus. The authors state that "either" whole cell death (1B) or smaller punctate signals (1F) were observed. When quantifying "photons" in Figure 1E, it appears most of the signal captured will be of dying relay neurons. What determined which signal was observed, and what is being quantified in Figure 1E? This also applies to the quantifications being reported in Figure 2.

      The quantification includes both types of signals – it is sum of all active caspase-3 signal within the dLGN boundary. We note that there is a significant amount of punctate signal in the TeTxLC-inactivated dLGN. Unfortunately, due to file compression, these signals are not clearly visible in the submitted manuscript file. We have provided high resolution figures in this revision.

      As argued above in the response to the public review, apoptotic relay neurons in TeTxLC-inactivated dLGN (not the general thalamus area) occur as a direct consequence of synapse inactivation. Therefore, active caspase-3 signals in these relay neurons should be included in the quantification.

      We believe it is the extent of synapse inactivation (i.e., the number of synapses that are inactivated) that determines whether dLGN relay neuron apoptosis occurs or not. Such apoptosis is expected considering the nature of the apoptosis signaling cascade. In the intrinsic apoptosis pathway, release of cytochrome-c from mitochondria induces cleavage of the initiator caspase, caspase-9, and caspase-9 in turn cleaves the executioner caspases, caspase-3/7, which causes apoptosis. Caspase-3 can cleave upstream factors in the apoptosis pathway, leading to explosive amplification of caspase-3 activity (McComb et al., DOI: 10.1126/sciadv.aau9433). When a relay neuron receives a few inactivated synapses, caspase-3 activation in the postsynaptic dendrite can remain local (as we observed in Figure 1), constrained by mechanisms such as proteasomal degradation of cleaved caspase-3 (Erturk et al., DOI: 10.1523/JNEUROSCI.3121-13.2014). However, when a relay neuron receives many inactivated synapses, the cumulative caspase-3 activity induced in the dendrite can overwhelm negative regulation and lead to significantly higher levels of caspase-3 activity in entire dendrites (Figure S4B) through positive feedback amplification, eventually leading to caspase-3 activation in entire relay neurons. Please see line 127-140 and line 403-413 for our discussion in the main text.

      (2) Figure 5 - Figures 5c-d and Fig 6 are confounded by pseudoreplication, whereby performing statistics on 50-60 microglia inflates statistical significance. Could the authors show all these data per mouse?

      If we understand the reviewer correctly, the reviewer is suggesting that reporting measurements from multiple microglia in one animal constitutes pseudo-replication. This is correct in a strict sense, as microglia in the same animal are more likely to be similar than microglia from different animals. In the revised version, we have plotted the data by animal in Figure S11 and S13. The observations remain valid. However, we would like to point out that averaging measurements from all microglia in each animal and report by mouse is very conservative, as measurements from microglia in the same animal still vary greatly due to cell-to-cell differences.

      (3) Although the authors are not the only ones to use this strategy, it is worth noting that performing all microglial experiments in Cx3cr1 heterozygotes could lead to alterations in microglial function that may not be reflective of their homeostatic roles.

      We acknowledge that Cx3cr1 heterozygosity could cause alterations in microglial physiology.

      While Cx3cr1 heterozygosity may impact microglia physiology, we note that the engulfment assay in Figure 5 is comparing microglia in Cx3cr1<sup>+/-</sup>; Casp3<sup>+/-</sup> and Cx3cr1<sup>+/-</sup>; Casp3<sup>-/-</sup> animals. Therefore, the impact of Cx3cr1 heterozygosity is controlled for in our experiment, and the observed difference in engulfed synaptic material in microglia is an effect specific to caspase-3 deficiency. However, we acknowledge that this difference could be quantitatively affected by Cx3cr1 heterozygosity.

      It is important to note that we did not perform all microglia engulfment analyses using Cx3cr1<sup>+/-</sup> mice. We have edited the manuscript to make this more clear. In the activity-dependent microglia engulfment analysis performed in Figure 6, we used Casp3<sup>+/+</sup> and Casp3<sup>-/-</sup> animals and detected microglia with anti-Iba1 immunostaining. Therefore, the impact of Cx3cr1 heterozygosity is not a problem for this experiment.

      Minor:

      (1) Figures are presented out of order, which makes the manuscript difficult to follow.

      We have revised text regarding the segregation analysis to align with the order of figures.

      (2) Figure S3 is very confusing- the terms "left" and "right" are used in three or four partly overlapping contexts (which eye, which injection, which panel or subpanel of the figure is being referred to). Would this not be more appropriately analyzed with a repeated measures ANOVA (multiple comparisons not necessary) rather than multiple separate T-tests?

      We have revised Figure S3 and S5 with better annotation and legends.

      Yes, it is possible to use repeated measure two-way ANOVA. The analysis reports significant effect from genotypes, with a dF of 1, SoS and MoS of 0.0001081, F(1,13) = 7.595, and p = 0.0164. We used multiple separate t-tests because we wanted to show how genotype effects change with increasing thresholds, whereas two-way ANOVA only provides one overall p-value.

      (3) Could the authors clarify why the percentage overlap (in the controls) is so different between Figure 3C and Figure S3C, and why different thresholds are applied?

      This difference is primary due to difference in age. Figure 3 and Figure S5 are acquired at age of P10, while Figure S3 is acquired at P8. While the segregation process is largely complete by P8, the segregation continues from P8 to P10. Therefore, overlap measured at P10 will be lower than that measured at P8. If we compare overlap at the same threshold (e.g., 10%) and at the same age in Figure 3 and S5, the overlap is very similar.

      The choice of threshold is related to the methods of labeling. In Figure 3, RGC terminals are labeled with AlexaFlour conjugated cholera toxin subunit-beta (CTB). In Figure S3 and S5, RGC axons are labeled by expression of fluorescent proteins. Labeling with CTB only labels membrane surfaces but yields stronger and slightly different signals at fine scales than labeling with fluorescent protein which are cell fillers. For Figure S3 and S5 (which use fluorescent protein labeling), higher thresholds such as those used in Figure 3 (which use CTB labeling) can be applied and the same trend still holds, but the data will be noisier. Regardless of the small difference in thresholds used, the important observation is that the defects in TeTxLC-injected or caspase-3 deficient animals are clear across multiple thresholds.

      (4) Many describe the eye-specific segregation process as being complete "between P8-10". Other studies have quantified ESS at P10 (Stevens 2007). The authors state they did all quantifications at P8 (l. 82) and refer to Figure 3, but Figure 3 shows images from P10, whereas Figure S3 shows data from P8.

      We did not say we performed all quantification at P8. In line 85, we said “To validate the efficacy of our synapse inactivation method, we injected AAV-hSyn-TeTxLC into the right eye of wildtype E15 embryos and analyzed the segregation of eye-specific territories at postnatal day 8 (P8), when the segregation process is largely complete”. The age of postnatal day 8 in this context is specifically referring to the experiment shown in Figure S3. For the segregation analysis in Figure 3, we specifically stated that the experiment was conducted at P10 (line 277).

      Although the experiment in Figure S3 is conducted at P8, and Figure S5 and Figure 3 show results at P10, each dataset always included appropriate age-matched controls.  P8 is generally considered an age where segregation is mostly complete and sufficient for us to assess the potency of TeTxLC-delivered AAV on eye segregation.  We don’t think performing the experiment shown in Figure S3 at P8 impacts the interpretation of the data.

      (5) Is Figure 6 also using Cx3cr1 GFP to label microglia? This is not clarified.

      We apologize for this oversight. In Figure 6 microglia are labeled by anti-Iba1 immunostaining. We have clarified this in figure legends and text.

      Reviewer #2 (Recommendations for the Authors):

      (1) The authors quantified the caspase-3 activity using immunostaining and confocal microscopy (Figures 1B-E). They may need to verify the result (increased level of activated caspase-3 upon synapse inactivation) using alternative methods, such as western blotting.

      Both western blot and immunostaining are based on antibody-antigen interaction. These two methods are not likely sufficiently independent. Additionally, to perform a western blot, we would need to surgically collect the TeTxLC-inactivated dLGN to avoid sample contamination from other brain regions. Such collection at the age we are interested in (P5) is very challenging. We have tested the anti-cleaved caspase-3 antibody using caspase-3 deficient mice and we can confirm it is a highly specific antibody that doesn’t generate signal in the caspase-3 deficient tissue samples.

      (2) Does caspase-3 deficiency alter the density of microglia or astrocytes in dLGN?

      No. Neither the density of microglia nor astrocytes changed with caspase-3 deficiency. In the case of microglia, we find that the mean density of microglia per unit area of dLGN is virtually the same in wild type and caspase-3 deficient mice (two-tailed t test P = 0.8556, 6 wild type and 5 Casp3<sup>-/-</sup> mice). Some overviews showing microglia in dLGNs of wildtype and caspase-3 deficient mice can be found in Figure S10.  Similarly for astrocytes, we did not observe overt changes in astrocytes dLGN density linked to caspase-3 deficiency.

      (3) During dLGN eye-specific segregation in normal developing animals, did the authors observe different levels of activated caspase-3 in different regions (territories)?

      For normal developing animals, the activated caspase-3 signal is generally sparse, and it is difficult to distinguish whether the signal is related to synapse elimination. For animals receiving TeTxLC-injection, we did notice that in the dLGN contralateral to the injection, where most inactivated synapses are located, the punctate caspase-3 signal tends to concentrate on the ventral-medial side of the dLGN (Figure 1B), which is the region preferentially innervated by the contralateral eye.

      (4) Recording of NMDAR-mediated synaptic currents may not be necessary for demonstrating that caspase 3 is essential for dLGN circuit refinement. In addition, the PPR may not necessarily reflect the number of innervations that a dLGN neuron receives. Instead, showing the changes in the frequency of mEPSCs (or synapse/spine density) may be more supportive.

      Thank you for the comment. We have performed the suggested mEPSC measurements and reported the results in revised Figure 4D-F.

      (5) Why is caspase 3 activation enhanced (compared to control) only at 4 months of age, when A-beta deposition has not formed yet, but not at later time points in AD mice (Figure S17)?

      A prevailing hypothesis in the field is that the form of A-beta that is most neurotoxic is the soluble oligomeric form, not the fibril form that leads to plaque deposition. As the oligomeric form appears before plaque deposition, the enhanced caspase-3 activation we observed at 4-month may reflect an increase in oligomeric A-beta, which occurs before any visible A-beta plaque formation.

      (6) The manuscript can be made more concise, and the figures more organized.

      We removed superfluous details and corrected text-figure mismatches in the revised manuscript to improve readability.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-0284z

      Corresponding author(s): Bérénice, Benayoun A

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This paper by McGill and colleagues explores sex differences in murine macrophages from different niches. They use a combination of publicly available, and newly developed datasets, and combine these using meta-analysis approaches. They explore DEGs between sexes - both common across niches, and specific to certain niches - and use enrichment analyses to identify pathways linked to these genes. Their overall conclusions are that gene expression changes in females are more consistent across niches, than for males, and are enriched in extracellular matrix-related genes. The paper is easy to follow and very well written.

      Major Comments:

      1. I would suggest Figure 1 be moved to a supplemental figure. We agree that the Xist and Ddx3y is QC and can be removed. However, we believe that the separation of macrophage transcriptomes based on sex in the Multidimensional Scaling plot is an important result. Thus, we have revised Figure 1 to only include the MDS plots and have moved the Xist/Ddx3y plots to the supplement (new Supplemental Figure S1) in line with the reviewer’s suggestion.

      Line 106 - It should be clarified why 50 DEGs was selected as the cut off for exclusion.

      We apologize that our cut off criteria was not explained clearly enough. Because these are publicly available datasets, every lab used different numbers of biological replicates, methods, and sequencing depths, impacting the power of the assay to detect differences in gene expression robustly. Since we were interested in functions that were sex-dimorphic, and that requires running functional enrichment analysis, we needed to have a minimum gene set size to be able to run these analyses, which, in the field, is usually accepted to be 50 genes for robustness. Thus, we used 50 DEGs and have updated the methods to explain our reasoning: “Applying a cutoff for the number of differentially expressed genes (DEGs) helps ensure data consistency and comparability across datasets with varying methodologies and sequencing depths. This prevents datasets with excessively low DEG counts from disproportionately influencing downstream analyses. A cutoff also reduces noise from spurious findings, prioritizing datasets with robust transcriptional changes that are more likely to be biologically meaningful. The excluded microglia dataset contained only 11 DEGs (whereas all other microglia datasets had hundreds of DEGs), the pleural macrophage dataset had 37 (whereas all other lung-related macrophage datasets had above 50), and the spleen macrophage dataset had only 30.” (page 12, lines 381-388).

      Optional - would suggest sex chromosome-linked genes are excluded and the analysis redone to see if there are other autosomal genes that are statistically shadowed by the X and Y linked genes.

      We thank the reviewer for this great suggestion, and we now added this point to the discussion (page 9, lines 260-268). However, we think that genes on the X and Y chromosomes will impact overall function of the macrophages and that they are necessary to understand how macrophages from males and females may support differences in immune function throughout life. We now add this in the discussion as a potential future direction: “We find that a majority of genes similarly differential across sexes among the macrophage niches are sex chromosome linked. X-linked genes like Tlr7, Cxcr3, and Kdm6a enhance immune responses in female macrophages, potentially increasing inflammation with age (Feng et al., 2024). Meanwhile, Y-linked genes such as Uty and Sry influence transcriptional regulation and inflammatory signaling in male macrophages, which may contribute to chronic low-grade inflammation (Lusis, 2019). These genetic differences affect macrophage activity, tissue-specific immune responses, and susceptibility to age-related diseases, highlighting the importance of sex-specific factors in immune research. Future research should also explore how non-sex chromosome-linked genes interact with these sex-specific mechanisms to further shape macrophage and immune function.” (page 9, lines 260-268).

      More metadata about the included studies should be included eg mouse ages, strains, experimental manipulations etc. I can't seem to access all of the Supplemental tables so this may already be included in Table S1.

      We agree that this information is important to take into consideration and have now included this information in Supplemental Table S1A, along with the accession numbers to each dataset. All mice were aged between 2 to 24 weeks and all on variations of the C57BL/6 background.

      How relevant the findings in mice are for humans should be explained further in the discussion.

      We agree that our discussion needs to better explain broader implications. Our findings are relevant for human health because macrophages play key roles in immunity, inflammation, and tissue homeostasis, and their functions are known to differ between sexes. Understanding these sex-specific transcriptional differences in mice can provide insights into how male and female immune systems respond differently to infections, autoimmune diseases, and aging in humans. Since macrophage phenotypes are influenced by both systemic factors (e.g., hormones) and tissue-specific environments, studying multiple macrophage subtypes from different organs helps identify conserved and context-dependent sex differences. Indeed, our findings suggest the ECM may be a potential mechanism underlying sex-biased diseases, such as higher autoimmune prevalence in females or increased susceptibility to certain infections in males. We have added this detail to the discussion (page 10, lines 269-275).

      Minor Comments:

      1. Lines 63-66 - need references here. This mirrors Reviewer 2’s major point #2. We agree with the reviewer that references are needed and now cite PMID: 31541153, PMID: 29533975, PMID: 37863894, PMID: 33415105, and PMID: 37491279 (page 4 line 68-69).

      Line 61 and 69 - repeated.

      We thank the reviewer for catching this oversight and have deleted the first instance of the sentence.

      Reviewer #1 (Significance (Required)):

      Although this study is primarily descriptive, it adds to the current knowledge about sex differences in macrophages, an important and relatively understudied area. Those interested in sex differences and in the innate immune system generally, plus those who study macrophages in any context, should be interested in this work.

      We thank the reviewer for their interest in our work and their helpful suggestions.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: The study investigates sex-specific differences in macrophage gene expression across various tissue niches by analyzing both newly generated and publicly available datasets of varying quality. The key finding is the identification of three consistently differentially expressed genes (DEGs) across all macrophage niches: the Y-chromosome-encoded genes Ddx3y and Eif2s3y, and the X-chromosome-specific gene Xist. However, the number of sex-dimorphic DEGs varied significantly between macrophage niches, with female-biased genes showing more consistency across datasets. To further explore these sex-specific differences, the authors performed an overrepresentation analysis of the DEGs across datasets. They found enriched gene sets associated with specific biological terms in female-biased macrophages from peritoneal macrophages, bone marrow-derived macrophages (BMDMs), and osteoclast progenitors (OCPs), while male-biased enrichment was observed in microglia, exudate macrophages, OCPs, and BMDMs. Notably, extracellular matrix (ECM)-related genes were specifically enriched in female peritoneal macrophages and OCPs, whereas the term "nucleic acid binding" was more prominent in male samples from microglia, BMDMs, and OCPs, driven by the Y-chromosome genes Uty and Kdm5d. A gene set enrichment analysis (GSEA) using Gene Ontology (GO) and Reactome databases further confirmed the enrichment of sex-biased pathways. Based on these findings, the authors conclude that three sex chromosome-associated genes are consistently differentially expressed across all datasets and that female-associated gene expression appears to be more stable, particularly in relation to ECM-associated processes.

      Major Comments:

      Are the key conclusions convincing?

      1. The study provides valuable insights into sex-dimorphic gene expression in macrophages across different niches. However, some conclusions appear overinterpreted due to the limited number of differentially expressed genes (DEGs) driving specific terms in the overrepresentation analysis. The reliance on only a few recurring genes (e.g., Kdm5d, Eif2s3y, Uty, and Ddx3y) raises concerns about the biological significance of some enriched terms. A clearer discussion on the limitations of such findings is necessary. We apologize for the confusion. Although the Venn Diagram may give the impression that our comparisons are limited to those few genes, we only highlight them with bold text because they are a good quality control mechanism for our analyses.

      Importantly, methods like gene set enrichment analysis [GSEA] use whole-transcriptome ranking, which means the results we obtain are driven by the entire transcriptome and not just a few genes (GSEA results are reported in Figure 5). We agree that further explanation of these methodologies would improve interpretation of our findings for readers unfamiliar with these analytical techniques. To address this, we have now added the following to the methods: “GSEA relies on whole-transcriptome ranking, ensuring that the results reflect global transcriptomic patterns rather than being influenced by only a few genes.” (page 13, lines 415-417).

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?Some claims, particularly those regarding the role of macrophages in diseases such as AD, histiocytosis, and osteoporosis, lack relevant references.

      This mirrors minor point #1 from Reviewer #1. We apologize for not originally including references for this statement and have now updated the introduction and discussion with appropriate references: “Excessive macrophage activation is associated with numerous conditions, including neurodegeneration, atherosclerosis, osteoporosis, and cancer, many of which exhibit sex-biased tendencies (Chen et al., 2020; Hou et al., 2023; Li et al., 2023; Mammana et al., 2018)” (page 4, lines 67-69) and “Thus, investigating female and male-biased processes in macrophages, including the contribution of the ECM, will be an important step in developing treatments for diseases including, but not limited to, AD, histiocytosis, and osteoporosis(Chen et al., 2020; Cox et al., 2021; Hou et al., 2023; Li et al., 2023; Mammana et al., 2018)” (page 10, lines 285-288).

      Would additional experiments be essential to support the claims of the paper? While additional wet-lab experiments are not strictly necessary, a deconvolution analysis of the datasets could be highly beneficial. This would allow the identification of enriched macrophage subtypes and help assess whether differences between datasets are driven by specific macrophage populations rather than global sex differences. Since peritoneal macrophage origin is influenced by age and inflammation status, deconvolution could also clarify dataset comparability.

      The reviewer makes an interesting point. We apologize for the confusion regarding the purity and origin of these datasets. All the datasets we curated from public repositories for our analysis are from purified populations of macrophages. To clarify this, we now include a column with the purification method used for each of the datasets based on the original manuscript in revised Supplemental Table S1A.

      Since all the used datasets were derived from pure macrophage populations, deconvolution (which is used to identify cellular proportions in heterogeneous contexts) would not accomplish much, predicting that all the cells in the data are macrophages. While some people have argued that deconvolution may be used to identify different cell states, this is very controversial, especially since the “pure” reference and the heterogeneous query are subject to batch effects (i.e. either from differences in bench processing, sex of provenance for target/query datasets, transcriptional impact of sorting methods, differences in transcriptomic quantification methods, etc.) which overshadow most differences beyond cell types. Thus, due to the known batch sensitivity of deconvolution methods and the fact that we only selected pure macrophage transcriptomic profiling datasets, using deconvolution to identify macrophage subtypes would not be informative/feasible. Importantly, we focused our analyses on datasets derived only from young, healthy, naïve animals (2 to 24 weeks), without any interference from age-related inflammation.

      To make this caveat clearer, we have added sentences to the results section indicating the age range of the animals (page 6, lines 100-101), as well as in the discussion to discuss how inflammation states and age may change some of our findings (page 10, lines 295-299).

      Are the suggested experiments realistic in terms of time and resources? Performing cell-type deconvolution using established computational tools (e.g., CIBERSORT, BisqueRNA, or single-cell deconvolution methods) would be a realistic approach within a few weeks and would significantly strengthen the study. This analysis would not require additional experimental work but could refine the interpretation of the dataset. Additionally, a PCA of all datasets could help identify potential similarities among macrophages from different niches and between sexes.

      As explained in our response to point #4, the use of only datasets from purified macrophages from young animals (before any influence of age or disease) makes deconvolution analysis meaningless, especially due to batching concerns. Specifically, it would require us to generate paired single-cell and bulk datasets on all macrophage subtypes in house to remove batch-inducing experimental biases, which we believe is outside of the scope of this small bioinformatics study.

      To the second point, doing a PCA of all the datasets together would not provide much new information beyond cell type of origin due to batching concerns that could not be corrected, which are a known problem in transcriptomics analyses (PMID:20838408, PMID:28351613). Since datasets come from different labs, using different isolation methods, RNA capture choices, library construction kits and sequencing platforms, the main separating effects overall will be batch/dataset, not biology (PMID:20838408, PMID:28351613). Indeed, this is what we observe (Reviewer Figure 1), with broad separation of datasets by tissue of origin, then dataset of origin. Additionally, the top 10 loadings for PC1 and PC2 are primarily associated to autosomal genes (i.e. not on the sex chromosomes; Reviewer Table 1).

      Reviewer Figure 1. (A) PCA of all samples across datasets. Read counts were processed together through R package sva v.3.46.0 for surrogate variable estimation, and surrogate variables were removed using the removeBatchEffect function from ‘limma’ v.3.54.2. DESeq2 normalized counts were used to make the PCA. (B) Zoomed in PCA excluding three outlier sample to enable easier visual discrimination of samples.

      Principal Component – Gene

      Loading

      Chromosome

      PC1- Srcin1

      0.013601

      11

      PC1- Cacna1c

      0.013593

      6

      PC1- Pclo

      0.01357

      5

      PC1- Tro

      0.013547

      X

      PC1- Ppp4r4

      0.013541

      12

      PC1- Ppp1r1a

      0.01354

      15

      PC1- Homer2

      0.013538

      7

      PC1- Caskin1

      0.013535

      17

      PC1- Arhgef9

      0.013527

      X

      PC1- Slc4a3

      0.013499

      1

      PC2- Gm15446

      0.017978

      5

      PC2- 1810034E14Rik

      0.017897

      13

      PC2- Gm19557

      0.017871

      19

      PC2- Pkd1l2

      0.017792

      8

      PC2- H60b

      0.017274

      10

      PC2- Appbp2os

      0.01723

      11

      PC2- Mir7050

      0.017221

      7

      PC2- Nkapl

      0.017166

      13

      PC2- Tmem51os1

      0.017083

      4

      PC2- Dpep3

      0.016962

      8

      Reviewer Table 1. Top 10 loadings for principal component 1 and principal component 2 with their respective chromosomal location.

      Thus, since batch effects can only be accounted for rigorously when they are not confounded by biology (and in our case since each dataset only looks at one type of macrophage), this cannot be corrected in a rigorous manner to yield the desired results.

      We have added a sentence to the discussion to highlight how future work where macrophages from diverse niches would be profiled in parallel may give greater insights into niche-specific sex-dimorphic effects (page 10, line 295-296).

      Are the data and the methods presented in such a way that they can be reproduced? Some methodological details are missing, particularly regarding:

      The isolation of mouse peritoneal macrophages (details on injection and harvesting procedure needed). Quality control of isolated macrophages (How were contaminating cells excluded? Was additional validation performed beyond using the kit?)

      The age of mice used for bone marrow-derived macrophages (BMDMs) is not provided, which is important given that immune responses can be age-dependent.

      We appreciate the reviewer’s request for additional methodological details. We apologize for not being clear with our details and have updated the methods to be clearer (page 11, lines 320-346), as well as added this information in revised Supplemental Table S1A (e.g. age of animals and purification method as described in the original papers). For all our in house datasets, mice were 4-months old, and the text is now updated to reflect this: “Long bones (tibia and femur) of young (4-months-old) from both sexes were collected and bone marrow was flushed into 1.5mL Eppendorf tubes via centrifugation (30 seconds, 10,000g) (Amend et al., 2016)” (page 11, lines 334-336).

      While we couldn’t check the purity post hoc for published datasets we identified for meta-analysis, we performed a purity check on our isolated peritoneal macrophages using Cd11b-F4/80 staining by flow cytometry and have now included this data (including gating strategy) in Supplemental Figure S4. For BMDMs, no purity check was performed, as there is extensive literature on the efficiency of this differentiation protocol which consistently yields > 90% of macrophages. This has been added to the methods: “We used a protocol that is expected to yield ~90% Cd11b+ F4/80+ cells (Mendoza et al., 2022; Toda et al., 2021)” (page 11, lines 336-337).

      Are the experiments adequately replicated and statistical analysis adequate? The statistical analysis appears generally appropriate, but there are concerns about dataset inconsistencies that should be addressed. Some datasets were not used across all analyses, which is not clearly indicated in figures or text. This should be explicitly mentioned to avoid misleading interpretations.

      We appreciate the reviewer’s careful evaluation of our statistical analysis and the concern regarding dataset inconsistencies.

      We believe that the reviewer is referring to the omission of the exudate dataset from the Venn Diagram analysis (Figure 2C), as this is the only time that we did not report the results from all datasets. We originally chose not to include the exudate dataset in the shared differentially expressed gene (DEG) analysis, because it contained over 1,300 DEGs, whereas all other datasets had between 4–30 DEGs, resulting in an unreadable figure.

      However, we agree that it is important to include for the readers, and while we have decided to still exclude the exudate dataset from Figure 1C for readability purposes, we now include the overlap analyses for all datasets in Supplemental Figure S2 using an upset plot (an alternative visualization method) showing all 6 niches, as well as a table panel that lists the shared genes across niches “Three genes were found to be differentially expressed across all six niches: Xist, Ddx3y, and Eif2s3y (Figure 2C, Supplemental Figure 2A,B)” (page 6, lines 124-126). We thank the reviewer for drawing our attention to this and making our analysis clearer for future readers.

      Minor Comments

      1. Figures are included twice in the manuscript. We apologize for this, and figures are now only included once.

      The use of stereotypic colors in figures (e.g., blue for male, pink for female) could be reconsidered for better readability and to avoid reinforcing gender stereotypes.

      While we understand that this color choice might feel gender normative, we respectfully disagree with the reviewer, as we believe that for the expediency of scientific communication it is important to choose a color palette that is easily understandable without confusion without even needing to consult a legend.

      Importantly, we have been using the same color palette in all publications from the lab on sex-differences for consistency (Lu et al, Nat aging 2021 PMID: 34514433; McGill et al, PLoS ONE, 2023 PMID: 38032907; Kang et al, J Neuroinflammation, 2024 PMID: 38840206; McGill et al, STAR Protocols, 2021 PMID: 34820637), which is crucial for scientific rigor and communication consistency.

      Results - Section 1

      Line 92: The word 'identified' may not be the most appropriate choice here, as it implies discovery rather than selection. Consider rephrasing to 'compiled' or 'gathered' to more accurately reflect the process of assembling the datasets. Additionally, the sentence structure could be refined for clarity, such as specifying that the datasets include both newly generated and publicly available data.

      We have changed two instances of using the word identified to “collected” and “gathered” (page 4, line 83 and page 6, line 98). We also adjusted the sentence to say, “Although we initially collected 21 datasets, both newly generated and publicly available, for our study, only 18 datasets were retained after various quality filtering steps for downstream analysis” (page 4, lines 83-85).

      Line 95: Specify the source of exudate-derived macrophage data.

      We have updated Supplemental Table S1A to make sure it was comprehensively describing the datasets we used in our analysis and double checked that it was complete (including for the exudate data). We have updated the text to reflect this: “All accession numbers and corresponding manuscripts are found in Supplemental Table S1A” (page 6, lines 103-104).

      Figure 1/2A: The scheme overview lacks clarity-its purpose is unclear. The two identical boxes are redundant and do not provide additional insight. Consider illustrating the origins of different macrophage subtypes instead. The cutoff of >50 DEGs should be included in the schematic to improve clarity. Overrepresentation and GSEA analysis should not be illustrated multiple times across different figures-it is redundant.

      In Figure 1A, we included the identical boxes to indicate that no datasets were excluded for incorrect labeling of males/females. However, we agree that this is unnecessary and have removed the second box as suggested.

      In Figure 2A, we agree the identical boxes are unneeded as the Xist/Ddx3y quality control step was listed in Figure 1A, and we have modified the figure accordingly.

      We also agree that including the DEG cutoff and removing the GSEA mention will streamline the figures and have updated them accordingly as well.

      Line 100: The mention of R software should be moved to the Methods section instead of appearing in the Results section.

      We have now updated the text to say, “Expression levels of male-specific Ddx3y and female-specific Xist genes across all samples were examined to ensure proper sex labeling of samples (Supplemental Figure 1A-U)” (page 6, lines 111-112).

      Figure 1B-V: The current figure layout is visually cluttered. Consider plotting male and female datasets together in a single graph with different point shapes instead of separate panels for each specific niche.

      This seems to echo the above request for a global PCA in Reviewer 2’s Major Point #4, which unfortunately cannot be included due to the disproportionate impact of batch effects that has been well documented in the literature (Reviewer Figure 1; PMID:20838408, PMID:28351613). However, to make the figure clearer and less cluttered, and to address related Reviewer 1’s Major Point #1, we have moved the Xist/Ddx3y plots to Supplemental Figure S1 and only include the Multidimensional Scaling plots in Figure 1 to showcase the sex separation in each dataset.

      Text-Figure alignment: The text describes male/female-specific gene expression levels first, while the figure starts with MDS analysis. The order should be consistent.

      We agree and have adjusted the text accordingly (lines 109-112).

      Figure 2C: Exudate data is missing-explain why.

      This point echoes major point #6. As explained above, we have clarified this and included new data panels for clarity (New Supplemental Figure S2).

      Results - Section 2

      Line 151: Use consistent terminology-either "DEGs" or "DE genes", not both.

      We replaced all instances of “DE genes” with DEGs (lines 132, 137, 141, 147, 149, 163, and 397).

      Figure 3A: The text suggests not all datasets were included in this analysis-this should be explicitly indicated in the figure.

      We apologize for the confusion. All datasets were included in this analysis; however, some niches did not have any GO terms passing the FDR

      Show the number of DEGs used for analysis.

      We apologize for the confusion. For the ORA analyses (Figures 3 and 4), we indicate the number of DEGs used for analysis in the panel header. For the GSEA analysis (Figure 5, Supplemental Figure S3), all expressed genes are ranked based on effect size without any prior filter (see response to major point #1), so DEGs are irrelevant for these analyses.

      Figure 3B: Smaller pale dots in the bubble plot are difficult to distinguish-consider using a darker outline.

      We have now added outlines to all the bubbles in the plots to help improve visibility.

      Line 158: The term "phagocytosis" appears inconsistent with the figure, where it is labeled "phagocytosis, recognition".

      We have updated the text accordingly (page 7, line 170).

      Figure 4B, D, E: The overrepresentation analysis is based on very few genes (often only 1-2 genes per term), which may lead to overinterpretation.

      We apologize for the lack of clarity of our previous manuscript. The number of genes used for DEG analysis is in the panel titles of Figure 3 and 4. While the overlap is small, this is unlikely to be spurious since all of the pathways we discuss show significant enrichment with FDR

      Consider explicitly naming these genes and discussing their biological role instead of assigning terms based on minimal evidence.

      We now discuss these genes in the results: “Male-biased GO terms for microglia, OCPs, and BMDMs derived from four genes: Kdm5d, Uty, Ddx3y, and Eif2s3y. All of these are Y-linked genes and play crucial roles in regulating innate and adaptive immune responses (Meester et al., 2020). Kdm5d and Uty influence adaptive immunity through chromatin remodeling and histone modification, while Ddx3y and Eif2s3y shape innate immune responses by modulating macrophage activation and cytokine production via translation initiation and RNA processing (Bloomer et al., 2013; Hamlin et al., 2024; Meester et al., 2020) “(page 8, lines 195-200).

      Figures S3G and S3H seem to be switched.

      We are puzzled by this comment, as our original manuscript did not include a Supplemental Figure S3. Out of an abundance of caution, however, we checked that Supplemental Table S3G and H were correctly labelled, and independently confirmed that they are not switched.

      Results - Section 3

      Figure 5A does not add significant new insights. Consider refining its content to highlight key findings more effectively.

      We respectfully disagree and believe that schematic overviews help readers understand what is accomplished in any specific figure and have thus decided to keep it.

      Number of genes included in the analysis is not provided-this is important to assess significance and should be stated in methods and figure legends.

      We apologize for the lack of clarity. As explained above, GSEA uses all the genes in rank order (PMID: 16199517), we now explain GSEA more explicitly in the text “GSEA relies on whole-transcriptome ranking, ensuring that the results reflect global transcriptomic patterns rather than being influenced by only a few genes” (page 13, lines 415-417).

      Discussion 20. Line 201-203: Missing reference.

      We have now updated the text with the proper reference: “Tissue-resident macrophages are crucial to proper immune system function (Guilliams et al., 2020). While all macrophages share the responsibility of clearing cellular debris and foreign bodies, tissue-resident macrophages also have unique responsibilities that facilitate homeostasis throughout the body (Guilliams et al., 2020; Varol et al., 2015)” (page 9, lines 227-230).

      Reference 23 (1999) is outdated. Newer literature should be cited to reflect modern insights into sex differences in macrophages.

      We have now updated the text with an updated reference for two outdated references: (i) “Sex differences have previously been reported in macrophages, with female macrophages having higher phagocytic activity than males (Scotland et al., 2011)” (page 9, lines 232-233) and (ii) “Dysfunctional OCPs are associated with development of osteoporosis, a disease that is four times more prevalent in women (Alswat, 2017)” (page 10, lines 284-285).

      Peritoneal macrophages and OCPs originate from monocytes. Would deconvolution help identify enriched subtypes and assess dataset comparability?

      As noted in Reviewer 2’s Major Points #3 and #4, deconvolution analysis is not meaningful for subtype analysis without paired isolated/bulk datasets, which are outside of the scope of this study to generate.

      The 'more consistent' pathways found for female datasets are not discussed.

      We now discuss pathways found among the female datasets: “In addition, GSEA analysis of REACTOME gene sets showed male-biased expression for cell cycle related pathways (average set size 499), and female-biased expression for G protein-coupled receptor (GPCR) signaling (average set size 122) and extracellular matrix organization (average set size 127) (Figure 5C, Supplemental Table S4S-AJ; consistent with our ECM observation, Supplemental Figure S3A). Macrophages express a wide variety of GPCRs that allow them to respond to different stimuli. The expression of specific GPCRs influences macrophage polarization toward either a pro-inflammatory or anti-inflammatory state (Wang et al., 2019). A manual review of the genes contributing to this GPCR enrichment reveals the presence of several chemokine-related genes (such as Ccl4, Ccr4, Cxcl1, and others) (Supplemental Table S4). This suggests that females may have an increased abundance of chemokine GPCRs, potentially contributing to heightened autoimmune activity, among other factors.” (page 8, lines 212-222).

      Methods - Peritoneal macrophage isolation:

      Details on injection and harvesting are missing.

      We apologize for not being clear with our details and have modified the methods to be clearer (page 11, lines 320-331).

      How was contamination from other cell types assessed? F4/80 selection may not be fully macrophage-specific, and contamination could occur due to insufficient washing or the presence of non-macrophage F4/80+ cells.

      For the peritoneal macrophage datasets we generated, the macrophages were checked for purity through flow cytometry using Cd11b and F4/80 antibodies. We considered double positive Cd11b+ F4/80+ cells to be macrophages, which represents >95% of cells using our methodology (Supplemental Figure S4), without a difference between sexes.

      For the BMDMs, we utilize a protocol that is expected to yield ~90% Cd11b+ F4/80+ cells (PMID: 35212988 and PMID: 33458708).

      Finally, we now include the purification method for all publicly available datasets according to their original manuscript in Supplemental Table S1A and explicitly discuss the information for our in-house datasets in the methods (page 11, lines 321-346).

      • Bone marrow macrophages:

      Mouse age is not provided in the results part.

      We now provide this information in the methods (page 11, line 334). All ages for all datasets are now included in Supplemental Table S1A.

      Figure Legends

      Figure 2: Peritoneal macrophages are abbreviated as PeriMac-consider using this abbreviation consistently in the text.

      We respectfully disagree with the reviewer and choose to keep Peritoneal Macrophages spelled out in the text for clarity. We use the shorthand “PeriMac” in Figure 2 and Figure 5 solely for spacing purposes, but these are explained in the figure legend.

      Reviewer #2 (Significance (Required)):

      The study's strengths include the integration of multiple datasets, the use of both overrepresentation and GSEA, and the exploration of tissue-specific macrophage niches. These findings have relevance for diverse communities, including immunologists, sex-difference researchers, and those studying macrophage-driven diseases such as osteoporosis, neurodegeneration, and chronic inflammation. The work provides a foundation for further studies on sex-specific macrophage biology and may have implications for sex-specific therapeutic strategies. However, the study has limitations. The conclusions regarding enriched pathways rely heavily on a small number of DEGs, raising concerns about overinterpretation. Additionally, dataset variability and missing data for some analyses (e.g., exudate macrophages) could affect the robustness of the results.

      Despite these limitations, the study makes a meaningful but incremental advance by highlighting stable sex-dimorphic patterns in macrophage biology. It provides insights for both fundamental and translational research, particularly for audiences focused on immune regulation, sex-specific gene expression, and tissue-specific macrophage function.

      We thank the reviewer for understanding the importance of our work.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: McGill et al. explore sex-based differences in macrophage gene expression across various tissues. Using a meta-analysis of publicly available and newly generated datasets, they identify conserved and divergent sex-dimorphic genes and pathways between tissues. Overall, the report is easy to follow and guides the reader through the analysis. The authors highlight the relevance of the report by noting sex differences in immune responses to infection, autoimmunity, and chronic diseases. The inclusion of 17 independent transcriptomic datasets provides a robust and extensive analysis of sex-based transcriptional differences. The authors explore potential biological implications of sex-based transcriptional differences using pathway analysis. Despite the overall strengths, there are some points for which further clarification and analysis would improve the manuscript. Detailed comments are listed below.

      Major comments:

      1. A comparison of the overall transcriptomic profiles of macrophages regardless of sex would be additive. Knowing the degree of similarities and differences among macrophages from different niches would help the reader determine what genetic programs vary by compartment. If macrophages are very different by niche, it is not surprising that they share few sex-dimorphic patterns. This mirrors Reviewer 2’s Major Point #4. While this approach may seem valuable, it would only be feasible if all datasets were generated simultaneously by the same lab using identical sequencing and library preparation protocols to avoid batch effects. In this case, biology and batch effects are confounded, making any global analysis misleading. Although the reviewer may find the limited overlap unsurprising, given that macrophages are generally considered to be the same cell type, our goal was to explore the extent of shared versus distinct features across datasets, which we believe to be an invaluable question for the field.

      Although it would not be possible to do this rigorously with the data we curated, the question of niche specific gene regulation of macrophages has been studied, showing extensive niche-specific regulation: “While the question of niche-specific gene regulation has been studied, showing extensive niche-specific regulation (Gosselin et al., 2014; Lavin et al., 2014), a comprehensive and systematic study of sex-differences across macrophage subtypes has not yet been performed” (page 4, lines 78-81).

      It is unclear what age and strain the mice were and the number of samples that were included (n) for each dataset. This information should be included in S1A. If different ages or strains were used, how might this impact findings?

      This mirrors Reviewer 1’s Major Point #4. We agree that this information is important to take into consideration and have now included this information in Supplemental Table 1A, along with the accession numbers to each dataset. Because there is no aging effect (all mice are aged between 2 to 24 weeks) and all mice are on a variation of the C57BL/6 background, we don’t expect this to be a major problem impacting our findings.

      The authors used a Jaccard index to examine similarities in sex-based differences across tissue compartments. They claim that there are more similarities in females. However, the male are female graphs (Fig. 1E,D) do not look that different. Is there a better way to display this?

      We apologize for the lack of clarity. We clustered the Jaccard matrices using hierarchical clustering to determine patterns of sharing. Thus, in these figures, the samples cluster based on the degree of similarity in sex-biased genes. In the females, there is clear separation by macrophage origin (yolk sac or circulating monocytes); whereas males have some separation but also have some mixing (e.g. Peritoneal Macrophage 2 clustering with the yolk-sac derived macrophage datasets). Additionally, four microglia datasets are together in the females with only one separate, whereas in the males they are split into three. We included colored bars by the dataset names to help highlight clear separation by niche of origin.

      We have added this detail to the text to better explain the similarities: “Our results indicate that female-biased genes were more consistent among the cell types compared to male-biased genes (Figures 2D,E). In females, there is clear separation by macrophage origin (yolk sac or circulating monocytes), with all the peritoneal macrophages clustering together, followed by bone-related macrophages, then microglia and lung macrophages. In the males, the five microglia datasets are split into three groups, and Peritoneal Macrophage 2 clusters with the yolk-sac derived macrophage datasets” (page 7, lines 155-160).

      In the Gene Ontology analysis, it is unclear what type of GO pathways were included (biological process, cellular component, molecular function). Also, some of the GO analyses were done with very few genes (as little as 4).

      This echoes Reviewer #2’s Major Comment #1. For the Overrepresentation analysis (ORA) using Gene Ontology, we use the “ALL” option to include biological process, cellular component, and molecular function terms. We used ORA to look at shared DEGs across datasets of the same niche which is why some have very low input. For this reason, we also performed Gene Set Enrichment Analysis that uses all genes, not just those differentially expressed at FDR 5%, to examine gene changes at a broader level. In the methods we have added this information: “The differentially expressed genes shared within each niche were divided into up and down-regulated based on the sign of the DEseq2 log2 fold change. These gene lists were used as the shared genes and all expressed genes across datasets in that specific niche were used as the universe for the clusterProfiler function ‘enrichGO’, using the “ALL” option to include biological process, cellular component, and molecular function terms” (page 13, lines 405-410) and “GSEA relies on whole-transcriptome ranking, ensuring that the results reflect global transcriptomic patterns rather than being influenced by only a few genes.” (page 13, lines 415-417)”.

      Is it possible to combine datasets by tissue to remove potential batch effects before downstream analyses? At the very least, PCA on combined data may help determine if some biological (e.g., age, strain) or technical (batch) differences are contributing to identifying few common sex differences.

      This mirrors Reviewer #2’s Major Point #4. Unfortunately, since every dataset only examined a single niche, biology and batches are confounded, and thus performing a PCA on all datasets together will be driven by technical rather than biological drivers. Batch effects are a well-documented issue in genomics (PMID:20838408, PMID:28351613) Indeed, this is largely observed when we attempt this analysis, with datasets clustering by batch (Reviewer Figure 1). Due to the issue of uncorrectable batch effects, we do not believe this analysis meets the rigor required to be included in the revised manuscript and have chosen to not include it.

      Validation of key results would further strengthen the manuscript.

      We agree that future validation is important but is beyond the scope of this purely bioinformatic analysis. We have included text in the revision to highlight the importance of future validation studies: “Thus, investigating female- and male-biased processes in macrophages, including the contribution of the ECM, will be an important step in developing treatments for diseases including, but not limited to, AD, histiocytosis, and osteoporosis, and future research will be essential to validate these findings and further refine therapeutic strategies (Chen et al., 2020; Cox et al., 2021; Hou et al., 2023; Li et al., 2023; Mammana et al., 2018)” (page 10, lines 285-289).

      Further contextualization of key results would enhance the discussion. For example, ECM-related differences in female macrophages could have broader roles in wound healing, fibrosis, and migration.

      We agree with the reviewers and have added this detail to the discussion: “ECM components are emerging as key regulators of innate immune responses (García-García & Martin, 2019). Macrophages contribute to ECM remodeling by producing and degrading collagens (Sutherland et al., 2023), and ECM-related differences in female macrophages may impact wound healing, fibrosis, and migration. In lung and kidney tissues, macrophages recruit and activate fibroblasts, influencing fibrosis through direct interactions and ECM-degrading enzymes (Nikolic-Paterson et al., 2014). The balance between ECM deposition and degradation is crucial for tissue homeostasis, as excessive fibrosis leads to pathology (Nikolic-Paterson et al., 2014; Ran et al., 2025). Mechanical properties of the ECM, such as stiffness and collagen crosslinking, enhance macrophage adhesion, migration, and inflammatory activation (Hsieh et al., 2019). These ECM cues direct macrophage behavior during injury response, influencing their ability to reach inflammation sites and promote repair. Thus, female-biased expression of ECM-related genes may contribute to phenotypes such as enhanced wound healing or even fibrosis(Balakrishnan et al., 2021; Harness-Brumley et al., 2014; Rønø et al., 2013) “ (page 9, lines 248-259).

      Minor comments:

      1. Line 51: In the introduction, the authors state that macrophages produce chemokines. There are other signaling molecules produced by macrophages (e.g., cytokines) that also contribute to immune responses. We apologize for this and have updated the text to say: “Macrophages are a key component of the mammalian immune system and are responsible for producing a diverse array of signaling molecules including (but not limited to) cytokines, chemokines, and interferons that activate the rest of the immune system to combat infection (Shapouri-Moghaddam et al., 2018)” (page 4, lines 49-52).

      Line 53: The authors state that after birth the primary source of new macrophages come from differentiation of monocytes. However, some tissue resident macrophages are self-renewing.

      We apologize for this oversight and have adjusted the text to say: “After birth, the primary source of new macrophages comes from the differentiation of monocytes, which can be recruited to tissues throughout life. However, some tissue resident macrophages can self-renew, including those from the pleural and peritoneal cavities (Röszer, 2018)” (page 4, lines 53-56).

      Line 123: "spermatogenial" should be "spermatogonial"

      We have updated the text accordingly (page 6, line 130).

      Reviewer #3 (Significance (Required)):

      Significance: • General assessment: The study provides a novel and comprehensive analysis of sex-dimorphic gene expression in macrophages, with key findings that emphasize the importance of ECM remodeling in female macrophages. The strengths include the broad dataset inclusion, rigorous quality control, and methodological rigor. However, consideration of potential confounding variables (e.g., age, strain) should be included and validation of key results would strengthen the manuscript. • Advance: This study advances knowledge by analyzing sex differences across multiple macrophage niches rather than focusing on a single tissue type. It extends findings from previous immune studies. • Audience: This report would be of interest to immunologists and researchers studying sex differences. Expertise: Immunology, sex differences in disease, macrophage biology, transcriptomics, and inflammation research.

      We thank the reviewer for their positive comments on the impact of our work and for their useful feedback.

      __ __


      References

      Alswat, K. A. (2017). Gender Disparities in Osteoporosis. J Clin Med Res, 9(5), 382-387. https://doi.org/10.14740/jocmr2970w

      Amend, S. R., Valkenburg, K. C., & Pienta, K. J. (2016). Murine Hind Limb Long Bone Dissection and Bone Marrow Isolation. J Vis Exp(110). https://doi.org/10.3791/53936

      Balakrishnan, M., Patel, P., Dunn-Valadez, S., Dao, C., Khan, V., Ali, H., El-Serag, L., Hernaez, R., Sisson, A., Thrift, A. P., Liu, Y., El-Serag, H. B., & Kanwal, F. (2021). Women Have a Lower Risk of Nonalcoholic Fatty Liver Disease but a Higher Risk of Progression vs Men: A Systematic Review and Meta-analysis. Clin Gastroenterol Hepatol, 19(1), 61-71.e15. https://doi.org/10.1016/j.cgh.2020.04.067

      Bloomer, L. D., Nelson, C. P., Eales, J., Denniff, M., Christofidou, P., Debiec, R., Moore, J., Zukowska-Szczechowska, E., Goodall, A. H., Thompson, J., Samani, N. J., Charchar, F. J., & Tomaszewski, M. (2013). Male-specific region of the Y chromosome and cardiovascular risk: phylogenetic analysis and gene expression studies. Arterioscler Thromb Vasc Biol, 33(7), 1722-1727. https://doi.org/10.1161/atvbaha.113.301608

      Chen, K., Jiao, Y., Liu, L., Huang, M., He, C., He, W., Hou, J., Yang, M., Luo, X., & Li, C. (2020). Communications Between Bone Marrow Macrophages and Bone Cells in Bone Remodeling. Front Cell Dev Biol, 8, 598263. https://doi.org/10.3389/fcell.2020.598263

      Cox, N., Pokrovskii, M., Vicario, R., & Geissmann, F. (2021). Origins, Biology, and Diseases of Tissue Macrophages. Annu Rev Immunol, 39, 313-344. https://doi.org/10.1146/annurev-immunol-093019-111748

      Gosselin, D., Link, V. M., Romanoski, C. E., Fonseca, G. J., Eichenfield, D. Z., Spann, N. J., Stender, J. D., Chun, H. B., Garner, H., Geissmann, F., & Glass, C. K. (2014). Environment drives selection and function of enhancers controlling tissue-specific macrophage identities. Cell, 159(6), 1327-1340. https://doi.org/10.1016/j.cell.2014.11.023

      Hamlin, R. E., Pienkos, S. M., Chan, L., Stabile, M. A., Pinedo, K., Rao, M., Grant, P., Bonilla, H., Holubar, M., Singh, U., Jacobson, K. B., Jagannathan, P., Maldonado, Y., Holmes, S. P., Subramanian, A., & Blish, C. A. (2024). Sex differences and immune correlates of Long Covid development, symptom persistence, and resolution. Sci Transl Med, 16(773), eadr1032. https://doi.org/10.1126/scitranslmed.adr1032

      Harness-Brumley, C. L., Elliott, A. C., Rosenbluth, D. B., Raghavan, D., & Jain, R. (2014). Gender differences in outcomes of patients with cystic fibrosis. J Womens Health (Larchmt), 23(12), 1012-1020. https://doi.org/10.1089/jwh.2014.4985

      Hou, P., Fang, J., Liu, Z., Shi, Y., Agostini, M., Bernassola, F., Bove, P., Candi, E., Rovella, V., Sica, G., Sun, Q., Wang, Y., Scimeca, M., Federici, M., Mauriello, A., & Melino, G. (2023). Macrophage polarization and metabolism in atherosclerosis. Cell Death Dis, 14(10), 691. https://doi.org/10.1038/s41419-023-06206-z

      Lavin, Y., Winter, D., Blecher-Gonen, R., David, E., Keren-Shaul, H., Merad, M., Jung, S., & Amit, I. (2014). Tissue-resident macrophage enhancer landscapes are shaped by the local microenvironment. Cell, 159(6), 1312-1326. https://doi.org/10.1016/j.cell.2014.11.018

      Li, M., Yang, Y., Xiong, L., Jiang, P., Wang, J., & Li, C. (2023). Metabolism, metabolites, and macrophages in cancer. J Hematol Oncol, 16(1), 80. https://doi.org/10.1186/s13045-023-01478-6

      Mammana, S., Fagone, P., Cavalli, E., Basile, M. S., Petralia, M. C., Nicoletti, F., Bramanti, P., & Mazzon, E. (2018). The Role of Macrophages in Neuroinflammatory and Neurodegenerative Pathways of Alzheimer's Disease, Amyotrophic Lateral Sclerosis, and Multiple Sclerosis: Pathogenetic Cellular Effectors and Potential Therapeutic Targets. Int J Mol Sci, 19(3). https://doi.org/10.3390/ijms19030831

      Meester, I., Manilla-Muñoz, E., León-Cachón, R. B. R., Paniagua-Frausto, G. A., Carrión-Alvarez, D., Ruiz-Rodríguez, C. O., Rodríguez-Rangel, X., & García-Martínez, J. M. (2020). SeXY chromosomes and the immune system: reflections after a comparative study. Biol Sex Differ, 11(1), 3. https://doi.org/10.1186/s13293-019-0278-y

      Rønø, B., Engelholm, L. H., Lund, L. R., & Hald, A. (2013). Gender affects skin wound healing in plasminogen deficient mice. PLoS One, 8(3), e59942. https://doi.org/10.1371/journal.pone.0059942

      Röszer, T. (2018). Understanding the Biology of Self-Renewing Macrophages. Cells, 7(8). https://doi.org/10.3390/cells7080103

      Scotland, R. S., Stables, M. J., Madalli, S., Watson, P., & Gilroy, D. W. (2011). Sex differences in resident immune cell phenotype underlie more efficient acute inflammatory responses in female mice. Blood, 118(22), 5918-5927. https://doi.org/10.1182/blood-2011-03-340281

      Shapouri-Moghaddam, A., Mohammadian, S., Vazini, H., Taghadosi, M., Esmaeili, S. A., Mardani, F., Seifi, B., Mohammadi, A., Afshari, J. T., & Sahebkar, A. (2018). Macrophage plasticity, polarization, and function in health and disease. J Cell Physiol, 233(9), 6425-6440. https://doi.org/10.1002/jcp.26429

      Wang, X., Iyer, A., Lyons, A. B., Körner, H., & Wei, W. (2019). Emerging Roles for G-protein Coupled Receptors in Development and Activation of Macrophages. Front Immunol, 10, 2031. https://doi.org/10.3389/fimmu.2019.02031

    1. Reviewer #1 (Public review):

      Summary:

      This study aims to provide imaging methods for users of the field of human layer-fMRI. This is an emerging field with 240 papers published so far. Different than implied in the manuscript, 3T is well represented among those papers. E.g. see the papers below that are not cited in the manuscript. Thus, the claim on the impact of developing 3T methodology for wider dissemination is not justified. Specifically, because some of the previous papers perform whole brain layer-fMRI (also at 3T) in more efficient, and more established procedures.

      The authors implemented a sequence with lots of nice features. Including their own SMS EPI, diffusion bipolar pulses, eye-saturation bands, and they built their own reconstruction around it. This is not trivial. Only a few labs around the world have this level of engineering expertise. I applaud this technical achievement. However, I doubt that any of this is the right tool for layer-fMRI, nor does it represent an advancement for the field. In the thermal noise dominated regime of sub-millimeter fMRI (especially at 3T) it is established to use 3D readouts over 2D (SMS) readouts. While it is not trivial to implement SMS, the vendor implementations (as well as the CMRR and MGH implementations) are most widely applied across the majority of current fMRI studies already. The author's work on this does not serve any previous shortcomings in the field.

      The mechanism to use bi-polar gradients to increase the localization specificity is doubtful to me. In my understanding, killing the intra-vascular BOLD should make it less specific. Also, the empirical data do not suggest a higher localization specificity to me.

      Embedding this work in the literature of previous methods is incomplete. Recent trends of vessel signal manipulation with ABC or VAPER are not mentioned. Comparisons with VASO are outdated and incorrect.

      The reproducibility of the methods and the result is doubtful (see below).

      I don't think that this manuscript is in the top 50% of the 240 layer-fmri papers out there.

      3T layer-fMRI papers that are not cited:

      Taso, M., Munsch, F., Zhao, L., Alsop, D.C., 2021. Regional and depth-dependence of cortical blood-flow assessed with high-resolution Arterial Spin Labeling (ASL). Journal of Cerebral Blood Flow and Metabolism. https://doi.org/10.1177/0271678X20982382

      Wu, P.Y., Chu, Y.H., Lin, J.F.L., Kuo, W.J., Lin, F.H., 2018. Feature-dependent intrinsic functional connectivity across cortical depths in the human auditory cortex. Scientific Reports 8, 1-14. https://doi.org/10.1038/s41598-018-31292-x

      Lifshits, S., Tomer, O., Shamir, I., Barazany, D., Tsarfaty, G., Rosset, S., Assaf, Y., 2018. Resolution considerations in imaging of the cortical layers. NeuroImage 164, 112-120. https://doi.org/10.1016/j.neuroimage.2017.02.086

      Puckett, A.M., Aquino, K.M., Robinson, P.A., Breakspear, M., Schira, M.M., 2016. The spatiotemporal hemodynamic response function for depth-dependent functional imaging of human cortex. NeuroImage 139, 240-248. https://doi.org/10.1016/j.neuroimage.2016.06.019

      Olman, C.A., Inati, S., Heeger, D.J., 2007. The effect of large veins on spatial localization with GE BOLD at 3 T: Displacement, not blurring. NeuroImage 34, 1126-1135. https://doi.org/10.1016/j.neuroimage.2006.08.045

      Ress, D., Glover, G.H., Liu, J., Wandell, B., 2007. Laminar profiles of functional activity in the human brain. NeuroImage 34, 74-84. https://doi.org/10.1016/j.neuroimage.2006.08.020

      Huber, L., Kronbichler, L., Stirnberg, R., Ehses, P., Stocker, T., Fernández-Cabello, S., Poser, B.A., Kronbichler, M., 2023. Evaluating the capabilities and challenges of layer-fMRI VASO at 3T. Aperture Neuro 3. https://doi.org/10.52294/001c.85117

      Scheeringa, R., Bonnefond, M., van Mourik, T., Jensen, O., Norris, D.G., Koopmans, P.J., 2022. Relating neural oscillations to laminar fMRI connectivity in visual cortex. Cerebral Cortex. https://doi.org/10.1093/cercor/bhac154

      Strengths:

      See above. The authors developed their own SMS sequence with many features. This is important to the field. And does not leave sequence development work to view isolated monopoly labs. This work democratises SMS.<br /> The questions addressed here are of high relevance to the field: getting tools with good sensitivity, user-friendly applicability, and locally specific brain activity mapping is an important topic in the field of layer-fMRI.

      Weaknesses:

      (1) I feel the authors need to justify why flow-crushing helps localization specificity. There is an entire family of recent papers that aims to achieve higher localization specificity by doing the exact opposite. Namely, MT or ABC fRMRI aims to increase the localization specificity by highlighting the intravascular BOLD by means of suppressing non-flowing tissue. To name a few:

      Priovoulos, N., de Oliveira, I.A.F., Poser, B.A., Norris, D.G., van der Zwaag, W., 2023. Combining arterial blood contrast with BOLD increases fMRI intracortical contrast. Human Brain Mapping hbm.26227. https://doi.org/10.1002/hbm.26227.

      Pfaffenrot, V., Koopmans, P.J., 2022. Magnetization Transfer weighted laminar fMRI with multi-echo FLASH. NeuroImage 119725. https://doi.org/10.1016/j.neuroimage.2022.119725

      Schulz, J., Fazal, Z., Metere, R., Marques, J.P., Norris, D.G., 2020. Arterial blood contrast ( ABC ) enabled by magnetization transfer ( MT ): a novel MRI technique for enhancing the measurement of brain activation changes. bioRxiv. https://doi.org/10.1101/2020.05.20.106666

      Based on this literature, it seems that the proposed method will make the vein problem worse, not better. The authors could make it clearer how they reason that making GE-BOLD signals more extra-vascular weighted should help to reduce large vein effects.

      The empirical evidence for the claim that flow crushing helps with the localization specificity should be made clearer. The response magnitude with and without flow crushing looks pretty much identical to me (see Fig, 6d).<br /> It's unclear to me what to look for in Fig. 5. I cannot discern any layer patterns in these maps. It's too noisy. The two maps of TE=43ms look like identical copies from each other. Maybe an editorial error?

      The authors discuss bipolar crushing with respect to SE-BOLD where it has been previously applied. For SE-BOLD at UHF, a substantial portion of the vein signal comes from the intravascular compartment. So I agree that for SE-BOLD, it makes sense to crush the intravascular signal. For GE-BOLD however, this reasoning does not hold. For GE-BOLD (even at 3T), most of the vein signal comes from extravascular dephasing around large unspecific veins and the bipolar crushing is not expected to help with this.

      (2) The bipolar crushing is limited to one single direction of flow. This introduces a lot of artificial variance across the cortical folding pattern. This is not mentioned in the manuscript. There is an entire family of papers that perform layer-fmri with black-blood imaging that solves this with a 3D contrast preparation (VAPER) that is applied across a longer time period, thus killing the blood signal while it flows across all directions of the vascular tree. Here, the signal cruising is happening with a 2D readout as a "snap-shot" crushing. This does not allow the blood to flow in multiple directions.<br /> VAPER also accounts for BOLD contaminations of larger draining veins by means of a tag-control sampling. The proposed approach here does not account for this contamination.

      Chai, Y., Li, L., Huber, L., Poser, B.A., Bandettini, P.A., 2020. Integrated VASO and perfusion contrast: A new tool for laminar functional MRI. NeuroImage 207, 116358. https://doi.org/10.1016/j.neuroimage.2019.116358

      Chai, Y., Liu, T.T., Marrett, S., Li, L., Khojandi, A., Handwerker, D.A., Alink, A., Muckli, L., Bandettini, P.A., 2021. Topographical and laminar distribution of audiovisual processing within human planum temporale. Progress in Neurobiology 102121. https://doi.org/10.1016/j.pneurobio.2021.102121

      If I would recommend anyone to perform layer-fMRI with blood crushing, it seems that VAPER is the superior approach. The authors could make it clearer why users might want to use the unidirectional crushing instead.

      (3) The comparison with VASO is misleading.<br /> The authors claim that previous VASO approaches were limited by TRs of 8.2s. The authors might be advised to check the latest literature of the last years.<br /> Koiso et al. has performed whole brain layer-fMRI VASO at 0.8mm at 3.9 seconds (with reliable activation) and 2.7 seconds (with unconvincing activation pattern, though), and 2.3 (without activation).<br /> Also, whole brain layer-fMRI BOLD at 0.5mm and 0.7mm has been previously performed by the Juelich group at TRs of 3.5s (their TR definition is 'fishy' though).

      Koiso, K., Müller, A.K., Akamatsu, K., Dresbach, S., Gulban, O.F., Goebel, R., Miyawaki, Y., Poser, B.A., Huber, L., 2023. Acquisition and processing methods of whole-brain layer-fMRI VASO and BOLD: The Kenshu dataset. Aperture Neuro 34. https://doi.org/10.1101/2022.08.19.504502

      Yun, S.D., Pais‐Roldán, P., Palomero‐Gallagher, N., Shah, N.J., 2022. Mapping of whole‐cerebrum resting‐state networks using ultra‐high resolution acquisition protocols. Human Brain Mapping. https://doi.org/10.1002/hbm.25855

      Pais-Roldan, P., Yun, S.D., Palomero-Gallagher, N., Shah, N.J., 2023. Cortical depth-dependent human fMRI of resting-state networks using EPIK. Front. Neurosci. 17, 1151544. https://doi.org/10.3389/fnins.2023.1151544

      The authors are correct that VASO is not advised as a turn-key method for lower brain areas, incl. Hippocampus and subcortex. However, the authors use this word of caution that is intended for inexperienced "users" as a statement that this cannot be performed. This statement is taken out of context. This statement is not from the academic literature. It's advice for the 40+ user base that want to perform layer-fMRI as a plug-and-play routine tool in neuroscience usage. In fact, sub-millimeter VASO is routinely being performed by MRI-physicists across all brain areas (including deep brain structures, hippocampus etc). E.g. see Koiso et al. and an overview lecture from a layer-fMRI workshop that I had recently attended: https://youtu.be/kzh-nWXd54s?si=hoIJjLLIxFUJ4g20&t=2401

      Thus, the authors could embed this phrasing into the context of their own method that they are proposing in the manuscript. E.g. the authors could state whether they think that their sequence has the potential to be disseminated across sites, considering that it requires slow offline reconstruction in Matlab?<br /> Do the authors think that the results shown in Fig. 6c are suggesting turn-key acquisition of a routine mapping tool? In my humble opinion it looks like random noise, with most of the activation outside the ROI (in white matter).

      (4) The repeatability of the results is questionable.<br /> The authors perform experiments about the robustness of the method (line 620). The corresponding results are not suggesting any robustness to me. In fact the layer profiles in Fig. 4c vs. Fig 4d are completely opposite. Location of peaks turn into locations of dips and vice versa.<br /> The methods are not described in enough detail to reproduce these results.<br /> The authors mention that their image reconstruction is done "using in-house MATLAB code" (line 634). They do not post a link to github, nor do they say if they share this code.

      It is not trivial to get good phase data for fMRI. The authors do not mention how they perform the respective coil-combination.<br /> No data are shared for reproduction of the analysis.

      (5) The application of NODRIC is not validated.<br /> Previous applications of NORDIC at 3T layer-fMRI have resulted in mixed success. When not adjusted for the right SNR regime it can result in artifactual reductions of beta scores, depending on the SNR across layers. The authors could validate their application of NORDIC and confirm that the average layer-profiles are unaffected by the application of NORDIC. Also, the NORDIC version should be explicitly mentioned in the manuscript.

      Akbari, A., Gati, J.S., Zeman, P., Liem, B., Menon, R.S., 2023. Layer Dependence of Monocular and Binocular Responses in Human Ocular Dominance Columns at 7T using VASO and BOLD (preprint). Neuroscience. https://doi.org/10.1101/2023.04.06.535924

      Knudsen, L., Guo, F., Huang, J., Blicher, J.U., Lund, T.E., Zhou, Y., Zhang, P., Yang, Y., 2023. The laminar pattern of proprioceptive activation in human primary motor cortex. bioRxiv. https://doi.org/10.1101/2023.10.29.564658

      Comments on revisions:

      Among all the concerns mentioned above, I think there is only one of the specific issues that was sufficiently addressed.<br /> The authors implemented a combination of three consecutive-dimensional flow crushers. Other concerns were not sufficiently addressed to change my confidence level of the study.<br /> - While the abstract is still focusing on the utility of using 3T, they do not give credit to early 3T layer-fMRI papers leading the way to larger coverage and connectivity applications.<br /> - While the author's choice of using custom SMS 2D readout is justified for them. I do not think that this very method will utilize widespread 3T whole brain connectivity experiments across the global 3T community. This lowers the impact of the paper.<br /> - The images in Fig. 5 are still suspiciously similar. To the level that the noise pattern outside the brain is identical across large parts of the maps with and without PR.<br /> - Maybe it's my ignorance, but I still do not agree why flow crushing focuses the local BOLD responses to small vessels.<br /> - While my feel of a misleading representation of the literature had been accompanied by explicit references, the authors claim that they cannot find them?!? Or claim that they are about something else (which they are not, in my viewpoint).<br /> Data and software are still not shared (not even example data, or nii data).

    2. Reviewer #2 (Public review):

      This study developed a setup for laminar fMRI at 3T that aimed to get the best from all worlds in terms of brain coverage, temporal resolution, sensitivity to detect functional responses and spatial specificity. They used a gradient-echo EPI readout to facilitate sensitivity, brain coverage and temporal resolution. The former was additionally boosted by NORDIC denoising and the latter two were further supported by acceleration both in-plane and across slices. The authors evaluated whether the implementation of velocity-nulling (VN) gradients could mitigate macrovascular bias, known to hamper laminar specificity of gradient-echo BOLD.

      Strengths:

      The setup includes 0.9 mm isotropic acquisitions with large coverage at a reasonable TR. These parameters are hard to optimize simultaneously, and I applaud the ambitious attempt to get "the best from all worlds" (large coverage, high spatio/temporal resolution, spatial specificity, sensitivity), which is sought after in the field. Also, in terms of the availability of the method, it is favorable that it benefits from lower field strength (additional time for VN-gradient implementation, afforded by longer gray matter T2*). Furthermore, I like that the authors took steps to improve the original manuscript by e.g., collecting more data, adjusting the VN implementation to include flow-suppression along three rather than a single dimension, and adjusting the ROI-definition procedure to avoid circularity issues.

      That being said, I still find the evidence weak in terms of this sequence achieving high spatial specificity and sensitivity. The results feel oversold and further validation is needed to make a case for the authors' conclusion that "[...] the potential impact of this development is expected to be extensive across various domains of neuroscience research". This is elaborated in the comments below:

      The authors acknowledge that the VN setup in its current form probably does not suppress the impact of most ascending veins (these are also not targeted by phase regression, as most are probably too small to produce sufficiently large phase responses). This seems to limit the theoretical support for the author's claim of reduced inter-layer blurring (e.g. the claim that deep and superficial signals are less coupled with VN gradients than without based on Fig 6-7). This limitation withstanding, the method may still be helpful for limiting laminar dependencies by suppressing pial vein responses (which may carry signal from distant regions and layers that blur into superficial layers if left unsuppressed). Unfortunately, the empirical support of VN gradients suppressing superficial bias seems quite weak and is hard to evaluate. For example, the profiles in Figure 4 does not consistently show clearly less superficial bias when VN gradients are on - this might partly be due to the fact that clear bias was not always present in the profiles even without VN. I suspect this is largely explained by the selection of very small and quite unrepresentative ROIs. The corresponding activation maps appear strongly weighted towards CSF which is not always captured in the profile. I recommend sampling a much larger patch of cortex to more accurately capture the actual underlying bias. In this way, all non-VN profiles should have clear bias which should be clearly suppressed for VN if the method is effective. The authors do evaluate the effect of VN/phase regression based on a large activated region in visual cortex (Fig 5) - why not show laminar profiles from here, which is an obvious way to show the effect on superficial bias? I think such evaluations would be a more direct way of evaluating the methods impact on specificity, and are necessary for subsequent FC evaluations to be convincing.

      The phase regression results are described inconsistently. In the results section, the authors, in my opinion, "correctly" acknowledge that phase regression seemed to have a very minor impact. However, in the discussion section it is described as if phase regression was effective in suppressing macrovascular responses (L 553-558), which the results do not support (especially based on profiles in Fig 4). There is barely any difference with/without phase regression, which may be due to the fact that ordinary least squares regression was chosen over a deming model which accounts for noise on the phase regressor. Although the authors correctly mentioned in their "answers to reviewers" that the required noise-ratio between magnitude and phase data can be hard to estimate, attempts of that has been described in previous phase regression studies which showed much larger effects (see e.g. Stanley et al. 2020, Knudsen et al. 2023).

      I like that the authors put in additional efforts to provide analyses to validate their NORDIC implementation. However, this needs to be done on the VN setup directly, not the "regular BOLD setup" with b=0, since the ability of NORDIC to distinguish signal and noise components depends on CNR which is expected to deviate for these setups. Also, it seems z-scores and confidence intervals were computed based on GLM residuals which may lead to inflated z-values and overly narrow CI's due to reduced degrees of freedom following denoising. The denoised z-maps from Fig 3 indeed look somewhat strange, i.e. seemingly increased false positives (more salt/pepper and a bunch of white matter activation) with very weak hand knob activation. Also, something must be wrong with the CIs on the laminar profiles - they seem extremely narrow despite noise levels obviously being high for highly accelerated 3T submillimeter results extracted from a very small ROI. The authors may consider computing these statistics from variance across trials instead.

      Given that the idea of the setup is to take advantage in terms of sensitivity by using GE-BOLD contrast relative to e.g. SE-EPI or CBV-weighted setups, they need to carefully demonstrate the sensitivity of their setup, which could be limited by high acceleration factors, the VN gradients, low field strength, etc. I like that they now put more emphasis on non-masked activation maps, but further comparison could be made through tSNR maps, raw single-volume images, raw timeseries, CNR based on across-trial variance, etc.

      The major rationale for the setup is to achieve functional connectivity (FC) with brain-wide coverage at laminar resolutions, but it is framed as if this is something that has not been possible in the past with existing setups (statements such as: "Despite advancements in acquisition speed, current CBV/CBF-based fMRI techniques remain inadequate for layer-dependent resting-state fMRI" (L138-140). To me, the functional connectivity results presented here with the VN setup are clearly less convincing than what has been shown with e.g. CBV-weighted acquisitions (e.g. Huber et al. 2021, Chai et al. 2024). The VN setup might also have advantages such as larger coverage as mentioned by the authors, but they fail to balance the comparison by highlighting where previous studies had clear edges. Thus, the impact of the results needs to be down-stated and a more balanced comparison with existing laminar FC studies is warranted. For example, acknowledging that the CBV-weighted studies demonstrate much higher spatial specificity.

      Overall I would recommend a stronger emphasis on validating the claims about the sequence on task-based data for which there is a large body of literature to benchmark against (e.g. laminar fMRI studies in V1 and M1), before going to FC where the base for comparison and reference is much more limited in humans at laminar scales.

    3. Author response:

      The following is the authors’ response to the original reviews.

      General responses:

      The authors sincerely thank all the reviewers for their valuable and constructive comments. We also apologize for the long delay in providing this rebuttal due to logistical and funding challenges. In this revision, we modified the bipolar gradients from one single direction to all three directions. Additionally, in response to the concerns regarding data reliability, we conducted a thorough examination of each step in our data processing pipeline. In the original processing workflow, the projection-onto-convex-set (POCS) method was used for partial Fourier reconstruction. Upon examination, we found that applying the POCS method after parallel image reconstruction significantly altered the signal and resulted in considerable loss of functional feature. Futhermore, the original scan protocol employed a TE of 46 ms, which is notably longer than the typical TE of 33 ms. A prolonged TE can increase the ratio of extravascular to intravascular contributions. Importantly, the impact of TE on the efficacy of phase regression remains unclear, introducing potential confounding effects. To address these issues, we revised the protocol by shortening the TE from 46 ms to 39 ms. This adjustment was achieved by modifying the SMS factor to 3 and the in-plane acceleration rate to 3, thereby minimizing the confounding effects associated with an extended TE.

      Following these changes, we recollected task-based fMRI data (N=4) and resting-state fMRI data (N=14) under the updated protocol. Using the revised dataset, we validated layer-specific functional connectivity (FC) through seed-based analyses. These analyses revealed distinct connectivity patterns in the superficial and deep layers of the primary motor cortex (M1), with statistically significant inter-layer differences. Furthermore, additional analyses with a seed in the primary sensory cortex (S1) corroborated the robustness and reliability of the revised methodology. We also changed the ‘directed’ functional connectivity in the title to ‘layer-specific’ functional connectivity, as drawing conclusions about directionality requires auxiliary evidence beyond the scope of this study.

      We provide detailed responses to the reviewers’ comments below.

      Reviewer #1 (Public Review):

      Summary:

      (1)   This study aims to provide imaging methods for users of the field of human layer-fMRI. This is an emerging field with 240 papers published so far. Different than implied in the manuscript, 3T is well represented among those papers. E.g. see the papers below that are not cited in the manuscript. Thus, the claim on the impact of developing 3T methodology for wider dissemination is not justified. Specifically, because some of the previous papers perform whole brain layer-fMRI (also at 3T) in more efficient, and more established procedures.

      3T layer-fMRI papers that are not cited:

      Taso, M., Munsch, F., Zhao, L., Alsop, D.C., 2021. Regional and depth-dependence of cortical blood-flow assessed with high-resolution Arterial Spin Labeling (ASL). Journal of Cerebral Blood Flow and Metabolism. https://doi.org/10.1177/0271678X20982382

      Wu, P.Y., Chu, Y.H., Lin, J.F.L., Kuo, W.J., Lin, F.H., 2018. Feature-dependent intrinsic functional connectivity across cortical depths in the human auditory cortex. Scientific Reports 8, 1-14. https://doi.org/10.1038/s41598-018-31292-x

      Lifshits, S., Tomer, O., Shamir, I., Barazany, D., Tsarfaty, G., Rosset, S., Assaf, Y., 2018. Resolution considerations in imaging of the cortical layers. NeuroImage 164, 112-120. https://doi.org/10.1016/j.neuroimage.2017.02.086

      Puckett, A.M., Aquino, K.M., Robinson, P.A., Breakspear, M., Schira, M.M., 2016. The spatiotemporal hemodynamic response function for depth-dependent functional imaging of human cortex. NeuroImage 139, 240-248. https://doi.org/10.1016/j.neuroimage.2016.06.019

      Olman, C.A., Inati, S., Heeger, D.J., 2007. The effect of large veins on spatial localization with GE BOLD at 3 T: Displacement, not blurring. NeuroImage 34, 1126-1135. https://doi.org/10.1016/j.neuroimage.2006.08.045

      Ress, D., Glover, G.H., Liu, J., Wandell, B., 2007. Laminar profiles of functional activity in the human brain. NeuroImage 34, 74-84. https://doi.org/10.1016/j.neuroimage.2006.08.020

      Huber, L., Kronbichler, L., Stirnberg, R., Ehses, P., Stocker, T., Fernández-Cabello, S., Poser, B.A., Kronbichler, M., 2023. Evaluating the capabilities and challenges of layer-fMRI VASO at 3T. Aperture Neuro 3. https://doi.org/10.52294/001c.85117

      Scheeringa, R., Bonnefond, M., van Mourik, T., Jensen, O., Norris, D.G., Koopmans, P.J., 2022. Relating neural oscillations to laminar fMRI connectivity in visual cortex. Cerebral Cortex. https://doi.org/10.1093/cercor/bhac154

      We thank the reviewer for listing out 8 papers related to 3T layer-fMRI papers. The primary goal of our work is to develop a methodology for brain-wide, layer-dependent resting-state functional connectivity at 3T. Upon review of the cited papers, we found that:

      (1) One study (Lifshits et al.) was not an fMRI study.

      (2) One study (Olman et al.) was conducted at 7T, not 3T.

      (3) Two studies (Taso et al. and Wu et al.) employed relatively large voxel sizes (1.6 × 2.3 × 5 mm³ and 1.5 mm isotropic, respectively), which limits layer specificity.

      (4) Only one of the listed studies (Huber et al., Aperture Neuro 2023) provides coverage of more than half of the brain.

      While each of these studies offers valuable insights, the VASO study by Huber et al. is the most relevant to our work, given its brain-wide coverage. However, the VASO method employs a relatively long TR (14.137 s), which may not be optimal for resting-state functional connectivity analyses.

      To address these limitations, our proposed method achieves submillimeter resolution, layer specificity, brain-wide coverage, and a significantly shorter TR (<5 s) altogether. We believe this advancement provides a meaningful contribution to the field, enabling broader applicability of layer-fMRI at 3T.

      (2) The authors implemented a sequence with lots of nice features. Including their own SMS EPI, diffusion bipolar pulses, eye-saturation bands, and they built their own reconstruction around it. This is not trivial. Only a few labs around the world have this level of engineering expertise. I applaud this technical achievement. However, I doubt that any of this is the right tool for layer-fMRI, nor does it represent an advancement for the field. In the thermal noise dominated regime of sub-millimeter fMRI (especially at 3T), it is established to use 3D readouts over 2D (SMS) readouts. While it is not trivial to implement SMS, the vendor implementations (as well as the CMRR and MGH implementations) are most widely applied across the majority of current fMRI studies already. The author's work on this does not serve any previous shortcomings in the field.

      We would like to thank the reviewer for their comments and the recognition of the technical efforts in implementing our sequence. We would like to address the points raised:

      (1) We completely agree that in-house implementation of existing techniques does not constitute an advancement for the field. We did not claim otherwise in the manuscript. Our focus was on the development of a method for brain-wide, layer-dependent resting-state functional connectivity at 3T, as mentioned in the response above.

      (2) The reviewer stated that "it is established to use 3D readouts over 2D (SMS) readouts". This is a strong claim, and we believe it requires robust evidence to support it. While it is true that 3D readouts can achieve higher tSNR in certain regions, such as the central brain, as shown in the study by Vizioli et al. (ISMRM 2020 abstract; https://cds.ismrm.org/protected/20MProceedings/PDFfiles/3825.html?utm_source=chatgpt.com ), higher tSNR does not necessarily equate to improved detection power in fMRI studies. For instance, Le Ster et al. (PLOS ONE, 2019; https://doi.org/10.1371/journal.pone.0225286 ). demonstrated that while 3D EPI had higher tSNR in the central brain, SMS EPI produced higher t-scores in activation maps.

      (3) When choosing between SMS EPI and 3D EPI, multiple factors should be taken into account, not just tSNR. For example, SMS EPI and 3D EPI differ in their sensitivity to motion and the complexity of motion correction. The choice between them depends on the specific research goals and practical constraints.

      (4) We are open to different readout strategies, provided they can be demonstrated suitable to the research goals. In this study, we opted for 2D SMS primarily due to logistical considerations. This choice does not preclude the potential use of 3D readouts in the future if they are deemed more appropriate for the project objectives.

      The mechanism to use bi-polar gradients to increase the localization specificity is doubtful to me. In my understanding, killing the intra-vascular BOLD should make it less specific. Also, the empirical data do not suggest a higher localization specificity to me.

      We will elaborate the mechanism and reasoning in the later responses.

      Embedding this work in the literature of previous methods is incomplete. Recent trends of vessel signal manipulation with ABC or VAPER are not mentioned. Comparisons with VASO are outdated and incorrect.

      The reproducibility of the methods and the result is doubtful (see below).

      In this revision, we updated the scan protocol and recollected the imaging data. Detailed explanations and revised results are provided in the later responses.

      I don't think that this manuscript is in the top 50% of the 240 layer-fmri papers out there.

      We respect the reviewer’s personal opinion. However, we can only address scientific comments or critiques.

      Strengths:

      See above. The authors developed their own SMS sequence with many features. This is important to the field. And does not leave sequence development work to view isolated monopoly labs. This work democratises SMS.

      The questions addressed here are of high relevance to the field: getting tools with good sensitivity, user-friendly applicability, and locally specific brain activity mapping is an important topic in the field of layer-fMRI.

      Weaknesses:

      (1) I feel the authors need to justify why flow-crushing helps localization specificity. There is an entire family of recent papers that aim to achieve higher localization specificity by doing the exact opposite. Namely, MT or ABC fRMRI aims to increase the localization specificity by highlighting the intravascular BOLD by means of suppressing non-flowing tissue. To name a few:

      Priovoulos, N., de Oliveira, I.A.F., Poser, B.A., Norris, D.G., van der Zwaag, W., 2023. Combining arterial blood contrast with BOLD increases fMRI intracortical contrast. Human Brain Mapping hbm.26227. https://doi.org/10.1002/hbm.26227.

      Pfaffenrot, V., Koopmans, P.J., 2022. Magnetization Transfer weighted laminar fMRI with multi-echo FLASH. NeuroImage 119725. https://doi.org/10.1016/j.neuroimage.2022.119725

      Schulz, J., Fazal, Z., Metere, R., Marques, J.P., Norris, D.G., 2020. Arterial blood contrast ( ABC ) enabled by magnetization transfer ( MT ): a novel MRI technique for enhancing the measurement of brain activation changes. bioRxiv. https://doi.org/10.1101/2020.05.20.106666

      Based on this literature, it seems that the proposed method will make the vein problem worse, not better. The authors could make it clearer how they reason that making GE-BOLD signals more extra-vascular weighted should help to reduce large vein effects.

      The proposed VN fMRI method employs VN gradients to selectively suppress signals from fast-flowing blood in large vessels. Although this approach may initially appear to diverge from the principles of CBV-based techniques (Chai et al., 2020; Huber et al., 2017a; Pfaffenrot and Koopmans, 2022; Priovoulos et al., 2023), which enhance sensitivity to vascular changes in arterioles, capillaries, and venules while attenuating signals from static tissue and large veins, it aligns with the fundamental objective of all layer-specific fMRI methods. Specifically, these approaches aim to maximize spatial specificity by preserving signals proximal to neural activation sites and minimizing contributions from distal sources, irrespective of whether the signals are intra- or extra-vascular in origin. In the context of intravascular signals, CBV-based methods preferentially enhance sensitivity to functional changes in small vessels (proximal components) while demonstrating reduced sensitivity to functional changes in large vessels (distal components). For extravascular signals, functional changes are a mixture of proximal and distal influences. While tissue oxygenation near neural activation sites represents a proximal contribution, extravascular signal contamination from large pial veins reflects distal effects that are spatially remote from the site of neuronal activity. CBV-based techniques mitigate this challenge by unselectively suppressing signals from static tissues, thereby highlighting contributions from small vessels. In contrast, the VN fMRI method employs a targeted suppression strategy, selectively attenuating signals from large vessels (distal components) while preserving those from small vessels (proximal components). Furthermore, the use of a 3T scanner and the inclusion of phase regression in the VN approach mitigates contamination from large pial veins (distal components) while preserving signals reflecting local tissue oxygenation (proximal components). By integrating these mechanisms, VN fMRI improves spatial specificity, minimizing both intravascular and extravascular contributions that are distal to neuronal activation sites. We have incorporated the responses into Discussion section.

      The empirical evidence for the claim that flow crushing helps with the localization specificity should be made clearer. The response magnitude with and without flow crushing looks pretty much identical to me (see Fig, 6d).

      In the new results in Figure 4, the application of VN gradients attenuated the bias towards pial surface. Consistent with the results in Figure 4, Figure 5 also demonstrated the suppression of macrovascular signal by VN gradients.

      It's unclear to me what to look for in Fig. 5. I cannot discern any layer patterns in these maps. It's too noisy. The two maps of TE=43ms look like identical copies from each other. Maybe an editorial error?

      In this revision, the original Figure 5 has been removed. However, we would like to clarify that the two maps with TE = 43 ms in the original Figure 5 were not identical. This can be observed in the difference map provided in the right panel of the figure.

      The authors discuss bipolar crushing with respect to SE-BOLD where it has been previously applied. For SE-BOLD at UHF, a substantial portion of the vein signal comes from the intravascular compartment. So I agree that for SE-BOLD, it makes sense to crush the intravascular signal. For GE-BOLD however, this reasoning does not hold. For GE-BOLD (even at 3T), most of the vein signal comes from extravascular dephasing around large unspecific veins, and the bipolar crushing is not expected to help with this.

      The reviewer’s statement that "most of the vein signal comes from extravascular dephasing around large unspecific veins" may hold true for 7T. However, at 3T, the susceptibility-induced Larmor frequency shift is reduced by 57%, and the extravascular contribution decreases by more than 35%, as shown by Uludağ et al. 2009 ( DOI: 10.1016/j.neuroimage.2009.05.051 ).

      Additionally, according to the biophysical models (Ogawa et al., 1993; doi: 10.1016/S0006-3495(93)81441-3 ), the extravascular contamination from the pial surface is inversely proportional to the square of the distance from vessel. For a vessel diameter of 0.3 mm and an isotropic voxel size of 0.9 mm, the induced frequency shift is reduced by at least 36-fold at the next voxel. Notably, a vessel diameter of 0.3 mm is larger than most pial vessels. Theoretically, the extravascular effect contributes minimally to inter-layer dependency, particularly at 3T compared to 7T due to weaker susceptibility-related effects at lower field strengths. Empirically, as shown in Figure 7c, the results at M1 demonstrated that layer specificity can be achieved statistically with the application of VN gradients. We have incorporated this explanation into the Introduction and Discussion sections of the manuscript.

      (2) The bipolar crushing is limited to one single direction of flow. This introduces a lot of artificial variance across the cortical folding pattern. This is not mentioned in the manuscript. There is an entire family of papers that perform layer-fmri with black-blood imaging that solves this with a 3D contrast preparation (VAPER) that is applied across a longer time period, thus killing the blood signal while it flows across all directions of the vascular tree. Here, the signal cruising is happening with a 2D readout as a "snap-shot" crushing. This does not allow the blood to flow in multiple directions.

      VAPER also accounts for BOLD contaminations of larger draining veins by means of a tag-control sampling. The proposed approach here does not account for this contamination.

      Chai, Y., Li, L., Huber, L., Poser, B.A., Bandettini, P.A., 2020. Integrated VASO and perfusion contrast: A new tool for laminar functional MRI. NeuroImage 207, 116358. https://doi.org/10.1016/j.neuroimage.2019.116358

      Chai, Y., Liu, T.T., Marrett, S., Li, L., Khojandi, A., Handwerker, D.A., Alink, A., Muckli, L., Bandettini, P.A., 2021. Topographical and laminar distribution of audiovisual processing within human planum temporale. Progress in Neurobiology 102121. https://doi.org/10.1016/j.pneurobio.2021.102121

      If I would recommend anyone to perform layer-fMRI with blood crushing, it seems that VAPER is the superior approach. The authors could make it clearer why users might want to use the unidirectional crushing instead.

      We understand the reviewer’s concern regarding the directional limitation of bipolar crushing. As noted in the responses above, we have updated the bipolar gradient to include three orthogonal directions instead of a single direction. Furthermore, flow-related signal suppression does not necessarily require a longer time period. Bipolar diffusion gradients have been effectively used to nullify signals from fast-flowing blood, as demonstrated by Boxerman et al. (1995; DOI: 10.1002/mrm.1910340103). Their study showed that vessels with flow velocities producing phase changes greater than p radians due to bipolar gradients experience significant signal attenuation. The critical velocity for such attenuation can be calculated using the formula: 1/(2gGDd) where g is the gyromagnetic ratio, G is the gradient strength, d is the gradient pulse width and D is the time between the two bipolar gradient pulses. In the framework of Boxerman et al. at 1.5T, the critical velocity for b value of 10 s/mm<sup>2</sup> is ~8 mm/s, resulting in a ~30% reduction in functional signal. In our 3T study, b values of 6, 7, and 8 s/mm<sup>2</sup> correspond to critical velocities of 16.8, 15.2, and 13.9 mm/s, respectively. The flow velocities in capillaries and most venules remain well below these thresholds. Notably, in our VN fMRI sequences, bipolar gradients were applied in all three orthogonal directions, whereas in Boxerman et al.'s study, the gradients were applied only in the z-direction. Given the voxel dimensions of 3 × 3 × 7 mm<sup>3</sup> in the 1.5T study, vessels within a large voxel are likely oriented in multiple directions, meaning that only a subset of fast-flowing signals would be attenuated. Therefore, our approach is expected to induce greater signal reduction, even at the same b values as those used in Boxerman et al.'s study. We have incorporated this text into the Discussion section of the manuscript.

      (3) The comparison with VASO is misleading.

      The authors claim that previous VASO approaches were limited by TRs of 8.2s. The authors might be advised to check the latest literature of the last years.

      Koiso et al. performed whole brain layer-fMRI VASO at 0.8mm at 3.9 seconds (with reliable activation), 2.7 seconds (with unconvincing activation pattern, though), and 2.3 (without activation).

      Also, whole brain layer-fMRI BOLD at 0.5mm and 0.7mm has been previously performed by the Juelich group at TRs of 3.5s (their TR definition is 'fishy' though).

      Koiso, K., Müller, A.K., Akamatsu, K., Dresbach, S., Gulban, O.F., Goebel, R., Miyawaki, Y., Poser, B.A., Huber, L., 2023. Acquisition and processing methods of whole-brain layer-fMRI VASO and BOLD: The Kenshu dataset. Aperture Neuro 34. https://doi.org/10.1101/2022.08.19.504502

      Yun, S.D., Pais‐Roldán, P., Palomero‐Gallagher, N., Shah, N.J., 2022. Mapping of whole‐cerebrum resting‐state networks using ultra‐high resolution acquisition protocols. Human Brain Mapping. https://doi.org/10.1002/hbm.25855

      Pais-Roldan, P., Yun, S.D., Palomero-Gallagher, N., Shah, N.J., 2023. Cortical depth-dependent human fMRI of resting-state networks using EPIK. Front. Neurosci. 17, 1151544. https://doi.org/10.3389/fnins.2023.1151544

      We thank the reviewer for providing these references. While the protocol with a TR of 3.9 seconds in Koiso’s work demonstrated reasonable activation patterns, it was not tested for layer specificity. Given that higher acceleration factors (AF) can cause spatial blurring, a protocol should only be eligible for comparison if layer specificity is demonstrated.

      Secondly, the TRs reported in Koiso’s study pertain only to either the VASO or BOLD acquisition, not the combined CBV-based contrast. To generate CBV-based images, both VASO and BOLD data are required, effectively doubling the TR. For instance, if the protocol with a TR of 3.9 seconds is used, the effective TR becomes approximately 8 seconds. The stable protocol used by Koiso et al. to acquire whole-brain data (94.08 mm along the z-axis) required 5.2 seconds for VASO and 5.1 seconds for BOLD, resulting in an effective TR of 10.3 seconds. The spatial resolution achieved was 0.84 mm isotropic.

      Unfortunately, we could not find the Juelich paper mentioned by the reviewer.

      To have a more comprehensive comparison, we collated relevant literature on brain-wide layer-specific fMRI. We defined brain-wide acquisition as imaging protocols that cover more than half of the human brain, specifically exceeding 55 mm along the superior-inferior axis. We identified five studies and summarized their scan parameters, including effective TR, coverage, and spatial resolution, in Table 1.

      The authors are correct that VASO is not advised as a turn-key method for lower brain areas, incl. Hippocampus and subcortex. However, the authors use this word of caution that is intended for inexperienced "users" as a statement that this cannot be performed. This statement is taken out of context. This statement is not from the academic literature. It's advice for the 40+ user base that wants to perform layer-fMRI as a plug-and-play routine tool in neuroscience usage. In fact, sub-millimeter VASO is routinely being performed by MRI-physicists across all brain areas (including deep brain structures, hippocampus etc). E.g. see Koiso et al. and an overview lecture from a layer-fMRI workshop that I had recently attended: https://youtu.be/kzh-nWXd54s?si=hoIJjLLIxFUJ4g20&t=2401

      In this revision, we decided to focus on cortico-cortical functional connectivity and have removed the LGN-related content. Consequently, the text mentioned by the reviewer was also removed. Nevertheless, we apologize if our original description gave the impression that functional mapping of deep brain regions using VASO is not feasible. The word of caution we used is based on the layer-fMRI blog ( https://layerfmri.com/2021/02/22/vaso_ve/ ) and reflects the challenges associated with this technique, as outlined by experts like Dr. Huber and Dr. Strinberg.

      According to the information provided, including the video, functional mapping of the hippocampus and amygdala using VASO is indeed possible but remains technically challenging. The short arterial arrival times in these deep brain regions can complicate the acquisition, requiring RF inversion pulses to cover a wider area at the base of the brain. For example, as of 2023, four or more research groups were attempting to implement layer-fMRI VASO in the hippocampus. One such study at 3T required multiple inversion times to account for inflow effects, highlighting the technical complexity of these applications. This is the context in which we used the word of caution. We are not sure whether recent advancements like MAGEC VASO have improved its applicability. As of 2024, we have not identified any published VASO studies specifically targeting deep brain structures such as the hippocampus or amygdala. Therefore, it is difficult to conclude that “sub-millimeter VASO is routinely being performed by MRI physicists on deep brain structures such as the hippocampus.”

      Thus, the authors could embed this phrasing into the context of their own method that they are proposing in the manuscript. E.g. the authors could state whether they think that their sequence has the potential to be disseminated across sites, considering that it requires slow offline reconstruction in Matlab?

      We are enthusiastic about sharing our imaging sequence, provided its usefulness is conclusively established. However, it's important to note that without an online reconstruction capability, such as the ICE, the practical utility of the sequence may be limited. Unfortunately, we currently don’t have the manpower to implement the online reconstruction. Nevertheless, we are more than willing to share the offline reconstruction codes upon request.

      Do the authors think that the results shown in Fig. 6c are suggesting turn-key acquisition of a routine mapping tool? In my humble opinion, it looks like random noise, with most of the activation outside the ROI (in white matter).

      As we mentioned in the ‘general response’ in the beginning of the rebuttal, the POCS method for partial Fourier reconstruction caused the loss of functional feature, potentially accounting for the activation in white matter. In this revision, we have modified the pulse sequence, scan protocol and processing pipelines.

      According to the results in Figure 4, stable activation in M1 was observed at the single-subject level across most scan protocols. Yet, the layer-dependent activation profiles in M1 were spatially unstable, irrespective of the application of VN gradients. This spatial instability is not entirely unexpected, as T2*-based contrast is inherently sensitive to various factors that perturb the magnetic field, such as eye movements, respiration, and macrovascular signal fluctuations. Furthermore, ICA-based artifact removal was intentionally omitted in Figure 4 to ensure fair comparisons between protocols, leaving residual artifacts unaddressed. Inconsistency in performing the button-pressing task across sessions may also have contributed to the observed variability. These results suggest that submillimeter-resolution fMRI may not yet be suitable for reliable individual-level layer-dependent functional mapping, unless group-level statistics are incorporated to enhance robustness. We have incorporated this text into the Limitation section of the manuscript.

      (4) The repeatability of the results is questionable.

      The authors perform experiments about the robustness of the method (line 620). The corresponding results are not suggesting any robustness to me. In fact, the layer profiles in Fig. 4c vs. Fig 4d are completely opposite. The location of peaks turns into locations of dips and vice versa.

      The methods are not described in enough detail to reproduce these results.

      The authors mention that their image reconstruction is done "using in-house MATLAB code" (line 634). They do not post a link to github, nor do they say if they share this code.

      We thank the reviewer for the comments regarding reproducibility and data sharing. In response, we have revised the Methods section and elaborated on the technical details to improve clarity and reproducibility.

      Regarding code sharing, we acknowledge that the current in-house MATLAB reconstruction code requires further refinement to improve its readability and usability. Due to limited manpower, we have not yet been able to complete this task. However, we are committed to making the code publicly available and will upload it to GitHub as soon as the necessary resources are available.

      For data sharing, we face logistical challenges due to the large size of the dataset, which spans tens of terabytes. Platforms like OpenNeuro, for example, typically support datasets up to 10TB, making it difficult to share the data in its entirety. Despite this limitation, we are more than willing to share offline reconstruction codes and raw data upon request to facilitate reproducibility.

      Regarding data robustness, we kindly refer the reviewer to our response to the previous comment, where we addressed these concerns in greater detail.

      It is not trivial to get good phase data for fMRI. The authors do not mention how they perform the respective coil-combination.

      No data are shared for reproduction of the analysis.

      Obtaining phase data is relatively straightforward when the images are retrieved directly from raw data. For coil combination, we employed the adaptive coil combination approach described by (Walsh et al.; DOI: 10.1002/(sici)1522-2594(200005)43:5<682::aid-mrm10>3.0.co;2-g ) The MATLAB code for this implementation was developed by Dr. Diego Hernando and is publicly available at https://github.com/welton0411/matlab .

      (5) The application of NODRIC is not validated.

      Previous applications of NORDIC at 3T layer-fMRI have resulted in mixed success. When not adjusted for the right SNR regime it can result in artifactual reductions of beta scores, depending on the SNR across layers. The authors could validate their application of NORDIC and confirm that the average layer-profiles are unaffected by the application of NORDIC. Also, the NORDIC version should be explicitly mentioned in the manuscript.

      Akbari, A., Gati, J.S., Zeman, P., Liem, B., Menon, R.S., 2023. Layer Dependence of Monocular and Binocular Responses in Human Ocular Dominance Columns at 7T using VASO and BOLD (preprint). Neuroscience. https://doi.org/10.1101/2023.04.06.535924

      Knudsen, L., Guo, F., Huang, J., Blicher, J.U., Lund, T.E., Zhou, Y., Zhang, P., Yang, Y., 2023. The laminar pattern of proprioceptive activation in human primary motor cortex. bioRxiv. https://doi.org/10.1101/2023.10.29.564658

      We appreciate the reviewer’s suggestion. To validate the application of NORDIC denoising in our study, we compared the BOLD activation maps before and after denoising in the visual and motor cortices, as well as the depth-dependent activation profiles in M1. These results are presented in Figure 3. The activation patterns in the denoised maps were consistent with those in the non-denoised maps but exhibited higher statistical significance. Notably, BOLD activation within M1 was only observed after NORDIC denoising, underscoring the necessity of this approach. Figure 3c shows the depth-dependent activation profiles in M1, highlighted by the green contours in Figure 3b. Both denoised and non-denoised profiles followed similar trends; however, as expected, the non-denoised profile exhibited larger confidence intervals compared to the NORDIC-denoised profile. These results confirm that NORDIC denoising enhances sensitivity without introducing distortions in the functional signal. The corresponding text has been incorporated into the Results section.

      Regarding the implementation details of NORDIC denoising, the reconstructed images were denoised using a g-factor map (function name: NIFTI_NORDIC). The g-factor map was estimated from the image time series, and the input images were complex-valued. The width of the smoothing filter for the phase was set to 10, while all other hyperparameters were retained at their default values. This information has been integrated into the Methods section for clarity and reproducibility.

      Reviewer #2 (Public Review):

      This study developed a setup for laminar fMRI at 3T that aimed to get the best from all worlds in terms of brain coverage, temporal resolution, sensitivity to detect functional responses, and spatial specificity. They used a gradient-echo EPI readout to facilitate sensitivity, brain coverage and temporal resolution. The former was additionally boosted by NORDIC denoising and the latter two were further supported by parallel-imaging acceleration both in-plane and across slices. The authors evaluated whether the implementation of velocity-nulling (VN) gradients could mitigate macrovascular bias, known to hamper the laminar specificity of gradient-echo BOLD.

      The setup allows for 0.9 mm isotropic acquisitions with large coverage at a reasonable TR (at least for block designs) and the fMRI results presented here were acquired within practical scan-times of 12-18 minutes. Also, in terms of the availability of the method, it is favorable that it benefits from lower field strength (additional time for VN-gradient implementation, afforded by longer gray matter T2*).

      The well-known double peak feature in M1 during finger tapping was used as a test-bed to evaluate the spatial specificity. They were indeed able to demonstrate two distinct peaks in group-level laminar profiles extracted from M1 during finger tapping, which was largely free from superficial bias. This is rather intriguing as, even at 7T, clear peaks are usually only seen with spatially specific non-BOLD sequences. This is in line with their simple simulations, which nicely illustrated that, in theory, intravascular macrovascular signals should be suppressible with only minimal suppression of microvasculature when small b-values of the VN gradients are employed. However, the authors do not state how ROIs were defined making the validity of this finding unclear; were they defined from independent criteria or were they selected based on the region mostly expressing the double peak, which would clearly be circular? In any case, results are based on a very small sub-region of M1 in a single slice - it would be useful to see the generalizability of superficial-bias-free BOLD responses across a larger portion of M1.

      We appreciate and understand the reviewer’s concerns. Given the small size of the hand knob region within M1 and its intersubject variability in location, defining this region automatically remains challenging. However, we applied specific criteria to minimize bias during the delineation of M1: 1) the hand knob region was required to be anatomically located in the precentral sulcus or gyrus; 2) it needed to exhibit consistent BOLD activation across the majority of testing conditions; and 3) the region was expected to show BOLD activation in the deep cortical layers under the condition of b = 0 and TE = 30 ms. Once the boundaries across cortical depth were defined, the gray matter boundaries of hand knob region were delineated based on the T1-weighted anatomical image and the cortical ribbon mask but excluded the BOLD activation map to minimize potential bias in manual delineation. Based on the new criteria, the resulting depth-dependent profiles, as shown in Figure 4, are no longer superficial-bias-free.

      As repeatedly mentioned by the authors, a laminar fMRI setup must demonstrate adequate functional sensitivity to detect (in this case) BOLD responses. The sensitivity evaluation is unfortunately quite weak. It is mainly based on the argument that significant activation was found in a challenging sub-cortical region (LGN). However, it was a single participant, the activation map was not very convincing, and the demonstration of significant activation after considerable voxel-averaging is inadequate evidence to claim sufficient BOLD sensitivity. How well sensitivity is retained in the presence of VN gradients, high acceleration factors, etc., is therefore unclear. The ability of the setup to obtain meaningful functional connectivity results is reassuring, yet, more elaborate comparison with e.g., the conventional BOLD setup (no VN gradients) is warranted, for example by comparison of tSNR, quantification and comparison of CNR, illustration of unmasked-full-slice activation maps to compare noise-levels, comparison of the across-trial variance in each subject, etc. Furthermore, as NORDIC appears to be a cornerstone to enable submillimeter resolution in this setup at 3T, it is critical to evaluate its impact on the data through comparison with non-denoised data, which is currently lacking.

      We appreciate the reviewer’s comments and acknowledge that the LGN results from a single participant were not sufficiently convincing. In this revision, we have removed the LGN-related results and focused on cortico-cortical FC. To evaluate data quality, we opted to present BOLD activation maps rather than tSNR, as high tSNR does not necessarily translate to high functional significance. In Figure 3, we illustrate the effect of NORDIC denoising, including activation maps and depth-dependent profiles. Figure 4 presents activation maps acquired under different TE and b values, demonstrating that VN gradients effectively reduce the bias toward the pial surface without altering the overall activation patterns. The results in Figure 4 and Figure 5 provide evidence that VN gradients retain sensitivity while reducing superficial bias. The ability of the setup to obtain meaningful FC results was validated through seed-based analyses, identifying distinct connectivity patterns in the superficial and deep layers of the primary motor cortex (M1), with significant inter-layer differences (see Figure 7). Further analyses with a seed in the primary sensory cortex (S1) demonstrated the reliability of the method (see Figure 8). For further details on the results, including the impact of VN gradients and NORDIC denoising, please refer to Figures 3 to 8 in the Results section.

      Additionally, we acknowledge the limitations of our current protocol for submillimeter-resolution fMRI at the individual level. We found that robust layer-dependent functional mapping often requires group-level statistics to enhance reliability. This issue has been discussed in detail in the Limitations section.

      The proposed setup might potentially be valuable to the field, which is continuously searching for techniques to achieve laminar specificity in gradient echo EPI acquisitions. Nonetheless, the above considerations need to be tackled to make a convincing case.

      Reviewer #3 (Public Review):

      Summary:

      The authors are looking for a spatially specific functional brain response to visualise non-invasively with 3T (clinical field strength) MRI. They propose a velocity-nulled weighting to remove the signal from draining veins in a submillimeter multiband acquisition.

      Strengths:

      - This manuscript addresses a real need in the cognitive neuroscience community interested in imaging responses in cortical layers in-vivo in humans.

      - An additional benefit is the proposed implementation at 3T, a widely available field strength.

      Weaknesses:

      - Although the VASO acquisition is discussed in the introduction section, the VN-sequence seems closer to diffusion-weighted functional MRI. The authors should make it more clear to the reader what the differences are, and how results are expected to differ. Generally, it is not so clear why the introduction is so focused on the VASO acquisition (which, curiously, lacks a reference to Lu et al 2013). There are many more alternatives to BOLD-weighted imaging for fMRI. CBF-weighted ASL and GRASE have been around for a while, ABC and double-SE have been proposed more recently.

      The major distinction between diffusion-weighted fMRI (DW-fMRI) and our methodology lies in the b-value employed. DW-fMRI typically measures cellular swelling using b-values greater than 1000 s/mm<sup>2</sup> (e.g., 1800 s/mm(sup>2</sup>). In contrast, our VN-fMRI approach measures hemodynamic responses by employing smaller b-values specifically designed to suppress signals from fast-flowing draining veins rather than detecting microstructural changes.

      Regarding other functional contrasts, we agree that more layer-dependent fMRI approaches should be mentioned. In this revision, we have expanded the Introduction section to include discussions of the double spin-echo approach and CBV-based methods, such as MT-weighted fMRI, VAPER, ABC, and CBF-based method ASL. Additionally, the reference to Lu et al. (2013) has been cited in the revised manuscript. The corresponding text has been incorporated into the Introduction section to provide a more comprehensive overview of alternative functional imaging techniques.

      - The comparison in Figure 2 for different b-values shows % signal changes. However, as the baseline signal changes dramatically with added diffusion weighting, this is rather uninformative. A plot of t-values against cortical depth would be much more insightful.

      - Surprisingly, the %-signal change for a b-value of 0 is not significantly different from 0 in the gray matter. This raises some doubts about the task or ROI definition. A finger-tapping task should reliably engage the primary motor cortex, even at 3T, and even in a single participant.

      - The BOLD weighted images in Figure 3 show a very clear double-peak pattern. This contradicts the results in Figure 2 and is unexpected given the existing literature on BOLD responses as a function of cortical depth.

      - Given that data from Figures 2, 3, and 4 are derived from a single participant each, order and attention affects might have dramatically affected the observed patterns. Especially for Figure 4, neither BOLD nor VN profiles are really different from 0, and without statistical values or inter-subject averaging, these cannot be used to draw conclusions from.

      We appreciate the reviewer’s suggestions. In this revision, we have made significant updates to the participant recruitment, scan protocol, data processing, and M1 delineation. Please refer to the "General Responses" at the beginning of the rebuttal and the first response to Reviewer #2 for more details.

      Previously, the variation in depth-dependent profiles was calculated across upscaled voxels within a specific layer. However, due to the small size of the hand knob region, the number of within-layer voxels was limited, resulting in inaccurate estimations of signal variation. In the revised manuscript, the signal was averaged within each layer before performing the GLM analysis, and signal variation was calculated using the temporal residuals. The technical details of these changes are described in the "Materials and Methods" section. Furthermore, while the initial submission used percentage signal change for the profiles of M1, the dramatic baseline fluctuations observed previously are no longer an issue after the modifications. For this reason, we retained the use of percentage signal change to present the depth-dependent profiles. After these adjustments, the profiles exhibited a bias toward the pial surface, particularly in the absence of VN gradients.

      - In Figure 5, a phase regression is added to the data presented in Figure 4. However, for a phase regression to work, there has to be a (macrovascular) response to start with. As none of the responses in Figure 4 are significant for the single participant dataset, phase regression should probably not have been undertaken. In this case, the functional 'responses' appear to increase with phase regression, which is contra-intuitive and deserves an explanation.

      We agreed with reviewer’s argument. In the revised results, the issues mentioned by the reviewer are largely diminished. The updated analyses demonstrate that phase regression effectively reduces superficial bias, as shown in Figures 4 and 5.

      - Consistency of responses is indeed expected to increase by a removal of the more variable vascular component. However, the microvascular component is always expected to be smaller than the combination of microvascular + macrovascular responses. Note that the use of %signal changes may obscure this effect somewhat because of the modified baseline. Another expected feature of BOLD profiles containing both micro- and microvasculature is the draining towards the cortical surface. In the profiles shown in Figure 7, this is completely absent. In the group data, no significant responses to the task are shown anywhere in the cortical ribbon.

      We agreed with reviewer’s comments. In the revised manuscript, the results have been substantially updated to addressing the concerns raised. The original Figure 7 is no longer relevant and has been removed.

      - Although I'd like to applaud the authors for their ambition with the connectivity analysis, I feel that acquisitions that are so SNR starved as to fail to show a significant response to a motor task should not be used for brain wide directed connectivity analysis.

      We appreciate the reviewer’s comments and share the concern about SNR limitations. In the updated results presented in Figure 5, the activation patterns in the visual cortex were consistent across TEs and b values. At the motor cortex, stable activation in M1 was observed at the single-subject level across most scan protocols. However, the layer-dependent activation profiles in M1 exhibited spatial instability, irrespective of the application of VN gradients. This spatial instability is not entirely unexpected, as T2*-based contrast is inherently sensitive to factors that perturb the magnetic field, such as eye movements, respiration, and macrovascular signal fluctuations. Additionally, ICA-based artifact removal was intentionally omitted in Figure 4 to ensure fair comparisons across protocols, leaving some residual artifacts unaddressed. Variability in task performance during button-pressing sessions may have further contributed to the observed inconsistencies.

      Although these findings suggest that submillimeter-resolution fMRI may not yet be reliable for individual-level layer-dependent functional mapping, the group-level FC analyses can still yield robust results. In Figure 7, group-level statistics revealed distinct functional connectivity (FC) patterns associated with superficial and deep layers in M1. These FC maps exhibited significant differences between layers, demonstrating that VN fMRI enhances inter-layer independence. Additional FC analyses with a seed placed in S1 further validated these findings (see Figure 8).

      The claim of specificity is supported by the observation of the double-peak pattern in the motor cortex, previously shown in multiple non-BOLD studies. However, this same pattern is shown in some of the BOLD weighted data, which seems to suggest that the double-peak pattern is not solely due to the added velocity nulling gradients. In addition, the well-known draining towards the cortical surface is not replicated for the BOLD-weighted data in Figures 3, 4, or 7. This puts some doubt about the data actually having the SNR to draw conclusions about the observed patterns.

      We appreciate the reviewer’s comments. In the updated results, the efficacy of the VN gradients is evident near the pial surface, as shown in Figures 4 and 5. In Figure 4, comparing the second and third columns (b = 0 and b = 6 s/mm<sup>2</sup>, respectively, at TE = 38 ms), the percentage signal change in the superficial layers is generally lower with b = 6 s/mm<sup>2</sup> than with b = 0. This indicates that VN gradient-induced signal suppression is more pronounced in the superficial layers. Additionally, in Figure 5, the VN gradients effectively suppressed macrovascular signals as highlighted by the blue circles. These observations support the role of VN gradients in enhancing specificity by reducing superficial bias and macrovascular contamination. Furthermore, bias towards cortical surface was observed in the updated results in Figure 4.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      (1) L141: "depth dependent" is slightly misleading here. It could be misunderstood to suggest that the authors are assessing how spatial specificity varies as a function of depth. Rather, they are assessing spatial specificity based on depth-dependent responses (double peak feature). Perhaps "layer-dependent spatial specificity" could be substituted with laminar specificity?

      We thank the reviewer for the suggestion. The term “depth dependent” has been replaced by “layer dependent” in the revised manuscript.

      (2) L146-149: these do not validate spatial specificity.

      The original text is removed.

      (3) L180: Maybe helpful to describe what the b-value is to assist unfamiliar readers.

      We have clarified the b-value as “the strength of the bipolar diffusion gradients” where it is first mentioned in the manuscript.

      (4) Figure 1B: I think it would be appropriate with a sentence of how the authors define micro/macrovasculature. Figure 1B seems to suggest that large ascending veins are considered microvascular which I believe is a bit unconventional. Nevertheless, as long as it is clearly stated, it should be fine.

      In our context, macrovasculature refers to vessels that are distal to neural activation sites and contribute to extravascular contamination. These vessels are typically larger in size (e.g., > 0.1 mm in diameter) and exhibit faster flow rates (e.g., > 10 mm/s).

      (5) I think the authors could be more upfront with the point about non-suppressed extravascular effects from macrovasculature, which was briefly mentioned in the discussion. It could already be highlighted in the introduction or theory section.

      We thank the reviewer’s suggestions. We have expanded the discussion of extravascular effects from macrovasculature in both the Introduction (5th paragraph) and Discussion (3rd paragraph) sections.

      (6) The phase regression figure feels a bit misplaced to me. If the authors agree: rather than showing the TE-dependency of the effect of phase regression, it may be more relevant for the present study to compare the conventional setup with phase regression, with the VN setup without phase regression. I.e., to show how the proposed setup compares to existing 3T laminar fMRI studies.

      In this revision, both the TE-dependent and VN-dependent effects of phase regression were investigated. The results in Figure 4 and Figure 5 demonstrated that phase regression effectively suppresses macrovascular contributions primarily near the gray matter/CSF boundary, irrespective of TE or the presence of VN gradients.

      (7) L520: It might be beneficial to also cite the large body of other laminar studies showing the double peak feature to underscore that it is highly robust, which increases its relevance as a test-bed to assess spatial specificity.

      We agreed. More literatures have been cited (Chai et al., 2020; Huber et al., 2017a; Knudsen et al., 2023; Priovoulos et al., 2023).

      (8) L557: The argument that only one participant was assessed to reduce inter-subject variability is hard to buy. If significant variability exists across subjects, this would be highly relevant to the authors and something they would want to capture.

      We thank the reviewer for the suggestions. In this revision, we have increased the number of participants to 4 for protocol development and 14 for resting-state functional connectivity analysis, allowing us to better assess and account for inter-subject variability.

      (9) L637: add download link and version number.

      The download link has been added as requested. The version number is not applicable.

      (10) L638: How was the phase data coil-combined?

      The reconstructed multi-channel data, which were of complex values, were combined using the adaptive combination method (Walsh et al.; DOI: 10.1002/(sici)1522-2594(200005)43:5<682::aid-mrm10>3.0.co;2-g). The MATLAB code for this implementation was developed by Dr. Diego Hernando and is publicly available at https://github.com/welton0411/matlab . The phase data were then extracted using the MATLAB function ‘angle’.

      (11) L639: Why was the smoothing filter parameter changed (other parameters were default)?

      The smoothing filter parameter was set based on the suggestion provided in the help comments of the NIFTI_NORDIC function:

      function  NIFTI_NORDIC(fn_magn_in,fn_phase_in,fn_out,ARG)

      % fMRI

      %

      %  ARG.phase_filter_width=10;

      In other words, we simply followed the recommendation outlined in the NIFTI_NORDIC function’s documentation.

      (12) I assume the phase data was motion corrected after transforming to real and imaginary components and using parameters estimated from magnitude data? Maybe add a few sentences about this.

      Prior to phase regression, the time series of real and imaginary components were subjected to motion correction, followed by phase unwrapping. The phase regression was incorporated early in the data processing pipeline to minimize the discrepancy in data processing between magnitude and phase images (Stanley et al., 2021).

      (13) Was phase regression applied with e.g., a deming model, which accounts for noise on both the x and y variable? In my experience, this makes a huge difference compared with regular OLS.

      We appreciate the reviewer’s insightful comment. We are aware that the noise present in both magnitude and phase data therefore linear Deming regression would be a good fit to phase regression (Stanley et al., 2021). To perform Deming regression, however, the ratio of magnitude error variance to phase error variance must be predefined. In our initial tests, we found that the regression results were sensitive to this ratio. To avoid potential confounding, we opted to use OLS regression for the current analysis. However, we agreed Deming model could enhance the efficacy of phase regression if the ratio could be determined objectively and properly.

      (14) Figure 2: What is error bar reflecting? I don't think the across-voxel error, as also used in Figure 4, is super meaningful as it assumes the same response of all voxels within a layer (might be alright for such a small ROI). Would it be better to e.g. estimate single-trial response magnitude (percent signal change) and assess variability across? Also, it is not obvious to me why b=30 was chosen. The authors argue that larger values may kill signal, but based on this Figure in isolation, b=48 did not have smaller response magnitudes (larger if anything).

      We agreed with the reviewer’s opinion on the across-voxel error. In the revised manuscript, the signal was averaged within each layer before performing the GLM analysis, and signal variation was calculated using the temporal residuals. The technical details of these changes are described in the "Materials and Methods" section.

      Additionally, the bipolar diffusion gradients were modified from a single direction to three orthogonal directions. As a result, the questions and results related to b=30 or b=48 are no longer applicable.

      (15) Figure 5: would be informative to quantify the effect of phase regression over a large ROI and evaluate reduction in macrovascular influence from superficial bias in laminar profiles.

      We appreciate the reviewer’s suggestion. In the revised manuscript, the reduction in macrovascular influence from superficial bias across a large ROI is displayed in Figure 5. Additionally, the impact on laminar profiles is demonstrated in Figure 4.

      (16) L406-408: What kind of robustness?

      We acknowledge that describing the protocol as “robust” was an overstatement. The updated results indicate that the current protocol for submillimeter fMRI may not yet be suitable for reliable individual-level layer-dependent functional mapping. However, group-level functional connectivity (FC) analyses demonstrated clear layer-specific distinctions with VN fMRI, which were not evident in conventional fMRI. These findings highlight the enhanced layer specificity achievable with VN fMRI.

      (17) Figure 8: I think C) needs pointers to superficial, middle, and deep layers? Why is it not in the same format as in Figure 9C? The discussion of the FC results could benefit from more references supporting that these observations are in line with the literature.

      In the revised results, the layer pooling shown in Figure 9c has been removed, making the question regarding format alignment no longer applicable. Additionally, references supporting the FC results have been added to the revised Discussion section (7th paragraph).

      (18) L456-457: But correlation coefficients may also be biased by different CNR across layers.

      That is correct. In the updated FC results in Figure 7 to 9, we used group-level statistics rather than correlation coefficients.

      Reviewer #3 (Recommendations For The Authors):

      The results in Figure 2-6 should be repeated over, or averaged over, a (small) group of participants. N=6 is usual in this field. I would seriously reconsider the multiband acceleration - the acquisition seemingly cannot support the SNR hit.

      A few more specific points are given below:

      (1) Abstract: The sentence about LGN in the abstract came for me out of the blue - why would LGN be important here, it's not even a motor network node? Perhaps the aims of the study should be made more clear - if it's about networks as suggested earlier then a network analysis result would be expected too. Expanding the directed FC findings would improve the logical flow of the abstract. Given the many concerns, removing the connectivity analysis altogether would also be an option.

      We thank the reviewer for the suggestions. The LGN-related results indeed diluted the focus of this study and have been completely removed in this revision.

      (2) Line 105: in addition to the VASO method, ..

      The corresponding text has been revised, and as a result, the reviewer’s suggestion is no longer applicable.

      (3) If out of the set MB 4 / 5 / 6 MB4 was best, why did the authors not continue with a comparison including MB3 and MB2? It seems to me unlikely that the MB4 acquisition is actually optimal.

      Results: We appreciate the reviewer’s suggestions. In this revision, we decreased the MB factor to 3, as it allowed us to increase the in-plane acceleration rate to 3, thereby shortening the TE. The resulting sensitivity for both individual and group-level results is detailed in earlier responses, such as the response to Q16 for Reviewer #2.

      (4) The formatting of the references is occasionally flawed, including first names and/or initials. Please consider using a reliable reference manager.

      We used Zotero as our reference manager in this revision to ensure consistency and accuracy. The references have been formatted according to the APA style.

      (5) In the caption of Figure 5, corrected and uncorrected p values are identical. What multiple comparisons correction was made here? A multiple comparisions over voxels (as is standard) would usually lead to a cut-off ~z=3.2. That would remove most of the 'responses' shown in figure 5.

      We appreciate the reviewer’s comment. The original results presented in Figure 5 have been removed in the revised manuscript, making this comment no longer applicable.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors address an important issue in Babesia research by repurposing cipargamin (CIP) as a potential therapeutic against selective Babesia spp. In this study, CIP demonstrated potent in vitro inhibition of B. bovis and B. gibsoni with IC<sub>50</sub> values of 20.2 ± 1.4 nM and 69.4 ± 2.2 nM, respectively, and the in vivo efficacy against Babesia spp. using mouse model. The authors identified two key resistance mutations in the BgATP4 gene (BgATP4<sup>L921I</sup> and BgATP4<sup>L921V</sup>) and explored their implications through phenotypic characterization of the parasite using cell biological experiments, complemented by in silico analysis. Overall, the findings are promising and could significantly advance Babesia treatment strategies.

      Strengths:

      In this manuscript, the authors effectively repurpose cipargamin (CIP) as a potential treatment for Babesia spp. They provide compelling in vitro and in vivo data showing strong efficacy. Key resistance mutations in the BgATP4 gene are identified and analyzed through both phenotypic and in silico methods, offering valuable insights for advancing treatment strategies.

      Thank you for your insightful comments and for taking the time to review our manuscript.

      Weaknesses:

      The manuscript explores important aspects of drug repurposing and rational drug design using cipargamin (CIP) against Babesia. However, several weaknesses should be addressed. The study lacks novelty as similar research on cipargamin has been conducted, and the experimental design could be improved. The rationale for choosing CIP over other ATP4-targeting compounds is not well-explained. Validation of mutations relies heavily on in silico predictions without sufficient experimental support. The Ion Transport Assay has limitations and would benefit from additional assays like Radiolabeled Ion Flux and Electrophysiological Assays. Also, the study lacks appropriate control drugs and detailed functional characterization. Further clarity on mutation percentages, additional safety testing, and exploration of cross-resistance would strengthen the findings.

      We appreciate your feedback and for giving us the chance to improve our paper. We have specified how we revised the below comments one by one. I hope these address your concerns.

      Comment 1: It is commendable to explore drug repurposing, drug deprescribing, drug repositioning, and rational drug design, especially using established ATP4 inhibitors that are well-studied in Plasmodium and other protozoan parasites. While the study provides some interesting findings, it appears to lack novelty, as similar investigations of cipargamin on other protozoan parasites have been conducted. The study does not introduce new concepts, and the experimental design could benefit from refinement to strengthen the results. Additionally, the rationale for choosing CIP over other MMV compounds targeting ATP4 is not clearly articulated. Clarifying the specific advantages CIP may offer against Babesia would be beneficial. Finally, the validation of the identified mutations might be strengthened by additional experimental support, as reliance on in silico predictions alone may not fully address the functional impact, particularly given the potential ambiguity of the mutations (BgATP4 L to V and I).

      Thank you for your thoughtful feedback. We have addressed the concerns as follows: (1) Introduction of new concepts and experimental design: While our study primarily builds on existing frameworks, it provides novel insights into the interaction of CIP with Babesia parasites, which we believe contribute to the field. Regarding the experimental design, we acknowledge its limitations and have revised the manuscript to include additional experiments to strengthen the robustness of our findings. Specifically, we have added experiments on the detection of BgATP4-associated ATPase activity (Figure 3H), the evaluation of cross-resistance to antibabesial agents (Figures 5A and 5B), and the efficacy of CIP plus TQ combination in eliminating B. microti infection with no recrudescence in SCID mice (Figure 5C).

      (2) Rationale for choosing CIP over other MMV compounds targeting ATP4: We appreciate this point and have expanded the introduction section to articulate our rationale for selecting CIP (Lines 94-97). Specifically, CIP was chosen due to its previously demonstrated efficacy against Plasmodium and other protozoan parasites.

      (3) Validation of identified mutations: We agree that additional experimental data would strengthen the validation of the identified mutations. In response, we have indicated the ratio of wild-type to mutant parasites by Illumina NovaSeq6000 to validate the impact of the BgATP4 C-to-G and A mutations (Figure 2D).

      Comment 2: Conducting an Ion Transport Assay is useful but has limitations. Non-specific binding or transport by other cellular components can lead to inaccurate results, causing false positives or negatives and making data interpretation difficult. Indirect measurements, like changes in fluorescence or electrical potential, can introduce artifacts. To improve accuracy, consider additional assays such as

      a. Radiolabeled Ion Flux Assay: tracks the movement of Na<sup>+</sup> using radiolabeled ions, providing direct evidence of ion transport.

      b. Electrophysiological Assay: measures ionic currents in real-time with patch-clamp techniques, offering detailed information about ATP4 activity.

      Thank you for highlighting the limitations of the ion transport assay and suggesting alternative approaches to improve accuracy. However, they require specialized equipment and expertise not currently available in our laboratory. We have acknowledged these limitations and included these alternative methods as part of the study's future directions. Thank you for your suggestions which will undoubtedly enhance the rigor and depth of our research.

      Comment 3: In-silico predictions can provide plausible outcomes, but it is essential to evaluate how the recombinant purified protein and ligand interact and function at physiological levels. This aspect is currently missing and should be included. For example, incorporating immunoprecipitation and ATPase activity assays with both wild-type and mutant proteins, as well as detailed kinetic studies with Cipargamin, would be recommended to validate the findings of the study.

      Thank you for your insightful suggestions regarding the validation of in-silico predictions. We recognize the importance of evaluating the interaction and function of recombinant purified proteins and ligands at physiological levels to strengthen the study's findings. (1) Incorporating experimental validation:

      a. Immunoprecipitation assays: We agree that immunoprecipitation could provide valuable evidence of protein-ligand interactions. While this was not included in the current study due to limitations in sample availability, we plan to incorporate this assay in follow-up experiments.

      b. ATPase activity assays: Assessing ATPase activity in both wild-type and mutant proteins is a crucial step in validating the functional impact of the identified mutations. We included the results in the revised manuscript (Figure 3H).

      (2) Detailed kinetic studies with cipargamin: We appreciate the recommendation to conduct detailed kinetic analyses. These studies would provide deeper insights into the binding affinity and inhibition dynamics of cipargamin. We have included the results of these experiments in the current study (Figure 3I).

      Comment 4: The study lacks specific suitable control drugs tested both in vitro and in vivo. For accurate drug assessment, especially when evaluating drugs based on a specific phenotype, such as enlarged parasites, it is important to use ATP4 gene-specific inhibitors. Including similar classes of drugs, such as Aminopyrazoles, Dihydroisoquinolines, Pyrazoleamides, Pantothenamides, Imidazolopiperazines (e.g., GNF179), and Bicyclic Azetidine Compounds, would provide more comprehensive validation.

      Thank you for emphasizing the importance of including suitable control drugs. We acknowledge the absence of specific control drugs in the previous version of the manuscript. To date, no drug targeting ATP4 proteins in Babesia has been definitively identified. The suggested drugs could potentially disrupt the parasite's ability to regulate sodium levels by inhibiting PfATP4, a protein essential for its survival. This highlights PfATP4 as an attractive target for antimalarial drug development. However, further studies are required to evaluate whether these drugs exhibit similar activity against ATP4 homologs in Babesia.

      Comment 5: Functional characterization of CIP through microscopic examination and quantification for assessing parasite size enlargement is not entirely reliable. A Flow Cytometry-Based Assay is recommended instead 9 along with suitable control antiparasitic drugs). To effectively monitor Cipargamin's action, conducting time-course experiments with 6-hour intervals is advisable rather than relying solely on endpoint measurements. Additionally, for accurate assessment of parasite morphology, obtaining representative qualitative images using Scanning Electron Microscopy (SEM) or Transmission Electron Microscopy (TEM) for treated versus untreated samples is recommended for precise measurements.

      Thank you for your constructive feedback regarding the methods for functional characterization of CIP and the evaluation of parasite morphology.

      (1) Flow Cytometry-Based Assay: We agree that a flow cytometry-based assay would enhance the accuracy of detecting changes in parasite size and morphology. We will implement this method in future studies as our laboratory currently does not have the capability to conduct such experiments.

      (2) Microscopy for Morphology Assessment: We acknowledge the importance of obtaining high-resolution, representative images of treated and untreated samples. Utilizing Scanning Electron Microscopy (SEM) or Transmission Electron Microscopy (TEM) for qualitative analysis will significantly improve the precision of our morphological assessments. However, both methods have limitations.

      a. SEM: This technique can only scan the erythrocytes' surface; it cannot scan the parasite itself because it is inside the erythrocytes.

      b. TEM: Since the parasite is fixed, observations from various angles may reveal longitudinal or cross-sectional portions, making it impossible to precisely view the parasite's dimensions. As a result, we employed TEM to precisely observe the parasite's internal structure alterations both before and after treatment, as seen in Figure 3C.

      Comment 6: A notable contradiction observed is that mutant cells displayed reduced efficacy and affinity but more pronounced phenotypic effects. The BgATP4<sup>L921I</sup> mutation shows a 2x lower susceptibility (IC<sub>50</sub> of 887.9 ± 61.97 nM) and a predicted binding affinity of -6.26 kcal/mol with CIP. However, the phenotype exhibits significantly lower Na<sup>+</sup> concentration in BgATP4<sup>L921I</sup> (P = 0.0087) (Figure 3E).

      The seemingly contradicting observation of reduced CIP binding and efficacy in the BgATP4<sup>L921I</sup> mutant with a significant decrease in intracellular Na<sup>+</sup> concentration may be explained by factors other than the direct CIP interaction. Logically, we consider that CIP binds less effectively to its target in the BgATP4<sup>L921I</sup> mutant, but the observed phenotype may be attributed to the functional consequences of the mutation. The BgATP4<sup>L921I</sup> mutation probably directly impacts the function of BgATP4's ion transport mechanism, which likely disrupts Na<sup>+</sup> homeostasis independently. Thus, we hypothesize that the dysregulated Na<sup>+</sup> homeostasis is driven by the mutation itself rather than the already weakened inhibitory effect of CIP.

      Comment 7: The manuscript does not clarify the percentage of mutations, and the number of sequence iterations performed on the ATP4 gene. It is also unclear whether clonal selection was carried out on the resistant population. If mutations are not present in 100% of the resistant parasites, please indicate the ratio of wild-type to mutant parasites and represent this information in the figure, along with the chromatograms.

      Thank you for your valuable comments. We appreciate your detailed observations and giving us the opportunity to clarify these points. During the long-term culture process, subculturing was performed every three days. Although clonal selection was not conducted, mutant strains were effectively selected during this process. Using the Illumina NovaSeq6000 sequencing platform, high-throughput next-generation sequencing was performed to detect ratio of wild-type to mutant parasites. Results showed that for BgATP4<sup>L921V</sup>, 99.97% of 7,960 reads were G, and for BgATP4<sup>L921I</sup>, 99.92% of 7,862 reads were A. To enhance clarity, we have included a new figure (Figure 2D) illustrating the sequencing results. We believe this addition will help provide a clearer understanding for the readers.

      Comment 8: While the compound's toxicity data is well-established, it is advisable to include additional testing in epithelial cells and liver-specific cell lines (e.g., HeLa, HCT, HepG2) if feasible for the authors. This would provide a more comprehensive assessment of the compound's safety profile.

      Thank you for your thoughtful suggestion. We included toxicity testing in human foreskin fibroblasts (HFF) as supplemental toxicity data to provide a more comprehensive evaluation of the compound's safety profile (Figure supplement 1B).

      Comment 9: In the in vivo efficacy study, recrudescent parasites emerged after 8 days of treatment. Did these parasites harbor the same mutation in the ATP4 gene? The authors did not investigate this aspect, which is crucial for understanding the basis of recrudescence.

      Thank you for raising this important point. We acknowledge that understanding the genetic basis of recrudescence is critical for elucidating mechanisms of resistance and treatment failure. Although our current study did not include an analysis of the BrATP4 gene in relapse parasites due to limitations in sample availability, we evaluated CIP efficacy in SCID mice and performed sequencing analysis of the BmATP4 gene in recrudescent samples. However, no mutation points were identified (Lines 211-212). We believe that if a relapse occurs after the 7-day treatment, it is unlikely that the parasites would easily acquire mutations.  

      Comment 10: The authors should explain their choice of BABL/c mice for evaluating CIP efficacy, as these mice clear the infection and may not fully represent the compound's effectiveness. Investigating CIP efficacy in SCID mice would be valuable, as they provide a more reliable model and eliminate the influence of the immune system. The rationale for not using SCID mice should be clarified.

      We appreciate the reviewer's suggestion regarding the use of SCID mice to evaluate the efficacy of CIP. In response to your suggestion, we have now included an experiment using SCID mice to evaluate the efficacy of CIP and to eliminate the confounding influence of the immune system. We further investigated the potential of combined administration of CIP plus TQ to eliminate parasites, as we are concerned that the long-term use of CIP as a monotherapy may be limited due to its potential for developing resistance. The results are shown in Figure 5C.

      Comment 11: Do the in vitro-resistant parasites show any potential for cross-resistance with commonly used antiparasitic drugs? Have the authors considered this possibility, and what are their expectations regarding cross-resistance?

      Thank you for your insightful question regarding the potential for cross-resistance between in vitro-resistant parasites and commonly used antiparasitic drugs. In response to your suggestion, we have now included experiments to assess whether B. gibsoni parasites that are resistant to CIP exhibit any cross-resistance to other commonly used antiparasitic drugs, such as atovaquone (ATO) and tafenoquine (TQ). The IC<sub>50</sub> values for both ATO and TQ in the resistant strains showed only slight changes compared to the wild-type strain, with less than a onefold difference (Figure 5A, 5B). This minimal variation suggests that the resistant strain has a mild alteration in susceptibility to ATO and TQ, but not enough to indicate strong resistance or significant cross-resistance. This suggests that CIP could be used in combination with TQ to treat babesiosis.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors have tried to repurpose cipargamin (CIP), a known drug against plasmodium and toxoplasma against babesia. They proved the efficacy of CIP on babesia in the nanomolar range. In silico analyses revealed the drug resistance mechanism through a single amino acid mutation at amino acid position 921 on the ATP4 gene of Babesia. Overall, the conclusions drawn by the authors are well justified by their data. I believe this study opens up a novel therapeutic strategy against babesiosis.

      Strengths:

      The authors have carried out a comprehensive study. All the experiments performed were carried out methodically and logically.

      Thank you for the comments and your time to review our manuscript.

      Weaknesses:

      The introduction section needs to be more informative. The authors are investigating the binding of CIP to the ATP4 gene, but they did not give any information about the gene or how the ATP4 inhibitors work in general. The resolution of the figures is not good and the font size is too small to read properly. I also have several minor concerns which have been addressed in the "Recommendations for the authors" section.

      We thank the reviewer for their valuable comments. In response, we have revised the introduction to include a more detailed explanation of the ATP4 gene, its biological significance, and the mechanism of ATP4 inhibitors to provide a better context of the study (Lines 86-93). Additionally, we have reformatted the figures to enhance resolution and increased the font size to ensure improved readability. We also appreciate the reviewer's careful assessment of the manuscript and have addressed all minor concerns outlined in the "Recommendations for the Authors" section. A detailed, point-by-point response to each concern is provided in the response letter, and the corresponding revisions have been incorporated into the manuscript.

      Reviewer #3 (Public review):

      Summary:

      The authors aim to establish that cipargamin can be used for the treatment of infection caused by Babesia organisms.

      Strengths:

      The study provides strong evidence that cipargamin is effective against various Babesia species. In vitro, growth assays were used to establish that cipargamin is effective against Babesia bovis and Babesia gibsoni. Infection of mice with Babesia microti demonstrated that cipargamin is as effective as the combination of atovaquone plus azithromycin. Cipargamin protected mice from lethal infection with Babesia rodhaini. Mutations that confer resistance to cipargamin were identified in the gene encoding ATP4, a P-type Na<sup>+</sup> ATPase that was found in other apicomplexan parasites, thereby validating ATP4 as the target of cipargamin.

      We appreciate the reviewer for taking the time to review our manuscript.

      Weaknesses:

      Cipargamin was tested in vivo at a single dose administered daily for 7 days. Despite the prospect of using cipargamin for the treatment of human babesiosis, there was no attempt to identify the lowest dose of cipagarmin that protects mice from Babesia microti infection. Exposure to cipargamin can induce resistance, indicating that cipargamin should not be used alone but in combination with other drugs. There was no attempt at testing cipargamin in combination with other drugs, particularly atovaquone, in the mouse model of Babesia microti infection. Given the difficulty in treating immunocompromised patients infected with Babesia microti, it would have been informative to test cipargamin in a mouse model of severe immunosuppression (SCID or rag-deficient mice).

      We thank the reviewer for raising these important comments. We address each concern as follows:

      (1) Identifying the lowest protective dose of CIP:

      Although our current study was designed to assess the efficacy of CIP at a single therapeutic dose over a 7-day period, we acknowledge that identifying the lowest effective dose would provide valuable information for optimizing treatment regimens. We plan to address this in future studies by conducting a dose-response experiment to identify the minimal protective dose of CIP.

      (2) Testing CIP in combination with other drugs:

      In the current study, we have tested the efficacy of tafenoquine (TQ) combined with CIP, as well as CIP or TQ administered individually, in a mouse model of B. microti infection. Our results demonstrated that, compared with monotherapy, the combination of CIP and TQ completely eliminated the parasites within 90 days of observation (Figure 5C).

      (3) Testing in an immunocompromised mouse model:

      We agree with the reviewer that evaluating CIP in immunocompromised models is critical for understanding its potential in treating immunocompromised patients. To address this, we have conducted experiments using SCID mice infected with B. microti. Our results indicated that the combination therapy of CIP plus TQ was effective in eliminating parasites in the severely immunocompromised model (Figure 5D).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Comment 1: Table: Include the in-silico binding energies for each mutation and ligand.

      We have added binding energies for each mutation and ligand in Table supplement 3.

      Comment 2: Did the authors investigate the potential of combination therapies involving CIP?

      We have tested the efficacy of TQ combined with CIP in a mouse model of B. microti infection.

      Comment 3: Does this mutation affect the transmission of the parasite?

      Based on our observations, the growth and generation rates of the mutant strain are comparable to those of the wild-type strain. These findings suggest that the mutation does not significantly affect the spread or transmission of the parasite. We have included this observation in the revised manuscript (Lines 243-244).

      Comment 4: 60: Use abbreviations CLN for clindamycin and QUI for quinine.

      We have revised them accordingly (Lines 59-60).

      Comment 5: 86: The hypothesis is not strong or convincing; it needs to be modified to be more specific and convincing.

      We have revised the hypothesis to reflect the rationale behind the study better and to support our claim more strongly (Lines 94-97).

      Comment 6: 93: Change to: "In vitro efficacy of CIP against B. bovis and B. gibsoni.".

      We have changed the suggested content in the manuscript (Line 104).

      Comment 7: 96: Define CC<sub>50</sub>.

      We have added the definition of CC<sub>50</sub> (Line 106).

      Comment 8: 102: Change to: "...Balb/c mice increased dramatically in the...".

      We have changed the word following your recommendation (Line 114).

      Comment 9: 108: "...significant decrease at 12 DPI...".

      We have revised it according to your suggestion (Line 120).

      Comment 10: 110: "This indicates that the administration...".

      We have revised it according to your suggestion (Line 122).

      Comment 11: Figure 1:

      (1) Panels A and B should clearly indicate parasite species within the graph for better self-explanation.

      We have indicated parasite species within the graph.

      (2) For panels C, D, and E, if mice were eliminated or euthanized in the study, include a symbol in the graph to indicate this.

      For panels C and D, no mice were eliminated during the study; therefore, no symbol was added to these graphs. Panel F already provides information about the number of eliminated mice, which corresponds to the data in Panel E.

      (3) In panels C, D, and E, use a continuation arrow for drug treatment rather than a straight line, to cover the duration of the treatment.

      We have updated the figures to use continuation arrows instead of straight lines to represent the duration of drug treatment.

      Comment 12: Figure 2: The color combination for the WT and mutant curves is hard to read; consider using regular, less fluorescent, and more distinguishable colors.

      We have adjusted the color scheme to use more distinguishable and less fluorescent colors, ensuring better readability and clarity. The revised figure with the updated color scheme has been included in the updated manuscript, and we hope this resolves the readability concern.

      Comment 13: Figure 3:

      (1) Panel A: Represent a single infected iRBC rather than a field for better visualization.

      We have updated Panel A to display a single infected iRBC instead of a field.

      (2) Panels E and F: Change the color patterns, as the current colors, especially the green variants (WT and mutant L921V), are difficult to read.

      To improve readability, we have updated the color patterns for these panels by selecting more distinguishable colors with higher contrast (Figure 3F, 3G).

      Comment 14: Figure 4: Panels B, C, and D: The text is too small to read; increase the font size or change the resolution.

      We have increased the font size and replaced the panels with high-resolution versions (Figure 4B, 4C, 4D).

      Reviewer #2 (Recommendations for the authors):

      Comment 1: In the last paragraph of the introduction, the authors mentioned determining the activity of CIP in vitro in B. bovis and B. gibsoni while in vivo in B. microti and B. rodhaini. It is not explained why they are testing the in vitro and in vivo effects on different Babesia species. Could you please add some logic there? Also, why did they mention measuring the inhibitory activity of CIP by monitoring the Na<sup>+</sup> and H<sup>+</sup> balance? This part needs to be rewritten with more information. The ATP4 gene is not properly introduced in the manuscript.

      We thank the reviewer for raising these important points. Below, we address each aspect of the comment in detail:

      (1) Rationale for testing different Babesia spp. in vitro and in vivo:

      B. bovis and B. gibsoni are well-established Babesia models for in vitro culture systems, allowing evaluation of CIP's inhibitory activity under controlled laboratory conditions. B. microti and B. rodhaini, on the other hand, are commonly used rodent models for the in vivo studies of babesiosis, enabling the assessment of drug efficacy in a mammalian host system. This multi-species approach provides a comprehensive evaluation of CIP's efficacy across Babesia spp. with different biological characteristics.

      (2) Measuring CIP's inhibitory activity via Na<sup>+</sup> and H<sup>+</sup> balance:

      We acknowledge that this section of the introduction requires more context. The revised manuscript now includes additional information explaining that the ATP4 gene, which encodes a Na<sup>+</sup>/H<sup>+</sup> transporter, is the proposed target of CIP (Lines 86-93). CIP disrupts the ion homeostasis maintained by ATP4, leading to an imbalance in Na<sup>+</sup> and H<sup>+</sup> concentrations. Monitoring these ionic changes provides a mechanistic understanding of CIP's mode of action and its impact on parasite viability. This rationale has been expanded in the introduction to clarify its significance.

      Comment 2: The figure fonts are too small. The resolution for the images is also poor.

      We have increased the font size in all figures to improve readability. Additionally, we have replaced the figures with high-resolution versions to ensure clarity and visual quality.

      Comment 3: Figures 1A and 1B: one of the error bars merged to the X-axis legend. Please modify these panels. Which curve was used to determine the IC<sub>50</sub> values (although it's mentioned in the methods section, would it be better to have the information in the figure legends as well)?

      We thank the reviewer for their comments regarding Figures 1A and 1B.

      (1) Error bars overlapping the X-axis legend:

      The error bars in the figures were automatically generated using GraphPad Prism9 based on the data and are determined by the values themselves. Unfortunately, this overlap cannot be avoided without altering the data representation.

      (2) IC<sub>50</sub> curve information:

      To clarify the determination of IC<sub>50</sub> values, we have already included gray dashed lines in the graphs to indicate where the IC<sub>50</sub> values were derived from the curves. This visual representation provides clear information about the IC<sub>50</sub> points.

      Comment 4: Supplementary Figure 1: what are MDCK cells? What is CC<sub>50</sub>? Please mention their full forms in the text and figure legends (they should be described here because the methods section comes later). What is meant by a predicted selectivity index? There should be an explanation of why and how they did it. Which curve was used to determine the IC<sub>50</sub> values?

      We thank the reviewer for pointing out the need to clarify terms and provide additional context in the supplementary figure and text. We have updated the figure legend and text to include the full forms of MDCK (Madin-Darby canine kidney) cells and CC<sub>50</sub> (50% cytotoxic concentration), ensuring clarity for readers encountering these terms for the first time. In text, now we have included a brief explanation of the selectivity index as a measure of a drug's safety and specificity (Lines 108-110). The selectivity index is calculated as the ratio between the half maximal inhibitory concentration (IC<sub>50</sub>) and the 50% cytotoxic concentration (CC<sub>50</sub>) values (Lines 333-335). We also have already included gray dashed lines in the graphs to indicate where the IC<sub>50</sub> values were derived from the curves (Figure supplement 1).

      Comment 5: Figures 1C-F: It feels unnecessary to write down n=6 for each panel and each group. Since "n" is equal for all, it would be nice to just mention it in the figure legend only.

      We appreciate the reviewer's suggestion regarding the notation of "n=6" in Figures 1C-F. To improve clarity and reduce redundancy, we have removed the "n=6" notation from the individual panels and included it in the figure legend instead.

      Comment 6: Figure 2A: was never mentioned in the text.

      We have described the sequencing results for the wild-type B. gibsoni ATP4 gene with a reference to Figure 2A in the revised manuscript (Lines 134-135).

      Comment 7: Figure 2D: some of the error bars merged to the X-axis legend. Please modify. Again, which curve was used to determine the IC<sub>50</sub> values? Can the authors explain why the pH declined after 4 minutes?

      We thank the reviewer for this insightful question.

      (1) Error bars overlapping the X-axis legend:

      The error bars in Figure 2E were automatically generated using GraphPad Prism9 and are determined by the underlying data values. Unfortunately, this overlap cannot be avoided without altering the data representation.

      (2) IC<sub>50</sub> curve information:

      Since Figure 2E contains three separate curves, adding dashed lines to indicate the IC<sub>50</sub> for each curve would make the figure overly cluttered and reduce readability. To address this, we have clearly indicated the IC<sub>50</sub> values in Figures 1A and 1B and described the methodology for determining IC<sub>50</sub> values in the Methods section. We believe this approach provides sufficient clarity without compromising the visual experience of Figure 2E.

      (3) The pH decline observed after 4 minutes (Figure 3E) may be attributed to the following factors:

      a. Ion transport dynamics:

      The initial rise in pH likely reflects the rapid inhibition of Na<sup>+</sup>/H<sup>+</sup> exchange mediated by CIP, which temporarily alkalinizes the intracellular environment. However, after this initial phase, compensatory mechanisms, such as proton influx or metabolic acid production, may lead to a subsequent decline in pH.

      b. Drug kinetics and target interaction:

      The decline could also result from the time-dependent effects of CIP on ATP4-mediated ion transport. As the drug action stabilizes, the parasite may partially restore ionic balance, leading to a decrease in intracellular pH.

      Comment 8: Supplementary Figure 2: It's difficult to distinguish between red and pink colors, so it would be wise to use two contrasting colors to distinguish between Pf and Tg CIP resistant cites.

      We have updated the figure to enhance clarity. Purple squares and arrows now represent sites linked to P. falciparum CIP resistance, replacing the previous red squares. Similarly, gray squares and arrows have replaced the green squares to denote sites associated with T. gondii (Figure supplement 2).

      Comment 9: Line 65: Is it possible to add a reference here?

      We have added a reference in line 65.

      Comment 10: Line 69: Please spell the full form of G6PD as it was never mentioned before.

      We have added the full form of G6PD in lines 69-70.

      Comment 11: Line 103: mention what DPI is (irrespective of the methods section which comes later).

      We have spelled out DPI (days postinfection) in line 115.

      Comment 12: Line 120: It's not explained why B. gibsoni ATP4 gene was investigated? There should be more explanation and references to previous work.

      We thank the reviewer for pointing out the need to provide more context for investigating the B. gibsoni ATP4 gene. To address this, we have added more information to the introduction, explaining that the ATP4 gene, which encodes a Na<sup>+</sup>/H<sup>+</sup> transporter, is the proposed target of CIP (Lines 86-93).

      Comment 13: Line 203-219: line spacing seems different from the rest of the manuscript.

      We have corrected the incorrect format (Lines 262-278).

      Reviewer #3 (Recommendations for the authors):

      Comment 1: Lines 66-68: The report by Marcos et al. 2022 did not demonstrate that tafenoquine was effective in curing relapsing babesiosis. In the discussion of that article, the authors state that "it is impossible to conclude that the drug tafenoquine provided any clinical benefit." The first demonstration of tafenoquine efficacy against relapsing babesiosis was reported by Rogers et al. 2023 and confirmed by Krause et al. 2024. Please rephrase the statement and use relevant citations.

      We thank the reviewer for pointing out this issue and we have rephrased the statement and used relevant citations (Lines 66-68).

      Comment 2: Line 103: mean parasitemia at 10 DPI is reported to be 35.88% but Figure 1C appears to indicate otherwise.

      We are sorry for the carelessness, the correct mean parasitemia at 10 DPI is 38.55%, and this has been updated in line 115 of the revised manuscript to reflect the data shown in Figure 1C.

      Comment 3: Line 116: parasitemia is said to recur on day 14 post-infection but Figure 1E indicates that recurrence was already noted on day 12 post-infection.

      We thank the reviewer for pointing out this inconsistency. We have corrected the relapse day to reflect that recurrence was noted on day 12 post-infection, as shown in Figure 1E. This correction has been made in the revised manuscript (Line 128).

      Comment 4: Line 120: Replace "wells" with "strains". Also, start the paragraph with one brief sentence to state how resistant parasites were generated.

      We have replaced "wells" with "strains" and added one brief sentence to explain how resistant parasites were generated (Lines 132-134).

      Comment 5: Line 169: is Ji et al, 2022b truly the appropriate reference to support a statement on tafenoquine?

      We thank the reviewer for highlighting this point. We have added one other reference to support a statement on tafenoquine. The IC<sub>50</sub> value of TQ was 20.0 ± 2.4 μM against B. gibsoni (Ji et al., 2022b), and 31 μM against B. bovis (Carvalho et al., 2020) (Lines 223-225).

      Comment 6: Lines 184-185: given that exposure to CIP induces mutations in the ATP4 gene and therefore resistance to CIP, what is the prospect of using CIP for the treatment of babesiosis? Can the authors speculate on whether CIP should not be used alone but rather in combination with other drugs currently used for the treatment of human babesiosis?

      We thank the reviewer for raising this important question. Given that exposure to CIP induces mutations in the ATP4 gene, leading to resistance, we acknowledge that the long-term use of CIP as a monotherapy may be limited due to the potential for resistance development. To address this concern, we investigated the combination therapy of TQ and CIP to achieve the complete elimination of B. microti in infected mice (a model for human babesiosis). The results of this study are presented in Figure 5C.

      Comment 7: Lines 258-259: it is stated that drug treatment was initiated on day 4 post-infection when mean parasitemia was 1% and that drug treatment was continued for 7 days. This is not the case for B. rodhaini infection. As reported in Figure 1E, treatment was initiated on day 2 post-infection.

      We apologize for the oversight and any confusion caused. We have corrected the statement to reflect that drug treatment for B. rodhaini-infected mice was initiated at 2 DPI, as reported in Figure 1E (Lines 347-349).

      Comment 8: Lines 282-285: RBCs are said to be exposed to CIP for 3 days but parasite size is said to be measured on day 4. Which is correct?

      We thank the reviewer for pointing out this discrepancy. To clarify, the infected erythrocytes were exposed to CIP for three consecutive days (72 hours). Blood smears were then prepared at the 73<sup>rd</sup> hour, corresponding to the fourth day.

      Comment 9: Lines 35-37: this sentence can be omitted from the abstract as it does not summarize additional insight or additional data.

      We have omitted this sentence from the abstract.

      Comment 10: Line 55: replace Drews et al. 2023 with Gray and Ogden 2021 (doi: 10.3390/pathogens10111430). This excellent article directly supports the statement made by the authors.

      We appreciate the reviewer's suggestion and have replaced the reference with Gray and Ogden, 2021 (doi: 10.3390/pathogens10111430) (Line 54).

      Comment 11: Line 55: modify the start of sentence to read "The disease is known as babesiosis ...".

      We have modified the sentence (Line 54).

      Comment 12: Line 56: rephrase to read ".... but chronic infections can be asymptomatic".

      We have modified the sentence (Line 55).

      Comment 13: Line 57: rephrase to read "The fatality rate ranges from 1% among all cases to 3% among hospitalized cases but has been as high as 20% in immunocompromised patients."

      We have rephrased the sentence (Lines 55-57).

      Comment 14: Line 61: replace Holbrook et al. 2023 with Krause et al. 2021 (doi: 10.1093/cid/ciaa1216).

      We have replaced Holbrook et al. 2023 with Krause et al. 2021 (doi: 10.1093/cid/ciaa1216) (Line 60).

      Comment 15: Line 62: rephrase to read "... cytochrome b, which is targeted by atovaquone, were identified in patients with relapsing babesiosis." Here, also cite Lemieux et al., 2016; Simon et al., 2017; Rosenblatt et al, 2021, Marcos et al., 2022; Rogers et al., 2023; Krause et al., 2024.

      We have rephrased the sentence and cited the suggested references (Lines 61-64).

      Comment 16: Line 65: rephrase "Despite its efficacy, this combination can elicit adverse drug reactions (Vannier and Krause, 2012)."

      We have rephrased the sentence (Lines 65-66).

      Comment 17: Lines 75-77: rephrase to read "... of the drug indicated that CIP taken orally had good absorption, a long half-life, and ...".

      We have rephrased the sentence (Lines 76-77).

      Comment 18: Line 79: remove "the".

      We have removed "the" (Lines 79-80).

      Comment 19: Lines 83-85: rephrase to read "Mice infected with T. gondii that were treated with CIP on the day of infection and the following day had 90% fewer parasites 5 days post-infection (Zhou et al., 2014).".

      We have rephrased the sentence (Lines 83-85).

      Comment 20: Line 90: shorten the sentence to end as follows "... of CIP on Babesia parasites.".

      We have shortened the sentence in line 100 with your suggestion.

      Comment 21: Line 96: spell out CC<sub>50</sub>.

      We have spelled out the full form of CC<sub>50</sub> (Line 106).

      Comment 22: Line 104: remove "of body weight".

      We have removed "of body weight" (Line 116).

      Comment 23: Line 108: delete "from 8 DPI to 24 DPI, with statistically significant decreases".

      We have deleted "from 8 DPI to 24 DPI, with statistically significant decreases" (Line 120).

      Comment 24: Line 111: start a new paragraph with the sentence "BALB/c mice infected ...".

      We have started a new paragraph with the sentence "BALB/c mice infected ..." (Line 124).

      Comment 25: Line 123: replace "showed" with "occurred".

      We have replaced "showed" with "occurred" (Line 138).

      Comment 26: Line 127: rephrase to read "... sensitivity of the resistant parasite lines ...".

      We have rephrased the sentence (Line 144).

      Comment 27: Lines 137-140: rephrase to read ".... lines were lower when compared with ..." .

      We have rephrased the sentence (Line 158).

      Comment 28: Line 149: replace "BgATP4" with "B. gibsoni ATP4".

      We have replaced "BgATP4" with "B. gibsoni ATP4" (Line 183).

      Comment 29: Line 154: spell out "pLDDT" prior to pLDDT.

      We have provided the full form of pLDDT in the revised manuscript (Line 188).

      Comment 30: Lines 165-166: rephrase to read "CIP is a novel compound that inhibits Plasmodium development by targeting ATP4 and has been ...".

      We have rephrased the sentence (Lines 219-220).

      Comment 31: Lines 171-172: rephrase to read "...AZI, the combination recommended by the CDC in the United States.

      We have rephrased the sentence (Lines 226-227).

      Comment 32: Line 173: rephrase to read "... B. rodhaini infection, with survival up to 67%.".

      We have rephrased the sentence (Line 228).

      Comment 33: Lines 175-178: rephrase to read "In a previous study, a P. falciparum Dd2 strain that acquired resistance to CIP carried the G358S mutation in the ...".

      We have rephrased the sentence (Lines 230-231).

      Comment 34: Lines 179-180: rephrase to read "ATP4 is found in the parasite plasma membrane and is specific to the subclass of apicomplexan parasites.".

      We have rephrased the sentence (Lines 232-233).

      Comment 35: Lines 182-184: rephrase to read "In another study of Toxoplasma gondii, a cell line that carried the mutation G419S in the TgATP4 gene was 34 times ...".

      We have rephrased the sentence (Lines 235-237).

      Comment 36: Lines 201-202: deleted the last sentence of this paragraph.

      We have deleted the last sentence of the paragraph (Line 261).

      Comment 37: Line 228: rephrase to read "... that CIP had a weaker binding to BgATP4<sup>L921I</sup> than to BgATP4<sup>L921V</sup>.".

      We have rephrased the sentence (Lines 294-295).

      Comment 38: Lines 261-262: please state that drugs were prepared in sesame oil. Add "20 mg/kg" in front of AZI.

      We have stated that drugs were prepared in sesame oil and added "20 mg/kg" in front of AZI (Lines 350-352).

      Comment 39: Line 265: replace "care" with "treatments".

      We have replaced "care" with "treatments" (Line 355).

      Comment 40: Line 267: replace "observe" with "assess".

      We have replaced "observe" with "assess" (Line 357).

      Comment 41: Lines 269-271: please provide the absolute numbers of B. gibsoni infected RBCs and the absolute numbers of uninfected RBCs that were added to the culture medium.

      We thank the reviewer for this suggestion. In the revised manuscript, we have included the absolute numbers of B. gibsoni-infected RBCs and uninfected RBCs added to the culture medium. Specifically, the culture medium contained 10 μL (5×10 <sup>6</sup>) B. gibsoni iRBCs mixed with 40 μL (4×10 <sup>8</sup>) uninfected RBCs (Lines 360-361).

      Comment 42: Line 279: replace "confirmed" with "identified".

      We have replaced "confirmed" with "identified" (Line 370).

      Comment 43: Figure Supplement 2: the squares are not readily visible. Could the entire column corresponding to the mutation position be highlighted?

      We thank the reviewer for this suggestion. To improve visibility, we have changed the color of the squares and added arrows to make the mutation sites as prominent as possible. Unfortunately, due to software limitations, we were unable to highlight the entire column corresponding to the mutation position.

      Comment 44: Figure Supplement 4: for the parasite that carries a mutation in BgATP4, please delete the arrows that are next to BgATP4. These arrows send the message that the mutation ATP4 has an active role in pumping back Na<sup>+</sup> and H<sup>+</sup> back in their compartment, which is not the case.

      We thank the reviewer for their observation. The dotted arrows next to BgATP4 are intended to indicate the recovery of H<sup>+</sup> and Na<sup>+</sup> balance facilitated by the mutated ATP4, which reduces susceptibility to ATP4 inhibitors. To avoid potential confusion, we have revised the figure legend to clearly explain the role of the arrows, ensuring the intended message is accurately conveyed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 1- I would like the authors to discuss and justify their use of high-dose (1.3%) isolfurane. A recent consensus paper on rat fMRI (Grandjean et al., "A Consensus Protocol for Functional Connectivity Analysis in the Rat Brain.") found that medetomidine combined with low dose isoflurane provided optimal control of physiology and fMRI signal. To overcome any doubts about the effects of the high-dose anaesthetic I'd encourage the authors to show the results of their functional connectivity specificity using the same or similar image processing protocol as described in that consensus paper. This is especially true since the fMRI ICs in Figure 2A appear fairly restricted.

      We thank the reviewer for their insightful comments. We agree that the combination of medetomidine and isoflurane, as recommended by Grandjean et al. in their consensus paper, provides superior physiological stability and fMRI signal quality, and should indeed be considered the preferred protocol for future studies. In fact, we have adopted this combination in our subsequent research [1]. However, the data acquired in the present study were acquired prior to the publication of the consensus recommendations and have been previously published [2, 3]. While isoflurane is not the ideal anesthetic for functional connectivity studies, we have demonstrated in earlier work [4], that using isoflurane at 1.3% maintains stable physiological parameters and avoids burst suppression, a key issue with higher isoflurane doses.

      Regarding preprocessing, we acknowledge the importance of standardized approaches as outlined in the consensus paper. However, to maintain methodological consistency with our prior work, we retained the original preprocessing pipeline for this study. This decision ensures comparability with our previous analyses. To address the reviewer’s concerns and encourage further verification, we have uploaded the full dataset to a public repository (as suggested in Comment 4). This will enable other researchers to reanalyze the data using updated preprocessing pipelines or explore additional analyses.

      We have updated the manuscript discussion (page 19) to clearly acknowledge these points:

      “One limitation of our study is that our experimental protocols predate the recently published consensus recommendations for rat fMRI [42], particularly concerning anesthesia and preprocessing pipelines. The use of isoflurane anesthesia, although common at the time of data acquisition, introduces a potential confound due to its known effects on neuronal activity. However, we previously demonstrated that isoflurane at 1.3% maintains stable physiological parameters and avoids burst suppression [43], a concern at higher doses. Furthermore, other studies have reported that low-dose isoflurane remains feasible for resting-state functional connectivity studies [44]. While isoflurane, as a GABA-A agonist, could theoretically interact with the mechanisms of MDMA in the brain, we found no evidence in the literature suggesting significant cross-talk between these substances. Future studies employing medetomidine-based protocols may help minimize this potential confound.

      Regarding data preprocessing, we chose to retain the same pipeline used in our prior publications [13, 14] to maintain methodological consistency. While we recognize the advantages of adopting standardized preprocessing as outlined in the consensus guidelines, this approach ensures comparability with our previous analyses. To facilitate further investigation, we have made the full dataset publicly available (see Data Availability Statement), enabling reanalysis with updated pipelines or additional explorations of this dataset.”

      Comment 2 - I'd also be interested to read more about why the cerebellum was chosen as a reference region, given that serotonin is highly expressed in the cerebellum, and what effects the choice of reference region has on their quantification.

      This is something we ourselves have examined in a paper, dedicated to determine the most suitable reference region for [11C]DASB, and while the reviewer is correct in saying there is also serotonin in the cerebellum, we found the lowest binding for this tracer in the cerebellar gray matter, recommending this region as a valid reference area. (“Displaceable binding of (11)C-DASB was found in all brain regions of both rats and mice, with the highest binding being in the thalamus and the lowest in the cerebellum. In rats, displaceable binding was largely reduced in the cerebellar cortex”, please refer to [5]).

      We amended our materials and methods part to specify that we had shown in this previous publication that the cerebellar gray matter is appropriate as a reference region (page 6):

      “Binding potentials were calculated frame-wise for all dynamic PET scans using the DVR-1 (equation 1) to generate regional BPND values with the cerebellar gray matter as a reference region, which our earlier studies have demonstrated to be the most appropriate for this tracer in rats [5, 6]:”

      Comment 3 - The PET ICs appear less bilateral than the fMRI ICs. Is that simply a thresholding artefact or is it a real signal?

      We thank the reviewer for this observation. The reduced bilaterality of PET ICs compared to fMRI ICs is likely due to the inherent limitation in the temporal resolution of PET, which provides significantly fewer frames (100 frames compared to 3000 frames for fMRI). This lower temporal resolution leads to reduced signal-to-noise ratio when computing the ICA, which can affect the stability and symmetry of the ICs during ICA computation, particularly at higher IC numbers. While thresholding may also a minor role, we believe the primary factor is poorer SNR associated with the PET data. We have clarified this point in the discussion section (page 17) as follows:

      “In our analysis, PET ICs appeared less bilateral than fMRI ICs. This is likely due to the lower temporal resolution of PET (100 frames) compared to fMRI (3000 frames), resulting in reduced signal-to-noise ratio (SNR) and potentially affecting the stability and symmetry of the independent components.”

      Comment 4 - "The data will be made available upon reasonable request" is not sufficient - please deposit the data in an open repository and link to its location.

      We agree with the request of the reviewer and uploaded the data to a Dryad repository. We amended our Data Availability Statement accordingly.

      Comment 5 (recommendation) - Please add the age and sex of the rats in lines 92-97.

      Amended.

      Comment 6 (recommendation) - There are multiple typos throughout the manuscript - for example, "z-vlaue" on line 164, "negligable" on line 194, etc.. Sometimes the 11 in 11C is superscripted, sometimes it isn't. This paper would benefit from a careful proofread.

      Thank you for pointing this out. We sent the manuscript for language and grammar editing to AJE (see certificate).

      Reviewer 2:

      Comment 1 - While the study protocol is referenced in the paper, it would be useful to at least report whether the study uses bolus, constant infusion, or a combination of the two and the duration of the frames chosen for reconstruction. Minimal details on anesthesia should also be reported, clarifying whether an interaction between the pharmacological agent for anesthesia and MDMA can be expected (whole-brain or in specific regions).

      We fully agree that this would improve the readability of our manuscript and added the information to the materials and methods and discussion accordingly. Please refer to page 4/5.

      Comment 2 - Some terminology is used in a bit unclear way. E.g. "seed-based" usually refers to seed-to-voxel and not ROI-to-ROI analysis, or e.g. it is a bit confusing to have IC1 called SERT network when in fact all ICs derived from DASB data are SERT networks. Perhaps a different wording could be used (IC1 = SERT xxxxx network; IC2= SERT salience network).

      Based on the reviewer´s suggestion, we suggest to rename IC1 and IC2 according to their anatomical and functional characteristics (page 13):

      “IC1 = SERT Salience Network: This name highlights the involvement of the regions typically associated with the salience network (e.g., CPu, Cg, NAc, Amyg, Ins, mPFC), which play key roles in emotional and cognitive processing.”

      “IC2 = SERT Subcortical Network: This name reflects the involvement of subcortical regions which play a role in arousal, stress response, and autonomic regulation, which are heavily modulated by serotonin in areas like the hypothalamus, PAG, and thalamus.”

      Comment 3 - The limited sample size for the rats undergoing pharmacological stimulation which might make the study (potentially) not particularly powerful. This could not be a problem if the MDMA effect observed is particularly consistent across rats. Information on inter-individual variability of FC, MC, and BPND could be provided in this regard.

      We thank the reviewer for raising this point. To address the concern about limited sample size and inter-individual variability, we have added this information to Figures 5 B and D. Regarding the BPND variability, the dotted lines in Figure 3 indicate the standard deviation in the regional BPNDs, however, this was not clearly stated in the original figure description. We have now amended the figure legend to explicitly clarify this point.

      Comment 4 (recommendation) - "Our research employs a novel approach named "molecular connectivity" (MC), which merges the strengths of various imaging methods to offer a comprehensive view of how molecules interact within the brain and affect its function." I'd recommend rephrasing to "..how molecular interact across different areas within the brain..". Molecular connectivity is a potentially ambiguous term (used to study interactions across different molecules (in the same compartment/environment) vs. to study interactions across the same molecules in different areas). I'd add a couple of references to help the reader disambiguate too (e.g. https://pubmed.ncbi.nlm.nih.gov/30544240/ , https://pubmed.ncbi.nlm.nih.gov/36621368/)

      We appreciate the reviewer’s suggestion and agree that the term "Molecular Connectivity" could be ambiguous. To clarify, we rephrased the description to emphasize that our approach specifically examines interactions of the same molecule (i.e., serotonin transporter) across different brain regions, rather than interactions between different molecules within the same environment. We propose the following revised text (page 2):

      “Our research employs a novel approach termed molecular connectivity (MC), which combines the strengths of various imaging methods to provide a comprehensive view of how specific molecules, such as the serotonin transporter, interact across different brain regions and influence brain function.”

      Additionally, we will incorporate the suggested references to help the reader further contextualize the use of this term.

      Comment 5 - In the methods, it is not clear if for MC the authors also compute ROI-to-ROI correlations or only ICA.

      Thank you for highlighting this point. To clarify, our MC analysis, includes both ROI-to-ROI correlations and ICA. Specifically, as described at the end of the “Molecular Connectivity Analysis” subchapter, we compute ROI-to-ROI correlations using the following steps: 1. The first 20 minutes of each scan are discarded to account for perfusion effects. 2. A detrending approach is applied to the remaining 60 minutes of BP<sub>ND</sub> time courses. 3. ROI-to-ROI calculations are then calculated and organized into subject-level correlation matrices, which are subsequently z-transformed to generate mean correlation matrices across subjects.

      We revised the methods section to explicitly state that both ROI-to-ROI correlations and ICA are integral components of the MC analysis to ensure this point is clear to readers (page 6).

      “The BP<sub>ND</sub> time courses were then used to calculate MC as described above for fMRI: ROI-to-ROI subject-level correlation matrices between all regional time courses were generated and z-transformed correlation coefficients were used to calculate mean correlation matrices.”

      Comment 7 - In the discussion, it could be useful to relate IC1 and IC2 to well-established neuroanatomical/molecular knowledge of the serotoninergic system. Did the authors expect the IC1 and IC2 anatomical distributions? is there a plausible biological reason as to why the time courses of BPnd variations would be somehow different between IC1 and IC2?

      We appreciate the reviewer’s insightful comment and agree on the importance of relating IC1 and IC2 to well-established neuroanatomical and molecular knowledge of the serotonergic system.

      In our discussion, we noted that IC1 primarily encompasses subcortical structures such as the brainstem, midbrain, and thalamus. These regions are consistent with areas housing dense serotonergic projections originating from the raphe nuclei, the primary source of serotonin release. In contrast, IC2 involves limbic and cortical regions - including the striatum, amygdala, cingulate, insular, and prefrontal cortices - which are key targets of the serotonergic pathways. This anatomical distinction aligns with the hierarchical organization of the serotonergic system, where the brainstem nuclei exert both local and distal serotonergic modulation.

      The observed differences in the temporal dynamics of the binding potential (BP<sub>ND</sub>) variations between IC1 and IC2 likely reflect the distinct functional roles of these regions within the serotonergic network. The more immediate changes in IC1 could be attributed to the direct effect of MDMA on the raphe nuclei, leading to rapid serotonin release in subcortical structures. In contrast, the delayed changes in IC2 may reflect downstream modulation in cortical and limbic regions involved in processing more complex emotional and cognitive functions.

      That said, while these interpretations are plausible based on current neuroanatomical and functional knowledge, the exact biological mechanisms underlying the differential time courses remain unclear. As discussed in the manuscript, future studies incorporating direct, simultaneous measurements of serotonin levels and imaging data will be essential to fully elucidate the temporal and spatial dynamics of serotonin transmission in these regions. We have revised to better highlight this limitation in the discussion section (page 17) as an important area for further investigation:

      “Our results demonstrate that compared with FC, MDMA induces more pronounced changes in MCs, particularly in regions associated with the SERT subcortical network. The distinct temporal dynamics of BPnd variations between these components may reflect the hierarchical organization of the serotonergic system. Specifically, the raphe nuclei, as the primary source of serotonin, are likely to exert more immediate modulation on posterior subcortical structures (IC2), whereas downstream effects on limbic and cortical regions (IC1) may occur more gradually. While these findings align with current neuroanatomical and molecular knowledge, the precise biological mechanisms driving these temporal differences remain unclear. Future investigations are warranted to elucidate these mechanisms. Future studies combining direct measurements of serotonin levels with neuroimaging data will be critical to fully understanding these components’ distinct roles and temporal profiles in regulating serotonergic function.”

      Comment 8 - In the discussion (physiological basis), could the authors detail the expected "time scale" in changes in SERT expression? How quickly can SERT expression change, especially under resting-state conditions? Is it reasonable to consider tracer fluctuations under rest conditions as biologically meaningful?

      SERT regulation can occur over different time scales depending on the mechanism involved [7].

      Acute, rapid changes (milliseconds to seconds): Protein-protein interactions with key regulatory proteins (e.g., syntaxin1A, neuronal nitric oxide synthase) can lead to rapid modulation of SERT surface expression [8-11]. These interactions often involve changes in transporter trafficking or conformational states and can occur within milliseconds to seconds. For example, syntaxin1A directly interacts with the N-terminus of SERT, influencing its availability on the plasma membrane within short timescales.

      Intermediate time scales (seconds to minutes): Posttranslational modifications, such as phosphorylation by kinases (e.g., protein kinase C) or dephosphorylation by phosphatases, are known to influence SERT function and surface expression [12-14]. These processes are typically initiated in response to cellular signaling and occur over seconds to minutes, affecting the SERT trafficking dynamics and serotonin uptake capacity [15, 16].

      Longer-term changes (minutes to hours): Longer-term regulation involves processes like endocytosis, recycling, or degradation of SERT. These pathways typically take minutes to hours and are often part of more sustained cellular responses to changes in neuronal activity or serotonin levels. Such changes are slower but contribute to the overall cellular homeostasis of SERT under prolonged stimulation.

      Under resting-state conditions, where neurons are not subjected to rapid or dramatic fluctuations in neurotransmitter release or signaling, SERT expression and activity are generally stable but still subject to subtle fluctuations due to ongoing basal regulatory processes. Basal phosphorylation or low-level protein-protein interactions can still dynamically modulate SERT trafficking and function, albeit at a lower intensity than under stimulated conditions. These fluctuations, although smaller in magnitude, may reflect fine-tuning of serotonin homeostasis and can occur on shorter timescales (seconds to minutes).

      Biological Relevance of Tracer Fluctuations at Rest:

      It is reasonable to consider that tracer fluctuations under resting conditions could reflect biologically meaningful variations in SERT expression and function. Even subtle shifts in SERT surface availability or activity can impact serotonin clearance and signaling, given the fine balance required to maintain serotonergic tone. These fluctuations may reflect intrinsic neuronal variability or ongoing homeostatic adjustments to maintain optimal neurotransmitter levels or serve as early indicators of adaptive responses to environmental or physiological changes before more overt modifications in transporter expression or activity become apparent.

      In summary, while SERT expression can change rapidly in response to signaling events (milliseconds to minutes), even under resting-state conditions, subtle regulatory fluctuations can be biologically meaningful. These fluctuations likely reflect ongoing regulatory adjustments essential for maintaining serotonergic balance and should not be disregarded as noise, particularly in experimental measurements using tracers.

      We added the following paragraph to the discussion (page 16):

      In addition, SERT regulation occurs over multiple time scales, ranging from milliseconds to hours, depending on the mechanism involved [31]. Rapid changes in SERT surface expression can be mediated by protein-protein interactions or posttranslational modifications [32, 33], such as phosphorylation, which occur on a timescale of milliseconds to minutes. These processes dynamically modulate surface availability and function, allowing fine-tuned regulation of serotonin uptake even under resting-state conditions. Additionally, while slower processes involving endocytosis, recycling, and degradation typically occur over minutes to hours, subtle fluctuations in SERT trafficking and activity can still occur under basal conditions. These minor yet biologically relevant changes likely reflect ongoing homeostatic regulation essential for maintaining serotonergic balance. Therefore, tracer fluctuations observed during resting-state measurements should not be dismissed, as they may represent meaningful variations in SERT regulation that contribute to the fine control of serotonin clearance.

      Comment 9 - In the discussion, the SERT network results should be commented on more extensively, as there is now only a generic reference to MC changes being stronger than FC ones, without spatial reference to the SERT network (while only negative salience network results are referenced explicitly instead, making the paragraph a bit confusing).

      We expanded the discussion to accommodate a more thorough contemplation of this network. This revised paragraph (page 17) directly addresses the spatial aspects of the SERT network, highlighting the specific regions involved in serotonergic connectivity and contrasting molecular and functional connectivity changes induced by MDMA.

      Comment 10 - Figure 3; I'd switch left and right charts in the bottom panel (last row only), to keep the SERT network always on the left of the Figure.

      We agree with the suggestion and changed the figure accordingly.

      Comment 11 - Figure 4: I'd add FC decreases to the figure, to allow the reader to compare BPnd, MC, and FC changes more easily and I'd add a horizontal line at the equivalent of e.g. Z-1.96 (or similar) so that it is clear which measures/regions display significant changes.

      We prefer to keep the figure focusing on the two analyses of PET alterations, since we want to emphasize their complementarity in the context of PET specifically. However, we added lines indicating significances, in line with the reviewer’s suggestion.

      Comment 12 - In Figure 5D, the y-axis mentioned FC but I suppose it should mention MC.

      We amended the figure accordingly, together with the changes to the names of the networks implemented across the manuscript.

      (1) Marciano, S., et al., Combining CRISPR-Cas9 and brain imaging to study the link from genes to molecules to networks. Proc Natl Acad Sci U S A, 2022. 119(40): p. e2122552119.

      (2) Ionescu, T.M., et al., Striatal and prefrontal D2R and SERT distributions contrastingly correlate with default-mode connectivity. Neuroimage, 2021. 243: p. 118501.

      (3) Ionescu, T.M., et al., Neurovascular Uncoupling: Multimodal Imaging Delineates the Acute Effects of 3,4-Methylenedioxymethamphetamine. J Nucl Med, 2023. 64(3): p. 466-471.

      (4) Ionescu, T.M., et al., Elucidating the complementarity of resting-state networks derived from dynamic [(18)F]FDG and hemodynamic fluctuations using simultaneous small-animal PET/MRI. Neuroimage, 2021. 236: p. 118045.

      (5) Walker, M., et al., In Vivo Evaluation of 11C-DASB for Quantitative SERT Imaging in Rats and Mice. J Nucl Med, 2016. 57(1): p. 115-21.

      (6) Walker, M., et al., Imaging SERT Availability in a Rat Model of L-DOPA-Induced Dyskinesia. Mol Imaging Biol, 2020. 22(3): p. 634-642.

      (7) Lau, T. and P. Schloss, Differential regulation of serotonin transporter cell surface expression. Wiley Interdisciplinary Reviews: Membrane Transport and Signaling, 2012. 1(3): p. 259-268.

      (8) Haase, J., et al., Regulation of the serotonin transporter by interacting proteins. Biochem Soc Trans, 2001. 29(Pt 6): p. 722-8.

      (9) Quick, M.W., Regulating the conducting states of a mammalian serotonin transporter. Neuron, 2003. 40(3): p. 537-49.

      (10) Ciccone, M.A., et al., Calcium/calmodulin-dependent kinase II regulates the interaction between the serotonin transporter and syntaxin 1A. Neuropharmacology, 2008. 55(5): p. 763-70.

      (11) Chanrion, B., et al., Physical interaction between the serotonin transporter and neuronal nitric oxide synthase underlies reciprocal modulation of their activity. Proc Natl Acad Sci U S A, 2007. 104(19): p. 8119-24.

      (12) Qian, Y., et al., Protein kinase C activation regulates human serotonin transporters in HEK-293 cells via altered cell surface expression. J Neurosci, 1997. 17(1): p. 45-57.

      (13) Ramamoorthy, S., et al., Phosphorylation and regulation of antidepressant-sensitive serotonin transporters. J Biol Chem, 1998. 273(4): p. 2458-66.

      (14) Jayanthi, L.D., et al., Evidence for biphasic effects of protein kinase C on serotonin transporter function, endocytosis, and phosphorylation. Mol Pharmacol, 2005. 67(6): p. 2077-87.

      (15) Steiner, J.A., A.M. Carneiro, and R.D. Blakely, Going with the flow: trafficking-dependent and -independent regulation of serotonin transport. Traffic, 2008. 9(9): p. 1393-402.

      (16) Lau, T., et al., Monitoring mouse serotonin transporter internalization in stem cell-derived serotonergic neurons by confocal laser scanning microscopy. Neurochem Int, 2009. 54(3-4): p. 271-6.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Alternate explanations for major conclusions.

      The major conclusions are (a) surface motility of W3110 requires pili which is not novel, (b) pili synthesis and pili-dependent surface motility require putrescine — 1 mM is optimal, and 4 mM is inhibitory, and (c) the existence of a putrescine homeostatic network that maintains intracellular putrescine that involves compensatory mechanisms for low putrescine, including diversion of energy generation toward putrescine synthesis.

      Conclusion a: Reviewer 3 suggests that the mutant may have lost surface motility because of outer surface structures that actually mediate motility but are co-regulated with or depend on pili synthesis. The reviewer explicitly suggests flagella as the alternate appendage, although flagella and pili are reciprocally regulated. Most experiments were performed in a Δ_fliC_ background, which lacks the major flagella subunit, in order to prevent the generation of fast-moving flagella-dependent variants. Furthermore, no other surface structure that could mediate surface motility is apparent in the electron microscope images. This observation does not definitively rule out this possibility, especially because of the large transcriptomic changes with low putrescine. Our explanation is the simplest.

      Conclusion b, first comment: Reviewer 1 states that “it is not possible to conclude that the effects of gene deletions to biosynthetic, transport or catabolic genes on pili-dependent surface motility are due to changes in putrescine levels unless one takes it on faith that there must be changes to putrescine levels.” The comment ignores both the nutritional supplementation and the transcript changes that strongly suggest compensatory mechanisms for low putrescine. Why compensate if the putrescine concentration does not change? The reviewer then implicitly acknowledges changes in putrescine content: “it is important to know how much putrescine must be depleted in order to exert a physiological effect”.

      Conclusion b, second comment: Reviewer 1 proposes that agmatine accumulation can account for some of the observed properties, but which property is not specified. With respect to motility, agmatine accumulation cannot account for motility defects because motility is impaired in (a) a speA mutant which cannot make agmatine and (b) a speC speF double mutant which should not accumulate agmatine. With respect to the transcriptomic results, even if high agmatine is the reason for some transcript changes, the results still suggest a putrescine homeostasis network.

      Conclusion c: the reviewers made no comments on the RNAseq analysis or the interpretation of the existence of a homeostatic network.

      Additional experiments proposed.

      Complementation. Reviewers 1 and 3 suggested complementation experiments, but the latter states that nutritional supplementation strengthens our arguments. The most relevant complementation is with speB.  We tried complementation and found that our control plasmid inhibited motility by increasing the lag time before movement commenced. A plasmid with speB did stimulate motility relative to the control plasmid, but movement with the speB plasmid took 4 days, while wild-type movement took 1.5 days. We think that interpretation of this result is ambiguous. We did not systematically search for plasmids that had no effect on motility.

      The purpose of complementation is to determine whether a second-site mutation is the actual cause of the motility defect. In this case, the artifact is that an alteration in polyamine metabolism is not the cause of the defect. However, external putrescine reverses the effects on motility and pili synthesis in the speB mutant. This result is inconsistent with a second-site mutation. Still, we agree that complementation is important, and because of our difficulties, we tested numerous mutants with defects in polyamine metabolism. The results present an interpretable and coherent pattern. For example, if putrescine is not the regulator, then mutants in putrescine transport and catabolism should have had no effect. Every single mutant is consistent with a role in movement and pili synthesis. The simplest explanation is that putrescine affects movement and pili synthesis.

      Phase variation. Reviewer 2 noted that we did not discuss phase variation. The comment came from the observation that the speB mutant had fewer fimB transcripts which could explain the loss of motility. The reviewer also suggested a simple experiment, which we performed and found that putrescine does not control phase variation. We present those results in the supplemental material. Our discussion of this topic includes a major qualification.

      Testing of additional strains. Published results from another lab showed that surface motility of MG1655 requires spermidine instead of putrescine (PMID 19493013 and 21266585). MG1655 and the W3110 that we used in our study are E. coli K-12 derivatives and phylogenetic group A. Any number of changes in enzymes that affect intracellular putrescine concentration could result in different responses to putrescine. We are currently studying pili synthesis and motility in other strains. While that study is incomplete, loss of speB in a strain of phylogenetic group D eliminates no surface motility. This work was intended as our initial analysis and the focus was on a single strain.

      Measuring intracellular polyamines. We felt that we had provided sufficient evidence to conclude that putrescine controls pili synthesis and putrescine concentrations are lower in the speB mutant: the nutritional supplementation, the lower levels of transcripts for putrescine catabolic enzymes which require putrescine for their expression strongly suggest lower putrescine in a mutant lacking a putrescine biosynthesis gene, and a transcriptomic analysis that found the speB mutant had transcript changes to compensate for low putrescine. We understand the importance of measuring intracellular polyamines. We are currently examining the quantitative relationship between intracellular polyamines and pili synthesis in multiple strains which respond differently to loss of speB.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The authors should measure putrescine, agmatine, cadaverine, and spermidine levels in their gene deletion strains.

      Polyamine concentration measurements will be part of a separate study on polyamine control of pili synthesis of a uropathogenic strain. A comparison is essential, and the results from W3110 will be part of that study.

      Reviewer #2 (Recommendations for the authors):

      (1) Line 28. Your statements about urinary tract infections are pure speculation. They are fine for the discussion, but should not be in the abstract.

      The abstract from line 27 on has been reworked. The comment of the reviewer is fair.

      (2) Line 65. Do we need this discussion about the various strains? If you keep it, you should point out that they were all W3110 strains. But you could just say that you confirmed that your background strain can do PDSM (since you are also not showing any data for the other isolates). Discussing the various strains implies that you are not confident in your strain and raises the question of why you didn't use a sequenced wt MG1655, or something like that.

      This section has been reworked. Our strain of W3110 has an insertion in fimB which is relevant for movement but does not affect our results. The insertion limits our conclusions about phase variation. We want to point out that strains variations are large. We also sequenced our strain of W3110.

      (3) Related. You occasionally use "W3110-LR" to designate the wild type. You use this or not, but be consistent throughout the text.

      Fixed

      (4) Line 99. Does eLife allow "data not shown"?  

      (5) Line 119. As you note, the phenotype of the puuA patA double mutant is exactly the opposite of what one would expect. Although you provide additional evidence that high levels also inhibit motility, complementing the double mutant would provide confidence that the strain is correct.

      We rapidly ran into issues with complementation which are discussed in public responses to reviewer comments.

      (6) Figure 6C. Either you need to quantify these data or you need a better picture.

      The files were corrupted. It was repeated several time, but we lost the other data.

      (7) Figure 7. Label panels A and B to indicate that these strains are speB. Also, you need to switch panels C and D to match the order of discussion in the manuscript.

      Done

      (8) Line 134. Is there a statistically significant difference in the ELISA between 1 and 4 mM? You need to say one way or the other.

      No statistical significance and this has been added to the paper

      (9) Figure 10C. You need to quantify these data.

      Quantification added as an extra panel.

      (10) Line 164. You include H-NS in the group of "positive effectors that control fim operon expression" and you reference Ecocyc, rather than any primary reference. Nowhere in the manuscript do you mention phase variation. In the speB mutant, you see decreased fimB, increased fimE, and decreased hns expression. My interpretation of the literature suggests that this would drive the fim switch to the off-state. This could certainly explain some of the results. It is also easily measurable with PCR. This might require testing cells scraped directly from the plates.

      The experiments were performed. There is no need to scrap cells from plates because the fimB result from RNAseq was from a liquid culture, and the prediction would be that the phase-locking should be evident in these cells.

      (11) Figure 10. Likewise, do you know that your hns mutant is not locked in the off-state? Granted, the original hns mutants (pilG) showed increased rates of switching, but growth conditions might matter.

      We also did phase variation for the hns mutant and the hns mutant was not phase locked. This result is shown. In addition to growth conditions, the strain probably matters.

      (12) Line 342. You describe the total genome sequencing of W3110, yet this is not mentioned anywhere else in the manuscript.

      It is now

      Minor points:

      (13) Line 192. "One of the most differentially expressed genes...".

      (14) Line 202. "...implicates extracellular putrescine in putrescine homeostasis."

      (15) Line 209. "...potential pili regulators...".

      (16) You are using a variety of fonts on the figures. Pick one.

      (17) Figure 9A. It took me a few minutes to figure out the labeling for this figure and I was more confused after reading the legend. It would be simpler to independently label red triangles, blue triangles, red circles, and blue circles.

      (18) Figure 9B and 10. The reader can likely figure out what W3110_1.0_3 means, but more straightforward labeling would be better, or you need to define these labels.

      All points were addressed and fixed.

      Reviewer #3 (Recommendations for the authors):

      Other comments:

      (1) Please go through the figures and the reference to figures in the text, as they often do not refer to the right panel (ex: figures 2 and 7 for instance). In the text, please homogenize the reference to figures (Figure 2C vs Figure 3). To help compare motility experiments between figures, please use the same scale in all figures.

      This has been fixed.

      (2) Lines 65-70: I am not sure I get the reason behind choosing the W3110 strain from your lab stock. In what background were the initial mutants constructed (from l.64-65)? Were the nine strains tested, all variations of W3110? If so, is the phenotype described in the manuscript robust in all strains?

      We have provided more explanation. W3110 was the most stable: insertions that allowed flagella synthesis in the presence of glucose were frequent. We deleted the major flagella subunit for most experiments. Before introduction of the fliC deletion, we needed to perform experiments 10 times so that fast-moving variants, which had mutationally altered flagella synthesis, did not complicate results.

      (3) Line 82-84: As stated in the public review, I think more controls are needed before making this conclusion, especially as type I fimbriae are usually involved in sessile phenotypes.

      Response provided in the public response.

      (4) In Figure 3: Changing the order of the image to follow the text would make the figure easier to follow.

      Fixed as requested

      (5) Lines 100-101: simultaneous - the results presented here do not support this conclusion. In Figure 4b, the addition of putrescine to speB mutants is actually not different from WT. From the results, it seems like one of biosynthesis or transport is needed, but it's not clear if both are needed simultaneously. For this, a mutant with no biosynthesis and no transport is needed and/or completely non-motile mutants would be needed to compare.

      We disagree. If there are two pathways of putrescine synthesis and both are needed, then our conclusion follows.

      (6) Lines 104-105: '... because E. coli secretes putrescine.' - not sure why this statement is there, as most transporters tested after are importers of putrescine? It is also not clear to me if putrescine is supplemented in the media in these experiments. If not, is there putrescine in the GT media?

      Good points, and this section has been reworded to clarify these issues. Some of the material was moved to the discussion.

      (7) Line 109: 'We note that potE and plaP are more highly expressed than potE and puuP...' - first potE should be potF?

      This has been corrected.

      (8) Figure 8: What is the difference between the TEM images in Figure 1 and here? The WT in Figure 1 does show pili without the supplementation unless I'm missing something here. Please specify.

      The reviewer means Figure 2 and not Figure 1. Figure 2 shows a wild-type strain which has both putrescine anabolic pathways while Figure 8 is the ΔspeB strain which lacks one pathway.

      (9) Line160-162: Transcripts for the putrescine-responsive puuAP and puuDRCBE operons, which specify genes of the major putrescine catabolic pathway, were reduced from 1.6- to 14- fold (FDR {less than or equal to} 0.02) in the speB mutant (Supplemental Table 1), which implies lower intracellular putrescine. I might not get exactly the point here. If the catabolic pathways are repressed in the speB mutant, then there will be less degradation which means more putrescine!?

      Expression of these genes is a function of intracellular putrescine: higher expression means more putrescine. Any discussion of steady putrescine must include the anabolic pathways: the catabolic pathways do not determine the intracellular putrescine, they are a reflection of intracellular putrescine.

      (10) Lines 162-163: Deletion of speB reduced transcripts for genes of the fimA operon and fimE, but not of fimB. It seems that the results suggest the opposite a reduction of fimB but not fimE!?

      The reviewer is correct, and it is our mistake, and the text now states what is in the figure..

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      This manuscript presents an interesting exploration of the potential activation mechanisms of DLK following axonal injury. While the experiments are beautifully conducted and the data are solid, I feel that there is insufficient evidence to fully support the conclusions made by the authors.

      In this manuscript, the authors exclusively use the puc-lacZ reporter to determine the activation of DLK. This reporter has been shown to be induced when DLK is activated.

      However, there is insufficient evidence to confirm that the absence of reporter activation necessarily indicates that DLK is inactive. As with many MAP kinase pathways, the DLK pathway can be locally or globally activated in neurons, and the level of DLK activation may depend on the strength of the stimulation. This reporter might only reflect strong DLK activation and may not be turned on if DLK is weakly activated. The results presented in this manuscript support this interpretation. Strong stimulation, such as axotomy of all synaptic branches, caused robust DLK activation, as indicated by puc-lacZ expression. In contrast, weak stimulation, such as axotomy of some synaptic branches, resulted in weaker DLK activation, which did not induce the puc-lacZ reporter. This suggests that the strength of DLK activation depends on the severity of the injury rather than the presence of intact synapses. Given that this is a central conclusion of the study, it may be worthwhile to confirm this further. Alternatively, the authors may consider refining their conclusion to better align with the evidence presented.

      In Figure 1E we have replotted the puc-lacZ data to show comparisons between different injuries that leave different numbers of spared (or lost) boutons and branches.  We observed no differences between injuries that remove only a small fraction of boutons (injury location (a)) and injuries that remove nearly all of them (injury locations (b) and (c)) and uninjured neurons (Figure 1E). These observations argue against the interpretation that the strength of DLK activation (at least within the cell body) depends on the severity of injury. Rather, puc-lacZ induction appears to be bimodal. It is either induced (in various injuries that remove all synaptic boutons), or not induced, including in injuries that spared only a small fraction of the total boutons. We therefore think that the presence of a remaining synaptic connection rather than the extent of the injury per se is a major determinant of whether the cell body component of Wnd signaling can be activated. 

      The reviewer (and others) fairly point out that our current study focuses on puc-lacZ as a reporter of Wnd signaling in the cell body. We consider this to be a downstream integration of events in axons that are more challenging to detect. It is striking that this integration appears strongly sensitized to the presence of spared synaptic boutons. Examination of Wnd’s activation in axons and synapses is a goal for our future work.

      As noted by the authors, DLK has been implicated in both axon regeneration and degeneration. Following axotomy, DLK activation can lead to the degeneration of distal axons, where synapses are located. This raises an important question: how is DLK activated in distal axons? The authors might consider discussing the significance of this "synapse connection-dependent" DLK activation in the broader context of DLK function and activation mechanisms.

      While it has been noted that inhibition of DLK can mildly delay Wallerian degeneration (Miller et al., 2009), this does not appear to be the case for retinal ganglion cell axons following optic nerve crush (Fernandes et al., 2014). It is also not the case for Drosophila motoneurons and NMJ terminals following peripheral nerve injury (Xiong et al., 2012; Xiong and Collins, 2012). Instead, overexpression of Wnd or activation of Wnd by a conditioning injury leads to an opposite phenotype - an increase in resiliency to Wallerian degeneration for axons that have been previously injured (Xiong et al., 2012; Xiong and Collins, 2012). The downstream outcome of Wnd activation is highly dependent on the context; it may be an integration of the outcomes of local Wnd/DLK activation in axons with downstream consequences of nuclear/cell body signaling.  The current study suggests some rules for the cell body signaling, however, how Wnd is regulated at synapses and why it promotes degeneration in some circumstances but not others are important future questions.

      For the reviewer’s suggestion, it is interesting to consider DLK’s potential contributions to the loss of NMJ synapses in a mouse model of ALS (Le Pichon et al., 2017; Wlaschin et al., 2023). Our findings suggest that the synaptic terminal is an important locus of DLK regulation, while dysfunction of NMJ terminals is an important feature of the ‘dying back’ hypothesis of disease etiology (Dadon-Nachum et al., 2011; Verma et al., 2022). We propose that the regulation of DLK at synaptic terminals is an important area for future study, and may reveal how DLK might be modulated to curtail disease progression. Of note, DLK inhibitors are in clinical trials (Katz et al., 2022; Le et al., 2023; Siu et al., 2018), but at least some have been paused due to safety concerns (Katz et al., 2022). Further understanding of the mechanisms that regulate DLK are needed to understand whether and how DLK and its downstream signaling can be tuned for therapeutic benefit.

      Reviewer #2 (Public review):

      Summary:

      The authors study a panel of sparsely labeled neuronal lines in Drosophila that each form multiple synapses. Critically, each axonal branch can be injured without affecting the others, allowing the authors to differentiate between injuries that affect all axonal branches versus those that do not, creating spared branches. Axonal injuries are known to cause Wnd (mammalian DLK)-dependent retrograde signals to the cell body, culminating in a transcriptional response. This work identifies a fascinating new phenomenon that this injury response is not all-or-none. If even a single branch remains uninjured, the injury signal is not activated in the cell body. The authors rule out that this could be due to changes in the abundance of Wnd (perhaps if incrementally activated at each injured branch) by Wnd, Hiw's known negative regulator. Thus there is both a yet-undiscovered mechanism to regulate Wnd signaling, and more broadly a mechanism by which the neuron can integrate the degree of injury it has sustained. It will now be important to tease apart the mechanism(s) of this fascinating phenomenon. But even absent a clear mechanism, this is a new biology that will inform the interpretation of injury signaling studies across species.

      Strengths:

      (1) A conceptually beautiful series of experiments that reveal a fascinating new phenomenon is described, with clear implications (as the authors discuss in their Discussion) for injury signaling in mammals.

      (2) Suggests a new mode of Wnd regulation, independent of Hiw.

      Weaknesses:

      (1) The use of a somatic transcriptional reporter for Wnd activity is powerful, however, the reporter indicates whether the transcriptional response was activated, not whether the injury signal was received. It remains possible that Wnd is still activated in the case of a spared branch, but that this activation is either local within the axons (impossible to determine in the absence of a local reporter) or that the retrograde signal was indeed generated but it was somehow insufficient to activate transcription when it entered the cell body. This is more of a mechanistic detail and should not detract from the overall importance of the study

      We agree. The puc-lacZ reporter tells us about signaling in the cell body, but whether and how Wnd is regulated in axons and synaptic branches, which we think occurs upstream of the cell body response, remains to be addressed in future studies.

      (2) That the protective effect of a spared branch is independent of Hiw, the known negative regulator of Wnd, is fascinating. But this leaves open a key question: what is the signal?

      This is indeed an important future question, and would still be a question even if Hiw were part of the protective mechanism by the spared synaptic branch. Our current hypothesis (outlined in Figure 4) is that regulation of Wnd is tied to the retrograde trafficking of a signaling organelle in axons. The Hiw-independent regulation complements other observations in the literature that multiple pathways regulate Wnd/DLK (Collins et al., 2006; Feoktistov and Herman, 2016; Klinedinst et al., 2013; Li et al., 2017; Russo and DiAntonio, 2019; Valakh et al., 2013). It is logical for this critical stress response pathway to have multiple modes of regulation that may act in parallel to tune and restrain its activation. 

      Reviewer #3 (Public review):

      Summary:

      This manuscript seeks to understand how nerve injury-induced signaling to the nucleus is influenced, and it establishes a new location where these principles can be studied. By identifying and mapping specific bifurcated neuronal innervations in the Drosophila larvae, and using laser axotomy to localize the injury, the authors find that sparing a branch of a complex muscular innervation is enough to impair Wallenda-puc (analogous to DLK-JNKcJun) signaling that is known to promote regeneration. It is only when all connections to the target are disconnected that cJun-transcriptional activation occurs.

      Overall, this is a thorough and well-performed investigation of the mechanism of sparedbranch influence on axon injury signaling. The findings on control of wnd are important because this is a very widely used injury signaling pathway across species and injury models. The authors present detailed and carefully executed experiments to support their conclusions. Their effort to identify the control mechanism is admirable and will be of aid to the field as they continue to try to understand how to promote better regeneration of axons.

      Strengths:

      The paper does a very comprehensive job of investigating this phenomenon at multiple locations and through both pinpoint laser injury as well as larger crush models. They identify a non-hiw based restraint mechanism of the wnd-puc signaling axis that presumably originates from the spared terminal. They also present a large list of tests they performed to identify the actual restraint mechanism from the spared branch, which has ruled out many of the most likely explanations. This is an extremely important set of information to report, to guide future investigators in this and other model organisms on mechanisms by which regeneration signaling is controlled (or not).

      Weaknesses:

      The weakest data presented by this manuscript is the study of the actual amounts of Wallenda protein in the axon. The authors argue that increased Wnd protein is being anterogradely delivered from the soma, but no support for this is given. Whether this change is due to transcription/translation, protein stability, transport, or other means is not investigated in this work. However, because this point is not central to the arguments in the paper, it is only a minor critique.

      We agree and are glad that the reviewer considers this a minor critique; this is an area for future study. In Supplemental Figure 1 we present differences in the levels of an ectopically expressed GFP-Wnd-kinase-dead transgene, which is strikingly increased in axons that have received a full but not partial axotomy. We suspect this accumulation occurs downstream of the cell body response because of the timing. We observed the accumulations after 24 hours (Figure S1F) but not at early (1-4 hour) time points following axotomy (data not shown). Further study of the local regulation of Wnd protein and its kinase activity in axons is an important future direction.

      As far as the scope of impact: because the conclusions of the paper are focused on a single (albeit well-validated) reporter in different types of motor neurons, it is hard to determine whether the mechanism of spared branch inhibition of regeneration requires wnd-puc (DLK/cJun) signaling in all contexts (for example, sensory axons or interneurons). Is the nerve-muscle connection the rule or the exception in terms of regeneration program activation?

      DLK signaling is strongly activated in DRG sensory neurons following peripheral nerve injury (Shin et al., 2012), despite the fact that sensory neurons have bifurcated axons and their projections in the dorsal spinal cord are not directly damaged by injuries to the peripheral nerve. Therefore it is unlikely that protection by a spared synapse is a universal rule for all neuron types. However the molecular mechanisms that underlie this regulation may indeed be shared across different types of neurons but utilized in different ways. For instance, nerve growth factor withdrawal can lead to activation of DLK (Ghosh et al., 2011), however neurotrophins and their receptors are regulated and implemented differently in different cell types. We suspect that the restraint of Wnd signaling by the spared synaptic branch shares a common underlying mechanism with the restraint of DLK signaling by neurotrophin signaling. Further elucidation of the molecular mechanism is an important next step towards addressing this question. 

      Because changes in puc-lacZ intensity are the major readout, it would be helpful to better explain the significance of the amount of puc-lacZ in the nucleus with respect to the activation of regeneration. Is it known that scaling up the amount of puc-lacZ transcription scales functional responses (regeneration or others)? The alternative would be that only a small amount of puc-lacZ is sufficient to efficiently induce relevant pathways (threshold response).

      While induction of puc-lacZ expression correlates with Wnd-mediated phenotypes, including sprouting of injured axons (Xiong et al., 2010), protection from Wallerian degeneration (Xiong et al., 2012; Xiong and Collins, 2012) and synaptic overgrowth (Collins et al., 2006), we have not observed any correlation between the degree of puc-lacZ induction (eg modest, medium or high) and the phenotypic outcomes (sprouting, overgrowth, etc). Rather, there appears to be a striking all-or-none difference in whether puc-lacZ is induced or not induced. There may indeed be a threshold that can be restrained through multiple mechanisms. We posit in figure 4 that restraint may take place in the cell body, where it can be influenced by the spared bifurcation. 

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      This is a beautiful study. Naturally, you're searching now for the underlying mechanism.

      A few questions:

      (1) At present you can not determine if the Wnd signal is never initiated (when a spared branch is present) or if it gets to the cell body but is incapable of activating the puckered reporter. Is there any optical reporter (JNK activation?) that could differentiate this?

      The reviewer is correct that a tool to detect local activity of JNK kinase in axons would be ideal for probing the mechanisms that underlie our observations. A FRET reporter for JNK kinase activity has been developed and utilized in cultured cells (Fosbrink et al. 2010). It would be interesting to implement this reporter in Drosophila; it would need to be sensitive enough to visualize  in single Drosophila axons. We have previously noted Wnd-dependent phosphorylated JNK in the cell body of injured motoneurons following nerve crush (Xiong et al., 2010). However anti-pJNK antibodies detect what appears to be a constitutive signal in uninjured axons that does not appear to be influenced by activation or inhibition of Wnd (Xiong et al., 2010).

      (2) What happens when you injure the axon in a dSarm KO? This is more of a curiosity, not a necessity, but is it the axon dying or the detection of the injury itself?

      We have tested whether overexpression of Nmnat or the WldS transgene, which inhibit Wallerian degeneration of injured axons, affect the induction of puc-lacZ following nerve injury. This manipulation has no effect on puc-lacZ expression in uninjured animals, and also has no effect on the induction of puc-lacZ following peripheral nerve crush (TJ Waller, personal communication).

      (3) Are Wnd rescue experiments possible in this context? Would be an interesting place to do Wnd structure-function and compare it to the synaptic work.

      This is not possible with current reagents. Expression of wild type wnd cDNA under the Gal4/UAS promoter leads to strong induction of puc-lacZ in uninjured animals, even when weak Gal4 driver lines are used (Xiong et al., 2012, 2010). Similar observations of constitutively active signaling have been observed for expression studies of DLK in mammalian cells ((Hao et al., 2016; Huntwork-Rodriguez et al., 2013; Nihalani et al., 2000), and data not shown). These and other observations suggest that the levels of Wnd/DLK protein are tightly controlled by posttranscriptional mechanisms. Delineation of sequences within Wnd/DLK that are required for its regulation would be helpful for addressing this question.

      This will be required reading in my lab.

      That is an honor. We look forward to help from the field to understand how and why this pathway is restrained at synapses. Your students may bring new ideas to the table.

      Reviewer #3 (Recommendations for the authors):

      Piezo is spelled incorrectly in the supplemental table in multiple places.

      Thank you for pointing this out! We have made the correction.

      References cited (in rebuttal)

      Collins CA, Wairkar YP, Johnson SL, DiAntonio A. 2006. Highwire restrains synaptic growth by attenuating a MAP kinase signal. Neuron 51:57–69.

      Dadon-Nachum M, Melamed E, Offen D. 2011. The “dying-back” phenomenon of motor neurons in ALS. J Mol Neurosci 43:470–477.

      Feoktistov AI, Herman TG. 2016. Wallenda/DLK protein levels are temporally downregulated by Tramtrack69 to allow R7 growth cones to become stationary boutons. Development 143:2983–2993.

      Fernandes KA, Harder JM, John SW, Shrager P, Libby RT. 2014. DLK-dependent signaling is important for somal but not axonal degeneration of retinal ganglion cells following axonal injury. Neurobiol Dis 69:108–116.

      Ghosh AS, Wang B, Pozniak CD, Chen M, Watts RJ, Lewcock JW. 2011. DLK induces developmental neuronal degeneration via selective regulation of proapoptotic JNK activity. J Cell Biol 194:751–764.

      Hao Y, Frey E, Yoon C, Wong H, Nestorovski D, Holzman LB, Giger RJ, DiAntonio A, Collins C. 2016. An evolutionarily conserved mechanism for cAMP elicited axonal regeneration involves direct activation of the dual leucine zipper kinase DLK. Elife 5. doi:10.7554/eLife.14048

      Huntwork-Rodriguez S, Wang B, Watkins T, Ghosh AS, Pozniak CD, Bustos D, Newton K, Kirkpatrick DS, Lewcock JW. 2013. JNK-mediated phosphorylation of DLK suppresses its ubiquitination to promote neuronal apoptosis. J Cell Biol 202:747–763.

      Katz JS, Rothstein JD, Cudkowicz ME, Genge A, Oskarsson B, Hains AB, Chen C, Galanter J, Burgess BL, Cho W, Kerchner GA, Yeh FL, Ghosh AS, Cheeti S, Brooks L, Honigberg L, Couch JA, Rothenberg ME, Brunstein F, Sharma KR, van den Berg L, Berry JD, Glass JD. 2022. A Phase 1 study of GDC-0134, a dual leucine zipper kinase inhibitor, in ALS. Ann Clin Transl Neurol 9:50–66.

      Klinedinst S, Wang X, Xiong X, Haenfler JM, Collins CA. 2013. Independent pathways downstream of the Wnd/DLK MAPKKK regulate synaptic structure, axonal transport, and injury signaling. J Neurosci 33:12764–12778.

      Le K, Soth MJ, Cross JB, Liu G, Ray WJ, Ma J, Goodwani SG, Acton PJ, Buggia-Prevot V, Akkermans O, Barker J, Conner ML, Jiang Y, Liu Z, McEwan P, Warner-Schmidt J, Xu A, Zebisch M, Heijnen CJ, Abrahams B, Jones P. 2023. Discovery of IACS-52825, a potent and selective DLK inhibitor for treatment of chemotherapy-induced peripheral neuropathy. J Med Chem 66:9954–9971.

      Le Pichon CE, Meilandt WJ, Dominguez S, Solanoy H, Lin H, Ngu H, Gogineni A, Sengupta Ghosh A, Jiang Z, Lee S-H, Maloney J, Gandham VD, Pozniak CD, Wang B, Lee S, Siu M, Patel S, Modrusan Z, Liu X, Rudhard Y, Baca M, Gustafson A, Kaminker J, Carano RAD, Huang EJ, Foreman O, Weimer R, Scearce-Levie K, Lewcock JW. 2017. Loss of dual leucine zipper kinase signaling is protective in animal models of neurodegenerative disease. Sci Transl Med 9. doi:10.1126/scitranslmed.aag0394

      Li J, Zhang YV, Asghari Adib E, Stanchev DT, Xiong X, Klinedinst S, Soppina P, Jahn TR, Hume RI, Rasse TM, Collins CA. 2017. Restraint of presynaptic protein levels by Wnd/DLK signaling mediates synaptic defects associated with the kinesin-3 motor Unc-104. Elife 6. doi:10.7554/eLife.24271

      Miller BR, Press C, Daniels RW, Sasaki Y, Milbrandt J, DiAntonio A. 2009. A dual leucine kinase-dependent axon self-destruction program promotes Wallerian degeneration. Nat Neurosci 12:387–389.

      Nihalani D, Merritt S, Holzman LB. 2000. Identification of structural and functional domains in mixed lineage kinase dual leucine zipper-bearing kinase required for complex formation and stress-activated protein kinase activation. J Biol Chem 275:7273–7279.

      Russo A, DiAntonio A. 2019. Wnd/DLK is a critical target of FMRP responsible for neurodevelopmental and behavior defects in the Drosophila model of fragile X syndrome. Cell Rep 28:2581–2593.e5.

      Shin JE, Cho Y, Beirowski B, Milbrandt J, Cavalli V, DiAntonio A. 2012. Dual leucine zipper kinase is required for retrograde injury signaling and axonal regeneration. Neuron 74:1015– 1022.

      Siu M, Sengupta Ghosh A, Lewcock JW. 2018. Dual Leucine Zipper Kinase Inhibitors for the Treatment of Neurodegeneration. J Med Chem 61:8078–8087.

      Valakh V, Walker LJ, Skeath JB, DiAntonio A. 2013. Loss of the spectraplakin short stop activates the DLK injury response pathway in Drosophila. J Neurosci 33:17863–17873.

      Verma S, Khurana S, Vats A, Sahu B, Ganguly NK, Chakraborti P, Gourie-Devi M, Taneja V. 2022. Neuromuscular junction dysfunction in amyotrophic lateral sclerosis. Mol Neurobiol 59:1502–1527.

      Wlaschin JJ, Donahue C, Gluski J, Osborne JF, Ramos LM, Silberberg H, Le Pichon CE. 2023. Promoting regeneration while blocking cell death preserves motor neuron function in a model of ALS. Brain 146:2016–2028.

      Xiong X, Collins CA. 2012. A conditioning lesion protects axons from degeneration via the Wallenda/DLK MAP kinase signaling cascade. J Neurosci 32:610–615.

      Xiong X, Hao Y, Sun K, Li J, Li X, Mishra B, Soppina P, Wu C, Hume RI, Collins CA. 2012. The Highwire ubiquitin ligase promotes axonal degeneration by tuning levels of Nmnat protein. PLoS Biol 10:e1001440.

      Xiong X, Wang X, Ewanek R, Bhat P, Diantonio A, Collins CA. 2010. Protein turnover of the Wallenda/DLK kinase regulates a retrograde response to axonal injury. J Cell Biol 191:211– 223.

    1. Costa, E.; Ferezin, N. B. ESG (Environmental, Social and Corporate Governance) e acomunicação: o tripé da sustentabilidade aplicado às organizações globalizadas. RevistaAlterjor, v. 24, n. 2, 79-95, 2021.Dourado, I. P.; Marques, A. O tripé da sustentabilidade brasileira: desafios históricos na lutaambiental, compromissos políticos e coletivos na educação ambiental. Rev. Gesto eDebate, v. 7, n. 1, 2023.Fonseca, S. A.; Martins, P. S. Gestão ambiental: uma súplica do planeta, um desafio parapolíticas públicas, incubadoras e pequenas empresas. Produção, v. 20, n. 4, out./dez., p.538-548, 2010.Lima, L. A. de O. et al. Sustainable Management Practices: Green Marketing as A Source forOrganizational Competitive Advantage. Revista de Gestão Social e Ambiental, São Paulo(SP), v. 18, n. 4, 2024. DOI: 10.24857/rgsa.v18n4-087.Lima, L. A. de O. et al. The Influence of Green Marketing on Consumer Purchase Intention: aSystematic Review. Revista de Gestão Social e Ambiental, São Paulo (SP), v. 18, n. 3, p.e05249, 2024. DOI: 10.24857/rgsa.v18n3-084.Machado, P. K. O.; Checon, B. Q. Análise do cumprimento de critérios de governançacorporativa por empresas ditas como Ambiental, Social e de Governança. FGV RIC Revistade Iniciação Científica, v. 4, n. 1, 2023.Mecca, M. S. et al. Sustentabilidade e ESG (Environmental, social and governance): estudo dasoperações turísticas de uma pousada na serra gaúcha. Tur., Visão e Ação, v25, n3, p425-444, Set./Dez. 2023.Mendes, L. S. Saber Ambiental: Sustentabilidade, Racionalidade, Complexidade, Poder.Revista Tocantinense de Geografia, [S. l.], v. 11, n. 23, p. 234–240, 2022.Santos, E. H.; Silva, M. A. Sustentabilidade empresarial: um novo modelo de negócio. RevistaCiência Contemporânea, jun./dez., v.2, n.1, p. 75-94, 2017.Silva, H. M. M. A sustentabilidade como vantagem competitiva: um olhar sobre o tripé dasustentabilidade. Revista Multidisciplinar de Educação e Meio Ambiente, v. 2, n. 3, 2021.

      There looks to be a good amount of courses used, all of which are relevant to the article. There seems to be a good mix of references and original research.

    Annotators

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02713

      Corresponding author(s): Igor, Kramnik

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      Dear Editors,

      We are grateful for constructive reviewers’ comments and criticisms and have thoroughly addressed all major and minor comments in the revised manuscript.

      Summary of new data.

      We have performed the following additional experiments to support our concept:

      1. The kinetcs of ROS production in B6 and B6.Sst1S macrophages after TNF stimulation (Fig. ____3I and J, Suppl. Fig. 3G)____;
      2. __ Time course of stress kinase activation (_Fig.3K)_ that clearly demonstrated the persistent stress kinase (phospho-ASK1 and phospho-cJUN) activation exclusively in. the B6.Sst1S macrophages;__
      3. New Fig.4 C – E panels include comparisons of the B6 and B6.Sst1S macrophage responses to TNF and effects of IFNAR1 blockade in both backgrounds.
      4. We performed new experiments demonstrating that the synthesis of lipid peroxidation products (LPO) occurs in TNF-stimulated macrophages earlier than the IFNβ super-induction (__Suppl.Fig.____4A and B). __
      5. We demonstrated that the IFNAR1 blockade 12, 24 and 32 h after TNF stimulation still reduced the accumulation of LPO product (4-HNE) in TNF-stimulated B6.Sst1S BMDMs (Suppl.Fig.4 E – G).
      6. We added comparison of cMyc expression between the wild type B6 and B6.Sst1S BMDMs during TNF stimulation for 6 – 24 h (Fig.__5I–J). __
      7. New data comparing 4-HNE levels in Mtb-infected B6 wild type and B6.Sst1S macrophages and quantification of replicating Mtb was added (Fig.____6B, Suppl.Fig.7C and D).
      8. In vivo data described in Fig.7 was thoroughly revised and new data was included. We demonstrated increased 4-HNE loads in multibacillary lesions (Fig.7A, Suppl. Fig.9A) and the 4-HNE accumulation in CD11b+ myeloid cells (Fig.7B __and __Suppl.Fig.9B). We demonstrated that the Ifnb – expressing cells are activated iNOS+ macrophages (Fig.7D and Suppl.Fig.13A). Using new fluorescent multiplex IHC, we have shown that stress markers phopho-cJun and Chac1 in TB lesions are expressed by Ifnb- and iNOS-expressing macrophages (Fig.7E and Suppl.Fig.13D – F).
      9. We performed additional experiment to demonstrate that naïve (non-BCG vaccinated) lymphocytes did not improve Mtb control by Mtb-infected macrophages in agreement with previously published data (Suppl.Fig.7H). Summary of updates

      Following reviewers requests we updated figures to include isotype control antibodies, effects of inhibitors on non-stimulated cells, positive and negative controls for labile iron pool, additional images of 4-HNE and live/dead cell staining.

      Isotype control for IFNAR1 blockade were included in Fig.3M, Fig.4C -E, Fig.6L-M

      Suppl.Fig.4F -G, 7I.

      Positive and negative controls for labile iron pool measurements were added to Fig.3E, Fig.5D, Suppl.Fig.3B

      Cell death staining images were added Suppl.Fig.3H

      Co-staining of 4-HNE with tubulin was added to Suppl.Fig.3A.

      High magnification images for Figure 7 __were added in __Suppl.Fig.8 to demonstrate paucibacillary and multibacillary image classification.

      Single-channel color images for individual markers were provided in Fig.____7E and Suppl.Fig.13B–F.

      Inhibitor effects on non-stimulated cells were included in Fig.____5 D – H, Suppl.Fig.6A and B.

      Titration of CSF1R inhibitors for non-toxic concentration determination are included in Suppl.Fig.6D.

      In addition, we updated the figure legends in the revised manuscript to include more details about the experiments. We also clarified our conclusions in the Discussion.

      Responses to every major and minor comment of the reviewers are provided below.

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary

      The study by Yabaji et al. examines macrophage phenotypes B6.Sst1S mice, a mouse strain with increased susceptibility to M. tuberculosis infection that develops necrotic lung lesions. Extending previous work, the authors specifically focus on delineating the molecular mechanisms driving aberrant oxidative stress in TNF-activated B6.Sst1S macrophages that has been associated with impaired control of M. tuberculosis. The authors use scRNAseq of bone marrow-derived macrophages to further characterize distinctions between B6.Sst1S and control macrophages and ascribe distinct trajectories upon TNF stimulation. Combined with results using inhibitory antibodies and small molecule inhibitors in in vitro experimentation, the authors propose that TNF-induced protracted c-Myc expression in B6.Sst1S macrophages disables the cellular defense against oxidative stress, which promotes intracellular accumulation of lipid peroxidation products, fueled at least in part by overexpression of type I IFNs by these cells. Using lung tissue sections from M. tuberculosis-infected B6.Sst1S mice, the authors suggest that the presence of a greater number of cells with lipid peroxidation products in lung lesions with high counts of stained M. tuberculosis are indicative of progressive loss of host control due to the TNF-induced dysregulation of macrophage responses to oxidative stress. In patients with active tuberculosis disease, the authors suggest that peripheral blood gene expression indicative of increased Myc activity was associated with treatment failure.

      __Major comments __ The authors describe differences in protein expression, phosphorylation or binding when referring to Fig 2A-C, 2G, 3D, 5B, 5C. However, such differences are not easily apparent or very subtle and, in some cases, confounded by differences in resting cells (e.g. pASK1 Fig 3L; c-Myc Fig 5B) as well as analyses across separate gels/blots (e.g. Fig 3K, Fig 5B). Quantitative analyses across different independent experiments with adequate statistical analyses are required to strengthen the associated conclusions.

      Author: We updated our Western blots as follows: 1. Densitometery of normalized bands is included above each lane (Fig.2A – C; Fig.3C – D and 3K; Fig.4A – B; Fig.5B,C,I,J). New data in Fig.3K is added to highlight differences between B6 and B6.Sst1S at individual timepoints after TNF stimulation. In Fig.5I we added new data comparing Myc levels in B6 and B6.Sst1S with and without JNK inhibitor and updated the results accordingly. New Fig.3K clearly demonstrates the persistent activation of p-cJun and p-Ask1 at 24 and 36h of TNF stimulation. In Fig.5B we clearly demonstrate that Myc levels were higher in B6.Sst1S after 12 h of TNF stimulation. At 6h, however, the basal differences in Myc levels are consistently higher in B6.Sst1S and the induction by TNF is 1.6-fold similar in both backgrounds. We noted this in the text.

      A representative experiment is shown in individual panels and the corresponding figure legend contains information on number of biological repeats. Each Western blot was repeated 2 – 4 times.

      The representative images of fluorescence microscopy in Fig 3H, 4H, 5H, S3C, S3I, S5A, S6A seem to suggest that under some conditions the fluorescence signal is located just around the nucleus rather than absent or diminished from the cytoplasm. It is unclear whether this reflects selective translocation of targets across the cell, morphological changes of macrophages in culture in response to the various treatments, or variations in focal point at which images were acquired. Control images (e.g. cellular actin, DIC) should be included for clarification. If cell morphology changes depending on treatments, how was this accounted for in the quantitative analyses? In addition, negative controls validating specificity of fluorescence signals would be warranted.

      Author: Our conclusion of higher LPO production is based on several parameters: 4-HNE staining, measurements of MDA in cell lysates and oxidized lipids using BODIPY C11. Taken together they demonstrate significant and reproducible increase in LPO accumulation in TNF-stimulated B6.Sst1S macrophages. This excludes imaging artefact related to unequal 4-HNE distribution noted by the reviewer. In fact, we also noted that the 4-HNE was spread within cell body of B6.Sst1S macrophages and confirmed it using co-staining with tubulin, as suggested by the reviewer (new Suppl.Fig.3A). Since low molecular weight LPO products, such as MDA and 4-HNE, traverse cell membranes, it is unlikely that they will be strictly localized to a specific membrane bound compartment. However, we agree that at lower concentrations, there might be some restricted localization, explaining a visible perinuclear ring of 4-HNE staining in B6 macrophages. This phenomenon may be explained just by thicker cytoplasm surrounding nucleus in activated macrophages spread on adherent plastic surface or by proximity to specific organelles involved in generation or clearance of LPO products and definitively warrants further investigation.

      We also included images of non-stimulated cells in Fig.3H, Suppl.Fig.3A and 3E. We used multiple fields for imaging and quantified fluorescence signals (Suppl. Fig.3D and 3F, Suppl.Fig.4G, Suppl.Fig.6A and B).

      We used negative controls without primary antibodies for the initial staining optimization, but did not include it in every experiment.

      To interpret the evaluation on the hierarchy of molecular mechanisms in B6.Sst1S macrophages, comparative analyses with B6 control cells should be included (e.g. Fig 4C-I, Fig 5, Fig 6B, E-M, S6C, S6E-F). This will provide weight to the conclusions that the dysregulated processes are specifically associated with the susceptibility of B6.Sst1S macrophages.

      Author: Understanding the sst1-mediated effects on macrophage activation is the focus of our previously published studies Bhattacharya et al., JCI, 2021) and this manuscript. The data comparing B6 and B6.Sst1S macrophage are presented in Fig.1, Fig.2, Fig.3, Fig.4, Fig.5A – C, I and J, Fig.6A – C, 6J and corresponding supplemental figures 1, 2, 3, 4A and B, Suppl.Fig.5, Suppl.Fig.6C, Suppl.Fig.7A-D,7F.

      Once we identified the aberrantly activated pathways in the B6.Sst1S, we used specific inhibitors to correct the aberrant response in B6.Sst1S.

      All experiments using inhibitory antibodies require comparison to the effect of a matched isotype control in the same experiment (e.g. Fig 3J, 4F, G, I; 6L, 6M, S3G, S6F).

      Author: Isotype control for IFNAR1 blockade were included in Fig.3M, Fig.4C -E, Fig.6L-M

      Suppl.Fig.4F -G, 7I.

      Experiments using inhibitors require inclusion of an inhibitor-only control to assess inhibitor effects on unstimulated cells (e.g. Fig 4I, 5D-I)

      Author: Inhibitor effects on non-stimulated cells were included in Fig.5 D – H, Suppl.Fig.6A and B.

      Fig 3K and Fig 5J appear to contain the same images for p-c-Jun and b-tubulin blots.

      Author: Fig.3K and 5J partially overlapped but had different focus – 3K has been updated to reflect the time course of stress kinase activation. Fig.5J is updated (currently Fig.5I and J) to display B6 and B6.Sst1S macrophage data including cMyc and p-cJun levels.

      Data of TNF-treated cells in Fig 3I appear to be replotted in Fig 3J.

      Author: Currently these data is presented in Fig.3L and 3M and has been updated to include comparison of B6 and B6.Sst1S cells (Fig.3L) and effects of inhibitors in Fig.3M.

      Rev.1: It is stated that lungs from 2 mice with paucibacillary and 2 mice with multi-bacillary lesions were analyses. There is contradicting information on whether these tissues were collected at the same time post infection (week 14?) or whether the pauci-bacillary lesions were in lungs collected at earlier time points post infection (see Fig S8A). If the former, how do the authors conclude that multi-bacillary lesions are a progression from paucibacillary lesions and indicative of loss of M. tuberculosis control, especially if only one lesion type is observed in an individual host? If the latter, comparison between lesions will likely be dominated by temporal differences in the immune response to infection. In either case, it is relevant to consider density, location, and cellular composition of lesions (see also comments on GeoMx spatial profiling). Is the macrophage number/density per tissue area comparable between pauci-bacillary and multi-bacillary lesions?

      Author: We did not collect lungs at the same time point. As described in greater detail in our preprints (Yabaji et al., https://doi.org/10.1101/2025.02.28.640830 and https://doi.org/10.1101/2023.10.17.562695) pulmonary TB lesions in our model of slow TB progression are heterogeneous between the animals at the same timepoint, as observed in human TB patients and other chronic TB animal models. Therefore, we perform analyses of individual TB lesions that are classified by a certified veterinary pathologist in a blinded manner based on their morphology (H&E) and acid fast staining of the bacteria, as depicted in Suppl.Fig.8. Currently it is impossible to monitor progression of individual lesions in mice. However, in mice TB is progressive disease and no healing and recovery from the disease have been observed in our studies or reported in literature. Therefore, we assumed that paucibacillary lesions preceded the multibacillary ones, and not vice versa, thus reflecting the disease progression. In our opinion, this conclusion most likely reflects the natural course of the disease. However, we edited the text : instead of disease progression we refer to paucibacillary and multibacillary lesions.

      Rev1: Does 4HNE staining align with macrophages and if so, is it elevated compared to control mice and driven by TNF in the susceptible vs more resistant mice?

      Author: We performed additional staining and analyses to demonstrate the 4-HNE accumulation in CD11b+ myeloid cells of macrophage morphology. Non-necrotic lesions contain negligible proportion of neutrophils (Fig.7B, Suppl.Fig.9B). B6 mice do not develop advanced multibacillary TB lesions containing 4-HNE+ cells. Also, 4-HNE staining was localized to TB lesions and was not found in uninvolved lung areas of the infected mice, as shown in Suppl.Fig.9A (left panel).

      It is well established that TNF plays a central role in the formation and maintenance of TB granulomas in humans and in all animal models. Therefore, TNF neutralization would lead to rapid TB progression, rapid Mtb growth and lesions destruction in both B6 and B6.Sst1S genetic backgrounds.

      Pathway analysis of spatial transcriptomic data (Suppl.Fig.11) identified TNF signaling via NF-kB among dominant pathways upregulated in multibacillary lesions, suggesting that the 4-HNE accumulation paralleled increased TNF signaling. In addition, in vivo other cytokines, including IFN-I, could activate macrophages and stimulate production of reactive oxygen and nitrogen species and lead to the accumulation of LPO products as shown in this manuscript.

      Rev.1: It would be relevant to state how many independent lesions per host were sampled in both the multiplex IHC as well as the GeoMx data. Can the authors show the selected regions of interest in the tissue overview and in the analyses to appreciate within-host and across-host heterogeneity of lesions. The nature of the spatial transcriptomics platform used is such that the data are derived from tissue areas that contain more than just Iba1+ macrophages. At later stages of infection, the cellular composition of such macrophage-rich areas will be different when compared to lesions earlier in the infection process. Hence, gene expression profiles and differences between tissue regions cannot be attributed to macrophages in this tissue region but are more likely a reflection of a mix of cellular composition and per-cell gene expression.

      Author: We used Iba1 staining to identify macrophages in TB lesions and programmed GeoMx instrument to collect spatial transcriptomics probes from Iba1+ cells within ROIs. Also, we selected regions of interest (ROI) avoiding necrotic areas (depicted in Suppl.Fig.10). We agree that Iba1+ macrophage population is heterogenous – some Iba1+ cells are activated iNOS+ macrophages, other are iNOS-negative (Fig.7C and D, and Suppl.Fig.13A). Multibacillary lesions contain larger areas occupied by activated (iNOS+) macrophages (Fig.7D, Suppl.Fig.13B and 13F). Although the GeoMx spatial transcriptomic platform does not provide single cell resolution, it allowed us to compare populations of Iba1+ cells in paucibacillary and multibacillary TB lesions and to identify a shift in their overall activation pattern.

      It is stated that loss of control of M. tuberculosis in multibacillary lesions was associated with "downregulation of IFNg-inducible genes". If the authors base this on the tissue expression of individual genes, this requires further investigation to support such conclusion (also see comment on GeoMx above). Furthermore, how might this conclusion be compatible with significantly elevated iNOS+ cells (Fig 7D) in multibacillary lesions?

      Author: We demonstrated that Ciita gene expression is specifically induced by IFN-gamma and is suppressed by IFN-I (Fig.6M). The expression of Ciita in paucibacillary lesions suggest the presence of the IFN-gamma activated cells and its disappearance in the multibacillary lesion is consistent with massive activation of IFN-I pathway (Fig.7C).

      Rev1. It is appreciated that the human blood signature analyses contain Myc-signatures but the association with treatment failure is not very strong based on the data in Fig 13B and C (Suppl.Fig.15B and C now). The authors indicate that they have no information on disease severity, but it should perhaps not be assumed that treatment failure is indicative of poor host control of the infection. Perhaps independent analyses in separate cohort/data set can add strength and provide -additional insights (e.g. PMID: 35841871; PMID: 32451443, PMID: 17205474, PMID: 22872737). In addition, the human data analyses could be strengthened by extension to additional signatures such as IFN, TNF, oxidative stress. Details of the human study design are not very clear and are lacking patient demographics, site of disease, time of blood collection relative to treatment onset, approving ethics committees.

      Author: X axis of Suppl.Fig.15A represent pre-defined molecular signature gene sets (MSigDB) in Gene Set Enrichment Analysis (GSEA) database (https://www.gsea-msigdb.org/gsea/msigdb). On Y axis is area under curve (AUC) score for each gene set. The Myc upregulated gene set myc_up was identified among top gene sets associated with treatment failure using unbiased ssGSEA algorithm. The upregulation of Myc pathway in the blood transcriptome associated with TB treatment failure most likely reflects greater proportion of immature cells in peripheral blood, possibly due to increased myelopoiesis.

      Pathway analysis of the differentially expressed genes revealed that treatment failures were associated with the following pathways relevant to this study: NF-kB Signaling, Flt3 Signaling in Hematopoietic Progenitor Cells (indicative of common myeloid progenitor cell proliferation), SAPK/JNK Signaling and Senescence (indicative of oxidative stress). The upregulation of these pathways in human patients with poor TB treatment outcomes correlates with our findings in TB susceptible mice. The detailed analysis of differentially regulated pathways in human TB patients is beyond the scope of this study and is presented in another manuscript entitled “ Tuberculosis risk signatures and differential gene expression predict individuals who fail treatment” by Arthur VanValkenburg et al., submitted for publication.

      Blood collection for PBMC gene expression profiling of TB patients was prior to TB treatment or within a first week of treatment commencement. Boxplot of bootstrapped ssGSEA enrichment AUC scores from several oncogene signatures ranked from lowest to highest AUC score, with myc_up and myc_dn genes highlighted in red.

      We agree with the reviewer that not every gene in the myc_up gene set correlates with the treatment outcome. But the association of the gene set is statistically significant, as presented in Suppl.Fig.15B – C.

      We updated the details of the study, including study sites and the ethics committee approval statement and references describing these cohorts. __ Other comments__

      It is excellent that the authors provide individual data points. Choosing a colour other than black would increase clarity when black bars are used.

      Author: We followed this useful suggestion and selected consistent color codes for B6 and B6.Sst1S groups to enhance clarity throughout the revised manuscript.

      Error bars are inconsistently depicted as either bi-directional or just unidirectional.

      Author: We used bi-directional error bars in the revised manuscript.

      Fig 1E, G, H- please include a scale to clarify what the heat map is representing.

      Author: We have included the expression key in Fig.1E,G and H and Suppl.Fig.1C and D in the revised version.

      Fig 2K, Fig S10A gene information cannot be deciphered.

      Author: We increased the font in previous Fig.2K and moved to supplement to keep larger fonts (current Suppl.Fig.2G).

      Fig S4A,B please add error bars.

      Author: These data are presented as Suppl.Fig.5 in the revised version. We performed one experiment to test the hypothesis. Because the data indicated no clear increase in transposon small RNAs in the sst1S macrophages, we did not pursue this hypothesis further, and therefore, the error bars were not included. However, we decided to include these negative data because it rejects a very attractive and plausible hypothesis.

      Please use gene names as per convention (e.g. Ifnb1) to distinguish gene expression from protein expression in figures and text.

      Author: We addressed the comment in the revised manuscript.

      Fig S8B. Contrary to the description of results, there seems to be minimal overlap between the signal for YFP and the Ifnb1 probe. Is the Ifnb1 reporter mouse a legacy reporter? If so, it is worth stating this and including such considerations in the data interpretation.

      Author: The YFP reporter expresses YFP protein under the control of the Ifnb1 promoter. The YFP protein accumulates within the cells and while Ifnb protein is rapidly secreted and does not accumulate in the producing cells in appreciable amounts. So YFP is not a lineage tracing reporter, but its accumulation marks the Ifnb1 promoter activity in cells, although the YFP protein half-life is longer than that of the Ifnb1 mRNA that is rapidly degraded (Witt et al., BioRxiv, 2024; doi:10.1101/2024.08.28.61018). Therefore, there is no precise spatiotemporal coincidence of these readouts.

      Please clarify what is meant by "normal interstitium" ? If the tissue is from uninfected mice, please state clearly.

      Author: In this context we refer to the uninvolved lung areas of the infected lungs. In every sample we compare uninvolved lung areas and TB lesions of the same animal. Also, we performed staining of lung of non-infected mice as additional controls.

      Rev1: If macrophage cultures underwent media changes every 48h, how was loss of liberated Mtb taken into account especially if differences in cell density/survival were noted? The assessment of M. tuberculosis load by qPCR is not well described. In particular, the method of normalization applied within the experiments (not within the qPCR) here remains unclear, even with reference to the authors' prior publication.

      Author: Our lab has many years of experience working with macrophage monolayers infected with virulent Mtb and uses optimized protocols to avoid cell losses and related artifacts. Recently we published a detailed protocol for this methodology in STAR Protocols (Yabaji et al., 2022; PMID 35310069). In brief, it includes preparation of single cell suspensions of Mtb by filtration to remove clumps, use of low multiplicity of infection, preparation of healthy confluent monolayers and use of nutrient rich culture medium and medium change every 2 days. We also rigorously control for cell loss using whole well imaging and quantification of cell numbers and live/dead staining.

      Please add citation for the limma package.

      Author: The references has been added (Ritchie et al, NAR 2015; PMID 25605792).

      The description of methodology relating to the "oncogene signatures" is unclear.

      Author: This signature was described in Bild etal, Nature, 2006 and McQuerry JA, et al, 2019 “Pathway activity profiling of growth factor receptor network and stemness pathways differentiates metaplastic breast cancer histological subtypes”. BMC Cancer 19: 881 and is cited in Methods section Oncogene signatures

      Please clearly state time points post infection for mouse analyses.

      Author: We collected lung samples from Mtb infected mice 12 – 20 weeks post infection. The lesions were heterogeneous and were individually classified using criteria described above.

      Reference is made to "a list of genes unique to type I [interferon] genes [....]" (p29). Can the authors indicate the source of the information used for compiling this list?

      Author: The lists were compiled from Reactome, EMBL's European Bioinformatics Institute and GSEA databases. The links for all datasets are provided in Suppl.Table 8 “Expression of IFN pathway genes in Iba1+ cells from pauci- and multi-bacillary lesions of Mtb infected B6.Sst1S mouse lungs” in the “Pool IFN I & II gene sets” worksheet.

      The discussion at present is very long, contains repetition of results and meanders on occasion.

      Author: Thank you for this suggestion, We critically revised the text for brevity and clarity.

      Reviewer #1 (Significance (Required)):

      Strengths and limitations

      Strengths: multi-pronged analysis approaches for delineating molecular mechanisms of macrophage responses that might underpin susceptibility to M. tuberculosis infection; integration of mouse tissues and human blood samples

      Weaknesses: not all conclusions supported by data presented; some concerns related to experimental design and controls; links between findings in human cohort and the mechanistic insights gained in mouse macrophage model uncertain

      Author: The revised manuscript addresses every major and minor comment of the reviewers, including isotype controls and naïve T cells, to provide additional support for our conclusions. Our study revealed causal links between Myc hyperactivity with the deficiency of anti-oxidant defense and type I interferon pathway hyperactivity. We have shown that Myc hyperactivity in TNF-stimulated macrophages compromises antioxidant defense leading to autocatalytic lipid peroxidation and interferon-beta superinduction that in turn amplifies lipid peroxidation, thus, forming a vicious cycle of destructive chronic inflammation. This mechanism offers a plausible mechanistic explanation of for the association of Myc hyperactivity with poorer treatment outcomes in TB patients and provide a novel target for host-directed TB therapy.

      Advance

      The study has the potential to advance molecular understanding of the TNF-driven state of oxidative stress previously observed in B6.Sst1S macrophages and possible implications for host control of M. tuberculosis in vivo.

      Audience

      Experts seeking understanding of host factors mediating M. tuberculosis control, or failure thereof, with appreciation for the utility of the featured mouse model in assessing TB diseases progression and severe manifestation. Interest is likely extended to audience more broadly interested in TNF-driven macrophage (dys)function in infectious, inflammatory, and autoimmune pathologies.

      Reviewer expertise

      In preparing this review, I am drawing on my expertise in assessing macrophage responses and host defense mechanisms in bacterial infections (incl. virulent M. tuberculosis) through in vitro and in vivo studies. This includes but is not limited to macrophage infection and stimulation assays, microscopy, intra-macrophage replication of M. tuberculosis, analyses of lung tissues using multi-plex IHC and spatial transcriptomics (e.g. GeoMx). I am familiar with the interpretation of RNAseq analyses in human and mouse cells/tissues, but can provide only limited assessment of appropriateness of algorithms and analysis frameworks.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Yabaji et al. investigated the effects of BMDMs stimulated with TNF from both WT and B6.Sst1S mice, which have previously been identified to contain the sst1 locus conferring susceptibility to Mycobacterium tuberculosis. They identified that B6.Sst1S macrophages show a superinduction of IFNß, which might be caused by increased c-Myc expression, expanding on the mechanistic insights made by the same group (Bhattacharya et al. 2021). Furthermore, prolonged TNF stimulation led to oxidative stress, which WT BMDMs could compensate for by the activation of the antioxidant defense via NRF2. On the other hand, B6.Sst1S BMDMs lack the expression of SP110 and SP140, co-activators of NRF2, and were therefore subjected to maintained oxidative stress. Yabaji et al. could link those findings to in vivo studies by correlating the presence of stressed and aberrantly activated macrophages within granulomas to the failure of Mtb control, as well as the progression towards necrosis. As the knowledge regarding Mtb progression and necrosis of granulomas is not yet well understood, findings that might help provide novel therapy options for TB are crucial. Overall, the manuscript has interesting findings with regard to macrophage responses in Mycobacteria tuberculosis infection.

      However, in its current form there are several shortcomings, both with respect to the precision of the experiments and conclusions drawn. In particular a) important controls are often missing, e.g. T-cells form non-immune mice in Fig. 6J, in F, effectivity of BCG in B6 mice in 6N; b) single experiments are shown throughout the manuscript, in particular western blots and histology without proper quantification and statistics, this is absolutely not acceptable; c) very few repetitions are shown in in vitro experiments, where there is no evidence for limitation in resources (usually not more than 3), it is not clear what "independent experiment means" - i.e. the robustness of the findings is questionable; d) data are often normalized multiple times, e.g. in the case of qPCR, and the methods of normalization are not clear (what house-keeping gene exactly?);

      Moreover, experiments regarding IFN I signaling (e.g. short term TNF treatment of BMDMs to analyze LPO, making sure that the reporter mouse for IFNß works in vivo) and c-Myc (e.g. the increase after M-CSF addition might impact on other analysis as well and the experiments should be adjusted to control for this effect; MYC expression in the human samples) should be carefully repeated and evaluated to draw correct conclusions.

      In addition, we would like to strongly encourage the authors to more precisely outline the experimental set-ups and figure legends, so that the reader can easily understand and follow them. In other words: The legends are - in part very - incomplete. In addition, the authors should be mindful of gene names vs. protein names and italicize where appropriate.

      Author: We appreciate a very thorough evaluation of our manuscript by this reviewer. Their insightful comments helped us improve the manuscript. As outlined below in point-by-point responses 1) we added important controls including isotype control antibodies in IFNAR blocking experiments and non-vaccinated T cells in T cell – macrophage interactions experiments; updated figure legends to indicate number of repeated experiment where a representative experiment is shown, numbers of mouse lungs and individual lesions, methods of data normalization, where it was missing. We also explained our in vitro experimental design and how we analyzed and excluded effects of media change and fresh CSF1 addition, by using a rest period before TNF stimulation and Mtb infection. The data shown in Suppl. Fig. 6C (previously Suppl. Fig. 5B) demonstrate that Myc levels induced by CSF1 return to the basal level at 12 h after media change. Our detailed in vitro protocol that contains these details has been published (Yabaji et al., STAR Protocols, 2022). We added new data demonstrating the ROS and LPO production at 6h of TNF stimulation, while the Ifnb1 mRNA super-induction occurred at 16 – 18 h, and edited the text to highlight these dynamics. The upregulation of Myc pathway in human samples does not necessarily mean the upregulation of Myc itself, it could be due to the dysregulation of downstream pathways. The upregulation of Myc pathway in the blood transcriptome associated with TB treatment failure most likely reflects greater proportion of immature cells in peripheral blood, possibly due to increased myelopoiesis. The detailed analysis of this cell populations in human patients is suggested by our findings but it is beyond the scope of this study.

      The reviewer’s comments also suggested that a summary of our findings was necessary. The main focus of our study was to untangle connections between oxidative stress and Ifnb1 superinduction. It revealed that Myc hyperactivity caused partial deficiency of anti-oxidant defense leading to type I interferon pathway hyperactivity that in turn amplifies lipid peroxidation, thus establishing a vicious cycle driving inflammatory tissue damage.

      Our laboratory worked on mechanisms of TB granuloma necrosis over more than two decades using genetic, molecular and immunological analyses in vitro and in vivo. It provided mechanistic basis for independent studies in other laboratories using our mouse model and further expanding our findings, thus supporting the reproducibility and robustness of our results and our lab’s expertise.

      Specific comments to the experiments and data:

      • Fig. 1E: Evaluation of differences in up- and downregulation between B6 and B6.Sst1S cells should highlight where these cells are within the heatmap, as it is only labelled with the clusters, or it should be depicted differently (in particular for cluster 1 and 2). Furthermore, a more simple labelling of the pathways would increase the readability of the data.

      Author: For our scRNAseq data presentation, we used formats accepted by computational community. To clarify Fig.1E, we added labels above B6 and B6.Sst1S-specific clusters.

      • Fig. 2D, E: The staining legend is missing. For the quantification it is not clear what % total means. Is this based on the intensity or area? What do the dots represent in the bar chart? Is one data point pooled from several pictures? If not, the experiments need to be repeated, as three pictures might not be representative for evaluation.

      • Fig. 2E: Statistics comparing B6/ B6,SsT1S with TNF (different) is required: Absence of induction is not a proof for a difference!

      Author: We included staining with NRF2-specific antibodies and performed area quantification per field using ImageJ to calculate the NRF2 total signal intensity per field. Each dot in the graph represents the average intensity of 3 fields in a representative experiment. The experiment was repeated 3 times. We included pairwise comparison of TNF-stimulated B6 and B6.Sst1S macrophages and updated the figure legend.

      • Fig. 3E: Positive and negative control need to be depicted in the figure (see legend).

      Author: We have added the positive and negative controls for the determination of labile iron pool to the data in Fig. 3E and related Suppl. Fig. 3B and to Fig. 5D that also demonstrates labile iron determination.

      • Fig. 3I: A quantification by flow cytometry or total cell counts are important, as 6% cell death in cell culture is a very modest observation. Otherwise, confocal images of the quantification would be a good addition to judge the specificity of the viability staining.

      Author: To validate the specificity of the viability staining method, we have provided fluorescent images as Suppl.Fig.3H. The main point of this experiment was to demonstrate a modest, but reproducible, increase in cell death in the sst1-mutant macrophages that suggested an IFN-dependent oxidative damage. In our study, we did not focus on mechanisms of cell death, but on a state of chronic oxidative stress in the sst1 mutant live cells during TNF stimulation.

      • Fig. 3I, J: What does one dot represent?

      Author: We performed this assay in 96 well format and each dot represent the % cell death in an individual well.

      • Fig. 3K,L: For the B6 BMDMs it seems that p-cJun is highly increased at 12h in (L), while it is not in (K). On the other hand, for the B6.Sst1S BMDMs it peaks at 24h in (K), while in (L) it seems to at 12h. According to the data in (L) it seems that p-cJun is rather earlier and stronger activated in B6 BMDMs and has a weakened but prolonged activation in the B6.Sst1S BMDMs, which would not fit with your statement in the text that B6.Sst1S BMDMs show an upregulation. !These experiments need repetitions and quantification and statistiscs!

      Fig. 3L: ASK1 seems to be higher at 12h for the B6 BMDMs and similar for both lines at 24h, which is not fitting to the statement in the text. ("Also, the ASK1 - JNK - cJun stress kinase axis was upregulated in B6.Sst1S macrophages, as compared to B6, after 12 - 36 h of TNF stimulation")

      Author: These experiments were repeated, and new data were added to highlight differences in ASK1 and c-Jun phosphorylation between B6 and B6.Sst1S at individual timepoints after TNF stimulation (presented in new Fig.3K). It demonstrated that after TNF stimulation the activation of stress kinases ASK1 and c-Jun initially increased in both genetic backgrounds. However, their upregulation was maintained exclusively in the sst1-susceptible macrophages from 24 to 36 h of TNF stimulation, while in the resistant macrophages their upregulation was transient. Thus, during prolonged TNF stimulation, B6.Sst1S macrophages experience stress that cannot be resolved, as evidenced by this kinetic analysis. The quantification of the band intensity was added to Western blot images above individual lanes.

      Reviewer 2 pointed to missing isotype control antibodies in Fig.3 and Fig.4:

      • Figure 3J: the isotype control for the IFNAR antibody is missing

      • Figure 4E: It seems the isotype control itself has already an effect in the reduction of IFNb.

      • Fig. 4H: It seems that the Isotype control antibody had an effect to increase 4-HNE (compared to TNF stimulated only).

      Author: We always include isotype control antibodies in our experiments because antibodies are known to modulate macrophage activation via binding to Fc receptor. To address the reviewer’s comments, we updated all panels that present the effects of IFNAR1 blockade with isotype-matched non-specific control antibodies in the revised manuscript. Specifically, we included isotype control in Fig. 3M (previously Fig.3J), Fig.4I, Suppl.4E – G, Fig.6L-M), Suppl.Fig.7I (previously Suppl.Fig.6F).

      • Fig.4A - C: "IFNAR1 blockade, however, did not increase either the NRF2 and FTL protein levels, or the Fth, Ftl and Gpx1 mRNA levels above those treated with isotype control antibodies"

      Maybe not above the isotype but it is higher than the TNF alone stimulation at least for NRF2 at 8h and for Ftl at both time points. Why does the isotype already cause stimulation/induction of the cells? !These experiments need repetitions and quantification and statistics!

      Author: To determine specific effects of IFNAR blockade we compared effects of non-specific isotype control and IFNAR1-specific antibodies. In our experiments, the isotype control antibody modestly increased of Nrf2 and Ftl protein levels and the Fth and Ftl mRNA levels, but their effects were similar to the effect of IFNAR-specific antibody. The non-IFN -specific effects of antibodies, although are of potential biological significance, are modest in our model and their analysis is beyond the scope of this study.

      • Fig.4H Was the AB added also at 12h post stimulation? Figure legend should be adjusted.

      Author: The IFNAR1 blocking antibodies and isotype control antibodies were added at 2 h after TNF stimulation in Fig.4H and 4I, as described in the corresponding figure legend. The data demonstrating effects of IFNAR blockade after 12, 24,and 33h of TNF stimulation are presented in Suppl.Fig.4 E - G.

      • Figure 4I: How was the data measured here, i.e. what is depicted? The isotype control is missing. It seems a two-way ANOVA was used, yet it is stated differently. The figure legend should be revised, as Dunnett's multiple comparison would only check for significances compared to the control.

      Author: The microscopy images and bar graphs were updated to include isotype control and presented in Suppl. Fig.4E - G of the revised version. We also revised the statistical analysis to include correction for multiple comparisons.

      Figure 4C and subsequent: How exactly was the experiment done (house-keeping gene)?

      Author: We included the details in the figure legends of revised version. We quantified the gene expression by DDCt method using b-actin (for Fig. 4C-E) and 18S (For Fig. 4F and G) as internal controls.

      • Figure 4D,E: Information on cells used is missing. Why the change in stimulation time? Did it not work after 12h? Then the experiments in A-C should be repeated for 16h.

      Author: The updated Fig. 4D and E present comparison of B6 and B6.Sst1S BMDMs clearly demonstrating significant difference between these macrophages in Ifnb1 mRNA expression 16 h after TNF stimulation, in agreement with our previous publication(Bhattacharya, et al., 2021). There we studied the time course of responses of B6 and B6.Sst1S macrophages to TNF at 2h intervals and demonstrated the divergence between their activation trajectories starting at 12 h of TNF stimulation Therefore, to reveal the underlying mechanisms we focus our analyses on this critical timepoint, i.e. as close to the divergence as possible. However, the difference between the strains in Ifnb1 mRNA expression achieved significance only by 16h of TNF stimulation. That is why we have used this timepoint for the Ifnb1 and Rsad2 analyses. It clearly shows that the superinduction was not driven by the positive feedback via IFNAR, as has been shown by the Ivashkiv lab for B6 wild type macrophages previously PMID 21220349.

      • Figure 4E: It would be helpful to see if these transcripts are actually translated into protein levels, e.g. perform an ELISA. Authors state that IFNAR blockages does not alter the expression but you statistic says otherwise.

      -The data for Ifnb expression (or better protein level) should be provided for B6 BMDMs as well.

      Author: We have previously reported the differences in Ifnb protein secretion (He et al., Plos Pathogens, 2013 and Bhattacharya et al., JCI 2021). We use mRNA quantification by qRT-PCR as a more sensitive and direct measurement of the sst1-mediated phenotype. The revised Fig.4D and E include responses of B6 in addition to the B6.Sst1S to demonstrate that the IFNAR blockade does not reduce the Ifnb1 mRNA levels in TNF-stimulated B6.Sst1S mutant to the B6 wild type levels. A slight reduction can be explained by a known positive feedback loop in the IFN-I pathway (see above). In this experiment we emphasized that the effect of the sst1 locus is substantially greater, as compared to the effect of the IFNAR blockade (Fig.4D), and updated the text accordingly.

      • Fig. 4F: To what does the fold induction refer to? If it is again to unstimulated cells, then why is the induction now so much higher than in (E) where it was only 50x (now to 100x).

      • Figure 4G: Again to what is the fold induction referring to? It seems your Fer-1 treatment only contains 2 data points. This needs to be fixed.

      Author: Yes, the fold induction was calculated by normalizing mRNA levels to untreated control incubated for the same time. Regarding the variation in Ifnb1 mRNA levels - a two-fold variation is not unusual in these experiments that may result in the Ifnb1 mRNA superinduction ranging from 50 -200-fold at this timepoint (16h). The graph in Fig.4G was modified to make all datapoints more visible.

      • "These data suggest that type I IFN signaling does not initiate LPO in our model but maintains and amplifies it during prolonged TNF stimulation that, eventually, may lead to cell death". Data for a short term TNF stimulation are not shown, however, so it might impact also on the initiation of LPO.

      • The overall conclusion drawn from Fig. 3 and 4 is not really clear with regard that IFN does not initiate LPO. Where is that shown? Data on earlier stimulation time points should be added to make this clear.

      Author: We demonstrated ROS production (new Suppl.Fig.3G) and the rate of LPO biosynthesis (new Suppl.Fig.4E-F) at 6 h post TNF stimulation, while the Ifnb1 superinduction occurs between 12-18 h post TNF stimulation. This temporal separation supports our conclusion that IFN-β superinduction does not initiate LPO. We clarified it in the text:

      “Thus, Ifnb1 super-induction and IFN-I pathway hyperactivity in B6.Sst1S macrophages follow the initial LPO production, and maintain and amplify it during prolonged TNF stimulation”. (Previously: These data suggest that type I IFN signaling does not initiate LPO in our model). We also edited the conclusion in this section to explain the hierarchy of the sst1-regulated AOD and IFN-I pathways better:

      “Taken together, the above experiments allowed us to reject the hypothesis that IFN-I hyperactivity caused the sst1-dependent AOD dysregulation. In contrast, they established that the hyperactivity of the IFN-I pathway in TNF-stimulated B6.Sst1S macrophages was itself driven by the initial dysregulation of AOD and iron-mediated lipid peroxidation. During prolonged TNF stimulation, however, the IFN-I pathway was upregulated, possibly via ROS/LPO-dependent JNK activation, and acted as a potent amplifier of lipid peroxidation”.

      We believe that these additional data and explanation strengthen our conclusions drawn from Figures 3 and 4.

      • "A select set of mouse LTR-containing endogenous retroviruses (ERV's) (Jayewickreme et al, 2021), and non-retroviral LINE L1 elements were expressed at a basal level before and after TNF stimulation, but their levels in the B6.Sst1S BMDMs were similar to or lower than those seen in B6". This sentence should be revised as the differences between B6 and B6.Sst1S BMDMs seem small and are not there after 48h anymore. Are these mild changes really caused by the mutation or could they result from different housing conditions and/or slowly diverging genetically lines. How many mice were used for the analysis? Is there already heterogeneity between mice from the same line?

      Author: We agree with the reviewer that the data presented in Suppl.Fig.4 (Suppl.Fig.5 in the revised version) indicated no increase in single- and double-stranded transposon RNAs in the B6.Sst1S macrophages. The purpose of these experiment was to test the hypothesis that increased transposon expression might be responsible for triggering the superinduction of type I interferon response in TNF-stimulated B6.Sst1S macrophages. In collaboration with a transposon expert Dr. Nelson Lau (co-author of this manuscript) we demonstrated that transposon expression was not increased above the B6 level and, thus, rejected this attractive hypothesis. We explained the purpose of this experiment in the text and adequately described our findings as “the levels in the B6.Sst1S BMDMs were similar to or lower than those seen in B6”…and concluded that ” the above analyses allowed us to exclude the overexpression of persistent viral or transposon RNAs as a primary mechanism of the IFN-I pathway hyperactivity” in the sst1-mutant macrophages.

      • Fig. 5A: Indeed, it even seems that Myc is upregulated for the mutant BMDMs. Yet, there are only 2 data points for B6 12h. !These experiments need repetitions and quantification and statistics!

      Author: We observed these differences in c-Myc mRNA levels by independent methods: RNAseq and qRT-PCR. The qRT-PCR experiments were repeated 3 times. A representative experiment in Fig.5A shows 3 data points for each condition. We reformatted the panel to make all data points clearly visible.

      • Fig. 5B: Why would the protein level decrease in the controls over 6h of additional cultivation? Is this caused by fresh M-CSF? In this case maybe cells should be left to settle for one day before stimulating them to properly compare c-Myc induction. Comment on two c-Myc bands is needed. At 12h only the upper one seems increased for TNF stimulated mutant BMDMs compared to B6 BMDMs.

      Author: We agree with the reviewer’s point that cells need to be rested after media change that contains fresh CSF-1. Indeed, in Suppl.Fig.6C, we show that after media change containing 10% L929 supernatant (a source of CSF1) there is an increase in c-Myc protein levels that takes approximately 12 hours to return to baseline.

      Our protocol includes resting period of 18 – 24 h after medium change before TNF stimulation. We updated Methods to highlight this detail. Thus, the increase in c-Myc levels we observe at 12 h of TNF stimulation (Fig.5B) is induced by TNF, not the addition of growth factors, as further discussed in the text.

      The two c-Myc bands observed in Fig.5B,I and J, are similar to patterns reported in previous studies that used the same commercial antibodies (PMIDs: 24395249, 24137534, 25351955). Whether they correspond to different c-Myc isoforms or post-translational modifications is unknown.

      • Fig. 5A,B: It seems that not all the RNA is translated into protein, as c-Myc at 12h in the mutant BMDMs seems to be lower than at 6h, while the gene expression implicates it vice versa.

      Author: In addition to Fig.5B, the time course of Myc protein expression up to 24 h is presented in new panels Fig. 5I-5J. It demonstrates the gradual decrease of Myc protein levels. The observed dissociation between the mRNA and protein levels in the sst1-mutant BMDMs at 12 and 24 h is most likely due to translation inhibition as a result of the development of the integrated stress response, ISR (as shown in our previous publication by Bhattacharya et al., JCI, 2021). Translation of Myc is known to be particularly sensitive to the ISR (PMID18551192, PMID25079319, PMID28490664). Perhaps, the IFN-driven ISR may serve as a backup mechanism for Myc downregulation. We are planning to investigate these regulatory mechanisms in greater detail in the future.

      • Fig. 5J: Indeed, the inhibitor seems to cause the downregulation of the proteins. Explanation?

      Author: This experiment was repeated twice and the average normalized densitometry values are presented in the updated Fig.5J. The main question addressed in this experiment was whether hyperactivity of JNK in TNF-stimulated sst1 mutant macrophages contributed to Myc upregulation, as had been previously shown in cancer. Comparing effects of JNK inhibition on phospho-cJun and c-Myc protein levels in TNF stimulated B6.Sst1S macrophages (updated Fig.5J), we rejected the hypotghesis that JNK activity might have a major role in c-Myc upregulation in sst1 mutant macrophages.

      • "TNF stimulation tended to reduce the LPO accumulation in the B6 macrophages and to increase it in the B6.Sst1S ones" However, this is not apparent in Sup. Fig. 6B. Here it seems that there might be a significant increase.

      Author: Suppl.Fig.6B (currently Suppl.Fig.7B) shows the 4-HNE accumulation at day 3 post infection. The data obtained after 5 days of Mtb infection are shown in Fig.6A. We clarified this in the text: “By day 5 post infection, TNF stimulation induced significant LPO accumulation only in the B6.Sst1S macrophages (Fig.6A)”.

      • Fig. 6B: Mtb and 4-HNE should be shown in two different channels in order to really assign each staining correctly.

      What time point is this? Are the mycobacteria cleared at MOI1, since it looks that there are fewer than that? How does this look like for the B6 BMDMs? Are there even less mycobacteria?

      Author: We included B6 infection data to the updated Fig.6B and added Suppl.Fig.7C and 7D that address this reviewer’s comment. The data represent day 5 of Mtb infection as indicated in the updated Fig.6B and Suppl.Fig.7C and 7D legends. New Suppl.Fig.7D shows quantification of replicating Mtb using Mtb replication reporter stain expressing single strand DNA binding protein GFP fusion, as described in Methods. We observed fewer Mtb and a lower percentage of replicating Mtb in B6 macrophages, but we did not observe a complete Mtb elimination in either background.

      We used red fluorescence for both Mtb::mCherry and 4-HNE staining to clearly visualize the SSB-GFP puncta in replicating Mtb DNA. In the revised manuscript, we have included the relevant channels in Suppl. Fig.7C and D to demonstrate clearly distinct patterns of Mtb::mCherry and 4-HNE signals. We did not aim to quantify the 4-HNE signal intensity in this experiment. For the 4-HNE quantification we use Mtb that expressed no reporter proteins (Fig.6A-B and Suppl.Fig.7A-B).

      • Fig 6E: In the context of survival a viability staining needs to be included, as well as the data from day 0. Then it needs to be analyzed whether cell numbers remain the same from D0 or if there is a change.

      Author: We updated Fig.6 legend to indicate that the cell number percentages were calculated based on the number of cells at Day 0 (immediately after Mtb infection). We routinely use fixable cell death staining to enumerate cell death to exclude artifacts due to cell loss. Brief protocol containing this information is included in Methods section. The detailed protocol including normalization using BCG spike has been published – Yabaji et al, STAR Protocols, 2022. Here we did not present dead cell percentage as it remained low and we did not observe damage to macrophage monolayers. The fold change of Mtb was calculated after normalization using Mtb load at Day 0 after infection and washes.

      "The 3D imaging demonstrated that YFP-positive cells were restricted to the lesions, but did not strictly co-localize with intracellular Mtb, i.e. the Ifnb promoter activity was triggered by inflammatory stimuli, but not by the direct recognition of intracellular bacteria. We validated the IFNb reporter findings using in situ hybridization with the Ifnb probe, as well as anti-GFP antibody staining (Suppl.Fig.8B - E)." The colocalization is not present within the tissue sections. It seems that the reporter line does not show the same staining pattern in vivo as the IFNß probe or the anti GFP antibody staining. The reporter line has to be tested for the specificity of the staining. Furthermore, to state that it was restricted to the lesions, an uninvolved tissue area needs to be depicted.

      Author: The Ifnb secreting cells are notoriously difficult to detect in vivo using direct staining of the protein. Therefore, lineage tracing of reporter expression are used as surrogates. The Ifnb reporter used in our study has been developed by the Locksley laboratory (Scheu et al., PNAS, 2008, PMID: 19088190) and has been validated in many independent studies. The reporter mice express the YFP protein under the control of the Ifnb1 promoter. The YFP protein accumulates within the cells, while Ifnb protein is rapidly secreted and does not accumulate in the producing cells in appreciable amounts. Also, the kinetics of YFP protein degradation is much slower as compared to the endogenous Ifnb1 mRNA that was detected using in situ hybridization. Thus, there is no precise spatiotemporal coincidence of these readouts in Ifnb expressing cells in vivo. However, this methodology more closely reflect the Ifnb expressing cells in vivo, as compared to a Cre-lox mediated lineage tracing approach. In the revised manuscript we demonstrate that both YFP and mRNA signals partially overlap (Suppl.Fig.12B). In Suppl.Fig.12B. we also included a new panel showing no YFP expression in the uninvolved area of the reporter mice infected with Mtb. The YFP expression by activated macrophages is demonstrated by co-staining with Iba1- and iNOS-specific antibodies (new Fig.7D and Suppl.Fig.13A). Our specificity control also included TB lesions in mice that do not carry the YFP reporter and did not express the YFP signal, as reported elsewhere (Yabaji et al., BioRxiv, https://doi.org/10.1101/2023.10.17.562695).

      • Are paucibacillary and multibacillary lesions different within the same animal or does one animal have one lesion phenotype? If that is the case, what is causing the differences between mice? Bacterial counts for the mice are required.

      Author: The heterogeneity of pulmonary TB lesions has been widely acknowledged in clinic and highlighted in recent experimental studies. In our model of chronic pulmonary TB (described in detail in Yabaji et al., https://doi.org/10.1101/2025.02.28.640830 and https://doi.org/10.1101/2023.10.17.562695) the development of pulmonary TB lesions is not synchronized, i.e. the lesions are heterogeneous between the animals and within individual animals at the same timepoint. Therefore, we performed a lesion stratification where individual lesions were classified by a certified veterinary pathologist in a blinded manner based on their morphology (H&E) and acid fast staining of the bacteria, as depicted in Suppl.Fig.8.

      • "Among the IFN-inducible genes upregulated in paucibacillary lesions were Ifi44l, a recently described negative regulator of IFN-I that enhances control of Mtb in human macrophages (DeDiego et al, 2019; Jiang et al, 2021) and Ciita, a regulator of MHC class II inducible by IFNy, but not IFN-I (Suppl.Table 8 and Suppl.Fig.10 D-E)." Why is Sup. Fig. 10 D, E referred to? The figure legend is also not clear, e.g. what means "upregulated in a subset of IFN-inducible genes"? Input for the hallmarks needs to be defined.

      Author: These data is now presented in Suppl.Fig.11 and following the reviewer’s comment, we moved reference to panels 11D – E up to previous paragraph in the main text, where it naturally belongs . We also edited the figure legend to refer to the list of IFN-inducible genes compiled from the literature that is discussed in the text. We appreciate the reviewer’s suggestion that helped us improve the text clarity. The inputs for the Hallmark pathway analysis are presented in Suppl.Tables 7 and 8, as described in the text.

      • Fig. 7C: Single channel pictures are required as it is hard to see the differences in staining with so many markers. Why is there no iNOS expression in the bottom row? What does the rectangle indicate on the bottom right? As black is chosen for DAPI, it is not visible at all. In case the signal is needed a visible a color should be chosen.

      Author: We thoroughly revised this figure to address the reviewer’s concern about the lack of clarity. We provide individual channels for each marker in Fig.7D – E and Suppl.Fig.13F. We have to use DAPI in these presentation in gray scale to better visualize other markers.

      • "In the advanced lesions these markers were primarily expressed by activated macrophages (Iba1+) expressing iNOS and/or Ifny (YFP+)(Fig.7D)" Iba1 is needed in the quantification. Based on the images, iNOS seems to be highly produced in Iba1 negative cells. Which cells do produce it then? Flow cytometry data for this quantification are required. This would allow you to specifically check which cells express the markers and allow for a more precise analysis of double positive cells.

      Author: Currently these data demonstrating the co-localization of stress markers phospho-c-Jun and Chac1 with YFP are presented in Fig.7E (images) and Suppl.Fig.13D (quantification). The co-localization of stress markers phospho-cJun and Chac1 with iNOS is presented in Suppl.Fig.13F (images) and Suppl.Fig.13E (quantification). We agree that some iNOS+ cells are Iba1-negative (Fig.7D). We manually quantified percentages of Iba1+iNOS+ double positive cells and demonstrated that they represent the majority of the iNOS+ population(Suppl.Fig.13A). Regarding the required FACS analysis, we focus on spatial approaches because of the heterogeneity of the lesions that would be lost if lungs are dissociated for FACS. We are working on spatial transcriptomics at a single cell resolution that preserves spatial organization of TB lesions to address the reviewer’s comment and will present our results in the future.

      • Results part 6: In general, can you please state for each experiment at what time point mice were analyzed? You should include an additional macrophage staining (e.g. MerTK, F4/80), as alveolar macrophages are not staining well for Iba1 and you might therefore miss them in your IF microscopy. It would be very nice if you could perform flow cytometry to really check on the macrophages during infection and distinguish subsets (e.g. alveolar macrophages, interstitial macrophages, monocytes).

      Author: We have included the details of time post infection in figure legends for Fig.7, Suppl.Figures 8, 9, 12B, 13, 14A of the revised manuscript. We have performed staining with CD11b, CD206 and CD163 to differentiate the recruited and lung resident macrophages and determined that in chronic pulmonary TB lesions in our model the vast majority of macrophages are recruited CD11b+, but not resident (CD206+ and CD163+) macrophages. These data is presented in another manuscript (Yabaji et al., BioRxiv https://doi.org/10.1101/2023.10.17.562695).

      • Spatial sequencing: The manuscript would highly profit from more data on that. It would be very interesting to check for the DEGs and show differential spatial distribution. Expression of marker genes should be inferred to further define macrophage subsets (e.g. alveolar macrophages, interstitial macrophages, recruited macrophages) and see if these subsets behave differently within the same lesion but also between the lesions. Additional bioinformatic approaches might allow you to investigate cell-cell interactions. There is a lot of potential with such a dataset, especially from TB lesions, that would elevate your findings and prove interesting to the TB field.

      • "Thus, progression from the Mtb-controlling paucibacillary to non-controlling multibacillary TB lesions in the lungs of TB susceptible mice was mechanistically linked with a pathological state of macrophage activation characterized by escalating stress (as evidenced by the upregulation phospho-cJUN, PKR and Chac1), the upregulation of IFNβ and the IFN-I pathway hyperactivity, with a concurrent reduction of IFNγ responses." To really show the upregulation within macrophages and their activation, a more detailed IF microscopy with the inclusion of additional macrophage markers needs to be provided. Flow cytometry would enable analysis for the differences between alveolar and interstitial macrophages, as well as for monocytes. As however, it seems that the majority of iNOS, as well as the stress associated markers are not produced by Iba1+ cells. Analyzing granulocytes and T lymphocytes should be considered.

      Author: We appreciate the reviewer’s suggestion. Indeed, our model provides an excellent opportunity to investigate macrophage heterogeneity and cell interactions within chronic TB lesions. We are working on spatial transcriptomics at a single cell resolution that would address the reviewer’s comment and will present our results in the future.

      In agreement with classical literature the overwhelming majority of myeloid cells in chronic pulmonary TB lesions is represented by macrophages. Neutrophils are detected at the necrotic stage, but our study is focused on pre-necrotic stages to reveal the earlier mechanisms pre-disposing to the necrotization. We never observed neutrophils or T cells expressing iNOS in our studies.

      • It's mentioned in the method section that controls in the IF staining were only fixed for 10min, while the infected cells were fixed for 30min. Consistency is important as the PFA fixation might impact on the fluorescence signal. Therefore, controls should be repeated with the same fixation time.

      Author: We have carefully considered the impact of fixation time on fluorescence and have separately analyzed the non-infected and infected samples to address this concern.

      For the non-infected samples, we examined the effect of TNF in both B6 and B6.Sst1S backgrounds, ensuring that a consistent fixation protocol (10 min) was applied across all experiments without Mtb infection.

      For the Mtb infection experiments, we employed an optimized fixation protocol (30 min) to ensure that Mtb was killed before handling the plates, which is critical for preserving the integrity of the samples. In this context, we compared B6 and B6.Sst1S samples to evaluate the effects of fixation and Mtb infection on lipid peroxidation (LPO) induction.

      We believe this approach balances the need for experimental consistency with the specific requirements for handling infected cells, and we have revised the manuscript to reflect this clarification.

      • Reactive oxygen species levels should be determined in B6 and B6.Sst1S BMDMs (stimulated and unstimulated), as they are very important for oxidative stress.

      Author: We have conducted experiments to measure ROS production in both B6 and B6.Sst1S BMDMs and demonstrated higher levels of ROS in the susceptible BMDMs after prolonged TNF stimulation (new Fig.3I – J and Suppl. Fig. 3G). Additionally, we have previously published a comparison of ROS production between B6 and B6.Sst1S by FACS (PMID: 33301427), which also supports the findings presented here.

      • Sup. Fig 2C: The inclusion of an unstimulated control would be advisable in order to evaluate if there are already difference in the beginning.

      Author: We have included the untreated control to the Suppl. Fig. 2C (currently Suppl. Fig. 2D) in the revised manuscript.

      • Sup. Fig. 3F: Why is the fold change now lower than in Fig. 4D (fold change of around 28 compared to 120 in 4D)?

      Author: The data in Fig.4D (Fig.4E in the revised manuscript) and Suppl.Fig.3F (currently Suppl.Fig.4C) represent separate experiments and this variation between experiments is commonly observed in qRT-PCR that is affected by slight variations in the expression in unsimulated controls used for the normalization and the kinetics of the response. This 2-4 fold difference between same treatments in separate experiments, as compared to 30 – 100 fold and higher induction by TNF does not affect the data interpretation.

      • Sup. Fig. 5C, D: The data seems very interesting as you even observe an increase in gene expression. Data for the B6 mice should be evaluated for increase to a similar level as the TNF treated mutants. Data on the viability of the cells are necessary, as they no longer receive M-CSF and might be dying at this point already.

      Author: To ensure that the observed effects were not confounded by cytotoxicity, we determined non-toxic concentrations of the CSF1R inhibitors during 48h of incubation and used them in our experiments that lasted for 24h. To address this valid comment, we have included cell viability data in the revised manuscript to confirm that the treatments did not result in cell death (Suppl. Fig. 6D). This experiment rejected our hypothesis that CSF1 driven Myc expression could be involved in the Ifnb superinduction. Other effects of CSF1R inhibitors on type I IFN pathway are intriguing but are beyond the scope of this study.

      • Sup. Fig 12: the phospho-c-Jun picture for (P) is not the same as in the merged one with Iba1. Double positive cells are mentioned to be analyzed, but from the staining it appears that P-c-Jun is expressed by other cells. You do not indicate how many replicates were counted and if the P and M lesions were evaluated within the same animal. What does the error bar indicate? It seems unlikely from the plots that the double positive cells are significant. Please provide the p values and statistical analysis.

      Author: We thank the reviewer for bringing this inadvertent field replacement in the single phospho-cJun channel to our attention. However, the quantification of Iba1+phospho-cJun+ double positive cells in Suppl.Fig.12 and our conclusions were not affected. In the revised manuscript, images and quantification of phospho-cJun and Iba1 co-expression are shown in new Suppl.Fig.13B and C, respectively. We have also updated the figure legends to denote the number of lesions analyzed and statistical tests. Specifically, lesions from 6–8 mice per group (paucibacillary and multibacillary) were evaluated. Each dot in panels Suppl.Fig.13 represent individual lesions.

      • Sup. Fig. 13D (suppl.Fig.15D now): What about the expression of MYC itself? Other parts of the signaling pathway should be analyzed(e.g. IFNb, JNK)?

      Author: The difference in MYC mRNA expression tended to be higher in TB patients with poor outcomes, but it was not statistically significant after correction for multiple testing. The upregulation of Myc pathway in the blood transcriptome associated with TB treatment failure most likely reflects greater proportion of immature cells in peripheral blood, possibly due to increased myelopoiesis. Pathway analysis of the differentially expressed genes revealed that treatment failures were associated with the following pathways relevant to this study: NF-kB Signaling, Flt3 Signaling in Hematopoietic Progenitor Cells (indicative of common myeloid progenitor cell proliferation), SAPK/JNK Signaling and Senescence (possibly indicative of oxidative stress). The upregulation of these pathways in human patients with poor TB treatment outcomes correlates with our findings in TB susceptible mice.

      • In the mfIHC you he usage of anti-mouse antibodies is mentioned. Pictures of sections incubated with the secondary antibody alone are required to exclude the possibility that the staining is not specific. Especially, as this data is essential to the manuscript and mouse-anti-mouse antibodies are notorious for background noise.

      Author: We are well aware of the technical difficulties associated with using mouse on mouse staining. In those cases, we use rabbit anti-mouse isotype specific antibodies specifically developed to avoid non-specific background (Abcam cat#ab133469). Each antibody panel for fluorescent multiplexed IHC is carefully optimized prior to studies. We did not use any primary mouse antibodies in the final version of the manuscript and, hence, removed this mention from the Methods.

      • In order to tie the story together, it would be interesting to treat infected mice with an INFAR antibody, as well as perform this experiment with a Myc antibody. According to your data, you might expect the survival of the mice to be increased or bacterial loads to be affected.

      Author: In collaboration with the Vance laboratory, we tested effects of type I IFN pathway inhibition in B6.Sst1S mice on TB susceptibility: either type I receptor knockout or blocking antibodies increased their resistance to virulent Mtb (published in Ji et al., 2019; PMID 31611644). Unfortunately, blocking Myc using neutralizing antibodies in vivo is not currently achievable. Specifically blocking Myc using small molecule inhibitors in vivo is notoriously difficult, as recognized in oncology literature. We consider using small molecule inhibitors of either Myc translation or specific pathways downstream of Myc in the future.

      • It is surprising that you not even once cite or mention your previous study on bioRxiv considering the similarity of the results and topic (https://doi.org/10.1101/2020.12.14.422743). Is not even your Figure 1I and Figure 2 J, K the same as in that study depicted in Figure 4?

      Author: The reviewer refers to the first version of this manuscript uploaded to BioRxiv, but it has never been published. We continued this work and greatly expanded our original observations, as presented in the current manuscript. Therefore, we do not consider the previous version as an independent manuscript and, therefore, do not cite it.

      • Please revise spelling of the manuscript and pay attention to write gene names in italics

      Author: Thank you, we corrected the gene and protein names according to current nomenclature.

      Minor points: - Fig. 1: Please provide some DEGs that explain why you used this resolution for the clustering of the scRNAseq data and that these clusters are truly distinct from each other.

      Author: Differential gene expression in clusters is presented in Suppl.Fig.1C (interferon response) and Suppl.Fig.1D (stress markers and interferon response previously established in our studies).

      • Fig. 1F: What do the two lines represent (magenta, green)?

      Author: The lines indicate pseudotime trajectories of B6 (magenta) and B6.Sst1S (green) BMDMs.

      • Fig. 1F, G: Why was cluster 6 excluded?

      Author: This cluster was not different between B6 and B6.Sst1S, so it was not useful for drawing the strain-specific trajectories.

      • Fig. 1E, G, H: The intensity scales are missing. They are vital to understand the data.

      Author: We have included the scale in revised manuscript (Fig.1E,G,H and Suppl.Fig.1C-D).

      • Fig. 2G-I: please revise order, as you first refer to Fig. 2H and I

      Author: We revised the panels’ order accordingly

      • Fig. 5: You say the data represents three samples but at least in D and E you have more. Please revise. Why do you only include at (G) the inhibitor only control?

      Author: We added the inhibitor only controls to Fig. 5D - H. We also indicated the number of replicates in the updated Fig.5 legend.

      • Figure 7A, Sup. Fig. 8: Are these maximum intensity projection? Or is one z-level from the 3D stack depicted?

      Author: The Fig. 7A shows 3D images with all the stacks combined.

      • Fig. 7B: What do the white boxes indicate?

      Author: We have removed this panel in the revised version and replaced it with better images.

      • Sup. Fig. 1A: The legend for the staining is missing

      Author: The Suppl. Fig.1A shows the relative proportions of either naïve (R and S) or TNF-stimulated (RT and ST) B6 or B6.Sst1S macrophages within individual single cell clusters depicted in Fig.1B. The color code is shown next to the graph on the right.

      • Sup. Fig. 1B: The feature plots are not clear: The legend for the expression levels is missing. What does the heading means?

      Author: We updated the headings, as in Fig.1C. The dots represent individual cells expressing Sp110 mRNA (upper panels) and Sp140 mRNA (lower panels).

      • Sup. Fig. 3C: The scale bar is barely visible.

      Author: We resized the scale bar to make it visible and presented in Suppl. Fig.3E (previously Suppl. Fig.3C).

      • Sup. Fig. 3D: There is not figure legend or the legend to C-E is wrong.

      • Sup. Fig. 3F, G: You do not state to what the data is relative to.

      Author: We identified an error in the Suppl.Fig.3 legend referring to specific panels. The Suppl.Fig.3 legend has been updated accordingly. New panels were added and Suppl.Fig.3-G panels are now Suppl.Fig.4C-D.

      • Sup. Fig. 3H: It seems you used a two-way ANOVA, yet state it differently. Please revise the figure legend, as Dunnett's multiple comparison would only check for significances compared to the control.

      Author: Following the reviewer’s comment, we repeated statistical analysis to include correction for multiple comparisons and revised the figure and legend accordingly.

      • Sup. Fig. 4A, B: It is not clear what the lines depict as the legend is not explained. Names that are not required should be changed to make it clear what is depicted (e.g. "TE@" what does this refer to?)

      Author: This previous Sup. Fig 4 is now Sup. Fig. 5. The “TE@” is a leftover label from the bioinformatics pipeline, referring to “Transposable Element”. We apologize for this confusion and have removed these extraneous labels. We have also added transposon names of the LTR (MMLV30 and RTLV4) and L1Md to Suppl.Fig.5A and 5B legend, respectively.

      • Sup. 4B: What does the y-scale on the right refer to?

      Author: We apologize for the missing label for the y-scale on the right which represents the mRNA expression level for the SetDB1 gene, which has a much lower steady state level than the LINE L1Md, so we plotted two Y-scales to allow both the gene and transposon to be visualized on this graph.

      • Sup. 4C: Interpretation of the data is highly hindered by the fact that the scales differ between the B6 and B6.Sst1. The scales are barely visible.

      Author: We apologize for the missing labels for the y-scales of these coverage plots, which were originally meant to just show a qualitative picture of the small RNA sequencing that was already quantitated by the total amounts in Sup. 4B. We have added thee auto-scaled Y-scales to Sup. 4C and improved the presentation of this figure.

      • Sup. Fig. 5A, B: Is the legend correct? Did you add the antibody for 2 days or is the quantification from day 3?

      Author: We recognize that the reviewer refers to Suppl.Fig.6A-B (Suppl.Fig.7A-B in the revised manuscript). We did not add antibodies to live cells. The figure legend describes staining with 4-HNE-specific antibodies 3 days post Mtb infection.

      • Sup. Fig. 8A: Are the "early" and "intermediate" lesions from the same time points? What are the definitions for these stages?

      Author: We discussed our lesion classification according to histopathology and bacterial loads above. Of note, in the revised manuscript we simplified our classification to denote paucibacillary and multibacillary lesions only. We agree with reviewers that designation lesions as early, intermediate and advanced lesions were based on our assumptions regarding the time course of their progression from low to high bacterial loads.

      • Sup. Fig. 8E: You should state that the bottom picture is an enlargement of an area in the top one. Scale bars are missing.

      Author: We replaced this panel with clearer images in Suppl.Fig.12B.

      • Sup. Fig. 11A: The IF staining is only visible for Iba and iNOS. Please provide single channels in order to make the other staining visible.

      Author: Suppl.Fig.11A (now Suppl.Fig.13B) shows the low-magnification images of TB lesions. In the Fig. 7 and Suppl. Fig. 13F of the revised manuscript we provided images for individual markers.

      • Sup. Fig. 13A (Suppl.Fig.15A now): Your axis label is not clear. What do the numbers behind the genes indicate? Why did you choose oncogene signatures and not inflammatory markers to check for a correlation with disease outcome?

      Author: X axis of Suppl.Fig.15A represent pre-defined molecular signature gene sets MSigDB) in Gene Set Enrichment Analysis (GSEA) database (https://www.gsea-msigdb.org/gsea/msigdb). On Y axis is area under curve (AUC) score for each gene set.

      • Sup. 13D(Suppl.Fig.15D now):: Maybe you could reorder the patients, so that the impression is clearer, as right now only the top genes seem to show a diverging gene signature, while the rest gives the impression of an equal distribution.

      Author: The Myc upregulated gene set myc_up was identified among top gene sets associated with treatment failure using unbiased ssGSEA algorithm. We agree with the reviewer that not every gene in the myc_up gene set correlates with the treatment outcome. But the association of the gene set is statistically significant, as presented in Suppl.Fig.15B – C.

      • The scale bars for many microscopy pictures are missing.

      Author: We have included clearly visible scale bars to all the microscopy images in the revised version.

      • The black bar plots should be changed (e.g. in color), since the single data points cannot be seen otherwise.
      • It would be advisable that a consistent color scheme would be used throughout the manuscript to make it easier to identify similar conditions, as otherwise many different colours are not required and lead right now rather to confusion (e.g. sometimes a black bar refers to BMDMs with and sometimes without TNF stimulation, or B6 BMDMs). Furthermore, plot sizes and fonts should be consistent within the manuscript (including the supplemental data)

      Author: We followed this useful suggestion and selected consistent color codes for B6 and B6.Sst1S groups to enhance clarity throughout the revised manuscript.

      Within the methods section: - At which concentration did you use the IFNAR antibody and the isotype?

      Author: We updated method section by including respective concentrations in the revised manuscript.

      • Were mice maintained under SPF conditions? At what age where they used?

      Author: Yes, the mice are specific pathogen free. We used 10 - 14 week old mice for Mtb infection.

      • The BMDM cultivation is not clear. According to your cited paper you use LCCM but can you provide how much M-CSF it contains? How do you make sure that amounts are the same between experiments and do not vary? You do not mention how you actually obtain this conditioned medium. Is there the possibility of contamination or transferred fibroblasts that would impact on the data analysis? Is LCCM also added during stimulation and inhibitor treatment?

      Author: We obtain LCCM by collecting the supernatant from L929 cell line that form confluent monolayer according to well-established protocols for LCCM collection. The supernatants are filtered through 0.22 micron filters to exclude contamination with L929 cells and bacteria. The medium is prepared in 500 ml batches that are sufficient for multiples experiments. Each batch of L929-conditioned medium is tested for biological activity using serial dilutions.

      • How was the BCG infection performed? How much bacteria did you use? Which BCG strain was used?

      Author: We infected mice with M. bovis BCG Pasteur subcutaneously in the hock using 106 CFU per mouse.

      • At what density did you seed the BMDMs for stimulation and inhibitor experiments?

      Author: In 96 well plates, we seed 12,000 cells per well and allow the cells to grow for 4 days to reach confluency (approximately 50,000 cells per well). For a 6-well plate, we seed 2.5 × 10^5 cells per well and culture them for 4 days to reach confluency. For a 24-well plate, we seed 50,000 cells per well and keep the cells in media for 4 days before starting any treatments. This ensures that the cells are in a proliferative or near-confluent state before beginning the stimulation or inhibitor treatments. Our detailed protocol is published in STAR Protocols (Yabaji et al., 2022; PMID 35310069).

      • What machine did you use to perform the bulk RNA sequencing? How many replicates did you include for the sequencing?

      Author: For bulk sequencing we used 3 RNA samples for each condition. The samples were sequenced at Boston University Microarray & Sequencing Resource service using Illumina NextSeq™ 2000 instrument.

      • How many replicates were used for the scRNA sequencing? Why is your threshold for the exclusion of mitochondrial DNA so high? A typical threshold of less than 5% has been reported to work well with mouse tissue.

      Author: We used one sample per condition. For the mitochondrial cutoff, we usually base it off of the total distribution. There is no "universal" threshold that can be applied to all datasets. Thresholds must be determined empirically.

      • You do not mention how many PCAs were considered for the scRNA sequencing analysis.

      Author: We considered 50 PCAs, this information was added to Methods

      • You should name all the package versions you used for the scRNA sequencing (e.g. for the slingshot, VAM package)

      Author: The following package versions were used: Seurat v4.0.4, VAM v1.0.0, Slingshot v2.3.0, SingleCellTK v2.4.1, Celda v1.10.0, we added this information to Methods.

      • You mention two batches for the human samples. Can you specify what the two batches are?

      Author: Human blood samples were collected at five sites, as described in the updated Methods section and two RNAseq batches were processed separately that required batch correction.

      • At which temperature was the IF staining performed?

      Author: We performed the IF at 4oC. We included the details in revised version.

      Reviewer #2 (Significance (Required)):

      Overall, the manuscript has interesting findings with regard to macrophage responses in Mycobacteria tuberculosis infection. However, in its current form there are several shortcomings, both with respect to the precision of the experiments and conclusions drawn.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary The authors use a mouse model designed to be more susceptible to M.tb (addition of sst1 locus) which has granulomatous lesions more similar to human granulomas, making this mouse highly relevant for M.tb pathogenesis studies. Using WT B6 macrophages or sst1B6 macrophages, the authors seek to understand the how the sst1 locus affects macrophage response to prolonged TNFa exposure, which can occur during a pro-inflammatory response in the lungs. Using single cell RNA-seq, revealed clusters of mutant macrophages with upregulated genes associated with oxidative stress responses and IFN-I signaling pathways when treated with TNF compared to WT macs. The authors go on to show that mutant macrophages have decreased NRF2, decreased antioxidant defense genes and less Sp110 and Sp140. Mutant macrophages are also more susceptible to lipid peroxidation and iron-mediated oxidative stress. The IFN-I pathway hyperactivity is caused by the dysregulation of iron storage and antioxidant defense. These mutant macrophages are more susceptible to M.tb infection, showing they are less able to control bacterial growth even in the presence of T cells from BCG vaccinated mice. The transcription factor Myc is more highly expressed in mutant macs during TNF treatment and inhibition Myc led to better control of M.tb growth. Myc is also more abundant in PBMCs from M.tb infected humans with poor outcomes, suggesting that Myc should be further investigated as a target for host-directed therapies for tuberculosis.

      Major Comments Isotypes for IF imaging and confocal IF imaging are not listed, or not performed. It is a concern that the microscopy images throughout the manuscript do not have isotype controls for the primary antibodies.

      Fig 4 (and later) the anti-IFNAR Ab is used along with the Isotype antibody, Fig 4I does not show the isotype. Use of the isotype antibody is also missing in later figures as well as Fig 3J. Why was this left off as the proper control for the Ab?

      Author: We addressed the comment in revised manuscript as described above in summary and responses to reviewers 1 and 2. Isotype controls for IFNAR1 blockade were included in Fig.3M (previously 3J), Fig. 4I, Suppl.Fig.4G (previously Fig.4I), and updated Fig.4C -E, Fig.6L-M, Suppl.Fig.4F -G, 7I.

      Conclusions drawn by the authors from some of the WB data are worded strongly, yet by eye the blots don't look as dramatically different as suggested. It would be very helpful to quantify the density of bands when making conclusions. (for example, Fig 4A).

      Author: We added the densitometry of Western blot values after normalization above each lane in Fig.2A – C, Fig.3C – D and 3K; Fig.4A – B, Fig5B,C,I,J.

      Fig 5A is not described clearly. If the gene expression is normalized to untreated B6 macs, then the level of untreated B6 macs should be 1. In the graph the blue bars are slightly below 1, which would not suggest that levels "initially increased and subsequently downregulated" as stated in the text. It seems like the text describes the protein expression but not the RNA expression. Please check this section and more clearly describe the results.

      Author: We appreciate the reviewer’s comment and modified the text to specify the mRNA and protein expression data, as follows:

      “We observed that Myc was regulated in an sst1-dependent manner: in TNF-stimulated B6 wild type BMDMs, c-Myc mRNA was downregulated, while in the susceptible macrophages c-Myc mRNA was upregulated (Fig.5A). The c-Myc protein levels were also higher in the B6.Sst1S cells in unstimulated BMDMs and 6 – 12 h of TNF stimulation (Fig.5B)”.

      Also, why look at RNA through 24h but protein only through 12h? If c-myc transcripts continue to increase through 24h, it would be interesting to see if protein levels also increase at this later time point.

      Author: The time-course of Myc expression up to 24 h is presented in new panels Fig. 5I-5J

      It demonstrates the decrease of Myc protein levels at 24 h. In the wild type B6 BMDMs the levels of Myc protein significantly decreased in parallel with the mRNA suppression presented in Fig.5A. In contrast , we observed the dissociation of the mRNA and protein levels in the sst1-mutant BMDMs at 12 and 24 h, most likely, because the mutant macrophages develop integrated stress response (as shown in our previous publication by Bhattacharya et al., JCI, 2021) that is known to inhibit Myc mRNA translation.

      Fig 5J the bands look smaller after D-JNK1 treatment at 6 and 12h though in the text is says no change. Quantifying the bands here would be helpful to see if there really is no difference.

      Author: This experiment was repeated twice, and the average normalized densitometry values are presented in the updated Fig.5J. The main question addressed in this experiment was whether the hyperactivity of JNK in TNF-stimulated sst1 mutant macrophages contributed to Myc upregulation, as was previously shown in cancer. Comparing effects of JNK inhibition on phospho-cJun and c-Myc protein levels in TNF stimulated B6.Sst1S macrophages (updated Fig.5J), we concluded that JNK did not have a major role in c-Myc upregulation in this context.

      Section 4, third paragraph, the conclusion that JNK activation in mutant macs drives pathways downstream of Myc are not supported here. Are there data or other literature from the lab that supports this claim?

      Author: This statement was based on evidence from available literature where JNK was shown to activate oncogens, including Myc. In addition, inhibition of Myc in our model upregulated ferritin (Fig.Fig.5C), reduced the labile iron pool, prevented the LPO accumulation (Fig.5D - G) and inhibited stress markers (Fig.5H). However, we do not have direct experimental evidence in our model that Myc inhibition reduces ASK1 and JNK activities. Hence, we removed this statement from the text and plan to investigate this in the future.

      Fig 6N Please provide further rationale for the BCG in vivo experiment. It is unclear what the hypothesis was for this experiment.

      Author: In the current version BCG vaccination data is presented in Suppl.Fig.14B. We demonstrate that stressed BMDMs do not respond to activation by BCG-specific T cells (Fig.6J) and their unresponsiveness is mediated by type I interferon (Fig.6L and 6M). The observed accumulation of the stressed macrophages in pulmonary TB lesions of the sst1-susceptible mice (Fig.7E, Suppl.Fig.13 and 14A) and the upregulation of type I interferon pathway (Fig.1E,1G, 7C), Suppl.Fig.1C and 11) suggested that the effect of further boosting T lymphocytes using BCG in Mtb-infected mice will be neutralized due to the macrophage unresponsiveness. This experiment provides a novel insight explaining why BCG vaccine may not be efficient against pulmonary TB in susceptible hosts.

      The in vitro work is all concerning treatment with TNFa and how this exposure modifies the responses in B6 vs sst1B6 macrophages; however, this is not explored in the in vivo studies. Are there differences in TNFa levels in the pauci- vs multi-bacillary lesions that lead to (or correlate with) the accumulation of peroxidation products in the intralesional macrophages. How to the experiments with TNFa in vitro relate back to how the macrophages are responding in vivo during infection?

      Author: Our investigation of mechanisms of necrosis of TB granulomas stems from and supported by in vivo studies as summarized below.

      This work started with the characterization necrotic TB granulomas in C3HeB/FeJ mice in vivo followed by a classical forward genetic analysis of susceptibility to virulent Mtb in vivo.

      That led to the discovery of the sst1 locus and demonstration that it plays a dominant role in the formation of necrotic TB granulomas in mouse lungs in vivo. Using genetic and immunological approaches we demonstrated that the sst1 susceptibility allele controls macrophage function in vivo (Yan, et al., J.Immunol. 2007) and an aberrant macrophage activation by TNF and increased production of Ifn-b in vitro (He et al. Plos Pathogens, 2013). In collaboration with the Vance lab we demonstrated that the type I IFN receptor inactivation reduced the susceptibility to intracellular bacteria of the sst1-susceptible mice in vivo (Ji et al., Nature Microbiology, 2019). Next, we demonstrated that the Ifnb1 mRNA superinduction results from combined effects of TNF and JNK leading to integrated stress response in vitro (Bhattacharya, JCI, 2021). Thus, our previous work started with extensive characterization of the in vivo phenotype that led to the identification of the underlying macrophage deficiency that allowed for the detailed characterization of the macrophage phenotype in vitro presented in this manuscript. In a separate study, the Sher lab confirmed our conclusions and their in vivo relevance using Bach1 knockout in the sst1-susceptible B6.Sst1S background, where boosting antioxidant defense by Bach1 inactivation resulted in decreased type I interferon pathway activity and reduced granuloma necrosis. We have chosen TNF stimulation for our in vitro studies because this cytokine is most relevant for the formation and maintenance of the integrity of TB granulomas in vivo as shown in mice, non-human primates and humans. Here we demonstrate that although TNF is necessary for host resistance to virulent Mtb, its activity is insufficient for full protection of the susceptible hosts, because of altered macrophages responsiveness to TNF. Thus, our exploration of the necrosis of TB granulomas encompass both in vitro and extensive in vivo studies.

      Minor comments Introduction, while well written, is longer than necessary. Consider shortening this section. Throughout figures, many graphs show a fold induction/accumulation/etc, but it is rarely specified what the internal control is for each graph. This needs to be added. Paragraph one, authors use the phrase "the entire IFN pathway was dramatically upregulated..." seems to be an exaggeration. How do you know the "entire" IFN pathway was upregulated in a dramatic fashion?

      Author: 1) We shortened the introduction and discussion; 2) verified that figure legends internal controls that were used to calculate fold induction; 3) removed the word “entire” to avoid overinterpretation.

      Figures 1E, G and H and supp fig 1C, the heat maps are missing an expression key Section 2 second paragraph refers to figs 2D, E as cytoplasmic in the text, but figure legend and y-axis of 2E show total protein.

      Author: The expression keys were added to Fig.1E,G,H, Fig.7C, Suppl.Fig.1C and 1D and Suppl.Fig.11A of the revised manuscript.

      Section 3 end of paragraph 1 refers to Fig 3h. Does this also refer to Supp Fig 3E?

      Author: Yes, Fig.3H shows microscopy of 4-HNE and Suppl.Fig.3H shows quantification of the image analysis. In the revised manuscript these data are presented in Fig.3H and Suppl.Fig.3F. The text was modified to reflect this change.

      Supplemental Fig 3 legend for C-E seems to incorrectly also reference F and G.

      Author: We corrected this error in the figure legend. New panels were added to Suppl.Fig.3 and previous Suppl.Fig.3F and G were moved to Suppl.Fig.4 panels C and D of the revise version.

      Fig 3K, the p-cJun was inhibited with the JNK inhibitor, however it’s unclear why this was done or the conclusion drawn from this experiment. Use of the JNK inhibitor is not discussed in the text.

      Author: The JNK inhibitor was used to confirm that c-Jun phosphorylation in our studies is mediated by JNK and to compare effects of JNK inhibition on phospho-cJun and Myc expression. This experiment demonstrated that the JNK inhibitor effectively inhibited c-Jun phosphorylation but not Myc upregulation, as shown in Fig.5I-J of the revised manuscript.

      Fig 4 I and Supp Fig 3 H seem to have been swapped? The graph in Fig 4I matches the images in Supp Fig 3I. Please check.

      Author: We reorganized the panels to provide microscopy images and corresponding quantification together in the revised the panels Fig. 4H and Fig. 4I, as well as in Suppl. Fig. 4F and Suppl. Fig. 4G.

      Fig 6, it is unclear what % cell number means. Also for bacterial growth, the data are fold change compared to what internal control?

      Author: We updated Fig.6 legend to indicate that the cell number percentages were calculated based on the number of cells at Day 0 (immediately after Mtb infection). We routinely use fixable cell death staining to enumerate cell death. Brief protocol containing this information is included in Methods section. The detailed protocol including normalization using BCG spike has been published – Yabaji et al, STAR Protocols, 2022. Here we did not present dead cell percentage as it remained low and we did not observe damage to macrophage monolayers. This allows us to exclude artifacts due to cell loss. The fold change of Mtb was calculated after normalization using Mtb load at Day 0 after infection and washes.

      Fig 7B needs an expression key

      Author: The expression keys was added to Fig.7C (previously Fig. 7B).

      Supp Fig 7 and Supp Fig 8A, what do the arrows indicate?

      Author: In Suppl.Fig.8 (previously Suppl.Fig.7) the arrows indicate acid fast bacilli (Mtb).

      In figures Fig.7A and Suppl.Fig.9A arrows indicate Mtb expressing fluorescent reporter mCherry. Corresponding figure legends were updated in the revised version.

      Supp Fig 9A, two ROI appear to be outlined in white, not just 1 as the legend says Methods:

      Author: we updated the figure legend.

      Certain items are listed in the Reagents section that are not used in the manuscript, such as necrostatin-1 or Z-VAD-FMK. Please carefully check the methods to ensure extra items or missing items does not occur.

      Author: These experiments were performed, but not included in the final manuscript. Hence, we removed the “necrostatin-1 or Z-VAD-FMK” from the reagents section in methods of revised version.

      Western blot, method of visualizing/imaging bands is not provided, method of quantifying density is not provided, though this was done for fig 5C and should be performed for the other WBs.

      Author: We used GE ImageQuant LAS4000 Multi-Mode Imager to acquire the Western blot images and the densitometric analyses were performed by area quantification using ImageJ. We included this information in the method section. We added the densitometry of Western blot values after normalization above each lane in Fig.2A – C, Fig.3C – D and 3K; Fig.4A – B, Fig5B,C,I,J.

      Reviewer #3 (Significance (Required)):

      The work of Yabaji et al is of high significance to the field of macrophage biology and M.tb pathogenesis in macrophages. This work builds from previously published work (Bhattacharya 2021) in which the authors first identified the aberrant response induced by TNF in sst1 mutant macrophages. Better understanding how macrophages with the sst1 locus respond not only to bacterial infection but stimulation with relevant ligands such as TNF will aid the field in identifying biomarkers for TB, biomarkers that can suggest a poor outcome vs. "cure" in response to antibiotic treatment or design of host-directed therapies. This work will be of interest to those who study macrophage biology and who study M.tb pathogenesis and tuberculosis in particular. This study expands the knowledge already gained on the sst1 locus to further determine how early macrophage responses are shaped that can ultimately determine disease progression. Strengths of the study include the methodologies, employing both bulk and single cell-RNA seq to answer specific questions. Data are analyze using automated methods (such as HALO) to eliminated bias. The experiments are well planned and designed to determine the mechanisms behind the increased iron-related oxidative stress found in the mutant macrophages following TNF treatment. Also, in vivo studies were performed to validate some of the in vitro work. Examining pauci-bacillary lesions vs multi-bacillary lesions and spatial transcriptomics is a significant strength of this work. The inclusion of human data is another strength of the study, showing increased Myc in humans with poor response to antibiotics for TB. Limitations include the fact that the work is all done with BMDMs. Use of alveolar macrophages from the mice would be a more relevant cell type for M.tb studies. AMs are less inflammatory, therefore treatment with TNF of AMs could result in different results compared to BMDMs. Reviewer's field of expertise: macrophage activation, M.tb pathogenesis in human and mouse models, cell signaling Limitations: not qualified to evaluate single cell or bulk RNA-seq technical analysis/methodology or spatial transcriptomics analysis.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Yabaji et al. investigated the effects of BMDMs stimulated with TNF from both WT and B6.Sst1S mice, which have previously been identified to contain the sst1 locus conferring susceptibility to Mycobacterium tuberculosis. They identified that B6.Sst1S macrophages show a superinduction of IFNß, which might be caused by increased c-Myc expression, expanding on the mechanistic insights made by the same group (Bhattacharya et al. 2021). Furthermore, prolonged TNF stimulation led to oxidative stress, which WT BMDMs could compensate for by the activation of the antioxidant defense via NRF2. On the other hand, B6.Sst1S BMDMs lack the expression of SP110 and SP140, co-activators of NRF2, and were therefore subjected to maintained oxidative stress. Yabaji et al. could link those findings to in vivo studies by correlating the presence of stressed and aberrantly activated macrophages within granulomas to the failure of Mtb control, as well as the progression towards necrosis. As the knowledge regarding Mtb progression and necrosis of granulomas is not yet well understood, findings that might help provide novel therapy options for TB are crucial.

      Overall, the manuscript has interesting findings with regard to macrophage responses in Mycobacteria tuberculosis infection.

      However, in its current form there are several shortcomings, both with respect to the precision of the experiments and conclusions drawn.

      In particular a) important controls are often missing, e.g. T-cells form non-immune mice in Fig. 6J, in F, effectivity of BCG in B6 mice in 6N; b) single experiments are shown throughout the manuscript, in particular western blots and histology without proper quantification and statistics, this is absolutely not acceptable; c) very few repetitions are shown in in vitro experiments, where there is no evidence for limitation in resources (usually not more than 3), it is not clear what "independent experiment means" - i.e. the robustness of the findings is questionable; d) data are often normalized multiple times, e.g. in the case of qPCR, and the methods of normalization are not clear (what house-keeping gene exactly?);

      Moreover, experiments regarding IFN I signaling (e.g. short term TNF treatment of BMDMs to analyze LPO, making sure that the reporter mouse for IFNß works in vivo) and c-Myc (e.g. the increase after M-CSF addition might impact on other analysis as well and the experiments should be adjusted to control for this effect; MYC expression in the human samples) should be carefully repeated and evaluated to draw correct conclusions.

      In addition, we would like to strongly encourage the authors to more precisely outline the experimental set-ups and figure legends, so that the reader can easily understand and follow them. In other words: The legends are - in part very - incomplete. In addition, the authors should be mindful of gene names vs. protein names and italicize where appropriate.

      Finally, it is necessary that the connection to several overlapping preprints by the same author group is outlined, e.g. to https://www.biorxiv.org/content/10.1101/2020.12.14.422743v1.full.

      part very - incomplete. In addition, the authors should be mindful of gene names vs. protein names and italicize where appropriate.

      Finally, it is necessary that the connection to several overlapping preprints by the same author group is outlined, e.g. to https://www.biorxiv.org/content/10.1101/2020.12.14.422743v1.full.

      Specific comments to the experiments and data:

      • Fig. 1E: Evaluation of differences in up- and downregulation between B6 and B6.Sst1S cells should highlight where these cells are within the heatmap, as it is only labelled with the clusters, or it should be depicted differently (in particular for cluster 1 and 2). Furthermore, a more simple labelling of the pathways would increase the readability of the data
      • Fig. 2D, E: The staining legend is missing. For the quantification it is not clear what % total means. Is this based on the intensity or area? What do the dots represent in the bar chart? Is one data point pooled from several pictures? If not, the experiments need to be repeated, as three pictures might not be representative for evaluation.
      • Fig. 2E: Statistics comparing B6/ B6,SsT1S with TNF (different) is required: Absence of induction is not a proof for a difference!
      • Fig. 3E: Positive and negative control need to be depicted in the figure (see legend).
      • Fig. 3I: A quantification by flow cytometry or total cell counts are important, as 6% cell death in cell culture is a very modest observation. Otherwise, confocal images of the quantification would be a good addition to judge the specificity of the viability staining.
      • Fig. 3I, J: What does one dot represent?
      • Fig. 3K,L: For the B6 BMDMs it seems that p-cJun is highly increased at 12h in (L), while it is not in (K). On the other hand, for the B6.Sst1S BMDMs it peaks at 24h in (K), while in (L) it seems to at 12h. According to the data in (L) it seems that p-cJun is rather earlier and stronger activated in B6 BMDMs and has a weakened but prolonged activation in the B6.Sst1S BMDMs, which would not fit with your statement in the text that B6.Sst1S BMDMs show an upregulation. !These experiments need repetitions and quantification and statistiscs!
      • Figure 3J: the isotype control for the IFNAR antibody is missing
      • Fig. 3L: ASK1 seems to be higher at 12h for the B6 BMDMs and similar for both lines at 24h, which is not fitting to the statement in the text. ("Also, the ASK1 - JNK - cJun stress kinase axis was upregulated in B6.Sst1S macrophages, as compared to B6, after 12 - 36 h of TNF stimulation")
      • Fig.4A - C: "IFNAR1 blockade, however, did not increase either the NRF2 and FTL protein levels, or the Fth, Ftl and Gpx1 mRNA levels above those treated with isotype control antibodies" Maybe not above the isotype but it is higher than the TNF alone stimulation at least for NRF2 at 8h and for Ftl at both time points. Why does the isotype already cause stimulation/induction of the cells? !These experiments need repetitions and quantification and statistics!
      • Figure 4C and subsequent: How exactly was the experiment done (house-keeping gene)?
      • Figure 4D,E: Information on cells used is missing. Why the change in stimulation time? Did it not work after 12h? Then the experiments in A-C should be repeated for 16h.
      • Figure 4E: It seems the isotype control itself has already an effect in the reduction of IFNb.
      • Figure 4E: It would be helpful to see if these transcripts are actually translated into protein levels, e.g. perform an ELISA. Authors state that IFNAR blockages does not alter the expression but you statistic says otherwise.
      • Fig. 4F: To what does the fold induction refer to? If it is again to unstimulated cells, then why is the induction now so much higher than in (E) where it was only 50x (now to 100x).
      • Figure 4G: Again to what is the fold induction referring to? It seems your Fer-1 treatment only contains 2 data points. This needs to be fixed.
      • Fig. 4H: It seems that the Isotype control antibody had an effect to increase 4-HNE (compared to TNF stimulated only). Was the AB added also at 12h post stimulation? Figure legend should be adjusted.
      • Figure 4I: How was the data measured here, i.e. what is depicted? The isotype control is missing. It seems a two-way ANOVA was used, yet it is stated differently. The figure legend should be revised, as Dunnett's multiple comparison would only check for significances compared to the control.
      • "These data suggest that type I IFN signaling does not initiate LPO in our model but maintains and amplifies it during prolonged TNF stimulation that, eventually, may lead to cell death". Data for a short term TNF stimulation are not shown, however, so it might impact also on the initiation of LPO.
      • The data for Ifnb expression (or better protein level) should be provided for B6 BMDMs as well.
      • "A select set of mouse LTR-containing endogenous retroviruses (ERV's) (Jayewickreme et al, 2021), and non-retroviral LINE L1 elements were expressed at a basal level before and after TNF stimulation, but their levels in the B6.Sst1S BMDMs were similar to or lower than those seen in B6". This sentence should be revised as the differences between B6 and B6.Sst1S BMDMs seem small and are not there after 48h anymore. Are these mild changes really caused by the mutation or could they result from different housing conditions and/or slowly diverging genetically lines. How many mice were used for the analysis? Is there already heterogeneity between mice from the same line?
      • The overall conclusion drawn from Fig. 3 and 4 is not really clear with regard that IFN does not initiate LPO. Where is that shown? Data on earlier stimulation time points should be added to make this clear.
      • Fig. 5A: Indeed, it even seems that Myc is upregulated for the mutant BMDMs. Yet, there are only 2 data points for B6 12h. !These experiments need repetitions and quantification and statistics!
      • Fig. 5B: Why would the protein level decrease in the controls over 6h of additional cultivation? Is this caused by fresh M-CSF? In this case maybe cells should be left to settle for one day before stimulating them to properly compare c-Myc induction. Comment on two c-Myc bands is needed. At 12h only the upper one seems increased for TNF stimulated mutant BMDMs compared to B6 BMDMs
      • Fig. 5A,B: It seems that not all the RNA is translated into protein, as c-Myc at 12h in the mutant BMDMs seems to be lower than at 6h, while the gene expression implicates it vice versa.
      • Fig. 5J: Indeed the inhibitor seems to cause the downregulation of the proteins. Explanation?
      • "TNF stimulation tended to reduce the LPO accumulation in the B6 macrophages and to increase it in the B6.Sst1S ones" However, this is not apparent in Sup. Fig. 6B. Here it seems that there might be a significant increase.
      • Fig. 6B: Mtb and 4-HNE should be shown in two different channels in order to really assign each staining correctly. What time point is this? Are the mycobacteria cleared at MOI1, since it looks that there are fewer than that? How does this look like for the B6 BMDMs? Are there even less mycobacteria?
      • Fig 6E: In the context of survival a viability staining needs to be included, as well as the data from day 0. Then it needs to be analyzed whether cell numbers remain the same from D0 or if there is a change.
      • "The 3D imaging demonstrated that YFP-positive cells were restricted to the lesions, but did not strictly co-localize with intracellular Mtb, i.e. the Ifnb promoter activity was triggered by inflammatory stimuli, but not by the direct recognition of intracellular bacteria. We validated the IFNb reporter findings using in situ hybridization with the Ifnb probe, as well as anti-GFP antibody staining (Suppl.Fig.8B - E)." The colocalization is not present within the tissue sections. It seems that the reporter line does not show the same staining pattern in vivo as the IFNß probe or the anti GFP antibody staining. The reporter line has to be tested for the specificity of the staining. Furthermore, to state that it was restricted to the lesions, an uninvolved tissue area needs to be depicted.
      • Are paucibacillary and multibacillary lesions different within the same animal or does one animal have one lesion phenotype? If that is the case, what is causing the differences between mice? Bacterial counts for the mice are required.
      • "Among the IFN-inducible genes upregulated in paucibacillary lesions were Ifi44l, a recently described negative regulator of IFN-I that enhances control of Mtb in human macrophages (DeDiego et al, 2019; Jiang et al, 2021) and Ciita, a regulator of MHC class II inducible by IFNy, but not IFN-I (Suppl.Table 8 and Suppl.Fig.10 D-E)." Why is Sup. Fig. 10 D, E referred to? The figure legend is also not clear, e.g. what means "upregulated in a subset of IFN-inducible genes"? Input for the hallmarks needs to be defined.
      • Fig. 7C: Single channel pictures are required as it is hard to see the differences in staining with so many markers. Why is there no iNOS expression in the bottom row? What does the rectangle indicate on the bottom right? As black is chosen for DAPI, it is not visible at all. In case the signal is needed a visible a color should be chosen.
      • "In the advanced lesions these markers were primarily expressed by activated macrophages (Iba1+) expressing iNOS and/or Ifny (YFP+)(Fig.7D)" Iba1 is needed in the quantification. Based on the images, iNOS seems to be highly produced in Iba1 negative cells. Which cells do produce it then? Flow cytometry data for this quantification are required This would allow you to specifically check which cells express the markers and allow for a more precise analysis of double positive cells.
      • Results part 6: In general, can you please state for each experiment at what time point mice were analyzed? You should include an additional macrophage staining (e.g. MerTK, F4/80), as alveolar macrophages are not staining well for Iba1 and you might therefore miss them in your IF microscopy. It would be very nice if you could perform flow cytometry to really check on the macrophages during infection and distinguish subsets (e.g. alveolar macrophages, interstitial macrophages, monocytes)
      • Spatial sequencing: The manuscript would highly profit from more data on that. It would be very interesting to check for the DEGs and show differential spatial distribution. Expression of marker genes should be inferred to further define macrophage subsets (e.g. alveolar macrophages, interstitial macrophages, recruited macrophages) and see if these subsets behave differently within the same lesion but also between the lesions. Additional bioinformatic approaches might allow you to investigate cell-cell interactions. There is a lot of potential with such a dataset, especially from TB lesions, that would elevate your findings and prove interesting to the TB field.
      • "Thus, progression from the Mtb-controlling paucibacillary to non-controlling multibacillary TB lesions in the lungs of TB susceptible mice was mechanistically linked with a pathological state of macrophage activation characterized by escalating stress (as evidenced by the upregulation phospho-cJUN, PKR and Chac1), the upregulation of IFNβ and the IFN-I pathway hyperactivity, with a concurrent reduction of IFNγ responses." To really show the upregulation within macrophages and their activation, a more detailed IF microscopy with the inclusion of additional macrophage markers needs to be provided. Flow cytometry would enable analysis for the differences between alveolar and interstitial macrophages, as well as for monocytes. As however, it seems that the majority of iNOS, as well as the stress associated markers are not produced by Iba1+ cells. Analyzing granulocytes and T lymphocytes should be considered.
      • It's mentioned in the method section that controls in the IF staining were only fixed for 10min, while the infected cells were fixed for 30min. Consistency is important as the PFA fixation might impact on the fluorescence signal. Therefore, controls should be repeated with the same fixation time.
      • Reactive oxygen species levels should be determined in B6 and B6.Sst1S BMDMs (stimulated and unstimulated), as they are very important for oxidative stress.
      • Sup. Fig 2C: The inclusion of an unstimulated control would be advisable in order to evaluate if there are already difference in the beginning.
      • Sup. Fig. 3F: Why is the fold change now lower than in Fig. 4D (fold change of around 28 compared to 120 in 4D)?
      • Sup. Fig. 5C, D: The data seems very interesting as you even observe an increase in gene expression. Data for the B6 mice should be evaluated for increase to a similar level as the TNF treated mutants. Data on the viability of the cells are necessary, as they no longer receive M-CSF and might be dying at this point already.
      • Sup. Fig 12: the P-c-Jun picture for (P) is not the same as in the merged one with Iba1. Double positive cells are mentioned to be analyzed, but from the staining it appears that P-c-Jun is expressed by other cells. You do not indicate how many replicates were counted and if the P and M lesions were evaluated within the same animal. What does the error bar indicate? It seems unlikely from the plots that the double positive cells are significant. Please provide the p values and statistical analysis.
      • Sup. Fig. 13D: What about the expression of MYC itself? Other parts of the signaling pathway should be analyzed(e.g. IFNb, JNK)?
      • In the mfIHC you he usage of anti-mouse antibodies is mentioned. Pictures of sections incubated with the secondary antibody alone are required to exclude the possibility that the staining is not specific. Especially, as this data is essential to the manuscript and mouse-anti-mouse antibodies are notorious for background noise.
      • In order to tie the story together, it would be interesting to treat infected mice with an INFAR antibody, as well as perform this experiment with a Myc antibody. According to your data, you might expect the survival of the mice to be increased or bacterial loads to be affected.
      • It is surprising that you not even once cite or mention your previous study on bioRxiv considering the similarity of the results and topic (https://doi.org/10.1101/2020.12.14.422743). Is not even your Figure 1I and Figure 2 J, K the same as in that study depicted in Figure 4?
      • Please revise spelling of the manuscript and pay attention to write gene names in italics

      Minor points:

      • Fig. 1: Please provide some DEGs that explain why you used this resolution for the clustering of the scRNAseq data and that these clusters are truly distinct from each other.
      • Fig. 1F: What do the two lines represent (magenta, green)?
      • Fig. 1F, G: Why was cluster 6 excluded?
      • Fig. 1E, G, H: The intensity scales are missing. They are vital to understand the data.
      • Fig. 2G-I: please revise order, as you first refer to Fig. 2H and I
      • Fig. 5: You say the data represents three samples but at least in D and E you have more. Please revise. Why do you only include at (G) the inhibitor only control?
      • Figure 7A, Sup. Fig. 8: Are these maximum intensity projection? Or is one z-level from the 3D stack depicted?
      • Fig. 7B: What do the white boxes indicate?
      • Sup. Fig. 1A: The legend for the staining is missing
      • Sup. Fig. 1B: The feature plots are not clear: The legend for the expression levels is missing. What does the heading means?
      • Sup. Fig. 3C: The scale bar is barely visible.
      • Sup. Fig. 3D: There is not figure legend or the legend to C-E is wrong.
      • Sup. Fig. 3F, G: You do not state to what the data is relative to.
      • Sup. Fig. 3H: It seems you used a two-way ANOVA, yet state it differently. Please revise the figure legend, as Dunnett's multiple comparison would only check for significances compared to the control.
      • Sup. Fig. 4A, B: It is not clear what the lines depict as the legend is not explained. Names that are not required should be changed to make it clear what is depicted (e.g. "TE@" what does this refer to?)
      • Sup. 4B: What does the y-scale on the right refer to?
      • Sup. 4C: Interpretation of the data is highly hindered by the fact that the scales differ between the B6 and B6.Sst1. The scales are barely visible.
      • Sup. Fig. 5A, B: Is the legend correct? Did you add the antibody for 2 days or is the quantification from day 3?
      • Sup. Fig. 8A: Are the "early" and "intermediate" lesions from the same time points? What are the definitions for these stages?
      • Sup. Fig. 8E: You should state that the bottom picture is an enlargement of an area in the top one. Scale bars are missing.
      • Sup. Fig. 11A: The IF staining is only visible for Iba and iNOS. Please provide single channels in order to make the other staining visible.
      • Sup. Fig. 13A: Your axis label is not clear. What do the numbers behind the genes indicate? Why did you chose oncogene signatures and not inflammatory markers to check for a correlation with disease outcome?
      • Sup. 13D: Maybe you could reorder the patients, so that the impression is clearer, as right now only the top genes seem to show a diverging gene signature, while the rest gives the impression of an equal distribution.

      • The scale bars for many microscopy pictures are missing.

      • The black bar plots should be changed (e.g. in color), since the single data points cannot be seen otherwise.
      • It would be advisable that a consistent color scheme would be used throughout the manuscript to make it easier to identify similar conditions, as otherwise many different colours are not required and lead right now rather to confusion (e.g. sometimes a black bar refers to BMDMs with and sometimes without TNF stimulation, or B6 BMDMs). Furthermore, plot sizes and fonts should be consistent within the manuscript (including the supplemental data)

      Within the methods section:

      • At which concentration did you use the IFNAR antibody and the isotype?
      • Were mice maintained under SPF conditions? At what age where they used?
      • The BMDM cultivation is not clear. According to your cited paper you use LCCM but can you provide how much M-CSF it contains? How do you make sure that amounts are the same between experiments and do not vary? You do not mention how you actually obtain this conditioned medium. Is there the possibility of contamination or transferred fibroblasts that would impact on the data analysis? Is LCCM also added during stimulation and inhibitor treatment?
      • How was the BCG infection performed? How much bacteria did you use? Which BCG strain was used?
      • At what density did you seed the BMDMs for stimulation and inhibitor experiments?
      • What machine did you use to perform the bulk RNA sequencing? How many replicates did you include for the sequencing?
      • How many replicates were used for the scRNA sequencing? Why is your threshold for the exclusion of mitochondrial DNA so high? A typical threshold of less than 5% has been reported to work well with mouse tissue.
      • You do not mention how many PCAs were considered for the scRNA sequencing analysis.
      • You should name all the package versions you used for the scRNA sequencing (e.g. for the slingshot, VAM package)
      • You mention two batches for the human samples. Can you specify what the two batches are?
      • At which temperature was the IF staining performed?

      Significance

      Overall, the manuscript has interesting findings with regard to macrophage responses in Mycobacteria tuberculosis infection.

      However, in its current form there are several shortcomings, both with respect to the precision of the experiments and conclusions drawn.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the editor for handling our manuscript and the reviewers for their constructive critiques. We are deeply convinced that the reviewers’ suggestions have substantially raised the quality and possible impact of our manuscript. We also like to thank the reviewers for their judgements that the subject of our manuscript is biologically and clinically significant and of high importance, and that our manuscript might help to increase focus and visibility for affected individuals.

      New text passages in the manuscript are colored in red. Below is a point-by-point response to the reviewers’ comments.

      2. Point-by-point description of the revisions

      Response to reviewer 1 comments

      Major comments


      Point 1-1

      The authors performed qRT-PCR validation for markers of differentiation and hypoxia, with a major absence of VEGF and HIF1a. The paper would be strengthened by mention of these factors, especially by qRT-PCR or Western blot.

      We thank the reviewer for the suggestion to include the bona fide hypoxia markers Vegfa and Hif1-alpha. We followed the suggestion and performed qRT-PCR on Vegfa transcripts at each tested condition (Figs. 1A,2A,3A,4A,5A,5D,5I,5N). As Hif1α is rather regulated on protein than on transcript level, we followed the advice to perform Western blots. We analyzed Hif1α protein levels on proliferating cells and quantified by normalization to actin (Figs. 1B,C and 5 B,C).

      Point 1-2

      Please provide justification of selection 0.5% as their hypoxic condition or perhaps repeat experiments in a less extreme environment to see if their conclusions still hold true.

      We admit that our approach to use 0.5% hypoxia was a drastic challenge for the cells. It should be noted, however, that physiologic oxygen levels during pregnancy at times drop to lower than 1% (Hansen et al, 2020; Ng et al, 2017). In the first place, we had used oxygen levels lower than this, because we had wanted to ensure that we can detect responses by bulk RNA-seq with a limited number of samples. As we had many conditions to compare, we did not want to use more than 3-4 samples per condition. The fact that the cells showed normal proliferation underscores the fact that 0.5% O2 per se was not so low that it would be overly stressful to the cells.

      Nevertheless, we are very grateful to the reviewer for the suggestion to include a milder hypoxic condition. We chose 2% O2, because this equals the physiological oxygen concentration shortly before the onset of cranial neural crest cell (CNCC) differentiation. We could recapitulate the phenomenon of impaired differentiation to chondrocytes, osteoblasts and smooth muscle cells at these mild hypoxic conditions, as shown by qRT-PCR and immunofluorescence of typical markers (Figs. 5D-R). Moreover, the differentiation-specific induction of the two central hypoxia-attenuated risk genes associated with orofacial clefts that we had identified by our bioinformatic analyses at 0.5% O2 (Boc and Cdo1), was still observable at 2% O2 (Figs. EV6C,D). Interestingly, in some rare cases, the attenuation of induction was lost or not as drastic as in 0.5% O2.

      We are convinced that the experiments at 2% O2 strongly increased the relevance of our manuscript, because we thus detected that oxygen levels prevailing shortly before the onset of CNCC differentiation still can influence their differentiation. This leads to the conclusion that only slight decreases of intra-uterine oxygen levels indeed might interfere with correct differentiation of CNCC.

      Point 1-3

      Standard immunohistochemistry or histology of differentiated cells would strengthen the authors' claims of reduced differentiation under hypoxic conditions, e.g., Alcian blue, alk-phos or Alizarin red, and smooth muscle actin or other indicator.

      We are grateful to the reviewer for the suggestion to include stainings of cells, as these stainings visualized the drastic effects of hypoxia on the cells. We performed immunofluorescent stainings against at least one marker protein for each differentiation paradigm. At 0.5% O2, each protein signals were nearly completely absent and cell morphology was disrupted (Figs. 2E,F, 3E, 4E). At 2% O2, we detected some more protein deposition than at 0.5%. Importantly, cells had retained their normal shape at mild hypoxia (Figs. 5H,M,R, EV5A).

      Point 1-4

      The authors identify a few genes that appear down-regulated in all three differentiation conditions. If it is within the scope of the study, it would strengthen the claim of these genes' function to show the effect of knock-down or knock-out for validation.

      We thank the reviewer for the suggestion of gene knock-down or knock-out in order to prove functional relevance of our findings. As this would have been too much effort and beyond the scope of our study, we rather followed the suggestion of reviewer 2 (cf. points 2-6, and 2-8) that headed to the same direction: we mined publicly available sequence data on orofacial development for gene expression or marks of active enhancers. We found robust expression of the two central hypoxia-attenuated OFC risk genes Boc and Cdo1 during human craniofacial development (Fig. 7A) and we identified enhancers that are active in embryonic craniofacial mouse tissue (Fig. 7B). Moreover, we detected expression of both genes during murine craniofacial development in undifferentiated mesenchymal cells, osteoblasts, chondrocytes and smooth muscle cells with the help of a single cell RNA-seq dataset (Figs. 7C-E, EV6B).

      Thus, we found evidence for the in vivo relevance of Boc and Cdo1 and could rule out a possible important role of Actg2, the third gene we had identified. We therefore are grateful for the suggestion to circumvent gene knockouts by reviewer 2, as we think these data strongly emphasized the importance of our findings.

      Point 1-5

      Another major critique lies in the initial claim that proliferation of O9-1 cells is not significantly impacted by hypoxia. In figures 1E-H, photograms of the cells cultured 24 -72 hours and quantifications of live vs dead cells are shown as evidence for this argument. However, the increased density of cells in normoxic conditions may be a confounding variable in this assay. It would be interesting for the researchers to assess the percent of dead vs alive cells between normoxic and hypoxic conditions when the plates reach equivalent densities.

      We apologize for the use of image sections from photographs with different cell densities. Of course, as demonstrated by our quantification, cell densities between 0.5% and 21% O2 in total were equal (cf. Figs. 1D,E). We therefore replaced the formerly used sections with new image sections with equal cell numbers.

      We thank the reviewer for the suggestion to examine if cell numbers influence cell death rates. We followed this advice by several approaches: first, we seeded cells at different densities, incubated them for 72 h (the same time span where a minimal difference had been detected) and performed live/dead stainings (Fig. EV1B). The seeding density did not affect percentages of dead cells and the values were in the same range as in our initial experiment (Fig. 1J). Moreover, we performed TUNEL stainings of apoptotic cells at different time points to have an additional readout of cell death (Figs. 1K,L). As expected, the percentages of TUNEL-positive cells were identical between hypoxic and normoxic cells at all analyzed time points.

      We therefore concluded that hypoxia does not influence the rate of cell death of proliferating CNCC and accordingly specified our wording in the results section.

      Point 1-6

      At end of Fig 1 section authors attempt to tie phenotypes observed in a cell line in vitro to the complex biological processes. They are not comparable and in vivo models would be better suited for these types of comparisons.

      We apologize for the overconfident wording in our manuscript. Of course, our in vitro experiments cannot fully simulate the complex developmental processes taking place in vivo. We therefore changed the text to a more careful formulation. Moreover, we kept the wording in the discussion section that we cannot exclude that in the in vivo situation proliferation of CNCC is also affected by low oxygen levels because nutrients might not be available in such excess as they are in cell culture.


      Point 1-7

      Fig 2: if qRT-PCR did not show statistically different results between experimental and control groups why move on to bulk RNA seq?

      We apologize that the sentence about statistical significance was misleading. What we wanted to express is that there was only a little difference (if any at all) between differentiated cells at 0.5% O2 and proliferating cells at 0.5% O2 or 21% O2. For the sake of clarity and readability, we deleted this misleading sentence.

      Point 1-8

      Fig 5: hypoxia this intense is going to affect broad range of biological processes and genes. Finding a few genes that are affected in extreme hypoxia that are also risk genes is highly unlikely. How can the authors be assured that these overlaps are actually significant and not just by chance?

      We thank the reviewer for the suggestion to test for statistical significance. We tested significance of the overlap of respective gene sets (nsOFC vs. hyp-a; OFC vs. hyp-a) by Fisher’s exact test. We included Venn diagrams depicting the overlap and present the exact p-values (Figs. EV5C,D). In each case where overlap of genes occurred, p-values indicated significance.

      Point 1-9

      Would appreciate discussion on how examination of neural crest is relevant for OFC, as most animal models of OFC demonstrate the pathogenesis in embryonic epithelium or periderm, not in the neural crest. Defects in neural crest are associated with other congenital craniofacial anomalies such as craniosynostosis or complex (Tessier) clefts, not the typical orofacial cleft. Please revise rationale of study, interpretation of data and Discussion to specifically state how neural crest cells are involved in the pathogenesis of orofacial cleft.

      We apologize for not pointing out enough the role of epithelial cells in the emergence of orofacial clefts. We revised our introduction, results and discussion sections in this regard and emphasized the role of epithelial cells. Importantly, we addressed the possible influence of the results gained in CNCC on epithelial cells by analyzing scRNA-seq data with the algorithm CellChat, as suggested by reviewer 2 (cf. point 2-8). We detected several cell communication pathways from CNCC to epithelial cells which contain components that are misexpressed upon hypoxia in our dataset (Figs. 7F-I). Therefore, during hypoxia, these pathways might influence epithelial cells and therefore indirectly cause orofacial clefts. We outlined this possible interplay in the discussion and briefly mentioned it in the abstract.

      We have not discussed more strongly the role of CNCC in the emergence of OFC in the revised manuscript, because we did not want to put even more emphasis on this matter. Numerous studies have proven the contribution of cranial neural crest tissue to the emergence of orofacial clefts. This fact is also pointed out in several review articles about orofacial clefts. In most cases, this knowledge was achieved by mouse models, because tissue-specific conditional knockouts are feasible (in contrast to genetic studies on patients), usually via deletion with the Wnt1-Cre driver. Funato et al. give an excellent (but quite old) overview of mouse models in which the neural crest-specific knockout of a gene leads to emergence of OFC and lists 17 genes for which this is the case (Funato et al, 2015). Moreover, several recent studies also report on the emergence of orofacial clefts upon neural crest-specific deletion (Forman et al, 2024; Li et al, 2025). These include genes responsible for DNA methylation (Ulschmid et al, 2024), and a study on subunits of chromatin remodeling complexes that are necessary for correct transcription of their target genes, which was conducted by our group (Gehlen-Breitbach et al, 2023).

      Minor comments

      __Point 1-10 __

      The author should replace "Final proof" in the introduction with "further evidence supporting."

      We apologize for the incorrect wording. Of course, it is highly questionable if there is such a thing as final proof in life sciences. We re-phrased the text according to the reviewer’s suggestion.

      Point 1-11

      Authors are inconsistent when referring to Figures- sometimes they capitalize (i.e. 1J) and other times they leave lower case (i.e. 1i). Needs to be consistent throughout. Figures are not numbered.

      We apologize for the inconsistency. We corrected the references to figures. Moreover, we apologize for the missing figure numbers. We also corrected this and included figure numbers.

      Point 1-12

      In figures authors would sometimes list 21% O2 first then 0.5% O2 or vice versa. (i.e. Fig on page 21 panels I, J, K). Needs to be consistent.

      We again apologize for being inconsistent. We corrected the inconsistency in Fig. 1D. Now, 21% O2 is presented before/above 0.5% O2.

      Point 1-13

      Figures on pages 28, 29, 30 panel J and page 31 panel F: there is no legend on what the scale/measurement is for the difference in expression level other than it ranges from -1 to +3.

      We thank the reviewer for the hint. We are aware that from the heatmaps we used one cannot infer relative expression rates of different genes or similar. If we would have considered expression strength of single genes, many of the gene-specific differing expression rates under the different conditions would have been hard to detect, as presentation would have been dominated by the differences in expression rates between genes. We therefore plotted gene-wise scaled expression.

      We included an explanation of the procedure in the materials and methods section.

      Point 1-14

      Will the authors please comment on the one normoxic sample in Figure 1I that did not cluster with the others? Did this meet the standards to merit exclusion as an outlier?

      We regret that the default scale of our plot of the principal component analysis is a bit misleading. This is the case because x-axis accounts for 80.3% of variance and y-axis only accounts for 6.1%. Therefore, the sample that might seem as an outlier actually met our standards. Nevertheless, we decided to keep the default scaling as is, in order not to embellish the graph (Fig. 1M).

      Point 1-15

      The authors refer to DEG as deregulated genes; while not strictly incorrect, the more standard usage is "differentially expressed genes." Please address.

      We apologize for the incorrect explanation of the acronym. Of course, this was corrected in the revised manuscript.

      Significance

      This work on neural crest cells and hypoxia are biologically and clinically significant.

      We are deeply grateful to the reviewer for considering our manuscript significant for both biologists and clinicians. We are convinced that the additional data we gathered in the course of the revision has significantly increased the importance of our work. Therefore, we once again express our gratitude to the reviewer for the valuable suggestions.

      Response to reviewer 2 comments

      Major comments


      Point 2-1

      The conclusions drawn from the experimental data are carefully formulated for the most part. One of the main concerns is that the cells were subjected to extreme hypoxic conditions, while it may be more biologically relevant to include a condition representing more mild hypoxia (e.g. 10%).

      Please refer to the response to point 1-2.

      Point 2-2

      One of the opening claims regarding severe hypoxia only mildly affecting cell proliferation is not shown clearly, since no mitotic markers have been analyzed (i.e. KI67 or PCNA staining or a simple EdU incorporation assay). Thus, the claim that they assessed cell proliferation is not very convincing, even though cell death was analyzed.

      We appreciate the reviewer’s suggestion to include a more thorough analysis of proliferation rates. We followed the advice and performed immunofluorescent stainings against Ki67 (accounting for cells in proliferative state) and phospho-histone H3 (accounting for cells undergoing mitosis). We performed this assay at different time points of culture in order to address the question if cell density might influence proliferation rates (Figs. 1F-H). Neither for Ki67 nor for pHH3 a difference was detected between 21% and 0.5% O2.

      We are convinced that these analyses strengthened our initial findings and provide strong evidence that hypoxia does not influence proliferation rates of CNCC.

      Point 2-3

      Additionally, cellular morphology of the cells could be assessed (brightfield images), since previous studies observed that hypoxia can be an inducive factor in cranial neural crest and driving EMT (Scully et al. 2016; Barriga et al. 2013).


      We thank the reviewer’s hint and followed the advice. We analyzed cellular morphology by the parameters cell length, total number of pseudopodia, number of filopodia and number of lobopodia (Figs. EV1C-F). As outlined in the results section, we did not detect a difference in these parameters between 21% and 0.5% O2.

      We included the second reference mentioned by the reviewer (Barriga et al, 2013) additionally to Scully et al. 2016 that had already been cited.

      Point 2-4

      Furthermore, in the RNA seq analysis of chondrogenic fate biased cells the authors draw a conclusion based on the proximity of the samples on the PCA plot, which is not very convincing. More careful analysis of the bulk RNA seq data sets they have generated for key marker genes will be more convincing (for example, a heatmap with selected genes would be a helpful representation).

      We apologize for the rash and inaccurate conclusion based on proximity on PCA plots. We are grateful to the reviewer for the suggestion to include heatmaps with selected marker genes. Following this advice, we generated heatmaps on our bulk RNA-seq data with the GO terms specific for each differentiation paradigm (Figs. EV2F, EV3F, EV4F).

      We are convinced that these maps are perfect additions to the heatmaps of the 200 top differentially-expressed genes that already had been included in the manuscript (Figs. 2K, 3J, 4J) and helped to strengthen our findings. For chondrocytes and smooth muscle cells, the new, GO-specific heatmaps perfectly recapitulated the phenomenon of hypoxia-attenuated induction. Interestingly, for osteoblasts, about half of the induced genes were hypoxia-attenuated, while the other half was induced stronger than under normoxia. This pointed to gene-specific mechanisms of hypoxia-dependent attenuation of transcription. Moreover, it shed light on a hypoxia-evoked complete dysregulation of transcriptional induction in osteoblasts, as nearly none of the genes was induced similar to normoxia.

      __ __


      Point 2-5

      As mentioned above, a straight-forward and not time consuming experiment (given that it was assessed for a maximum of 72 hrs) would be to repeat the culture of NCCs and stain for mitotic markers, and quantify the number of positively stained cells over total cell numbers. Furthermore, it is not that demanding to add an experimental condition of less severe hypoxia in this assay.

      We thank the reviewer for the suggestion and followed the advice (cf. point 2-2). The conducted experiments straightened our results, because the initially detected slight tendency to lower cell numbers at 0.5% O2 could thus be falsified: We did not detect any difference for Ki67 and pHH3 between 0.5% and 21% O2 at any analyzed time point (Figs. 1F-H). Moreover, percentages of dead or apoptotic cells at 0.5% O2 did not vary from 21% (Figs. 1I-L, EV1B). As we could not detect any difference in proliferation between 21% and 0.5% O2, we skipped the analysis of proliferating cells at 2% O2.

      Point 2-6

      Without underestimating how time consuming this would be, a major lack of experimental validation of the key genes they identify as important across all conditions may be the limitation of the study (this would be the difference between correlation and a probable underlying mechanism). This can be circumvented by more extensive reference to in situ data sets from mouse or existing data sets of single cell and spatial transcriptomics. A suggested targeted knock-down (for example with siRNA, shRNA or CRISPR) to validate a few of the key genes revealed as important could take a few months, with an estimated cost up to 5,000 euros per targeted gene and replicate.

      We thank the reviewer for the notion that targeted knockdowns are beyond the scope of our manuscript. We are deeply grateful for the reviewer’s constructive criticism and for the suggestion to analyze publicly available data sets in order to gather data depicting in vivo relevance of our identified central hypoxia-attenuated OFC risk genes Boc, Cdo1 and Actg2 (cf. point 1-4). We detected robust expression of Boc and Cdo1 during human craniofacial development (Fig. 7A) and we identified enhancers that are active in embryonic craniofacial mouse tissue (Fig. 7B). Moreover, we detected expression of both genes during murine craniofacial development in undifferentiated mesenchymal cells, osteoblasts, chondrocytes and smooth muscle cells by reanalysis of a scRNA-seq dataset (Figs. 7C-E, EV6B). This data comprised scRNA-seq of mouse embryonic maxillary prominence at stages E11.5 and E14.5 (Sun et al, 2023).

      Thus, we found evidence for the in vivo relevance of Boc and Cdo1 and could rule out a possible important role of Actg2, the third gene we had identified. We therefore are deeply grateful for the suggestion, as we think these data strongly emphasize the importance of our findings.

      Point 2-7

      On methods, replicates and statistics: The experimental methods and approach are described efficiently and seem reproducible. All biological and technical replicates are of a minimum of N=3 from independent experiments and statistical tests have been run in all cases.


      We thank the reviewer for the appreciation of our methodology, descriptions and statistical analyses.

      Minor points

      Point 2-8

      One of the key implications of NCCs in palate formation is interaction with orofacial epithelial cells, which the authors also mention. It may be interesting to check if any signaling pathways involved in this crosstalk are affected under hypoxic conditions in their existing data sets of bulk RNA SEQ. This can be done by using available algorithms such as CellChat (Jin et al. 2021; Jin, Plikus, and Nie 2023), which has been reported to work also in bulk RNA seq data analysis (according to GitHub). The authors could mine the literature for existing RNA sequencing data that include osteoblasts, chondrocytes and epithelial cells (Ozekin, O'Rourke, and Bates 2023; Piña et al. 2023).

      We are very grateful to the reviewer for this suggestion. Moreover, we like to thank the reviewer for mentioning exemplary references. We followed the advice by the methodology lined out in results and materials and methods sections: we applied the CellChat algorithm on a scRNA-seq dataset (Pina et al, 2023; Sun et al., 2023) to identify pathways containing components that are hypoxia-attenuated (and associated with a risk for OFC) in our bulk RNA-seq dataset (Figs. 7F-I). We did not use the datasets the reviewer had suggested, because the data were not available for us or the file format was not well-suited for the analysis with CellChat. Importantly, the dataset from Sun et al. has the following advantages over the suggested references: the complete maxillary prominence was used (instead of palatal shelves only), and different time points were included. Thus, we were able to follow the expression of genes of interest at different developmental stages before the onset of differentiation and after (Figs. 7C-E and EV6B). By our approach, we identified several OFC-related pathways that contain hypoxia-attenuated components such as BMP and FGF signaling and deposition of collagen and fibronectin (Figs. 7F-I). Importantly, the named pathways (and others) send outgoing communication patterns to epithelial cells. Therefore, hypoxia-attenuated gene induction in CNCC could influence epithelial cells via these pathways.

      We believe that the use of the CellChat algorithm has brought a deeper understanding of how hypoxia can have indirect consequences on the important topic of epithelial cells and thus could also evoke OFC. We therefore once again like to express our gratitude to the reviewer.

      Point 2-9

      Additionally, another process that may be affected is EMT (epithelial-to-mesenchymal-transition) and is possible to assess by re-analysis of bulk RNA-seq data while focusing on key genes implicated in this process (i.e. E-cadherin, vimentin, EpCAM, Snail, Twist, PRRX1).

      We thank the reviewer for the advice. We followed the advice and analyzed cellular morphology by the parameters cell length, total number of pseudopodia, number of filopodia and number of lobopodia (Figs. EV1C-F) (cf. point 2-3). As we did not detect any differences between 21% and 0.5% O2, and because the cells we used for our analyses represent mesenchymal cells, i.e. cells that had already undergone EMT, we did not re-analyze our dataset with the focus on EMT.

      Point 2-10

      Lastly, when the authors report on the significantly up- or down-regulated genes, it may be interesting to categorize them by ligands, receptors, intracellular molecules and transcription factors (and use separate plots to visualize them). While a big focus of the manuscript are down-regulated genes, less emphasis was given in upregulated genes (other than the response to hypoxia gene module).

      We thank the reviewer for the advice. Following this advice, we categorized genes according to Panther protein classes "intercellular signal molecule" (PC00207), "transmembrane signal receptor" (PC00197) and "gene-specific transcriptional regulator" (PC00264) and depicted the results with violin plots (Fig. EV5B). We could not analyze intracellular molecules, because this protein class does not exist in the Panther database. We had not focused on the genes with stronger induction in hypoxic condition, because the number of genes was low in each differentiation paradigm (7 in chondrocytes, less than 30 in osteoblasts, none in smooth muscle cells) and the transcriptional changes were mostly not as drastic as for the attenuated genes. In order to achieve a broader overview of deregulated processes, we now included GO term analyses of genes downregulated during the differentiation regimes both at 21% and 0.5% O2 (Figs. EV2D,E, EV3D,E, EV4D,E).

      Point 2-11

      The authors are referencing extensively and accurately existing studies in the field and the manuscript is exceptionally well-written, with only a few points of limited clarity or increased complexity. Such an example is when the authors refer to OFC risk genes, because it is not clearly stated how the referenced studies reached their conclusions (for example, are they mouse studies, do they involve mutants, are any of these studies based on GWAS on human cohorts). This matter would significantly improve the flow of the text and highlight the importance of the study and their findings.

      We would like to thank the reviewer very much for the appreciation of our scientific writing. We apologize for not explaining exactly how our OFC risk gene lists had been curated. We included this information for both non-syndromic and other OFC risk genes at the respective sites in the results section. Moreover, we included the Human Phenotype Ontology terms that had been used in the search in the materials and methods section.

      We thank the reviewer for this suggestion, as we agree that this information significantly highlights the importance of our findings.

      Point 2-12

      The figures could be redesigned to be more intuitive to interpret. For example, using violin plots and heatmaps, as discussed, and including references or re-analysis/re-use of existing spatial transcriptomics and in situs for marker genes.

      In all cases where there is a comparison of gene expression levels, violin plots would be a better representation of up- and down-regulated genes (i.e. selected genes from Fig1K, comparison of gene expression between normoxic and hypoxic NCCs, Fig 2G when analyzing chondrogenesis and the respective analysis for osteoblasts and smooth muscle cells, as well as when comparing the three fate-biasing conditions to identify common genes that are misregulated).

      We thank the reviewer for the advice and for the appreciation of the usage of heatmaps (Figs. 2K, 3J, 4J, 6F). Unfortunately, as the number of biological replicates is only three to four, the visualization of gene expression data from our bulk RNA-seq data with violin plots was not intuitive. We therefore retained the heatmaps rather than choosing bar graphs, because they are much clearer when presenting expression data of several to many genes. We included violin plots whenever possible due to high numbers of data points (Figs. EV1C, EV1D, EV1E, EV1F, EV5B). Moreover, we added additional heatmaps to depict transcriptional changes of genes associated with GO terms with the various differentiation regimes (Figs. EV2F, EV3F, EV4F). Unfortunately, we did not detect the three central hypoxia-attenuated genes in spatial transcriptomics data on craniofacial development. But we used scRNA-seq data of different stages of orofacial mouse tissue where we could identify expression of Boc and Cdo1 (cf. points 1-4 and 2-6). These data helped, together with other in vivo data to gain evidence for the in vivo function of Boc and Cdo1 during CNCC differentiation and helped to dismiss Actg2 as another central player.

      Significance

      Several pieces of evidence have pointed to hypoxia as an environmental factor contributing to congenital orofacial clefts, ranging from studies in mouse to observations in human. The authors are doing an excellent job in putting this information together and the question they are trying to answer is of high importance, given the prevalence of such congenital syndromes.

      We are deeply grateful to the reviewer for the appreciation of our work and for classifying our research topic as highly important.

      In terms of the methods and model employed, there are some limitations, related to the choice of a mouse cell line over one from human, the severe hypoxia induced (over a more mild), and the conditions of directed differentiation not allowing for simultaneous examination of more complex lineage transitions. The methods as a whole are not that up-to-date, given the single cell and multiplexed transcriptomic advances the last couple of decades, advanced bioinformatics that could be used in combination with in vitro lineage tracing methods.

      We thank the reviewer for the honest evaluation of our methods, especially for the constructive suggestions that were given to address our hypotheses with more up-to-date methods and at milder hypoxic conditions. As outlined above, we followed the advice and re-analyzed existing scRNA-seq datasets (cf. points 2-6 and 2-8) and checked our central hypotheses at milder hypoxic conditions (cf. response to point 1-3).

      We are deeply convinced that both significantly increased the biological relevance of our results, because we thus (1) gathered evidence for the in vivo function of Boc and Cdo1 and (2) were able to show that the phenomenon of hypoxia-attenuated gene induction still holds true at biologically relevant hypoxic conditions.

      The audience this work will reach are neural crest experts, developmental biologists, and potentially clinical doctors. The general public outreach of such a paper is also diverse, as more focus and visibility is required for the individuals affected by those syndromes and their families.

      We thank the reviewer for the judgement that our manuscript will not only reach neural crest experts, but also developmental biologists in general and potentially also clinicians. We are very much pleased that the reviewer shares our opinion that affected individuals should be more in the focus of public attention. We like to express our gratitude for the judgement that our manuscript might help to increase focus and visibility for them.

      References


      Barriga EH, Maxwell PH, Reyes AE, Mayor R (2013) The hypoxia factor Hif-1α controls neural crest chemotaxis and epithelial to mesenchymal transition. The Journal of cell biology 201: 759-776, 10.1083/jcb.201212100.

      Forman TE, Sajek MP, Larson ED, Mukherjee N, Fantauzzo KA (2024) PDGFRα signaling regulates Srsf3 transcript binding to affect PI3K signaling and endosomal trafficking. Elife 13, 10.7554/eLife.98531.

      Funato N, Nakamura M, Yanagisawa H (2015) Molecular basis of cleft palates in mice. World journal of biological chemistry 6: 121-138, 10.4331/wjbc.v6.i3.121.

      Gehlen-Breitbach S, Schmid T, Fröb F, Rodrian G, Weider M, Wegner M, Gölz L (2023) The Tip60/Ep400 chromatin remodeling complex impacts basic cellular functions in cranial neural crest-derived tissue during early orofacial development. International Journal of Oral Science 15: 16, 10.1038/s41368-023-00222-7.

      Hansen JM, Jones DP, Harris C (2020) The Redox Theory of Development. Antioxid Redox Signal 32: 715-740, 10.1089/ars.2019.7976.

      Li D, Tian Y, Vona B, Yu X, Lin J, Ma L, Lou S, Li X, Zhu G, Wang Y et al (2025) A TAF11 variant contributes to non-syndromic cleft lip only through modulating neural crest cell migration. Hum Mol Genet 34: 392-401, 10.1093/hmg/ddae188.

      Ng KYB, Mingels R, Morgan H, Macklon N, Cheong Y (2017) In vivo oxygen, temperature and pH dynamics in the female reproductive tract and their importance in human conception: a systematic review. Human Reproduction Update 24: 15-34, 10.1093/humupd/dmx028.

      Pina JO, Raju R, Roth DM, Winchester EW, Chattaraj P, Kidwai F, Faucz FR, Iben J, Mitra A, Campbell K et al (2023) Multimodal spatiotemporal transcriptomic resolution of embryonic palate osteogenesis. Nature communications 14: 5687, 10.1038/s41467-023-41349-9.

      Sun J, Lin Y, Ha N, Zhang J, Wang W, Wang X, Bian Q (2023) Single-cell RNA-Seq reveals transcriptional regulatory networks directing the development of mouse maxillary prominence. J Genet Genomics 50: 676-687, 10.1016/j.jgg.2023.02.008.

      Ulschmid CM, Sun MR, Jabbarpour CR, Steward AC, Rivera-González KS, Cao J, Martin AA, Barnes M, Wicklund L, Madrid A et al (2024) Disruption of DNA methylation-mediated cranial neural crest proliferation and differentiation causes orofacial clefts in mice. Proc Natl Acad Sci U S A 121: e2317668121, 10.1073/pnas.2317668121.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      Major Comments:

      1. The data in the paper strongly suggests that the new copper shuttles are selective for copper and have faster binding kinetics (Fig 1) than the previous one. However, the data regarding the copper shuttling from the copper(Aβ) peptides is not very convincing. It appears to be due to the Cu effect alone (Fig.3), as the reduction in viability with Cu(II)+ AscH- is almost the same as the Cu(II)(Aβ)+AscH-. To convincingly show that the peptide shuttle can strip copper from (Aβ) peptides, the authors need to show that the copper is bound to the (Aβ) peptide before it is used in the experiment. Rightfully so, the effect of the toxicity of Cu(II)+ AscH- is similar to that of Cu(II)(Aβ16)+AscH-. This is due to the fact that Aβ16 is not toxic to the cells, so therefore there is no compounded effect of Cu and Aβ16 as seen for Cu(II)(Aβ40). As for the toxicity of Cu(II)+ AscH-, it is be similar to Cu(II)(Aβ)+AscH- because Cu(II) will be bound to a weaker ligand in the medium and such loosely bound Cu is also able to produce ROS with AscH- with similar rates as Cu-Ab.

      Data from our lab and others have shown that in HEPES solution at pH 7.4, Aβ forms a complex with Cu. The present work is also in line with Cu-binding to Ab, as in Figure 1C (GSH), the rate of Cu withdrawal by the shuttle can only be explained by Cu bound to Ab, as Cu in the buffer binds to the shuttle much faster. Also, the AscH- consumption rate measured in Fig S5D-E are congruent of Cu bound to Ab, unbound Cu has a much faster rate of AscH- consumption (Santoro et al. 2018, doi.org/10.1039/C8CC06040A).

      The concentrations of Aβ and Cu used in our experimental condition were determined with a UV-Vis spectrophotometer.

      Minor comments:

      1. The paper does not cite Figure 1A and some supplementary figures, especially Supp. Fig. 1-2. All the figures and supplementary figures should be cited. This has been rectified for all the concerned figures.

      The data presentation in Figures 3B and S8 is confusing."-" signs indicate no addition or the blank box means no addition. Also, the AKH-αR5W4 has no "-" sign in the first bar. For clarity, please indicate the -, +, or no sign means in the figure legends. Also, what does "Batch A" refer to in Figure 3B?

      The figures have been modified as suggested by the reviewer.

      Page 7, correct (Error! Referencesource not found.Figure 1C).

      This has been rectified.

      The Giantin staining in Figure 2B is making it hard to visualize ATP7A trafficking. If the Giantin image overlay is removed, it may be easier to see the movement of ATP7A from the perinuclear region to the vesicles.

      The images have been modified to better appreciate the ATP7A change in distribution upon the increase in intracellular Cu level. We have reduced the number of conditions for which images are provided and provided individual staining for clarity. Zoomed images are also provided. The remainder of the conditions are in Figure S7B

      In the introduction, the authors mention, "These molecules have, however, a major pitfall as is seen for Elesclemol, a candidate for Menkes disease treatments 32. The authors cite reference " Tsvetkov, P. et al. Copper induces cell death by targeting lipoylated TCA cycle proteins." The paper showing elesclomol as a candidate for Menkes disease treatments is Guthrie L et al., Elesclomol alleviates Menkes pathology and mortality by escorting Cu to cuproenzymes in mice. Science. 2020.

      We thank the reviewer for pointing this out, which was apparently not clearly explained. Our intention here was to show that a major pitfall of shuttles like Elesclomol, as seen in the study by Tsvetkov, P. et al. Science (2022), is cuprotoxicity. The sentence has been clarified and the work of Guthrie L et al is cited for Elesclomol as a candidate for Menkes disease.

      Reviewer #2 :

      Major issues:

      1. This reviewer is not convinced that the authors' experimental system is well suited for studies of glia activation and protective effects. With the exception of a couple of panels it is very hard to see differences. The authors should significantly improve the quality of images in Figure 5 to make this set of data convincing. We thank the reviewer for his/her detailed evaluation and for bringing to light the quality of the image in Figure 5. We have therefore improved the quality of the images by improving the signal to noise ratio to better show the differences between conditions.

      Similarly, the quality of giantin staining is low and needs to be improved and more experimental details are needed (see details below).

      As stated in our answer to reviewer 1, the images have been modified to better appreciate ATP7A redistribution upon increase of intracellular Cu levels. We have reduced the number of conditions for which images are provided and provided individual staining for clarity. Zoomed images are also provided. The remainder of the conditions are in Figure S7B.

      Given that shuttles are found within vesicles, the authors should discuss the mechanism through which Cu is released into the cytosol to trigger ATP7B trafficking.

      The mechanism of Cu escape from endosomes remains poorly understood. However, supported by our recent observations that Cu quickly (within 10 min) dissociates from the Cu-shuttle AKH-αR5W4NBD in endosomes (Okafor et al., 2024, /doi.org/10.3389/fmolb.2024.1355963), we discuss the potential involvement CTR1/2 and DMT1 (page 16).

      There are numerous small writing issues that make paper difficult to read. The authors are encouraged to carefully edit their manuscript.

      We thank the reviewer for pointing this out and several errors have been corrected whereas various sentences have been clarified.

      Minor issues

      * „A solution of monomerized Aβ complex in 10% DMEM (diluted with DMEM salt solution) was prepared in microcentrifuge tubes" - here and further the description of media composition is confusing What is the rest 90%?

      This has been rectified. The composition of the salt solution that makes up the 90% has been provided (page 4).

      * „Afterwards, AscH- was added to the tubes and vortexed, the mixture was then added to PC12 cells" - concentration of ascorbate is mentioned only once (later in the figure legend) where it can be barely found, also without explaining the choice of concentration. Additionally, ascorbate's product code is not listed. Please, correct.

      These points have been rectified.

      * Description of the cell (PC12 line) handling conditions is absent (growth medium, passage number used etc) and should be included.

      This information is now provided.

      * ATP7A delocalization assay. Details for the secondary antibodies are absent (full name (e.g. AlexaFluor 488), manufacturer, code) and should be added.

      Missing information has been added.

      * page 6: „Next, we investigated the capacity of the shuttles to withdraw Cu(II) from cell culture media, DMEM 10% and DMEM/F12 1:1 (D/F)." Here and further explanation is needed why the mixture of DMEM/F12 is needed (F12 is also not listed in the materials list).

      DMEM/F12 is a media that is commercially available used for some cell types, and it has been added to the materials list (page 4).

      * Page 7. Legend to the figure 1B: „Conditions: Cu(II)=AKH-αR5W4NBD=DapHH-αR5W4NBD=HDapH-αR5W4NBD= 5 μM, DMEM 10%, D/F 100%, 25{degree sign}C, n=3." - „DMEM/F12" ratio equals to „100%" is confusing, please clarify

      This has been clarified.

      * Page 8-9. Legend to the Figure 2A. „Similar observations were obtained with 5 different cell cultures." Same remark goes to the legend to supplementary figure 7 ("Similar observations were obtained with at least 3 different cell cultures"). Do the authors mean independent experiments or different cell lines? Please clarify. If different cell lines, consider including these data into the supplement.

      Indeed we meant independent experimentations. This has been clarified.

      * Page 8-9, figure 2B. Giantin is a cis-golgi marker, which should localize perinuclearly. In the cells shown the signal is diffuse and appears non-specific. Please improve the quality.

      We have reduced the number of conditions for which images are provides and are providing individual staining for clarity. Zoomed images are also provided allowing visualization of the typical cis-Golgi distribution of Giantin.

      * Page 8-9, figure 2B. ATP7A is shown in green. The authors did not specify the secondary antibody has been used for it. If the secondary antibody used for labeling of ATP7A has green fluorescence then how does one distinguish between the transporter signal and signal of the green fluorescent shuttle? Please provide more details.

      We thank the reviewer for pointing this point as we missed to mention this technical issue in the original manuscript. The Cu-shuttles labeled with NBD indeed emit in the green signal, but they are not fixable under our conditions and are washed out during ICC procedure. Accordingly, they do generate any background signal and do not interfere with the ICC as shown by the controls and test conditions (Figure S7B and Figure 2B). This is now mentioned (page 11).

      * Page 9 and Figure 2B. Why did authors use Cu(II)EDTA for the experiment? What was the concentration? Please, add this information as well as Cu(II)GTSM treatment conditions to the experiment description in materials and methods.

      EDTA is a strong chelator of Cu(II), however due to its negative charge it cannot penetrate the plasma membrane thus importing Cu. It is therefore used as a negative control, to eliminate the speculation of Cu non-specifically crossing the plasma membrane or through a channel.

      * Figure 2 and supplementary figure 7. It would be beneficial to have higher magnification images. Please, add them, if possible.

      These higher magnification images have been provided.

      * Page 11. „In conclusion, the novel Cu(II)-selective peptide shuttles .... capable of instantly preventing ... toxicity on PC12 cells, whereas ... instantly rescue Cu(II)Aβ1-42 toxicity". Authors should be more careful with terminology. According to the materials and methods, the survival assay was carried out after 24h of cells' treatment with the reagents. Effect visible after 24h and „instant rescue" is not the same, Please clarify or modify the wording

      In principle, the peptides cannot reverse the production of ROS, however they prevent ROS production. Therefore, for the peptides to have an effect, they have to instantly halt ROS production. This is justified by the novel shuttles being more effective than AKH-αR5W4NBD in preventing toxicity, given we modified just the Cu binding sequence. We have however restricted the use of the term instantly to ROS production.

      * Page 13, figure 5, panels C and D. In both quantitations Cu(II) was used as one of the control conditions. Why in panel D the percentage of activated microglial cells (second graphs from right) is several fold higher (appr. 150% vs >500%)?

      This variability was observed throughout our set of experiments and could be linked to the quality of the hippocampal slices used. Slight variations in the age of the animals or in the traces of metals in the mediums are likely explanations. However, the different groups that are compared represent experiments performed simultaneously.

      * Supplementary Figure S3B. The lowest solid line does not correspond to any color in the legend (please, check and correct). However, by the method of exclusion, one may conclude that it refers to Cu(II)+HDapH-shuttle. What could be a potential explanation for stronger quenching of this shuttle by binding Cu(II) directly from the spiked media comparing to when it is pre-complexed with copper (also supported by the panel D)?

      The stronger quenching of this shuttle by binding Cu(II) directly from the spiked media comparing to when it is pre-complexed with copper is not significant.

      * In discussion the authors mention that the designed shuttles are prone to degradation in 48 hours. In the viability assays, they treat cells for 24 hours, in the fluorescent and confocal microscopy experiments for one hour or less. What is the lifetime of these shuttle peptides in the cells?

      The lifetime of the shuttle peptide in the cells is currently unknown. However, after 24h incubation of PC12 cells with the AKH-αR5W4NBD, DapHH-αR5W4NBD and HDapH-αR5W4NBD, the Cu shuttles lose their punctate distribution and appear diffuse inside the cells. We have recently shown that AKH-αR5W4NBD cycles through different endosomal compartments and eventually reaches the lysosomes where it could be degraded (Okafor et al., 2024, /doi.org/10.3389/fmolb.2024.1355963). Therefore, the diffuse distribution of the fluorescence signal could suggest degradation of the Cu-shuttles.

      * From the microscopy observations, the mechanism of entry of apo-shuttles (with no Cu(II) in the complex) and in complex with Cu(II) looks quite different. Namely, in figure S7 the fluorescent signal is very strong in the plasma membrane with significantly less vesicular pattern when compared to figure 2A. It is especially apparent for DapHH shuttle at 15 minutes of incubation. Can authors hypothesize/discuss the reason for these differences?

      The difference of the shuttle’s signal in the presence or absence of Cu binding, is due to fluorescence quenching by Cu bound and was at the heart of the design of these shuttles. Hence a strong signal at the plasma membrane is seen in the absence of Cu as these CPP-based shuttles interact strongly with the plasma membrane. However in presence of Cu, they become less visible due to quenching by Cu. Interestingly however, is that when Cu dissociates from the shuttle inside the cells (likely in acid endosomes), this quenching is suppressed and the fluorescence reappears. This is now better explained (page 10).

      * Please, show the figures in the supplementary file in the same order as you refer to them.

      This has been rectified.

      * Introduction. Description of the shuttle peptides: „(3) a cell penetrating peptide (CPP), αR5W4, with sequence RRWWRRRWWR, for cell entry35" - one R is the middle is extra.

      This has been rectified.

      *Kd units are missing (pages 2, 3 and 15) and should be added.

      This has been added.

      * Figure 1A is either not referred at all or mislabeled.

      * Page 7, Figure 1B: x axis on the second panel (+Mn+) misses a label.

      * Page 8. „Upon addition of DapHH-αR5W4NBD or HDapH-αR5W4NBD, an immediate slow-down in ROS production was observed (Figure 1D and S1E), ..." - mislabeled supplementary figure, please, correct.

      * Page 11. „...but not in the presence of AKH-αR5W4NBD which required pre-incubation to prevent toxicity (Figure 3AFigure)." Please, correct the reference to the figure.

      * Page 11. „This is in line with the faster retrieval ... previously demonstrated in vitro (Figure 1)" - please, specify the panel.

      * Supplementary materials and methods, subsection „Retrieval of Cu by peptide shuttles from Aβ", page 2: „The same was done for 10 μM Cu(II)...to give the estimated 100% saturated emission level." - check the spelling of the shuttle species.

      * Supplementary Figure S4. By the behavior of AKH-shuttle in the presence of copper and other metals, it looks that panels are shuffled, i.e. panel C looks corresponding to the panel B with DMEM/F12 conditions, whish is also supported by the values in the Table S1. Please, check and correct, if needed.

      * Supplementary figure S9, panel A. Apparently, mislabeled images with Abeta1-42 and Cu(II)Abeta1-42. Please, correct.

      We apologize for the different issues in referencing figures. This has been rectified.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Minor Concerns

      I think that authors can add some concepts of general interest on AD, as follows

      evidence showed that AD top-line disease-modifying drugs employing monoclonal antibodies (donanemab, lecanemab, and aducanumab) that tag Aβ, based on the 'Amyloid cascade hypothesis', are able to rid the brain of Aβ plaques, but the drug benefits consist in a reduction of 35% of cognitive decline. The remaining disease burden (more than 65%) has no disease-modifying therapeutic options, at the moment. Furthermore, monoclonal antibodies against Aβ have strong side- events (ARIA). On this basis, it could be suggested that removing Aβ plaque might not be sufficient to slow the 100% percentage of clinical decline in AD. This is why the Cu(II) shuttle invention presented by the candidate may represent a valid and concrete means to fight AD, since also meta-analyses demonstrate that Cu and more specifically non-Cp Cu is increased in AD (PMID: 34219710). The authors can add some of these clinical considerations in the Discussion.

      There is only a very brief description of the scenario of evidence of the involvement of copper in Alzheimer's, especially from a clinical point of view, I mean the scenario resulting from clinical studies carried out on AD patients. This would have highlighted the unmet medical need to which these new compounds (the Cu shuttles) can provide an answer. At least for a subpopulation of Alzheimer's patients, and we know that there are different subtypes of Alzheimer's disease (for example 10.1016/j.neurobiolaging.2004.04.001, but authors can find others), these Cu(II) selective shuttles could provide beneficial effects. Literature reports about a percentage of AD patients with increased levels of Cu (some papers on this topic e can be easily retrieved,), who may primarily benefit from these compounds. These can be easily identified as it is also characterized by a different biochemical, cognitive, and genetic profile. The current study is timely since AD patients with high Cu can be easily identified since they are characterized by a different biochemical, cognitive, and genetic profile as per recent findings (PMID: 37047347). This information can improve the quality of the manuscript by providing information about the unmet clinical need that this study can answer

      We thank the reviewer for his very positive evaluation and for his suggestion that gives more perspective to our work. Accordingly, we have added these parts to the introduction and discussion sections.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The questions after reading this manuscript are what novel insights have been gained that significantly go beyond what was already known about the interaction of these receptors and, more importantly, what are the physiological implications of these findings? The proposed significance of the results in the last paragraph of the Discussion section is speculative since none of the receptor interactions have been investigated in TNBC cell lines. Moreover, no physiological experiments were conducted using the PRLR and GH knockout T47D cells to provide biological relevance for the receptor heteromers. The proposed role of JAK2 in the cell surface distribution and association of both receptors as stated in the title was only derived from the analysis of box 1 domain receptor mutants. A knockout of JAK2 was not conducted to assess heteromers formation.

      We thank the reviewer for these comments. The novel insight is that two different cytokine receptors can interact in an asymmetric, ligand-dependent manner, such that one receptor regulates the other receptor’s surface availability, mediated by JAK2. To our knowledge this has not been reported before. Beyond our observations, there is the question if this could be a much more common regulatory mechanism and if it has therapeutic relevance. However, answering these questions is beyond the scope of this work.

      Along the same line, the question regarding the biological relevance of our receptor heteromers and JAK2’s role in cell surface distribution is undoubtfully very important. Studying GHR-PRLR cell surface distributions in JAK2 knockout cells and certain TNBC cell lines as proposed by the reviewer could perhaps be insightful. However, most TNBCs down-regulate PRLR [1], so we would first have to identify TNBC cell lines that actually express PRLR at sufficiently high levels. Moreover, knocking out JAK2 is known to significantly reduce GHR surface availability [2,3], such that the proposed experiment would probably provide only limited insights.

      Unfortunately, our team is currently not in the position to perform any experiments (due to lack of funding and shortage of personnel). However, to address the reviewer’s comment as much as possible, we have revised the respective paragraph of the discussion section to emphasize the speculative nature of our statement and have added another paragraph discussing shortcoming and future experiments (see revised manuscript, pages 23-24).

      (1) López-Ozuna, V., Hachim, I., Hachim, M. et al. Prolactin Pro-Differentiation Pathway in Triple Negative Breast Cancer: Impact on Prognosis and Potential Therapy. Sci Rep 6, 30934 (2016). https://www.nature.com/articles/srep30934

      (2) He, K., Wang, X., Jiang, J., Guan, R., Bernstein, K.E., Sayeski, P.P., Frank, S.J. Janus kinase 2 determinants for growth hormone receptor association, surface assembly, and signaling. Mol Endocrinol. 2003;17(11):2211-27. doi: 10.1210/me.2003-0256. PMID: 12920237.

      (3) He, K., Loesch, K., Cowan, J.W., Li, X., Deng, L., Wang, X., Jiang, J., Frank, S.J. Janus Kinase 2 Enhances the Stability of the Mature Growth Hormone Receptor, Endocrinology, Volume 146, Issue 11, 2005, Pages 4755–4765,https://doi.org/10.1210/en.2005-0514

      (2) Except for some investigation of γ2A-JAK2 cells, most of the experiments in this study were conducted on a single breast cancer cell line. In terms of rigor and reproducibility, this is somewhat borderline. The CRISPR/Cas9 mutant T47D cells were not used for rescue experiments with the corresponding full-length receptors and the box1 mutants. A missed opportunity is the lack of an investigation correlating the number of receptors with physiological changes upon ligand stimulation (e.g., cellular clustering, proliferation, downstream signaling strength).

      We appreciate the reviewer’s comments. While we are confident in the reproducibility of our findings, including those obtained in the T47D cell line, we acknowledge that testing in additional cell lines would have strengthened the generalizability of our results. We also recognize that performing a rescue experiment using our T47D hPRLR or hGHR KO cells would have been valuable. Furthermore, examining physiological changes, such as proliferation rates and downstream signaling responses, would have provided additional insights. Unfortunately, these experiments were not conducted at the time, and we currently lack the resources to carry them out.

      (3) An obvious shortcoming of the study that was not discussed seems to be that the main methodology used in this study (super-resolution microscopy) does not distinguish the presence of various isoforms of the PRLR on the cell surface. Is it possible that the ligand stimulation changes the ratio between different isoforms? Which isoforms besides the long form may be involved in heteromers formation, presumably all that can bind JAK2?

      This is a very good point. We fully agree with the reviewer that a discussion of the results in the light of different PRLR isoforms is appropriate. We have added information on PRLR isoforms to the Introduction (see revised manuscript, page 2) and Discussion sections (see revised manuscript, pages 23-24).

      (4) Changes in the ligand-inducible activation of JAK2 and STAT5 were not investigated in the T47D knockout models for the PRL and GHR. It is also a missed opportunity to use super-resolution microscopy as a validation tool for the knockouts on the single cell level and how it might affect the distribution of the corresponding other receptor that is still expressed.

      We thank the reviewer for his comment. We fully agree that such additional experiments could be very valuable. We are sorry but, as already mentioned above, this is not something we are able to address at this stage due to lack of personnel and funding. However, we do hope to address these and other proposed experiments in the future.

      (5) Why does the binding of PRL not cause a similar decrease (internalization and downregulation) of the PRLR, and instead, an increase in cell surface localization? This seems to be contrary to previous observations in MCF-7 cells (J Biol Chem. 2005 October 7; 280(40): 33909-33916).

      It has been recently reported for GHR that not only JAK2 but also LYN binds to the box1-box2 region, creating competition that results in divergent signaling cascades and affects GHR nanoclustering [1]. So, it is reasonable to assume that similar mechanisms may be at work that regulate PRLR cell surface availability. Differences in cells’ expression of such kinases could perhaps play a role in the perceived inconsistency. Also, Lu et al. [2] studied the downregulation of the long PRLR isoform in response to PRL. All other PRLR isoforms were not detectable in MCF-7 cells. So, differences between MCF-7 and T47D may lead to this perceived contradiction.

      At this stage, we can only speculate about the actual reasons for these seemingly contradictory results. However, for full transparency, we are now mentioning this apparent contradiction in the Discussion section (see page 23) and have added the references below.

      (1) Chhabra, Y., Seiffert, P., Gormal, R.S., et al. Tyrosine kinases compete for growth hormone receptor binding and regulate receptor mobility and degradation. Cell Rep. 2023;42(5):112490. doi: 10.1016/j.celrep.2023.112490. PMID: 37163374.

      https://www.cell.com/cell-reports/pdf/S2211-1247(23)00501-6.pdf

      (2) Lu, J.C., Piazza, T.M., Schuler, L.A. Proteasomes mediate prolactin-induced receptor down-regulation and fragment generation in breast cancer cells. J Biol Chem. 2005 Oct 7;280(40):33909-16. doi: 10.1074/jbc.M508118200. PMID: 16103113; PMCID: PMC1976473.

      (6) Some figures and illustrations are of poor quality and were put together without paying attention to detail. For example, in Fig 5A, the GHR was cut off, possibly to omit other nonspecific bands, the WB images look 'washed out'. 5B, 5D: the labels are not in one line over the bars, and what is the point of showing all individual data points when the bar graphs with all annotations and SD lines are disappearing? As done for the y2A cells, the illustrations in 5B-5E should indicate what cell lines were used. No loading controls in Fig 5F, is there any protein in the first lane? No loading controls in Fig 6B and 6H.

      We thank the reviewer for pointing this out. We have amended Fig. 5A to now show larger crops of the two GHR and PRLR Western Blot images and thus a greater range of proteins present in the extracts. Please note that the bands in the WBs other than what is identified as GHR and PRLR are non-specific and reflect roughly equivalent loading of protein in each lane.

      We also made some changes to Figures 5B-5E.

      (7) The proximity ligation method was not described in the M&M section of the manuscript.

      We thank the reviewer for pointing this out. We have added a description of the PL method to the Methods section.

      Reviewer #1 (Recommendations for the Authors):

      A final suggestion for future investigations: Instead of focusing on the heteromer formation of the GHR/PRLR which both signal all through the same downstream effectors (JAK2, STAT5), it would have been more cancer-relevant, and perhaps even more interesting, to look for heteromers between the PRLR and receptors of the IL-6 family since it had been shown that PRL can stimulate STAT3, which is a unique feature of cancer cells. If that is the case, this would require a different modality of the interaction between different JAK kinases.

      We highly appreciate the reviewer’s recommendation and hope to follow up on it in the near future.

      Reviewer #2 (Public Review):

      (1) I could not fully evaluate some of the data, mainly because several details on acquisition and analysis are lacking. It would be useful to know what the background signal was in dSTORM and how the authors distinguished the specific signal from unspecific background fluorescence, which can be quite prominent in these experiments. Typically, one would evaluate the signal coming from antibodies randomly bound to a substrate around the cells to determine the switching properties of the dyes in their buffer and the average number of localisations representing one antibody. This would help evaluate if GHR or PRLR appeared as monomers or multimers in the plasma membrane before stimulation, which is currently a matter of debate. It would also provide better support for the model proposed in Figure 8.

      We are grateful for the reviewer’s comment. In our experience, the background signal is more relevant in dSTORM when imaging proteins that are located at deeper depths (> 3 μm) above the coverslip surface. In our experiments, cells are attached to the coverslip surface and the proteins being imaged are on the cell membrane. In addition, we employed dSTORM’s TIRF (total internal reflection fluorescence) microscopy mode to image membrane receptor proteins. TIRFM exploits the unique properties of an induced evanescent field in a limited specimen region immediately adjacent to the interface between two media having different refractive indices. It thereby dramatically reduces background by rejecting fluorescence from out-of-focus areas in the detection path and illuminating only the area right near the surface.

      Having said that, a few other sources such as auto-fluorescence, scattering, and non-bleached fluorescent molecules close to and distant from the focal plane can contribute to the background signal. We tried to reduce auto-fluorescence by ensuring that cells are grown in phenol-red-free media, imaging is performed in STORM buffer which reduces autofluorescence, and our immunostaining protocol includes a quenching step aside from using blocking buffer with different serum, in addition to BSA. Moreover, we employed extensive washing steps following antibody incubations to eliminate non-specifically bound antibodies. Ensuring that the TIRF illumination field is uniform helps reduce scatter. Additionally, an extended bleach step prior to the acquisition of frames to determine localizations helped further reduce the probability of non-bleached fluorescent molecules.

      In short, due to the experimental design we do not expect much background. However, in the future, we will address this concern and estimate background in a subtype dependent manner. To this end we will distinguish two types of background noise: (A) background with a small change between subsequent frames, which mainly consists of auto-fluorescence and non-bleached out-of-focus fluorescent molecules; and (B) background that changes every imaging frame, which is mainly from non-bleached fluorescent molecules near the focal plane. For type (A) background, temporal filters must be used for background estimation [1]; for type (B) background, low-pass filters (e.g., wavelet transform) should be used for background estimation [2].

      (1) Hoogendoorn, Crosby, Leyton-Puig, Breedijk, Jalink, Gadella, and Postma (2014). The fidelity of stochastic single-molecule super-resolution reconstructions critically depends upon robust background estimation. Scientific reports, 4, 3854. https://doi.org/10.1038/srep03854

      (2) Patel, Williamson, Owen, and Cohen (2021). Blinking statistics and molecular counting in direct stochastic reconstruction microscopy (dSTORM). Bioinformatics, Volume 37, Issue 17, September 2021, Pages 2730–2737, https://doi.org/10.1093/bioinformatics/btab136

      (2) Since many of the findings in this work come from the evaluation of localisation clusters, an image showing actual localisations would help support the main conclusions. I believe that the dSTORM images in Figures 1 and 2 are density maps, although this was not explicitly stated. Alexa 568 and Alexa 647 typically give a very different number of localisations, and this is also dependent on the concentration of BME. Did the authors take that into account when interpreting the results and creating the model in Figures 2 and 8?

      I believe that including this information is important as findings in this paper heavily rely on the number of localisations detected under different conditions.

      Including information on proximity labelling and CRISPR/Cas9 in the methods section would help with the reproducibility of these findings by other groups.

      Figures 1 and 2 show Gaussian interpolations of actual localizations, not density maps. Imaging captured the fluorophores’ blinking events and localizations were counted as true localizations, when at least 5 consecutive blinking events had been observed. Nikon software was used for Gaussian fitting. In other words, we show reconstructed images based on identifying true localizations using gaussian fitting and some strict parameters to identify true fluorophore blinking. This allowed us to identify true localizations with high confidence and generate a high-resolution image for membrane receptors.

      Indeed, Alexa 568 and 647 give different numbers of localization. This is dependent on the intrinsic photo-physics of the fluorophores. Specifically, each fluorophore has a different duty cycle, switching cycle, and survival fraction. However, we note that we focused on capturing the relative changes in receptor numbers over time, before and after stimulation by ligands, not the absolute numbers of surface GHR and PRLR. We are not comparing the absolute numbers of localizations or drawing comparisons for localization numbers between 568 and 647. For all these different conditions/times, the photo-physics for a particular fluorophore remains the same. This allows us to make relative comparisons.

      As far as the effect of BME is concerned, the concentration of mercaptoethanol needs to be carefully optimized, as too high a concentration can potentially quench the fluorescence or affect the overall stability of the sample. However, we are using an optimized concentration which has been previously validated across multiple STORM experiments. This makes the concerns relating to the concentration of BME irrelevant to the current experimental design. Besides, the concentration of BME is maintained across all experimental conditions.

      We have added information regarding PL and CRISPR/Cas9 for generating hGHR KO and hPRLR KO cells in two new subsections to the Methods section.

      Reviewer #2 (Recommendations for the authors):

      In the methods please include:<br /> (1) A section with details on proximity ligation assays.

      We have added a description of the PL method to the Methods section.

      (2) A section on CRISPR/Cas9 technology.

      We have added two new sections on “Generating hGHR knockout and hPRLR knockout T47D cells” and “Design of sgRNAs for hGHR  or hPRLR knockout” to the Methods section.

      (3) List the precise composition of the buffer or cite the paper that you followed.

      We used the buffer recipe described in this protocol [1] and have added the components with concentrations as well as the following reference to the manuscript.

      (1) Beggs, R.R., Dean, W.F., Mattheyses, A.L. (2020). dSTORM Imaging and Analysis of Desmosome Architecture. In: Turksen, K. (eds) Permeability Barrier. Methods in Molecular Biology, vol 2367. Humana, New York, NY. https://doi.org/10.1007/7651_2020_325

      (4) Exposure time used for image acquisition to put 40 000 frames in the context of total imaging time and clarify why you decided to take 40 000 images per channel.

      Our Nikon Ti2 N-STORM microscope is equipped with an iXon DU-897 Ultra EMCCD camera from Andor (Oxford Instruments). According to the camera’s manufacturer, this camera platform uses a back-illuminated 512 x 512 frame transfer sensor and overclocks readout to 17 MHz, pushing speed performance to 56 fps (in full frame mode). We note that we always tried to acquire STORM images at the maximal frame rate. As for the exposure time, according to the manufacturer it can be as short as 17.8 ms. We would like to emphasize that we did not specify/alter the exposure time.

      See also: https://andor.oxinst.com/assets/uploads/products/andor/documents/andor-ixon-ultra-emccd-specifications.pdf

      The decision to take 40,000 images per frame was based on our intention to identify the true population of the molecules of interest that are localized and accurately represented in the final reconstruction image. The total number of frames depends on the sample complexity, density of sample labeling and desired resolution. We tested a range of frames between 20,000 and 60,000 and found for our experimental design and output requirements that 40,000 frames provided the best balance between achieving maximal resolution and desired localizations to make consistent and accurate localization estimates across different stimulation conditions compared to basal controls.

      (5) The lasers used to switch Alexa 568 and Alexa 647. Were you alternating between the lasers for switching and imaging of dyes? Intermittent and continuous illumination will produce very different unspecific background fluorescence.

      Yes, we used an alternating approach for the lasers exciting Alexa 647 and Alexa 568, for both switching and imaging of the dyes.

      (6) A paragraph with a detailed description of methods used to differentiate the background fluorescence from the signal.

      We have addressed the background fluorescence under Point 1 (Public Review). We have added a paragraph in the Methods section on this issue.

      (7) Minor corrections to the text:

      It appears as though there is a large difference in the expression level of GHR and PRLR in basal conditions in Figure 1. This can be due to the switching properties of the dyes, which is related to the amount of BME in the buffer, or it can be because there is indeed more PRL. Would the authors be able to comment on this?

      We thank the reviewer for this suggestions. According to expression data available online there is indeed more PRLR than GHR in T47D cells. According to CellMiner [1], T47D cells have an RNA-Seq gene expression level log2(FPKM + 1) of 6.814 for PRLR, and 3.587 for GHR, strongly suggesting that there is more PRLR than GHR in basal conditions, matching the reviewer’s interpretation of our images in Fig. 1 (basal). However, we would advise against using STORM images for direct comparisons of receptor expression. First, with TIRF images, we are only looking at the membrane fraction (~150 nm close to the coverslip membrane interface) that is attached to the coverslip. Secondly, as discussed above, our data represent relative cell surface receptor levels that allow for comparison of different conditions (basal vs. stimulation) and does not represent absolute quantifications. Everything is relative and in comparison to controls.

      Also, BME is not going to change the level of expression. The differences in growth factor expression as estimated by relative comparison can be attributed to the actual changes in growth factors and is not an artifact of the amount of BME in the buffer or the properties of dyes. These factors are maintained across all experimental conditions and do not influence the final outcome.

      (1) https://discover.nci.nih.gov/cellminer/

      (8) I would encourage the authors to use unspecific binding to characterize the signal coming from single antibodies bound to the substrate. This would provide a mean number of localizations that a single antibody generates. With this information, one can evaluate how many receptors there are per cluster, which would strengthen the findings and potentially provide additional support for the model presented in Figure 8. It would also explain why the distributions of localisations per cluster in Fig. 3B look very different for hGHR and hPRLR. As the authors point out in the discussion, the results on predimerization of these receptors in basal conditions are conflicting and therefore it is important to shed more light on this topic.

      We thank the reviewer for this suggestions. While we are unable to perform this experiment at this stage, we will keep it in mind for future experiments.

      (9) Minor corrections to the figures:

      Figure 1:

      In the legend, please say what representation was used. Are these density maps or another representation? Please provide examples of actual localisations (either as dots or crosses representing the peaks of the Gaussians). Most findings of this work rely on the characterisation of the clusters of localisations and therefore it is of essence to show what the clusters look like. This could potentially go to the supplemental info to minimise additional work. It's very hard to see the puncta in this figure.

      If the authors created zoomed regions in each of the images (as in Figure 3), it would be much easier to evaluate the expression level and the extent of colocalisation. Halfway through GHR 3 min green pixels become grey, but this may be the issue with the document that was created. Please check. Either increase the font on the scale bars in this figure or delete it.

      As described above, Figure 1 does not show density maps. Imaging captured the fluorophores’ blinking events and localizations were counted as true localizations, when at least 5 consecutive blinking events had been observed. Nikon software was used for Gaussian fitting and smoothing.

      We have generated zoomed regions. In our files (original as well as pdf) we do not see pixels become grey. We increased the font size above one of the scale bars and removed all others.

      Figure 3:

      In A, the GHR clusters are colour coded but PRLR are not. Are both DBSCN images? Explain the meaning of colour coding or show it as black and white. Was brightness also increased in the PRLR image? The font on the scale bars is too small. In B, right panels, the font on the axes is too small. In the figure legend explain the meaning of 33.3 and 16.7

      In our document, both GHR and PRLR are color coded but the hGHR clusters are certainly bigger and therefore appear brighter than the hPRLR clusters. Both are DBSCAN images. The color coding allows to distinguish different clusters (there is no other meaning). We have kept the color-coding but have added a sentence to the caption addressing this. Brightness was increased in both images of Panel B equally. 33.3 and 16.7 are the median cluster sizes. We have added a sentence to the caption explaining this. We have increased the font on the axes in B (right panels).

      Figure 4:

      I struggled to see any colocalization in the 2nd and the 3rd image. Please show zoomed-in sections. In the panels B and C, the data are presented as fractions. Is this per cell? My interpretation is that ~80% of PRL clusters also contain GHR.

      Is this in agreement with Figures 1 and 2? In Figure 1, PRL 3 min, Merge, colocalization seems much smaller. Could the authors give the total numbers of GHR and PRLR from which the fractions were calculated at least in basal conditions?

      We have provided zoom-in views. As for panels B and C, fractions are number of clusters containing both receptors divided by the total number of clusters. We used the same strategy that we had used for calculating the localization changes: We randomly selected 4 ROIs (regions of interest) per cell to calculate fractions and then calculated the average of three different cells from independently repeated experiments. We did not calculate total numbers of GHR/PRLR. The numbers are fractions of cluster numbers.

      Moreover, the reviewer interprets results in panels B and C that ~80% of PRLR clusters also contain GHR. We assume the reviewer refers to Basal state. Now, the reviewer’s interpretation is not correct for the following reason: ~80% of clusters have both receptors. How many of the remaining (~20%) clusters have only PRLR or only GHR is not revealed in the panels. Only if 100% of clusters have PRLR, we can conclude that 80% of PRLR clusters also contain GHR.

      Also, while Figures 1 and 2 show localization based on dSTORM images, Figure 3 indicates and quantifies co-localization based on proximity ligation assays following DBSCAN analysis using Clus-DoC. We do not think that the results are directly comparable.

      Reviewer #3 (Public Review):

      (1) The manuscript suffers from a lack of detail, which in places makes it difficult to evaluate the data and would make it very difficult for the results to be replicated by others. In addition, the manuscript would very much benefit from a full discussion of the limitations of the study. For example, the manuscript is written as if there is only one form of the PRLR while the anti-PRLR antibody used for dSTORM would also recognize the intermediate form and short forms 1a and 1b on the T47D cells. Given the very different roles of these other PRLR forms in breast cancer (Dufau, Vonderhaar, Clevenger, Walker and other labs), this limitation should at the very least be discussed. Similarly, the manuscript is written as if Jak2 essentially only signals through STAT5 but Jak2 is involved in multiple other signaling pathways from the multiple PRLRs, including the long form. Also, while there are papers suggesting that PRL can be protective in breast cancer, the majority of publications in this area find that PRL promotes breast cancer. How then would the authors interpret the effect of PRL on GHR in light of all those non-protective results? [Check papers by Hallgeir Rui]

      We thank the reviewer for such thoughtful comments. We have added a paragraph in the Discussion section on the limitations of our study, including sole focus on T47D and γ2A-JAK2 cells and lack of PRLR isoform-specific data. Also, we are now mentioning that these isoforms play different roles in breast cancer, citing papers by Dufau, Vonderhaar, Clevenger, and Walker labs.

      We did not mean to imply that JAK2 signals only via STAT5 or by only binding the long form. We have made this point clear in the Introduction as well as in our revised Discussion section. Moreover, we have added information and references on JAK2 signaling and PRLR isoform specific signaling.

      In our Discussions section we are also mentioning the findings that PRL is promoting breast cancer. We would like to point out that it is well perceivable that PRL is protective in BC by reducing surface hGHR availability but that this effect may depend on JAK2 levels as well as on expression levels of other kinases that competitively bind Box1 and/or Box2 [1]. Besides, could it not be that PRL’s effect is BC stage dependent? In any case, we have emphasized the speculative nature of our statement.

      (1) Chhabra, Y., Seiffert, P., Gormal, R.S., et al. Tyrosine kinases compete for growth hormone receptor binding and regulate receptor mobility and degradation. Cell Rep. 2023;42(5):112490. doi: 10.1016/j.celrep.2023.112490. PMID: 37163374.

      Reviewer #3 (Recommendations for the authors):

      Points for improvement of the manuscript:

      (1) Method details -

      a) "we utilized CRISPR/Cas9 to generate hPRLR knockout T47D cells ......" Exactly how? Nothing is said under methods. Can we be sure that you knocked out the whole gene?

      We have addressed this point by adding two new sections on “Generating hGHR knockout and hPRLR knockout T47D cells” and “Design of sgRNAs for hGHR or hPRLR knockout” to the Methods section.

      b) Some of the Western blots are missing mol wt markers. How specific are the various antibodies used for Westerns? For example, the previous publications are quoted as providing characterization of the antibodies also seem to use just band cutouts and do not show the full molecular weight range of whole cell extracts blotted. Anti-PRLR antibodies are notoriously bad and so this is important.

      There is an antibody referred to in Figure 5 that is not listed under "antibodies" in the methods.

      We have modified Figure 5a, showing the entire gel as well as molecular weight markers. As for specificity of our antibodies, we used monoclonal antibodies Anti-GHR-ext-mAB 74.3 and Anti-PRLR-ext-mAB 1.48, which have been previously tested and used. In addition, we did our own control experiments to ensure specificity. We have added some of our many control results as Supplementary Figures S2 and S3.

      We thank the reviewer for noticing the missing antibody in the Methods section. We have now added information about this antibody.

      c) There is no description of the proximity ligation assay.

      We have addressed this by adding a paragraph on PLA in the Methods section.

      d) What is the level of expression of GHR, PRLR, and Jak2 in the gamma2A-JAK2 cells compared to the T47D cells? Artifacts of overexpression are always a worry.

      γ2A-JAK2 cell series are over-expressing the receptors. That’s the reason we did not only rely on the observation in γ2A-JAK2 cell lines but also did the experiment in T47D cell lines.

      e) There are no concentrations given for components of the dSTORM imaging buffer. On line 380, I think the authors mean alternating lasers not alternatively.

      Thank you. Indeed, we meant alternating lasers. We are referring to [1] (the protocol we followed) for information on the imaging buffer.

      (1) Beggs, R.R., Dean, W.F., Mattheyses, A.L. (2020). dSTORM Imaging and Analysis of Desmosome Architecture. In: Turksen, K. (eds) Permeability Barrier. Methods in Molecular Biology, vol 2367. Humana, New York, NY. https://doi.org/10.1007/7651_2020_325

      f) In general, a read-through to determine whether there is enough detail for others to replicate is required. 4% PFA in what? Do you mean PBS or should it be Dulbecco's PBS etc., etc.?

      We prepared a 4% PFA in PBS solution. We mean Dulbecco's PBS.

      (2) There are no controls shown or described for the dSTORM. For example, non-specific primary antibody and second antibodies alone for non-specific sticking. Do the second antibodies cross-react with the other primary antibody? Is there only one band when blotting whole cell extracts with the GHR antibody so we can be sure of specificity?

      We used monoclonal antibodies Anti-GHR-ext-mAB 74.3 and Anti-PRLR-ext-mAB 1.48 (but also tested several other antibodies). While these antibodies have been previously tested and used, we performed additional control experiments to ensure specificity of our primary antibodies and absence of non-specific binding of our secondary antibodies. We have added some of our many control results as Supplementary Figures S2 and S3.

      (3) Writing/figures-

      a) As discussed in the public review regarding different forms of the PRLR and the presence of other Jak2-dependent signaling

      We have added paragraphs on PRLR isoforms and other JAK2-dependent signaling pathways to the Introduction. Also, we have added a paragraph on PRLR isoforms (in the context of our findings) to the Discussion section.

      b) What are the units for figure 3c and d?

      The figures show numbers of localizations (obtained from fluorophore blinking events). In the figure caption to 3C and 3D, we have specified the unit (i.e. counts).

      c) The wheat germ agglutinin stains more than the plasma membrane and so this sentence needs some adjustment.

      We thank the reviewer for this comment. We have rephrased this sentence (see caption to Fig. 4).

      d) It might be better not to use the term "downregulation" since this is usually associated with expression and not internalization.

      While we understand the reviewer’s discomfort with the use of the word “downregulation”, we still think that it best describes the observed effect. Moreover, we would like to note that in the field of receptorology “downregulation” is a specific term for trafficking of cell surface receptors in response to ligands. That said, to address the reviewer’s comment, we are now using the terms “cell surface downregulation” or “downregulation of cell surface [..] receptor” throughout the manuscript in order to explicitly distinguish it from gene downregulation.

      e) Line 420 talks about "previous work", a term that usually indicates work from the same lab. My apologies if I am wrong, but the reference doesn't seem to be associated with the authors.

      At the end of the sentence containing the phrase “previous work”, we are referring to reference [57], which has Dr. Stuart Frank as senior and corresponding author. Dr. Frank is also a co-corresponding author on this manuscript. While in our opinion, “previous work” does not imply some sort of ownership, we are happy to confirm that one of us was responsible for the work we are referencing.

      Reviewing Editor's recommendations:

      The reviewers have all provided a very constructive assessment of the work and offered many useful suggestions to improve the manuscript. I'd advise thinking carefully about how many of these can be reasonably addressed. Most will not require further experiments. I consider it essential to improve the methods to ensure others could repeat the work. This includes adding methods for the PLA and including detail about the controls for the dSTORM. The reviewers have offered suggestions about types of controls to include if these have not already been done.

      We thank the editor for their recommendations. We have revised the methods section, which now includes a paragraph on PLA as well as on CRISPR/Cas9-based generation of mutant cell lines. We have also added information on the dSTORM buffer to the manuscript. Data of controls indicating antibody specificity (using confocal microscopy) have been added to the manuscript’s supplementary material (see Fig. S2 and S3).

      I agree with the reviewers that the different isoforms of the prolactin receptor need to be considered. I think this could be done as an acknowledgment and point of discussion.

      We have revised the discussions section and have added a paragraph on the different PRLR isoforms, among others.

      For Figure 2E, make it clear in the figure (or at least in legend) that the middle line is the basal condition.

      We thank the editor for their comment. We have made changes to Fig 2E and have added a sentence to the legend making it clear that the middle depicts the basal condition.

      My biggest concern overall was the fact that this is all largely conducted in a single cell line. This was echoed by at least one of the reviewers. I wonder if you have replicated this in other breast cancer cell lines or mammary epithelial cells? I don't think this is necessary for the current manuscript but would increase confidence if available.

      We thank the editor for their comment and fully agree with their assessment. Unfortunately, we have not replicated these experiments in other BC cell lines nor mammary epithelial cells but would certainly want to do so in the near future.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this work, the authors investigate the functional difference between the most commonly expressed form of PTH, and a novel point mutation in PTH identified in a patient with chronic hypocalcemia and hyperphosphatemia. The value of this mutant form of PTH as a potential anabolic agent for bone is investigated alongside PTH(1-84), which is a current anabolic therapy. The authors have achieved the aims of the study.

      Strengths:

      The work is novel, as it describes the function of a novel, naturally occurring, variant of PTH in terms of its ability to dimerise, to lead to cAMP activation, to increase serum calcium, and its pharmacological action compared to normal PTH.

      Recommendations for the authors:

      (1) In your response to the reviewers you included a figure. You said it was for the reviewers only. We are *not* including it here. Is that correct or should it be in the Public Reviews?

      We apologize for any confusion and appreciate your thorough review. The phrase “data only for reviewers” was intended to indicate that the content was included in the revision based on reviewers’ comments, not in the main text (article). However, we acknowledge that this phrasing may be inappropriate. We are agree to make the figure included in the previous author response of the public reviews. Accordingly, we propose to revise the previous author response as follows:

      - Remove "(data only for reviewers)".

      -  Correct the typo from "perosteal" to "periosteal".

      - “Thank you for your comment. First, we ensured that the bones sampled during the experiment showed no defects, and we carefully separated the femur bones from the mice to preserve their integrity. In the 3-point bending test, PTH treatment significantly increased the maximum load of the femur bone compared to the OVX-control group. Additionally, the maximum load in the PTH treatment group was significantly greater than that observed in the PTH dimer group. Furthermore, structural factors influencing bone strength, such as the periosteal perimeter and the endocortical bone perimeter, were also increased in the PTH treatment group compared to the PTH dimer group.”

      (2) Do you mean to always have R<sup>0</sup> (have a superscript) and RG (never have a superscript) or should they be shown in the same way throughout your paper?

      Thank you for your thorough review. Based on previous studies that addressed the conformation of PTH1R, R<sup>0</sup> is typically shown with a superscript, while RG is not (Hoare et al., 2001; Dean et al., 2006; Okazaki et al., 2008). We have followed this notation and will ensure consistency throughout our paper.

      Hoare, S. R., Gardella, T. J., & Usdin, T. B. (2001). Evaluating the signal transduction mechanism of the parathyroid hormone 1 receptor: effect of receptor-G-protein interaction on the ligand binding mechanism and receptor conformation. Journal of Biological Chemistry, 276(11), 7741-7753.

      Dean, T., Linglart, A., Mahon, M. J., Bastepe, M., Jüppner, H., Potts Jr, J. T., & Gardella, T. J. (2006). Mechanisms of ligand binding to the parathyroid hormone (PTH)/PTH-related protein receptor: selectivity of a modified PTH (1–15) radioligand for GαS-coupled receptor conformations. Molecular endocrinology, 20(4), 931-943.

      Okazaki, M., Ferrandon, S., Vilardaga, J. P., Bouxsein, M. L., Potts Jr, J. T., & Gardella, T. J. (2008). Prolonged signaling at the parathyroid hormone receptor by peptide ligands targeted to a specific receptor conformation. Proceedings of the National Academy of Sciences, 105(43), 16525-16530.

      (3) The following grammatical and fact changes and word changes are requested.

      We appreciate the thoughtful review and thank you for pointing out the grammatical, factual, and word changes required. We have carefully reviewed and addressed each of these corrections to ensure the paper's accuracy and readability.

      We appreciate the reviewers' detailed and constructive reviews. We have addressed all the comments to improve the quality of our paper.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their comments and have included substantial new data to strengthen the work by specifically addressing questions regarding the molecular mechanisms driving the proteomic and phenotypic changes observed in these disease models. We have generated a new ganglioside disease model (GM1 gangliosidosis) and demonstrated that the lysosomal exocytosis mechanism identified for GM2 gangliosidosis is a conserved mechanism that alters the PM proteome (see new Figure 5).

      We have also carried out substantial additional experimental work to address the question of whether specific protein-lipid interactions drive some of these changes. We have preliminary data supporting this (included below) but we are not confident that these data are robust enough for inclusion in this manuscript. This work required substantial in vitro experiments including the expression and purification of several proteins for use in liposome binding assays. Although these data are promising, they have been challenging to reproduce and we would prefer to develop this work further for inclusion in a subsequent paper.

      Although not requested by any reviewers we have also included substantial additional multielectrode array (MEA) data in Figure 4 to further support the phenotypic changes to electrical signalling seen in the Tay Sachs disease model.

      We would like to note that even without these new data the reviewers highlighted that the “high-quality data presented significantly advance the field” and that the work “exposes key conceptual novelties” using “new insight” and “new tools” that shed “light on the complex pathophysiology that links lipid accumulation to neuronal dysfunction”. And that this highlights “an underappreciated dimension of these diseases” allowing them to be “understood better thanks to this study”. More generally the reviewers state that the work is of interest to both “clinicians and basic researchers” and is relevant to “broader fields in cellular and neurodegenerative biology”.

      Point-by-point description of the revisions

      • *

      Reviewer 1

      Confirmation of Neuronal Differentiation: To confirm neuronal differentiation in their i3N cell model, the authors show qPCR results indicating the expression of mature neuronal markers and the downregulation of stem cell markers by day 14. However, single-cell RNA sequencing (scRNA-seq) could provide a more detailed evaluation of the differentiation process, addressing the fine-grained cell-type composition within the cell population. Depending on the results, the authors might more precisely interpret functional data and assess the possible influence of increased GM2 levels on cell fate decisions.

      The accumulation of GM2 may not be identical across all neurons and so it is possible that, although the neuronal populations as a whole display mature differentiation, individual cells may respond differently to the amount of lipid debris. However, there are several technical reasons why obtaining samples for scRNAseq is extremely challenging. By 14 dpi the separation of individual neurons from each other is very difficult as they are in a densely grown and highly attached and interconnected network. Furthermore, the individual neurons have a highly polarized differentiated morphology with long delicate axonal and dendritic projections, that are readily cleaved and lysed in the process of harvesting and dissociation to obtain single cell suspensions for FACS sorting. In neurons, mRNAs are also abundantly localised along the length of their neuritic projections [1], thus these damaged preparations would provide unreliably meaningful data. Alternatively, sufficiently isolated individual neurons show poor survival and do not mature. If these technical difficulties could be overcome, in order to monitor altered differentiation, it would be necessary to determine which timepoint was most relevant to capture differences between day 0 stem cells and day 28 when they are synchronously firing glutamatergic neuron cultures. For this analysis to be robust it would require sample preparation and analysis of multiple stages of the differentiation process. For all the reasons above we cannot address this reviewer’s request.

      Mechanistic Links Between Lipid Accumulation and Proteomic Changes: The authors report specific proteome changes upon HEXA/B KO. What are the mechanistic links between lipid accumulation and proteomic changes? Is the overall degradative performance of lysosomes compromised? The authors note that certain proteins, such as TSPANs, can bind directly to GSL headgroups. Clarifying whether the observed proteomic changes result from specific, direct lipid-protein interactions versus indirect effects could strengthen the argument for targeted lipid-mediated proteomic shifts.

      In response to these questions, we have carried out substantial additional experimental work testing the lipid interactions of some of the proteins that are most altered in their abundance at the PM. We focussed on the top non-lysosomal proteins as we are proposing that the lysosomal ones are primarily changed due to lysosomal exocytosis, suggesting the non-lysosomal are the best candidates for direct GSL-binding. To robustly identify specific lipid-protein interactions is highly challenging but something we have demonstrated previously [2].

      In vitro lipid-binding assays require expression and purification of the proteins of interest to then be used in liposome pulldown experiments using liposomes of defined composition. As we are most interested in the specificity of the headgroup interaction we focussed on producing the extracellular portions of these proteins that would be predicted to bind these headgroups (again this is a strategy we have successfully used previously [2]). We expressed and purified the extracellular domains of three top non-lysosomal hits: CNTNAP4, CNTN5 and NTRK2 (Fig. R1A, provided in attached response document). These purified proteins were used in liposome-binding assays using liposomes composed of different sphingolipids and gangliosides (Fig. R1B). These data demonstrate that the GPI-anchored protein CNTN5 and its potential binding partner CNTNAP4 bind promiscuously to different headgroups. This may be consistent with their being incorporated into GSL-rich membrane microdomains via the GPI-anchor. Interestingly, in this assay NTRK2 demonstrates specific and substantial binding to GM2, with some weaker binding to GD3.

      These data support that the increased abundance of NTRK2 at the PM could be driven by direct interactions with the same lipid that is accumulating at the PM. As exciting and compelling as these data are, we have subsequently been unable to repeat this observation for NTRK2. We are unsure why and have tried several different strategies to test this interaction, but at this stage with only an N=1 for this observation we do not feel confident to include these data in the manuscript.

      We intend to pursue this further using a range of alternative techniques and protein constructs but this will take substantial additional time and effort that we feel go beyond the scope of this current manuscript.

      Additionally, does this phenomenon extend to other sphingolipidoses (e.g., Gaucher disease)? Comparing the proteomes of i3N cells across different sphingolipidoses could reveal whether the accumulation of distinct GSLs produces unique or shared proteomic profiles, highlighting similarities or specificities across lysosomal storage disorders.

      We agree with the reviewer that this is an interesting and important question and had intended to do this as follow-up work in a future publication. However, in the interests of addressing this point here, we are including additional data we have generated from a new i3N model of GM1 gangliosidosis. As for the GM2 gangliosidosis models, we used CRISPRi to knockdown GLB1 and have confirmed this KD by q-PCR. We have also profiled the GSL composition and quantified the increased GM1 abundance. We have followed this up with both whole-cell and PM proteomics. We have presented comparative proteomics of the two models and demonstrated that they both result in significant accumulation of lysosomal proteins both in cells and at the PM. This shared proteomic profile is consistent with lysosomal exocytosis being a conserved mechanism driving altered PM composition in these diseases. We have included this work as an additional results section and an additional figure (Figure 5) as well as expanding the discussion. For this analysis we collected mass spec data at 28 dpi based on our observations in the paper that electrical signalling was synchronised at this point (Fig 4). In the text we discuss additional changes in these new WCP data such as the appearance of other trafficking molecules such as Arl8a that further support a lysosomal exocytosis mechanism.

      In terms of the unique proteomic profiles of these diseases, the read depth of the PMP data in this case was not sufficient to confidently identify differences between the two gangliosidosis models and therefore we intend to pursue this work with additional LSDs in future studies to be included in a follow-up paper.

      In terms of mechanistic links between lipid accumulation and proteome changes, we feel these new data provide substantial additional support that the appearance of lysosomal proteins at the PM is driven by lysosomal exocytosis and have preliminary data supporting that some non-lysosomal protein changes may be driven by altered protein-lipid interactions.

      Impact of Increased PM GM2 Levels on Endocytic Pathways: Along similar lines, the authors show differences in the PM proteome and in the representation of specific PM lipid domain-associated proteins. As some of these proteins are turned over by mechanisms involving lipid domain-dependent endocytosis, the authors might want to examine the effect of increased PM GM2 levels on various endocytic pathways.

      We thank the reviewer for this suggestion and have attempted assays monitoring endocytosis using several approaches including the uptake of fluorescently labelled bovine serum albumin (DQ-BSA) [3–5]. These endocytosis assays are well established in standard cell lines such as HeLa cells. Despite several attempts by us to get this working in neurons using multiple alternative readouts (microscopy and plate-based fluorescence) we have been unable to measure changes in endocytosis. Exploration of alternative methods to probe Clathrin-independent/dynamin-independent endocytosis (CLIC/GEEC) suggests these pathways are difficult to observe by fluorescence microscopy as there is minimal concentration of cargo proteins during the formation of carriers before endocytosis [6]. As an alternative strategy to probe changes in lipid-domain dependent endocytosis we have analysed the proteomics data for changes in galectins but no changes were identified in the data. We also explored available tools for modulating lysosomal exocytosis and monitoring lysosomal movement including activating TRPML1 to trigger exocytosis and activating ABCA3 to drive more lipid accumulation [7–10]. Similarly to the endocytosis assays above, these were not translatable to neurons in our hands due to a range of challenges including increased toxicity of these drugs on this cell type. We have made a substantial effort to try and address these questions and have conferred with colleagues who have also reported difficulties in establishing these assays in neurons. We are keen to continue to pursue this question but due to the technical challenges we feel this work lies beyond the scope of the current manuscript.

      Multifaceted Nature of Gangliosidoses as PM Disorders: The manuscript presents an important perspective by reframing gangliosidoses as multifaceted PM disorders that disrupt neuronal function and membrane composition. By further elaborating on the connection between membrane lipid alterations, neuronal excitability, and synaptic composition, and by exploring the interplay with lysosomal dysfunction, the authors could provide a richer understanding of gangliosidoses and GSL function in general.

      We appreciate that the reviewer agrees with us that reframing gangliosidoses as more complex multifaceted diseases is important. We are not sure if there is a request here for more elaboration in the text but based on the new data included in the paper, we have expanded some of the discussion around these points. We are very enthusiastic to continue to probe the connections and interplay as described by the reviewer and this is the focus of our ongoing studies.

      Reviewer 2

      1. T-tests and one-way ANOVAs were used, but it is not clear if datasets were tested for normality and equal standard deviations. Please add these details. If data are not normal or standard deviations are unequal, other tests will have to be used.

      All graphs were checked for normality and variance in standard deviation and for figure 1F, where the data was not normally distributed, a Kruskal-Wallace test was used in place of a one-way ANOVA. All significantly different results are now labelled on graphs and the relevant tests described in the figure legends. This has also all been updated in the Supplementary data.

      1. It needs to be clearly explained how many data points were used for statistical analyses and what the data points were. E.g., N=3 independent experiments on 3 different days, each done in n=3 different wells, total n=9. Each well can be considered a biological replicate, but it's of lesser value than the "big Ns" done on different days. The authors can choose different ways of defining their N/n numbers, but it has to be transparent. The bar graphs would ideally display the data points.

      All figure legends now clearly explain N and n numbers used in experiments. Individual data points are displayed on qPCR graphs where N and n are mixed, with shapes denoting the biological repeat (N). In addition to clarification in figure legends, N and n numbers are described in the methods sections where appropriate.

      For completeness we also include here details of these N/n numbers.

      • For the q-PCR experiments, technical triplicates (repeats on the same day, n=3) were carried out for 3 separate biological replicates on different days (N=3). We have changed how these data are plotted to clarify this.
      • For the activity assays, N=3 biological replicates were carried out on cell lysates from cultures grown on different days.
      • For the microscopy analysis, coverslips from N=3 biological replicates on different days were used. n=2 coverslips per N were used to generate 15 images per N.
      • For the glycan analysis, N=3 independent cell pellets were prepared on different days.
      • For the proteomics experiments, these were done as N=3 independent cell cultures grown and prepared on different days. Specifically, one of each cell line SCRM, HEXA-1, HEXA-2, HEXB-1 and HEXB-2 were grown and harvested or biotinylated at a time (for WCP or PMP), with repeats on different days. These N=3 were then combined for the ΔHEX-A/B lines to provide N=12 biological repeats for disease cell lines to be compared to N=3 biological repeats for “SCRM” control cell lines.
      • For calcium imaging, n=4 wells for each of SCRM, ΔHEXA-1 and ΔHEXB1 were averaged and the mean from each was used to provide n=3 data points across two biological repeats of this experiment, N=2.
      • For the MEA data, we now include substantially more data than in the original manuscript (see comments at the top of this document). This is now N=3 biological replicates across n=52 wells over a time period from 38-45 dpi.
      • The N/n values and statistical tests have also all been updated in the Supplementary data.
        1. There should be a comment on how statistical power was calculated upfront and if not: how N/n numbers were chosen ("based on similar expts in the past").

      N/n numbers, as detailed above, were chosen based on previous experiments by ourselves and others, as well as recommended practice [2,11–15]. Typically, these papers do not describe the statistical power upfront. We have added statements to this effect and relevant references to the methods section of the manuscript.

      1. "This suggests that some of the proteins that are accumulating in these diseases are specifically products of lipid accumulation rather than a product of general lysosomal dysfunction. In further support of this, several lysosomal proteins including V-type ATPases (ATP6 family), mannose-6-phosphate receptor (M6PR) and biogenesis of lysosomal organelle complex subunits (BLOC1) are quantified in the WCP but are not increased in abundance." This part is confusing. It seems like the authors observe an accumulation of endolysosomes in general (page 6), but then only certain endolysosomal proteins accumulate - and the authors speculate that this is due to decreased degradation or enhanced translation (mRNA levels are unaffected). This question should be addressed better, ideally experimentally: are endolysosomes accumulating in general or not? And what defines the endolysosomal proteins that accumulate vs. those that don't? How is that regulated?

      Recently published work has identified that late endosomes/lysosomes do not possess one composition; they are dynamically remodelled and there is substantial heterogeneity in the composition of different lysosomes [16,17]. While some components, such as LAMP1 and Cathepsin D, are common across all lysosomal compartments there is considerable heterogeneity in the composition of these organelles. These studies also demonstrate that in disease-relevant conditions or upon drug treatment, lysosomes change their protein composition. For example, in a LIPL-4 KO mouse model they observe an increased abundance of Ragulator complex components, similarly to the increase in LAMTOR3 seen in our new 28 dpi WCP data for GM1 and GM2 gangliosidoses. Interestingly, in this study they demonstrate that lysosomal lipolysis leads to bigger changes in lysosomal protein composition than other pro-longevity mechanisms [17]. Another recent paper looking at a different lysosomal storage disease in microglia with accumulating GSLs and cholesterol has also identified abundance changes in a subset of lysosomal proteins including several we observe here including TTYH3, NPC1, PSAP and TSPAN7 [18]. Beyond proteomic analyses, the experimental tools for identifying these different populations are currently very limited, but these published studies support that it is possible to have accumulation of what we define as lysosomes by IF (using LAMP1 or lysotracker) but for the proteomic analysis to identify increased abundance of only a subset of lysosomal proteins.

      These papers do not identify or speculate on how these differences are regulated. Analysis of the changes in our WCP as well as the new data for GM1 gangliosidoses support that the proteins that are most changed in response to GSL accumulation are membrane proteins involved in lipid and cholesterol binding and transport (New Fig 2D and 5E and see response below). This specific enrichment suggests that the changes are directly linked to the lipid changes, thus our suggestion that these accumulate due to a need for the cell to process these lipids but also that they may get “trapped” in the membrane whorls such that they are not efficiently degraded.

      We have included the references above and a more detailed description of lysosomal heterogeneity into the main text to help address the reviewer’s questions.

      1. Fig. 1D: The GO terms are confusing. Why are there more proteins in the category lysosomal membrane than lysosome as a whole? Other categories seem to be overlapping as well.

      We apologize for the confusion; this graph does not display protein counts it is the adjusted P values for the enrichment of the term. To make this clearer, the DAVID analysis graphs are now presented in a new format. We present in this new graph the false discovery rate (FDR) (adjusted P value) which is a measure of the significance of whether that GO term is specifically enriched in the dataset. We have also expanded the GO term analysis to include molecular function and biological process descriptors in addition to the cellular component originally described. For full clarity, to the right of each term we include the number of significant hits that have this term, that being the number of proteins that are contributing to this GO term enrichment.

      1. Fig. 2C/3A: It'd be good to also show the hits that don't match the expectation/pathways of interest.

      We provide a full list in the Supplementary Information of all hits that are considered significant allowing the reader to access this information without having to download the datasets from PRIDE. We did not label all hits in these panels to avoid cluttering the image. In the main text we have focused on those that clearly fall within related categories or pathways as we feel that several “hits” in the same area represents a more compelling and confident assessment of the data. Several of the additional hits not mentioned in the main text do still match the expectations/pathways. For example, one of the top hits not labelled in the WCP is GPR155 (a cholesterol binding protein at the lysosomal membrane) and one of the top unlabelled hits in the PMP data is OPCML (a GPI-anchored protein that clusters in GSL-rich microdomains). There are some, such as KITLG (up in the PMP data), that we don’t currently have a hypothesis for why/how they change, but we are reluctant to describe and speculate upon additional isolated/orphan hits in the main text when these have not been further validated.

      1. Fig. 3: It is not intuitive that synaptic proteins in particular would accumulate at the plasma membrane due to the lipid storage defect. Are they mis-trafficked or are they at synaptic membranes? That could, e.g, be addressed by isolating synaptosomes. And why this selectivity for synaptic proteins? Neurons should have more plasma membrane that is not synaptic. And, e.g, the release of lysosomal material should not happen at synapses (and lysosomes should not deliver synaptic proteins to the PM, unless there is a failure to degrade them).

      We agree that synapses represent a relatively small proportion of the entire PM of neurons, but synapses are particularly enriched with glycosphingolipids where they affect synaptogenesis and synaptic transmission [19–22]. For these reasons we think that some synaptic proteins are particularly sensitive to these lipid changes as they are localised in GSL-rich membrane microdomains. We have now clarified this point in the text. We have also further clarified that we were not proposing that lysosomal proteins are present at the synapses. We observed that lysosomal proteins are enriched at the PM and this may be more generally across the whole PM, while the changes to synaptic proteins may or may not be localised at the synapse. We apologise for the confusion and have modified the text at the end of the PM proteomics results section to make this clearer.

      To try and address experimentally the question of whether these proteins are at synapses, we have attempted synaptosome enrichment. However, lysosomal compartments co-sedimented with synaptosomes during the preparation – LAMP1 staining was enriched in the synaptosome preparations of all samples including SCRM controls. Therefore, we cannot distinguish these compartments which is particularly problematic in this disease model.

      (7. Continued) Or is there an effect on synaptic vesicles? Are there more? Do they deliver their cargo more readily? Or is there a failure to do endocytosis of synaptic proteins, and that's why the accumulate? What is the connection between SVs and endolysosomes? More clarity would be good here.

      We do think that there is an effect on synaptic vesicles particularly as the SV proteins SYT1 and SV2b are significantly increased in abundance at the PM suggesting they are not being internalized normally. Furthermore, the new WCP data going out to 28 dpi for both GM1 and GM2 gangliosidoses have identified a significant increase in Arl8a which plays a shared role in lysosomal and SV anterograde trafficking [23,24]. Whilst previously thought of as discrete pathways, evidence now suggests that endolysosomal and SV recycling pathways form a continuum with several shared proteins involved in the fusion, trafficking and sorting in both pathways [25]. Arl8a provides a good example of an adaptor protein that functions in both pathways and also when overexpressed results in enhanced neurotransmission consistent with our studies [26]. We have adjusted the discussion text to include a description of the links between SVs and endolysosomal trafficking and the potential shared role Arl8a may be playing in both pathways.

      Regarding the question of whether there are more SVs or not, this is hard to determine directly as they are particularly small (~50 nm) and difficult to visualise or specifically stain for using microscopy. Not all SV-associated proteins are increased in the PMP data, for example SNAP25 and several other synaptotagmins are not changed in the 28 dpi data for both gangliosidosis models. We hope in the future to address SV changes more directly with higher resolution imaging such as electron microscopy or cryo-tomography but cannot currently confidently answer these specific questions.

      1. Fig. 4: The assumption that there is more synaptic activity because there are more synaptic proteins at the membrane seems to be plausible, but also speculative at this point.

      We have modified the text at the end of this results section to highlight that this is a speculative link.

      1. The possible contribution of glial cells should at least be discussed.

      We mention potential deleterious effects on bystander cells including other neurons, astrocytes and microglia in the second last paragraph of the discussion. In response to this request we have expanded and modified this text.

      Minor: there are some typos etc.

      Although no specific examples were listed, we have endeavored to find and correct typos, we have also checked for English spelling (not American) throughout.

      Reviewer 3

      1. Results section, 1st paragraph- to develop disease models- -- Please add cellular models as we already have KO mouse models.

      This has been added to the text.

      1. It was not clear what was the percentage of mutation success with their CRISPR technique.

      The CRISPR method employed here was CRISPRi so there is no mutation of the genome. Instead, inactive/dead-Cas9 is targeted to the promotor/early exon of the HEXA or HEXB gene to inhibit mRNA production. We have included qPCR data to demonstrate the extent of the KD for two different guides to each of these genes in Fig 1.

      1. Will the anti-GM2 antibody be available for other researchers? The researcher details needs to be clarified.

      The anti-GM2 antibody is not commercial available and was generated by one of the co-authors. We invite scientists with an interest in this antibody to contact the corresponding author for details.

      1. Hex activity assay was shown in 1C, but it was not clear that it is MUG or MUGS.

      We apologise for this and have relabelled these activity assay graphs and expanded the legend text to clarify how these two substrates were used to distinguish the two different KD lines. We also corrected a small mistake in the methods section.

      1. Is there a significance in Figure 2 B, 4A, 4B,4C and 4E?

      Based on additional requests from reviewer 2 we have added significance indicators and details of significance tests for several panels in Figures 1-5 including 2B and 4B. For 4A we do not state a significant difference, we use these data to select a timepoint (28 dpi) where all cell lines have synchronous (correlated) signal. The data in Figure 4C and D have been substantially updated and expanded. Analysis of the data in 4C is plotted in 4D where we show significance. For 4E we are stating that the applied stimulation (white triangles) stimulates the HEXA cells every time but the SCRM do not respond to each stimulation. It is not clear how we would quantify this difference and there is no precedent for doing this in the MEA literature or by the Axion company who provided the instrument. We have also included additional references for best practice when analysing MEA data.

      REFERENCES

      1. Mofatteh M. mRNA localization and local translation in neurons. AIMS Neurosci. 2020;7: 299–310. doi:10.3934/Neuroscience.2020016
      2. McKie SJ, Nicholson AS, Smith E, Fawke S, Caroe ER, Williamson JC, et al. Altered plasma membrane abundance of the sulfatide-binding protein NF155 links glycosphingolipid imbalances to demyelination. Proc Natl Acad Sci U S A. 2023;120: e2218823120. doi:10.1073/pnas.2218823120
      3. Marwaha R, Sharma M. DQ-Red BSA Trafficking Assay in Cultured Cells to Assess Cargo Delivery to Lysosomes. Bio Protoc. 2017;7: e2571. doi:10.21769/BioProtoc.2571
      4. gustavo.parfitt. Lysosome proteolysis analysis with DQ-BSA. 2022 [cited 13 Feb 2025]. Available: https://www.protocols.io/view/lysosome-proteolysis-analysis-with-dq-bsa-cgjxtupn
      5. Fernandez-Mosquera L, Yambire KF, Couto R, Pereyra L, Pabis K, Ponsford AH, et al. Mitochondrial respiratory chain deficiency inhibits lysosomal hydrolysis. Autophagy. 2019;15: 1572–1591. doi:10.1080/15548627.2019.1586256
      6. Rennick JJ, Johnston APR, Parton RG. Key principles and methods for studying the endocytosis of biological and nanoparticle therapeutics. Nat Nanotechnol. 2021;16: 266–276. doi:10.1038/s41565-021-00858-8
      7. Pastore N, Annunziata F, Colonna R, Maffia V, Giuliano T, Custode BM, et al. Increased expression or activation of TRPML1 reduces hepatic storage of toxic Z alpha-1 antitrypsin. Molecular Therapy. 2023;31: 2651–2661. doi:10.1016/j.ymthe.2023.06.018
      8. Zhang H, Wang Y, Wang R, Zhang X, Chen H. TRPML1 agonist ML-SA5 mitigates uranium-induced nephrotoxicity via promoting lysosomal exocytosis. Biomedicine & Pharmacotherapy. 2024;181: 117728. doi:10.1016/j.biopha.2024.117728
      9. Shen D, Wang X, Li X, Zhang X, Yao Z, Dibble S, et al. Lipid Storage Disorders Block Lysosomal Trafficking By Inhibiting TRP Channel and Calcium Release. Nat Commun. 2012;3: 731. doi:10.1038/ncomms1735
      10. Wünkhaus D, Tang R, Nyame K, Laqtom NN, Schweizer M, Scotto Rosato A, et al. TRPML1 activation ameliorates lysosomal phenotypes in CLN3 deficient retinal pigment epithelial cells. Sci Rep. 2024;14: 17469. doi:10.1038/s41598-024-67479-8
      11. Zlamalova E, Rodger C, Greco F, Cheers SR, Kleniuk J, Nadadhur AG, et al. Atlastin-1 regulates endosomal tubulation and lysosomal proteolysis in human cortical neurons. Neurobiol Dis. 2024;199: 106556. doi:10.1016/j.nbd.2024.106556
      12. Anderson GSF, Ballester-Beltran J, Giotopoulos G, Guerrero JA, Surget S, Williamson JC, et al. Unbiased cell surface proteomics identifies SEMA4A as an effective immunotherapy target for myeloma. Blood. 2022;139: 2471–2482. doi:10.1182/blood.2021015161
      13. Mossink B, Verboven AHA, Hugte EJH van, Gunnewiek TMK, Parodi G, Linda K, et al. Human neuronal networks on micro-electrode arrays are a highly robust tool to study disease-specific genotype-phenotype correlations in vitro. Stem Cell Reports. 2021;16: 2182–2196. doi:10.1016/j.stemcr.2021.07.001
      14. McCready FP, Gordillo-Sampedro S, Pradeepan K, Martinez-Trujillo J, Ellis J. Multielectrode Arrays for Functional Phenotyping of Neurons from Induced Pluripotent Stem Cell Models of Neurodevelopmental Disorders. Biology. 2022;11: 316. doi:10.3390/biology11020316
      15. Weaver S, Dube S, Mir A, Qin J, Sun G, Ramakrishnan R, et al. Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughput qPCR to enhance quantitative resolution. Methods. 2010;50: 271–276. doi:10.1016/j.ymeth.2010.01.003
      16. Bond C, Hugelier S, Xing J, Sorokina EM, Lakadamyali M. Heterogeneity of late endosome/lysosomes shown by multiplexed DNA-PAINT imaging. J Cell Biol. 2025;224: e202403116. doi:10.1083/jcb.202403116
      17. Yu Y, Gao SM, Guan Y, Hu P-W, Zhang Q, Liu J, et al. Organelle proteomic profiling reveals lysosomal heterogeneity in association with longevity. Elife. 2024;13: e85214. doi:10.7554/eLife.85214
      18. Yasa S, Butz ES, Colombo A, Chandrachud U, Montore L, Tschirner S, et al. Loss of CLN3 in microglia leads to impaired lipid metabolism and myelin turnover. Commun Biol. 2024;7: 1373. doi:10.1038/s42003-024-07057-w
      19. Sipione S, Monyror J, Galleguillos D, Steinberg N, Kadam V. Gangliosides in the Brain: Physiology, Pathophysiology and Therapeutic Applications. Front Neurosci. 2020;14: 572965. doi:10.3389/fnins.2020.572965
      20. Svennerholm L. Gangliosides and Synaptic Transmission. In: Svennerholm L, Mandel P, Dreyfus H, Urban P-F, editors. Structure and Function of Gangliosides. Boston, MA: Springer US; 1980. pp. 533–544. doi:10.1007/978-1-4684-7844-0_46
      21. Palmano K, Rowan A, Guillermo R, Guan J, McJarrow P. The role of gangliosides in neurodevelopment. Nutrients. 2015;7: 3891–3913. doi:10.3390/nu7053891
      22. Hering H, Lin C-C, Sheng M. Lipid rafts in the maintenance of synapses, dendritic spines, and surface AMPA receptor stability. J Neurosci. 2003;23: 3262–3271. doi:10.1523/JNEUROSCI.23-08-03262.2003
      23. Rizalar FS, Lucht MT, Petzoldt A, Kong S, Sun J, Vines JH, et al. Phosphatidylinositol 3,5-bisphosphate facilitates axonal vesicle transport and presynapse assembly. Science. 2023;382: 223–230. doi:10.1126/science.adg1075
      24. Klassen MP, Wu YE, Maeder CI, Nakae I, Cueva JG, Lehrman EK, et al. An Arf-like small G protein, ARL-8, promotes the axonal transport of presynaptic cargoes by suppressing vesicle aggregation. Neuron. 2010;66: 710–723. doi:10.1016/j.neuron.2010.04.033
      25. Ivanova D, Cousin MA. Synaptic Vesicle Recycling and the Endolysosomal System: A Reappraisal of Form and Function. Front Synaptic Neurosci. 2022;14: 826098. doi:10.3389/fnsyn.2022.826098
      26. Vukoja A, Rey U, Petzoldt AG, Ott C, Vollweiter D, Quentin C, et al. Presynaptic Biogenesis Requires Axonal Transport of Lysosome-Related Vesicles. Neuron. 2018;99: 1216-1232.e7. doi:10.1016/j.neuron.2018.08.004
      27. Saheki Y, De Camilli P. Synaptic Vesicle Endocytosis. Cold Spring Harb Perspect Biol. 2012;4: a005645. doi:10.1101/cshperspect.a005645
    1. Author response:

      Reviewer 1:

      A primary limitation of this study, acknowledged by the authors, is its reliance on self-reports of participants’ emotional states. Although considerable effort was made to minimize expectation effects, further research is needed to confirm that the observed behavioral changes reflect genuine alterations in emotional states.

      Thank you very much for raising this point. We fully agree that self-reported emotional states are inherently subjective and that the ramifications of this need to be clarified in the manuscript. However, we would suggest that the focus on self-report may be a strength rather than a limitation. First, the regularities and rules underlying and determining emotional self-report are of primary importance and interest in their own right, and the work presented here does, we believe, shed light on a rich structure present in multivariate timeseries of subjective self-reports and their response to external inputs. Second, there is no clear definition of what a ”genuine emotion state” might be; particularly if there is a discrepancy with self-reported emotions.

      Additionally, the generalizability of the findings to long-term remediation strategies remains an open question.

      Yes, we agree that what we have described is limited to a short-term intervention and change.

      Whether these changes bear on longer-term changes remains to be assessed. Furthermore, the mechanisms or processes that would support such a maintenance are of substantial interest, and will be the focus of future work.

      Second, the statistical analysis, particularly the computational approach, sometimes lacks sufficient detail and refinement. While I will not elaborate on specific points here, one notable issue is the interpretation of the intrinsic matrix (A). The model-free analysis reveals correlations between emotions at a given time or within an emotional state across time points. However, it does not provide evidence to support lagged interactions across states that would justify non-diagonal elements in A. The other result concerning the dynamics matrix only highlights a trend in the dominant eigenvalue, which is difficult to interpret in isolation. The absence of a statistically significant group x intervention interaction furthermore makes this finding a little compelling. This weakens the study’s conclusions about the importance of intrinsic dynamics, as claimed in the title.

      We appreciate the reviewer’s detailed feedback on the statistical analysis and interpretation of the intrinsic dynamics matrix. It is true that the model-free analysis as presented focuses on within-state correlations and that we have not provided such model-free evidence for lagged interactions across states. We do note that the model comparison suggested that the intervention caused changes in the full A matrix. This would be unlikely if there had not been meaningful cross-emotion lagged effects. Similarly, inference of the A matrix could have revealed a diagonal matrix, and we preferred not to impose such an assumption a priori, as it is very restrictive. Nevertheless, in the absence of a statistically significant group x intervention interaction, the findings regarding the A matrix are less compelling than those related to the control analyses. While this is likely due to a lack of statistical power, these are important points which we will consider in more detail in the revision.

      Finally, to avoid potential misunderstandings of their work, the authors should be more careful about their use of terms pertaining to the control theory and take the time to properly define them. For example, the ”controllability” of emotional states can either denote that those states are more changeable (control theory definition), or, conversely, more tightly regulated (common interpretation, as used in the abstract). This is true for numerous terms (stability, sensitivity, Gramian, etc.) for which no clear definition nor references are provided. Readers unfamiliar with the framework of control theory will likely be at a loss without more guidance.

      Thank you for this point. We recognize the potential for misunderstanding due to the dual usage of terms such as ”controllability” and will improve the clarity to avoid any misunderstanding.

      Reviewer 2:

      Acquiring data online inevitably gives rise to selection and self-selection effects. This needs to be acknowledged clearly. Exacerbating this, participant remuneration seems low at an amount below the minimum or living wage in Western countries (do the authors know where their participants came from?).

      Thank you for this point. We certainly agree that different experimental settings can induce different biases, and this is no different for online settings. However, online tasks such as the one used here, have become accepted, and there is now a substantial literature showing that in-lab effects are often well-replicated in online settings (Gillan and Rutledge, 2021) . For the current study, it is not clear that an inperson setting may not induce comparably complex biases, e.g. to do with differences between experimenters. All participants were from the UK. Remuneration rates were comparable to other experimental settings, in keeping with other online studies, UK living wage recommendations, and ultimately determined according to institutional ethical guidance.

      Another concern is that the intervention does not simply take place before the second block begins but is ongoing during the whole of the second block in that it is integrated into the phrasing of the task on each trial. It is therefore somewhat misleading to speak of a period ’after the intervention’, and it would have been interesting to assess the effect of this by including a third group where the phrasing does not change, but the floating leaves intervention takes place.

      Thank you for this point. We acknowledge that the phrasing of the emotion question in the second block may have influenced the observed effects. Including a third group without the reminder would have provided valuable insights and is an important consideration for future studies. We will acknowledge this limitation.

      As mentioned in the Limitations section, observation noise was assumed and not estimated. While this is understandable in this case, the effect of this assumption could have been assessed by simulation with varying levels of observation (and process) noise.

      Thank you for this comment. We would like to clarify that both observation noise and process noise were estimated in the analyses. We will ensure this is emphasized better in the revised version to avoid future misunderstandings.

      Relatedly, the reliance on formal model comparison is unfortunate since the outcome of such comparisons is easily influenced by slight changes to assumptions such as noise levels. An alternative approach would have been to develop a favoured model based on its suitability to address the research question and its ability, established by simulation, to distill relevant changes of behaviour into reliable parameter estimates.

      We agree that model comparison alone is insufficient. This is why we have also included extensive simulations, including posterior predictive checks, and have followed established best-practice procedures (Wilson and Collins, 2019). We have focused on a relatively simple model space to avoid overfitting to the dataset, and hence reduce the risk of spurious findings. While we agree that outcomes will be influenced by underlying assumptions, this would persist with the suggested approach of relying on a favoured model. Simulations themselves rely on predefined structures and noise specifications, which inherently shape parameter recovery and inference. Relying only on a favoured model might risk model misspecification, whereby the model may not actually capture the data, and the parameters intended to capture the intervention effect could be confounded. We will clarify the reasoning behind our approach in the revised version.

      The statistical analyses clearly show the limitations of classical statistical testing with highly complex models of the kind the authors (commendably) use. Hunting for statistically significant interactions in a multivariate repeated-measures design relying on inputs from time seriesderived point estimates is a difficult proposition. While the authors make the best of the bad situation they create by using null-hypothesis significance testing, a more promising approach would have been to estimate parameters using a sampler like Stan or PyMC and then draw conclusions based on posterior predictive simulations.

      This comment raises several interesting points. First, we agree that the value of classical test on individual parameters within such complex situations is limited. This is why our main focus is on global measures like model comparison. Our use of the classical tests is more to support the understanding of the nature of the data, i.e. they have a more descriptive aim. We will hope to clarify this further in the revision. Second, in terms of sampling, we would like to emphasize that the Kalman filter is both efficient and analytical tractable, making it well-suited to our data and research question. It may have been possible to use sampling to obtain posterior distributions rather than point estimates. However, we did not judge this to be worth the (substantial) additional computational cost.

      Reviewer 3:

      An interesting but perhaps at present slightly confusing aspect of their described results relates to the ’controllability’ of emotions, which they define as their susceptibility to external inputs. Readers should note this definition is (as I understand it) quite distinct from, and sometimes even orthogonal to, concepts of emotional control in the emotion literature, which refer to intentional control of emotions (by emotion regulation strategies such as distancing). The authors also use this second meaning in the discussion. Because of the centrality of control/controllability (in both meanings) to this paper, at present it is key for readers to bear these dual meanings in mind for juxtaposed results that distancing ”reduces controllability” while causing ”enhanced emotional control”.

      We fully agree with the reviewer’s observation that ”controllability” can be interpreted in different ways. we will revise the text to ensure consistent usage and explicitly state the distinction between the control theory definition of controllability and its interpretation in the emotion regulation literature.

      As above the authors use an active control - a relaxation intervention - which is extremely closely matched with their active intervention (and a major strength). However, there was an additional difference between the groups (as I currently understand it): ”in the group allocated to the distancing intervention, the phrasing of the question about their feelings in the second video block reminded participants about the intervention, stating: ”You observed your emotions and let them pass like the leaves floating by on the stream.” I do wonder if the effects of distancing also have been partially driven by some degree of reappraisal (considered a separate emotion regulation strategy) since this reminder might have evoked retrospective changes in ratings.

      We appreciate this substantial point. While our study was designed to isolate the effects of distancing, we acknowledge that elements of reappraisal may also have influenced the results. We will discuss this in the revised version. Additionally, as noted in our response to Reviewer 2, including a third group without the reminder could have provided valuable information, and we consider this to be an important direction for future research.

      Not necessarily a weakness, but an unanswered question is exactly how distancing is producing these effects. As the authors point out, there is a possibility that eye-movement avoidance of the more emotionally salient aspects of scenes could be changing participants’ exposure to the emotions somewhat. Not discussed by the authors, but possibly relevant, is the literature on differences between emotion types on oculomotor avoidance, which could have contributed to differential effects on different emotions.

      Thank you very much for these suggestions. It is very true that different emotions can elicit different patterns of oculomotor avoidance, which could have contributed to our observed effects. Research suggests that emotions such as disgust are associated with visual avoidance (Armstrong et al., 2014; Dalmaijer et al., 2021), whereas anxiety and other negative emotions exhibited increased attentional bias after fear conditioning (Kelly and Forsyth, 2009; Pischek-Simpson et al., 2009). It would be very interesting to repeat the experiment with eye-tracking to examine these possibilities. What would be particularly interesting to examine is whether a distancing intervention induces multiple, emotionally-specific behaviours, or not.

      References

      Armstrong, T., McClenahan, L., Kittle, J., and Olatunji, B. O. (2014). Don’t look now! Oculomotor avoidance as a conditioned disgust response. Emotion (Washington, D.C.), 14(1):95–104.

      Dalmaijer, E. S., Lee, A., Leiter, R., Brown, Z., and Armstrong, T. (2021). Forever yuck: Oculomotor avoidance of disgusting stimuli resists habituation. Journal of Experimental Psychology. General, 150(8):1598– 1611.

      Gillan, C. M. and Rutledge, R. B. (2021). Smartphones and the Neuroscience of Mental Health. Annual Review of Neuroscience, 44(Volume 44, 2021):129–151. Publisher: Annual Reviews.

      Kelly, M. M. and Forsyth, J. P. (2009). Associations between emotional avoidance, anxiety sensitivity, and reactions to an observational fear challenge procedure. Behaviour Research and Therapy, 47(4):331–338. Place: Netherlands Publisher: Elsevier Science.

      Pischek-Simpson, L. K., Boschen, M. J., Neumann, D. L., and Waters, A. M. (2009). The development of an attentional bias for angry faces following Pavlovian fear conditioning. Behaviour Research and Therapy, 47(4):322–330.

      Wilson, R. C. and Collins, A. G. (2019). Ten simple rules for the computational modeling of behavioral data. eLife, 8:e49547. Publisher: eLife Sciences Publications, Ltd.

    1. Reviewer #1 (Public review):

      Summary:

      In the manuscript entitled 'A comparative analysis of planarian regeneration specificity reveals tissue polarity contributions of the axial cWnt signalling gradient.' Cleland et al. study the robustness of regenerating a head or a tail in the proper position in two different planarian species (S. mediterranea and G. sinensis). The authors find that the expression of notum, a Wnt inhibitor that is triggered after any cut, shows different dynamics of expression in both planarian species, being more symmetrical in the species that display a higher number of double-headed or Janus heads (G. sinenesis), which they refer to a less robust regeneration. The authors claim that the reduced robustness of G. sinensis regeneration is partially explained by this anterior-posterior symmetric expression of notum, since in S. mediterranea, which shows a 'robust regeneration' it appears asymmetric. So, the first claim of the manuscript is that the symmetry in notum expression could underlie the poor robustness of regenerating a head/tail in small bipolar regenerating planarian fragments.

      Then, they analyse the role of a proposed tail-to-head cWnt signalling gradient during the regeneration of heads and tails in the same planarian species. To do so they develop an antibody that allows the quantification of b-catenin activity along the AP axis, together with a pharmacological approach that reduces the pre-existent cWnt gradient without affecting the wound-induced. Through this strategy the authors can demonstrate the slope of the b-catenin activity, which is a very nice result, and that it changes according to the size of the animal. Furthermore, they are able to demonstrate that by reducing the cWnt signalling in the pre-existent tissue, there is an increase in the number of double-headed regenerates (Janus heads) and that it depends on the body size and on the decreasing steepness of the cWnt gradient. This result relies on G. sinensis species since the drug is not so effective in S. mediterranea. Thus, the authors' second claim is that the slope of the cWnt gradient may contribute to head-tail regeneration specificity in planarians.

      To conclude, it is proposed that regeneration of the correct identity in each wound depends on multiple cues acting in parallel and that their species-specificity provides variations in the regenerative capability of the different planarian species.

      The study has great potential to have a high impact on the regeneration community, since the opportunity to compare mechanisms between close species provides the framework for understanding the essential mechanism of regeneration.

      Strengths:

      The project has several strengths. The authors are able to reproduce the Janus heads phenotypes described by Morgan TH by analysing different planarian species. This is of great importance in the planarian field, because with the current model species, S. mediterranea, this could not be reproduced. So, these results demonstrate that small planarian fragments do make errors during regeneration, giving rise to double-headed animals, which supports the well-known hypothesis that it exists an anteroposterior gradient underlying anteroposterior identity during regeneration. However, and importantly, it does not occur in all planarian species. So, there are differences between planarian species in the robustness of regeneration and may be in the mechanisms that drive this regeneration. The finding of different behaviours and gene expressions in different planarian species is very interesting and promising in the field of regeneration.

      A second strength of the study is the demonstration of the b-catenin1 slope in planarians and how it changes with the animal size, and also the establishment of a method to decrease it in the pre-existent tissue but not in the wound. This strategy allows us to examine specifically the role of the pre-existent cWnt signal, demonstrating that it does have a role in the decision of making head or tail during regeneration, which was an essential question in the field of planarians and animal regeneration.

      Weaknesses:

      (1) The finding that notum, which is the main head determinant identified in planarians, has a different dynamic in both planarian species is very suggestive. However, the different dynamics of notum expression during regeneration, which is the basis of the subsequent rationale, is not properly demonstrated, nor is its correlation with the robustness in regenerating a proper head/tail identity. Main concerns regarding this point:

      a) The authors observe that 'In regenerating S. mediterranea 2 mm trunk pieces cut from 6 mm animals, notum expression was induced predominantly at anterior-facing wounds as early as 6 h post-amputation (Figure 2A), as previously reported (Petersen and Reddien 2011)'. However, in the graphics in Figures 2B and C, the expression of notum at 6h is shown as symmetric. It definitely does not agree with the in situ, with the text, or with the published data. How was it measured? It should be corrected and explained since it is the basis of the subsequent rationale.

      b) Then, when measuring notum in G. sinensis the authors conclude: 'Strikingly and in sharp contrast to S. mediterranea, the number of notum expressing cells was nearly identical between anterior and posterior wounds without any discernible A/P asymmetry at any of the examined time points (Figures 2E-F)'. However, in the in situ results of 12 h regenerating G. sinensis, there is a clear difference in notum expression between anterior and posterior wounds. Is it not representative of the image? Again, how exactly the measurements were performed? Are dots or pixels quantified? It is not explained in the text. This is a crucial result that has to be consistent.

      c) A more general weakness of this part of the manuscript is that even if the authors demonstrate that in G. sinensis the expression of notum is symmetrical in contrast to S. mediterranea, this is just an observation of 1 species that has symmetrical notum and regenerates less robustly than 1 species that has asymmetrical expression and regenerates more robustly. If they for instance look at the expression of wnt1, maybe they also see differences between both species that could be linked to their different regeneration properties (related to this, see below the comment on wnt1 expression). That is to say, comparing 1 to 1 species cannot give any cause-effect evidence.<br /> Furthermore, the authors rely on the fact that notum inhibition rescues the wild-type phenotype to conclude that is the symmetric expression of notum that underlies the appearance of Janus heads. This is what can be read in the results: 'Significantly, the rescue of wild-type regenerates by notum(RNAi) suggests that the symmetric G. sinensis notum expression contributes to the formation of double-heads and thus to reduced regeneration specificity'; and in the Summary: We found that the reduced regeneration robustness of G. sinensis was partially explained by wound site-symmetric expression of the head determinant notum, which is highly anterior-specific in S. mediterranea.' However, notum RNAi decreases notum in both wounds, so it does not produce an asymmetric expression (at least this is not shown). So, it does not link the symmetry or asymmetry of notum with the appearance of Janus heads.

      d) If the authors want to maintain the claim that the symmetry of notum is one of the reasons that explain the increase in Janus head phenotype in G. sinensis, there are several possibilities to test it. For instance:

      i) Analyse notum expression in different planarian species and relate its symmetry or asymmetry with the appearance of Janus heads. If the claim is true, the species that are more robust should show more asymmetric expression of notum. This would sustain strongly the first claim, and would really be a breakthrough in the field of regeneration.

      ii) Another possibility is a more in-depth analysis of notum expression in the species of the study. If the authors show that larger fragments show fewer Janus heads, and also that it depends on the anteroposterior level of the fragments, they could try to relate the rate of Janus heads with the degree of asymmetry in notum expression in both wounds. For instance, they could analyze notum expression in bipolar regenerating fragments along the anteroposterior axis in both species; it should be more symmetric in G sinenesis, in all fragments, according to Figure 2 L. Or they could analyze notum expression in bipolar regenerating fragments of different sizes, mainly in 1 or 2 mm fragments of big planarians, since they are the fragments analyzed that form or not the Janus heads. In G sinensis the expression of notum should be more symmetrical than in S. mediterranea in these fragments.

      iii) The authors could design an experiment to demonstrate that the symmetry in the expression of notum affects the rate of Janus heads. The experiment that the authors show is the rescue of the Janus heads in G. sinensis after notum RNAi. However, notum RNAi suppresses notum in both wounds, thus not making them asymmetric. Furthermore, the rescue could be explained by the posteriorizing effect that notum RNAi has in planarians, as reported by several authors. A possibility could be to inhibit APC, which increases notum expression in S. mediterranea (Petersen and Reddien 2011). If APC RNAi in G. sinenesis produces an increase in notum in both wounds and the rate of Janus heads is not rescued, then it would support the hypothesis that notum symmetry is the cause of the Janus heads. However, if it produces an increase of notum in an asymmetric manner, then the Janus phenotype should be rescued.

      (2) The second weakness of the study is related to the methodology used to support the second claim, that the slope of bcatenin1 activity has a role in the decision of regeneration - a head and a tail in the correct tip. The main concerns relate to the specificity of the anti-bcatenin1 antibody and to the broad effect of C59 in the secretion of all Wnts.

      a) Raising an antibody against beta-catenin1 that allows the quantification by western blot is a strength of the study, since beta-catenin1 is the key element of the cWnt pathway, and their levels are directly associated with the activation of the pathway. Since this is one of the tools that support the second claim of the study, a characterization of the antibody and additional tests to prove its specificity are required. The authors show a Western blot in which the band intensity decreases after beta-catenin1 inhibition in both species. Further analysis should be shown:<br /> i) Demonstration that the intensity of the band increases after APC or Axin inhibition.<br /> ii) Does the antibody work in immunohistochemistry? It would provide further evidence of the specificity of a nuclear signal could be demonstrated.<br /> iii) Explanation and discussion of the protocol used to analyse the levels of b-catenin1 activity along the anteroposterior axis is required. It has been reported that beta-catenin1 is highly expressed and required in the brain in planarians, and also in the pharynx, and in the sexual organs (Hill and Petersen 2015, Sureda-Gomez et al 2016). How is it then explained the anterior-to-posterior gradient of expression of beta-catenin1 seen in this study in both species? Has the pharynx been removed before the protein extraction? What about the beta-catenin1 activity demonstrated in the brain? Why is it not reflected in the western blot analysis using the antibody? This point should be clarified.

      b) The second tool used in the second part of the manuscript is the drug C59, which inhibits Porcupine, a protein required for palmitoylation and secretion of Wnts. Because Porcupine could be required for the secretion of all Wnts, the phenotype obtained with the drug could be the sum of the inhibition of cWNT signal (wnt1 for instances) and non-canonical WNT (as wnt5). This is in fact the phenotype resulting after the inhibition of Wntless in planarians (Adell et al. 2009), which is also required for the secretion of Wnts. Thus, in the phenotypes resulting from C59 treatment the analysis of the nervous system and posterior/anterior markers is required. Looking at the in vivo phenotype it appears that in fact the drug is affecting both canonical and no canonical pathways since the animal with protrusions in the lateral part (Figure 4B-double head, or Supplementary Figure 3A) is very similar to the one reported after Wntless inhibition. In case the phenotypes observed also show non-canonical Wnt inhibition, this should be clearly shown and discussed.

      The above-mentioned weaknesses are the most important concerns about the present manuscript. However, there are other concerns related to a further analysis of the phenotypes and the analysis of additional Wnt elements as wnt1, which are essential to complete the study and are directly discussed with the authors.

    1. Reviewer #1 (Public review):

      Summary

      In this article, Kawanabe-Kobayashi et al., aim to examine the mechanisms by which stress can modulate pain in mice. They focus on the contribution of noradrenergic neurons (NA) of the locus coeruleus (LC). The authors use acute restraint stress as a stress paradigm and found that following one hour of restraint stress mice display mechanical hypersensitivity. They show that restraint stress causes the activation of LC NA neurons and the release of NA in the spinal cord dorsal horn (SDH). They then examine the spinal mechanisms by which LC→SDH NA produces mechanical hypersensitivity. The authors provide evidence that NA can act on alphaA1Rs expressed by a class of astrocytes defined by the expression of Hes (Hes+). Furthermore, they found that NA, presumably through astrocytic release of ATP following NA action on alphaA1Rs Hes+ astrocytes, can cause an adenosine-mediated inhibition of SDH inhibitory interneurons. They propose that this disinhibition mechanism could explain how restraint stress can cause the mechanical hypersensitivity they measured in their behavioral experiments.

      Strengths:

      (1) Significance. Stress profoundly influences pain perception; resolving the mechanisms by which stress alters nociception in rodents may explain the well-known phenomenon of stress-induced analgesia and/or facilitate the development of therapies to mitigate the negative consequences of chronic stress on chronic pain.

      (2) Novelty. The authors' findings reveal a crucial contribution of Hes+ spinal astrocytes in the modulation of pain thresholds during stress.

      (3) Techniques. This study combines multiple approaches to dissect circuit, cellular, and molecular mechanisms including optical recordings of neural and astrocytic Ca2+ activity in behaving mice, intersectional genetic strategies, cell ablation, optogenetics, chemogenetics, CRISPR-based gene knockdown, slice electrophysiology, and behavior.

      Weaknesses:

      (1) Mouse model of stress. Although chronic stress can increase sensitivity to somatosensory stimuli and contribute to hyperalgesia and anhedonia, particularly in the context of chronic pain states, acute stress is well known to produce analgesia in humans and rodents. The experimental design used by the authors consists of a single one-hour session of restraint stress followed by 30 min to one hour of habituation and measurement of cutaneous mechanical sensitivity with von Frey filaments. This acute stress behavioral paradigm corresponds to the conditions in which the clinical phenomenon of stress-induced analgesia is observed in humans, as well as in animal models. Surprisingly, however, the authors measured that this acute stressor produced hypersensitivity rather than antinociception. This discrepancy is significant and requires further investigation.

      (2) Specifically, is the hypersensitivity to mechanical stimulation also observed in response to heat or cold on a hotplate or coldplate?

      (3) Using other stress models, such as a forced swim, do the authors also observe acute stress-induced hypersensitivity instead of stress-induced antinociception?

      (4) Measurement of stress hormones in blood would provide an objective measure of the stress of the animals.

      (5) Results:

      a) Optical recordings of Ca2+ activity in behaving rodents are particularly useful to investigate the relationship between Ca2+ dynamics and the behaviors displayed by rodents.

      b) The authors report an increase in Ca2+ events in LC NA neurons during restraint stress: Did mice display specific behaviors at the time these Ca2+ events were observed such as movements to escape or orofacial behaviors including head movements or whisking?

      c) Additionally, are similar increases in Ca2+ events in LC NA neurons observed during other stressful behavioral paradigms versus non-stressful paradigms?

      d) Neuronal ablation to reveal the function of a cell population.

      e) The proportion of LC NA neurons and LC→SDH NA neurons expressing DTR-GFP and ablated should be quantified (Figures 1G and J) to validate the methods and permit interpretation of the behavioral data (Figures 1H and K). Importantly, the nocifensive responses and behavior of these mice in other pain assays in the absence of stress (e.g., hotplate) and a few standard assays (open field, rotarod, elevated plus maze) would help determine the consequences of cell ablation on processing of nociceptive information and general behavior.

      f) Confirmation of LC NA neuron function with other methods that alter neuronal excitability or neurotransmission instead of destroying the circuit investigated, such as chemogenetics or chemogenetics, would greatly strengthen the findings. Optogenetics is used in Figure 1M, N but excitation of LC→SDH NA neuron terminals is tested instead of inhibition (to mimic ablation), and in naïve mice instead of stressed mice.

      g) Alpha1Ars. The authors noted that "Adra1a mRNA is also expressed in INs in the SDH".

      h) The authors should comprehensively indicate what other cell types present in the spinal cord and neurons projecting to the spinal cord express alpha1Ars and what is the relative expression level of alpha1Ars in these different cell types.

      i) The conditional KO of alpha1Ars specifically in Hes5+ astrocytes and not in other cell types expressing alpha1Ars should be quantified and validated (Figure 2H).

      j) Depolarization of SDH inhibitory interneurons by NA (Figure 3). The authors' bath applied NA, which presumably activates all NA receptors present in the preparation.

      k) The authors' model (Figure 4H) implies that NA released by LC→SDH NA neurons leads to the inhibition of SDH inhibitory interneurons by NA. In other experiments (Figure 1L, Figure 2A), the authors used optogenetics to promote the release of endogenous NA in SDH by LC→SDH NA neurons. This approach would investigate the function of NA endogenously released by LC NA neurons at presynaptic terminals in the SDH and at physiological concentrations and would test the model more convincingly compared to the bath application of NA.

      l) As for other experiments, the proportion of Hes+ astrocytes that express hM3Dq, and the absence of expression in other cells, should be quantified and validated to interpret behavioral data.

      m) Showing that the effect of CNO is dose-dependent would strengthen the authors' findings.

      n) The proportion of SG neurons for which CNO bath application resulted in a reduction in recorded sIPSCs is not clear.

      o) A1Rs. The specific expression of Cas9 and guide RNAs, and the specific KD of A1Rs, in inhibitory interneurons but not in other cell types expressing A1Rs should be quantified and validated.

      (6) Methods:

      It is unclear how fiber photometry is performed using "optic cannula" during restraint stress while mice are in a 50ml falcon tube (as shown in Figure 1A).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This paper uses single-molecule FRET to investigate the molecular basis for the distinct activation mechanisms between 2 GPCR responding to the chemokine CXCL12 : CXCR4, that couples to G-proteins, and ACKR3, which is G-protein independent and displays a higher basal activity.

      Strengths:

      It nicely combines the state-of-the-art techniques used in the studies of the structural dynamics of GPCR. The receptors are produced from eukaryotic cells, mutated, and labeled with single molecule compatible fluorescent dyes. They are reconstituted in nanodiscs, which maintain an environment as close as possible to the cell membrane, and immobilized through the nanodisc MSP protein, to avoid perturbing the receptor's structural dynamics by the use of an antibody for example.

      The smFRET data are analysed using the HHMI technique, and the number of states to be taken into account is evaluated using a Bayesian Information Criterion, which constitutes the state-of-the-art for this task.

      The data show convincingly that the activation of the CXCR4 and ACKR3 by an agonist leads to a shift from an ensemble of high FRET states to an ensemble of lower FRET states, consistent with an increase in distance between the TM4 and TM6. The two receptors also appear to explore a different conformational space. A wider distribution of states is observed for ACKR3 as compared to CXCR4, and it shifts in the presence of agonists toward the active states, which correlates well with ACKR3's tendency to be constitutively active. This interpretation is confirmed by the use of the mutation of Y254 to leucine (the corresponding residue in CXCR4), which leads to a conformational distribution that resembles the one observed with CXCR4. It is correlated with a decrease in constitutive activity of ACKR3.

      Weaknesses:

      Although the data overall support the claims of the authors, there are however some details in the data analysis and interpretation that should be modified, clarified, or discussed in my opinion

      Concerning the amplitude of the changes in FRET efficiency: the authors do not provide any structural information on the amplitude of the FRET changes that are expected. To me, it looks like a FRET change from ~0.9 to ~0.1 is very important, for a distance change that is expected to be only a few angstroms concerning the movement of the TM6. Can the authors give an explanation for that? How does this FRET change relate to those observed with other GPCRs modified at the same or equivalent positions on TM4 and TM6?

      The large FRET change in our system was initially unexpected. However, the reviewer is mistaken that the expected distance change is only a few angstroms. Crystal structures of the homologous beta2 adrenergic receptor (β<sub>2</sub>AR) in inactive and active conformations reveal that the cytoplasmic end of TM6 moves outwards by 16 angstroms during activation (Rasmussen et al., 2011, ref 47).  Consistent with this, smFRET studies of β<sub>2</sub>AR labeled in TM4 and TM6 (as here) showed that the donor-acceptor (D-A) distance was 14 angstroms longer in the active conformation (Gregorio et al., ref 38).  Surprisingly, the apparent distance change in our system (calculated for our FRET probes, A555/Cy5, using FPbase.com) is almost 30 angstroms. A possible explanation is that the fluorophore attached to TM6 interacts with lipids within the nanodisc when TM6 moves outwards, which could stretch the fluorophore linker and thereby increase the D-A distance (lipids were absent in the β<sub>2</sub>AR study). Such an interaction could also constrain the fluorophore in an unfavorable orientation for energy transfer, also leading to lower than expected FRET efficiencies and inflated distance calculations. Regardless, it is important to emphasize that none of the interpretations or conclusions of our study are based on computed D-A distances. Rather, we resolved different receptor conformations and quantified their relative populations based on the measured FRET efficiency distributions.

      Finally, we note that a recent smFRET study of the glucagon receptor (labeled in TM4 and TM6, as here) also revealed a large difference in apparent FRET efficiencies between inactive (E<sub>app</sub> = 0.83) and active (E<sub>app</sub> = 0.32) conformations (Kumar et al., ref. 39). Thus, the large change in FRET efficiency observed in our study is not unprecedented.

      Concerning the intermediate states: the authors observe several intermediate states.

      (1) First I am surprised, looking at the time traces, by the dwell times of the transitions between the states, which often last several seconds. Is such a long transition time compatible with what is known about the kinetic activation of these receptors?

      We too were surprised by the apparent kinetics of the receptors in our system. However, it was previously noted that purified systems, including nanodiscs, lead to slower activation times for GPCRs compared to cellular membrane systems (Lohse et al, Curr. Opin. Cell Biology, 27, 8792, 2014). Indeed, slow transitions among different FRET states (dwell times in the seconds range) were also observed in recent smFRET studies of the mu opioid receptor (Zhao et al., 2024, ref. 41) and the glucagon receptor (Kumar et al., 2023, ref. 39). These studies are consistent with the observed time scale of the FRET transitions reported here.

      (2) Second is it possible that these “intermediate” states correspond to differences in FRET efficiencies, that arise from different photophysical states of the dyes? Alexa555 and Cy5 are Cyanines, that are known to be very sensitive to their local environment. This could lead to different quantum yields and therefore different FRET efficiencies for a similar distance. In addition, the authors use statistical labeling of two cysteines, and have therefore in their experiment a mixture of receptors where the donor and acceptor are switched, and can therefore experience different environments. The authors do not speculate structurally on what these intermediate states could be, which is appreciated, but I think they should nevertheless discuss the potential issue of fluorophore photophysics effects.

      The reviewer is correct that the intermediate FRET states could, in principle, arise from a conformational change of the receptor that alters the local environment of the donor and/or acceptor fluorophores, rather than a change in donor-acceptor distance. This caveat is now included in the discussion on Pg. 10:

      “In principle, the intermediates in CXCR4 and ACKR3 could represent partial movements of TM6 from the inactive to active conformation or more subtle conformational changes altering the photophysical characteristics of the probes without drastically altering the donor-acceptor distance. Either possibility leads to detectable changes in apparent FRET efficiency and reflect discrete conformational steps on the activation pathway; however, it is not possible to resolve specific structural changes from the data.”

      Regarding the second possibility, it is true that our labeling methodology leads to a statistical mixture of labeled species (D on TM6 and A on TM4, D on TM4 and A on TM6). If the photophysical properties of the fluorophores were markedly different for the two labeling orientations, this would produce two different FRET efficiencies for a given receptor conformation. Assuming two receptor conformations, this scenario would produce four distinct FRET states: E<sub>1</sub> (inactive receptor, labeling configuration 1), E<sub>2</sub> (active receptor, labeling configuration 1), E<sub>3</sub> (inactive receptor, labeling configuration 2) and E<sub>4</sub> (active receptor, labeling configuration 2), with two cross peaks in the TDP plots, corresponding to E<sub>1</sub> ↔ E<sub>2</sub> and E<sub>3</sub> ↔ E<sub>4</sub> transitions. Notably, E<sub>2</sub> ↔ E<sub>3</sub> cross peaks would not be present, since states E<sub>2</sub> and E<sub>3</sub> exist on separate molecules. Instead, we see all states inter-connected sequentially, R ↔ R’ ↔ R* in CXCR4 and R ↔ R’ ↔ R*’ ↔ R* in ACKR3 (Fig. 2), suggesting that the resolved FRET states represent interconnected conformational states.

      We added the following text to the Results section on Pg. 6:

      “Two-dimensional transition density probability (TDP) plots revealed that the three FRET states were connected in a sequential fashion (Figs. 2A & B), indicating that the transitions occurred within the same molecules. Notably, these observations exclude the possibility that the midFRET state arises from different local fluorophore environments (hence FRET efficiencies) for the two possible labeling orientations of the introduced cysteines: assuming two receptor conformations, this model would produce four distinct FRET states, but only two cross peaks in the TDP plot.”

      (3) It would also have been nice to discuss whether these types of intermediate states have been observed in other studies by smFRET on GPCR labeled at similar positions.

      Intermediate states have also been reported in previous smFRET studies of other GPCRs. For example, in the glucagon receptor (also labeled in TM4 and TM6), a third FRET state (E<sub>app</sub> =  0.63) was resolved between the inactive (E<sub>app</sub>  = 0.85) and active (E<sub>app</sub>  = 0.32) states (Kumar et al., Ref. 39).  Discrete intermediate receptor conformations were also observed in the A<sub>2A</sub>R labeled in TM4 and TM6 (Fernandes et al., Ref 40). These examples are now cited in the Discussion.

      On line 239: the authors talk about the R↔R' transitions that are more probable. In fact it is more striking that the R'↔R* transition appears in the plot. This transition is a signature of the behavior observed in the presence of an agonist, although IT1t is supposed to be an inverse agonist. This observation is consistent with the unexpected (for an inverse agonist) shift in the FRET histogram distribution. In fact, it appears that all CXCR4 antagonists or inverse agonists have a similar (although smaller) effect than the agonist. Is this related to the fact that these (antagonist or inverse agonist) ligands lead to a conformation that is similar to the agonists, but cannot interact with the G-protein ?? Maybe a very interesting experiment would be here to repeat these measurements in the presence of purified G-protein. G-protein has been shown to lead to a shift of the conformational space explored by GPCR toward the active state (using smFRET on class A and class C GPCR). It would be interesting to explore its role on CXCR4 in the presence of these various ligands. Although I am aware that this experiment might go beyond the scope of this study, I think this point should be discussed nevertheless.

      We thank the reviewer for this observation and the possible explanation offered.  In response, we have added the following text to the Results section on Pg. 7:

      “The small-molecule ligand IT1t is reported to act as an inverse agonist of CXCR4 (54-56). However, the conformational distribution of CXCR4 showed little change to the overall apparent

      FRET profile, although R’ ↔ R* transitions appeared in the TDP plot (Figs. 3A & B, Fig. S8). This suggests that the small molecule does not suppress CXCR4 basal signaling by changing the conformational equilibrium. Instead IT1t appears to increase transition probabilities which may impair G protein coupling by CXCR4.”

      We have also added the following text to the Results on Pg. 8:

      “Despite the ability of CXCL12<sub>P2G</sub> and CXCL12<sub>LRHQ</sub> to stabilize the active R* conformation of CXCR4, both variants are known to act as antagonists (20). This suggests that the CXCL12 mutants inhibit CXCR4 coupling to G proteins not by suppressing the active receptor population but rather by increasing the dynamics of the receptor state transitions. Our results suggest that the helical movements considered classic signatures of the active state may not be sufficient for CXCR4 to engage productively with G proteins.”

      In addition, we have added the following text to the Discussion on Pg. 11:

      “The chemokine variants CXCL12<sub>P2G</sub> and CXCL12<sub>LRHQ</sub> are reported to act as antagonists of CXCR4 (19, 20), and the small molecule IT1t acts as an inverse agonist (54-56). Surprisingly, none of these ligands inhibit formation of the active R* conformation of CXCR4. In fact, the chemokine variants both stabilize and increase this state to some degree, although less effectively than CXCL12<sub>WT</sub>. Thus, the antagonism and inverse agonism of these ligands does not appear to be linked exclusively to receptor conformation, suggesting that the ligands inhibit coupling of G proteins to CXCR4 or disrupt the ligand-receptor-G protein interaction network required for signaling (Fig. S10) (21, 23).  Interestingly, these ligands also increase the probabilities of state-to-state transitions (Figs. 3B & 4B), suggesting that enhanced conformational exchange prevents the receptor from productively engaging G proteins. Similarly, ACKR3 is naturally dynamic and lacks G protein coupling, suggesting a common mechanism of G protein antagonism.”

      Finally, we also agree that experiments with G proteins could be informative. In fact, we initiated such experiments during the course of this study.  However, it soon became apparent that significant optimization would be required to identify fluorophore labeling positions that report receptor conformation without inhibiting G protein coupling. Accordingly, we decided that G protein experiments would be the subject of future studies.

      However, we added the following text to the Discussion on Pg. 12:

      “Future smFRET studies performed in the presence of G proteins should be informative in this regard”.

      The authors also mentioned in Figure 6 that the energetic landscape of the receptors is relatively flat ... I do not really agree with this statement. For me, a flat conformational landscape would be one where the receptors are able to switch very rapidly between the states (typically in the submillisecond timescale, which is the timescale of protein domain dynamics). Here, the authors observed that the transition between states is in the second timescale, which for me implies that the transition barrier between the states is relatively high to preclude the fast transitions.

      We thank the reviewer for the comment. We have modified the description of the energy landscapes of ACKR3 and CXCR4 in the discussion on Pg. 10 as follows:

      “These observations imply that ACKR3 has a relatively flat energy landscape, with similar energy minima for the different conformations, whereas the energy landscape of CXCR4 is more rugged (Fig. 6). For both receptors, the energy barriers between states are sufficiently high that transitions occur relatively slowly with seconds long dwell times (Figs. 1C and S2).”

      Reviewer #2 (Public Review):

      Summary:

      his manuscript uses single-molecule fluorescence resonance energy transfer (smFRET) to identify differences in the molecular mechanisms of CXCR4 and ACKR3, two 7transmembrane receptors that both respond to the chemokine CXCL12 but otherwise have very different signaling profiles. CXCR4 is highly selective for CXCL12 and activates heterotrimeric G proteins. In contrast, ACKR3 is quite promiscuous and does not couple to G proteins, but like most G protein-coupled receptors (GPCRs), it is phosphorylated by GPCR kinases and recruits arrestins. By monitoring FRET between two positions on the intracellular face of the receptor (which highlights the movement of transmembrane helix 6 [TM6], a key hallmark of GPCR activation), the authors show that CXCR4 remains mostly in an inactive-like state until CXCL12 binds and stabilizes a single active-like state. ACKR3 rapidly exchanges among four different conformations even in the absence of ligands, and agonists stabilize multiple activated states.

      Strengths:

      The core method employed in this paper, smFRET, can reveal dynamic aspects of these receptors (the breadth of conformations explored and the rate of exchange among them) that are not evident from static structures or many other biophysical methods. smFRET has not been broadly employed in studies of GPCRs. Therefore, this manuscript makes important conceptual advances in our understanding of how related GPCRs can vary in their conformational dynamics.

      Weaknesses:

      (1) The cysteine mutations in ACKR3 required to site-specifically install fluorophores substantially increase its basal and ligand-induced activity. If, as the authors posit, basal activity correlates with conformational heterogeneity, the smFRET data could greatly overestimate the conformational heterogeneity of ACKR3.

      The change in basal ACKR3 activity with the Cys introductions are modest in comparison and insignificantly different as determined by extra-sum-of-squares F test (P=0.14).

      (2) The probes used cannot reveal conformational changes in other positions besides TM6. GPCRs are known to exhibit loose allosteric coupling, so the conformational distribution observed at TM6 may not fully reflect the global conformational distribution of receptors. This could mask important differences that determine the ability of intracellular transducers to couple to specific receptor conformations.

      We agree that the overall conformational landscape of the receptors has not been investigated and we have added this caveat to the discussion on Pg. 12.

      “An important caveat is that our study does not report on the dynamics of the other TM helices and H8, some of which are known to participate in arrestin interactions.”

      (3) While it is clear that CXCR4 and ACKR3 have very different conformational dynamics, the data do not definitively show that this is the main or only mechanism that contributes to their functional differences. There is little discussion of alternative potential mechanisms.

      The main functional difference between CXCR4 and ACRK3 is their effector coupling: CXCR4 couples to G proteins, whereas ACKR3 only couples to arrestins (following phosphorylation of the C-terminal tail by GRKs). As currently noted in the discussion, ACKR3 has many features that may contribute to its lack of G protein coupling, including lack of a well-ordered intracellular pocket due to conformational dynamics, lack of an N-term-ECL3 disulfide, different chemokine binding mode, and the presence of Y257. Steric interference due to different ICL loop structures may also interfere with G protein activation. No one thing has proven to confer ACKR3 with G protein activity including swapping all of the ICLs to those of canonical chemokine receptor, suggesting it is a combination of these different factors. The following has been added to the discussion on Pg. 13 to clearly note that any one feature is unlikely to drive the atypical behavior of ACKR3:

      “The atypical activation of ACKR3 does not appear to be dependent on any singular receptor feature and is likely a combination of several factors.”

      (4) The extent to which conformational heterogeneity is a characteristic feature of ACKRs that contributes to their promiscuity and arrestin bias is unclear. The key residue the authors find promotes ACKR3 conformational heterogeneity is not conserved in most other ACKRs, but alternative mechanisms could generate similar heterogeneity.

      Despite the commonalities in the roles of the ACKRs, they all appear to have evolved independently. Thus, we do not believe that all features observed and described for one ACKR will explain the behavior of another. We have carefully avoided expanding our observations to other ACKRs to avoid suggesting common mechanisms.

      (5) There are no data to confirm that the two receptors retain the same functional profiles observed in cell-based systems following in vitro manipulations (purification, labeling, nanodisc reconstitution).

      We agree this is an important point. All labeled receptors responded to agonist stimulation as expected. As only properly folded receptors are able to make the extensive interactions with ligands necessary for conformational changes (for instance, CXCL12 interacts with all TMs and ECLs), this suggests that the proteins are folded correctly and functional following all manipulations.

      Reviewer #3 (Public Review):

      Summary:

      This is a well-designed and rigorous comparative study of the conformational dynamics of two chemokine receptors, the canonical CXCR4 and the atypical ACKR3, using single-molecule fluorescence spectroscopy. These receptors play a role in cell migration and may be relevant for developing drugs targeting tumor growth in cancers. The authors use single-molecule FRET to obtain distributions of a specific intermolecular distance that changes upon activation of the receptor and track differences between the two receptors in the apo state, and in response to ligands and mutations. The picture emerging is that more dynamic conformations promote more basal activity and more promiscuous coupling of the receptor to effectors.

      Strengths:

      The study is well designed to test the main hypothesis, the sample preparation and the experiments conducted are sound and the data analysis is rigorous. The technique, smFRET, allows for the detection of several substates, even those that are rarely sampled, and it can provide a "connectivity map" by looking at the transition probabilities between states. The receptors are reconstituted in nanodiscs to create a native-like environment. The examples of raw donor/acceptor intensity traces and FRET traces look convincing and the data analysis is reliable to extract the sub-states of the ensemble. The role of specific residues in creating a more flat conformational landscape in ACKR3 (e.g., Y257 and the C34-C287 bridge) is well documented in the paper.

      Weaknesses:

      The kinetics side of the analysis is mentioned, but not described and discussed. I am not sure why since the data contains that information. For instance, it is not clear if greater conformational flexibility is accompanied by faster transitions between states or not.

      The reviewer is correct that kinetic information is available, in principle, from smFRET experiments. However, a detailed kinetic analysis will require a much larger data set than we currently possess, to adequately sample all possible transitions and the dwell times of each FRET state. We intend to perform such an analysis in the future as more data becomes available. The purpose of this initial study was to explore the conformational landscapes of CXCR4 and ACKR3 and to reveal differences between them. To this end, we have documented major differences in conformational preferences and response to ligands of the two receptors that are likely relevant to their different biological behavior. Future kinetic information will add further detail, but is not expected to alter the conclusions drawn here.

      The method to choose the number of states seems reasonable, but the "similarity" of states argument (Figures S4 and S6) is not that clear.

      We thank the reviewer for noting a need for further clarification. We qualitatively compared the positions of the various FRET peaks across treatments to gain insight into the consistency of the conformations and avoid splitting real states by overfitting the data. For instance, fitting the ACKR3 treatments with three states leads to three distinct FRET populations for the R’ intermediate. Adding a fourth state results in two intermediates that are fairly well overlapping. In contrast, the two-intermediate model for CXCR4 appears to split the R* state of the CXCL12 treated sample and causes a general shift in both intermediate states to lower FRET values when CXCL12 is present. As we assume that the conformations are consistent throughout the treatments, we conclude that this represents an overfitting artifact and not a novel CXCL12CXCR4 R*’ state. Additional sentences have been added to the supplemental figure legend to better describe the comparative analysis.

      “(Top) With the 3-state model, the R’ states for apo-CXCR4 and for CXCL12- and IT1t-bound receptor overlapped well with similar apparent FRET values across all of the tested conditions. In the case of the four-state model, the R*’ (Middle) and R’ (Bottom) states were substantially different across the ligand treatments. In particular, the R*’ state with CXCL12 treatment appears to arise from a splitting of the R* conformation, indicating that the model was overfitting the data.”

      Also, the "dynamics" explanation offered for ACKR3's failure to couple and activate G proteins is not very convincing. In other studies, it was shown that activation of GPCRs by agonists leads to an increase in local dynamics around the TM6 labelling site, but that did not prevent G protein coupling and activation.

      We agree with the reviewer that any single explanation for ACKR3 bias, including the dynamics argument presented here, is insufficient to fully characterize the ACKR3 responses. As noted by the reviewer, the TM6 movement and dynamics is generally correlated with G protein coupling, whereas other dynamics studies (Wingler et al. Cell 2019) have noted that arrestinbiased ligands do not lead to the same degree of TM6 movement. We have added the following statement to the discussion on Pg. 13:

      “The atypical activation of ACKR3 does not appear to be dependent on any singular receptor feature and is likely a combination of several factors.” 

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):

      I would like to raise a technical point about the calculation and reporting of the FRET efficiency. The authors report the FRET efficiency as E=IA/(IA+ID). There is now a strong recommendation from the FRET community (https://doi.org/10.1038/s41592-018-0085-0) to use the term “FRET efficiency” only when a proper correction procedure of all correction factors has been applied, which is not the case here (gamma factor has not been calculated). The authors should therefore use the term “Apparent FRET Efficiency” and  E<sub>app</sub> in all the manuscripts.

      Also, it would be nice to indicate directly on the figures whether a ligand that is used is an agonist, antagonist, inverse agonist, etc...

      We thank the reviewer for suggesting this clarification in terminology. We now refer to apparent FRET efficiency (or E<sub>app</sub>) throughout the manuscript and in the figures. In addition, we have added ligand descriptions to the relevant figures.

      Reviewer #2 (Recommendations For The Authors):

      (1) M159(4.40)C/Q245(6.28)C ACKR3 appears to have higher constitutive activity than ACKR3 Wt (Fig. S1). While the vehicle point itself is likely not significant due to the error in the Wt, the overall trend is clear and arguably even stronger than the effect of Y257(6.40)L (Fig. S9). While this is an inherent limitation of the method used, it should be clearly acknowledged; the comment in lines 162-164 seems to skirt the issue by only saying that arrestin recruitment is retained. It would be helpful and more rigorous to report the curve fit parameters (basal, E<sub>max</sub>, EC50) for the arrestin recruitment experiments and the associated errors/significance (see https://www.graphpad.com/guides/prism/latest/statistics/stat_qa_multiple_comparisons_ after_.htm for a discussion).

      The Emin, E<sub>max</sub>, and EC50 for M159<sup>4</sup>.<sup>40</sup>C/Q245<sup>6</sup>.<sup>28</sup>C ACKR3 were compared against the values for WT ACKR3 from Fig. S1 and only the E<sub>max</sub> was determined to be significantly different by the extra sum of squares F test. A note has been added to the text to reflect these results on Pg. 5.

      “Only the E<sub>max</sub> for arrestin recruitment to CXCL12-stimulated ACKR3 was significantly altered by the mutations, while all other pharmacological parameters were the same as for WT receptors.”

      (2) The methods do not specify the reactive group of the dyes used for labeling (i.e., AlexaFluor 555-maleimide and Cy5-maleimide?).

      We regret the omission and have added the necessary details to the materials and methods.

      (3) Were any of the native Cys residues removed from ACKR3 and CXCR4 in the constructs used for smFRET? ACKR3 appears to have two additional Cys residues in the N-terminus besides the one involved in the second disulfide bridge, and these would presumably be solvent-exposed. If so, please specify in the Methods and clarify whether the constructs tested in functional assays included these. (Also, please specify if the human receptors were used.)

      No additional cysteine residues were mutated in either receptor. All exposed cysteines are predicted to form disulfides. The residues in the N-terminus that the reviewer alludes to, C21 and C26, form a disulfide (Gustavsson et al. Nature Communications 2017) and are thus protected from our probes. Consistent with these expectations, neither WT CXCR4 nor ACKR3 exhibited significant fluorophore labeling (now mentioned in the text on Pg. 5). The species of origin has been added to the material and methods.

      (4) There are a few instances where the data seem to slightly diverge from the proposed models that may be helpful to comment on explicitly in the text:

      - Figure 4E (ACKR3/CXCL12(P2G)): As noted in the legend, despite stabilizing R*/R*', CXCL12(P2G) reduces transitions between these states compared to Apo. This is more similar to the effects of VUF16840 (Figure 3D) than the other ACKR3 agonists. The authors note the difference between CXCL12(LHRQ) and CXCL12(P2G) (but not vs Apo) in this regard. There might be some other information here regarding the relative importance of the conformational equilibrium vs transition rates for receptor activity.

      Although the TDPs for CXCL12<sub>P2G</sub> and VUF16840 are similar, as noted by the reviewer, the overall FRET envelopes are drastically different.

      The differences in transition probabilities for R ↔ R’ and R*’ « R* transitions observed in the presence of CXCL12<sub>P2G</sub> or CXCL12<sub>LRHQ</sub> relative to the apo receptor are now explicitly noted in the Results.

      - The conformational distributions of ACKR3 apo and ACKR3 Y257L CXCL12 are very similar (Figure 5A,D). However, there is a substantial difference in the basal activity of WT vs CXCL12stimulated Y257L (Figure S9).

      The mutation Y257L appears to promote the highest and lowest FRET states at the expense of the intermediates. Although the distribution appears similar between Apo-WT and CXCL12Y257L, the depopulation of the R’ state may lead to the observed activation in cells.

      (5) There are inconsistent statements regarding the compatibility of G protein binding to the "active-like" ACKR3 conformation observed in the authors' previous structures (Yen et al, Sci Adv 2022). In the introduction, the authors seem to be making the case that steric clashes cannot account for its lack of coupling; in the discussion, they seem to consider it a possibility.

      The introduction to previous research on the molecular mechanisms governing the lack of ACKR3-G protein coupling was not intended to be all encompassing, but rather to highlight previous efforts to elucidate this process and justify our study of the role  of dynamics. Due to the positions of the probes, we can only comment on the impact on TM6 movements and not other conformational changes. The steric clash reported in Yen et al. was in ICL2 and not directly tested here, so our observations do not preclude changes occurring in this region. We also do not claim that the active-like state resolved in our previous structures matches any specific state isolated here by smFRET.

      (6) Line 83-85: "Having excluded other mechanisms we therefore surmised that the inability of ACKR3 to activate G proteins may be due to differences in receptor dynamics."

      Line 400-402: "It is possible that the active receptor conformation clashes sterically with the G protein as suggested by docking of G proteins to structures of ACKR3."

      As mentioned above, we suspect the mechanisms governing the inability of  ACKR3 to couple to G proteins may be more complex than one particular feature but instead due to a combination of several factors. Accordingly, we have not completely eliminated a contribution of steric hindrance as we described in Yen et al. Sci Adv 2022 and instead include it as a possibility. Following the line highlighted here, we list several alternatives: 

      “Alternatively, the receptor dynamics and conformational transitions revealed here may prevent formation of productive contacts between ACKR3 and G protein that are required for coupling, even though G proteins appear to constitutively associate with the receptor.”

      And, at the end of the paragraph, we have added the following sentence: 

      “The atypical activation of ACKR3 does not appear to be dependent on any singular receptor feature and is likely a combination of several factors.”

      (7) If the authors believe that the various ligands/mutations are only altering the distribution/dynamics of the same 3/4 conformations of CXCR4/ACKR3, respectively, is there a reason each FRET efficiency histogram is fit independently instead of constraining the individual components to Gaussian components with the same centroids, and/or globally fitting all datasets for the same receptor?

      We performed global analysis across all data sets for each sample and condition. Since the peak positions of the various FRET states recovered in this way were consistent across treatments (Fig. S4,S6), we did not feel it was necessary to perform a further global analysis across all samples for a given receptor.

      Reviewer #3 (Recommendations For The Authors):

      The manuscript is well-written, the arguments are easy to follow and the figures are helpful and clear. Here are a few questions/suggestions that the authors might want to address before the paper will be published:

      (1) Include a table with kinetic rates between states in SI and have a brief discussion in the main text to support the trends observed in transition probabilities.

      As noted above, determining rate constants for each of the state-to-state transitions will require a much larger set of experimental smFRET data than is currently available and will be the subject of future studies.

      (2) The argument of state similarity (Figure S4 and S6)... why are the profiles not Gaussian, like in the fits on Figures S3 and S5, repectively? I would also suggest that once the number of states is chosen to do a global fit, where the FRET values of a certain sub-state across different conditions for one receptor are shared.

      The state distributions presented in Figs. S4 and S6 (as well as throughout the rest of the paper) are derived from HMM fitting of the time traces themselves, and are not constrained to be Gaussian, whereas the GMM analysis in Figs. S3 and S5 are Gaussian fits to the final apparent FRET efficiency histograms.

      Similar to our response to Review 2 above, due to the consistency of the fitted peak positions obtained across different conditions for a given sample, we did not feel that further global analysis was necessary.

      (3) It is shown FRET changes from ~0.85 in the inactive (closed) state to ~0.25 in the active (open) state. How do these values match the expectations based on crystal structure and dye properties?

      As noted in our response to Reviewer 1, translating the apparent FRET values using the assumed Förster distances for A555/Cy5 (per FPbase) suggest a change in D-A distance of ~30 angstroms, whereas the expected change from structures is ~16 Å. We suspect this discrepancy is due to the lipids immediately adjacent to the fluorophores, which may lead to the probes being constrained in an extended position when TM6 moves outwards, thus also reporting the linker length in the distance change. Additionally, such interactions may constrain the donor and acceptor in unfavorable orientations for energy transfer, which would also reduce the FRET efficiency in the active state. Since the calculated D-A distance changes appear too large for GPCR activation, we have opted to not make any structural interpretations. Instead, all of our conclusions are based on resolving individual conformational states and quantifying their relative populations, which is based directly on the measured FRET efficiency distributions, not computed distances.

      (4) The results on the effect of CXCL12-P2G on CXCR4 are confusing...despite being an antagonist, this ligand stabilizes the "active state"...I am not sure if the explanation offered is sufficient that the opening of the intracellular cleft is not sufficient to drive the G protein coupling/activation.

      We agree that the explanation related to the opening of the intracellular cleft being insufficient to drive G protein coupling/activation is speculative and we have removed that text. We now simply propose that the CXCL12 variants inhibit coupling of G proteins to CXCR4 or disrupt interactions necessary for signaling, as stated in the following text to the results on Pg. 8:

      “Despite the ability of CXCL12<sub>P2G</sub> and CXCL12<sub>LRHQ</sub> to stabilize the active R* conformation of CXCR4, both variants are known to act as antagonists (20). This suggests that the CXCL12 mutants inhibit CXCR4 coupling to G proteins not by suppressing the active receptor population but rather by increasing the dynamics of the receptor state-to-state transitions. Our results suggest that the helical movements considered classic signatures of the active state may not be sufficient for CXCR4 to engage productively with G proteins.”

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      • We thank the reviewers for their useful suggestions regarding how to improve our manuscript.
      • Reviewer 3 declared that s/he did not find and evaluate the provided Supplementary Materials. As a result, many of her/his criticisms seem invalid: the requested data, validations etc. were already there in the Supplementary Figures and Tables.
      • To avoid confusion, we renamed the transgene that is commonly used as a readout for STAT-activated transcription from 10xStat92E-GFP to 10xStat92E DNA binding site-GFP (please see comments by Reviewer 2 that show how easily one can think that Stat92E protein levels go up because of the misleading name of this transgene).
      • One co-author, Martin Csordós was among the authors by mistake. Although first considered, his contribution was not included in either the original or the current manuscript version, so we removed his name from the revised version with his permission.
      • We prefer to use colour coding for Sections 2., 3. and 4. in our responses to Reviewer comments rather than splitting the responses to queries in separate sections, because many of our answers contain a mixture of planned experiments (labeled as bold), already available data (labeled as underlined), and *explanations why we think that no additional analyses are necessary* (between asterisks). Data already provided in the original submission but missed by Reviewers has white background in our responses. Reviewer comments

      Reviewer 1

      Major comments:

      R1/1. ”Figure 6E seems to indicate that a subset of Su(var)2-10/PIAS isoforms may bind to ATG8 (directly or indirectly). This leads to the straightforward prediction that this subset should be differentially affected by the selective autophagy at the center of the manuscript. That could be tested to strengthen that point. “

      Response:

      The Atg8a-binding subset of Su(var)2-10/PIAS isoforms could indeed be differentially affected by selective autophagy__. To test this, we will analyze in vivo Su(var)2-10 isoform abundance on western blots with an anti- Su(var)2-10 antibody in __Atg8aΔ12and ____Atg8aK48A/Y49A (Atg8aLDS) mutants.

      Minor comments:

      R1/2. “ in Fig S1B,C the colocalization between GFP reporters for STAT92E and AP-1 activity and glia marker does not seem convincing, indicating other cell types may be expressing them as well.”

      *Response: *

      *The overlap between glia labelling and STAT92E and AP-1 transcriptional readout reporter expression is indeed not complete. First of all, epithelial cells in the wing display both STAT92E and AP-1 activity even in uninjured conditions when glial expression of these reporters is not yet observed. Transcriptional reporter activity outside of the wing nerve was previously indicated in figures with arrowheads, now the epithelium is labeled and the regions containing nerve glia are outlined everywhere. *

      The fiber-like reporter expression after injury in the wing nerve could correspond to either glia or axons1–3. Glia in the wing nerve have a filament-like appearance resembling axons in confocal images, even glial nuclei are flat/elongated1. Importantly, STAT92E enhancer-driven GFP also labels the nucleus in expressing cells, as opposed to glially driven mtdTomato that is membrane-tethered (and thus excluded from the nucleus: see Fig. S1B, C). Of note, TRE-GFP and Stat-GFP are not expressed in neurons because the cell bodies and nuclei of wing vein neurons are never GFP-positive, see Fig. 2C, Figs. S1, S4 in Neukomm et al.1 and Figure 1 for Reviewers. We also explain this better now in the revised manuscript (please see the legend of Fig. S1).

      Nonetheless, we plan to analyze colocalization of mtdTomato-labeled neurons and TRE-GFP and Stat-GFP around the neuronal cell bodies to unequivocally show their different identities. Additionally, we will include transverse confocal sections of the genotypes in Fig. S1B, C that may better illustrate the colocalization.

      Fig. 1 for Reviewers. Neuronal (nSyb+) and Stat92E-GFP+ cell morphology in the L1 vein at the anterior wing margin around the neuronal cell bodies which occupy a stereotypical position at the sensilla1. The location and shape of neuronal nuclei (left panel) are different from Stat-GFP+ cell nuclei (right panel, please see also Fig. S1B, C) based on the circumferential GFP signal. Therefore, cells expressing TRE-GFP and Stat-GFP in injured wing nerves are glia and not neurons.

      R1/3. “p.7 Instead of "Su(var)2-10 is mainly nuclear due to its transcriptional repressor and chromatin organizer functions" It may be better to say" .. .consistent with its transcriptional repressor and chromatin organizer functions"”

      Response:

      We have modified the manuscript accordingly.

      R1/4. It is not clear whether the differences in Su(var)2-10/PIAS accumulation between Atg16 and Atg101 RNAi indicate functional differences of blocking autophagy at different stages or simply differences in RNAi efficiency (Atg16) versus the Atg101 mutant.”

      Response:

      We have added glial Atg1 (the catalytic subunit of the autophagy initiation complex that also includes Atg101) knockdown experiments that show the same lack of Su(var)2-10 accumulation in uninjured conditions as seen in the Atg101 null mutant (please see Fig. S6C). Please note that Atg16-Atg5-Atg12 dependent conjugation of LC3/Atg8a is involved in various vesicle trafficking pathways in addition to autophagy4–6, alterations of which may perturb baseline Su(var)2-10 levels in uninjured animals.

      Significance:

      R1/5. “STAT92E-dependent glial upregulation of vir-1, but not Draper, is shown, but consequences for glial functions in nerve injury are not tested.”

      Response:

      We will test antimicrobial peptide (AMP) expression in glia after nerve injury and whether this is affected by STAT92E and vir-1. Certain AMPs such as Attacin C are known to be regulated by both the Stat and NF-____κΒpathways7, and AMPs can be generally upregulated in response to brain injury8,9. This could serve pathogen clearance functions after defence lines such as the epithelium and blood-brain barrier are compromised. In addition, we will test the recruitment of glial processes into the antennal lobe after olfactory nerve injury in animals with glial STAT92E or vir-1 deficiency. Glial invasion is an adaptive response to axon injury and a first step towards debris clearance10.

      R1/6. “experiments indicate a role for Su(var)2-10/PIAS SUMOylation activity in tis autophagic degradation, but it is not clear whether the critical substrata Su(var)2-10/PIAS itself or another protein.”

      “binding of Su(var)2-10/PIAS to ATG8 is indicated, but no in vitro experiment performed to test whether this is direct and perhaps SUMOylation dependent.”

      Response:

      *We aimed to answer this question by using a point mutant form of Su(var)2-10: CTD2, which is unable to properly autoSUMOylate itself11, see Fig. 6D. CTD2 mutant Su(var)2-10 levels increased in S2 cells transfected with the mutant construct relative to the wild-type, similar to lysosome inhibition affecting the wild-type protein level but not the mutant variant. Importantly, wild-type Su(var)2-10 is present in CTD2 mutant Su(var)2-10-transfected cells, which can still SUMOylate other Su(var)2-10 targets. It is thus the intrinsic SUMOylation defect of the CTD2 mutant that results in its impaired degradation. It is firmly established that increased Su(var)2-10/PIAS levels repress STAT92E activity12, mammalian example: Liu et al., 199813, pointing to Su(var)2-10 as the critical substrate for autophagy during STAT92E derepression.*

      We will further address this point and investigate if Su(var)2-10 directly binds to Atg8a by in vitro SUMOylation of GST-Su(var)2-10 and subsequent GST pulldown assay with HA-Atg8a. In vitro SUMOylation reaction with purified GST-Su(var)2-10 and negative controls are available via in-house collaboration11. We will incubate the resulting proteins and non-SUMOylated counterparts with in vitro transcribed /translated HA-Atg8a, and interactions will be tested by anti-HA western blotting with quantitative fluorescent LICOR Odyssey CLX detection.

      Reviewer 2

      Major comments:

      R2/1. The working hypothesis is that upon injury, Su(var)2-10 is degraded by autophagy and, as a consequence, Stat92E induces vir-1 expression.

      Could the authors clarify why do Stat92E levels increase upon injury? Does Stat92E stability increase upon ATG mediated Su(var)2-10 degradation? Or does it expression/nuclear translocation change?“

      Response:

      We did not state that Stat92E levels increase during injury - we only used the 10xStat92E DNA binding site-GFP reporter (we have renamed it as such in our revised manuscript to avoid confusion) that is commonly referred to as 10xStat92E-GFP in the literature14, as a readout for Stat92E-dependent transcription.

      To address these questions, we will use an endogenous promoter-driven STAT92E::GFP::FLAG protein-protein fusion transgene (https://flybase.org/reports/FBti0147707.htm) to test if STAT92E stability/expression or translocation is altered during injury or upon disruption of selective autophagy. We have already tested this reporter and it is detected in the wing nerve nuclei after injury (Figure 2 for Reviewers, panel A).

      As the Atg8aLDS mutation specifically impairs selective autophagy, we will use this mutant and wild-type controls to assess STAT92E::GFP::FLAG abundance on western blots from fly lysates with anti-GFP antibody. To assess STAT92E::GFP::FLAG nuclear translocation as well as stability/expression, we will use independently Atg8aLDS and Su(var)2-10 RNAi in glia to perturb STAT92E -dependent transactivation and visualize glia cell membrane by membrane-tethered tdTomato, glial nuclei by DAPI/anti-Repo and STAT92E with the STAT92E::GFP::FLAG fusion transgene in dissected brains. We can also evaluate STAT92E nuclear translocation with the same genotypes in the injured wing nerve glia. Of note, studies in mammals failed to identify an obvious effect of PIAS1 on STAT1 abundance13, please see Figure 2B from this paper as Figure 2 for Reviewers, panel B. Rather, PIAS family proteins bind tyrosine-phosporylated STAT dimers and impair their DNA binding thereby their transcriptional activation function15.

      A.

      Proc. Natl. Acad. Sci. USA Vol. 95, pp. 10626–10631

      https://doi.org/10.1073/pnas.95.18.10626.

      Fig. 2 for Reviewers.

      1. Stat92E::GFP::FLAG expression and nuclear appearance in the wing nerve before and after injury
      2. Increasing PIAS1 (Su(var)2-10 ortholog) levels does not affect STAT1 abundance in mammalian cells R2/2. Also, since Su(var) levels increase upon ATG RNAi, independently of injury, do ATG levels increase upon injury? It does not seem to be the case from Fig 6D, but then, if the ATG levels do not increase, how to explain the injury mediated effects of Su(var)2-10? “

      Response:

      *We have not seen an effect of injury on the rate of autophagic degradation (flux) using the common flux reporter GFP-mCherry -Atg8a in glia after injury (shown in Fig. S2D – not 6D). Also, levels of the typical autophagic cargo p62/Ref(2)P and core autophagy proteins such as Atg12, Atg5, Atg16 do not change after nervous system injury16suggesting no change in general autophagic turnover. *

      *An increase in general autophagy would be one option to promote degradation of a given cargo. Just as for the ubiquitin-proteasome system, in selective autophagy the labelling of the cargo/substrate for degradation is a regulated process. Dynamic ubiquitylation of a cargo often promotes its autophagic degradation17. We hypothesize that SUMO may fulfil a similar role in labelling cargo for elimination and this may be promoted by injury in the case of Su(var)2-10, which warrants future studies. *

      R2/3. “Su(var)2-10 levels in control and injured wings are different between ATG18RNAi and ATG101 mutant (Fig 5). Could the authors explain the rational for using two ATG mutants? and the meaning of this difference? Also, why comparing data using the RNAi approach and a mutation?”

      Response:

      This issue was also raised in R1/4 and we refer the Reviewer/Editor to that section for our new Atg1 knockdown data and explanations.

      *There is a consensus in the autophagy community that mutants for multiple Atg genes should always be used to ensure that it is indeed canonical autophagy that is affected (because Atg proteins can have non-autophagic roles, as is the case for Atg16 in regulation of phagosome maturation - LAP). *

      R2/4. “Fig 6 What is the relevance of the Atg8, Sumo and Su(var)2-10 colocalization at puncta, since there is a lot of colocalization outside the puncta and also lots of Su(var)2-10 or Atg8 labeling that does not colocalize? “

      Response:

      *Su(var)2-10 orthologs PIAS1-4 localize to the nuclear matrix and certain foci in the chromatin and may play roles in heterochromatin formation, DNA repair, and repression of transposable elements in addition to transcriptional repression18–20. SUMO-modified proteins accumulate in response to PIAS activity in phase-separated foci also referred to as SUMO glue21. We show colocalization of Atg8a with similar Su(var)2-10 and SUMO double positive structures in foci. *

      *We do not expect a full overlap between Su(var)2-10 and Atg8a labeling for a number of reasons. First, Su(var)2-10 has many different roles that may not be regulated by autophagy. Second, Atg8a+ autophagosomes in the cytoplasm deliver not only indidivual proteins such as Su(var)2-10 for degradation but also many other cellular components. Third, nuclear Atg8a is implicated in the removal of the Sequoia transcriptional repressor from autophagy genes that is unlikely to involve Su(var)2-1022. Now we include these points in the Discussion section.*

      R2/5. “The statement made in the first sentence of the discussion is very strong: 'we have uncovered an activation mechanism for Stat92E', without sufficient supporting evidence.”

      Response:

      We have rephrased this section as follows:

      Here we have uncovered the autophagy-dependent clearance of a direct repressor of the Stat92E transcription factor. This, synergistically with injury-induced Stat92E phosphorylation, may ensure proper Stat92E-dependent responses in glia after nerve injury to promote glial reactivity.

      R2/6. “Could the authors validate (some) expression data by in situ hybridization experiments?”

      Response:

      *Our gene expression data were derived from wing nerve imaging or wing tissue. Unfortunately, in situ hybridization is not feasible in this organ because probes do not penetrate the thick chitin-based cuticule and wax cover of the wing (and the same is true for wing immunostaining).* We do provide independent evidence for vir-1 upregulation in the wing after injury via quantitative PCR (qPCR) in Fig. S5C. To corroborate reporter-based data, we will also analyze drpr in qPCR using wing material after injury at the same time points.

      R2/7. “Could the authors validate the RNAi lines molecularly (or refer to published data on these lines?”

      Response:

      *Almost all RNAi lines have already been validated by qPCR, western blot, or immunostaining in Szabo et al., 202316 and other publications23–25. The only exception is Su(var)2-10JF03384 and we show that it is indistinguishable from the validated Su(var)2-10HMS00750 RNAi line (which causes 95% transcript reduction): it also strongly derepresses STAT activity. These reagents have also been widely used in the community (e.g. https://flybase.org/reports/FBal0242556.htm, https://flybase.org/reports/FBal0233496.htm).*

      R2/8. „Clarifying the role of Su(var)2-10 on Stat92E would benefit to the presented work. Does Atg8-Su(var)2-10 binding affect Stat92E accumulation, expression, translocation to the nucleus? Some of these experiments could be obtained in S2 cell transfection assays, if too complex in vivo.”

      Response:

      As explained in R2/1, we will use an endogenous promoter-driven STAT92E::GFP::FLAG protein-protein fusion transgene to test if STAT92E stability/expression or translocation is altered upon disruption of selectiveautophagy (in Atg8aLDS mutant flies).

      R2/9. „Also, what happens to the axons in the mutant conditions described in the manuscript? This would higher the impact of the work, but would require in vivo work with fly stocks containing several transgenes.”

      Response:

      We have already published in our previous paper, Szabo et al., 202316 that the mutants used in the current study display normal axon morphology__. There are only two mutants that we did not test in that paper: Atg8aLDS and our new Atg8anull and we will examine these remaining two during the revision, __but we already published in the above paper that axons appear normal in Atg8aΔ4, a widely used Atg8a mutant allele.

      R2/10. „It has been published that Draper is involved in the response to injury in the adult wing nerve. See for example Neukomm et al (2014). The authors should discuss how this fits with their hypothesis and data. In this respect, Fig S4B, which should support the hypothesis, should be improved. It is rather hard to interpret it.”

      Response:

      Fig. S3 (draper protein trap-Gal4 driven GFP-RFP reporter expression) and S4B (intronic STAT92E binding site of the draper gene driven GFP-RFP reporter expression) show similar results: drpr is already expressed in wing nerve glia before injury, which is in line with Draper’s crucial role in the injury response because Draper-mediated glial signaling triggers glial reactivity. This has been added to the Discussion.

      Minor comments:

      R2/11. „Rubicon is also a negative regulator of autophagy (doi:10.1038/s41598-023-44203-6). in (Fig2 B, D) we have a higher GFP intensity in both uninjured and injured, and the difference between Injured/uninjured is less significant compared to control. It is possible that Rubicon KD causes more autophagy leading to a higher activation of Stat92E even in control. I wouldn't take the results as a proof of canonical autophagy implication and not LC3-associated phagocytosis”

      Response:

      Loss of Rubicon could indeed potentially remove more Su(var)2-10 via increased autophagy, leading to higher Stat92E activity. However, there is no statistically significant difference between injured and uninjured controls and injured and uninjured Rubicon knockdown, respectively, in Fig2 B, D (p=0.6975 and >0.9999 for each comparison). We are puzzled by the statement that the reviewer „wouldn't take the results as a proof of canonical autophagy implication and not LC3-associated phagocytosis”. We analyzed Rubicon as a factor critical for LAP and its deficiency does not prevent Stat transcriptional activity following injury unlike the loss of Atg8a, Atg16, Atg13 and Atg5. We will further support this result with a mutant of Atg16 with part of the WD40 domain deleted, because this region is critical for LAP but not for autophagy.16,26,27

      R2/12. „The rationale for using both repoGal4 and repoGS is unclear. If, as mentioned, the goal is to avoid developmental defects, repoGS should be consistently used. Especially I don't understand how both were utilized to knock down the same genes, such as Atg16”

      Response:

      *We had to use repoGS (a drug-inducible Gal4 active in glia) because knocking down Su(var)2-10 with repoGal4 resulted in no viable adult progeny. Su(var)2-10 is an essential gene as opposed to most autophagy genes and its absence results in embryonic lethality24. Thus all Su(var)2-10 silencing experiments were done with repoGS. Similarly, Stat92E is involved in various developmental processes and its loss is embryonic lethal. repoGal4 was used for genes generally not having an adverse effect when absent during development16 in the first two figures. In Fig. 4D, we silenced Atg16 by repoGS because it is one of the controls for testing a genetic epistasis between Su(var)2-10 and Atg16. Please note that we see exactly the same phenotype in case of Atg16 knockdown when using either Gal4 version.* This has been explained in the revised methods section.

      R2/13. „In the third paragraph of the introduction, I am confused whether Stat92E regulates drpr of the reverse”

      Response:

      Upon antennal injury, Drpr receptor binding to phagocytic cargo initiates a positive feedback loop in glial cells to promote its own transcription28. Drpr receptor in the plasma membrane regulates Stat92E and AP-1 activity via signal transduction. Stat92E and AP-1, in turn, increases drpr transcription10,28–30 that will result in more plasma membrane Drpr protein expression. We have explained this more clearly in the revised Introduction.

      R2/14. „I cannot find the evidence for vir-1 being expressed in glia and target of Gcm in the refences that have been cited.”

      Response:

      We apologize for not explaining this better: vir-1 is called CG5453 in Freeman et al., 200331. It is listed in Table 1 as a Gcm target since there is no detectable CG5453 expression in a Gcm null mutant, please see below. We have updated the manuscript with this gene name.

      .....

      .....

      Part of Table 1 from Freeman et al., 200331.

      R2/15. „The presence of a Stat92E binding site on the vir-1 promoter has already bene described in the paper from Imler and collaborators, Nature immunology 2005. Actually, if this site is present in their transgenic line, it would help the authors strengthen the argument that Stat92E has a direct role on vir1 (for which they make a very strong statement in the discussion, with no direct evidence).”

      Response:

      *The evidence that Stat92E may have a direct role in vir-1 transcription in glia comes exactly from the same reporter transgene described by Imler and collaborators in the mentioned paper32. We received this transgenic line from the Imler group and monitored its expression after injury upon depletion of Stat92E (Fig. 3B). It thus contains the studied Stat binding site. This was referenced in the Methods and in all relevant sections of the main text, and we now explicitly state this in the revised text.*

      R2/16. In the Fig S2D, I do not see a lot of GFP+ (Glia) cells. I see more Atg8a in injured 3 dpi regardless of colocalization with glia”

      Response:

      Fig S2D uses one of the standard assays for autophagic turnover, which we now explain in more detail in the Results section. Basically, the dual tagged GFP::mCherry::Atg8a transgene is expressed in glia, and GFP is quenched in lysosomes after delivery by autophagy while mCherry remains fluorescent. So, in addition to double positive dots (autophagosomes), there are mCherry dots lacking GFP (autolysosomes) if autophagy is functional. All of these dots are in glia but the cell boudaries are not visible.

      The images shown are single optical slices. The number of mCherry+ puncta are around 7-8 per field in both uninjured and injured (3 dpi) conditions, but puncta brightness is always variable. Since most mCherry+ puncta were rather bright in the original 3 dpi image, we changed it to a more representative image.

      R2/17. „The quantification of the signals is made in a specific region of the wing, I guess throughout the nerve thickness. This could be represented more carefully in a schematic and It would also help defining colocalization in the first figure, by using a transverse section.”

      Response:

      The quantification method is described in Materials and Methods and we have added that quantification was done on single optical slices. The imaged region is depicted in Fig. S1A, where we indicated the rectangular region used in Fiji for image quantification. We will add transverse sections of wings as suggested.

      R2/18. „A number of ATG genes are considered in the manuscript, but the rational for using them is not always clear. Showing a schematic would help clarify this. „

      Response:

      We have added a table showing the different steps of autophagy where the studied Atg genes/proteins function (now Supplementary Table 1). We also added whether the gene is considered specific for autophagy or can play a role in another process, e.g. LAP. We studied different autophagy genes in line with the assumption that disabling distinct autophagic complexes should produce the same phenotype if this process is indeed autophagy (and not LC3-associated phagocytosis for example).

      R2/19. „Fig 7 is not cited and its legend is very short.”

      Response:

      We have now cited Fig 7 and expanded its legend.

      R2/20. „Clarify the color coding in Fig S1E”

      Response:

      We added that red is injured, black is uninjured.

      R2/21. „What is the tandem tagged autophagic fly reporter in fig S2D?”

      Response:

      This is one of the most common tools to study autophagy, please see the updated explanation above at your first question regarding Fig. S2D.

      R2/22. „Add a schematic on the vir-1 isoforms.”

      Response:

      We have added a a schematic showing the vir-1 isoforms in Fig. S5B.

      R2/23. „Fig S6B and Fig 5 relate on the levels of Su(var)2-10 upon Atg16 RNAi, but the scale is not the same, why?”

      Response:

      *The scales are different because these two images measure different things. Fig. 5 indeed displays quantification of Su(var)2-10 levels in brain glia. However, Fig S6B shows quantification of Stat92E-induced GFP reporter levels (as a proxy of Stat92E transcriptional activity) in the wing nerve upon Atg16 knockdown. *

      Reviewer 3

      R3/1. „The claim that the negative regulator of Stat92E signaling is removed by selective autophagy, involving selective autophagy receptors different from/in addition to Ref(2)P/p62 is not convincingly shown. This claim probably needs to be softened.”

      Response:

      *We have rephrased this sentence as follows: *

      „These data suggest that selective autophagy is involved in Stat92E-dependent transcriptional activation in glia.”

      R3/2. „The reporter that was used (10xSTAT92E-eGFP) is not a dynamic reporter of STAT92E activity. It accumulates in glia and is highly stable. The appropriate reporter to look at dynamic changes would be 10XSTAT92E-dGFP, which has a degradable (unstable) GFP that is required to see dynamic changes even in the CNS. All of the claims about STAT92E regulation use this reporter, so they are questionable.”

      Response:

      10XSTAT92E-dGFP featuring destabilized GFP could be a more appropriate tool for monitoring dynamic changes in transcription when short term- e.g. few hours - changes are investigated. However, we did not see any expression of 10XSTAT92E-dGFP (we tried 2 different transgenic insertions) in the wing nerve, please see Figure 3 for Reviewers. In the brain, dGFP expression with this reporter is also several times lower than stable GFP, please compare Fig. 4A and B in Doherty et al28.

      The use of 10xSTAT92E-eGFP to follow dynamic expression changes is justified by many lines of evidence. First, there is no 10xSTAT92E-EGFP expression in uninjured wing nerves (Fig. S1D,E). Injury induces EGFP expression in the wing nerve with a sustained activation from 1 to 3 dpi (days post injury), and the EGFP expression returns to the baseline by 5 dpi (Fig. S1D, E). Second, the initial Stat-dependent upregulation of drpr and the 10XSTAT92E-dGFP signal in the brain both occur in the first 24 hours after injury and are sustained for 72 hours28 similar to our results with 10xSTAT92E-EGFP ((Fig. S1D,E). These results indicate that the dynamics of 10xSTAT92E-EGFP expression allows monitoring changes in Stat-dependent transcription occurring over days.

      Figure 3 for Reviewers. Lack of 10XSTAT92E-dGFP signal in the wing nerve from two independent insertions of the same transgene at the indicated time points after wing injury.

      R3/3. „The claim that glial drpr is not upregulated by wing injury and drpr accumulation is not apparently a prerequisite for efficient debris processing within the wing is weak. First, they did not stain for Draper using antibodies, rather they used expression constructs. Dee7 is a promoter that was found to be injury activated in the CNS (were they able to replicate that result? I did not receive the supplemental data), but it might not be the crucial regulator in the periphery. The MIMIC line that was converted is better, but might not represent the full spectrum of regulatory events at the draper locus. Finally, they never actually test for endogenous RNA changes, or use the antibody on westerns. Their lack of evidence is not as compelling as it could be.”

      Response:

      The__ original Supplemental Material already provides answers for this and subsequent questions of Reviewer 3__. We deposited the Supplemental Material to bioRxiv at the time of the first Review Commons submission and it was/is available at https://www.biorxiv.org/content/10.1101/2024.08.28.610109v2.supplementary-material.

      Figs. S3 and S4 show in the wing and the brain (using two different drpr reporters for its transcriptional regulation) that drpr expression does not change much in the wing after nerve injury, as opposed to the brain.

      *We did indeed replicate that dee7-Gal4 expression is induced in the brain after antennal injury using UAS- TransTimer (Fig. S4A). In contrast, wing cell nuclei already show expression of both fluorescent proteins in uninjured conditions, and RFP+ nucleus numbers do no change after wing injury (Fig. S4B, C). drpr-Gal4 was generated by conversion of a MiMIC gene trap element into a Gal4 that traps all transcripts. drprMI07659 is in an intron that is common in all drpr isoforms so it should capture the regulation of all transcript isoforms. *

      We will further analyze drpr expression via independent methods during the revision: qPCR amplification of a common region of drpr transcripts, and western blot with anti-Drpr antibody to compare injured and uninjured wing material. Of note, we see no upregulation of drpr 2 days after wing injury in our (unpublished) RNAseq results either.

      *Unfortunately, immunostaining of the adult wing is not feasible because antibodies do not penetrate the thick chitin-based cuticle and wax cover of the wing.*

      R3/4. „The authors claim autophagy contributes to glial reactive states in part by acting on JAK-STAT pathway via regulation of Stat92E. They did not investigate other potential STAT92E targets. Does Atg16 knockdown alter STAT92E expression? Apparently Vir1 is still upregulated in the absence of Atg16 following injury, but they don’t show STAT92E changes.”

      Response:

      We did investigate other potential STAT92E targets besides vir-1. This is referred to in the text as „*immunity-related gene reporters” and it again can be found in the Supplemental Material (____Supplementary Table 2). None of these genes showed glia-specific upregulation following injury. *

      We will investigate STAT92E expression with the STAT92E::GFP::FLAG protein-protein fusion transgene after disrupting autophagy as also suggested by Reviewer 2. Please see our detailed answer to the first comment of Reviewer 2.

      *We do not agree with the comment that „Vir1 is still upregulated in the absence of Atg16 following injury” because Fig. 3F,G show that lack of Atg16 abolishes the upregulation of the vir-1 reporter: the change from uninjured to injured becomes statistically not significant and the mean GFP intensities are practically identical. *

      R3/5. „The authors claim Su(var)2-10 is an autophagic cargo. They should better characterize Su(var)2-10 degradation and its regulation, and image quality needs to be improved (better images, merged examples, and clearer indication of what they are highlighting. There are many arrows in figures that I don't know what they are pointing to. Much of the labeling in Fig 1 (and others) looks like axons. Could TRE-GFP be turned on in neurons? How did they discriminate?”

      Response:

      As also explained to Reviewer 1’s last comment, we will carry out experiments to address whether SUMOylated Su(var)2-10 binds Atg8a, which can provide evidence for a direct SUMO-dependent autophagic elimination of Su(var)2-10. Please see our detailed response there.

      We will further improve image quality for brain images and we already incorporated new images in Fig. S6. *Merged images were missing only in Fig 5, which we have included in the current version. Arrows and arrowheads were used as described in Figure legends, but instead of those, we now clearly label the epithelium and we outlined the region of wing nerve glia in all images. *

      Please see our response to the first minor comment of Reviewer 1 regarding the expression of reporters in wing tissues.

      R3/6. „The authors claim interaction of Su(var)2-10 with Atg8a in the nucleus and cytoplasm can trigger autophagic breakdown, involving Su(var)2-10 SUMOylation. The paper would benefit from showing direct SUMOylation of Su(var)2-10 after injury. Is there any way to examine this in vivo?”

      Response:

      We will test direct SUMOylation of Su(var)2-10 using a recently described method by Andreev et al., 202233. FLAG-GFP-Smt3 (SUMO)____ is expressed under SUMO transcriptional regulation and we will immunoprecipitate FLAG-GFP-SUMO and GFP alone as negative control with GFPTrap beads from lysates of heads subjected to traumatic brain injury that results in glial reactivity16____, and also from uninjured head lysates. We will use anti-____Su(var)2-10 ____western blotting to visualize SUMOylated Su(var)2-10 and whether its levels are modulated by brain injury.

      R3/7. „The authors state in discussion "we find that draper is highly expressed in wing nerve glia already in uninjured conditions and it is not further induced by wing transection - indicating high phagocytic capacity in wing glia ... axon debris clearance takes substantially longer in the wing nerve than in antennal lobe glomeruli, thus draper levels may not readily predict actual phagocytic activity in glia". However, they never actually assess this in their experiments. All the conclusions about Draper are made from promoter fusions of integrated reporters, which are imperfect. This conclusion cannot be made.”

      Response:

      As described in our response to R3/3, we will further test drpr expression changes after wing injury using two independent methods: qPCR and western blot .

      We deleted this part from the Discussion that were criticized by the reviewer because these are not important for the main message of our manuscript.

      R3/8. „Both STAT92E and Jun are activated by a stress response. Could this be a stress response to disrupting autophagy that is somehow enhance by injury?”

      Response:

      *Stress responses are indeed relayed by AP-1 and Stat signaling, and impaired autophagy could be a source of stress. We would like to emphasize, though, that the main finding of our manuscript is that disrupting autophagy suppresses Stat-dependent transcription. Autophagy inhibition does not increase Stat signaling in uninjured wing nerves and while control flies upregulate Stat activity upon injury, autophagy-deficient animals fail to do so (Fig. 1). Thus, Stat signaling is not activated by loss of autophagy – it is activated by injury (that is the stress) and Stat activation requires autophagy in this setting.*

      R3/9. „Minor:

      I don't think that "glially" is a word.”

      Response:

      Online dictionaries such as Wiktionary list glially as a word, and many scientific articles use it: https://doi.org/10.1016/j.conb.2022.102653, https://doi.org/10.1016/j.yexcr.2013.08.016,https://doi.org/10.1016/j.jpain.2006.04.001*, to give some examples. *

      We nonetheless refrain from using it in the updated text.

      References

      1. Neukomm, L.J., Burdett, T.C., Gonzalez, M.A., Züchner, S., and Freeman, M.R. (2014). Rapid in vivo forward genetic approach for identifying axon death genes in Drosophila. Proc National Acad Sci 111, 9965–9970. https://doi.org/10.1073/pnas.1406230111.
      2. Giangrande, A., Murray, M.A., and Palka, J. (1993). Development and organization of glial cells in the peripheral nervous system of Drosophila melanogaster. Development 117, 895–904. https://doi.org/10.1242/dev.117.3.895.
      3. Stork, T., Engelen, D., Krudewig, A., Silies, M., Bainton, R.J., and Klämbt, C. (2008). Organization and Function of the Blood–Brain Barrier in Drosophila. J. Neurosci. 28, 587–597. https://doi.org/10.1523/jneurosci.4367-07.2008.
      4. Figueras-Novoa, C., Timimi, L., Marcassa, E., Ulferts, R., and Beale, R. (2024). Conjugation of ATG8s to single membranes at a glance. J. Cell Sci. 137, jcs261031. https://doi.org/10.1242/jcs.261031.
      5. Galluzzi, L., and Green, D.R. (2019). Autophagy-Independent Functions of the Autophagy Machinery. Cell 177, 1682–1699. https://doi.org/10.1016/j.cell.2019.05.026.
      6. Nieto-Torres, J.L., Leidal, A.M., Debnath, J., and Hansen, M. (2021). Beyond Autophagy: The Expanding Roles of ATG8 Proteins. Trends Biochem Sci 46, 673–686. https://doi.org/10.1016/j.tibs.2021.01.004.
      7. Huang, Z., Kingsolver, M.B., Avadhanula, V., and Hardy, R.W. (2013). An Antiviral Role for Antimicrobial Peptides during the Arthropod Response to Alphavirus Replication. J. Virol. 87, 4272–4280. https://doi.org/10.1128/jvi.03360-12.
      8. Purice, M.D., Ray, A., Münzel, E.J., Pope, B.J., Park, D.J., Speese, S.D., and Logan, M.A. (2017). A novel Drosophila injury model reveals severed axons are cleared through a Draper/MMP-1 signaling cascade. Elife 6, e23611. https://doi.org/10.7554/elife.23611.
      9. Alphen, B. van, Stewart, S., Iwanaszko, M., Xu, F., Li, K., Rozenfeld, S., Ramakrishnan, A., Itoh, T.Q., Sisobhan, S., Qin, Z., et al. (2022). Glial immune-related pathways mediate effects of closed head traumatic brain injury on behavior and lethality in Drosophila. Plos Biol 20, e3001456. https://doi.org/10.1371/journal.pbio.3001456.
      10. MacDonald, J.M., Beach, M.G., Porpiglia, E., Sheehan, A.E., Watts, R.J., and Freeman, M.R. (2006). The Drosophila Cell Corpse Engulfment Receptor Draper Mediates Glial Clearance of Severed Axons. Neuron 50, 869–881. https://doi.org/10.1016/j.neuron.2006.04.028.
      11. Bence, M., Jankovics, F., Kristó, I., Gyetvai, Á., Vértessy, B.G., and Erdélyi, M. (2024). Direct interaction of Su(var)2‐10 via the SIM‐binding site of the Piwi protein is required for transposon silencing in Drosophila melanogaster. FEBS J. 291, 1759–1779. https://doi.org/10.1111/febs.17073.
      12. Betz, A., Lampen, N., Martinek, S., Young, M.W., and Darnell, J.E. (2001). A Drosophila PIAS homologue negatively regulates stat92E. Proc. Natl. Acad. Sci. 98, 9563–9568. https://doi.org/10.1073/pnas.171302098.
      13. Liu, B., Liao, J., Rao, X., Kushner, S.A., Chung, C.D., Chang, D.D., and Shuai, K. (1998). Inhibition of Stat1-mediated gene activation by PIAS1. Proc. Natl. Acad. Sci. 95, 10626–10631. https://doi.org/10.1073/pnas.95.18.10626.
      14. Bach, E.A., Ekas, L.A., Ayala-Camargo, A., Flaherty, M.S., Lee, H., Perrimon, N., and Baeg, G.-H. (2007). GFP reporters detect the activation of the Drosophila JAK/STAT pathway in vivo. Gene Expr Patterns 7, 323–331. https://doi.org/10.1016/j.modgep.2006.08.003.
      15. Hu, X., li, J., Fu, M., Zhao, X., and Wang, W. (2021). The JAK/STAT signaling pathway: from bench to clinic. Signal Transduct. Target. Ther. 6, 402. https://doi.org/10.1038/s41392-021-00791-1.
      16. Szabó, Á., Vincze, V., Chhatre, A.S., Jipa, A., Bognár, S., Varga, K.E., Banik, P., Harmatos-Ürmösi, A., Neukomm, L.J., and Juhász, G. (2023). LC3-associated phagocytosis promotes glial degradation of axon debris after injury in Drosophila models. Nat. Commun. 14, 3077. https://doi.org/10.1038/s41467-023-38755-4.
      17. Goodall, E.A., Kraus, F., and Harper, J.W. (2022). Mechanisms underlying ubiquitin-driven selective mitochondrial and bacterial autophagy. Mol. Cell 82, 1501–1513. https://doi.org/10.1016/j.molcel.2022.03.012.
      18. Zhang, T., Yang, H., Zhou, Z., Bai, Y., Wang, J., and Wang, W. (2022). Crosstalk between SUMOylation and ubiquitylation controls DNA end resection by maintaining MRE11 homeostasis on chromatin. Nat. Commun. 13, 5133. https://doi.org/10.1038/s41467-022-32920-x.
      19. Chen, Z., Zhang, Y., Guan, Q., Zhang, H., Luo, J., Li, J., Wei, W., Xu, X., Liao, L., Wong, J., et al. (2021). Linking nuclear matrix–localized PIAS1 to chromatin SUMOylation via direct binding of histones H3 and H2A.Z. J. Biol. Chem. 297, 101200. https://doi.org/10.1016/j.jbc.2021.101200.
      20. Brown, J.R., Conn, K.L., Wasson, P., Charman, M., Tong, L., Grant, K., McFarlane, S., and Boutell, C. (2016). SUMO Ligase Protein Inhibitor of Activated STAT1 (PIAS1) Is a Constituent Promyelocytic Leukemia Nuclear Body Protein That Contributes to the Intrinsic Antiviral Immune Response to Herpes Simplex Virus 1. J. Virol. 90, 5939–5952. https://doi.org/10.1128/jvi.00426-16.
      21. Gutierrez-Morton, E., and Wang, Y. (2024). The role of SUMOylation in biomolecular condensate dynamics and protein localization. Cell Insight 3, 100199. https://doi.org/10.1016/j.cellin.2024.100199.
      22. Jacomin, A.-C., Petridi, S., Monaco, M.D., Bhujabal, Z., Jain, A., Mulakkal, N.C., Palara, A., Powell, E.L., Chung, B., Zampronio, C., et al. (2020). Regulation of Expression of Autophagy Genes by Atg8a-Interacting Partners Sequoia, YL-1, and Sir2 in Drosophila. Cell Reports 31, 107695. https://doi.org/10.1016/j.celrep.2020.107695.
      23. Maimon, I., Popliker, M., and Gilboa, L. (2014). Without children is required for Stat-mediated zfh1 transcription and for germline stem cell differentiation. Development 141, 2602–2610. https://doi.org/10.1242/dev.109611.
      24. Ninova, M., Chen, Y.-C.A., Godneeva, B., Rogers, A.K., Luo, Y., Tóth, K.F., and Aravin, A.A. (2020). Su(var)2-10 and the SUMO Pathway Link piRNA-Guided Target Recognition to Chromatin Silencing. Mol. Cell 77, 556-570.e6. https://doi.org/10.1016/j.molcel.2019.11.012.
      25. Pircs, K., Nagy, P., Varga, A., Venkei, Z., Erdi, B., Hegedus, K., and Juhasz, G. (2012). Advantages and Limitations of Different p62-Based Assays for Estimating Autophagic Activity in Drosophila. PLoS ONE 7, e44214. https://doi.org/10.1371/journal.pone.0044214.
      26. Fletcher, K., Ulferts, R., Jacquin, E., Veith, T., Gammoh, N., Arasteh, J.M., Mayer, U., Carding, S.R., Wileman, T., Beale, R., et al. (2018). The WD40 domain of ATG16L1 is required for its non‐canonical role in lipidation of LC3 at single membranes. EMBO J 37, e97840. https://doi.org/10.15252/embj.201797840.
      27. Rai, S., Arasteh, M., Jefferson, M., Pearson, T., Wang, Y., Zhang, W., Bicsak, B., Divekar, D., Powell, P.P., Nauman, R., et al. (2018). The ATG5-binding and coiled coil domains of ATG16L1 maintain autophagy and tissue homeostasis in mice independently of the WD domain required for LC3-associated phagocytosis. Autophagy 15, 1–14. https://doi.org/10.1080/15548627.2018.1534507.
      28. Doherty, J., Sheehan, A.E., Bradshaw, R., Fox, A.N., Lu, T.-Y., and Freeman, M.R. (2014). PI3K Signaling and Stat92E Converge to Modulate Glial Responsiveness to Axonal Injury. PLoS Biol 12, e1001985. https://doi.org/10.1371/journal.pbio.1001985.
      29. Logan, M.A., Hackett, R., Doherty, J., Sheehan, A., Speese, S.D., and Freeman, M.R. (2012). Negative regulation of glial engulfment activity by Draper terminates glial responses to axon injury. Nat. Neurosci. 15, 722–730. https://doi.org/10.1038/nn.3066.
      30. MacDonald, J.M., Doherty, J., Hackett, R., and Freeman, M.R. (2013). The c-Jun kinase signaling cascade promotes glial engulfment activity through activation of draper and phagocytic function. Cell Death Differ 20, 1140–1148. https://doi.org/10.1038/cdd.2013.30.
      31. Freeman, M.R., Delrow, J., Kim, J., Johnson, E., and Doe, C.Q. (2003). Unwrapping Glial Biology Gcm Target Genes Regulating Glial Development, Diversification, and Function. Neuron 38, 567–580. https://doi.org/10.1016/s0896-6273(03)00289-7.
      32. Dostert, C., Jouanguy, E., Irving, P., Troxler, L., Galiana-Arnoux, D., Hetru, C., Hoffmann, J.A., and Imler, J.-L. (2005). The Jak-STAT signaling pathway is required but not sufficient for the antiviral response of drosophila. Nat. Immunol. 6, 946–953. https://doi.org/10.1038/ni1237.
      33. Andreev, V.I., Yu, C., Wang, J., Schnabl, J., Tirian, L., Gehre, M., Handler, D., Duchek, P., Novatchkova, M., Baumgartner, L., et al. (2022). Panoramix SUMOylation on chromatin connects the piRNA pathway to the cellular heterochromatin machinery. Nat. Struct. Mol. Biol. 29, 130–142. https://doi.org/10.1038/s41594-022-00721-x.
    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Multiple compounds that inhibit ATP-sensitive potassium (KATP) channels also chaperone channels to the surface membrane. The authors used an artificial intelligence (AI)-based virtual screening (AtomNet) to identify novel compounds that exhibit chaperoning effects on trafficking-deficient disease-causing mutant channels. One compound, which they named Aekatperone, acts as a low affinity, reversible inhibitor and effective chaperone. A cryoEM structure of KATP bound to Aekatperone showed that the molecule binds at the canonical inhibitory site.

      Strengths and weaknesses:

      The details of the AI screening itself are inevitably opaque, but appear to differ from classical virtual screening in not involving any physical docking of test compounds into the target site. The authors mention criteria that were used to limit the number of compounds, so that those with high similarity to known binders and 'sequence identity' (does this mean structural identity) were excluded. The identified molecules contain sulfonylurea-like moieties. How different are they from other sulfonylure4as?

      We thank the reviewers for the questions. As part of the library preparation, molecules with greater than 0.5 Tanimoto similarity in ECFP4 space to any known binders of the target protein and its homologs within 70% sequence identity were excluded to increase the possibility of identifying novel hits. After scoring and ranking the molecules by the AtomNet® technology, a diversity clustering was performed using the Butina algorithm (Butina D. Unsupervised Data Base Clustering Based on Daylight’s Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets, J. Chem. Inf. Comput. Sci. 1999, 39, 747–750) with a Tanimoto similarity cutoff of 0.35 in ECFP4 space to minimize selection of structurally similar scaffolds for the final compound buy-list. We have revised the results and methods sections to make this clear.

      Sulfonylureas are defined by their core structure comprising a sulfonyl group (–S(=O)<sub>2</sub>) and a urea moiety (–NH–CO–NH–). While some compounds identified in our study contain a sulfonamide group (R-S(=O) <sub>2</sub>-NR<sub>2</sub>), they differ structurally from sulfonylureas by lacking the key urea group and by incorporating unique R-group substitutions (we have now added this to Figure 1A legend). For example, compound C27 (Z2068224500) includes a sulfonamide group but not a urea moiety. Likewise, C45 (Aekatperone, Z1620764636) contains a sulfonamide group along with an aromatic, nitrogen-rich heterocyclic ring, but no urea group. Additionally, the R-groups in these compounds are more complex than the simple aromatic or alkyl chains typical of sulfonylureas. They include heterocyclic aromatic systems and nitrogen-rich structures, which likely influence their binding properties and lipophilicity. These structural differences suggest distinct functional and pharmacological profiles as supported by our biochemical and functional studies.

      The experimental work confirming that Aekatperone acts to traffic mutant KATP channels to the surface and acts as a low affinity, reversible, inhibitor is comprehensive and clear, with very convincing cell biological and patch-clamp data, as is the cryoEM structural analysis, for which the group are leading experts. In addition to the three positive chaperone-effective molecules, the authors identified a large number of compounds that are predicted binders but apparently have no chaperoning effect. Did any of them have inhibitory action on channels? If so, does this give clues to separating chaperoning from inhibitory effects?

      This is an interesting question. Evidence from cryo-EM, biochemical and electrophysiology studies reveal a critical role of Kir6.2 N-terminus in K<sub>ATP</sub> channel assembly and gating, and that pharmacological chaperones like glibenclamide, repaglinide, carbamazepine, and now aekatperone exert their chaperoning and inhibitory effects by stabilizing the interaction between Kir6.2 N-terminus and the SUR1-ABC core. This stabilization, while promoting the assembly of Kir6.2 and SUR1 to “chaperone” trafficking-impaired mutant channels to the cell surface, also inhibits the channel by restricting the Kir6.2 C-terminal domain from rotating to an open state. An additional mechanism by which these compounds inhibit channel activity is by preventing SUR1-NBD dimerization, which mediates physiological activation of the channel by MgADP (see review: Driggers CM, Shyng SL. Mechanistic insights on K<sub>ATP</sub> channel regulation from cryo-EM structures. J Gen Physiol. 2023 Jan 2;155(1): e202113046, PMID: 36441147). From our compound screening, we did find some compounds that showed mild inhibition of the channel by electrophysiology but no obvious chaperone effects by western blots. It is possible that small chaperoning effects of some compounds showing mild channel inhibition effects were missed due to the lower sensitivity of the western blot assay compared to electrophysiology. Alternatively, these compounds could inhibit channels by preventing SUR1NBD dimerization without stabilizing the Kir6.2 N-terminus, which is required for the chaperone effect based on our model. Unfortunately, we did not find any compounds that show chaperone effects but no channel inhibition effects, which is consistent with our understanding of how this type of K<sub>ATP</sub> chaperones work (i.e. by stabilizing Kir6.2 N-terminus interaction with SUR1’s ABC core).

      The authors suggest that the novel compound may be a promising therapeutic for treatment of congenital hyperinsulinism due to trafficking defective KATP mutations. Because they are low affinity, reversible, inhibitors. This is a very interesting concept, and perhaps a pulsed dosing regimen would allow trafficking without constant channel inhibition (which otherwise defeats the therapeutic purpose), although it is unclear whether the new compound will offer advantages over earlier low-affinity sulfonylurea inhibitor chaperones. These include tolbutamide which has very similar affinity and effect to Aekatperone. As the authors point out this (as well as other sulfonlyureas) are currently out of favor because of potential adverse cardiovascular effects, but again, it is unclear why Aekatperone should not have the same concerns.

      We thank the reviewer for the comments. This is clearly an important question to address in the future. While we have not directly tested the effects of Aekatperone on cardiac functions, we did assess its inhibitory effect on cells expressing the cardiac K<sub>ATP</sub> channel isoform (SUR2A/Kir6.2). Our results indicate that Aekatperone exhibits higher sensitivity toward the pancreatic K<sub>ATP</sub> channel isoform (SUR1/Kir6.2) compared to the cardiac isoform. However, we acknowledge that Aekatperone could still have cardiotoxic effects through its potential action on other channels, such as the hERG channel.

      It is worth noting that tolbutamide, despite its known cardiotoxic effects, does not exert these effects through cardiac K<sub>ATP</sub> channel inhibition. This has been demonstrated in studies showing no inhibitory effect of tolbutamide on SUR2A/Kir6.2 channels and on channels formed by Kir6.2 and SUR1 harboring the S1238Y mutation (also shown as S1237Y in some studies using a different SUR1 isoform)--the amino acid substitution found in SUR2A at the corresponding position (Ashfield R, Gribble FM, Ashcroft SJ, Ashcroft FM. Identification of the high-affinity tolbutamide site on the SUR1 subunit of the K<sub>ATP</sub> channel. Diabetes. 1999 Jun;48(6):1341-7, PMID: 10342826). This suggests that tolbutamide’s cardiotoxic effects might involve other targets like the hERG channel. Interestingly, tolbutamide contains a hydrophobic tail and aromatic rings that align well with the structural features for hERG interaction (Garrido A, Lepailleur A, Mignani SM, Dallemagne P, Rochais C. hERG toxicity assessment: Useful guidelines for drug design. Eur J Med Chem. 2020 Jun 1;195:112290, PMID: 32283295). In contrast, highaffinity sulfonylureas such as glibenclamide and glimepiride, which have additional benzamide moieties, are associated with lower cardiovascular risks (Douros A, Yin H, Yu OHY, Filion KB, Azoulay L, Suissa S. Pharmacologic Differences of Sulfonylureas and the Risk of Adverse Cardiovascular and Hypoglycemic Events. Diabetes Care. 2017, 40:1506-1513, PMID:

      28864502). Given these considerations, a comprehensive assessment of Aekatperone’s potential cardiotoxicity is crucial. Future studies involving in silico modeling, in vitro, and in vivo experiments will be essential to evaluate Aekatperone’s interaction with hERG and other offtarget effects. These efforts will help clarify its safety profile. This point has now been added to the Discussion.

      Reviewer #2 (Public review):

      Summary:

      In their study 'AI-Based Discovery and CryoEM Structural Elucidation of a KATP Channel Pharmacochaperone', ElSheikh and colleagues undertake a computational screening approach to identify candidate drugs that may bind to an identified binding pocket in the SUR1 subunit of

      KATP channels. Other KATP channel inhibitors such as glibenclamide have been previously shown to bind in this pocket, and in addition to inhibition KATP channel function, these inhibitors can very effectively rescue cell surface expression of trafficking deficient KATP mutations that cause excessive insulin secretion (Congenital Hyperinsulinism). However, a challenge for their utility for treatment of hyperinsulinism has been that they are powerful inhibitors of the channels that are rescued to the channel surface. In contrast, successful therapeutic pharmacochaperones (eg. CFTR chaperones) permit function of the channels rescued to the cell membrane. Thus, a key criteria for the authors' approach in this case was to identify relatively low affinity compounds that target the glibenclamide binding site (and be washed off) - these could potentially rescue KATP surface expression, but also permit KATP function.

      Strengths:

      The main findings of the manuscript include:

      (1) Computational screening of a large virtual compound library, followed by functional screening of cell surface expression, which identified several potential candidate pharmacochaperones that target the glibenclamide binding site.

      (2) Prioritization and functional characterization of Aekatperone as a low affinity KATP inhibitor which can be readily 'washed off' in patch clamp, and cell based efflux assays. Thus the drug clearly rescues cell surface expression, but can be manipulated experimentally to permit function of rescued channels.

      (3) Determination of the binding site and dynamics of this candidate drug by cryo-EM, and functional validation of several residues involved in drug sensitivity using mutagenesis and patch clamp.

      The experiments are well-conceived and executed, and the study is clearly described. The results of the experiments are very straightforward and clearly support the conclusions drawn by the authors. I found the study to provide important new information about KATP chaperone effects of certain drugs, with interesting considerations in terms of ion channel biology and human disease.

      Weaknesses:

      I don't have any major criticisms of the study as described, but I had some remaining questions that could be addressed in a revision.

      (1) The chaperones can effectively rescue KATP trafficking mutants, but clearly not as strongly as the higher affinity inhibitor glibenclamide. Is this relationship between inhibitory potency, and efficacy of trafficking an intrinsic challenge of the approach? I suspect that it may be an intractable problem in the sense that the inhibitor bound conformation that underlies the chaperone effect cannot be uncoupled from the inhibited gating state. But this might not be true (many partial agonist drugs with low efficacy can be strongly potent, for example). In this case, the approach is really to find a 'happy medium' of a drug that is a weak enough inhibitor to be washed away, but still strong enough to exert some satisfactory chaperone effect. Could some additional clarity be added in the discussion on whether the chaperone and gating effects can be 'uncoupled'.

      Thank you for the suggestion. A similar question was raised by Reviewer 1, which was addressed above (public review, point 2). We have now added more discussion to clarify this point.

      (2) Based on the western blots in Figure 2B, the rescue of cell surface expression appears to require a higher concentration of AKP compared to the concentration response of channel inhibition (~9 microM in Figure 3, perhaps even more potent in patch clamp in Figure 2C). Could the authors clarify/quantify the concentration response for trafficking rescue?

      Thank you for bringing up this observation. Indeed, the pharmacochaperone effects of Aekatperone as well as other previously published K<sub>ATP</sub> pharmacochaperones require higher concentrations compared to their inhibitory effects on surface-expressed channels. This difference likely stems from the necessity for these compounds to cross the cell membrane and interact with newly synthesized channels in the endoplasmic reticulum, where the trafficking rescue occurs. We estimate that effective pharmacochaperone activity for Aekatperone can be achieved at concentrations ranging from 50 to 100 µM in cells expressing trafficking-deficient K<sub>ATP</sub> channel mutants, higher than that required for inhibition of surface-expressed channels (~9 µM IC50). Future work could focus on medicinal chemistry modifications, for example esterification of Aekatperone (Zhou G. Exploring Ester Prodrugs: A Comprehensive Review of Approaches, Applications, and Methods. Pharmacology & Pharmacy, 2024, 15, 269-284). Once inside the cell, the esters would be cleaved by endogenous esterases to release the active compound, ensuring efficient intracellular delivery. This strategy could potentially improve membrane permeability and bioavailability of the compound, which would lower the required concentrations to achieve desired chaperoning effects.

      (3) A future challenge in the application of pharmacochaperones of this type in hyperinsulinism may be the manipulation of chaperone concentration in order to permit function. In experiments it is straightforward to wash off the chaperone, but this would not be the case in an organism. I wondered if the authors had attempted to rescue channel function with diazoxide ine presence of AKP, rather than after washing off (ie. is AKP inhibition insurmountable, or can it be overcome by sufficient diazoxide).

      Thank you for raising this important point. We have previously shown (Martin GM et al. Pharmacological Correction of Trafficking Defects in ATP-sensitive Potassium Channels Caused by Sulfonylurea Receptor 1 Mutations. J Biol Chem. 2016, 291: 21971-21983, PMID: 27573238) that diazoxide, which stabilizes K<sub>ATP</sub> channels in an open conformation, also reduces physical association between Kir6.2 N-terminus and SUR1 as demonstrated by reduced crosslinking of engineered azido-phenylalanine (an unnatural amino acid) at Kir6.2 N-terminal amino acid 12 position to SUR1. Incubating cells with diazoxide did not rescue the trafficking mutants but actually further reduced the maturation efficiency of trafficking mutants. For this reason, we did not include diazoxide during Aekatperone incubation and instead added diazoxide after Aekatperone washout to potentiate the activity of mutant channels rescued to the cell surface. In vivo, we envision testing alternating Aekatperone and diazoxide dosing to maximize functional rescue of K<sub>ATP</sub> trafficking mutants.

      (4) Do the authors have any information about the turnover time of KATP after washoff of the chaperone (how stable are the rescued channels at the cell surface)? This is a difficult question to probe when glibenclamide is used as a chaperone, but maybe much simpler to address with a lower affinity chaperone like AKP.

      Thank you for your thoughtful comment. While we have not yet tested the duration of rescued K<sub>ATP</sub> channels at the cell surface following Aekatperone washout, we have conducted similar studies with carbamazepine (Chen PC et al. Carbamazepine as a novel small molecule corrector of trafficking-impaired ATP-sensitive potassium channels identified in congenital hyperinsulinism. J Biol Chem. 2013, 288: 20942-20954, PMID: 23744072), another compound exhibiting reversible inhibitory and chaperone effects (apparent affinity between glibenclamide and Aekatperone). Our previous findings with carbamazepine showed that in cultured cells its chaperone effects were detectable as early as 1 hour and peaked around 6 hours after treatment. Furthermore, when carbamazepine was removed following a 16-hour treatment, the rescue effect persisted for up to 6 hours post-drug removal. These results provide a potential duration of the surface expression rescue effects of reversible pharmacochaperones.

      Reviewer #1 (Recommendations for the authors):

      The paper is well-written and comprehensive with only very minor essentially copy-editing needed. That said, it would be good if the authors could answer the main points raised above:

      (1) What is the relevant Tanimoto parameters and sequence identity (does this mean structural identity) for the identified compounds?

      As we answered above in response to the overall assessment, to facilitate the identification of novel hits, molecules with greater than 0.5 Tanimoto similarity in ECFP4 space to any known binders of the target protein and its homologs within 70% amino acid sequence identity were excluded from the commercial library. Additionally, after scoring and ranking the molecules by the AtomNet® technology, a diversity clustering was performed on the top 30,000 molecules using the Butina algorithm with a Tanimoto similarity cutoff of 0.35 in ECFP4 space to minimize selection of structurally similar scaffolds for the final compound buy-list.

      (2) Did any of the identified putative binders have inhibitory action on channels? If so, does this give clues to separating chaperoning from inhibitory effects?

      Please see response to the same question in the overall assessment above.

      (3) Acknowledge that the identified compounds contain sulfonylurea-like moieties, and address why Aekatperone should (or perhaps does not) offer anything advantage over low affinity sulfonrylureas such as tolbutamide?

      Please see response to the same question in the overall assessment above.

      Reviewer #2 (Recommendations for the authors):

      Thank you for assembling the interesting study, which I felt was well designed and communicated. The diverse approaches used in the study, with consistent findings, were definitely a strength. The core findings are also well distilled in the main body of the text, and although there is quite a lot of supplementary information, I felt that it was presented appropriately and well selected in terms of what would be important for readers hoping to learn more. In addition to the questions described above, I only had a few minor editorial issues that could be fixed related to presentation.

      (1) Figure 1B. The colours and resolution of the chemical structures are difficult to see clearly and could be improved.

      We have revised the figure accordingly.

      (2) This is a minor wording point... first sentence of the discussion describes the drugs as pancreatic-selective, when it would be more clear to describe them as selective for the pancreatic isoform of KATP (Kir6.2/SUR1), or perhaps better as 'exhibiting ~4-5 fold selective for SUR1-containing KATP channels vs. SUR2A or SUR2B'.

      We have changed the wording as suggested.

      (3) As a curiosity (not necessary to do more experiments), but I am curious if the authors know whether there is any meaningful enhancement of trafficking of WT channels by AKP.

      All pharmacochaperones we have identified to date including Aekatperone also slightly enhance WT channel surface expression (10-20%).

      Reviewing editor recommendations:

      (1) Given the modest resolution of the EM reconstruction, it is perhaps not entirely clear how AKP was assigned to the density observed. Specifically, it would be helpful to include a comparison of an AKP-free map and the current AKP map (filtered to a similar resolution) showing slice views of densities in the region around the inferred binding site. This would be very helpful in ascertaining whether the cryoEM reconstruction is an independent validation of the computational and functional experiments or whether the density inference depends on the additional knowledge.

      We appreciate the editor’s suggestion. We have now added a Supplemental Figure (Supplementary Figure 7 in the revised manuscript) that compares our AKP-free cryoEM density deposited previously to the EMDB (EMD-26320) and the AKP-bound cryoEM density from this study, with cryoEM density (filtered to the same resolution) superimposed on the structural model.

      (2) It could help to mention in brief what is a probable mechanism of AKP inhibition - that is how after binding of AKP, channel opening is restricted. Is it similar to that of other site A ligands?

      Based on the strong Kir6.2 N-terminal cryoEM density observed in our AKP map, AKP most likely inhibits K<sub>ATP</sub> channels by trapping the Kir6.2 N-terminus in the central cavity of SUR1’s ABC core thus preventing Kir6.2-C-terminal domain from rotating to an open conformation, similar to other ligands that stabilize the Kir6.2 N-terminus-SUR1 interface by binding to site A (such as tolbutamide and AKP), site B (such as repaglinide), or both site A and site B (such as glibenclamide). We have now included this in the revised Results and Discussion sections.

      (3) In the context of the MD simulations, do other site A ligands (which from my understanding bind at a similar site) also exhibit similar flexibility as AKP? If there is information available on the flexibility of ligands of varying affinities, bound to the same site, maybe some correlative inferences can be drawn? However, in MD simulation trajectories it is not entirely uncommon for a ligand to simply get trapped in a local energy well. Since the authors have performed significant analysis of their MD results it could be worth mentioning/discussing such phenomena.

      Previously published MD data addressing ligand dynamics, such as glibenclamide in the SUR1 pocket (Walczewska-Szewc K, Nowak W. Photo-Switchable Sulfonylureas Binding to ATPSensitive Potassium Channel Reveal the Mechanism of Light-Controlled Insulin Release. J Phys Chem B. 2021, 125: 13111-13121, PMID: 34825567), indicate a certain degree of flexibility. Unfortunately, we cannot directly compare these results, as the simulations were performed without the KNtp domain in the SUR1 cavity, which partially contributes to ligand stabilization. This is an issue we plan to investigate in the future.

      In this study, we ran five independent MD simulations, each 500 ns long, resulting in a total of 2.5 μs of simulation time. Across all replicates, the ligand stayed in the same position, with variations mainly in the dynamics of the blurred segment. Considering the length of the simulations and the consistency across the runs, we believe this binding pose is stable and represents a global (or at least highly stable) energy minimum, consistent with the cryo-EM data.

      (4) In electrophysiological assays, 10 uM AKP seems to inhibit all currents (Figure 2), but in the Rb+ flux assay ~10 uM appears to be the IC50. The reason for this difference is not entirely clear and it would help to comment on this.

      Thank you for noticing the difference. The initial electrophysiological experiments were conducted using the very small amount of AKP provided to us from Atomwise. We estimated the concentration of the reconstituted AKP the best we could, but the concentration was likely to not be very accurate due to difficulty in handling the very small amount of the AKP powder. Subsequent Rb<sup>+>/sup> efflux experiments were conducted using a different, larger batch of AKP we purchased from Enamine. We have now stated this in the Methods section.

    1. Author response:

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary:

      In this manuscript, Hammond et al. study robustness of the vertebrate segmentation clock against morphogenetic processes such as cell ingression, cell movement and cell division to ask whether the segmentation clock and morphogenesis are modular or not. The modularity of these two would be important for evolvability of the segmenting system. The authors adopt a previously proposed 3D model of the presomitic mesoderm (Uriu et al. 2021 eLife) and include new elements; different types of cell ingression, tissue compaction and cell cycles. Based on the results of numerical simulations that synchrony of the segmentation clock is robust, the authors conclude that there is a modularity in the segmentation clock and morphogenetic processes. The presented results support the conclusion. The manuscript is clearly written. I have several comments that could help the authors further strengthen their arguments.

      Major comment: 

      [Optional] In both the current model and Uriu et al. 2021, coupling delay in phase oscillator model is not considered. Given that several previous studies (e.g. Lewis 2003, Herrgen et al. 2010, Yoshioka-Kobayashi et al. 2020) suggested the presence of coupling delays in DeltaNotch signaling, could the authors analyze the effect of coupling delay on robustness of the segmentation clock against morphogenetic processes?

      We thank the reviewer for the suggestion. Owing to the computational demands of including such a delay in the model, we cannot feasibly repeat every simulation analysed here in the presence of delay, and would like to note that the increased computational demand that delays put on the simulations is also the reason why Uriu et al 2021 did not include it, as stated in their published exchange with reviewers. However, analogous to our analysis in figure 7, we can analyse how varying the position of progenitor cell ingression affects synchrony in the presence of the coupling delay measured in zebrafish by Herrgen et al. (2010). We show this analysis in a new figure 8 (8B, specifically), on page 21, and discuss its implications in the text on pages 2022. Our analysis reveals that the model cannot recover synchrony using the default parameters used by Uriu et al. (2021) and reveal a much stronger dependence on the rate of cell mixing (vs) than shown in the instantaneous coupling case (cf. figure 7). However, by systematically varying the value of the delay we find that a relatively minor increase in the delay is sufficient to recover synchrony using the parameter set of Uriu et al. (see figure 8C). Repeating this across the three scenarios of cell ingression we see that the combination of coupling strength and delay determine the robustness of synchrony to varying position of cell ingression. This suggests that the combination of these two parameters constrain the evolution of morphogenesis.

      Minor comments: 

      -  PSM radius and oscillation synchrony are both denoted by the same alphabet r. The authors should use different alphabets for these two to avoid confusion.

      We thank the reviewer for spotting this. This has now been changed throughout to rT, as shorthand for ‘radius of tissue’.

      -  page 5 Figure 1 caption: (x-x_a/L) should be (x-x_a)/L.

      We thank the reviewer for spotting this. This has now been corrected.

      -  Figure 3C: Description of black crosses in the panels is required in the figure legend.

      Thank you for spotting this. The legend has now been corrected.

      -  Figure 3C another comment: In this panel, synchrony r at the anterior PSM is shown. It is true that synchrony at anterior PSM is most relevant for normal segment formation. However, in this case, the mobility profile is changed, so it may be appropriate to show how synchrony at mid and posterior PSM would depend on changes in mobility profile. Is synchrony improved by cell mobility at the region where cell ingression happens?

      We thank the reviewer for the suggestion. We have now plotted the synchrony along the AP axis for varying motility profiles, and this can be seen in figure 3 supplement 1, and is briefly discussed in the text on page 11. We show that while the synchrony varies with x-position (as already expected, see figure 2), there is no trend associated with the shape of the motility profile.

      -  In page 12, the authors state that "the results for the DP and DP+LV cases are exactly equal for L = 185 um, as .... and the two ingression methods are numerically equivalent in the model". I understood that in this case two ingression methods are equivalent, but I do not understand why the results are "exactly" equal, given the presence of stochasticity in the model.

      These results can be exactly equal despite the simulations being stochastic because they were both initialised using the same ‘seed’ in the source code. However, we now see that this might be confusing to the reader, and we have re-generated this figure but this time initialising the simulations for each ingression scenario using a different seed value. This is now reflected in the text on page 12 and in figure 4.

      -  The authors analyze the effect of cell density on oscillation synchrony in Fig. 4 and they mention that higher density increases robustness of the clock by increasing the average number of interacting neighbours. I think it would be helpful to plot the average number of neighbouring cells in simulations as a function of density to quantitatively support the claim.

      We thank the reviewer for their suggestion. Distributions of neighbour numbers for exemplar simulations with varying density can now be found in  figure 4 supplementary figure 1 and are referred to in the text on page 11.

      -  The authors analyze the effect of PSM length on synchrony in Fig. 4. I think kymographs of synchrony r as shown in Fig. 2D would also be helpful to show that indeed cells get synchronized while advecting through a longer PSM.

      We thank the reviewer for their suggestion and agree that visualising the data in this way is an excellent idea. We have generated the suggested kymographs and added them to figure 4 as supplements 2 and 4, and discussed these results in the text on page 12.

      -  I understand that cells in M phase can interact with neighboring cells with the same coupling strength kappa in the model, although their clocks are arrested. If so, this aspect should be also mentioned in the main text in page 16, as this coupling can be another noise source for synchrony.

      We agree this is an important clarification. We explicitly state this, and briefly justify our choice, in the text on page 16.

      -  Figure 5-figure supplement 2: panel labels A, B, C are missing. 

      Thank you for bringing this to our attention. These have now been added.

      – Figure 5-figure supplement 3: panel labels A, B, C are missing.

      Thank you for bringing this to our attention. These have now been added.

      Reviewer #1 (Significance):

      Synchronization of the segmentation clock has been studied by mathematical modeling, but most previous studies considered cells in a static tissue without morphogenesis. In the previous study by Uriu et al. 2021, morphogenetic processes such as cell advection due to tissue elongation, tissue shortening, and cell mobility were considered in synchronization. The current manuscript provides methodological advances in this aspect by newly including cell ingression, tissue compaction and cell cycle. In addition, the authors bring a concept of modularity and evolvability to the field of the vertebrate segmentation clock, which is new. On the other hand, the manuscript confirms that the synchronization of the segmentation clock is robust by careful simulations, but it does not propose or reveal new mechanisms for making it robust or modular. The main targets of the manuscript will be researchers working on somitogenesis and evolutionary biologists who are interested in evolution of developmental systems. The manuscript will also be interested by broader audiences, like developmental biologists, biophysicists, and physicists and computer scientists who are working on dynamical systems.

      We thank the reviewer for their interest in our manuscript and for acknowledging us as one of the first to address the modularity and evolvability of somitogenesis. We hope that this work will encourage others to think about these concepts in this system too.  

      In the original submission, we identified a high enough coupling strength as the main mechanism underlying the identified modularity in somitogenesis. Since, we have included an analysis of the coupling delay and find that it is the interplay between coupling strength and coupling delay that mediate the identified modularity, allowing PSM morphogenesis and the segmentation clock to evolve independently in regions of parameter space that are constrained and determined by the interplay between these two parameters. We have now added an extra figure (figure 8) where we explore this interplay and have discussed it at length in the last section of the results and in the discussion. We again thank the reviewer for encouraging us to include delays in our analysis.

      Reviewer #2 (Evidence, reproducibility and clarity):

      SUMMARY 

      The manuscript from Hammond et al., investigates the modularity of the segmentation clock and morphogenesis in early vertebrate development, focusing on how these processes might independently evolve to influence the diversity of segment numbers across vertebrates.

      Methodology: The study uses a previously published computational model, parameterized for zebrafish, to simulate and analyse the interactions between the segmentation clock and the morphogenesis of the pre-somitic mesoderm (PSM). Their model integrates cell advection, motility, compaction, cell division, and the synchronization of the embryo clock. Three alternative scenarios of PSM morphogenesis were modeled to examine how these changes affect the segmentation clock.

      Model System: The computational model system combines a representation of cell movements and the phase oscillator dynamics of the segmentation clock within a three-dimensional horseshoe-shaped domain mimicking the geometry of the vertebrate embryo PSM. The parameters used for the mathematical model are mostly estimated from previously published experimental findings.

      Key Findings and Conclusions: (1) The segmentation clock was found to be broadly robust against variations in morphogenetic processes such as cell ingression and motility; (2) Changes in the length of the PSM and the strength of phase coupling within the clock significantly influenced the system's robustness; (3) The authors conclude that the segmentation clock and PSM morphogenesis exhibited developmental modularity (i.e. relative independence), allowing these two phenomena to evolve independently, and therefore possibly contributing to the diverse segment numbers observed in vertebrates.

      MAJOR COMMENTS

      (1) The key conclusion drawn by the authors (that there is robustness, and therefore modularity, between the morphogenetic cellular processes modeled and the embryo clock synchronization) stems directly from the modeling results appropriately presented and discussed in the manuscript. The model comprises some strong assumptions, however all have been clearly explained and the parameterization choices are supported by experimental findings, providing biological meaning to the model. Estimated parameters are well explained and seem reasonable assumptions (from the embryology perspective).

      We thank the reviewer for their positive comments about our work

      (2) This study, as is, achieves its proposed goal of evaluating the potential robustness of the embryo clock to changes in (some) morphogenetic processes. The authors do not claim that the model used is complete, and they properly identify some limitations, including the lack of cellcell interactions. Given the recognized importance of cellular physical interactions for successful embryo development, including them in the model would be a significant addition in future studies.

      We would like to clarify that the model does include cell-cell interactions as cells interact with their neighbours’ clock phase to synchronise and to avoid occupying the same physical space. 

      (3) The authors have deposited all the code used for analysis in a public GitHub repository that is updated and available for the research community.

      We support open source coding practices.

      (4) In page 6, the authors justify their choice of clock parameters for cells ingressing the PSM: "As ingressing cells do not appear to express segmentation clock genes (Mara et al. (2007)), the position at which cells ingress into the PSM can create challenges for clock patterning, as only in the 'off' phase of the clock will ingressing cells be in-phase with their neighbours."  However, there are several lines of evidence (in chick and mouse), that some oscillatory clock genes are already being expressed as early as in the gastrulation phase (so prior to PSM ingression) (Feitas et al, 2001 [10.1242/dev.128.24.5139]; Jouve et al, 2002 [10.1242/dev.129.5.1107]; Maia-Fernandes at al, 2024 [10.1371/journal.pone.0297853]) Question: Is this also true in zebrafish? (I.e. is there any recent experimental evidence that the clock genes are not expressed at ingression, since the paper cited to support this assumption is from 2007). If they are expressed in zebrafish (as they are in mouse and chick), then the cell addition should have random clock gene periods when they enter the PSM and not start all with a constant initial phase of zero. Probably this will not impact the results since the cells will also be out of phase with their neighbours when they "ingress", however, it will model more closely the biological scenario (and avoid such criticism).

      We thank the reviewer for their comments. While it is known that in zebrafish the clock begins oscillating during epiboly and before the onset of segmentation (Riedel-Kruse et al., 2007), to our knowledge no-one has examined whether posteriorly or laterally ingressing progenitor cells express clock genes prior to their ingression into the PSM, which occurs later in development than the first oscillations which give rise to the first somites. We have not found any published evidence of her/hes gene expression in the dorsal donor tissues or lateral tissues surrounding the PSM, however we acknowledge that this has not been actively studied before and our assumption relies on an absence of evidence, rather than evidence of absence. 

      However, we agree with the reviewer that one should include such an analysis for completeness, and we have now generated additional simulations where progenitor cells ingress with a random clock phase. This data is presented in figure 2 supplement 1 and mentioned in the main text on page 9.

      MINOR COMMENTS 

      (1) The citations are appropriate and cover the major labs that have published work related to this study (although with some overrepresentation of the lab that published the model used).

      We have cited the vast literature on somitogenesis to the best of our ability and do recognise that the work of the Oates lab appears prominently, but this is probably because their experimental data were originally used to parametrise the model in Uriu et al. 2021.

      (2) The text is clear, carefully written, and both the methods and the reasoning behind them are clearly explained and supported by proper citations.

      We are very glad to see that the reviewer found that the manuscript was clearly presented.

      (3) The figures are comprehensive, properly annotated, with explanatory self-contained legends. I have no comments regarding the presentation of the results.

      Thank you

      (4) Minor suggestions: 

      a. Page 26: In the Cell addition sub-section of the Methods section, correct all instances where the word domain is used, but subdomain should be used (for clarity and coherence with the description of the model, stated as having a single domain comprising 3 subdomains).

      We thank the reviewer for raising this, this is a good point. We have now corrected to ‘subdomain’ where appropriate.

      b. Page 32: Table 1. Parameter values used in our work, unless otherwise stated -> Suggestion: Add a column with the individual citations used for each parameter (to facilitate the confirmation of each corresponding reference).

      Thank you for the suggstion, we have now done this (see table 1 page 36).

      Reviewer #2 (Significance):

      GENERAL ASSESSMENT 

      This study uses a previously published model to simulate alternative scenarios of morphogenetic parameters to infer the potential independence (termed here modularity) between the segmentation clock and a set of morphogenetic processes, arguing that such modularity could allow the evolution of more flexible body plans, therefore partially explaining the variability in the number of segments observed in the vertebrates. This question is fundamental and relevant, yet still poorly researched. This work provides a comprehensive simulation with a model that tries to simplify the many morphogenetic processes described in the literature, reducing it to a few core fundamental processes that allow drawing the conclusions seeked. It provides theoretical insight to support a conceptual advance in the field of evolutionary vertebrate embryology.

      ADVANCE

      This study builds on a model recently published by Uriu et al. (eLife, 2021) that incorporates quantitative experimental data within a modeling framework including cell and tissue-level parameters, allowing the study of multiscale phenomena active during zebrafish embryo segmentation. Uriu's publication reports many relevant and often non-intuitive insights uncovered by the model, most notably the description of phase vortices formed by the synchronizing genetic oscillators interfering with the traveling-wave front pattern.  However, this model can be further explored to ask additional questions beyond those described in the original paper. A good example is the present study, which uses this mathematical framework to investigate the potential independence between two of the modeled processes, thereby extracting extra knowledge from it. Accordingly, the present study represents a step forward in the direction of using relevant theoretical frameworks to quantitatively explore the landscape of complex molecular hypotheses in silico, and with it shed some light on fundamental open questions or inform the design of future experiments in the lab.

      The study incorporates a wide range of existing literature on the developmental biology of vertebrates. It comprehensively cites prior work, such as the foundational studies by Cooke and Zeeman on the segmentation clock and the role of FGF signaling in PSM development as discussed by Gomez et al. The literature properly covers the breadth of knowledge in this field.

      AUDIENCE

      Target audience | This study is relevant for fundamental research in developmental biology, specifically targeting researchers who focus on early embryo development and morphogenesis from both experimental and theoretical perspectives. It is also relevant for evolutionary biologists investigating the genetic factors that influence vertebrate evolution, as well as to computational biologists and bioinformatics researchers studying developmental processes and embryology.

      Developmental researchers studying the segmentation clock in other vertebrate model organisms (namely mouse and chick), will find this publication especially valuable since it provides insights that can help them formulate new hypotheses to elucidate the molecular mechanisms of the clock (for example finding a set of evolutionarily divergent genes that might interfere with PSM length). Additionally, this study provides a set of cellular parameters that have yet to be measured in mouse and chick, therefore guiding the design of future experiments to measure them, allowing the simulation of the same model with sets of parameters from different vertebrate model organisms, therefore testing the robustness of the findings reported for zebrafish.

      Reviewer #3 (Evidence, reproducibility and clarity): 

      In this manuscript, Verd and colleagues explored how various biologically relevant factors influence the robustness of clock dynamics synchronization among neighboring cells within the context of somatogenesis, adapting a mathematical model presented by Urio et. al in 2021 in a similar context. Specifically they show that clock dynamics is robust to different biological mechanisms such as cell infusion, cellular motility, compaction-extension and cell-division. On the other hand , the length of Presomitic Mesoderm (PSM) and density of cells in it has a significant role in the robustness of clock dynamics. While the manuscript is well-written and provides clear descriptions of methods and technical details, it tends to be somewhat lengthy.

      Below are the comments I would like the authors to address:

      (1) The authors mention that "...the model is three dimensional and so can quantitatively recapture the rates of cell mixing that we observe in the PSM". I am not convinced with this justification of using a 3D model. None of the effects the authors explore in this manuscript requires a three dimensional model or full physical description of the cellular mechanics such as excluded volume interaction etc. A one-dimensional model characterized by cell position along the arclength of PSM and somatic region and segmentation clock phase θ can incorporate all the physics authors described in this manuscript as well as significantly computationally cheap allowing the authors to explore the effect of different parameters in greater detail.

      One of the main objectives of the work we present in this manuscript is to assess how the evolution of PSM morphogenesis affects, or does not affect, segment patterning. The PSM is a three-dimensional tissue with differing cell rearrangement dynamics along its anterior-posterior axis. In addition, PSM dimension, density, the rearrangement rate, and patterns of cell ingression all vary across vertebrate species, and they are functional, especially cell mixing as it promotes synchronisation and drives elongation. In order to answer questions on the modularity of somitogenesis we therefore consider it absolutely necessary to include a three-dimensional representation of the PSM that captures single cells and their movements. In addition, this will allow us, as Reviewer #2 also pointed out, to reparametrize our model using species-specific data as it becomes available. 

      While the reviewer is right in that lower dimensional representations would be computationally more efficient, and are generally more tractable, it would not be possible to represent cell mixing in one dimension, as this happens in three dimensions. One could perhaps encode the synchrony-promoting effect of cell mixing via some coupling function κ(x) that increases towards the posterior, however it is unclear what existing biological data one could use to parameterise this function or determine its form. Cell mixing can be modelled in a two-dimensional framework, however this cannot quantitatively recapture the rate of cell mixing observed in vivo, which is an advantage of this model. 

      Furthermore, it is unclear how one would simulate processes such as compactionextension using a one-dimensional model. The two different scenarios of cell ingression which we consider can also not be replicated in a one-dimensional model, as having a population of cells re-acquiring synchrony on the dorsal surface of the tissue while new material is added to the ventral side, creating asynchrony, is qualitatively different than a one-dimensional scenario where cells are introduced continuously along the spatial axis.

      (2) I am not sure about the justification for limiting the quantification of phase synchrony in a very limited (one cell diameter wide) region at one end of the somatic part (Page 33 below Fig. 9). From my understanding of the manuscript, the segments appear in significant length anterior to this region. Wouldn't an ensemble average of multiple such one cell diameter wide regions in the somatic region be a more accurate metric for quantifying synchrony?

      Indeed, such a metric (e.g. as that used by Uriu et al. to quantify synchrony along the xaxis) would be more accurate for determining synchrony within the PSM. However, as per the clock and wavefront model of somitogenesis, only synchrony at the very anterior of the PSM (or at the wavefront, equivalently) is functional for somitogenesis and thus evolution. Therefore, we restrict our analysis to the anterior-most region of the PSM. We now further justify this in the main text on page 9.

      (3) While studying the effect of cellular ingression, the authors study three discrete modes- random, DP and DP+LV and show that in the DP+LV mode the clock synchrony becomes affected. I would like the authors to explore this in a continuous fashion from a pure DP ingression to Pure LV ingression and intermediates.

      We thank the reviewer for this suggestion; this is a very interesting question. We are currently working on a related computational and experimental project to address the question of how PSM morphogenesis can change over evolutionary time to evolve the different modes that we see across species. As part of this work, we are running precisely the simulations suggested by the reviewer to find regions of parameter space in which all the relevant morphogenetic processes can freely evolve.  While interesting, this work is however outside the scope of the current manuscript.

      (4) While studying the effect of length and density of cells in PSM on cellular synchrony, the authors restrict to 3 values of density and 6 values of PSM length keeping the other parameter constant. I would be interested to see a phase diagram similar to Fig. 7 in the two-dimensional parameter space of L and ρ0. I am curious if a scaling relation exists for the parameter values that partition the parameter space with and without synchrony.

      We thank the reviewer for their suggestion and agree that this would constitute an interesting addition to the manuscript. We have now generated these data, which are shown in figure 4 supplement 5 and mentioned on page 13. We see no clear relationship between these two variables when co-varying in the presence of random ingression. 

      (5) Both in the abstract and introduction, the authors discuss at a great length about the variability in the number of segments. I am curious how the number and width of the segments observed depend on different parameters related to cellular mechanics and the segmentation clock ?

      We thank the reviewer for this question. It was not clear to us if this was something the reviewer wants us to address in the study’s background and introduction, or an analysis we should include in the results. Therefore, we have responded to both comprehensively below:

      The prevailing conceptual framework for understanding this is the clock and wavefront model (Cooke and Zeeman, 1976), which posits that the somite length is inversely proportional to the frequency of the clock relative to the speed of the wavefront, and that the total number of segments is the relative frequency multiplied by the total duration of somitogenesis.

      Experimentally we know that the frequency is determined in part by the coupling strength (Liao, Jorg, and Oates, 2016), and from comparative embryological studies (Gomez et al., 2008; Steventon et al., 2016) we know that changes in the elongation dynamics of the PSM correlate with changes in somite number, presumably by altering the total duration of somitogenesis (Gomez et al., 2009). These changes in elongation are thought to be driven by the changes in cell and tissue mechanics we test in our manuscript. 

      Within our model, we cannot in general predict how the number of segments responds to changes in either clock parameters or cell mechanical parameters, as we lack understanding of what causes somitogenesis to cease; this is thus not encoded in our model and segmentation can in principle proceed indefinitely. Therefore, we have not performed this analysis.

      Similarly, we have not included an analysis of somite length. This is for two reasons: 1) as per the clock and wavefront model, the frequency at the PSM anterior (which we analyse) is equivalent to this measurement, as we assume (in general) the wavefront ($x = x_{a}$) is inertial. 2) the length of the nascent somite is not thought to be of much relevance to the adult phenotype, and by extension evolution. Somites undergo cell division and growth soon after their patterning by the segmentation clock, therefore their final size does not majorly depend on the dynamics of the segmentation clock. Rather, the main function of the clock is to control their number (and polarity).

      (6) The authors assume that the phase dynamics of the chemical network may be described by an oscillator with constant frequency. For the completeness of the manuscript, the author should discuss in detail, for which chemical networks this is a good assumption.

      We thank the reviewer for their suggestion and now justify this assumption in the methods on page 31. 

      Such an assumption is appropriate for the segmentation clock, as the clock in the posterior of the PSM is thought to oscillate with a constant frequency, at least for the majority of somitogenesis although the frequency of somite formation slows towards the end of this process in zebrafish (Giudicelli et al., 2007, PLoS Biol.). In addition, PSM cells isolated and cultured in the presence of FGF (thus replicating the signalling environment of the posterior PSM) will continue to exhibit her1 oscillations with an apparently constant frequency (Webb et al., 2016). 

      We note that such formulations are widely used within the segmentation clock literature (e.g. Riedel-Kruse et al., 2007, Morelli et al., 2009).

      (7) Figure 3 and the associated text shows no effect of the cellular motility profile in the synchrony of the segmentation clock. This may be moved to the supplementary considering the length of this manuscript.

      Thank you for the suggestion. However, we would argue that the lack of effect is a crucial result when discussing modularity. Reviewer #2 agrees with this assessment.

      Reviewer #3 (Significance): 

      The manuscript answers some important questions in the synchrony of segmentation clock in the vertebrates utilizing a model published earlier. However, the presented result is incomplete in some aspects (points 2 to 5 of section A) and that could be overcome by a more detailed analysis using a simpler one dimensional (point 1 of section A). I believe this manuscript could be of interest to an intersecting audience of developmental biologists, systems biologists, and physicists/engineers interested in dynamical systems.

    1. Reviewer #2 (Public review):

      Summary

      The study investigated whether memory retrieval followed soon by extinction training results in a short-term memory deficit when tested - with a reinstatement test that results in recovery from extinction - soon after extinction training. Experiment 1 documents this phenomenon using a between-subjects design. Experiment 2 used a within-subject control and saw that the effect is also observed in a control condition. In addition, it also revealed that if testing is conducted 6 hours after extinction, there is not effect of retrieval prior to extinction as there is recovery from extinction independently of retrieval prior to extinction. A third Group also revealed that retrieval followed by extinction attenuates reinstatement when the test is conducted 24 hours later, consistent with previous literature. Finally, Experiment 3 used continuous theta-burst stimulation of the dorsolateral prefrontal cortex and assessed whether inhibition of that region (vs a control region) reversed the short-term effect revealed in Experiments 1 and 2. The results of control groups in Experiment 3 replicated the previous findings (short-term effect), and the experimental group revealed that these can be reversed by inhibition of the dorsolateral prefrontal cortex.

      Strengths

      The work is performed using standard procedures (fear conditioning and continuous theta-burst stimulation) and there is some justification of the sample sizes. The results replicate previous findings - some of which have been difficult to replicate and this needs to be acknowledged - and suggest that the effect can also be observed in a short-term reinstatement test.

      The study establishes links between the memory reconsolidation and retrieval-induced forgetting (or memory suppression) literatures. The explanations that have been developed for these are distinct and the current results integrate these, by revealing that the DLPFC activity involved in retrieval-extinction short-term effect. There is thus some novelty in the present results, but numerous questions remain unaddressed.

      Weakness

      The fear acquisition data is converted to a differential fear SCR and this is what is analysed (early vs late). However, the figure shows the raw SCR values for CS+ and CS- and therefore it is unclear whether acquisition was successful (despite there being an "early" vs "late" effect - no descriptives are provided).

      In Experiment 1 (Test results) it is unclear whether the main conclusion stems from a comparison of the test data relative to the last extinction trial ("we defined the fear recovery index as the SCR difference between the first test trial and the last extinction trial for a specific CS") or the difference relative to the CS- ("differential fear recovery index between CS+ and CS-"). It would help the reader assess the data if Fig 1e presents all the indexes (both CS+ and CS-). In addition, there is one sentence which I could not understand "there is no statistical difference between the differential fear recovery indexes between CS+ in the reminder and no reminder groups (P=0.048)". The p value suggests that there is a difference, yet it is not clear what is being compared here. Critically, any index taken as a difference relative to the CS- can indicate recovery of fear to the CS+ or absence of discrimination relative to the CS-, so ideally the authors would want to directly compare responses to the CS+ in the reminder and no-reminder groups. In the absence of such comparison, little can be concluded, in particular if SCR CS- data is different between groups. The latter issue is particularly relevant in Experiment 2, in which the CS- seems to vary between groups during the test and this can obscure the interpretation of the result.

      In experiment 1, the findings suggest that there is a benefit of retrieval followed by extinction in a short-term reinstatement test. In Experiment 2, the same effect is observed to a cue which did not undergo retrieval before extinction (CS2+), a result that is interpreted as resulting from cue-independence, rather than a failure to replicate in a within-subjects design the observations of Experiment 1 (between-subjects). Although retrieval-induced forgetting is cue-independent (the effect on items that are suppressed [Rp-] can be observed with an independent probe), it is not clear that the current findings are similar, and thus that the strong parallels made are not warranted. Here, both cues have been extinguished and therefore been equally exposed during the critical stage.

      The findings in Experiment 2 suggest that the amnesia reported in Experiment 1 is transient, in that no effect is observed when the test is delayed by 6 hours. The phenomena whereby reactivated memories transition to extinguished memories as a function of the amount of exposure (or number of trials) is completely different from the phenomena observed here. In the former, the manipulation has to do with the number of trials (or total amount of time) that the cues are exposed. In the current Experiment 2, the authors did not manipulate the number of trials but instead the retention interval between extinction and test. The finding reported here is closer to a "Kamin effect", that is the forgetting of learned information which is observed with intervals of intermediate length (Baum, 1968). Because the Kamin effect has been inferred to result from retrieval failure, it is unclear how this can be explained here. There needs to be much more clarity on the explanations to substantiate the conclusions.<br /> There are many results (Ryan et al., 2015) that challenge the framework that the authors base their predictions on (consolidation and reconsolidation theory), therefore these need to be acknowledged. These studies showed that memory can be expressed in the absence of the biological machinery thought to be needed for memory performance. The authors should be careful about statements such as "eliminate fear memores" for which there is little evidence.

      The parallels between the current findings and the memory suppression literature are speculated in the general discussion, and there is the conclusion that "the retrieval-extinction procedure might facilitate a spontaneous memory suppression process". Because one of the basic tenets of the memory suppression literature is that it reflects an "active suppression" process, there is no reason to believe that in the current paradigm the same phenomenon is in place, but instead it is "automatic". In other words, the conclusions make strong parallels with the memory suppression (and cognitive control) literature, yet the phenomena that they observed is thought to be passive (or spontaneous/automatic). Ultimately, it is unclear why 10 mins between the reminder and extinction learning will "automatically" suppress fear memories. Further down in the discussion it is argued that "For example, in the well-known retrieval-induced forgetting (RIF) phenomenon, the recall of a stored memory can impair the retention of related long-term memory and this forgetting effect emerges as early as 20 minutes after the retrieval procedure, suggesting memory suppression or inhibition can occur in a more spontaneous and automatic manner". I did not follow with the time delay between manipulation and test (20 mins) would speak about whether the process is controlled or automatic. In addition, the links with the "latent cause" theoretical framework are weak if any. There is little reason to believe that one extinction trial, separated by 10 mins from the rest of extinction trials, may lead participants to learn that extinction and acquisition have been generated by the same latent cause.

      Among the many conclusions, one is that the current study uncovers the "mechanism" underlying the short-term effects of retrieval-extinction. There is little in the current report that uncovers the mechanism, even in the most psychological sense of the mechanism, so this needs to be clarified. The same applies to the use of "adaptive".

      Whilst I could access the data in the OFS site, I could not make sense of the Matlab files as there is no signposting indicating what data is being shown in the files. Thus, as it stands, there is no way of independently replicating the analyses reported.

      The supplemental material shows figures with all participants, but only some statistical analyses are provided, and sometimes these are different from those reported in the main manuscript. For example, the test data in Experiment 1 is analysed with a two-way ANOVA with main effects of group (reminder vs no-reminder) and time (last trial of extinction vs first trial of test) in the main report. The analyses with all participants in the sup mat used a mixed two-way ANOVA with group (reminder vs no reminder) and CS (CS+ vs CS-). This makes it difficult to assess the robustness of the results when including all participants. In addition, in the supplementary materials there are no figures and analyses for Experiment 3.

      One of the overarching conclusions is that the "mechanisms" underlying reconsolidation (long term) and memory suppression (short term) phenomena are distinct, but memory suppression phenomena can also be observed after a 7-day retention interval (Storm et al., 2012), which then questions the conclusions achieved by the current study.

      References:

      Baum, M. (1968). Reversal learning of an avoidance response and the Kamin effect. Journal of Comparative and Physiological Psychology, 66(2), 495.<br /> Chalkia, A., Schroyens, N., Leng, L., Vanhasbroeck, N., Zenses, A. K., Van Oudenhove, L., & Beckers, T. (2020). No persistent attenuation of fear memories in humans: A registered replication of the reactivation-extinction effect. Cortex, 129, 496-509.<br /> Ryan, T. J., Roy, D. S., Pignatelli, M., Arons, A., & Tonegawa, S. (2015). Engram cells retain memory under retrograde amnesia. Science, 348(6238), 1007-1013.<br /> Storm, B. C., Bjork, E. L., & Bjork, R. A. (2012). On the durability of retrieval-induced forgetting. Journal of Cognitive Psychology, 24(5), 617-629.

      Comments on revisions:

      The authors have revised the manuscript but most of my concerns have remained unaddressed.

      (1) There are still no descriptive statistics to substantiate learning in Experiment 1.

      (2) In the revised analyses, the authors now show that CS- changes in different groups (for example, Experiment 2) so this means that there is little to conclude from the differential scores because these depend on CS-. It is unclear whether the effects arise from CS+ performance or the differential which is subject to CS- variations.

      (3) The notion that suppression is automatic is speculative at best

      (4) It still struggle with the parallels between these findings and the "limbo" literature. Here you manipulated the retention interval, whereas in the cited studies the number of extinction (exposure) was varied. These are two completely different phenomena.

      (5) My point about the data problematic for the reconsolidation (and consolidation) frameworks is that they observed memory in the absence of the brain substrates that are needed for memory to be observed. The answer did not address this. I do not understand how the latent cause model can explain this, if the only difference is the first ITI. Wouldn't participants fail to integrate extinction with acquisition with a longer ITI?

      (6) The materials in the OSF site are the same as before, they haven't ben updated.

      (7) Concerning supplementary materials, the robustness tests are intended to prove that you 1) can get the same results by varying the statistical models or 2) you can get the same results when you include all participants. Here authors have done both so this does not help. Also, in the rebuttal letter, they stated "Please note we did not include non-learners in these analyses " which contradicts what is stated in the figure captions "(learners + non learners)"

      (8) Finally, the literature suggesting that reconsolidation interference "eliminates" a memory is not substantiated by data nor in line with current theorising, so I invite a revision of these strong claims.

      Overall, I conclude that the revised manuscript did not address my main concerns.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their thorough evaluation of this manuscript. We are pleased that overall, they found our work and results valuable for the scientific community. Based on their feedback, we performed additional experiments and made several changes to strengthen the manuscript and expand the target audience.

      *All three reviewers pointed out that the manuscript lacked demonstration of OneSABER method applicability across sample types (i.e., its claimed versatility) and other whole-mount systems beyond the Macrostomum lignano flatworm. *

      We now include an additional results section with accompanying figures (Figs. 6 and 7) that demonstrate the application of OneSABER in whole-mount samples of another flatworm, the planarian Schmidtea mediterranea (Fig. 6), which is much larger than M. lignano, and in formalin-fixed paraffin-embedded (FFPE) mouse small intestine tissue sections (Fig. 7). We believe that these additional experiments on different sample types demonstrate the versatility of the OneSABER approach.

      Please note that two more authors, Jan Freark de Boer and Folkert Kuipers, have been added for their contribution to mouse FFPE sections.

      Furthermore, two reviewers asked for an additional main figure with a comparison of the signal strengths between the different OneSABER methods.

      We have addressed this comment by including an additional results section and its adjacent figure (Fig. 5), where we provide a comparison of fluorescent signals from the same probes and gene but different OneSABER development methods.

      Additionally, to implement the revisions, we modified Fig. 1 and Supplementary Fig. 6 and broadened Supplementary Tables S1-S2, S4-S6.

      2. Point-by-point description of the revisions

      Reviewer #1

      1) “Fig.1 seems to suggest that the protocol for in vitro swapping of 3' concatemers happens in two consecutive PCR steps. I recommend indicating in the figure that the switching can be conducted in a single in vitro reaction.”

      We have changed Fig. 1 to make this clearer.

      2) “Is it possible to multiplex the switching in one single reaction? For example, perform p27 to p28 and p29 to p30 simultaneously? This will be crucial for the split-probe methodology.”

      We did not test it. This should be possible if there is no overlap between the 3’ initiator sequences. However, it seems counterproductive as the elongation efficiencies of switching reactions from the 3’ initiator sequences to another concatemer may vary (Supplementary Fig. S6). Running independent extension/switch reactions and performing equimolar mixing of purified extended probes could be a better solution.

      3) “Did the authors encounter any switching hairpins sequence that does not work? If not, can they postulate, what are the requirements for the design of switching sequences.”

      The design criteria followed the requirements postulated in the original SABER article and its Supplementary Materials (Kishi et al 2019). All switching hairpins we tested in the pairs of the 3 used 3’ initiator sequences (p27, p28 and p30) worked, but elongation efficiencies varied (see an example in Supplementary Fig. S6).

      4) “Is there cross hybridization between the switched and original hairpins? For example, can the authors show that the signals from p27 and p30 do not overlaps?”

      The in situ hybridization results with swapped primary probes are shown in Fig. 6B (multiplexed HCR in S. mediterranea). All probes were originally designed using a p27 PER initiator. We swapped Smed-vit-1 with p30 and Smedwi-1 with p28. We also updated Fig. S6, by adding the second section (B) showing the in vitro results after concatemer swapping, as well as hybridization specificity of the secondary imager probes.

      5) “Can the authors quantify results from the direct, AP, TSA, and HCR? What do you mean by 'narrow anatomical structures like neural chords (syt11) or muscles (tnnt2) seem less visible'?”

      *“I agree with reviewer #2 regarding the lack of comparison to standard SABER.” *

      A comparison of fluorescent signals from the same probes/genes but different OneSABER development methods is shown in Fig. 5.

      We have rephrased the sentence for clarity. From “As a result, despite higher intracellular resolution, some narrow anatomical structures like neural chords (syt11) or muscles (tnnt2) seem less visible for the human eye after SABER HCR (Figs. 3, 4).” to “As a result, despite higher intracellular resolution, some fine anatomical structures like neural chords (syt11) or muscles (tnnt2) are less resolved by widefield fluorescence microscopy after SABER HCR FISH compared to SABER TSA FISH”

      Reviewer #2

      1) “This work is building on standard SABER (a set of PER-extended primary probes that serve as landing pads for secondary fluorescently-labeled readout oligos) and pSABER (the readout oligo carries HRP instead of a dye for downstream TSA). The novelty of the work presented here is introducing additional variations of signal amplification, i.e. by using an hapten-labeled oligo to recruit a tertiary readout probe (antibodies conjugated with HRP or AP) or using SABER in combination with HCR. Since SABER can be seen as the underlying platform and pSABER was (arguably) also already introduced as a new platform by Attar et al. 2023, it seems difficult to introduce OneSABER as yet another new platform, of which standard SABER and pSABER are a part of. The reviewer encourages the authors to overthink the conceptual introduction, which in view of its certainly distinct novel features might allow a clearer distinction to previous work.”

      We agree with the reviewer’s comments. We have added additional information in the Introduction section to clarify the novelty and key distinct features of OneSABER that justify its separation from other SABER protocols.

      2) “Although the authors take care in tributing prior work, some of the studies are only mentioned in the results section, one of such cases is pSABER by Attar et al. 2023. The close relation between pSABER and SABER TSA (HRP on readout oligo vs. hapten on readout oligo + HRP-conjugated antibody) needs to be better positioned in the introduction, clearly framing earlier work, inspirations drawn etc.. This is in line with my previous point.”

      The pSABER preprint article by Attar et al. 2023 (now published in a peer-reviewed journal as Attar et al. 2025) is now mentioned in the Introduction, and its inspirational impact on our research is clearly stated.

      3) “Fig. 1 lists the individual modules of the OneSABER platform: i) standard SABER, ii) AP SABER, iii) SABER TSA, iv) pSABER (TSA FISH) (would recommend leaving it with original name when introducing it and include additional explanation in parentheses) and iv) SABER HCR. The main figures feature only AP SABER, SABER TSA and SABER HCR, for standard SABER and pSABER one must look up the SI. Since the authors describe the limited performance of standard SABER for one of their targets of interest (syt11) and since they have tested this target for all five conditions, it would be valuable to include a comparative view of all five platform modules in a single figure for syt11 or even also piwi, which also seems to have been tested for all five. Comparing the signal strength would be useful for the community, at least of each SABER variation compared to standard SABER.”

      We agree with the reviewer’s comments. Except for pSABER, a comparison of fluorescence signals from the same probes/genes but different OneSABER development methods is shown in Fig. 5. To make the comparison as objective as possible, all FISH developments were re-done using available “far red” fluorophores, except for pSABER. Unfortunately, our directly labeled HRP oligonucleotides for pSABER lost their activity after a year of storage at +4oC. These conjugated oligonucleotides are very expensive and, given their limited shelf life, we cannot justify ordering a new batch for this experiment. Therefore, we only have the data for pSABER syt11 with FITC green tyramide, which is not comparable to “far red” fluorophore signals. This issue has also been discussed in the main text.

      In addition, we have modified Fig. 1, as suggested.

      4) “The description of how the authors designed their probes is very detailed and they also provide a nice step-by-step protocol for their individual commands using Oligominer and BLAT software. This reviewer is wondering how the authors chose their PER sequences that they appended to their mined set of homologous in situ hybridization probes (p27,p28,p30). This is a general problem of multiplexed ISH approaches with single-stranded overhang, could the author's comment on potential self-interaction of the appended sequence with the homologous part, which might limit the PER efficiency, or elaborate on their choice?”

      As being ourselves novice to SABER when we started our work, we based our selection of the p27, p28, and p30 PER sequences on their multiple co-occurrences in previous publications (Amamoto et al. 2019, doi: 10.7554/eLife.51452; Saka et al. 2019, doi: 10.1038/s41587-019-0207-y; Wang et al. 2020, doi: 10.1016/j.omtm.2020.10.003; Salinas-Saavedra et al. 2023, doi: 10.1016/j.celrep.2023.112687; and Attar et al. 2023, doi: 10.1101/2023.01.30.526264). We did not consider the potential interference between PER concatemers and homologous primary probe-binding sequences. However, as all PER concatemers were specifically designed to lack G nucleotides to keep them from self-annealing (Kishi et al. 2019, doi: 10.1038/s41592-019-0404-0), we assumed that it would also reduce potential annealing to the homologous part of the probe.

      5) “Fig.1 and l. 125 describe straightforward in vitro switching of the concatemer sequence for an existing set of primary probes as a central feature of the OneSABER platform. However, the authors to my knowledge do not show such experiments themselves and only cite the original SABER paper by Kishi et al. 2019. This reviewer would be grateful to be pointed toward where in Kishi et al. 2019 this was demonstrated, however in view of this central part of the swopping scheme in the OneSABER platform an experiment showing this swopping is missing.”

      In the article by Kishi et al. 2019, concatemer switching/swapping is termed as “primer remapping”. We found this term confusing because it does not describe the essence of the reaction. The in situ hybridization results with swapped primary probes are shown in Fig. 6B (multiplexed HCR in S. mediterranea). All probes were originally designed using a p27 PER initiator. We swapped Smed-vit-1 with p30 and Smewi-1 with p28. We also updated Fig. S6, by adding the second section (B) showing the in vitro results after concatemer swapping, as well as hybridization specificity of the secondary imager probes.

      6) “the description of Table S6 could use additional information in the legend such that the reader does not have to scroll down to Section S1 to retrieve the information (PER reaction, gel conditions, ladder is dsDNA, what are the individual bands)”

      Probably, the reviewer meant Fig. S6. We now wrote a more detailed caption for the figure and extended it with a second panel (B) to illustrate the results of 3’ concatemer swapping.

      7) “the manuscript features an extensive set of resources in main body, supplementary materials and protocols. It is important and usually not merited sufficiently making the effort to compare orthogonal approaches for a given aim. This reviewer particularly appreciates the detailed strengths & weaknesses discussion in Table S6.”

      We thank the reviewer for the appreciation of our work.

      8) “Minor comments:

      -Definitions should be consistent, in Fig. 1 all approaches are defined with FISH added, but this definition is not followed consistently in the main text.”

      These definitions are now made consistent throughout the text.

      9) “Optional:

      -The authors describe several newly developed optimization steps during sample preparation for M. lignano ISH experiments compared to established ones. If the data exists, they include a supplementary figure showing improvements of optimized protocol steps”

      As almost every step and the buffer recipes were different from the original ISH protocol by Pfister et al. (2007) because of the use of liquid-exchange columns, different probes, and development chemistry, we believe that a comparison would be excessive. We think that the key difference points are already substantially highlighted in the results section.

      Reviewer #3

      1) “Despite including a whole figure (Figure 1) featuring the operation scheme of the OneSABER platform, the figure as well as the associated text fall short with respect to clearly stating the advantage of the different aspects of the platform. Consider a clearer and more thorough explanation of the different aspects of the platfrom.”

      Details on the advantages and disadvantages of using different OneSABER methods in terms of their experimental application and cost efficiency are described in Supplementary Tables S4-S6 of the submitted manuscript. However, we agree that the description in Fig. 1 was too concise and also did not refer to these tables. We have expanded the description in Fig. 1.

      2) “Related to the first comment: A more detailed description of the similarities and/or differences of this platform relative to similar applications such as the study by Hall et al, 2024”

      The mere point of mentioning the preprint of Hall et al. 2024 (now peer-reviewed, https://doi.org/10.1016/j.celrep.2024.114892) was to acknowledge that in M. lignano the HCR technology has been previously applied (although only once), while all other previously published works on M. lignano utilized canonical antisense RNA probes colorimetric in situ hybridization. We have extensively mentioned the HCR approach and its working principles throughout the submitted manuscript.

      3) “The authors describe the probes used as short, synthetic DNA probes targeting short RNA transcripts. Are these probes Oligopaints (Beliveau et al, 2015)? Why is that not more clearly stated in the text?”

      Oligopaints use oligo libraries as a renewable source of FISH probes, and these libraries are amplified with fluorophore-conjugated PCR primers. We used synthetic DNA probes directly. In this sense, our probe sets are not oligopaints. However, we used the OligoMiner pipeline of Oligopaints for the design of the probes, and thus used the same tiling strategy as oligopaints. We believe that this has been explained in the manuscript. Please refer to comment 4 of Reviewer 2.

      4) “Line 105, p5: The authors state that the number of probes depends on the target RNA length and its expression strength. This data should be in the main text and described in detail since it is a major aspect of the platform design.”

      We believe that this statement is common sense, as one cannot design more than 5x 30-50 bp probes for 200 nt transcripts, while for a 2000 bp mRNA, the theoretical limit is ~50 probes. Similarly, weakly expressed genes (regardless of their length) would require either more probes to reach the detection threshold or stronger amplification through choice of concatemer length and/or signal developing techniques. We have rephrased this sentence in the main text to reflect this.

      5) “Figure 2 showcases one of the most compelling data supporting the versatility of the platform. Can the signals in each panel be quantified and compared to 1. Published Ab staining? Is there a clear correlation in the intensity of the signals? 2. Between Vector Blue and NBT? 3. Chemical staining and FISH signals?”

      Since M. lignano is a relatively new model, there are no published antibody stainings for M. lignano genes used in this study. Furthermore, colorimetric precipitate methods are not quantitative but rather qualitative, because their signal strength is proportional to both the target RNA level and the development time; thus, signals from weakly expressed transcripts can be “boosted” simply by longer development. Therefore, a correct quantitative comparison with colorimetric methods, as requested by the reviewer, was not possible. However, with some corrections on fluorophore differences and animal-to-animal variability, it is possible to roughly compare peak saturation intensities for FISH methods if the experiments are designed for this aim. We performed these experiments, and a comparison of fluorescent signals from the same probes/genes but different OneSABER development methods is shown in Fig. 5.

      Minor comments:

      6) “The whole mount images and signals are often diffuse, can they be visualized using a DIC where the morphology of the organism is clearer?”

      We are unsure which images appear to be diffused to the reviewer. The other reviewers have not pointed out similar issues. Perhaps the question resolves once full-resolution uncompressed images are uploaded.

      7) “In order to support the claim that this is a universal approach for whole-mount staining, can the authors show an example of applicability to C. elegans?”

      This is now addressed. We included two additional results sections with two accompanying figures (Figs. 6 and 7) that demonstrate OneSABER’s application in whole-mount samples of a much larger than M. lignano model flatworm, the planarian Schmidtea mediterranea (Fig. 6), as well as in formalin-fixed paraffin-embedded (FFPE) small intestine tissue sections of a mouse model (Fig. 7).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors used a subset of a very large, previously generated 16S dataset to:<br /> (1) Assess age-associated features; and (2) develop a fecal microbiome clock, based on an extensive longitudinal sampling of wild baboons for which near-exact chronological age is known. They further seek to understand deviation from age-expected patterns and uncover if and why some individuals have an older or younger microbiome than expected, and the health and longevity implications of such variation. Overall, the authors compellingly achieved their goals of discovering age-associated microbiome features and developing a fecal microbiome clock. They also showed clear and exciting evidence for sex and rank-associated variation in the pace of gut microbiome aging and impacts of seasonality on microbiome age in females. These data add to a growing understanding of modifiers of the pace of age in primates, and links among different biological indicators of age, with implications for understanding and contextualizing human variation. However, in the current version, there are gaps in the analyses with respect to the social environment, and in comparisons with other biological indicators of age. Despite this, I anticipate this work will be impactful, generate new areas of inquiry, and fuel additional comparative studies.

      Thank you for the supportive comments and constructive reviews.

      Strengths:

      The major strengths of the paper are the size and sampling depth of the study population, including the ability to characterize the social and physical environments, and the application of recent and exciting methods to characterize the microbiome clock. An additional strength was the ability of the authors to compare and contrast the relative age-predictive power of the fecal microbiome clock to other biological methods of age estimation available for the study population (dental wear, blood cell parameters, methylation data). Furthermore, the writing and support materials are clear, informative and visually appealing.

      Weaknesses:

      It seems clear that more could be done in the area of drawing comparisons among the microbiome clock and other metrics of biological age, given the extensive data available for the study population. It was confusing to see this goal (i.e. "(i) to test whether microbiome age is correlated with other hallmarks of biological age in this population"), listed as a future direction, when the authors began this process here and have the data to do more; it would add to the impact of the paper to see this more extensively developed.

      Comparing the microbiome clock to other metrics of biological age in our population is a high priority (these other metrics of biological age are in Table S5 and include epigenetic age measured in blood, the non-invasive physiology and behavior clock (NPB clock), dentine exposure, body mass index, and blood cell counts (Galbany et al. 2011; Altmann et al. 2010; Jayashankar et al. 2003; Weibel et al. 2024; Anderson et al. 2021)). However, we have opted to test these relationships in a separate manuscript. We made this decision because of the complexity of the analytical task: these metrics were not necessarily collected on the same subjects, and when they were, each metric was often measured at a different age for a given animal. Further, two of the metrics (microbiome clock and NPB clock) are measured longitudinally within subjects but on different time scales (the NPB clock is measured annually while microbiome age is measured in individual samples). The other metrics are cross-sectional. Testing the correlations between them will require exploration of how subject inclusion and time scale affect the relationships between metrics.

      We now explain the complexity of this analysis in the discussion in lines 447-450. In addition, we have added the NPB clock (Weibel et al. 2024) to the text in lines 260-262 and to Table S5.

      An additional weakness of the current set of analyses is that the authors did not explore the impact of current social network connectedness on microbiome parameters, despite the landmark finding from members of this authorship studying the same population that "Social networks predict gut microbiome composition in wild baboons" published here in eLife some years ago. While a mother's social connectedness is included as a parameter of early life adversity, overall the authors focus strongly on social dominance rank, without discussion of that parameter's impact on social network size or directly assessing it.

      Thank you for raising this important point, which was not well explained in our manuscript. We find that the signatures of social group membership and social network proximity are only detectable our population for samples collected close in time. All of the samples analyzed in  Tung et al. 2015 (“Social networks predict gut microbiome composition in wild baboons”) were collected within six weeks of each other. By contrast, the data set analyzed here spans 14 years, with very few samples from close social partners collected close in time. Hence, the effects of social group membership and social proximity are weak or undetectable. We described these findings in Grieneisen et al. 2021 and Bjork et al. 2022, and we now explain this logic on line 530, which states, “We did not model individual social network position because prior analyses of this data set find no evidence that close social partners have more similar gut microbiomes, probably because we lack samples from close social partners sampled close in time (Grieneisen et al. 2021; Björk et al. 2022).”

      We do find small effects of social group membership, which is included as a random effect in our models of how each microbiome feature is associated with host age (line 529) and our models predicting microbiome Dage (line 606; Table S6).

      Reviewer #2 (Public review):

      Summary:

      Dasari et al present an interesting study investigating the use of 'microbiota age' as an alternative to other measures of 'biological age'. The study provides several curious insights into biological aging. Although 'microbiota age' holds potential as a proxy of biological age, it comes with limitations considering the gut microbial community can be influenced by various non-age related factors, and various age-related stressors may not manifest in changes in the gut microbiota. The work would benefit from a more comprehensive discussion, that includes the limitations of the study and what these mean to the interpretation of the results.

      We agree and have text to the discussion that expands on the limitations of this study and what those limitations mean for the interpretation of the results. For instance, lines 395-400 read, “Despite the relative accuracy of the baboon microbiome clock compared to similar clocks in humans, our clock has several limitations. First, the clock’s ability to predict  individual age is lower than for age clocks based on patterns of DNA methylation—both for humans and baboons (Horvath 2013; Marioni et al. 2015; Chen et al. 2016; Binder et al. 2018; Anderson et al. 2021). One reason for this difference may be that gut microbiomes can be influenced by several non-age-related factors, including social group membership, seasonal changes in resource use, and fluctuations in microbial communities in the environment”

      In addition, lines 405-411 now reads, “Third, the relationships between potential socio-environmental drivers of biological aging and the resulting biological age predictions were inconsistent. For instance, some sources of early life adversity were linked to old-for-age gut microbiomes (e.g., males born into large social groups), while others were linked to young-for-age microbiomes (e.g., males who experienced maternal social isolation or early life drought), or were unrelated to gut microbiome age (e.g., males who experienced maternal loss; any source of early life adversity in females).”

      Strengths:

      The dataset this study is based on is impressive, and can reveal various insights into biological ageing and beyond. The analysis implemented is extensive and high-level.

      Weaknesses:

      The key weakness is the use of microbiota age instead of e.g., DNA-methylation-based epigenetic age as a proxy of biological ageing, for reasons stated in the summary. DNA methylation levels can be measured from faecal samples, and as such epigenetic clocks too can be non-invasive. I will provide authors a list of minor edits to improve the read, to provide more details on Methods, and to make sure study limitations are discussed comprehensively.

      Thank you for this point. In response, we have deleted the text from the discussion that stated that non-invasive sampling is an advantage of microbiome clocks. In addition, we now propose a non-invasive epigenetic clock from fecal samples as an important future direction for our population (see line 450).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Abstract - The opening 2 sentences are not especially original or reflective of the potential value/ premise of the study. Members of this team have themselves measured variation in biological age in many different ways, and the implication that measuring a microbiome clock is easy or straightforward is not compelling. This paper is very interesting and provides unique insight, but I think overall there is a missed opportunity in the abstract to emphasize this, given the innovative science presented here. Furthermore, the last 2 sentences of the abstract are especially interesting - but missing a final statement on the broader significance of research outside of baboons.

      We appreciate these comments and have revised the Abstract accordingly. The introductory sentences now read, “Mammalian gut microbiomes are highly dynamic communities that shape and are shaped by host aging, including age-related changes to host immunity, metabolism, and behavior. As such, gut microbial composition may provide valuable information on host biological age.” (lines 31-34). The last two sentences of the abstract now read, “Hence, in our host population, gut microbiome age largely reflects current, as opposed to past, social and environmental conditions, and does not predict the pace of host development or host mortality risk. We add to a growing understanding of how age is reflected in different host phenotypes and what forces modify biological age in primates.” (lines 40-43).

      If possible, it would be highly useful to present some comments on concordance in patterns at different levels. Are all ASVs assessed at both the family and genus levels? Do they follow similar patterns when assessed at different levels? What can we learn about the system by looking at different levels of taxonomic assignment?

      The section on relationships between host age and individual microbiome features is already lengthy, so we have not added an analysis of concordance between different taxonomic levels. However, we added a justification for why we tested for age signatures in different levels of taxa to line 171, which reads, “We tested these different taxonomic levels in order to learn whether the degree to which coarse and fine-grained designations categories were associated with host age.”

      To calculate the delta age - please clarify if this was done at the level of years, as suggested in Figure 3C, or at the level of months or portion months, etc?

      Delta age is measured in years. This is now clarified in lines 294, 295, and 578.

      Spelling mistake in table S12, cell B4 (Octovber)

      Thank you. This typo has been corrected.

      Given the start intro with vertebrates, the second paragraph needs some tweaking to be appropriate. Perhaps, "At least among mammals, one valuable marker of biological aging may lie in the composition and dynamics of the mammalian gut microbiome (7-10)." Or simply remove "mammalian".

      We have updated this sentence based on your suggestions in line 54. It reads, “In mammals, one valuable marker of biological aging may lie in the composition and dynamics of the gut microbiome (Claesson et al. 2012; Heintz and Mair 2014; O’Toole and Jeffery 2015; Sadoughi et al. 2022).”

      A rewrite at the end of the introduction is needed to avoid the almost direct repetition in lines 115-118 and 129-131 (including lit cited). One potentially effective way to approach this is to keep the predictions in the earlier paragraph and then more clearly center the approach and the overarching results statement in the latter paragraph. (I.e., "we find that season and social rank have stronger effects on microbiome age than early life events. Further, microbiome age does not predict host development or mortality.").

      Thank you for pointing this out. We have re-organized the predictions in the introduction based on your suggestion. The alternative “recency effects” model now appears in the paragraph that starts in line 110. The final paragraph then centers on the overall approach and the results statement (lines 128-140)

      Be clear in each case where taxon-level trends are discussed if it's at Family, Genus, or other level. It's there most, but not all, of the time.

      We have gone through the text and clarified what taxa or microbiome feature was the subject of our analyses in any places where this was not clear.

      In the legend for Figure 2, add clarification for how values to right versus left of the centered value should be interpreted with respect to age (e.g. "values to x of the center are more abundant in older individuals").

      We now clarify in Figure 2C and 2D that “Positive values are more abundant in older hosts”.

      Figure 3 - Are Panels A, B, and C all needed - can the value for all individuals not also be overlaid in the panel showing sex differences and the same point showing individuals with "old" and "young" microbiomes be added in the same plot if it was slightly larger?

      We agree and have simplified Figure 3. We reduced the number of panels from three to two, and we added the information about how to calculate delta age to Panel A. We also moved the equation from the top of Panel C to the bottom right of Panel A.

      Reviewer #2 (Recommendations for the authors):

      Dasari et al present an interesting study investigating the use of 'microbiota age' as an alternative to other measures of 'biological age'. The study provides several curious insights which in principle warrant publication. However, I do think the manuscript should be carefully revised. Below I list some minor revisions that should be implemented. Importantly, the authors should discuss in the Discussion the pros and cons of using 'microbiota age' as a proxy of 'biological age'. Further, the authors should provide more information on Methods, to make sure the study can be replicated.

      Thank you for these important points. Based on your comments and those of the first reviewer, we have expanded our discussion of the limitations of using microbiota age as a proxy for biological age (see edits to the paragraph starting in line 395).

      We have also expanded our methods around sample collection, DNA extraction, and sequencing to describe our sampling methods, strategies to mitigate and address possible contamination, and batch effects. See lines 483-490 and our citations to the original papers where these methods are described in detail.

      (1) Lines 85-99: I think this paragraph could be revisited to make the assumptions clearer. For instance, the last sentence is currently a little confusing: are authors expecting males to exhibit old-for-age microbiomes already during the juvenile period?

      This prediction has been clarified. Line 96 now reads, “Hence, we predicted that adult male baboons would exhibit gut microbiomes that are old-for-age, compared to adult females (by contrast, we expected no sex effects on microbiome age in juvenile baboons).”

      (2) Lines 118-121: Could the authors discuss this assumption in relation to what has been observed e.g., in humans in terms of delays in gut microbiome development? Delayed/accelerated gut microbiome development has been studied before, so this assumption would be stronger if related to what we know from previous studies.

      This comment refers to the sentence which originally stated, “However, we also expected that some sources of early life adversity might be linked to young-for-age gut microbiota. For instance, maternal social isolation might delay gut microbiome development due to less frequent microbial exposures from conspecifics.” We have slightly expanded the text here (line 117) to explain our logic. We now include citations for our predictions. We did not include a detailed discussion of prior literature on microbiome development in the interest of keeping the same level of detail across all sections on our predictions.

      (3) As the authors discuss, various adversities can lead to old-for-age but also young-for-age microbiome composition. This should be discussed in the limitations.

      We agree. This is now discussed in the sentence starting at line 371, which reads, “…deviations from microbiome age predictions are explained by socio-environmental conditions experienced by individual hosts, especially recent conditions, although the effect sizes are small and are not always directionally consistent.” In addition, the text starting at line 405 now reads, “Third, the relationships between potential socio-environmental drivers of biological aging and the resulting biological age predictions were inconsistent. For instance, some sources of early life adversity were linked to old-for-age gut microbiomes (e.g., males born into large social groups), while others were linked to young-for-age microbiomes (e.g., males who experienced maternal social isolation or early life drought), or were unrelated to gut microbiome age (e.g., males who experienced maternal loss; any source of early life adversity in females).”

      (4) In various places, e.g., lines 129-131, it is a little unclear at what chronological age authors are expecting microbiota to appear young/old-for-age.

      This sentence was removed while responding to the comments from the first reviewer.

      (5) Lines 132-133: this statement could be backed by stating that this is because the gut microbiota can change rapidly e.g., when diet changes (or whatever the authors think could be behind this).

      We have added an expository sentence at line 123, including new citations. This sentence reads, “Indeed, gut microbiomes are highly dynamic and can change rapidly in response to host diet or other aspects of host physiology, behavior, or environments”.

      We now cite:

      · Hicks, A.L., et al. (2018). Gut microbiomes of wild great apes fluctuate seasonally in response to diet. Nature Communications 9, 1786.

      · Kolodny, O., et al. (2019). Coordinated change at the colony level in fruit bat fur microbiomes through time. Nature Ecology & Evolution 3, 116-124.

      · Risely, A., et al. (2021) Diurnal oscillations in gut bacterial load and composition eclipse seasonal and lifetime dynamics in wild meerkats. Nat Commun 12, 6017.

      (6) Lines 135-137: current or past season and social rank? This paragraph introduces the idea that it could be past rather than current socio-environmental factors that might predict microbiota age, so the authors should clarify this sentence.

      We have clarified the information in this sentence. line 135 now reads, “In general, our results support the idea that a baboon’s current socio-environmental conditions, especially their current social rank and the season of sampling, have stronger effects on microbiome age than early life events—many of which occurred many years prior to sampling.”

      (7) Lines 136-137: this sentence could include some kind of a conclusion of this finding. What might this mean?

      We have added a sentence at line 138, which speculates that, “…the dynamism of the gut microbiome may often overwhelm and erase early life effects on gut microbiome age.”

      (8) Use 'microbiota' or 'microbiome' across the manuscript; currently, the terms are used interchangeably. I don't have a strong opinion on this, although typically 'microbiota' is used when data comes from 16S rRNA.

      We have updated the text to replace any instance of “microbiota” with “microbiome”. We use the term microbiome in the sense of this definition from the National Human Genome Research Institute, which defines a microbiome as “the community of microorganisms (such as fungi, bacteria and viruses) that exists in a particular environment”.

      (9) Figure 1 legend: make sure to unify formatting; e.g., present sample sizes as N= or n=, rather than both, and either include or do not include commas in 4-digit values (sample sizes).

      We have checked the formatting related to sample sizes and the use of commas in 4-digits in the main text and supplement. The formats are now consistent.

      (10) Line 166: relative abundances surely?

      Following Gloor et al. (2017), our analyses use centered log-ratio (CLR) transformations of read counts, which is the recommended approach for compositional data such as 16S rRNA amplicon read counts. CLR transformations are scale-invariant, so the same ratio is obtained in a sample with few read versus many reads. We now cite Gloor et al. (2017) at line 169 and in the methods in line 517, which reads “centered log ratio (CLR) transformed abundances (i.e., read counts) of each microbial phyla (n=30), family (n=290), genus (n=747), and amplicon sequence variance (ASV) detected in >25% of samples (n=358). CLR transformations are a recommended approach for addressing the compositional nature of 16S rRNA amplicon read count data (Gloor et al. 2017).”  

      (11) Lines 167-172: were technical factors, e.g., read depth or sequencing batch, included as random effects?

      Thank you for catching this oversight in the text. We did model sequencing depth and batch effects. The sentence starting at line 173 now reads, “For each of these 1,440 features, we tested its association with host age by running linear mixed effects models that included linear and quadratic effects of host age and four other fixed effects: sequencing depth, the season of sample collection (wet or dry), the average maximum temperature for the month prior to sample collection, and the total rainfall in the month prior to sample collection (Grieneisen et al. 2021; Björk et al. 2022; Tung et al. 2015). Baboon identity, social group membership, hydrological year of sampling, and sequencing plate (as a batch effect) were modeled as random effects.”

      (12) Lines 175-180: When discussing how these alpha diversity results relate to previous findings, the authors should be clear about whether they talk about weighted or non-weighted measures of alpha diversity. - also maybe this should be included in the discussion rather than the results? Please consider this when revisiting the manuscript (see how it reads after edits).

      Richness is the only unweighted metric, which we now clarify in line 181. We opted to retain the interpretation in the text in its original location to maintain the emphasis in the discussion on the microbiome clock results.

      (13) Table S1 is very hard to interpret in the provided PDF format as columns are not presented side-by-side. It is currently hard to check model output for e.g., specific families. This needs to be revisited.

      We agree. We believe that eLife’s submission portal automatically generates a PDF for any supplementary item. However, we also include the supplementary tables as an Excel workbook which has the columns presented side-by-side.

      (14) Line 184: taxa meaning what? Unclear what authors refer to with this sentence, taxa across taxonomic levels, or ASVs, or what does the 51.6% refer to?

      We have edited line 191 to clarify that this sentence refers to taxa at all taxonomic levels (phyla to ASVs).

      (15) Line 191: a punctuation mark missing after ref (81).

      We have added the missing period at the end of this sentence.

      (16) Lines 189-197: this should go into the discussion in my opinion.

      We have opted to retain this interpretation, now at line 183.

      (17) Lines 215-219: Not sure what this means; do the authors mean features were not restricted to age-associated taxa, ie also e.g., diversity and other taxa-independent patterns were included? If so, the rest of the highlighted lines should be revisited to make this clear, currently to me it is very unclear what 'These could include features that are not strongly age-correlated in isolation' means. Currently, that sounds like some features included were only age-associated in combination with other features, but unclear how this relates to taxa-dependency/taxa-independency.

      We agree this was not clear. We have revised line 224 to read, “We included all 9,575 microbiome features in our age predictions, as opposed to just those that were statistically significantly associated with age because removing these non-significant features could exclude features that contribute to age prediction via interactions with other taxa.”

      (18) Line 403-407: There is now a paper showing epigenetic clocks can be built with faecal samples, so this argument is not valid. Please revisit in light of this publication: https://onlinelibrary.wiley.com/doi/epdf/10.1111/mec.17330

      Thank you for bringing this paper to our attention. We deleted the text that describes epigenetic clocks as invasive, and we now cite this paper in line 450, which reads, “We also hope to measure epigenetic age in fecal samples, leveraging methods developed in Hanski et al. 2024.”

      (19) Line 427: a punctuation mark/semicolon missing before However.

      We have corrected this typo.

      (20) Lines 419-428: I don't quite understand this speculation. Why would the priority of access to food lead to an old-looking gut microbiome? This paragraph needs stronger arguments, currently unclear and also not super convincing.

      We agree this was confusing. We have revised this text to clarify the explanation. The text starting at line 424 now reads, “This outcome points towards a shared driver of high social status in shaping gut microbiome age in both males and females. While it is difficult to identify a plausible shared driver, one benefit shared by both high-ranking males and females is priority of access to food. This access may result in fewer foraging disruptions and a higher quality, more stable diet. At the same time, prior research in Amboseli suggests that as animals age, their diets become more canalized and less variable (Grieneisen et al. 2021). Hence aging and priority of access to food might both be associated with dietary stability and old-for-age microbiomes. However, this explanation is speculative and more work is needed to understand the relationship between rank and microbiome age.”

      (21) Line 434: remove 'be'.

      We have corrected this typo.

      (22) Line 478: add information on how samples were collected; e.g., were samples collected from the ground? How was cross-contamination with soil microbiota minimised? Were samples taken from the inner part of depositions? These factors can influence microbiota samples quite drastically so detailed info is needed. Also what does homogenisation mean in this context? How soon were samples freeze-dried after sample collection?

      We have expanded our methods with respect to sample collection. This text starts in line 483 and reads, “Samples were collected from the ground within 15 minutes of defecation. For each sample, approximately 20 g of feces was collected into a paper cup, homogenized by stirring with a wooden tongue depressor, and a 5 g aliquot of the homogenized sample was transferred to a tube containing 95% ethanol. While a small amount of soil was typically present on the outside of the fecal sample, mammalian feces contains 1000 times the number of microbial cells in a typical soil sample (Sender, Fuchs, and Milo 2016; Raynaud and Nunan 2014), which overwhelms the signal of soil bacteria in our analyses (Grieneisen et al. 2021). Samples were transported from the field in Amboseli to a lab in Nairobi, freeze-dried, and then sifted to remove plant matter prior to long term storage at -80°C.”

      (23) Line 480 onwards: were negative controls included in extraction batches? Were samples randomised into extraction batches?

      Yes, we included extraction blanks. These are now described in lines 495-500. This text reads, “We included one extraction blank per batch, which had significantly lower DNA concentrations than sample wells (t-test; t=-50, p < 2.2x10-16; Grieneisen et al. 2021). We also included technical replicates, which were the same fecal sample sequenced across multiple extraction and library preparation batches. Technical replicates from different batches clustered with each other rather than with their batch, indicating that true biological differences between samples are larger than batch effects.”

      (24) Were extraction, library prep, and sequencing negative controls included? Is data available?

      We included extraction blanks (described above) and technical replicates, which were the same sample sequenced across multiple extraction and library preparation batches. Technical replicates from different batches clustered with each other rather than with their batch, indicating that true biological differences between samples are larger than batch effects.

      We have updated the data availability statement to read, “All data for these analyses are available on Dryad at https://doi.org/10.5061/dryad.b2rbnzspv. The 16S rRNA gene sequencing data are deposited on EBI-ENA (project ERP119849) and Qiita (study 12949). Code is available at the following GitHub repository: https://github.com/maunadasari/Dasari_etal-GutMicrobiomeAge”.

      (25) Line 562: how were corrected microbiome delta ages calculated? Currently, the authors state x, y and z factors were corrected for, but it is unclear how this was done.

      The paragraph starting at line 577 describes how microbiome delta age was calculated. We have made only a few changes to this text because we were not sure which aspects of these methods confused the reviewer. However, briefly, we calculated sample-specific microbiome Dage in years as the difference between a sample’s microbial age estimate, age<sub>m</sub> from the microbiome clock, and the host’s chronological age in years at the time of sample collection, age<sub>c</sub>. Higher microbiome Dages indicate old-for-age microbiomes, as age<sub>m</sub> > age<sub>c</sub>, and lower values (which are often negative) indicate a young-for-age microbiome, where age<sub>c</sub> > age<sub>m</sub> (see Figure 3).

      (26) Line 579: typo 'as'.

      We have corrected this typo.

      Works Cited

      Altmann, Jeanne, Laurence Gesquiere, Jordi Galbany, Patrick O Onyango, and Susan C Alberts. 2010. “Life History Context of Reproductive Aging in a Wild Primate Model.” Annals of the New York Academy of Sciences 1204:127–38. https://doi.org/10.1111/j.1749-6632.2010.05531.x.

      Anderson, Jordan A, Rachel A Johnston, Amanda J Lea, Fernando A Campos, Tawni N Voyles, Mercy Y Akinyi, Susan C Alberts, Elizabeth A Archie, and Jenny Tung. 2021. “High Social Status Males Experience Accelerated Epigenetic Aging in Wild Baboons.” Edited by George H Perry. eLife 10 (April):e66128. https://doi.org/10.7554/eLife.66128.

      Binder, Alexandra M., Camila Corvalan, Verónica Mericq, Ana Pereira, José Luis Santos, Steve Horvath, John Shepherd, and Karin B. Michels. 2018. “Faster Ticking Rate of the Epigenetic Clock Is Associated with Faster Pubertal Development in Girls.” Epigenetics 13 (1): 85–94. https://doi.org/10.1080/15592294.2017.1414127.

      Björk, Johannes R., Mauna R. Dasari, Kim Roche, Laura Grieneisen, Trevor J. Gould, Jean-Christophe Grenier, Vania Yotova, et al. 2022. “Synchrony and Idiosyncrasy in the Gut Microbiome of Wild Baboons.” Nature Ecology & Evolution, June, 1–10. https://doi.org/10.1038/s41559-022-01773-4.

      Chen, Brian H., Riccardo E. Marioni, Elena Colicino, Marjolein J. Peters, Cavin K. Ward-Caviness, Pei-Chien Tsai, Nicholas S. Roetker, et al. 2016. “DNA Methylation-Based Measures of Biological Age: Meta-Analysis Predicting Time to Death.” Aging (Albany NY) 8 (9): 1844–59. https://doi.org/10.18632/aging.101020.

      Claesson, Marcus J., Ian B. Jeffery, Susana Conde, Susan E. Power, Eibhlís M. O’Connor, Siobhán Cusack, Hugh M. B. Harris, et al. 2012. “Gut Microbiota Composition Correlates with Diet and Health in the Elderly.” Nature 488 (7410): 178–84. https://doi.org/10.1038/nature11319.

      Galbany, Jordi, Jeanne Altmann, Alejandro Pérez-Pérez, and Susan C. Alberts. 2011. “Age and Individual Foraging Behavior Predict Tooth Wear in Amboseli Baboons.” American Journal of Physical Anthropology 144 (1): 51–59. https://doi.org/10.1002/ajpa.21368.

      Gloor, Gregory B., Jean M. Macklaim, Vera Pawlowsky-Glahn, and Juan J. Egozcue. 2017. “Microbiome Datasets Are Compositional: And This Is Not Optional.” Frontiers in Microbiology 8. https://doi.org/10.3389/fmicb.2017.02224.

      Grieneisen, Laura E., Mauna Dasari, Trevor J. Gould, Johannes R. Björk, Jean-Christophe Grenier, Vania Yotova, David Jansen, et al. 2021. “Gut Microbiome Heritability Is Nearly Universal but Environmentally Contingent.” Science 373 (6551): 181–86. https://doi.org/10.1126/science.aba5483.

      Hanski, Eveliina, Susan Joseph, Aura Raulo, Klara M. Wanelik, Áine O’Toole, Sarah C. L. Knowles, and Tom J. Little. 2024. “Epigenetic Age Estimation of Wild Mice Using Faecal Samples.” Molecular Ecology 33 (8): e17330. https://doi.org/10.1111/mec.17330.

      Heintz, Caroline, and William Mair. 2014. “You Are What You Host: Microbiome Modulation of the Aging Process.” Cell 156 (3): 408–11. http://dx.doi.org/10.1016/j.cell.2014.01.025.

      Horvath, Steve. 2013. “DNA Methylation Age of Human Tissues and Cell Types.” Genome Biology 14 (10): R115. https://doi.org/10.1186/gb-2013-14-10-r115.

      Jayashankar, Lakshmi, Kathleen M. Brasky, John A. Ward, and Roberta Attanasio. 2003. “Lymphocyte Modulation in a Baboon Model of Immunosenescence.” Clinical and Vaccine Immunology 10 (5): 870–75. https://doi.org/10.1128/CDLI.10.5.870-875.2003.

      Marioni, Riccardo E., Sonia Shah, Allan F. McRae, Brian H. Chen, Elena Colicino, Sarah E. Harris, Jude Gibson, et al. 2015. “DNA Methylation Age of Blood Predicts All-Cause Mortality in Later Life.” Genome Biology 16 (1): 25. https://doi.org/10.1186/s13059-015-0584-6.

      O’Toole, Paul W., and Ian B. Jeffery. 2015. “Gut Microbiota and Aging.” Science 350 (6265): 1214–15. https://doi.org/10.1126/science.aac8469.

      Raynaud, Xavier, and Naoise Nunan. 2014. “Spatial Ecology of Bacteria at the Microscale in Soil.” PLOS ONE 9 (1): e87217. https://doi.org/10.1371/journal.pone.0087217.

      Sadoughi, Baptiste, Dominik Schneider, Rolf Daniel, Oliver Schülke, and Julia Ostner. 2022. “Aging Gut Microbiota of Wild Macaques Are Equally Diverse, Less Stable, but Progressively Personalized.” Microbiome 10 (1): 95. https://doi.org/10.1186/s40168-022-01283-2.

      Sender, Ron, Shai Fuchs, and Ron Milo. 2016. “Revised Estimates for the Number of Human and Bacteria Cells in the Body.” PLoS Biology 14 (8): e1002533. https://doi.org/10.1371/journal.pbio.1002533.

      Tung, J, L B Barreiro, M B Burns, J C Grenier, J Lynch, L E Grieneisen, J Altmann, S C Alberts, R Blekhman, and E A Archie. 2015. “Social Networks Predict Gut Microbiome Composition in Wild Baboons.” Elife 4. https://doi.org/10.7554/eLife.05224.

      Weibel, Chelsea J., Mauna R. Dasari, David A. Jansen, Laurence R. Gesquiere, Raphael S. Mututua, J. Kinyua Warutere, Long’ida I. Siodi, Susan C. Alberts, Jenny Tung, and Elizabeth A. Archie. 2024. “Using Non-Invasive Behavioral and Physiological Data to Measure Biological Age in Wild Baboons.” GeroScience 46 (5): 4059–74. https://doi.org/10.1007/s11357-024-01157-5.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public review):

      This paper examines the role of MLCK (myosin light chain kinase) and MLCP (myosin light chain phosphatase) in axon regeneration. Using loss-of-function approaches based on small molecule inhibitors and siRNA knockdown, the authors explore axon regeneration in cell culture and in animal models from central and peripheral nervous systems. Their evidence shows that MLCK activity facilitates axon extension/regeneration, while MLCP prevents it.

      Major concerns:

      (1) In the title, authors indicate that the observed effects from loss-of-function of MLCK/MLCP take place via F-actin redistribution in the growth cone. However, there are no experiments showing a causal effect between changes in axon growth mediated by MLCK/MLCP and F-actin redistribution.

      Thank you for your comments. We revised the title of our manuscript to “MLCK/MLCP regulates mammalian axon regeneration and redistributes the growth cone F-actin”. (line 3)

      (2) The author combines MLCK inhibitors with Bleb (Figure 6), trying to verify if both pairs of inhibitors act on the same target/pathway. MLCK may regulate axon growth independent of NMII activity. However, this has very important implications for the understanding not only on how NMII works and affects axon extension, but also in trying to understand what MLCP is doing. One wonders if MLCP actions, which are opposite of MLCK, also independent of NMII activity? The authors, in the discussion section, try to find an explanation for this finding, but I consider it fails since the whole rationale of the manuscript is still around how MLCK and MLCP affect NMII phosphorylation.

      Thank you for your comments. Although both MLCK and MLCP regulate the activity of NMII, it has been reported that they also govern domain-specific spatial control of actin-based motility in the growth cone. Specifically, MLCK activity is essential for arc translocation and retrograde flow within the P domain, while MLCP appears to specifically modulate arc movement and associated myosin II contractility in the T zone and C domain (Ref). Therefore, it is proposed that the regulatory mechanisms of MLCK and MLCP are highly complex during the process of axon growth. 

      [Ref]:Xiao-Feng Zhang, Andrew W Schaefer, Dylan T Burnette, Vincent T Schoonderwoert, Paul Forscher. Rho-dependent contractile responses in the neuronal growth cone are independent of classical peripheral retrograde actin flow. Neuron. 2003 Dec 4;40(5):931-44.

      What follows is a discussion of the merits and limitations of different claims of the manuscript in light of the evidence presented.

      (1) Using western blot and immunohistochemical analyses, authors first show that MLCK expression is increased in DRG sensory neurons following peripheral axotomy, concomitant to an increase in MLC phosphorylation, suggesting a causal effect (Figure 1). The authors claim that it is common that axon growth-promoting genes are upregulated. It would have been interesting at this point to study in this scenario the regulation of MLCP.

      We thank Reviewer for the positive comment on our manuscript.

      (2) Using DRG cultures and sciatic nerve crush in the context of MLCK inhibition (ML-7) and down-regulation, authors conclude that MLCK activity is required for mammalian peripheral axon regeneration both in vitro and in vivo (Figure 2). In parallel, the authors show that these treatments affect as expected the phosphorylation levels of MLC.

      The in vitro evidence is of standard methods and convincing. However, here, as well as in all other experiments using siRNAs, no Control siRNAs were used. Authors do show that the target protein is downregulated, and they can follow transfected cells with GFP. Still, it should be noted that the standard control for these experiments has not been done.

      Thank you for your comments. We utilized scrambled siRNA as a control. I sincerely apologize for the oversight in the manuscript; although we mentioned that scrambled siRNA was used as a control in the figure legends, we failed to clearly articulate this important information in the methods section. We have revised the manuscript accordingly. (line 87, line 549, line, line 562, line 568).

      (3) The authors then examined the role of the phosphatase MLCP in axon growth during regeneration. The authors first use a known MLCP blocker, phorbol 12,13-dibutyrate (PDBu), to show that is able to increase the levels of p-MLC, with a concomitant increase in the extent of axon regrowth of DRG neurons, both in permissive as well as non-permissive substrates. The authors repeat the experiments using the knockdown of MYPT1, a key component of the MLC-phosphatase, and again can observe a growth-promoting effect (Figure 3).

      The authors further show evidence for the growth-enhancing effect in vivo, in nerve crush experiments. The evidence in vivo deserves more evidence and experimental details (see comment 2). A key weakness of the data was mentioned previously: no control siARN was used.

      Thank you for your comments. As mentioned above, we used scramble siRNA as control in vivo experiment as well.

      (4) In the next set of experiments (presented in Figure 4) authors extend the previous observations in primary cultures from the CNS. For that, they use cortical and hippocampal cultures, and pharmacological and genetic loss-of-function using the above-mentioned strategies. The expected results were obtained in both CNS neurons: inhibition or knockdown of the kinase decreases axon growth, whereas inhibition or knockdown of the phosphatase increases growth. A main weakness in this set is that drugs were used from the beginning of the experiment, and hence, they would also affect axon specification. As pointed in Materials and Method (lines 143-145) authors counted as "axons" neurites longer than twice the diameter of the cell soma, and hence would not affect the variable measured. In any case, to be sure one is only affecting axon extension in these cells, the drugs should have been used after axon specification and maturation, which occurs at least after 5 DIV.

      Thank you for your comments. We acknowledge that the early administration of drugs can lead to unintended effects on neuronal polarization and axon formation. However, in line with our previous publication, we focused exclusively on measuring the longest length of the axon. To quantify axon length, we selected neurons exhibiting an axonal process exceeding twice the diameter of their cell body and measured the longest axon from 100 neurons for each condition (Ref 1, Ref 2). Consequently, we believe that drug administration at the onset of cell culture influences axon formation; however, it does not significantly affect the drug's impact on axon length.

      [Ref 1]: Chang-Mei Liu, Rui-Ying Wang, Saijilafu, Zhong-Xian Jiao, Bo-Yin Zhang, Feng-Quan Zhou. MicroRNA-138 and SIRT1 form a mutual negative feedback loop to regulate mammalian axon regeneration. Genes Dev. 2013 Jul 1;27(13):1473-83.

      [Ref 2]: Eun-Mi Hur, Saijilafu, Byoung Dae Lee, Seong-Jin Kim, Wen-Lin Xu, Feng-Quan Zhou. GSK3 controls axon growth via CLASP-mediated regulation of growth cone microtubules. Genes Dev. 2011 Sep 15;25(18):1968-81.

      (5) In Figure 7, the authors a local cytoskeletal action of the drug, but the evidence provided does not differentiate between a localized action of the drugs and a localized cell activity.

      We appreciate the reviewer’s insightful comments and have revised our title to “MLCK/MLCP Regulates mammalian axon regeneration and redistributes growth cone F-actin.” Furthermore, we have made corresponding revisions to the manuscript (line31, line 73).

      References:

      (1) Eun-Mi Hur 1, In Hong Yang, Deok-Ho Kim, Justin Byun, Saijilafu, Wen-Lin Xu, Philip R Nicovich, Raymond Cheong, Andre Levchenko, Nitish Thakor, Feng-Quan Zhou. 2011. Engineering neuronal growth cones to promote axon regeneration over inhibitory molecules. Proc Natl Acad Sci U S A. 2011 Mar 22;108(12):5057-62. doi: 10.1073/pnas.1011258108.

      (2) Garrido-Casado M, Asensio-Juárez G, Talayero VC, Vicente-Manzanares M. 2024. Engines of change: Nonmuscle myosin II in mechanobiology. Curr Opin Cell Biol. 2024 Apr;87:102344. doi: 10.1016/j.ceb.2024.102344.

      (3) Karen A Newell-Litwa 1, Rick Horwitz 2, Marcelo L Lamers. 2015. Non-muscle myosin II in disease: mechanisms and therapeutic opportunities. Dis Model Mech. 2015 Dec;8(12):1495-515. doi: 10.1242/dmm.022103.

      Reviewer #2 (Public review):

      Summary:

      Saijilafu et al. demonstrate that MLCK/MLCP proteins promote axonal regeneration in both the central nervous system (CNS) and peripheral nervous system (PNS) using primary cultures of adult DRG neurons, hippocampal and cortical neurons, as well as in vivo experiments involving sciatic nerve injury, spinal cord injury, and optic nerve crush. The authors show that axon regrowth is possible across different contexts through genetic and pharmacological manipulation of these proteins. Additionally, they propose that MLCK/MLCP may regulate F-actin reorganization in the growth cone, which is significant as it suggests a novel strategy for promoting axonal regeneration.

      Strengths:

      This manuscript presents a wide range of experimental models to address its hypothesis and biological question. Notably, the use of multiple in vivo models significantly enhances the overall validity of the study.

      We thank Reviewer for the positive comment on our manuscript.

      Weaknesses:

      - The authors previously published that blocking myosin II activity stimulates axonal growth and that MLCK activates myosin II. The present work shows that inhibiting MLCK blocks axonal regeneration while blocking MLCP (the protein that dephosphorylates myosin II) produces the opposite effect. Although this contradiction is discussed, no new evidence has been added to the manuscript to clarify this mechanism or address the remaining questions. Critical unresolved questions include: what happens to myosin II expression when both MLCK and MLCP are inhibited? If MLCK/MLCP are acting through an independent mechanism, what would that mechanism be?

      - In the discussion, the authors mention the existence of two myosin II isoforms with opposing effects on axonal growth. Still, there is no evidence in the manuscript to support this point.

      - It is also unclear how MLCK/MLCP acts on the actin cytoskeleton. The authors suggest that proteins such as ADF/cofilin, Arp 2/3, Eps8, Profilin, Myosin II, and Myosin V could regulate changes in F-actin dynamics. However, this study provides no experimental evidence to determine which proteins may be involved in the mechanism.

      Thank you for your comments. Axon growth is an exceptionally intricate process, facilitated by the coordinated regulation of gene expression in the soma, axonal transport along the shaft, and the assembly of cytoskeletal elements and membrane proteins at the growth cone. In this paper, our results primarily demonstrate that MLCK/MLCP plays a crucial role in regulating mammalian axon regeneration and redistributing F-actin within the growth cone; however, we did not investigate which specific proteins act downstream of MLCK/MLCP during axon regeneration.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      - A title more suitable for the evidence shown can be: MLCK/MLCP regulates mammalian axon regeneration and redistributes the growth cone F-actin.

      Thank you for your comments. We revised the title of our manuscript to“MLCK/MLCP regulates mammalian axon regeneration and redistributes the growth cone F-actin” (line 3).

      -In figure 3, It would be useful to indicate in the figure legend, that the red arrow is pointing to a suture that was performed during surgery to mark clearly the injury site.

      Thank you for your comments. We revised Figure 3 legend that indicates the red arrow is pointing to a suture that was performed during surgery to mark clearly the injury site (line 571-572).

      - The following is a concern raised in the previous round, and that the response by the authors was so complete and accurate that I consider it would be useful to include it in the discussion section.

      Thank you for your comments. We included those contents in the discussion section of our revised manuscript (line 348-354, line 355-359).

      The author combines MLCK inhibitors with Bleb (Figure 6), trying to verify if both pairs of inhibitors act on the same target/pathway. The rationale is wrong for at least two reasons.

      a- Because both lines of evidence point to contrasting actions of NMII on axon growth, one approach could never "rescue" the other.

      Reply by authors in R1:If MLCK regulates axon growth through the activation of Myosin, the inhibitory effect of ML-7 (an MLCK inhibitor) on axon growth might be influenced by Bleb, a NMII inhibitor. However, our findings reveal that the combination of Bleb and ML-7 does not alter the rate of axon outgrowth compared to ML-7 alone. This suggests that the roles of ML-7 and Bleb in axon growth are independent. It means MLCK may regulate axon growth independent of NMII activity.

      b- Because the approaches target different steps on NMII activation, one could never "prevent" or rescue the other. For example, for Bleb to provide a phenotype, it should find any p-MLC, because it is only that form of MLC that is capable of inhibiting its ATPase site. In light of this, it is not surprising that Bleb is unable to exert any action in a situation where there is no p-MLC (ML-7, which by inhibiting the kinase drives the levels of p-MLC to zero, Figure 4A). Hence, the results are not possible to validate in the current general interpretation of the authors. (See 'major concern').

      Reply by authors in R1: The reported mechanism of blebbistatin is not through competition with the ATP binding site of myosin. Instead, it selectively binds to the ATPase intermediate state associated with ADP and inorganic phosphate, which decelerates the phosphate release. Importantly, blebbistatin does not impede myosin's interaction with actin or the ATP-triggered disassociation of actomyosin. It rather inhibits the myosin head when it forms a product complex with a reduced affinity for actin. This indicates that blebbistatin functions by stabilizing a particular myosin intermediate state that is independent of the phosphorylation status of myosin light chain (MLC).

      [Ref] Kovács M, Tóth J et al. Mechanism of blebbistatin inhibition of myosin II. J Biol Chem. 2004 Aug 20;279(34):35557-63.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      In this manuscript, the authors highlight the importance of the Golgi apparatus during SARS-CoV-2 infection. Specifically, using different compounds able to alter Golgi structure and function, the authors show a strong reduction in SARS-CoV-2 infection rate. In particular it is interesting to observe that treatments of 24 hrs with BFA strongly impair viral infection, highlithing the importance of Golgi function for this virus. Albeit the time of treatment is different. this observation is in contrast with previous studies on related coronaviruses (Ghosh et al., 2020) that did not observe any effect upon treatment with BFA. This might imply that SARS-CoV-2 relies more on conventional trafficking pathways respect to other coronaviruses which, under certain conditions, favour different trafficking routes.

      We thank the reviewer for the positive comments. Indeed, our results with BFA treatment for 24 hours are inconsistent with previous studies based on the prototype coronavirus MHV (Ghosh et al., 2020). To validate this observation, we have now performed new experiments with BFA treatment for 4, 6, and 8 hours, matching the time points used in the previous study (Ghosh et al, 2020). Our new results show that BFA treatment at these early time points significantly inhibits SARS-CoV-2 assembly and secretion, as measured by immunoblotting and TCID50 assays, without reducing intracellular viral RNA levels, which serve as a marker of genome replication. This implies that Golgi function and an intact ER-to-Golgi trafficking route are required for SARS-CoV-2 assembly and secretion. These new results are now presented as new Fig. 2C-H.

      The authors additionally observed that viral infection increases TGN46 levels while decreasing GRASP55 levels. To dissect the role of TGN46 and GRASPR55, the authors performed several infection studies in cells in which the levels of the two proteins were modulated either by overexpression (GRASP55) and/or siRNA-mediated knock-down (GRASP55 and TGN46). Those approaches suggest that GRASPR55 overexpression, a protein essential for Golgi stack formation, decelerates viral trafficking and inhibits viral assembly while its depletion reverses the effects. On the other hand, TGN46 knock-down impairs viral trafficking but not assembly. Overall the study clearly shows the importance of the Golgi during SARS-CoV-2 and also shows that modulation of those two factors affect viral infection.

      We appreciate the reviewer's accurate summary of our work and positive comments.

      However the claims that specifically the trafficking (TGN46) and trafficking and assembly (GRASP55) are not fully substantiated. Regarding GRASP55, the authors state that viral infection decreases GRASPR55 levels and this results in Golgi fragmentation. However GRASPR55 levels decrease is shown at 24 hrs post infection while Golgi fragmentation occurs as early as 5 hrs. Thus there might be no direct casual effect between the two effects.

      We agree with the reviewer that GRASP55 downregulation is unlikely to be the only reason for Golgi fragmentation in the infected cells. In our results, 5- or 8-hour post infection caused only mild Golgi fragmentation (Fig. S6D), while 24 hours post infection led to severe Golgi fragmentation. On the other hand, GRASP55 is likely to play a relevant role as SARS-CoV-2 induced Golgi fragmentation can be partially rescued by exogenous GRASP55 expression (Fig S6C). We have modified the text in lines 303-305 accordingly to acknowledge the possibility that other factors also contribute to Golgi fragmentation in infected cells.

      Additionally, the authors show that overexpression of GRASP55 rescue Golgi fragmentation, as observed by imaging, however is not clear if only infected cells where quantified and if they had the same level of infection.

      Yes, only infected cells with either GFP or GRASP55-GFP expression were quantified. The viral infection rate was significantly lower in GRASP55-GFP expressing cells compared to GFP expressing cells (Fig 5A-B).

      The authors exclude and effect on entry based on experiment on Spike expressing pseudovirus in 293-ACE2, however they also clearly observe reduction of ACE2 on the membrane of GRASPR55 expressing cells (Fig S6B). Thus how can they explain this discrepancy and how ca defect in entry can be fully marked out in these cell lines?

      We thank the reviewer for pointing this out. This discrepancy is likely due to the different systems used in the two experiments.

      In the pseudovirus entry assay, ACE2 was exogenously expressed in 293T cells and GRASP55 expression did not show any effect on the viral entry efficiency. In contrast, Huh7-ACE2 cells were selected for a high surface expression of ACE2. While GRASP55 expression reduces surface ACE2 levels as shown in our cell surface biotinylation assay, we believe that the surface ACE2 levels in GRASP55-expressing cells remain sufficient to support viral entry. To further investigate whether GRASP55 expression affects viral entry using authentic SARS-CoV-2, we performed RT-qPCR analysis of intracellular RNA level of the spike, N, and RdRp in both GFP and GRASP55-GFP expressing cells 4 hours post infection (new Fig 5D). Our results show that GRASP55 expression does not affect SARS-CoV-2 entry efficiency, even though it reduces ACE2 surface expression levels.

      It is not clear to which process the authors refer to when they write about "viral trafficking". Is it virion trafficking or viral proteins trafficking? The two process are linked but are not the same. This oversemplification can be misleading. For instance the authors show that overexpression of GRASP55 decreases Spike protein on the plasma membrane and its depletion increases S protein incorporation into psudoviruses. However it was shown that in infected cells S protein is mainly retained at the ERGIC by M and E (Boson et al., 2021) where viral assembly occurs. Thus an increase in S trafficking on the PM does not correlate with an increase in virion trafficking,

      We agree with the reviewer that our use of the term "viral trafficking" is imprecise and we have changed this throughout the manuscript to be more specific. S trafficking to the PM may not necessarily be equal to an increase in virion trafficking and thus have rephrased these terms in our writing accordingly.

      We acknowledge that our cell surface biotinylation assay results only demonstrate that GRASP55 overexpression slows down spike protein trafficking to the PM. We have accordingly also examined viral protein and infectious particle secretion into the culture medium as a more direct readout of virion trafficking (new Fig 2E, 2H, 6K, and 7P).

      Finally, we have removed all of the data describing spike incorporation into pseudoviruses as we acknowledge that plasma membrane assembly of lentiviruses is not a good model for SARS-CoV-2 assembly.

      ...and ultimately, the data provided do not fully support the authors claim on a modulation of "virion trafficking" in response to GRASP or TGN46 changes, since no experiments clearly show a change in virions secretion.

      In response to the above comment, we provide the following clarification: Our Western blotting, TCID50 assay, and plaque assay results collectively demonstrate that SARS-CoV-2 virion secretion is reduced in GRASP55 expressing cells (new Fig 5E-M) and in TGN46-depleted cells (new Fig 7F-H, 7L-N). Conversely, viral assembly and secretion appear to be increased in GRASP55-depleted cells (new Fig 6A, 6E-I) at 24 hpi. Furthermore, within a single viral secretion cycle (10 hpi), GRASP55 depletion increased viral secretion (new Fig 6K), while TGN46 depletion reduced viral secretion (new Fig 7P). These findings strongly support the conclusion that GRASP55 and TGN46 modulate viral secretion.

      Importantly, the authors do not rule out potential effects of their perturbations on genome replication. The only experiment that they perform in this direction is presented in Fig. S7B, where the authors show similar percentage of infected cells at early stage upon silecing of GRASPR55. The experiment suggests that productive entry is similar in these conditions, but quantification of intracellular viral genome could exclude a change in viral replication. If no changes in viral replication are observed, the authors could verify an increase in particles secretion by collecting supernatants from the early time points and performing plaque assays and quantification of viral genomes by qRT-PCR, to prove that modulation of GRASPR55 indeed promote SARS-CoV-2 trafficking.

      We thank the reviewer for the excellent suggestions. In response, we performed RT-qPCR analysis in GRASP55-expressing and TGN46-depleted cells at 4 hpi to compare the viral genome replication process. Additionally, we performed western blotting analysis and released viral titer assay of the culture media from both GRASP55-depleted and TGN46-depleted cells at 10 hpi to investigate virion release. Our new results show that GRASP55 depletion increases viral secretion (new Fig. 6K), while TGN46 depletion reduces viral secretion (new Fig. 7P). Furthermore, GRASP55 expression and TGN46 depletion do not perturb viral genome replication (new Fig. 5D and new Fig. 7R).

      Finally, whenever reduction of viral infection is observed upon cell partubation, a robust analysis of cell viability should be presented to exclude pleiotropic effects. Expecially in presence of multiple pertubation that might affect cell metabolism. The authors should carefully control cell viability and growth in response to depletion of TGN46 and GRASP55.

      We thank the reviewer for the excellent suggestions, which were also pointed out by reviewer #3. To address this, we performed the LDH cytotoxicity assay under SARS-CoV-2 infection conditions with TGN46 depletion and GRASP55 depletion/expression (new Fig. 5C, 6L, 7Q). Our new results show that no significant cell death was induced by TGN46 depletion, GRASP55 depletion/expression, or other perturbations.

      Minor: show data on viability of the drug and add the relative section in Material and Methods.

      We performed LDH assays of SARS-CoV-2 infected Huh7-ACE2 cells treated with 9 small molecules, and LDH release levels were similar across all treatments (new Fig. S3C). Additionally, a CellTiter Glo viability assay of 293T-ACE2 cells did not show any significant effect of cell viability with small molecule treatment (new Fig S3F). Detailed descriptions of these assays have been included in the Material and Methods section.

      Figure 3A: should read spike and not nucleocapsid eported for SARS-CoV-2

      Fig. 3A labeling is correct - cells were labeled with antibodies for GRASP65 (rabbit) and for nucleocapsid (mouse).

      Lack of inhibition with camostat correlates with lack of TMPRSS2 in the Huh7. The sentence seems to be too general while in this case the effect is clearly cell specific. Similarly, the importance of the lysosome in viral entry is restricted to cells lacking TMPRSS2 and cannot be generalized since CQ, for example, does not work in Calu-3 cells that express TMPRSS2 cells.

      We agree with the reviewer and have added one sentence: The relative smaller effect of camostat mesylate observed here, compared to previous studies (Hoffmann et al, 2021), might be due to the use of different cell lines across studies in lines 182-184. We also discussed the discrepancy of CQ treatment between our Huh7-ACE2 cells and Calu-3 cells (Hoffmann et al, 2020) in lines 466-473.

      Typo: Fig S3B - Y axis should reat viral not vrial

      Thank you - we have corrected this.

      S3C: concentrations of the compound used in the assay should be reported. Was a viability assay performed also in the 293T-ACE2 cell line?

      We thank the reviewer for the suggestion. We have added the concentration information to the legend in Fig. S3E "Cell entry assay of 293T or 293T-ACE2 cells by SARS-CoV-2 Spike pseudotyped lentivirus for 24h in the presence of indicated molecules at the same concentrations as in Fig. 2A." Additionally, we performed a CellTiter Glo assay to assess the viability of 293T-ACE2 cells treated with the 9 molecules. The results demonstrate that treatment with these 9 molecules does not alter cell viability (Fig. S3F).

      Significance

      Overall, the major strenght of the manuscript is that it has clarified the importance of the Golgi during SARS-CoV-2 infection. The drugs screening demonstrate that for SARS-CoV-2 the conventional secretion seems to have major role respect to other secretory routes observed for other coronaviruses. Also it is clear that the two factors identified by the authors have a role in viral infection, however the major limitation is that the authors failed to clearly highlight which step/s of the viral life cycle are modulated upon GRASP55 and TGN46 perturbatio. Expecially the claims on "trafficking" is not fully substantiated, since the only experiment in this direction is the transport of Spike protein on the plasma membrane upon GRASPR55 overexpression. It is risky to conclude that the trafficking of a single protein reflect the intracellular trafficking of the virions.

      Several of the finding presented in the first part of the manuscript have been already previously reported (for example the fragmentation of the Golgi upon SARS-CoV-2 infection), however the role of GRASP55 and TGN46 in SARS-CoV-2 infection has been reported here for the first time. This manuscript can be of interest for a broad audience considering the topic (cell biology, host-pathogen interactions and molecular virology)

      My expertise reside in the field of molecular virology, expecially in the contest of the mechanisms of viral replication and host-pathogen interactions.

      We thank the reviewer for the overall positive comments and excellent suggestions. We hope that our new results have convincingly demonstrated that viral trafficking is regulated by GRASP55 and TGN46.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: In this study, Zhang and colleagues address the impact on SARS-CoV-2 infection on the morphology of the Golgi apparatus and convincingly demonstrate a fragmentation of this organelle in infected cells. Conversely, they show that the modulation of TGN46 or GRASP55 expressions, two components of this organelle impact SARS_CoV-2 replication. By monitoring the relative levels of viral Spike and nucleocapsid in the cell supernatants, they conclude that GRASP55 regulates particle assembly and trafficking while TGN46 controls only secretion. The study was generally well performed, and the quality of the microscopy and western blot data is good. It was appreciated that all the phenotypes were robustly quantified. I believe that this study is potentially interesting and relevant for the SARS-CoV-2 community since providing an extensive characterization of the interplay between SARS-CoV-2 and the Golgi apparatus.

      We thank the reviewer for the positive comments.

      However, as described below, I have some concerns regarding the interpretations of some of the key conclusions. Moreover, the fact that it was already described by several groups that Golgi is a key machinery used for SARS-CoV-2 virion assembly (ERGIC) and secretion dampens my enthusiasm about the study, especially without clear molecular mechanisms about the interplay between SARS-CoV-2 proteins and TNG46/GRASP55.

      We rephrased some sentences following the reviewer's suggestions. Although it was believed that SARS-CoV-2 is assembled at the ERGIC, there has been significant controversy surrounding the virion secretion pathway. Our results strongly support that SARS-CoV-2 virions traffic through the Golgi apparatus and that an intact ER-to-Golgi trafficking pathway is essential for SARS-CoV-2 assembly and secretion. Manipulation of two Golgi-resident proteins, GRASP55 and TGN46, significantly regulates SARS-CoV-2 secretion. Interestingly, GRASP55 regulates both assembly and secretion of SARS-CoV-2, while TGN46 exclusively modulates viral secretion. This is consistent with their subcellular localization, as GRASP55 is localized to the medial/trans Golgi, whereas TGN46 is localized to the TGN. We hope that our new experimental results (Figs. 2C-H, 5C-D, 6J-L, and 7O-R) have addressed all concerns from the reviewer. Identification of downstream protein targets involved in TGN46/GRASP55-mediated modulation of SARS-CoV-2 trafficking will be the focus of our future studies.

      Major comments: -All the assays have been performed in liver-derived Huh7 cells (overexpressing SARS-CoV-2 receptor) ACE2 (for infection) or kidney 293 cells (for pseudotyped HIV entry assays). However, no conclusion was validated in lung-derived cells (like A549-ACE2, Calu-3 or primary cells), which would be important since the respiratory tract is the main target of SARS-CoV-2

      In our study, Huh7-ACE2 cells are sorted for the high expression of endogenous ACE2 protein, and we did not overexpress ACE2 protein. Also, the liver has been reported to be a site of SARS-CoV-2 infection in humans (Barnes, 2022). We did use A549 and Calu-3 cells in pilot experiments; A549 cells displayed infection rates that were too low for our purposes, and Calu-3 cells showed both low infection rates and relatively disorganized Golgi in the absence of viral infection. We were able to add new IF results from Calu-3 cells. Consistent with our findings in Huh7-ACE2 cells, SARS-CoV-2 infection disrupts Golgi structure and alters protein levels of TGN46 and GRASP55 in Calu3 cells (new Fig. S5R-W). We also confirmed GRASP55 downregulation and TGN46 upregulation in VeroE6 cells (Fig S5K-N).

      -Fig2: The impact of the drugs on replication was assessed by measuring the % of infected cells. At 24 hpi, I am unsure about what this value is supposed to measure (the whole life cyle, intracellular replication or spread?), especially since it is not indicated when the drugs were added to the cells. Was it during, before or after the infection? This information should be provided.

      Fig. 2 refers to infection, not replication. We agree that infection encompasses multiple steps of the viral cycle. In our experiments, cells were treated with the drugs immediately before viral infection. We have added the information into the Fig. 2 legend.

      If the "Golgi" drugs impact egress only (as inferred by the genetic modulation phenotypes), I would expect that at this early time point, the % of infection would not drastically change (as well as intracellular RNA) but that the extracellular infectious titers would decrease. Plaque assays (or TCID50 assays) and RT-qPCR on intracellular viral RNA should be conducted to better understand the impact of drug treatments.

      This is a great suggestion! As the reviewer expected, our new BFA time-point assay shows that at early time points, the intracellular RNA levels for S, N and RdRp are not reduced. However, the extracellular N protein (measured by WB) and virial titer (measured by TCID50 assay), which serve as readouts for virion secretion, are significantly decreased (new Fig. 2C-H).

      On page 10, it is said that the virus makes three cycles of replication within 24 hours following infection. On what data is this based? This seems a lot. If this is true (and shown in Huh7-ACE2 cells), does the assay of figure 2 measure spread in general? More importantly, despite mentioned, the cell viability data are not provided. It is important to show them to ensure that these concentrations of drugs are not toxic at the tested concentrations.

      It has been reported that a single cycle of SARS-CoV-2 infection is approximately 8 hours (Eymieux et al, 2021). Therefore, Fig. 2 represents a multicycle infection, reflecting a composite measure of viral infection and spread. Under the microscope, we did not observe dramatic cell death at the tested concentration. To further assess cytotoxicity, we performed a cell toxicity assay for the 9 small molecules that inhibit viral infection of Huh7-ACE2 cells. The results show that no or minor cell death was observed with all these compounds (Fig. S3C).

      -I appreciated the extensive confocal microscopy analysis performed by the authors, which seems of high quality and overall, very convincing. They clearly show that SARS-CoV-2 infection induces the fragmentation of the Golgi apparatus although it was reported by others before as mentioned by the authors.

      We thank the reviewer for the positive comments. We agree that Golgi fragmentation was observed during SARS-CoV-2 infection, as we mentioned. However, our study provides a comprehensive and systematic analysis of the entire host cell endomembrane system in the response to viral infection.

      However, it was hard for me to make the functional link between these data and those related to GRASP55 and TGN46 overexpression/knockdown. First, the authors should assess the morphology of the Golgi apparatus in Huh7-ACE2 when GRASP55 is knocked down/out or when TGN46 is overexpressed. Second, in these 2 conditions that favor replication, it should be assessed whether this correlates with Golgi fragmentation. Even if this was probably shown before, it is relevant to show that these genetic modulations induce Golgi reshaping in this particular cell type by confocal microscopy (and ideally electron microscopy).

      Thank you for the suggestion. We performed IF analysis to assess Golgi morphology in Huh7-ACE2 cells under conditions of GRASP55 knockdown or TGN46 overexpression. Our results show that GRASP55 depletion disrupts Golgi structure (Fig. S7D), whereas TGN46 expression does not significantly alter the Golgi morphology (Fig. S8D).

      -The fact that GRASP55-GFP expression decreases in 293T the cell surface levels of ACE2, the receptor of Spike (Fig S6), raises concern that the effect of GRASP55 is not specific to the virus and suggests that the whole secretory pathway is altered, while an impairment of virus entry should be expected in this cell line. Is there a similar trend in Huh7-ACE2?

      Reviewer 1 raised a similar question regarding viral entry efficiency. Fig. S6B, performed in Huh7-ACE2 cells, shows that GRASP55-GFP expression also decreases ACE2 surface level in these cells. To further assess whether GRASP55 expression affects viral entry, we performed RT-qPCR analysis of viral RNA at early time points of infection. We found that authentic SARS-CoV-2 entry efficiency was not altered by GRASP55 expression (new Fig. 5D). Although GRASP55 overexpression does alter the secretory pathway, we want to point out that SARS-CoV-2 infection downregulates endogenous GRASP55 expression. We have used GRASP55 overexpression as a probe to assess the effects of GRASP55 on the secretory pathway and on SARS-CoV-2 virion trafficking, but this does not actually reflect what is observed in SARS-CoV-2 infection.

      In addition to addressing the functionality of the secretory machinery in Huh7-ACE2, it would be relevant to repeat the cell surface labelling in the context of pseudotyped virus production with other viral envelopes such as VSV G protein or HIV gp41/gp120. If the phenotype is specific to Spike trafficking, the cell surface abundance of these alternative viral proteins should not be impacted by GRASP55 overexpression. Otherwise, this would indicate a general effect of on the secretory pathway. Besides, since HIV Gag is directed directly to the plasma membrane during particle assembly without entering the secretory pathway, I am not convinced that upstream alteration on nucleocapsid assembly at the ERGIC should be excluded. Indeed, changes on the S/N ratios are generally mild and I feel that this cannot explain the phenotypes in the extracellular infectious titers.

      We have removed the original figure because we acknowledge that HIV Gag is directed directly to the plasma membrane, which is different from the trafficking of SARS-CoV-2 spike protein. We appreciate the reviewer's recognition of the difference in extracellular infectious titers between GFP and G55-GFP expressing cells. We hypothesize that GRASP55 expression not only reduces the number of spikes on each virion but also inhibits the secretion of SARS-CoV-2, resulting in a significantly lower extracellular infectious titer. We agree that it would be interesting to test whether GRASP55 expression affects viral production with other viral envelopes. However, this is beyond the scope of the current study and represents a promising direction for future research.

      More generally, the comparison between trafficking and assembly should be better assessed and not simply based on extracellular N and S levels. It was hard to see the differences between the two in terms of phenotypes. The authors should at least measure the intracellular infectivity upon TGN46 and GRASP55 knock/down and overexpression as well as intracellular vRNA abundance as a readout of RNA replication (which is anticipated to remain unchanged).

      We thank the reviewer for the valuable suggestions. We performed RT-qPCR analysis of Spike, N, and RdRp at early time points of infection. The new results show that neither GRASP55 expression (new Fig. 5D) nor TGN46 depletion (new Fig. 7R) affects viral RNA abundance at an early infection timepoint (4 hpi). Also, we found that GRASP55 depletion increased intracellular infectivity (new Fig. 6J) while TGN46 depletion did not affect intracellular infectivity (new Fig. 7O), suggesting that GRASP55 modulates viral assembly but TGN46 does not.

      -Finally, mechanistic insight about the viral determinants regulating the morphology of the Golgi would significantly strengthen the study.

      Fig S6 shows that S expression decreases ACE2 surface levels? If so, could some S mutants be tested? Does it correlate with Golgi fragmentation? Do other viral structural proteins contribute to Golgi morphological alterations?

      We thank the reviewer for the suggestions. These are indeed interesting experiments, but we believe that investigating viral determinants of Golgi fragmentation should be pursued by future studies.

      In the same line of idea, how GRASP55 and TGN46 regulate replication. The link with Golgi morphology is unclear. Are these proteins hijacked by SARS-COV-2?

      Our new data in this revised manuscript more clearly define the stages in the viral infection cycle that are modulated by GRASP55 and TGN46. New Fig. 5D and Fig. 7R show that neither GRASP55 nor TGN46 affects viral entry or early viral replication. However, GRASP55 perturbation modulates viral assembly and secretion, while TGN46 perturbation affects virion secretion but not assembly. Fig. S6C shows that GRASP55 overexpression in the presence of the virus partially rescues Golgi fragmentation. The mechanisms by which GRASP55 and TGN46 are hijacked by SARS-CoV-2 will be explored in the future studies.

      Page 13 mentions some relevant mutants that could be assessed in this context and provide mechanistic insights.

      It would be interesting to investigate the effects of GRASP55 mutants or specific domains on SARS-CoV-2 trafficking, which we plan to explore in future studies.

      Minor comments: -The signal of calreticulin in Fig. S1 is too low to appreciate it distribution.

      We have increased the intensity of calreticulin staining for both uninfected and infected cells in parallel in Fig. S1. Thank you.

      -Fig 4K, Q: The differences in LC3 forms levels are not convincing. These results do not allow to draw any conclusion about autophagy, especially considering that this was done at steady-state and that the autophagic flux was not measured. Indeed, a bafilomycin A treatment control would be required to measure the real induction of autophagosomes. Lysosomal degradation inhibition allows the detection of LC3 accumulation.

      We agree that additional experiments are needed to demonstrate autophagic flux alteration by SARS-CoV-2. We observed an increase in LC3II/LC3I ratio in infected cells at steady state and did not explore this further, since this is not our main focus of this study. Therefore, we have removed the LC3 blots and quantification from Figs. 4 and S5.

      -In the GRASP55 overexpression and TGN46 knockdown studies, associated cell viability should be measured to control that that these genetic manipulations do not induce any cytotoxicity which may impact viral replication.

      We appreciate the reviewer's suggestions. We performed the LDH cytotoxicity assay under SARS-CoV-2 infection with TGN46 depletion or GRASP55 expression. Our new results show that TGN46 depletion or GRASP55 depletion/expression did not induce significant cell death (Figs. 5C, 6L, and 7Q).

      -The authors should test the impact of GRASP55 and GRASP65 knock-out on SARS-CoV-2 replication

      Investigating the genetic GRASP55 knockout effect on SARS-CoV-2 replication would be valuable. However, ACE2 protein expression in our Huh7-ACE2 cells decreases with cell passages, making knockout construction on this background impractical due to low ACE2 levels and poor viral infection rates. We believe that both our GRASP55 overexpression and depletion assays sufficiently support its role in SARS-CoV-2 trafficking. Future studies will explore GRASP55 knockout in different cell lines.

      -The authors should provide more details about the USA-WA1/2020 isolate in the Methods section. Is it related to the "Wuhan" strain or the variant which spread globally in early 2020 (with D614G mutation in Spike).

      USA-WA1/2020 was isolated from an oropharyngeal swab from a patient who returned from China and developed COVID-19 on January 19, 2020, in Washington, USA. It is related to the "Wuhan" strain but does not have D614G mutation in spike. Additional details have been added to the Methods section.

      -Fig 8: The combined modulation of GRASP55 and TGN46 expressions does not really seem additive to me since a 70% decrease of either protein modulation is observed while the combined condition brings this value to 75% in TCID50 assays. This does not bring much insight to the study in my opinion. I would suggest that the authors consider removing this figure.

      We agree with the reviewer's recommendation and have removed Fig. 8.

      Reviewer #2 (Significance (Required)):

      General assessment and advance: The study was generally well performed, and the quality of the microscopy and western blot data is good. It was appreciated that all the phenotypes were quantified extensively. However, I have some concerns regarding the interpretations of some of the key conclusions. Moreover, the fact that it was already described by several groups that Golgi is a key machinery for SARS-CoV-2 virion assembly (ERGIC) and secretion dampens my enthusiasm about the study. In addition, the antiviral activity of several tested drugs was also reported elsewhere. A clear mechanism of how SARS-CoV-2 induces a fragmentation of the Golgi would strengthen the study. In the same line of idea, it is unclear how TGN46 and GRASP55 regulate the late steps of the life cycle. The link between SARS-CoV-2-induced Golgi fragmentation and TGN46/GRASP55 is unclear. In my opinion, the data did not allow to clearly discriminate between virion assembly and egress. I was not convinced that it was not simply due to a general disruption of the secretory pathway (as attested by ACE2 down regulation upon GRASP55 overexpression).

      Targeted audience: This study will be of high interest for molecular virologists (not only working on SARS-CoV-2) but could be very well fit into the scope of molecular/cell biology-focused generalist journals

      Reviewer expertise: Molecular virology, virus-host interactions (especially involving membranous organelles), SARS-CoV-2, RNA viruses

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Zhang et al. demonstrated in this study that the Golgi apparatus and many other organelles are disturbed by SARS-CoV-2 infection. They focused on the Golgi apparatus and especially on TGN46 and GRASP55 which are both affected differently in their level of expression by the SARS-CoV-2 infection. TGN46 is overexpressed while GRASP55 is decreased in expression. Through different methods overexpression or depletion, the authors nicely demonstrated that modulation of both proteins either increased or decreased particles production. They demonstrated that in absence of GRASP55, SARS-CoV-2 release is increased in the medium. On the contrary, depletion of TGN46 decreases the secretion of SARS-CoV-2 particles.

      We thank the reviewer for the accurate summary of our work.

      Major comments:

      Figure 1: The authors demonstrated that SARS-CoV-2 expression affected the morphology of multiple organelles. Although the results are clear, my concern was that the MOI=1 was really high which indeed would affect the whole cell. To have a less drastic effect on the cell, I would suggest realizing the visualization of some organelles (Golgi, EEA1, Rab7 for example) at a lower MOI=0.1. In addition, it would be nice to verify with a live-dead assay with the MOI=1 if after 24h the cells are still alive, which will confirm that these disturbances are not caused by cells in process of dying.

      We thank the reviewer for the excellent suggestions. Investigating how SARS-CoV-2 reshapes subcellular organelles at low MOI (e.g., 0.1) and at different time points would be interesting but is beyond the scope of our study. However, we have performed LDH assay at MOI=1, 2 and 3 for 24 hours to assess cell death. Our results show that LDH release was similar across these conditions (Fig. S5R). We also performed RT-qPCR analysis of Spike, N, and RdRp at early time points of infection. The new results show that neither GRASP55 expression (new Fig. 5D) nor TGN46 expression (Fig. 7R) affects viral RNA abundance at an early infection timepoint (4 hpi).

      Figure 2: The results indicated in that panel are really nice. However, the addition of a virus with drugs could increase the proportion of cell death. For the Figure 2C, I propose that the author use a LDH assay to prove that the decrease in infection is not caused by cell death. In addition, a RT-qPCR would be more appropriate to indicate the infection rate and support the microscopy data.

      We thank the reviewer for the positive feedback and suggestions. As recommended, we performed an LDH assay to assess cytotoxicity under 9 small molecules treatment of infected cells. Additionally, we performed RT-qPCR analysis for the BFA time-point treatment assay. No significant cell death was observed under these conditions (new Figs. 2D, and S3C).

      Figure 3: The authors should have been consistent and add spike instead of nucleocapsid for GalT. According to the figures, Spike seemed to co-localize more with GM130 than Golgin 245. Data analysis of colocalization between Spike and GM130 should be performed to complete the observation. Are no colocalizations of Spike observed with the other Golgi markers?

      We agree with the reviewer that it was ideal if spike and GalT were co-stained. Unfortunately, both our spike antibody and GalT antibody are from rabbit, so co-staining could not be done as GM130/spike. We performed colocalization analysis between Spike and GM130, and the results show that GRASP55 expression did enhance Spike and GM130 colocalization to some extent (new Fig. S6E-F). We only co-stained spike with GM130 and Golgin-245 due to the antibody availability.

      Figure 4K: While all the experiments were performed at MOI=1, why is the authors using MOI=2 for the immunoblots. Did they have a different result in protein expression for MOI=1 in HuH cells? if so they should show a blot indicating this result.

      We did not perform WB to assess protein expression at MOI=1, but our cell toxicity assay showed that there is no significant difference between MOI=2 and MOI=1.

      Figure 5: Viral infection should be indicated using RT-qPCR data analysis to support the microscopy observations.

      We performed RT-qPCR analysis (new Figs. 2F, 5D, and 7R) and found that BFA treatment did not reduce viral RNA levels at all three time points. Also, GRASP55 expression and TGN46 depletion did not inhibit viral genome RNA levels within one viral infection cycle. Additionally, our new TCID50 assay results support our microscope observation (new Fig. 7O-P). Thanks for the suggestion.

      Figure 6: The authors should look at the trafficking of ACE2 and TfR in case of GRASP55 depletion like they did in case of GRASP55 overexpression. It could demonstrate if the virus is using trafficking pathways that are common to the one used by some host receptors to reach the plasma membrane.

      Thanks for the excellent suggestion. We performed cell surface biotinylation assay of control and GRASP55-depleted cells. We found that ACE2 and TfR receptor displayed a similar reduction on the cell surface (Fig. S7C), consistent with previous findings that GRASP55 depletion induced Golgi fragmentation and accelerated global conventional protein secretion.

      Figure 7: Viral infection assay should also be performed by RT-qPCR. Figure 7H: The immunoblots conditions were performed at MOI=3 this time. The authors should indicate why they did not keep the same MOI conditions. In that case, they should use an intracellular marker for their medium experiment to prove that they isolated proteins that are secreted and not simply released from dead cells. I will also suggest to show LDH assay at MOI=2 and 3 to monitor cell death. Is the Golgi fragmented when GRASP 55 is overexpressed in presence of the virus? Microscopy observations should be performed to reply to this question as it will support their model. The authors suggest that GRASP55 overexpression decreases spike incorporation inside the virion. Can they observe if Spike still colocalizes with GM130 when GRASP55 is overexpressed?

      We showed that TGN46 depletion inhibits viral infection by both IF and WB. We further confirmed this through TCID50 assay for both cells and media (new Fig. 7O-P), strengthening our hypothesis.

      As we described above, we performed morphological analysis at MOI=1 so that we could observe a significant number of infected cells but minimize cell toxicity. We performed immunoblotting (in Fig. 7H) at MOI=3 to get a good viral infection rate.

      As suggested, we also performed LDH assay at MOI=2 and 3 to monitor cell death (new Fig. S2O). Fig. S6C shows that GRASP55 overexpression in the presence of the virus partially rescues Golgi fragmentation. GRASP55 expression did also enhance Spike and GM130 colocalization to some extent (new Fig. S6E-F).

      Minor comments:

      Figure 1P in the text: Considering that Rab7 up-regulation is equal to "growth of late endosome" is an overstatement. Rab7 is cytosolic at its inactive state and at the endosome at its active state. The authors would have to prove this statement by monitoring an increased quantity of Rab7 at the endosomes which is not enough by just monitoring protein intensity by microscopy. As Rab7 is also localized in lysosomes, and the authors used Lamp2 as a lysosomal marker, it is strange that the area of these structures is not increased. The authors should replace the term "growth" by "an increase in the area of their vesicles".

      We did observe less but larger LAMP2 puncta in the infected cells. We agree with the reviewer and rephrased "growth" by an increase in the area of their vesicles". Thank you for the excellent suggestions.

      Figure 1Q-T: The observations described in the text did not match the quantification, the area of lysosomes is not significantly different from the non-infected conditions.

      In Fig. 1Q-T, we did observe fewer but larger LAMP2 puncta in the infected cells, which was consistent with our quantification, i.e., fewer puncta (Fig. 1R), but each punctum was larger (Fig. 1S), and total area was similar.

      Figure 8: In the text, it is mentioned that there is "a dramatic reduction of spike and N in the lysate in GRASP55-expressing and TGN46 depleted cells". However, the quantification indicated that the decrease in N and S content is non-significant. Can the authors precise what was the sample of comparison in the text (siControl versus siTGN46 or siTGN46+GFP versus siTGN46+GFP-GRASP55)?

      The decrease in N and S content is significant with the lysate sample comparison (siControl versus siTGN46; siControl+GFP versus siTGN46+GFP; siTGN46+GFP versus siTGN46+GFP-GRASP55). We have now removed this Figure following Reviewer #2's suggestion, since the results are consistent with single protein manipulation and more experiments are needed to confirm whether there is an additive effect.

      **Referee cross-commenting**

      I agree with most of the concerns of the other reviewers. I do also consider that they should have done their study on cells expressing naturally ACE2. However, at this stage, it will be a lot of work to perform all of their study in a more relevant cell type. The authors should repeat some of their key experiments in lung-derived cell types, to determine if GRASP55 and TGN46 have the same effect on SARS-CoV-2 virion secretion/production.

      We thank the reviewer for the suggestions and understanding. As we mentioned before, our study utilizes Huh7-ACE2 cells, which are sorted for the high expression of endogenous ACE2 protein, without ACE2 overexpression. Actually, we also tested A549 and Calu-3 cells. While A549 cells displayed very low infection rate, Calu-3 cells displayed disorganized Golgi without viral infection. However, we did perform immunofluorescence assays in Calu-3 cells. Consistent with our findings in Huh7-ACE2 cells, SARS-CoV-2 infection disrupts Golgi structure and alters protein levels of TGN46 and GRASP55 in Calu3 cells (new Fig. S5R-W). Also, others have reported that liver can be a target for SARS-CoV-2 infection in humans. Furthermore, we confirmed GRASP55 downregulation and TGN46 upregulation in VeroE6 cells (Fig. S6K-N).

      Reviewer #3 (Significance (Required)):

      The study identified two Golgi proteins (TGN46 and GRASP55) that are involved in modulating the release of SARS-CoV-2 particles from the cells. As these proteins are also acting on general secretion of host proteins to the plasma membrane, the effect on SARS-CoV-2 release could just be indirect. However, it does not change the informative points of the study raised by Zhang et al. It highlights really well how the host trafficking pathway could be diverted for the purpose of the virus, which is to produce particles to maintain its survival.

      Strengths: The authors performed a precise and well quantified study. Observing how SARS-CoV-2 impacts host organelles morphology and uses host trafficking proteins to produce particles, brings more clarity on some unclear parts of the life cycle of the virus. In addition, it exposes new targets for therapeutic studies.

      We thank the reviewer for the positive comments.

      Weakness: The paper is mostly based on microscopy analysis and need some other methods to support their data. The paper lacks some molecular mechanisms explaining the clear role of GRASP55 and TGN46 in particle production or assembly.

      In the revised version, we incorporated RT-qPCR assay, cell cytotoxicity assay, and BFA time-point treatment assay. Notably, we added intracellular and extracellular viral titer assays to more precisely distinguish between effects on virion assembly and virion secretion. We also confirmed the key observation that SARS-CoV-2 infection modulates GRASP55 and TGN46 expression in the Calu-3 lung cell line. Additionally, our early time-point results clearly support the role of GRASP55 and TGN46 in viral trafficking.

      • Audience: The paper will be interesting for basic research for a virology and cell biology audience.
      • Field of expertise with a few keywords: Virology and host cell trafficking.

      References

      Barnes E (2022) Infection of liver hepatocytes with SARS-CoV-2. Nat Metab 4: 301-302

      Bekier ME, 2nd, Wang L, Li J, Huang H, Tang D, Zhang X, Wang Y (2017) Knockout of the Golgi stacking proteins GRASP55 and GRASP65 impairs Golgi structure and function. Mol Biol Cell 28: 2833-2842

      Eymieux S, Rouille Y, Terrier O, Seron K, Blanchard E, Rosa-Calatrava M, Dubuisson J, Belouzard S, Roingeard P (2021) Ultrastructural modifications induced by SARS-CoV-2 in Vero cells: a kinetic analysis of viral factory formation, viral particle morphogenesis and virion release. Cell Mol Life Sci 78: 3565-3576

      Ghosh S, Dellibovi-Ragheb TA, Kerviel A, Pak E, Qiu Q, Fisher M, Takvorian PM, Bleck C, Hsu VW, Fehr AR et al (2020) beta-Coronaviruses Use Lysosomes for Egress Instead of the Biosynthetic Secretory Pathway. Cell 183: 1520-1535 e1514

      Hoffmann M, Hofmann-Winkler H, Smith JC, Kruger N, Arora P, Sorensen LK, Sogaard OS, Hasselstrom JB, Winkler M, Hempel T et al (2021) Camostat mesylate inhibits SARS-CoV-2 activation by TMPRSS2-related proteases and its metabolite GBPA exerts antiviral activity. EBioMedicine 65: 103255

      Hoffmann M, Mosbauer K, Hofmann-Winkler H, Kaul A, Kleine-Weber H, Kruger N, Gassen NC, Muller MA, Drosten C, Pohlmann S (2020) Chloroquine does not inhibit infection of human lung cells with SARS-CoV-2. Nature 585: 588-590

      Xiang Y, Wang Y (2010) GRASP55 and GRASP65 play complementary and essential roles in Golgi cisternal stacking. J Cell Biol 188: 237-251

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary of what the authors were trying to achieve:

      In this manuscript, the authors investigated the role of β-CTF on synaptic function and memory. They report that β-CTF can trigger the loss of synapses in neurons that were transiently transfected in cultured hippocampal slices and that this synapse loss occurs independently of Aβ. They confirmed previous research (Kim et al, Molecular Psychiatry, 2016) that β-CTF-induced cellular toxicity occurs through a mechanism involving a hexapeptide domain (YENPTY) in β-CTF that induces endosomal dysfunction. Although the current study also explores the role of β-CTF in synaptic and memory function in the brain using mice chronically expressing β-CTF, the studies are inconclusive because potential effects of Aβ generated by γ-secretase cleavage of β-CTF were not considered. Based on their findings, the authors suggest developing therapies to treat Alzheimer's disease by targeting β-CTF, but did not address the lack of clinical improvement in trials of several different BACE1 inhibitors, which target β-CTF by preventing its formation.

      We would like to thank the reviewer for his/her suggestions. We have addressed the specific comments in following sections.

      Major strengths and weaknesses of the methods and results:

      The conclusions of the in vitro experiments using cultured hippocampal slices were well supported by the data, but aspects of the in vivo experiments and proteomic studies need additional clarification.

      (1) In contrast to the in vitro experiments in which a γ-secretase inhibitor was used to exclude possible effects of Aβ, this possibility was not examined in in-vivo experiments assessing synapse loss and function (Figure 3) and cognitive function (Figure 4). The absence of plaque formation (Figure 4B) is not sufficient to exclude the possibility that Aβ is involved. The potential involvement of Aβ is an important consideration given the 4-month duration of protein expression in the in vivo studies.

      We appreciate the reviewer for raising this question. While our current data did not exclude the potential involvement of Aβ-induced toxicity in the synaptic and cognitive dysfunction observed in mice overexpressing β-CTF, addressing this directly remains challenging. Treatment with γ-secretase inhibitors could potentially shed light on this issue. However, treatments with γ-secretase inhibitors are known to lead to brain dysfunction by itself likely due to its blockade of the γ-cleavage of other essential molecules, such as Notch[1, 2]. Therefore, this approach is unlikely to provide a clear answer, which prevents us from pursuing it further experimentally in vivo. We hope the reviewer understands this limitation. We have included additional discussion (page 14 of the revised manuscript) to highlight this question.

      (2) The possibility that the results of the proteomic studies conducted in primary cultured hippocampal neurons depend in part on Aβ was also not taken into consideration.

      We thank the reviewer for raising this question. In the revised manuscript, we examined the protein levels of synaptic proteins after treatment with γ-secretase inhibitors and found that the levels of certain synaptic proteins were further reduced in neurons expressing β-CTF (Supplementary figure 5A-B). These results do not support Aβ as a major contributor of the proteomic changes induced by β-CTF.

      Likely impact of the work on the field, and the utility of the methods and data to the community:

      The authors' use of sparse expression to examine the role of β-CTF on spine loss could be a useful general tool for examining synapses in brain tissue.

      We thank the reviewer for these comments.

      Additional context that might help readers interpret or understand the significance of the work:

      The discovery of BACE1 stimulated an international effort to develop BACE1 inhibitors to treat Alzheimer's disease. BACE1 inhibitors block the formation of β-CTF which, in turn, prevents the formation of Aβ and other fragments. Unfortunately, BACE1 inhibitors not only did not improve cognition in patients with Alzheimer's disease, they appeared to worsen it, suggesting that producing β-CTF actually facilitates learning and memory. Therefore, it seems unlikely that the disruptive effects of β-CTF on endosomes plays a significant role in human disease. Insights from the authors that shed further light on this issue would be welcome.

      Response: We would like to express our gratitude to the reviewer for raising this question. It remains puzzling why BACE1 inhibition has failed to yield benefits in AD patients, while amyloid clearance via Aβ antibodies are able to slow down disease progression. One possible explanation is that pharmacological inhibition of BACE1 may not be as effective as its genetic removal. Indeed, genetic depletion of BACE1 leads to the clearance of existing amyloid plaques[3], whereas its pharmacological inhibition prevents the formation of new plaques but does not deplete the existing ones[4]. We think the negative results of BACE1 inhibitors in clinical trials may not be sufficient to rule out the potential contribution of β-CTF to AD pathogenesis. Given that cognitive function continues to deteriorate rapidly in plaque-free patients after 1.5 years of treatment with Aβ antibodies in phase three clinical studies[5], it is important to consider the potential role of other Aβ-related fragments in AD pathogenesis, such as β-CTF. We included further discussion in the revised manuscript (page 15 of the revised manuscript) to discusss this question.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors investigate the potential role of other cleavage products of amyloid precursor protein (APP) in neurodegeneration. They combine in vitro and in vivo experiments, revealing that β-CTF, a product cleaved by BACE1, promotes synaptic loss independently of Aβ. Furthermore, they suggest that β-CTF may interact with Rab5, leading to endosomal dysfunction and contributing to the loss of synaptic proteins.

      We would like to thank the reviewer for his/her suggestions. We have addressed the specific comments in following sections.

      Weaknesses:

      Most experiments were conducted in vitro using overexpressed β-CTF. Additionally, the study does not elucidate the mechanisms by which β-CTF disrupts endosomal function and induces synaptic degeneration.

      We would like to thank the reviewer for this comment. While a significant portion of our experiments were conducted in vitro, the main findings were also confirmed in vivo (Figure 3 and 4). Repeating all the experiments in vivo would be challenging and may not be possible because of technical difficulties. Regarding the use of overexpressed β-CTF, we acknowledge that this represents a common limitation in neurodegenerative disease studies. These diseases progress slowly over decades in patients. To model this progression in cell or mouse models within a time frame feasible for research, overexpression of certain proteins is often inevitable. Since β-CTF levels are elevated in AD patients[6], its overexpression is not a irrelevant approach to investigate its potential effects.

      We did not further investigate the mechanisms by which β-CTF disrupted endosomal function because our preliminary results align with previous findings that could explain its mechanism. Kim et al. demonstrated that β-CTF recruits APPL1 (a Rab5 effector) via the YENPTY motif to Rab5 endosomes, where it stabilizes active GTP-Rab5, leading to pathologically accelerated endocytosis, endosome swelling and selectively impaired transport of Rab5 endosomes[6]. However, this paper did not show whether this Rab5 overactivation-induced endosomal dysfunction leads to any damages in synapses. In our study, we observed that co-expression of Rab5<sub>S34N</sub> with β-CTF effectively mitigated β-CTF-induced spine loss in hippocampal slice cultures (Figures 6L-M), indicating that Rab5 overactivation-induced endosomal dysfunction contributed to β-CTF-induced spine loss. We included further discussion in the revised manuscript to clarify this (page 15 of the revised manuscript).

      Reviewer #3 (Public Review):

      Summary:

      Most previous studies have focused on the contributions of Abeta and amyloid plaques in the neuronal degeneration associated with Alzheimer's disease, especially in the context of impaired synaptic transmission and plasticity which underlies the impaired cognitive functions, a hallmark in AD. But processes independent of Abeta and plaques are much less explored, and to some extent, the contributions of these processes are less well understood. Luo et all addressed this important question with an array of approaches, and their findings generally support the contribution of beta-CTF-dependent but non-Abeta-dependent process to the impaired synaptic properties in the neurons. Interestingly, the above process appears to operate in a cell-autonomous manner. This cell-autonomous effect of beta-CTF as reported here may facilitate our understanding of some potentially important cellular processes related to neurodegeneration. Although these findings are valuable, it is key to understand the probability of this process occurring in a more natural condition, such as when this process occurs in many neurons at the same time. This will put the authors' findings into a context for a better understanding of their contribution to either physiological or pathological processes, such as Alzheimer's. The experiments and results using the cell system are quite solid, but the in vivo results are incomplete and hence less convincing (see below). The mechanistic analysis is interesting but primitive and does not add much more weight to the significance. Hence, further efforts from the authors are required to clarify and solidify their results, in order to provide a complete picture and support for the authors' conclusions.

      We would like to thank the reviewer for the suggestions. We have addressed the specific comments in following sections.

      Strengths:

      (1) The authors have addressed an interesting and potentially important question

      (2) The analysis using the cell system is solid and provides strong support for the authors' major conclusions. This analysis has used various technical approaches to support the authors' conclusions from different aspects and most of these results are consistent with each other.

      We would like to thank the reviewer for these comments.

      Weaknesses:

      (1) The relevance of the authors' major findings to the pathology, especially the Abeta-dependent processes is less clear, and hence the importance of these findings may be limited.

      We would like to thank the reviewer for this question. Phase 3 clinical trial data from Aβ antibodies show that cognitive function continues to decline rapidly, even in plaque-free patients, after 1.5 years of treatment[5]. This suggests that plaque-independent mechanisms may drive AD progression. Therefore, it is crucial to consider the potential contributions of other Aβ species or related fragments, such as alternative forms of Aβ and β-CTF. While it is early to predict how much β-CTF contributes to AD progression, it is notable that β-CTF induced synaptic deficits in mice, which recapitulates a key pathological feature of AD. Ultimately, the contribution of β-CTF in AD pathogenesis can only be tested through clinical studies in the future.

      (2) In vivo analysis is incomplete, with certain caveats in the experimental procedures and some of the results need to be further explored to confirm the findings.

      We would like to thank the reviewer for this suggestion. We have corrected these caveats in the revised manuscript.

      (3) The mechanistic analysis is rather primitive and does not add further significance.

      We would like to thank the reviewer for this comment. We did not delve further into the underlying mechanisms because our analysis indicates that Rab5 overactivation-induced endosomal dysfunction underlies β-CTF-induced synaptic dysfunction, which is consistent with another study and has been addressed in our study[6]. We hope the reviewer could understand that our focus in this paper is on how β-CTF triggers synaptic deficits, which is why we did not investigate the mechanisms of β-CTF-induced endosomal dysfunction further.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggestions for improved or additional experiments, data, or analyses:

      (1) In Figures 4H, 4J, 4K and Supplemental Figures 3C, 3E, and 3G, it was unclear whether a repeated measures 2-way ANOVA, rather than a 2-way ANOVA, followed by appropriate post-hoc analyses was used to strengthen the conclusion that there were significant effects in the behavioral tests.

      We appreciate the reviewer for raising this point and apologize for the lack of clear description in the manuscript. In those figures mentioned above, we use a repeated measures 2-way ANOVA to analyze the data by Graphpad Prism. In Figure 4H, fear conditioning tests were conducted. The same cohort of mice were used in the baseline, contextual and cued tests. Firstly, baseline freezing was tested; then these mice underwent tone and foot shock training, followed by contextual test and cued test. So, a repeated measures 2-way ANOVA is more appropriate for the experiment.

      In water T maze tests (Figure 4J and K), the same cohort of mice were trained and tested each day. So, it’s also appropriate to use a repeated measures 2-way ANOVA.

      In Supplementary figure 3C, 3E and 3G, OFT was conducted. In this experiment, the locomotion of the same cohort of mice were recorded. Also, it’s appropriate to use a repeated measures 2-way ANOVA.

      Clearer description for these experiments has been provided in the revised manuscript.

      (2) Including gender analyses would be helpful.

      The mice we used in this study were all males.

      Minor corrections to text and figures:

      (1) Quantitative analyses in Figures 5A-C, 5H, 6G, 6H, and Supplementary Figures 4 and 5C would be helpful.

      We have provided quantitative analysis of these results (Figure 5D, 5J, 6K, Supplementary figure 4D, 5F) mentioned above in the revised manuscript.

      (2) Percent correct (%) in Figures 4J and 4K should be labeled as 0, 50, and 100 instead of 0.0, 0.5, and 1.0.

      We would like to thank the reviewer for pointing out this. We have made corrections in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      In the study conducted by Luo et al, it was observed that the fragment of amyloid precursor protein (APP) cleaved by beta-site amyloid precursor protein cleaving enzyme 1 (BACE1), known as β-CTF, plays a crucial role in synaptic damage. The study found increasing expression of β-CTF in neurons could induce synapse loss both in vitro and in vivo, independent of Aβ. Mechanistically, they explored how β-CTF could interfere with the endosome system by interacting with RAB5. While this study is intriguing, there are several points that warrant further investigation:

      (1) The study involved overexpressing β-CTF in neurons. It would be valuable to know if the levels of β-CTF are similarly increased in Alzheimer's disease (AD) patients or AD mouse models.

      We would like to thank the reviewer for the suggestion. It’s reported β-CTF levels were significantly elevated in the AD cerebral cortex[6]. Most AD mouse models are human APP transgenic mouse models with elevated β-CTF levels[7].

      (2) The study noted that β-CTF in neurons is a membranal fragment, but the overexpressed β-CTF was not located in the membrane. It is important to ascertain whether the membranal β-CTF and cytoplasmic β-CTF lead to synapse loss in a similar manner.

      We apologize for not clearly explaining the localization of β-CTF in the original manuscript. β-CTF is produced from APP through β-cleavage, a process that occurs in organelles such as endo-lysosomes[8]. The overexpressed β-CTF is also primarily localized in the endo-lysosomal systems (Figure 5C and Supplementary figure 4C), similar to those generated by APP cleavage.

      (3) The study found a significant decrease in GluA1, a subunit of AMPA receptors, due to β-CTF. It would be beneficial to investigate whether there are systematic alterations in NMDA receptors, including GluN2A and GluN2B.

      We would like to express our gratitude to the reviewer for bringing up this question. The protein levels of GluN2A and GluN2B are also reduced in neurons expressing β-CTF (Figure 6E-F)

      (4) The study showed a significant decrease in the frequency of miniature excitatory postsynaptic currents (mEPSC), indicating disrupted presynaptic vesicle neurotransmitter release. It would be pertinent to test whether the expression level of the presynaptic SNARE complex, which is required for vesicle release, is altered by β-CTF.

      We would like to express our gratitude to the reviewer for bringing up this question. The protein level of the presynaptic SNARE complex, such as VAMP2, is also reduced in neurons expressing β-CTF (Figure 6E, G).

      (5) Since AMPA receptors are glutamate receptors, it is important to determine whether the ability of glutamate release is altered by β-CTF. In vivo studies using a glutamate sensor should be conducted to examine glutamate release.

      We would like to express our gratitude to the reviewer for this suggestion. It will be interesting to use glutamate sensors to assess the ability of glutamate release in the future.

      (6) The quality of immunostaining associated with Figures 4B and 4C was noted to be suboptimal.

      We apologize for the suboptimal quality of these images. The immunostaining in Figures 4B and 4C were captured using the stitching function of a confocal microscope to display larger areas, including the entire hemisphere and hippocampus. We have reprocessed the images to obtain higher-quality versions.

      (7) It would be insightful to investigate whether treatment with a BACE1 inhibitor in the study could reverse synaptic deficits mediated by β-CTF.

      We would like to thank the reviewer for this sggestion. In Figure 1I-M, we constructed an APP mutant (APP<sub>MV</sub>), which cannot be cleaved by BACE1 to produce β-CTF and Aβ but has no impact on β’-cleavage. When co-expressed with BACE1, APP<sub>MV</sub> failed to induce spine loss, supporting the effect of β-CTF. We think these results domonstrate that β-CTF underlies the synaptic deficits. It would be interesting to test the effects of BACE1 inhibition in the future.

      (8) Considering the potential implications for therapeutics, it is worth exploring whether extremely low levels of β-CTF have beneficial effects in regulating synaptic function or promoting synaptogenesis at a physiological level.

      We would like to thank the reviewer for raising this question. We found that when the plasmid amount was reduced to 1/8 of the original dose, β-CTF no longer induced a decrease in dendritic spine density (Supplementary figure 2E-F). It’s reported APP-Swedish mutation in familial AD increased synapse numbers and synaptic transmission, whereas inhibition of BACE1 lowered synapse numbers, suppressed synaptic transmission in wild type neurons, suggesting that at physiological level, β-CTF might be synaptogenic[9].

      (9) The molecular mechanism through which β-CTF interferes with Rab5 function should be elucidated.

      We would like to thank the reviewer for raising this question. Kim et al have elucidated the mechanism through which β-CTF interferes with Rab5 function. β-CTF recruited APPL1 (a Rab5 effector) via YENPTY motif to Rab5 endosomes, where it stabilizes active GTP-Rab5, leading to pathologically accelerated endocytosis, endosome swelling and selectively impaired transport of Rab5 endosomes[6]. We have included additional discussion for this question in the revised manuscript (page 15 of the revised manuscript).

      (10) The study could compare the role of β-CTF and Aβ in neurodegeneration in AD mouse models.

      We would like to thank the reviewer for raising this point. While it is easier to dissect the role of Aβ and β-CTF in vitro, some of the critical tools are not applicabe in vivo, such as γ-secretase inhibitors, which lead to severe side effects because of their inhibition on other γ substrates[1, 2]. Therefore it will be difficult to deomonstrate their different roles in vivo. There are studies showing that β-CTF accumulation precedes Aβ deposition in model mice and mediates Aβ independent intracellular pathologies[10, 11], consistent with our results.

      (11) Based on the findings, it would be valuable to discuss possible explanations for the failure of most BACE1 inhibitors in recent clinical trials for humans.

      Response: We would like to express our gratitude to the reviewer for raising this recommendation. It is a big puzzle why BACE1 inhibition failed to provide beneficial effects in AD patients whereas clearance of amyloid by Aβ antibodies could slow down the AD progress. One potential answer is that pharmacological inhibition of BACE1 might be not as effective as its genetic removal. Indeed, genetic depletion of BACE1 leads to clearance of existing amyloid plaques[3], whereas pharmacological inhibition of BACE1 could not stop growth of existing plaques, although it prevents formation of new plaques[4]. The negative result of BACE1 inhibitors might not be sufficient to exclude the possibility that β-CTF could also contribute to the AD pathogenesis. We have included additional discussion for this question in the revised manuscript (page 15 of the revised manuscript).

      Reviewer #3 (Recommendations For The Authors):

      Major:

      (1) The cell experiments were performed at DIV 9, do the authors know whether at this age, the neurons are still developing and spine density has not reached a pleated yet? If so, the observed effect may reflect the impact on development and/or maturation, rather than on the mature neurons. The authors should be more specific about this issue.

      We would like to thank the reviewer for pointing out this question. These slice cultures were made from 1-week-old rats. DIV 9 is about two weeks old. These neurons are still developing and spine density has not reached a plateau yet[12]. In addition, we also investigated the effects of β-CTF on the synapses of mature neurons in two-month-old mice (Figure 3). So we think the observed effect reflects the impact on both immature and mature neurons.

      (2) mEPSCs shown in Figure 3D were of small amplitudes, perhaps also indicating that these synapses are not yet mature.

      In Figure 3D, the mEPSC results were obtained from pyramidal neurons in the CA1 region of two-month-old mice. At the age of two months, neurotransmitter levels and synaptic density have reached adult levels[13].

      (3) There was no data on the spine density or mEPSCs in the mice OE b-CTF, hence it is unclear whether a primary impact of this manipulation (b-CTF effect) on the synaptic transmission still occurs in vivo.

      In Figure 3, we examined the density of dendritic spines and mEPSCs from CA1 pyramidal neurons infected with lentivirus expressing β-CTF in mice and showed that those neurons expressing additional amount of β-CTF exhibited lower spine density and less mEPSCs, supporting that β-CTF also damaged synaptic transmission in vivo.

      (4) OE of b-CTF should lead to the production of Abeta, although this may not lead to the formation of significant plaques. How do the authors know whether their findings on behavioral and cognitive impairments were not largely mediated by Abeta, which has been widely reported by previous studies?

      We would like to thank the reviewer for pointing out this question. Indeed, our in vivo data could not exclude the potential involvement of Aβ in the pathology, despite the absence of amyloid plaque formation. It will be difficult to demonstrate this question in vivo because of the severe side effects from γ inhibition.

      (5) Figure 4H, the freezing level in the cued fear conditioning was very high, likely saturated; this may mask a potential reduction in the b-CTF OE mice (there is a hint for that in the results). The authors should repeat the experiments using less strong footshock strength (hence resulting in less freezing, <70%).

      We would like to express our gratitude to the reviewer for bringing up this question. The contextual fear conditioning test assesses hippocampal function, while the cued fear conditioning test assesses amygdala function. We hope the reviewer understands that our primary goal is to assess hippocampus-related functions in this experiment and we did see a significant difference between GFP and β-CTF groups. Therefore, we think the intensity of footshock we used was suitable to serve the primary purpose of this experiment.

      (6) Why was the deficit in the Morris water maze in the b-CTF OE mice only significant in the training phase?

      We would like to thank the reviewer for rasing this question and apologize for not describing the test clearly. This is a water T maze test, not Morris water maze test.

      To make the behavioral paradigm of the water T maze test easier to understand, we have provided a more detailed description of the methods in the new version of the manuscript.

      The acquisition phase of the Water T Maze (WTM) evaluates spatial learning and memory, where mice use spatial cues in the environment to navigate to a hidden platform and escape from water, while the reversal learning measures cognitive flexibility in which mice must learn a new location of the hidden platform[14]. In reversal learning task (Figure 4J-K), the learning curves of the two groups of mice did not show any significant differences, indicating that the expression of β-CTF only damages spatial learning and memory but not cognitive flexibility. This is consistent with a previous report using APP/PS1 mice[15].

      (7) Will the altered Rab5 in the b-CTF OE condition also affect the level of other proteins?

      We would like to express our gratitude to the reviewer for raising this interesting question.  Expression of Rab5<sub>S34N</sub> in β-CTF-expressing neurons did not alter the levels of synapse-related proteins that were reduced in these neurons (Supplementary figure 5G-H), suggesting Rab5 overactivation did not contribute to these protein expression changes induced by β-CTF.

      (8) How do the authors reconcile their findings with the well-established findings that Abeta affects synaptic transmission and spine density? Do they think these two processes may occur simultaneously in the neurons, or, one process may dominate in the other?

      APP, Aβ, and presenilins have been extensively studied in mouse models, providing convincing evidence that high Aβ concentrations are toxic to synapses[16]. Moreover, addition of Aβ to murine cultured neurons or brain slices is toxic to synapses[17]. However, Aβ-induced synaptotoxicity was not observed in our study. A major difference between our study and others is that our study used a isolated expression system that apply Aβ only to individual neurons surrounded by neurons without excessive amount of Aβ, whereas the rest studies generally apply Aβ to all the neurons. Therefore, we predict that Aβ does not lead to synaptic deficits from individual neurons in cell autonomous manners, whereas β-CTF does. Aβ and β-CTF represent two parallel pathways of action. Additional discussion for this question has been included in the revised manuscript (page 14 of the revised manuscript).

      Minor:

      Fig 2F-G, "prevent" rather than "reverse"?

      We would like to thank the reviewer for pointing this out. We have made corrections in the revised manuscript.

      Reference:

      (1) GüNER G, LICHTENTHALER S F. The substrate repertoire of γ-secretase/presenilin [J]. Seminars in cell & developmental biology, 2020, 105: 27-42.

      (2) DOODY R S, RAMAN R, FARLOW M, et al. A phase 3 trial of semagacestat for treatment of Alzheimer's disease [J]. The New England journal of medicine, 2013, 369(4): 341-50.

      (3) HU X, DAS B, HOU H, et al. BACE1 deletion in the adult mouse reverses preformed amyloid deposition and improves cognitive functions [J]. The Journal of experimental medicine, 2018, 215(3): 927-40.

      (4) PETERS F, SALIHOGLU H, RODRIGUES E, et al. BACE1 inhibition more effectively suppresses initiation than progression of β-amyloid pathology [J]. Acta neuropathologica, 2018, 135(5): 695-710.

      (5) SIMS J R, ZIMMER J A, EVANS C D, et al. Donanemab in Early Symptomatic Alzheimer Disease: The TRAILBLAZER-ALZ 2 Randomized Clinical Trial [J]. Jama, 2023, 330(6): 512-27.

      (6) KIM S, SATO Y, MOHAN P S, et al. Evidence that the rab5 effector APPL1 mediates APP-βCTF-induced dysfunction of endosomes in Down syndrome and Alzheimer's disease [J]. Molecular psychiatry, 2016, 21(5): 707-16.

      (7) MONDRAGóN-RODRíGUEZ S, GU N, MANSEAU F, et al. Alzheimer's Transgenic Model Is Characterized by Very Early Brain Network Alterations and β-CTF Fragment Accumulation: Reversal by β-Secretase Inhibition [J]. Frontiers in cellular neuroscience, 2018, 12: 121.

      (8) ZHANG X, SONG W. The role of APP and BACE1 trafficking in APP processing and amyloid-β generation [J]. Alzheimer's research & therapy, 2013, 5(5): 46.

      (9) ZHOU B, LU J G, SIDDU A, et al. Synaptogenic effect of APP-Swedish mutation in familial Alzheimer's disease [J]. Science translational medicine, 2022, 14(667): eabn9380.

      (10) LAURITZEN I, PARDOSSI-PIQUARD R, BAUER C, et al. The β-secretase-derived C-terminal fragment of βAPP, C99, but not Aβ, is a key contributor to early intraneuronal lesions in triple-transgenic mouse hippocampus [J]. The Journal of neuroscience : the official journal of the Society for Neuroscience, 2012, 32(46): 16243-1655a.

      (11) KAUR G, PAWLIK M, GANDY S E, et al. Lysosomal dysfunction in the brain of a mouse model with intraneuronal accumulation of carboxyl terminal fragments of the amyloid precursor protein [J]. Molecular psychiatry, 2017, 22(7): 981-9.

      (12) HARRIS K M, JENSEN F E, TSAO B. Three-dimensional structure of dendritic spines and synapses in rat hippocampus (CA1) at postnatal day 15 and adult ages: implications for the maturation of synaptic physiology and long-term potentiation [J]. The Journal of neuroscience : the official journal of the Society for Neuroscience, 1992, 12(7): 2685-705.

      (13) SEMPLE B D, BLOMGREN K, GIMLIN K, et al. Brain development in rodents and humans: Identifying benchmarks of maturation and vulnerability to injury across species [J]. Progress in neurobiology, 2013, 106-107: 1-16.

      (14) GUARIGLIA S R, CHADMAN K K. Water T-maze: a useful assay for determination of repetitive behaviors in mice [J]. Journal of neuroscience methods, 2013, 220(1): 24-9.

      (15) ZOU C, MIFFLIN L, HU Z, et al. Reduction of mNAT1/hNAT2 Contributes to Cerebral Endothelial Necroptosis and Aβ Accumulation in Alzheimer's Disease [J]. Cell reports, 2020, 33(10): 108447.

      (16) CHAPMAN P F, WHITE G L, JONES M W, et al. Impaired synaptic plasticity and learning in aged amyloid precursor protein transgenic mice [J]. Nature neuroscience, 1999, 2(3): 271-6.

      (17) WANG Z, JACKSON R J, HONG W, et al. Human Brain-Derived Aβ Oligomers Bind to Synapses and Disrupt Synaptic Activity in a Manner That Requires APP [J]. The Journal of neuroscience : the official journal of the Society for Neuroscience, 2017, 37(49): 11947-66.

    1. “[b]uilding trust and nurturing legitimacy on both sides of the police-citizen divide is not only the first pillar of this task force’s report but also the foundational principle underlying this inquiry into the nature of relations between law enforcement and the communities they serve.” Its first recommendation was that “[l]aw enforcement culture should embrace a guardian mindset to build public trust and legitimacy. Toward that end, police and sheriffs’ departments should adopt procedural justice as the guiding principle for internal and external policies and practices to guide their interactions with the citizens they serv

      If these steps are put in place and everyone adapts, accepts, and lives by them, I think it can work. Trust has to be built and there has to be a mutual level of respect for each. There can not be hidden agendas and there should not be a grouping of individuals, i.e. all blacks are criminals or all cops are bad.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This manuscript presents evidence of ’vocal style’ in sperm whale vocal clans. Vocal style was defined as specific patterns in the way that rhythmic codas were produced, providing a fine-scale means of comparing coda variations. Vocal style effectively distinguished clans similar to the way in which vocal repertoires are typically employed. For non-identity codas, vocal style was found to be more similar among clans with more geographic overlap. This suggests the presence of social transmission across sympatric clans while maintaining clan vocal identity.

      Strengths:

      This is a well-executed study that contributes exciting new insights into cultural vocal learning in sperm whales. The methodology is sound and appropriate for the research question, building on previous work and ground-truthing much of their theories. The use of the Dominica dataset to validate their method lends strength to the concept of vocal style and its application more broadly to the Pacific dataset. The results are framed well in the context of previous works and clearly explain what novel insights the results provide to the current understanding of sperm whale vocal clans. The discussion does an overall great job of outlining why horizontal social learning is the best explanation for the results found.

      Weaknesses:

      The primary issues with the manuscript are in the technical nature of the writing and a lack of clarity at times with certain terminology. For example, several tree figures are presented and ’distance’ between trees is key to the results, yet ’distance’ is not clearly defined in a way for someone unfamiliar with Markov chains to understand. However, these are issues that can easily be dealt with through minor revisions with a view towards making the manuscript more accessible to a general audience.

      I also feel that the discussion could focus a bit more on the broader implications - specifically what the developed methods and results might imply about cultural transmission in other species. This is specifically mentioned in the abstract but not really delved into in detail during the discussion.

      We are grateful for the Reviewer’s recognition of the study’s contributions to understanding cultural vocal learning in sperm whales. In response to the concerns regarding clarity and accessibility, we have revised the manuscript to improve the definition of key concepts, such as the notion of “distance” between subcoda trees. This adjustment ensures clarity for readers unfamiliar with the technical details of Markov chains. Additionally, we have expanded the discussion to highlight broader implications of our findings, particularly their relevance to understanding cultural transmission in other species, as suggested.

      Reviewer #2 (Public review):

      Summary:

      The current article presents a new type of analytical approach to the sequential organisation of whale coda units.

      Strengths:

      The detailed description of the internal temporal structure of whale codas is something that has been thus far lacking.

      Weaknesses:

      It is unclear how the insight gained from these analyses differs or adds to the voluminous available literature on how codas varies between whale groups and populations. It provides new details, but what new aspects have been learned, or what features of variation seem to be only revealed by this new approach? The theoretical basis and concepts of the paper are problematical and indeed, hamper potentially the insights into whale communication that the methods could offer. Some aspects of the results are also overstated.

      We appreciate the Reviewer’s acknowledgment of the novelty in describing the internal temporal structure of whale codas. Regarding the concern about the unique contributions of this approach, we have further emphasized in the revised manuscript how our methodology reveals previously uncharacterized dimensions of coda structure. Specifically, our work highlights how non-identity codas, which have received limited attention, play a significant role in inter-clan acoustic interactions. By leveraging Variable Length Markov Chains, we provide a nuanced understanding of coda subunits that complements existing studies and demonstrates the value of this analytical approach.

      Reviewer #3 (Public review):

      Summary:

      The study presented by Leitao et al., represents an important advancement in comprehending the social learning processes of sperm whales across various communicative and socio-cultural contexts. The authors introduce the concept of ”vocal style” as an addition to the previously established notion of ”vocal repertoire,” thereby enhancing our understanding of sperm whale vocal identity.

      Strengths:

      A key finding of this research is the correlation between the similarity of clan vocal styles for non-ID codas and spatial overlap (while no change occurs for ID codas), suggesting that social learning plays a crucial role in shaping symbolic cultural boundaries among sperm whale populations. This work holds great appeal for researchers interested in animal cultures and communication. It is poised to attract a broad audience, including scholars studying animal communication and social learning processes across diverse species, particularly cetaceans.

      Weaknesses:

      In terms of terminology, while the authors use the term ”saying” to describe whale vocalizations, it may be more conservative to employ terms like ”vocalize” or ”whale speech” throughout the manuscript. This approach aligns with the distinction between human speech and other forms of animal communication, as outlined in prior research (Hockett, 1960; Cheney & Seyfarth, 1998; Hauser et al., 2002; Pinker & Jackendoff, 2005; Tomasello, 2010).

      We thank the Reviewer for recognizing the importance of our findings and their appeal to broader audiences interested in animal cultures and communication. In response to the suggestion regarding terminology, we have adopted a more conservative language to align with distinctions between human and non-human communication systems. For example, terms like “vocalize” and “vocal repertoire” are used in place of anthropomorphic terms such as “saying”. This ensures consistency with established conventions while maintaining clarity for a broad readership.

      Reviewer #1 (Recommendations):

      Comment 1

      Lines 11-13: As mentioned above, the implications for comparing communication systems and cultural transmission in other species isn’t really discussed much and I think it’s a really interesting component of the study’s broader implications.

      Thank you for the comment.

      Action - We added a few more sentences to the discussion regarding this.

      Comment 2

      Figure 1: More information on the figure of these trees would help. What do the connecting lines represent? What do the plain black dots and the black dot with the white dot represent? Especially since the ”distance between trees” is a key result, it’s important that someone unfamiliar with Markov chains can understand the basics of how this is calculated and what it represents. It is explained in the methods, but a brief explanation here would make the results and the figure a lot clearer since the methods are the last section of the manuscript.

      These were omitted as we believed that attempting to introduce the mathematical structure and the methodology to compare two instances, in a figure caption, would have caused more ambiguity than necessary.

      Action - Added an informal introduction to these concepts on the figure caption. Also added a pointer to the Supplementary Materials.

      Comment 3

      Table 1: A definition of dICIs should be included here.

      Added the definition of discrete ICI to the table.

      Comment 4

      Figure 2: The placement of the figures is a bit confusing because they are quite far from the text that references them.

      We thank the reviewer for pointing this out, we tried to edit the manuscript to improve this issue, but this part of the editing is more within the journal’s powers than our own.

      Action - Moved images closes to the corresponding text in manuscript.

      Comment 5

      Line 117: Probabilistic distance needs to be briefly explained earlier when you first mention distance (see Lines 11-13 comments).

      Action - Clarifications added in the caption of figure 1. as per comment on Lines 11-13

      Comment 6

      Figure 4: Is order considered in these pairwise comparisons? It looks like there are two dots for each pairwise comparison. Additionally, why is the overlap different in these two comparisons? For example, short:four-plus has an overlap of 0.6, while four-plus:short has an overlap of 0.95.

      The x-axis of the plots in Figure 4 is geographical clan overlap. This is calculated as per (Hersh et al., 2022) and is described in our Methods (see “Measuring clan overlap” section). Given two clans—for example, the Four-Plus and the Short clan—spatial overlap is calculated twice: as the proportion of the Four-Plus clan’s repertoires that were recorded within 1,000 km of at least one of the Short clan’s repertoires, and as the proportion of the Short clan’s repertoires that were recorded within 1,000 km of at least one of the Four-Plus clan’s repertoires.

      Order is important in these pairwise comparisons and generates an asymmetric matrix because the clans have different spatial extents. A clan found in only one small region might overlap completely with a clan that spans the Pacific Ocean, while the opposite is not true. For example, the Short clan spans the Pacific Ocean while the Four-Plus clan has been documented over a smaller area (but that smaller area overlaps extensively with the Short clan range). That is why the value is smaller (0.6) when considering how much of the Short clan’s range is shared with the Four-Plus clan, and larger ( 0.95) when considering how much of the Four-Plus clan’s range is shared with the Short clan.

      Action - We have now added a reference to that section of the Methods in our Figure 4 caption and include the clan spatial overlap matrix as a supplemental table (Table S5).

      Comment 7

      Figure 4: I think the reference should be Hersh et al. [11].

      Thank you for catching this.

      Action - Reference corrected

      Comment 8

      Line 227: What aspect of your analysis looked at how often codas were produced? You mention coda frequency, but it is unclear how this was incorporated into your analysis. If this is included in the methods, the language is a bit too technical to easily parse it out.

      Indeed here we are referencing the results of the paper mentioned in the previous line. We do not look at coda production frequency.

      Action - Added citation to paper that actually performs this analysis.

      Comment 9

      Lines 253-255: I think you could dig into this a little more, as ”there is currently no evidence” is not the most convincing argument that something is not a driver. Perhaps expanding on the latter sentence that clans are recognizable across oceans basins would be helpful. Does this suggest that clans with similar geographic overlap experience diverse environmental conditions across ocean basins? If so, this might better strengthen your argument against environmental drivers.

      Thank you for pointing this out. We feel that the next sentence highlights that clans are recognizable across environmental variation from one side to the other of the ocean basin, which supports the inductive reasoning that codas do not vary systematically with environment. However, we have edited these sentences for clarity.

      Comment 10

      Lines 311-314: It would also be interesting to look at vocal style across non-ID coda types. Are some more similar to each other across clans than others? Perhaps vocal style can further distinguish types of non-ID codas.

      In supplementary Materials 3.4.2 and 3.5 we highlight our results when the codas are separated by coda type summarized in Table S4. We do compare the vocal style across non-ID coda types across clans and within the same clan. The results however are aggregated to highlight the differences in style between the clans and a a coda type-only comparison is not shown.

      Comment 11

      Lines 390-392: I’m assuming this is why pairwise comparisons were directional (i.e., there was both an A:B and a B:A comparison)? Can you speak to why A:B and B:A comparisons can have such different overlap values?

      Given two clans—for example, the Four-Plus and the Short clan—spatial overlap is calculated twice: as the proportion of the Four-Plus clan’s repertoires that were recorded within 1,000 km of at least one of the Short clan’s repertoires, and as the proportion of the Short clan’s repertoires that were recorded within 1,000 km of at least one of the Four-Plus clan’s repertoires.

      Order is important in these pairwise comparisons and generates an asymmetric matrix because the clans have different spatial extents. A clan found in only one small region might overlap completely with a clan that spans the Pacific Ocean, while the opposite is not true. For example, the Short clan spans the Pacific Ocean while the Four-Plus clan has been documented over a smaller area (but that smaller area overlaps extensively with the Short clan range). That is why the value is smaller (0.6) when considering how much of the Short clan’s range is shared with the Four-Plus clan, and larger (0.95) when considering how much of the Four-Plus clan’s range is shared with the Short clan.

      Action - We now include the clan spatial overlap matrix as a supplemental table (Table S5).

      Comment 13

      Line 56: Can you briefly explain what memory means in the context of Markov chains?

      We provide an explanation of the meaning of memory in the Methods section on ”Variable length Markov Chains”. Briefly, the memory in this case means how many states in the past of the Markov chain’s current state are required to predict the next transition of the chain itself. Standard Markov chains “look” back only one time step, while k-th order Markov chains look back k steps. In our case, there was no reason to assume that the memory required to predict different sequences of states (interclick intervals) should be the same across all sequences, and thus we adopted the formalism of variable length Markov chains, that allow for different levels of memory across the system.

      Comment 14

      Supplementary Figure S3: Like in the main manuscript, briefly explain or remind us what the blank nodes and the yellow nodes are.

      Action - Clarified that the orange node represents the root of the tree in the figures.

      Comment 15

      Supplementary Figure S7: Put the letters before the dataset name.

      Action - Done.

      Comment 16

      Supplementary Figure S10: Unclear what ’inner vs outer’ means.

      One specifies comparisons across clans (outer) and the other within the same clan (inner)

      Action - Added clarification on the caption of Figure S10

      Comment 17

      Supplementary Figure S14: Include a-c labels in the figure itself.

      Action - Labels added to figure

      Comment 18

      Supplementary Figure S14: The information about the nodes is what needs to be included earlier and in the main body when discussing the trees.

      Action - Added the explanation earlier in the text and in the main body

      Reviewer #2 (Recommendations):

      Comment 19

      Line 22: ”Symbolic” and ”Arbitrary” are not synonyms. Please see the comment above.

      We agree. Here, we make the point that the evolution of symbolic markers of group identity can be explained from what are initially arbitrary, and meaningless, signals (see [L1, L2]). Our point being that any vocalization, any coda, could have become selected for as an identity coda, and to become symbolic, and evolve to play a key role in cultural group formation and in-group favoritism because they enable a community of individuals to solve the problem of with whom to collaborate. The specific coda itself does not affect collaborative pay offs, but group specific differences in behavior can, as such the coda is arguably symbolic; as it is observable and recognizable, and can serve as a means for social assortment even when the behavioural differences are not. This can explain the means by which the social segregation which is observed among behaviorally distinct clans of sperm whales. However, in this manuscript, we do not extend this discussion of existing literature and have attempted to concisely describe this in a couple of lines, which clearly do a disservice to the large body of literature on the evolution of symbolic markers and human ethnic groups. We have added some citations to this section so that the reader may follow up should they disagree with out brief introductory statements.

      Action - Added citations and pointers to the literature.

      Comment 20

      Line 24: The authors’ terminology around ”markers”, ”arbitrary”, ”symbolic” is unnecessarily confusing and mystifying, giving the impression these terms are interchangeable. They are not. These terms are an integral and long-established part of key definitions in signal theory. Term use should be followed accordingly. The observation that whale vocal signals vary per population does not necessarily mean that they function as a social tag. The word ”dog” varies per population but its use relates to an animal, not the population that utters the word. ”Dog” is not ”symbolic” of England, English-speaking populations or the English language. Furthermore, the function of whale vocal signals is extremely challenging to determine. In the best conditions, researchers can pin the signal’s context, this is distinct from signal’s function and further even for the signal’s meaning. How exactly the authors determine that whale vocal signals are arbitrary is, thus, perplexing given that this would require a detailed description and understanding of who is producing the song, when, towards whom, and how the receivers react, none of which the authors have and without which no claim on the signals’ function can be made. This terminological laxness and the sensu latu in extremis to various terms in an unjustified, unnecessary and unhelpful.

      We use these terms as established in Hersh et al 2022 and the works leading up to it over the last 20 years in the study of sperm whales. These are often derived from definitions by Boyd and Richerson’s work on culture in humans and animals along with evolution of symbolic markers both in theory and in humans. We agree with the reviewer that these are difficult to establish in non-humans, whales or otherwise, but feel strongly that the accumulating evidence provides strong support for the function of these signals as symbolic markers of cultural groups, and that they likely evolved from initially arbitrary calls which were a part of the vocal repertoire (similar to the process and selective environment in Efferson et al. [L1] and McElreath et al. [L2]). We feel that we do not use these terms interchangeably here, and have inherited their use from definitions from anthropology. The work presented here uses terminology built across two decades of work in cetacean, and sperm whale, culture. And do not feel that these terms should be omitted here.

      Comment 21

      Lines 21-27: Overly broad and hazy paragraph.

      We hope the replies above and our changes satisfy this comment and clarify the text.

      Comment 22

      Figure 1 legend: What are ”memory structures”? Unjustified descriptor.

      The phrase was chosen to make draw some intuition on the variation of context length in variable length markov models.

      Action - Re-worded from memory structures to statistical properties

      Comment 23

      Line 30: Omit ”finite”.

      Action - Omitted.

      Comment 24

      Line 31: Please define and distinguish ”rhythm” and ”tempo”. Also see comment above, rhythm and tempo definitions require the use of IOIs.

      We disagree with the reviewer’s claims here. In our research specifically, and for sperm whale research generally, coda inter-click intervals (ICIs) are calculated as the time between the start of the first click and the start of the subsequent click. This makes ICIs identical to inter-onset intervals (IOIs) under all definitions we are aware of. For example, Burchardt and Knornschild [L3] define IOIs as such: “In a sequence of acoustic signals, the time span between the start of an element and the next element, comprising the element duration and the following gap duration”. We now include a sentence making this point.

      Regardless, we disagree on a more fundamental level with the statement that unless researchers quantify inter-onset intervals (IOIs), they cannot make any claims about rhythm. There are many studies that investigate rhythmic aspects of human and animal vocalizations without using IOIs [L4–L7]. If the duration of sound elements of interest is relatively constant (as is the case for sperm whale clicks), then rhythm analyses can still be meaningfully conducted on inter-call intervals (the silent intervals between calls).

      For sperm whales, coda rhythm is defined by the relative ICIs standardized by their total duration. These can be clustered into discrete, defined rhythm types based on characteristic ICI patterns. Coda tempo is relative to the total duration of the coda itself. This can also be clustered into discrete tempo types across all coda durations as well (see [L8]).

      Action - We added a sentence specifying that in this case we can use both ICIs and IOIs because of the standardized length of a single click.

      Comment 25

      Line 36: Are there non-vocalized codas to require the disambiguation here?

      No, we have omitted for clarity.

      Comment 26

      Line 44: ”Higher” than which other social group class?

      Sperm whales live in a multi-level social organization. Clans are a “higher” level of social organization than the social “units” which we define in line 40. Clans are made up of all units which share similar production repertoire of codas.

      Action - We have added ’above social units’ on line 44 to make this clear.

      Comment 27

      Line 47: The use of “symbolic” continues to be enigmatic, even if authors are taking in this classification from other researchers. In signal theory (semiotics), not all biomarkers are necessarily symbols. I advise the authors to avoid the use of the term colloquially and instead adopt the definition used in the research field within which the study falls in.

      There is ample examples of the use of ”symbolic” when referring to markers of in-group membership both in human and non-human cultures.Our choice to use the term “symbolic” is based on a previous study [L9] that found quantitative evidence that sperm whale identity codas function as symbolic markers of cultural identity, at least for Pacific Ocean clans. The full reasoning behind why the authors used the term “symbolic markers” is given in that paper, but briefly, they found evidence that identity coda usage becomes more distinct as clan overlap increases, while non-identity coda usage does not change. This matches theoretical and empirical work on human symbolic markers[L1, L2, L10, L11].

      Action - We retain the use of the term here, as defined in the works cited, and based on its prior usage in the study of both human and non-human cultures.

      Comment 28

      Line 50: This statement is not technically accurate. The use of a signal as a marker by individuals can only be determined by how individuals ”interpret” and react to that signal - e.g., via playback experiments - it cannot be determined by how different populations use and produce the signals.

      We respectfully disagree. While we agree that the optimal situation would be that of playback, the contextual use can provide insight into the functional use of signals; as can expected patterns of use and variation, as was tested in the papers we cite. However, this argument is not the scope nor the synthesis of this paper. These statements are supported by existing published works, as cited, and we encourage the reviewer to take exception with those papers.

      Comment 29

      Line 69: ”Meaningful speech characteristics”??? These terms do not logically or technically follow the previous statement. Why not stay faithful to the results and state that the method used seems to be valid and reliable because it confirms former studies and methods?

      Action - Reworded to better underline the method’s results with previous studies

      Comment 30

      Lines 72-74: This statement doesn’t seem to accurately capture/explain/resume the difference between ID and non-ID codas.

      We are not sure what the reviewer is referring to in this case. The sentence in this case was meant to explain the different relations that ID/non-ID codas have with clan sympatry.

      Comment 31

      Line 75: The information provided in the few previous sentences does not allow the reader to understand why these results support the notion that cultural transmission and social learning occurs between clans.

      We conclude out introduction with a brief summary of our overall findings, which we then use the rest of the manuscript to support these statements.

      Comment 32

      Table 1: So far, the authors refer to their analyses as capturing the ”rhythm” of whale clicks. Consequently, it is not readily clear at this point why the authors rely on ”ICIs” (inter click intervals) instead of the ”universal” measure used across taxa to capture the rhythm of signal sequences - IOIs (inter onset intervals). If ICIs are the same measure as IOIs, why not use the common term, instead of creating a new term name? Alternatively, if ICIs are not equivalent to IOIs, then arguably the analyses do not capture the ”rhythm” of whale clicks, as claimed by the authors. Any rhythmic claim will need to be based on IOI measures. In animal behaviour, stereotyped is primarily used to describe pathological, dysfunctional behaviour. I suggest the use of other adjective, such as ”regular”, ”repetitive”, ”recurring”, ”predictable”. Another deviation from typical terminology: ”usage frequency” -¿ ”production rate”. Why is a clan a ”higher-order” level of social organization? This requires explanation, at least a mention, of what are the ”lower-order” levels. To the non-expert reader, there is a logical circularity/gap here: Clans are said to produce clan-specific codas, and then, it is said that codas are used to delineate clans. Either one deduces, or one infers, but not both. This raises the question, are clans confirmed by any other means than codas?

      We are not creating a “new term name”: inter-click interval (ICI) is the standard terminology used in odontocete (toothed whale) research. We take the reviewer’s point that some readers will not be coming to our paper with that background, however, and now explicitly point out that ICI is synonymous with IOI for sperm whales. Please see our response to your earlier comment for more on this point.

      Comment 33

      Line 92: Unclear term, ”sub-sequence”. Fig. 1B doesn’t seem to readily help disambiguate the meaning of the term.

      In fact reference to Fig. 1B is misplaced as it does not refer to the text. A sub-sequence is simply a contiguous subset of a coda, a subset of it.

      Action - Removed ambiguous reference to Fig. 1B

      Comment 34

      Line 94: How does the use of ”sequence” compare here with ”sub-sequence” above?

      In fact its the same situation although the previous comment highlighted a source of ambiguity.

      Action - Reworded the sentence to be less confusing.

      Comment 35

      Line 95: Signal sequences don’t ”contain” memory, they require memory for processing.

      Action - Rephrased from “sequences contain memory” to “states depend on previous sequences of varying length”.

      Comment 36

      Lines 95-97: The analogy with human language seems forced, combinatorics in any given species are expected to entail different transitions between unit/unit-sequences.

      Thank you for the comment. Indeed, the purpose of the analogy is to illustrate how variable length Markov Chains work (which have been shown to be good at discerning even accents of the same language). We used human language as an analogy to provide the readers’ with a more intuitive understanding of the results.

      Action - Revised paragraph to read: “Despite we do not have direct evidence of unitary blocks in sperm whale communication, on can imagine this effect similarly to what happens with words (e.g., a word beginning with “re” can continue in more ways than one starting with “zy”).”

      Comment 37

      Line 97: Unclear which possibility is this.

      Action - Made the wording clearer.

      Comment 38

      Line 99: Invocation of memory, although common in the use of Markov chains, in inadequate here given that the research did not study how individuals perceived or processed click sequences, only how individual produced click sequences. If the authors are referring to the cognitive load imposed by producing clicks sequences, terms such as ”sequence planning” will be more accurate.

      Here, we use the term “fixed-memory” in relation to the definition of a variable length Markov model. We feel that, in this section of the manuscript, the context is clear that it is a mathematical definition and in no way invokes the biological idea of memory or cognition. It is rather standard to use memory to describe the order of Markov chains. Swapping words in the definition of mathematical objects when the context is clear seems to cause unnecessary ambiguity.

      Action - We clarified this in the manuscript (see comments above).

      Reviewer #3 (Recommendations):

      Comment 39

      Line 16: Add ”broadly defined” as there are many other more restricted definitions (see for example Tomasello 1999; 2009). Tomasello M (1999) The cultural origins of human cognition. Harvard University Press, Cambridge Tomasello M (2009) The question of chimpanzee culture, plus postscript (chimpanzee culture 2009). In: Laland KN, Galef BG (eds) The question of animal culture. Harvard University Press, Cambridge, pp 198-221.

      Thanks for the clarification.

      Action - We added the term “broadly” and added the last reference.

      Comment 40

      Line 22: Is all stable social learned behavior that becomes idiosyncratic and ”distinguishable” considered symbolic markers? If not, consider adding ”potentially.”

      No, but the evolution of cultural groups with differing behavior can reorganize the selective environment in such a way that it can favour an in-group bias that was not initially advantageous to individuals and lead to a preference towards others who share an overt symbolic marker that initially had no meaning and a random frequency in both populations. That is to say, even randomly assigned trivial groups can evolve arbitrary symbolic markers through in-group favouritism once behavioural differences exist even in the absence of any history of rivalry, conflict, or competition between groups. See for example [L1, L2].

      Comment 41

      Table 1: Identity codas are defined as a ”Subset of coda types most frequently used by a sperm whale clan; canonically used to define vocal clans.” Therefore, I infer that an identity coda is not exclusively used by a specific clan and may be utilized by other clans, albeit less frequently. If this is the case, what criteria determine the frequency of usage for a coda to be categorized as an identity or non-identity coda? Does the criteria used to differentiate between ID and non-ID codas reflect the observed differences in micro changes between the two and within clans?

      The methods for this categorization are defined, discussed, and justified in previous work in [L9, L12]. We feel its outside the scope of this paper to review these details here in this manuscript. However, the differences between vocal styles discussed here and the frequency production repertoires which allow for the definition of identity codas are on different scales. The differences between identity and non-identity codas are not the observed differences in vocal style reported here.

      Comment 42

      Table 1: The definition of vocal style states that it ”Encodes the rhythmic variations within codas.” However, if rhythm changes, does the type of coda change as well? Typically, in musical terms, the component that maintains the structure of a rhythm is ”tempo,” not ”rhythm.” How much microvariation is acceptable to maintain the same rhythm, and when do these variations constitute a new rhythm?

      Thank you for raising this important point about the relationship between rhythmic variations and coda categorization. In our definition, ”vocal style” refers to subtle, micro-level variations in the rhythmic structure of codas that do not alter their overarching categorical identity. These microvariations are akin to ”tempo” changes in musical terms, which can modify the expression of a rhythm without fundamentally altering its structure.

      The threshold at which microvariations constitute a new rhythm, and thus a new coda type, remains an open question and is a limitation of current analytical approaches. In our study, we used established classification methods to group codas into types, treating variations within these groups as part of the same rhythm. Future work could refine these thresholds to better distinguish between meaningful rhythmic variation and the emergence of new coda types.

      Comment 43

      Table 1: Change ”say” to ”vocalize” (similarly as used in line 273 for humpback whales ”vocalizations”).

      Thanks.

      Action - Done.

      Comment 44

      Lines 33-35 and Figure 1-C: Can a lay listener discern the microvariations within each coda type by ear? Consider including sound samples of individual rhythmic microvariations for the same coda type pattern (e.g., Four plus, Palindrome, Plus One, Regular) to provide readers/listeners with an impression of their detectability. If authors considered too much or redundant Supplemental material at least give a sound sample for each the 4 subcodas modeled structures examples of 4R2 coda variations depicted in Figure 1-C so the reader can have an acoustic impression of them.

      We do not think that human listeners would be able to all of the variation detected here. However, this does not mean that it is not important variation for the whales. Human observers being able to classify call variation aurally shouldn’t be seen as a bar representing important biological variation for non-human species, given that their hearing and vocal production systems have evolved independently. Importantly, ’Four Plus’,’Palindrome’, etc are names of Clans; sympatric, but socially segregated, communities of whale families, which share a distinct vocal dialect of coda types. These clans each have have distinguishable coda dialects made up of dozens of coda types (and delineated based on identity codas), these are not names/categorical coda types themselves.

      Action - We now provide audio samples of all coda types listed in Figure 1B in the paper’s Github repository.

      Comment 45

      Line 69: As stated above, it may be confusing to refer to it as ”speech.” I suggest adding something like: ”Our method does capture one essential characteristic of human speech: phonology.” Reply 45.—Thank you for drawing our attention to this.

      Action - We removed the word “speech” from the manuscript, using “communication” and/or “vocalization” depending on the context.

      Comment 46

      Line 111-112: Consider adding a sound sample of the variation of the 4R2 coda type that can be vocalized as BCC but also as CBB as supplementary data.

      What the reviewer has correctly observed is that the traditional categorical coda type ’names’ do not capture the variation within a type by rhythm nor by tempo.

      Action - We have added samples of all coda types listed in Figure 1B in the paper’s Github repo.

      Comment 47

      Figure 3: Include a sound sample for each of the 7 coda types in Figure 1B (”specific vocal repertoires”) to illustrate the set of coda types used and their associated usage frequencies, or at least for each of the 7 coda types in Figure 3 and tables S1 and S2.

      Sperm whales in the Eastern Caribbean produce dozens of rhythm types across at least five categorical tempo types [L8, L13]. The coda types represented in Figure 1B do not demonstrate all the variability inherent in the sperm whales’ vocal dialect. Importantly, Figure 3, as well as table S1 and S2, refer to clan-level dialects not specific individual coda types.

      Action - We added sound samples for each coda rhythm type listed in Figure 1B to the Github repository.

      Comment 48

      Lines 184-190: It is unclear what human analogy term is used for ID codas. This needs clarification.

      We are not making an analogy in humans for the role of ID vs non-ID codas, but only providing the example of accents as changes in vocalization (style) without a change in the actual words used (repertoire).

      Action - We tried to make it clearer in the manuscript.

      Comment 49

      Line 190: Change ”whale speech” to ”whale vocalizations.”

      Thanks.

      Action - Done.

      Comment 50

      Figure 4: Correct citation number Hersh ”10” to Hersh ”11.”

      Thanks.

      Action - Fixed the reference.

      Comment 51

      Lines 224-232: Clarify whether the reference to how spatial overlap affects the frequency of ID codas refers to shared ID codas between clans or the production frequency of each coda within the total repertoire of codas.

      The similarity between ID coda repertoires we are referring to there is based on the ID codas of both clans.

      More details on the comparison can be found in [L9].

      Action - We added a sentence explaining the comparison is made using the joint set of ID codas.

      Comment 52

      Lines 240-241: What are non-ID codas vocal cues for?

      Non-ID codas likely serve as flexible, context-dependent signals that facilitate group coordination, convey environmental or social context, and promote social learning, especially in mixed-clan or overlapping habitats. Their variability suggests multifunctional roles shaped by ecological and social pressures.

      Comment 53

      Lines 267-268: It’s unclear whether non-ID coda vocal styles are genetically inherited or not, as argued in lines 257-258.

      We did not intend to argue that non-ID coda vocal styles are genetically inherited. Instead, we aimed to present a hypothetical consideration: if non-ID coda vocal styles were genetically inherited, one would expect a direct correlation between vocal style similarity and genetic relatedness. This hypothetical framework was introduced to strengthen our argument that the observed patterns are unlikely to be explained by genetic inheritance, as such correlations have not been observed. While we acknowledge that we lack definitive proof to rule out genetic influences entirely, the evidence available strongly suggests that social learning, rather than genetic transmission, is the more plausible mechanism.

      Action - Clarified in manuscript.

      Comment 54

      Line 277: Can males mate with females from different clans?

      Yes, genetic evidence shows that males may even switch ocean basins.

      Action - We have clarified that we mean the female members of units from different clans have only rarely been observed to interact at sea between clans.

      Comment 55

      Lines 287-292: Consider discussing the difference between controlled/voluntary and automatic/involuntary imitation and their implications for cultural selection and social learning (see Heyes 2011; 2012). Heyes, C. (2011). Automatic imitation. Psychological bulletin, 137(3), 463. Heyes, C. (2012). What’s social about social learning?. Journal of comparative psychology, 126(2), 193.

      Thank you for your insightful comment regarding this. The distinction between controlled/voluntary and automatic/involuntary imitation, as highlighted by Heyes [L14, L15], provides a potentially valuable framework for interpreting social learning mechanisms in sperm whales. Automatic imitation refers to reflexive, often unconscious mimicry driven by perceptual or motor coupling, while controlled imitation involves deliberate and goal-directed efforts to replicate behaviors. Both forms likely play complementary roles in the cultural transmission observed in sperm whales.

      This dual-process perspective highlights the potential for cultural selection to act at different levels. Automatic imitation may drive convergence in shared environments, promoting acoustic homogeneity and facilitating inter-clan communication. In contrast, controlled imitation ensures the preservation of clan-specific vocal traditions, maintaining cultural diversity. This interplay between automatic and controlled processes could reflect a balancing act between cultural assimilation and differentiation, underscoring the adaptive value of these mechanisms in dynamic social and ecological contexts.

      Action - We have incorporated a short discussion of this distinction and its implications for our findings in the Discussion. Additionally, we have cited [L14, L15] to provide theoretical grounding for this interpretation.

      Comment 56

      Methods: Consider integrating the paragraph from lines 319-321 into lines 28-35 and eliminate redundant information.

      Thanks.

      Action - We implemented the suggestion, removing the first paragraph of the Dataset description and integrating the information when we introduce the concepts of codas and clicks.

      [L1] C. Efferson, R. Lalive, and E. Fehr, Science 321, 1844 (2008).

      [L2] R. McElreath, R. Boyd, and P. Richerson, Curr. Anthropol. 44, 122 (2003).

      [L3] L. S. Burchardt and M. Knornschild, PLoS Computational Biology 16, e1007755 (2020).

      [L4] A. Ravignani and K. de Reus, Evolutionary Bioinformatics 15, 1176934318823558 (2019).

      [L5] C. T. Kello, S. D. Bella, B. Med´ e, and R. Balasubramaniam, Journal of the Royal Society Interface 14, 20170231 (2017).

      [L6] D. Gerhard, Canadian Acoustics 31, 22 (2003).

      [L7] N. Mathevon, C. Casey, C. Reichmuth, and I. Charrier, Current Biology 27, 2352 (2017).

      [L8] P. Sharma, S. Gero, R. Payne, D. F. Gruber, D. Rus, A. Torralba, and J. Andreas, Nature Communications 15, 3617 (2024).

      [L9] T. A. Hersh, S. Gero, L. Rendell, M. Cantor, L. Weilgart, M. Amano, S. M. Dawson, E. Slooten, C. M. Johnson, I. Kerr, et al., Proc. Natl. Acad. Sci. 119, e2201692119 (2022).

      [L10] R. Boyd and P. J. Richerson, Cult Anthropol 2, 65 (1987). [L11] E. Cohen, Curr. Anthropol. 53, 588 (2012).

      [L12] T. A. Hersh, S. Gero, L. Rendell, and H. Whitehead, Methods Ecol. Evol. 12, 1668 (2021), ISSN 2041-210X, 2041-210X.

      [L13] S. Gero, A. Bøttcher, H. Whitehead, and P. T. Madsen, R. Soc. Open Sci. 3, 160061 (2016).

      [L14] C. Heyes, Psychological Bulletin 137, 463 (2011).

      [L15] C. Heyes, Journal of Comparative Psychology 126, 193 (2012).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Cao et al. examines an important but understudied question of how chronic exposure to heat drives changes in affective and social behaviors. It has long been known that temperature can be a potent driver of behaviors and can lead to anxiety and aggression. However, the neural circuitry that mediates these changes is not known. Cao et al. take on this question by integrating optical tools of systems neuroscience to record and manipulate bulk activity in neural circuits, in combination with a creative battery of behavior assays. They demonstrate that chronic daily exposure to heat leads to changes in anxiety, locomotion, social approach, and aggression. They identify a circuit from the preoptic area (POA) to the posterior paraventricular thalamus (pPVT) in mediating these behavior changes. The POA-PVT circuit increases activity during heat exposure. Further, manipulation of this circuit can drive affective and social behavioral phenotypes even in the absence of heat exposure. Moreover, silencing this circuit during heat exposure prevents the development of negative phenotypes. Overall the manuscript makes an important contribution to the understudied area of how ambient temperature shapes motivated behaviors.

      Strengths:

      The use of state-of-the-art systems neuroscience tools (in vivo optogenetics and fiber photometry, slice electrophysiology), chronic temperature-controlled experiments, and a rigorous battery of behavioral assays to determine affective phenotypes. The optogenetic gain of function of affective phenotypes in the absence of heat, and loss of function in the presence of heat are very convincing manipulation data. Overall a significant contribution to the circuit-level instantiation of temperature-induced changes in motivated behavior, and creative experiments.

      Weaknesses:

      (1) There is no quantification of cFos/rabies overlap shown in Figure 2, and no report of whether the POA-PVT circuit has a higher percentage of Fos+ cells than the general POA population. Similarly, there is no quantification of cFos in POA recipient PVT cells for Figure 2 Supplement 2.

      Thanks for the comment. The quantification results of c-Fos signal have been provided in the main text and figures.  

      (2) The authors do not address whether stimulation of POA-PVT also increases core body temperature in Figure 3 or its relevant supplements. This seems like an important phenotype to make note of and could be addressed with a thermal camera or telemetry.

      Thanks for raising this point. We did indeed monitor the core body temperature during stimulation of POA-PVT pathway, but we did not observe any significant changes. We have included this finding in the revised manuscript.

      (3) In Figure 3G: is Day 1 vs Day 22 "pre-heat" significant? The statistics are not shown, but this would be the most conclusive comparison to show that POA-PVT cells develop persistent activity after chronic heat exposure, which is one of the main claims the authors make in the text. This analysis is necessary in order to make the claim of persistent circuit activity after chronic heat exposure.

      Figure 3G does compare the Day 1 preheat to Day22 preheat, and the difference was significant. The wording has been corrected to avoid confusion. Also, we have modified Figure 3D to 3H in our revised manuscript to improve the clarity of these plots.

      (4) In Figure 4, the control virus (AAV1-EYFP) is a different serotype and reporter than the ChR2 virus (AAV9-ChR2-mCherry). This discrepancy could lead to somewhat different baseline behaviors.

      Thanks for bringing out this issue. We acknowledge that using AA1-EGFP (a different serotype and reporter compared to the AAV9-ChR2-mCherry) as our control virus is not ideal. But based on our own prior experiments, we observed no significant differences in baseline behaviors between animals injected with AAV1 and AAV9 EYFP as well as control mice without virus injection. Therefore, we believe that the baseline behaviors of the animals were unaffected.

      (5) In Figure 5G, N for the photometry data: the authors assess the maximum z-score as a measure of the strength of calcium response, however the area under the curve (AUC) is a more robust and useful readout than the maximum z score for this. Maximum z-score can simply identify brief peaks in amplitude, but the overall area under the curve seems quite similar, especially for Figure 5N.

      Thanks for the comment. We agree with the reviewer that the area under the curve (AUC) is an alternative readout for measurement of the strength of calcium response. However, the reason why we chose the maximum z-score is based on the observation that we found POA recipient pPVT neurons after chronic heat treatment exhibited a higher calcium peak corresponding to certain behavioral performances when compared to pre-heat conditions. We thus applied the maximum z-score as a representative way to describe the neuronal activity changes of mice during certain behaviors before and after chronic heat treatment. The other consideration is that we want to reflect that POA recipient pPVT neurons become more sensitive and easier to be activated after chronic heat exposure under the same stressful situations compared to control mice. The maximum z score represented by peak in combination with particular behavioral performances is considered more suitable to highlight our findings in this study.

      (6) For Fig 5V: the authors run the statistics on behavior bouts pooled from many animals, but it is better to do this analysis as an animal average, not by compiling bouts. Compiling bouts over-inflates the power and can yield significant p values that would not exist if the analysis were carried out with each animal as an n of 1.

      Thanks for the comment and suggestion. We had tried both methods and the statistical results were similar. As suggested, we have updated Fig 5V, as well as Fig. 5H and 5O by comparing animal average in our revised manuscript.

      (7) In general this is an excellent analysis of circuit function but leaves out the question of whether there may be other inputs to pPVT that also mediate the same behavioral effect. Future experiments that use activity-dependent Fos-TRAP labeling in combination with rabies can identify other inputs to heat-sensitive pPVT cells, which may have convergent or divergent functions compared to the POA inputs.

      Thanks for the valuable suggestion, which would enhance the conclusion. We will consider adopting this approach in future investigations into this question.

      Reviewer #2 (Public review):

      Summary

      The study by Cao et al. highlights an interesting and important aspect of heat- and thermal biology: the effect of repetitive, long-term heat exposure and its impact on brain function.

      Even though peripheral, sensory temperature sensors and afferent neuronal pathways conveying acute temperature information to the CNS have been well established, it is largely unknown how persistent, long-term temperature stimuli interact with and shape CNS function, and how these thermally-induced CNS alterations modulate efferent pathways to change physiology and behavior. This study is therefore not only novel but, given global warming, also timely.

      The authors provide compelling evidence that neurons of the paraventricular thalamus change plastically over three weeks of episodic heat stimulation and they convincingly show that these changes affect behavioral outputs such as social interactions, and anxiety-related behaviors.

      Strengths

      (1) It is impressive that the assessed behaviors can be (i) recruited by optogenetic fiber activation and (ii) inhibited by optogenetic fiber inhibition when mice are exposed to heat. Technically, when/how long is the fiber inhibition performed? It says in the text "3 min on and 3 min off". Is this only during the 20-minute heat stimulation or also at other times?

      Thanks for pointing out the need for clarification. Our optogenetic inhibition had been conducted for 21 days during the heat exposure period (90 mins) for each mouse. And to avoid the light-induced heating effect, we applied the cyclical mode of 3 minutes’ light on and 3 minutes’ light off only during the process of heat exposure but not other time. The detailed description has been supplemented in the Method part of our revised manuscript.

      (2) It is interesting that the frequency of activity in pPVT neurons, as assessed by fiber photometry, stays increased after long-term heat exposure (day 22) when mice are back at normal room temperature. This appears similar to a previous study that found long-term heat exposure to transform POA neurons plastically to become tonically active (https://www.biorxiv.org/content/10.1101/2024.08.06.606929v1). Interestingly, the POA neurons that become tonically active by persistent heat exposure described in the above study are largely excitatory, and thus these could drive the activity of the pPVT neurons analyzed in this study.

      Thanks for pointing out this study that suggests similar plasticity of POA neurons under long-term heat exposure serving a different purpose. We have included this information in our discussion as well.  

      (3) How can it be reconciled that the majority of the inputs from the POA are found to be largely inhibitory (Fig. 2H)? Is it possible that this result stems from the fact that non-selective POA-to-pPVT projections are labelled by the approach used in this study and not only those pathways activated by heat? These points would be nice to discuss.

      Thanks for raising these important questions. Although it is not our primary focus, we are aware of the substantial inhibitory inputs from POA to pPVT which suggests an important function. However, we do not think that this pathway, which would exert an opposite effect on POA-recipient pPVT neurons compared to the excitatory input, contributes to the long-term effect of chronic heat exposure. This is due to the increased, rather than decreased, excitability of the neurons. There is a possibility that this inhibitory input serves as a short-term inhibitory control for other purpose. Further work is needed to fully address this question.

      (4) It is very interesting that no LTP can be induced after chronic heat exposure (Figures K-M); the authors suggest that "the pathway in these mice were already saturated" (line 375). Could this hypothesis be tested in slices by employing a protocol to extinguish pre-existing (chronic heat exposure-induced) LTP? This would provide further strength to the findings/suggestion that an important synaptic plasticity mechanism is at play that conveys behavioral changes upon chronic heat stimulation.

      We agree with the reviewer that the results of the suggested experiment would further strengthen our hypothesis. We will try to confirm this in future studies.

      (5) It is interesting that long-term heat does not increase parameters associated with depression (Figure 1N-Q), how is it with acute heat stress, are those depression parameters increased acutely? It would be interesting to learn if "depression indicators" increase acutely but then adapt (as a consequence of heat acclimation) or if they are not changed at all and are also low during acute heat exposure.

      Based on our observations, we did not find increased depression parameters after acute heat stress in our experiments (data not shown), which was consistent with other two previous studies (Beas et al., 2018; Zhang et al., 2021). It appears that acute heat stress is more associated with anxiety-like behavior and may not be sufficient to induce depression-like phenotypes in rodents, aligning with our observation during experiments.

      Beas BS, Wright BJ, Skirzewski M, Leng Y, Hyun JH, Koita O, Ringelberg N, Kwon HB, Buonanno A, Penzo MA (2018) The locus coeruleus drives disinhibition in the midline thalamus via a dopaminergic mechanism Nat Neurosci 21:963-973.

      Zhang GW, Shen L, Tao C, Jung AH, Peng B, Li Z, Zhang LI, Whit Tao HZ (2021) Medial preoptic area antagonistically mediates stress-induced anxiety and parental behavior Nat Neurosci 24:516-528.

      Weaknesses/suggestions for improvement.

      (1) The introduction and general tenet of the study is, to us, a bit too one-sided/biased: generally, repetitive heat exposure --heat acclimation-- paradigms are known to not only be detrimental to animals and humans but also convey beneficial effects in allowing the animals and humans to gain heat tolerance (by strengthening the cardiovascular system, reducing energy metabolism and weight, etc.).

      Thanks for the suggestion. We have modified the introduction in our revised manuscript to make it more balanced.

      (2) The point is well taken that these authors here want to correlate their model (90 minutes of heat exposure per day) to heat waves. Nevertheless, and to more fully appreciate the entire biology of repetitive/chronic/persistent heat exposure (heat acclimation), it would be helpful to the general readership if the authors would also include these other aspects in their introduction (and/or discussion) and compare their 90-minute heat exposure paradigm to other heat acclimation paradigms. For example, many past studies (using mice or rats)m have used more subtle temperatures but permanently (and not only for 90 minutes) stimulated them over several days and weeks (for example see PMID: 35413138). This can have several beneficial effects related to cardiovascular fitness, energy metabolism, and other aspects. In this regard: 38{degree sign}C used in this study is a very high temperature for mice, in particular when they are placed there without acclimating slowly to this temperature but are directly placed there from normal ambient temperatures (22{degree sign}C-24{degree sign}C) which is cold/coolish for mice. Since the accuracy of temperature measurement is given as +/- 2{degree sign}C, it could also be 40{degree sign}C -- this temperature, 40{degree sign}C, non-heat acclimated C57bl/6 mice will not survive for long.

      The authors could consider discussing that this very strong, short episodic heat-stress model used here in this study may emphasize detrimental effects of heat, while more subtle long-term persistent exposure may be able to make animals adapt to heat, become more tolerant, and perhaps even prevent the detrimental cognitive effects observed in this study (which would be interesting to assess in a follow-up study).

      Thanks for pointing out the important aspect regarding the different heat exposure paradigms and their potential impacts. We have incorporated these points into both the Introduction and Discussion sections of the revised manuscript.

      (3) Line 140: It would help to be clear in the text that the behaviors are measured 1 day after the acute heat exposure - this is mentioned in the legend to the figure, but we believe it is important to stress this point also in the text. Similarly, this is also relevant for chronic heat stimulation: it needs to be made very clear that the behavior is measured 1 day after the last heat stimulus. If the behaviors had been measured during the heat stimulus, the results would likely be very different.

      Thanks for the suggestion, and we have clarified the procedure in the revised manuscript.

      (4) Figure 2 D and Figure 2- Figure Supplement 1: since there is quite some baseline cFos activity in the pPVT region we believe it is important to include some control (room temperature) mice with anterograde labelling; in our view, it is difficult/not possible to conclude, based on Fig 2 supplement 2C, that nearly 100% of the cfos positive cells are contacted by POA fibre terminals (line 168). By eye there are several green cells that don't have any red label on (or next to) them; additionally, even if there is a little bit of red signal next to a green cell: this is not definitive proof that this is a synaptic contact. It is therefore advisable to revisit the quantification and also revisit the interpretation/wording about synaptic contacts.

      In relation to the above: Figure 2h suggests that all neurons are connected (the majority receiving inhibitory inputs), is this really the case, is there not a single neuron out of the 63 recorded pPVT neurons that does not receive direct synaptic input from the POA?

      Thanks for the comments. For Figure 2-figure supplement 1, the baseline c-Fos activity in pPVT were indeed measured from mouse under room temperature. Observed activity may be attributed to the diverse functions that the pPVT is responsible for. Compared to the heat-exposed group, we observed significant increases in c-Fos signals, suggesting the effect of heat exposure.

      For Figure 2-figure supplement 2, through targeted injection of AAV1-Cre into the POA, we achieved selective expression of Cre-dependent ChR2-mCherry in pPVT neurons receiving POA inputs. Following heat exposure, we observed substantial colocalization between heat-induced c-Fos expression (green signal) and ChR2-mCherry-labeled neurons (red signal) in the pPVT. This extensive overlap indicates that POA-recipient pPVT neurons are predominantly heat-responsive and likely mediate the behavioral alterations induced by chronic heat exposure. We have validated these signals and included updated quantification in our revised manuscript.

      For Fig 2H, we specifically patched those neurons that were surrounded by red fluorescence under the microscope, ensuring that the patched neurons had a high likelihood of being innervated from POA. This is why all 63 recorded pPVT neurons were found to receive direct synaptic input from the POA.

      (5) It would be nice to characterize the POA population that connects to the pPVT, it is possible/likely that not only warm-responsive POA neurons connect to that region but also others. The current POA-to-pPVT optogenetic fibre stimulations (Figure 4) are not selective for preoptic warm responsive neurons; since the POA subserves many different functions, this optogenetic strategy will likely activate other pathways. The referees acknowledge that molecular analysis of the POA population would be a major undertaking. Instead, this could be acknowledged in the discussion, for example in a section like "limitation of this study".

      Thanks for the suggestion. We have supplemented this part in our revised manuscript.

      (6) Figure 3a the strategy to express Gcamp in a Cre-dependent manner: it seems that the Gcamp8f signal would be polluted by EGFP (coming from the Cre virus injected into the POA): The excitation peak for both is close to 490nm and emission spectra/peaks of GCaMP8f (510-520 nm) and EGFP (507-510 nm) are also highly overlapping. We presume that the high background (EGFP) fluorescence signal would preclude sensitive calcium detection via Gcamp8f, how did the authors tackle this problem?

      Thank you for pointing out this issue. We acknowledge that we included AAV1-EGFP when recording the GCaMP8F signal to assist in the post-verification of the accuracy of the injection site. But we also collected recording data from mice with AAV1-Cre without EGFP injected into POA and Cre-dependent GCaMP8F in pPVT, albert in a smaller number. We did not observe any obvious differences in the change in calcium signal between these two virus strategies, suggesting that the sensitivity of the GCaMP signals was not significantly affected by the increased baseline fluorescence due to EGFP.

      (7) How did the authors perform the social interaction test (Figures 1F, G)? Was the intruder mouse male or female? If it was a male mouse would the interaction with the female mouse be a form of mating behavior? If so, the interpretation of the results (Figures 1F, G) could be "episodic heat exposure over the course of 3 weeks reduces mating behavior".

      Thanks for the comment. For this female encounter test, we strictly followed the protocol by Ago Y, et al., (2015). During this test, both the strange male and female mice were placed into a wired cup (which is made up of mental wire entanglement and the size for each hole is 0.5 cm [L] x 0.5 cm [W]), which successfully prevented large body contact and the mating behavior but only innate sex-motivated moving around the cup. We have supplemented the details in the method part of our revised manuscript.

      Ago Y, Hasebe S, Nishiyama S, Oka S, Onaka Y, Hashimoto H, Takuma K, Matsuda T (2015) The Female Encounter Test: A Novel Method for Evaluating Reward-Seeking Behavior or Motivation in Mice Int J Neuropsychopharmacol 18: pyv062.

      Reviewer #3 (Public review):

      In this study, Cao et al. explore the neural mechanisms by which chronic heat exposure induces negative valence and hyperarousal in mice, focusing on the role of the posterior paraventricular nucleus (pPVT) neurons that receive projections from the preoptic area (POA). The authors show that chronic heat exposure leads to heightened activity of the POA projection-receiving pPVT neurons, potentially contributing to behavioral changes such as increased anxiety level and reduced sociability, along with heightened startle responses. In addition, using electrophysiological methods, the authors suggest that increased membrane excitability of pPVT neurons may underlie these behavioral changes. The use of a variety of behavioral assays enhances the robustness of their claim. Moreover, while previous research on thermoregulation has predominantly focused on physiological responses to thermal stress, this study adds a unique and valuable perspective by exploring how thermal stress impacts affective states and behaviors, thereby broadening the field of thermoregulation. However, a few points warrant further consideration to enhance the clarity and impact of the findings.

      (1) The authors claim that behavior changes induced by chronic heat exposure are mediated by the POA-pPVT circuit. However, it remains unclear whether these changes are unique to heat exposure or if this circuit represents a more general response to chronic stress. It would be valuable to include control experiments with other forms of chronic stress, such as chronic pain, social defeat, or restraint stress, to determine if the observed changes in the POA-pPVT circuit are indeed specific to thermal stress or indicative of a more universal stress response mechanism.

      We also share similar considerations as the reviewer and indeed have conducted experiments to explore this possibility. Our findings suggest that the POA-pPVT pathway may also mediate behavioral changes induced by other chronic stress, e.g. chronic restraint stress. Nevertheless, given the well-known prominent role of POA neurons in heat perception, we do believe that the POA-pPVT has a specialized role in mediating chronic heat induced changes. The role of this pathway in other stress-related responses will need a more comprehensive study in the future.

      (2) The authors use the term "negative emotion and hyperarousal" to interpret behavioral changes induced by chronic heat (consistently throughout the manuscript, including the title and lines 33-34). However, the term "emotion" is broad and inherently difficult to quantify, as it encompasses various factors, including both valence and arousal (Tye, 2018; Barrett, L. F. 1999; Schachter, S. 1962). Therefore, the reviewer suggests the authors use a more precise term to describe these behaviors, such as valence. Additionally, in lines 117 and 137-139, replacing "emotion" with "stress responses," a term that aligns more closely with the physiological observations, would provide greater specificity and clarity in interpreting the findings.

      Thanks for the suggestion. We have modified the description of “emotion” to “emotional valence” in various places throughout the revised manuscript.

      (3) Related to the role of POA input to pPVT,

      a) The authors showed increased activity in pPVT neurons that receive projections from the POA (Figure 3), and these neurons are necessary for heat-induced behavioral changes (Figures 4N-W). However, is the POA input to the pPVT circuit truly critical? Since recipient pPVT neurons can receive inputs from various brain regions, the reviewer suggests that experiments directly inhibiting the POA-to-pPVT projection itself are needed to confirm the role of POA input. Alternatively, the authors could show that the increased activity of pPVT neurons due to chronic heat exposure is not observed when the POA is blocked. If these experiments are not feasible, the reviewer suggests that the authors consider toning down the emphasis on the role of the POA throughout the manuscript and discuss this as a limitation.<br /> b) In the electrophysiology experiments shown in Figures 6A-I, the authors conducted in vitro slice recordings on pPVT neurons. However, the interpretation of these results (e.g., "The increase in presynaptic excitability of the POA to pPVT excitatory pathway suggested plastic changes induced by the chronic heat treatment.", lines 349-350) appears to be an overclaim. It is difficult to conclude that the increased excitability of pPVT neurons due to heat exposure is specifically caused by inputs from the POA. To clarify this, the reviewer suggests the authors conduct experiments targeting recipient neurons in the pPVT, with anterograde labeling from the POA to validate the source of excitatory inputs.

      For point (a), we acknowledge that pPVT neurons receiving POA inputs may also receive projections from other brain regions. While these additional inputs warrant investigation, they fall beyond the scope of our current study and represent promising directions for future research. Notably, compared to other well-characterized regions such as the amygdala and ventral hippocampus, the pPVT receives particularly robust projections from hypothalamic nuclei (Beas et al., 2018). Our optogenetic inhibition of POA-recipient pPVT neurons during chronic heat exposure effectively prevented the influence of POA excitatory projections on pPVT neurons. Furthermore, selective optogenetic activation of POA excitatory terminals within the pPVT was sufficient to induce similar behavioral abnormalities in mice, strongly supporting the causal role of POA inputs in mediating chronic heat exposure-induced behavioral alterations.

      Beas BS, Wright BJ, Skirzewski M, Leng Y, Hyun JH, Koita O, Ringelberg N, Kwon HB, Buonanno A, Penzo MA (2018) The locus coeruleus drives disinhibition in the midline thalamus via a dopaminergic mechanism Nat Neurosci 21:963-973.

      Regarding point (b), we acknowledge certain limitations in our in vitro patch-clamp recordings when attributing increased pPVT neuronal excitability to enhanced presynaptic POA inputs. Nevertheless, our brain slice recordings clearly demonstrated heightened excitability of pPVT neurons following chronic heat exposure. This finding was further corroborated by our in vivo fiber photometry recordings specifically targeting POA-recipient pPVT neurons, which confirmed that the increased pPVT neuronal activity was indeed modulated by POA inputs. The causal relationship was strengthened by our observation that optogenetic activation of POA excitatory terminals within the pPVT reproduced behavioral abnormalities similar to those observed in chronic heat-exposed mice. Additionally, our inability to induce circuit-specific LTP in the POA-pPVT pathway suggests that these synapses were already potentiated and saturated, reflecting enhanced excitatory inputs from the POA to pPVT. Collectively, these findings support our conclusion that increased excitatory projections from the POA to pPVT likely represent a key mechanism underlying chronic heat exposure-induced behavioral alterations in mice.

      (4) The authors focus on the excitatory connection between the POA and pPVT (e.g., "Together, our results indicate that most of the pPVT-projecting POA neurons responded to heat treatment, which would then recruit their downstream neurons in the pPVT by exerting a net excitatory influence.", lines 169-171). However, are the POA neurons projecting to the pPVT indeed excitatory? This is surprising, considering i) the electrophysiological data shown in Figures 2E-K that inhibitory current was recorded in 52.4% of pPVT neurons by stimulation of POA terminal, and ii) POA projection neurons involved in modulating thermoregulatory responses to other brain regions are primarily GABAergic (Tan et al., 2016; Morrison and Nakamura, 2019). The reviewer suggests showing whether the heat-responsive POA neurons projecting to the pPVT are indeed excitatory (This could be achieved by retrogradely labeling POA neurons that project to the pPVT and conducting fluorescence in situ hybridization (FISH) assays against Slc32a1, Slc17a6, and Fos to label neurons activated by warmth). Alternatively, demonstrate, at least, that pPVT-projecting POA neurons are a distinct population from the GABAergic POA neurons that project to thermoregulatory regions such as DMH or rRPa. This would clarify how the POA-pPVT circuit integrates with the previously established thermoregulatory pathways.

      Thanks for the comment and suggestion. We acknowledge that there are both excitatory and inhibitory projections from POA to pPVT. Although it is not our primary focus, we are aware of the substantial inhibitory inputs from POA to pPVT which suggests an important function. However, we do not think that this pathway, which would exert an opposite effect on POA-recipient pPVT neurons compared to the excitatory input, contributes to the long-term effect of chronic heat exposure. This is due to the increased, rather than decreased, excitability of the neurons. There is a possibility that this inhibitory input serves as a short-term inhibitory control for other purpose. Further work is needed to fully address this question.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      I have a number of suggested minor edits that would improve the readability and interpretation of figures for the reader. In many figures, there are places where it is unclear what is being tested, and making minor changes would make the manuscript flow more easily for the reader:

      (1) The authors could add additional details about the behavior paradigms in the Figures, especially Figure 1. How long was the chronic heat exposure for? At what temperature? What is the length of time between the end of heat exposure and the start of behaviors? What was the schedule of testing for EPM and social behaviors? Was it all on the same day or on different days? These details will make it easier for the reader to understand the behavior tests.

      We have revised our experimental scheme, especially Figure 1, and added more detailed descriptions in the method section. The modifications have also been applied to the other figures.

      (2) In Figures 1J and 1K, it is a bit unclear what is being shown in the right panel, since there are no axes or labels to interpret what is being plotted.

      We have added body kinetics (purple dot) in the left panel of Figure 1J and 1K to align with the right panels, and we have updated our descriptions in the figure legend.

      (3) In general, Figure 1 would benefit from more headers/labels or schematics to demonstrate what is being tested (for example, it's unclear that forced swim, tail suspension, open field, aggression, sucrose preference, or acoustic startle are being studied unless the reader looks at the figure legend in depth. Simple schematics or titles for each panel would help.

      We have added the abbreviated titles for each panel of Figure 1 to help readers to better understand what was being tested.

      (4) Figure 2A would benefit from edits to the schematic so that it is clear that heat exposure is being done before the animal is sacrificed and cFos is stained.

      We have revised the text to clarify that heat exposure occurred before the animal was sacrificed and c-Fos was stained.

      (5) Figure 2D: would help if the quantification of overlap of cFos and rabies was shown in the figure in addition to reporting it in the text (84%).

      We have added quantification in Figure 2D.

      (6) The supplemental data in Figure 2 - Supplemental Figure 1 showing increased Fos in PVT and POA after heat exposure would actually help if it was in main Figure 2 so that the reader can more clearly see the rationale for choosing the POA-PVT circuit. But this is a matter of preference and up to the author where they want to show this data.

      Thanks for the suggestion. But considering the layout and space, we will prefer to retain this part in Figure 2-supplemental figure 1.

      (7) Figure 3 would benefit from a behavior schematic illustrating the time course of the experiment and what the heat exposure protocol is for each day (how many minutes heat 'on' vs 'off', the temperature of heat, etc). Also, what is different about day 22 that makes it chronic heat vs day 21? Currently, it is a bit hard to understand the protocol.

      We have added the temperature and time of chronic heat exposure in the schematic of Figure 3. The “day 22” represented the time point after chronic heat exposure. And we measured the calcium activity of POA recipient pPVT neurons on day 22 to compare with day 1 to demonstrate that the activity changes of POA recipient pPVT neurons after chronic heat exposure.

      (8) Figure 3D, it is unclear what the difference is between the Day 1 data on the left and Day 1 data on the right. Same with Figure 3H, unclear what the difference is between the left and the right.

      The left panel and right panel reflect different parameters: frequency /min (left) and amplitude (△F/F) for Figure 3D-3H. By doing this, we want to reflect the dynamic activity changes of POA recipient pPVT neurons throughout chronic heat exposure process. Now, all figures in panel 3D to 3H have been revised to make them clearer in meaning.

      (9) Figure 4A would benefit from schematics showing the stimulation protocol for chronic optogenetics (how many days? Frequency? Duration of time? Etc)

      We have added detailed schematics in our Figure 4A.

      Reviewer #2 (Recommendations for the authors)

      (1) It is interesting that social behavior appears to be reduced upon long-term heat exposure but not after acute heat exposure. Interaction of animals, such as huddling, can be used by animals as a form of behavioral thermoregulation in cold environments and heat may drive animals apart to allow for better heat dissipation. The social interaction measured here is not huddling (because, I assume, the animals are separated by a divider?) but is this form of behavior measured here related to huddling/"social thermoregulation"? This could be discussed.

      Our behavioral tests were performed at room temperature. Even though huddling is a type of social behavior, based on our observation, the tested mouse was actively revolving around the mental cap, suggesting this type of behavior is not related to huddling/social thermoregulation type of social behavior.

      (2) Line 113: The statement "Chronic treatment did not change body temperature" should be clarified/rephrased because 90 minutes of 38 degrees centigrade exposure to heat will increase the body temperature of mice. It would be helpful if the authors made clear that they measure body temperature before the heat stimulus (and not during the heat stimulus), which is now only obvious if one digs into the methods section.

      We have revised the text and clarified that body temperature was measured before the heat stimulus in the revised manuscript.

      (3) Figure 1J and K: for the non-experts, these graphs are difficult to interpret, some more explanation is needed (what exactly is measured ?). We believe that the term "arousal" may not be justified in this context because the authors have not measured sleep patterns (EEG and EMG) to show that the mice arouse from a sleep (or sleep-like) stage; the authors may consider changing the terminology, e.g. something along the lines of "agitation" or "activity".

      We have further elaborated the meaning of Figure 1J and K in our revised manuscript. The acoustic startle response is a well-recognized behavioral parameter reflecting arousal levels in rodent model. The more agitation in response to stimulus, the higher the arousal levels in mice. We have used the term “agitation” to describe mice’s performance in the acoustic startle response test.

      Reviewer #3 (Recommendations for the authors):

      (1) The authors suggest in the introduction of the manuscript that the HPA axis and other multifaceted factors may influence emotional changes caused by heat stress (lines 63-78). However, there are no experiments or discussions on how the POA-pPVT circuit interacts with these factors. In line with the study's proposed direction in the introduction section, it would be valuable to explore, or at least discuss, whether and how the POA-pPVT circuit interacts with the HPA axis or other neural circuits known to regulate emotional and stress responses. Alternatively, the reviewer suggests revising the content of the introduction to align with the focus of the study.

      Although POA is known to possibly interact with the HPA axis via its connection with the paraventricular nucleus of the hypothalamus, there is hardly any evidence for the pPVT. Thus, we prefer not to speculate this question, which remains open, in our current manuscript.

      (2) In Figure 5, the authors report that pPVT neurons that receive projections from the POA exhibited increased responses to stressful situations following chronic heat exposure. However, considering the long pre- and post-recording time gap of approximately three weeks, the additional expression of GCaMP protein over time could potentially account for the increased signal. Therefore, the reviewer recommends including a control group without heat exposure to rule out this possibility.

      We have included Figure 3-figure supplement 1 in our manuscript to exclude the effect of expression of GCaMP protein over time on the recording of calcium signal.

      (3) Related to Figure 2, a) Please include quantification data of the overlap between retrogradely labeled and c-Fos-expressing POA neurons, which can be presented as a bar graph in Figure 2. This would be beneficial for readers to estimate how many warm-activated POA neurons connected to the pPVT are actively engaged under these conditions.

      In the revised manuscript, we have included the quantification analysis in Figure 2.

      b) The images in Figure 2 - Figure Supplement 1 seem to degrade in quality when magnified, making it difficult to discern finer details. Higher-resolution images would greatly improve the clarity and help in accurately visualizing the c-Fos expression patterns in the POA and pPVT regions.

      We have changed our images of Figure 2-figure supplement 1 to higher-resolution in the revised manuscript.

      c) The c-Fos images in Figure 2D and Figure 2 - Figure Supplement 2C appear unusual in that the c-Fos signal seems to fill the entire cell, whereas c-Fos protein is localized to the nucleus. Could the authors clarify whether this image accurately represents c-Fos staining or if there might be an issue with the staining or imaging process?

      We are confident that the green signals in both Figure 2D and Figure 2-figure supplement 2C, which did not occupy the whole cell body, have already accurately reflected the c-Fos and that they were nucleus staining. We have updated the amplified picture in Figure 2D.

      d) In Supplemental Figure 2B, the square marking the region of interest should be clearly explained in the figure legend to ensure that readers can fully understand the context and focus of the image.

      We have further modified our figure legend in Figure 2-figure supplement 1 in our revised manuscript.

    1. REFERENC ESAdekoya, O. B., Oliyide, J. A., Yaya, O. S., & Al-Faryan, M. A. S. (2022).Does oil connect differently with prominent assets during war? Analy-sis of intra-day data during the Russia-Ukraine saga. Resources Policy,77, 102728.Aguinis, H., Cope, A., & Martin, U. M. (2022). On the parable of the man-agement scholars and the Russia–Ukraine war. British Journal of Man-agement, (in press), 33, 1668–1672.Ahmed, S., Hasan, M. M., & Kamal, M. R. (2022). Russia–Ukraine crisis:The effects on the European stock market. European Financial Man-agement (in press).AkÇalI, E., & Görmüs¸, E. (2021). Business people in war times, the ‘fluidcapital’ and the ‘shy diaspora’: The case of Syrians in Turkey. Journalof Refugee Studies, 34(3), 2891–2911.Alyukov, M. (2022). Making sense of the news in an authoritarian regime:Russian television viewers' reception of the Russia-Ukraine conflict.Europe-Asia Studies, 74(3), 337–359.Behnassi, M., & El Haiba, M. (2022). Implications of the Russia–Ukrainewar for global food security. Nature Human Behaviour, 6, 1–2.Boston, W. (2022, March 3). Ukraine war plunges auto makers into newsupply-chain crisis. Wall Street Journal, 3. https://www.wsj.com/articles/ukraine-war-plunges-auto-makers-into-new-supply-chain-crisis-11646309152Boungou, W., & Yatie, A. (2022). The impact of the Ukraine-Russia war onworld stock market returns. Economics Letters, 215, 1–3.Cai, H., Bai, W., Zheng, Y., Zhang, L., Cheung, T., Su, Z., … Xiang, Y. T.(2022). International collaboration for addressing mental health crisisamong child and adolescent refugees during the Russia-Ukraine war.Asian Journal of Psychiatry, 72, 103109.Casson, M., & Li, Y. (2022). Complexity in international business: Theimplications for theory. Journal of International Business Studies(in press).Chapra, M. U. (2011). The global financial crisis: Some suggestions forreform of the global financial architecture in the light of Islamicfinance. Thunderbird International Business Review, 53(5),565–579.Cumming, D. J. (2022). Management scholarship and the Russia-Ukrainewar. British Journal of Management, (in press), 33, 1663–1667.Curran, L., & Zignago, S. (2011). The financial crisis and trade—Keyimpacts, interactions, and outcomes. Thunderbird International BusinessReview, 53(2), 115–128.Dombo, E. A. (2022). War, religion, and social work. Journal of Religion &Spirituality in Social Work: Social Thought, 41, 121–122.Grossi, G., & Vakulenko, V. (2022). New development: Accounting forhuman-made disasters- comparative analysis of the support toUkraine in times of war. Public Money & Management (in press).Haukkala, H. (2015). From cooperative to contested Europe? Theconflict in Ukraine as a culmination of a long-term crisis inEU-Russia relations. Journal of Contemporary European Studies,23(1), 25–40.Higgins-Desbiolles, F. (2022). The question of solidarity in tourism. Journalof Policy Research in Tourism, Leisure and Events, 1(1), 1–10.Jackson, T. (2022). Engaging with contemporary issues: Should we studywar? International Journal of Cross-Cultural Management, 22(1), 3–6.Johannesson, J., & Clowes, D. (2022). Energy resources and markets –Perspectives on the Russia–Ukraine war. European Review, 30(1),4–23.Kammer, A., Azour, J., Selassie, A. A., Goldfajn, I., & Rhee, C. (2022, March,15). How war in Ukraine is reverberating across world's regions. IMF,2022. In press.Kayed, R. N., & Hassan, M. K. (2011). The global financial crisis andIslamic finance. Thunderbird International Business Review, 53(5),551–564.Lichterman, A. (2022). The peace movement and the Ukraine war: Whereto now? Journal for Peace and Nuclear Disarmament, 5, 1–13.Lim, M., Chin, M. W. C., Ee, Y. S., Fung, C. Y., Giang, C. S., Heng, K. S., …Weissmann, M. A. (2022). What is at stake in a war? A prospectiveevaluation of the Ukraine and Russian conflict for business and soci-ety. Global Business and Organizational Excellence, 1–14. In press.Markus, S. (2017). Oligarchs and corruption in Putin's Russia: Of sand cas-tles and geopolitical volunteering. Georgetown Journal of InternationalAffairs, 18, 26–32.Markus, S. (2022). Long-term business implications of Russia's war inUkraine. Asian Business & Management (in press).Mendez, A., Forcadell, F. J., & Horiachko, K. (2022). Russia–Ukraine crisis:China's belt road initiative at the crossroads. Asian Business & Manage-ment, 21(4), 488–496.Michailova, S. (2022). An attempt to understand the war in Ukraine–Anescalation of commitment perspective. British Journal of Management,(in press), 33, 1673–1677.Orhan, E. (2022). The effects of the Russia-Ukraine war on global trade.Journal of International Trade, Logistics and Law, 8(1), 141–146.Owens, M. (2022). Exploiting bullets: International business and thedynamics of war. Critical Perspectives on International Business(in press).Pattit, J., & Pattit, K. (2022). Responding to crisis: World war 2, COVID-19, and the business school. Business and Society Review, 127,319–342.Richard, C., Burdekin, K., & Siklos, P. (2022). Armageddon and the stock mar-ket: US, Canadian and Mexican market responses to the 1962 Cubanmissile crisis. Quarterly Review of Economics Finance, 84, 112–117.Sahebalzamani, S., Jørgensen, E. J. B., Bertella, G., & Nilsen, E. R. (2022). Adynamic capabilities approach to business model innovation in timesof crisis. Tourism Planning & Development, 1–24. In press.Siddi, M. (2022). The partnership that failed: EU-Russia relations and thewar in Ukraine. Journal of European Integration, 44, 1–6.Sigurjonsson, T. O., & Mixa, M. W. (2011). Learning from the “worstbehaved”: Iceland's financial crisis and the Nordic comparison. Thun-derbird International Business Review, 53(2), 209–223.Teagarden, M. B., & Hinrichs, M. A. (2009). Learning from toys: Reflectionson the 2007 recall crisis. Thunderbird International Business Review,51(1), 5–17.Titov, A. (2022). The impact of the Ukraine war on Russia. Political Insight,13(2), 32–36.Umar, Z., Polat, O., Choi, S. Y., & Teplova, T. (2022). The impact of theRussia-Ukraine conflict on the connectedness of financial markets.Finance Research Letters, 102976. In press.UNCTAD. (2022). Trade and Development Report.United Nations. (2022). Global impact of war in Ukraine on food, energyand finance systems (Brief No 1).Wise, J. (2022). Ukraine conflict: Global research community reviews linkswith Russia. BMJ, 376, o637. https://doi.org/10.1136/bmj.0637Zahra, S. A. (2022). Institutional change and international entrepreneurshipafter the war in Ukraine. British Journal of Management, (in press), 33,1689–1693.Zhang, C., & Gao, H. (2022). Managing business-to-business disruptions:Surviving and thriving in the face of challenges. Industrial MarketingManagement, 105, 72–78.270 COMMENTARY

      many sources used throughout. could show not much original research.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      Major issue #1. Regarding the conclusions on IRE1 signaling, both yeast species have different IRE1 activities (https://elifesciences.org/articles/00048), the total deletion of IRE1 in S pombe appears to indicate that expansion of perinuclear ER is independent of IRE1, however since IRE1 signaling has exclusively a negative impact on mRNA expression, it might be relevant to identify mRNA whose expression is stabilized under those circumstances and evaluate whether those could confer a mechanism which would also yield perinuclear ER expansion (eg differential deregulation of ER stress controlled lipid biosynthesis required for lipid membrane synthesis). In S. cerevisiae, do the authors observe HAC1 mRNA splicing?

      We have not tested whether HAC1 mRNA is processed in S. cerevisiae.

      In addition, as requested by the reviewers, we reassessed our RNA-seq data and compared it with data from (Kimmig et al., 2012) (UPR activation in S. pombe), which added a new layer of data that reinforces the differences between the transcriptomic responses induced by HU and DIA and the canonical UPR. The following information is now included in the paper (page 26, highlighted in blue):

      “We further compared our transcriptomic data with that obtained by Kimmig et al. from DTT- treated S. pombe cells. When we compared the genes that were downregulated in our conditions with the ones described by Kimmig et al. (FC≤-1), we found no similarities between HU treatment (75 mM HU for 150 minutes) and UPR-induced downregulation, and only three genes ( ist2, efn1 and xpa1) all of them encode for transmembrane proteins, were common with DIA treatment (3 mM DIA for 60 minutes). Additionally, ist2 and xpa1, but not efn1, are considered Ire1-dependent downregulated genes and are located in the ER. These results show that HU- or DIA- induced transcriptomic programs are different from UPR, as they do not heavily rely on mRNA decay and favor gene overexpression. Interestingly, we found similarities between genes showed to be upregulated more that twofold by DTT in Kimmig et al., and HU and DIA conditions. When the two N-Cap-inducing conditions were compared with DTT, we found eight common upregulated genes (frp1, plr1, SPCC663.08c, srx1, gst2, str3, caf5 and hsp16) mostly involved in reduction processes and the chaperone Hsp16 which suggests folding stress”.

      Major issue #2. The authors indicate that HU and DIA lead to thiol stress, it might be relevant to evaluate the thiol-redox status of major secretory proteins in S. pombe (or even cargo reporters if necessary) to fully document the stress impact on global protein redox status.

      We agree with the reviewer that it is important to determine the redox and the functional state of the secretory pathway in our conditions to fully understand the cellular consequences of these treatments, especially in the case of HU, as it is routinely used in clinics. In this context, we have already included new data showing that HU or DIA treatment leads to alterations in the Golgi apparatus and in the distribution of secretory proteins (Figures 3A-B). In addition, we are currently performing mass spectrometry experiment to detect protein glutathionylation in our conditions, as it has been previously shown that DIA treatment leads to glutathionylation of key ER proteins such as Bip1, Pdi or Ero1 (Lind et al., 2002; Wang & Sevier, 2016), which might by reproduced upon HU treatment. Finally, we plan to test the folding and processing of specific secretory cargoes by western blot in our experimental conditions (See below, Reviewer 2, Major issue #1).

      What happens if HU-treated yeast cells are grown in the presence of n-acetyl cysteine?

      We have tested whether the addition of this antioxidant could prevent and/or revert the N-Cap phenotype. We found that NAC in combination with HU increased N-Cap incidence (Figure 5H). As NAC is a GSH precursor and we find that GSH is required to develop the phenotype of N-Cap (Figure 5A-B, D, G), this result further supports that the HU-induced cellular damage might involve ectopic glutathionylation of proteins.

      Unfortunately, we have not tested NAC in combination with DIA, as NAC seems to reduce DIA as soon as they get in contact, as judged by the change in the characteristic orange color of DIA, the same that happens when we combine GSH and DIA (Supplementary Figure 5A-B).

      In this regard, the following information has been added to the manuscript (page 30, highlighted in blue):

      “We also tested GSH addition to the medium in combination with either HU or DIA. When mixed with DIA, we noticed that the color of the culture changed after GSH addition (Figure S5A), which suggests that GSH and DIA can interact extracellularly, thus preventing us from being able to draw conclusions from those experiments. On the other hand, combining GSH with HU increased N-Cap incidence (Figure 5G), as expected based on our previous observations. Additionally, we checked whether the addition of the antioxidant N-acetyl cysteine (NAC), a GSH precursor, impacted upon the N-Cap phenotype. The results were the same as with GSH addition: when combined with HU, NAC increased N-Cap incidence (Figure 5H), whereas in combination, the two compounds interacted extracellularly (Figure S5B). These data align with NAC being a precursor of GSH, as incrementing GSH levels augments the penetrance of the HU-induced phenotype”.

      Major issue #3. The appearance of cytosolic aggregates is intriguing, do the authors have any idea on the nature of the protein aggregates?

      DIA is a strong oxidant, and HU treatment results in the production of reactive oxygen species (ROS). Therefore, one hypothesis would be that cytoplasmic chaperone foci represent oxidized and/or misfolded soluble proteins. Indeed, in this revised version of the manuscript we have included data showing that guk1-9-GFP and Rho1.C17R-GFP soluble reporters of misfolding accumulate in cytoplasmic foci upon HU or DIA treatment that colocalize with Hsp104 (Figure 4I-J, pages 23-24 and 29), which demonstrate that cytoplasmic chaperone foci contain misfolded proteins. We have also tested if they contain Vgl1, which is one of the main components of heat shock induced stress granules in S. pombe (Wen et al., 2010). However, we found that HU or DIA-induced foci lacked this stress granule marker, and indeed Vgl1 did not form any foci in response to these treatments. Therefore, our aggregates differ from the canonical stress-induced granules.

      Are those resulting from proficient retrotranslocation or reflux of misfolded proteins from the ER?

      To test whether these cytosolic aggregates result from retrotranslocation from the ER, we plan to use the vacuolar Carboxipeptidase Y mutant reporter CPY*, which is misfolded. This misfolded protein is imported into the ER lumen but does not reach the vacuole. Instead, it is retrotranslocated to the cytoplasm, where it is ubiquitinated and degraded by the proteasome (Mukaiyama et al., 2012). We will analyze by fluorescence microscopy the localization of CPY*´-GFP and Hsp104-containing aggregates upon HU or DIA treatment and with or without proteasome inhibitors. We can also test the levels, processing and ubiquitination of CPY*-GFP by western blot, as ubiquitination of retrotranslocated proteins occurs once they are in the cytoplasm.

      Are those aggregates membrane bound or do they correspond to aggresomes as initially defined? The Walter lab has demonstrated a tight balance between ER phagy and ER membrane expansion (https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.0040423), which could also impact on the presence of protein aggregates in the cytosol.

      Our results suggest that these aggregates are not bound to ER membranes, as they do not appear in close proximity to the ER area marked by mCherry-AHDL in fluorescence microscopy images.

      To fully rule out this possibility, we have tested whether these Hsp104-aggregates colocalized with ER transmembrane proteins Rtn1 and Yop1, and with Gma12-GFP that marks the Golgi apparatus. In none of the cases the Hsp104-containing aggregates colocalized or were surrounded by membranes. This information will be added to the final version of the manuscript.

      With respect to autophagy, we have tested whether deletion of key genes involved in autophagy affected the N-Cap phenotype. To this end, we used deletions of vac8 and atg8 in strains expressing Cut11-GFP and/or mCherry-AHDL and found that none of them affected N-Cap formation. These data suggest that the core machinery of autophagy is not critical for HU/DIA-induced ER expansion. We plan to include this data in the final version of the manuscript along with the rest of experiments proposed.

      To get deeper insights and to fully rule out a possible contribution of macro-autophagy to the HU- and DIA-induced phenotypes, we plan to analyze by western blot whether GFP-Atg8 is induced and cleaved upon HU or DIA treatments which would be indicative of macroautophagy activation.

      To test whether the cytoplasmic aggregates are the result of an imbalance between ER-expansion and ER-phagy we plan to analyze the localization of GFP-Atg8 and Hsp104-RFP in the atg7Δ mutant, impaired in the core macro-autophagy machinery. In these conditions, the number or size of the cytoplasmic aggregates might be impacted.

      On the other hand, it has been recently shown that an ER-selective microautophagy occurs in yeasts upon ER stress (Schäfer et al., 2020; Schuck et al., 2014). This micro-ER-phagy involves the direct uptake of ER membranes into lysosomes, is independent of the core autophagy machinery and depends on the ESCRT system and is influenced by the Nem1-Spo7 phosphatase. ESCRT directly functions in scission of the lysosomal membrane to complete the uptake of the ER membrane. Interestingly, N-Caps are fragmented in the absence of cmp7 and specially in the absence of vps4 or lem2, the nuclear adaptor of the ESCRT (Figure 3E), We had initially interpreted these results as the need to maintain nuclear membrane identity during the process of ER expansion (Kume et al., 2019); however, the appearance of fragmented ER upon HU treatment in the absence of ESCRT might also be due to an inability to complete microautophagic uptake of ER membranes. To test this hypothesis, we plan to analyze whether the fragmented ER in these conditions co-localize with lysosome/vacuole markers.

      Major issue #4. Nucleotide depletion was previously shown to lead to HSP16 expression through activation of the spc1 MAPK pathway (https://academic.oup.com/nar/article/29/14/3030/2383924), one might think that HU (or diamide) could lead to this through a nucleotide dependent mechanism and not necessary through a thiol-redox protein misfolding stress. This issue has to be sorted out to ensure that the HSP effect is independent of nucleotide depletion.

      As stated in (Taricani et al., 2001), hsp16 expression is strongly induced in a cdc22-M45 mutant background. We performed experiments in this mutant that were included in the original version of the manuscript and remain in the current version (Sup. Fig. 2C) and, under restrictive conditions, we do not see spontaneous N-Cap formation. If Hsp16 overexpression and nucleotide depletion were key to the mechanism triggering N-Cap appearance, we would expect this mutant to eventually form N-Caps when placed at restrictive temperature. Furthermore, Taricani et al. show that Hsp16 expression was abolished in a Δatf1 mutant background in the presence of HU, and we found that this mutant is still able to produce N-Caps in HU; therefore, our results strongly suggest that the phenotype of N-cap is independent on the MAPK pathway and on the expression of hsp16.

      Minor issues

      1. __P1 - UPR = Unfolded Protein Response: __Corrected in the manuscript
      2. 2__. P22 - HSP upregulation "might" be indicative of a folding stress:__ Corrected in the manuscript
      3. __ The abstract does not reflect the findings presented in the manuscript. In addition, I would recommend the authors revise the storytelling in their manuscript to push forward the message on either the specific phenotype associated with perinuclear ER or on the characterization of protein misfolding stress.__ We have modified the abstract to better reflect our findings and will further revise our arguments in the final version of the manuscript once we have the results of the experiments proposed

      Reviewer 2

      Major issue #1. The authors state the cytoplasmic and ER folding are both disrupted. The impact on ER protein biogenesis would be bolstered with some biochemical data focused on the folding of one or more nascent secretory proteins. Is disulfide bond formation and/or protein folding indeed disrupted?

      We have addressed the status of secretion in cells treated with HU or DIA by assessing the morphology of the Golgi apparatus and the localization of several secretory proteins by fluorescence microscopy and found that both HU and DIA treatments impact the secretion system. In addition, we plan on addressing the redox status of ER proteins (Bip1, Pdi or Ero1) by biochemical approaches. Please see the answer to major issue #2 from reviewer 1.

      We will also analyze by western blot the biogenesis and processing of the wildtype vacuolar Carboxypeptidase Y (Cpy1-GFP) and/or alkaline phosphase (Pho8-GFP), two widely used markers to test the functionality of the ER/endomembrane system.

      Major issue #2. Increased signal of Bip1 in the expanded perinuclear ER is shown and is suggested as consistent with immobilization of BiP upon binding of misfolded proteins. The authors suggest that this increased signal must reflect Bip1 redistribution because "Bip1 levels are constant". Yet, the western image (Figure 4B) looks to show increased level of Bip1 protein up HU treatment. Given the abundance of Bip1 in cells, it seems possible that a two-fold increase in newly synthesized proteins in the perinuclear region may account for the increased signal. These original data cited by the authors uses photobleaching (not just fluorescence intensity) to show a change in crowding / mobility, which the authors should consider to support their conclusion. Alternatively, a detected increased engagement of Bip1 with substrates (e.g. pulldown experiment) would be similarly strengthening.

      This same issue arose with reviewer 3, so we decided to change the image of the western blot showing another one with less exposure and added a quantification showing that Bip1-GFP levels remain mostly constant between control conditions and treatments with HU and DIA.

      We have also performed the suggested photobleaching experiment to analyze potential changes in crowding and mobility in Bip1-GFP upon HU treatment. We found that Bip1-GFP signal recovers after photobleaching the perinuclear ER in HU-treated cells that had not yet expanded the ER, showing that Bip1-GFP is dynamic in these conditions. However, Bip1-GFP signal did not recover after photobleaching the whole N-Cap in cells that had fully developed the expanded perinuclear ER phenotype, whereas it did recover when only half of the N-Cap region was bleached. This suggests that Bip1-GFP is mobile within the expanded perinuclear ER but cannot freely diffuse between the cortical and the perinuclear ER once the N-Cap is formed.

      These data have been included in the revised version of the manuscript, in figure 4B, sup. figures 4A-B, and in page 22.

      Major issue #3. It is curious that cycloheximide (CHX) has a distinct impact on HU versus DIA treatment. Blocking protein synthesis with CHX exacerbates the phenotype with DIA, but not HU. The authors use the data with CHX to argue that their drug treatments are interfering with folding during synthesis and translation into the ER. If so, what is the rationale as to why CHX treatment decreases expansion upon HU treatment? Relatedly, is protein synthesis and/or ER import impacted upon treatment with HU and/or DIA?

      As all three reviewers had comments about the CHX and Pm-related data, we revised those experiments and noticed a phenotype occurring upon HU+CHX treatment that had gone unnoticed previously and that changed our understanding about the effect of these drugs on the ER. Briefly, we noticed that, although CHX treatment decreases the HU-induced expansion of the perinuclear ER, it indeed induced expansion but in this case in the cortical area of the ER. This means that the phenotype of ER expansion in HU is not being suppressed by addition of CHX, but rather taking place in another area of the ER (cortical ER). We do not understand why this happens; however, these results show that ER expansion is exacerbated both in DIA and HU when combined with CHX. We have included this data in Figures 3C-D and in page 21.

      We also examined the trafficking of secretory proteins that go from the ER to the cell tips and noticed that this transit was affected under both drugs (Figures 3A-B). This suggests that, although there is still protein synthesis when cells are exposed to the drugs (as can be seen by the higher levels of chaperones induced by both stresses (Figure 4C-E)), their protein synthesis capacity is possibly impinged on to certain degree. All this information is now included in the manuscript (page 18).

      Major issue #4. While the authors suggest that there is disulfide stress in the ER / nucleus, the redox environment in these compartments is not tested directly (only cytoplasmic probes).

      Although we have only included experiments using one redox sensor in the manuscript, we had tested the oxidation of several biosensors during HU and DIA exposure monitoring cytoplasmic, mitochondrial and glutathione-specific probes. We have tried to use ER directed probes however, we have not been successful due to oversaturation of the probe in the highly oxidative environment of the ER lumen.

      Although so far we have not been able to directly test the redox status of the ER with optical probes, we plan to test the folding and redox status of several ER proteins and secretory markers by biochemical approaches, so hopefully these experiments will give us more information on this question (See answer to Reviewer 1, Main Issue #2 and Reviewer 2, Main issue #1).

      Major Issue #5. What do the authors envision is the role of the cytoplasmic chaperone foci? Do CHX / Pm treatment with HU/DIA reverse the chaperone foci?

      Pm causes premature termination of translation, leading to the release of truncated, misfolded, or incomplete polypeptides into the cytosol and the re-engagement of ribosomes in a new cycle of unproductive translation, as puromycin does not block ribosomes (Aviner, 2020; Azzam & Algranati, 1973). This likely decreases the number of peptides entering the ER that can be targeted by either HU or DIA, decreasing in turn ER expansion. Indeed, we have found that Pm treatment alone results in the formation of multiple cytoplasmic protein aggregates marked by Hsp104-GFP (Figure 4K), consistent with a continuous release of incomplete and misfolded nascent peptides to the cytoplasm. This would explain why Pm treatment suppresses N-Cap formation when cells are treated with either HU or DIA.

      To further test this idea, we analyzed the number and size of Hsp104-containing cytoplasmic aggregates in cells treated with HU or DIA and Pm, where N-Caps are suppressed. As expected, we found an increase in the accumulation of proteotoxicity in the cytoplasm in these conditions. This information has now been added to the paper (Figure 4K, pages 23-24 and 29).

      On the other hand, CHX inhibits translation elongation by stalling ribosomes on mRNAs, preventing further peptide elongation but leaving incomplete polypeptides tethered to the blocked ribosomes. This reduces overall protein load entering the ER by blocking new protein synthesis and stabilizes misfolded proteins bound to ribosomes. Accordingly, it has been shown previously that blocking translation with CHX abolishes cytoplasmic protein aggregation (Cabrera et al., 2020; Zhou et al., 2014). Similarly, we have found that Hsp104 foci are not observed when we add CHX alone or in combination with HU or DIA (Figures 4K-L). These results suggest that cytoplasmic foci that we observe upon HU or DIA treatment likely contain misfolded proteins derived from ongoing translation.

      As this question had also been raised by reviewer 1, we further explored the nature of these cytoplasmic foci (please see answer to Reviewer1, Issue 3). Briefly:

      • We tested whether they colocalize with the foci of Guk1-9-GFP and Rho1.C17R-GFP reporters of misfolding that appear upon HU or DIA treatments and, indeed, Hsp104-containing aggregates colocalize with Guk1-9-GFP and Rho1.C17R-GFP. This information has now been added to the paper (Figure 4I-J, pages 23-24 and 29).
      • We tested whether these foci were membrane bound with several ER transmembrane proteins (Tts1, Yop1, Rtn1) and integral membrane protein Ish1, and in none of the cases we detected membranes surrounding the aggregates. This information will be included in the final version of the paper.
      • We plan to test whether the cytoplasmic foci represent proteins retro-translocated from the ER.
      • We will also test whether autophagy or an imbalance between ER expansion and ER-phagy might contribute to the accumulation of cytoplasmic protein foci. The new data regarding the suppression of cytoplasmic foci by CHX treatment has already been included in the current version of the manuscript in Figure 4K and in the text (page 29).

      The authors argue that cytoplasmic foci are "independent" from ER expansion and are "not a direct consequence of thiol stress" based on the observation that DTT does not reverse these foci. This seems like a strong statement based on the limited analysis of these foci.

      We agree with the reviewer. We have toned down our statements about the relationship between thiol stress, the cytoplasmic chaperone foci and their relationship with ER expansion. We have removed from the text the statement that cytoplasmic foci are independent from ER expansion and thiol stress and have further revised our claims about CHX and Pm in the main text and the discussion to address these and the other reviewers’ concerns.

      Major Issue #6. Based on the transcriptional data, the authors speculate a potential role on role on iron-sulfur cluster protein biogenesis. This would seem to be rather straightforward to test.

      To address this issue, we plan to analyze the localization of proteins involved in iron-sulfur cluster assembly and/or containing iron-sulfur clusters by in vivo fluorescence microscopy, such as DNA polymerase Dna2 or Grx5, during HU or DIA treatments.

      Related to this, we have found that a subunit of the ribonucleotide reductase (RNR) aggregated in the cytoplasm upon HU exposure (Figure S2B). It is worth noting that RNR is an iron-containing protein whose maturation needs cytosolic Grxs (Cotruvo & Stubbe, 2011; Mühlenhoff et al., 2020). The catalytic site, the activity site (which governs overall RNR activity through interactions with ATP) and the specificity site (which determines substrate choice) are located in the R1 (Cdc22) subunits, which are the ones that aggregate, while the R2 subunits (Suc22) contain the di-nuclear iron center and a tyrosyl radical that can be transferred to the catalytic site during RNR activity (Aye et al., 2015). The fact that a subunit of RNR aggregates could be related to an impingement on its synthesis and/or maturation due to defects in iron-sulfur cluster formation, as it has been recently published that RNR cofactor biosynthesis shares components with cytosolic iron-sulfur protein biogenesis and that the iron-sulfur cluster assembly machinery is essential for iron loading and cofactor assembly in RNR in yeast (Li et al., 2017). This information has been added to the discussion.

      Major Issue #7. The authors suggest that "pre-treatment" with DTT before HU addition suppresses formation of the N-Caps. However, these samples (Figure 2J) contain DTT coincident with the treatment as well. To say it is the effect of pre-treatment, the DTT should be added and then washed out prior to HU or DIA addition. Alternatively, the language used to describe these experiments and their outcomes could be revised.

      We modified the language used to describe the experiment in the manuscript, as suggested by the reviewer, to clarify that while DTT is kept in the medium, N-Caps never form. In addition, we have also performed a pre-treatment with DTT; adding 1 mM DTT one hour before, washing the reducing agent out and adding HU to the medium then. The result indicates that pre-treating cells with DTT significantly reduces N-Cap formation after a 4-hour incubation with HU, which suggests that triggering reducing stress “protects” cells from the oxidative damage induced by HU and DIA. This information has been also added to the manuscript (Figure 2J).

      Major Issue #8. For a manuscript with 128 references there is rather limited discussion of the data in the context of the wider literature. The discussion primarily focuses on a recap of the results. The authors do cite several prior works focused on redox-dependent nuclear expansion. However, while cited, there is no real discussion of the relationship between this work in the context of that previously published (including several known disulfide bonded proteins that are involved in nuclear/ER architecture).

      We have revised and expanded our discussion. In addition, in the final revision of our work we will increase the discussion in the context of the new results obtained.

      Minor points

      1. __ Figure numbering goes from figure 4 to S6 to 5.__ We have updated the numbering of the figures after merging several supplementary figures, so now this issue is fixed.

      __ It would be helpful to the reader to explain what some of the reporters are in brief. For example, Guk1-9-GFP and Rho1.C17R-GFP reporters__.

      Both the Guk1-9-GFP and Rho1.C17R-GFP are two thermosensitive mutants in guanylate kinase and Rho1 GTPase respectively, that have been previously used in S. pombe as soluble reporters of misfolding in conditions of heat stress. During mild heat shock, both mutants aggregate into reversible protein aggregate centers (Cabrera et al., 2020). This information has now been added to the manuscript.

      __ Supplementary Figure 3. The main text suggests panel 3A is focused on diamide treatment. The figure legend discusses this in terms of HU treatment. Which is correct?__

      We thank the reviewer for pointing out this mistake. The experiment was performed in 75 mM HU, the legend was correct. It has now been corrected in the manuscript.

      __ The authors use ref 110 and 111 to suggest the importance of UPR-independent signaling. However, they do not point out that this UPR-independent signaling referred to in these papers is dependent on the UPR transmembrane kinase IRE1.__

      We have included pertinent clarification in the new discussion.

      Reviewer 3

      Major issue #1. It is hard to see how the claim of ER stress can be supported if BiP levels do not change (Fig. 4B). Also, this figure is overexposed. The RNA-seq data should be able to establish ER stress as well, but no rigorous analysis of ER stress markers is presented.

      Regarding the levels of Bip1, we now show in Figure 4 a less exposed image of the western blot, and a quantification of Bip1-GFP intensity from three independent experiments. We find that, in our experimental conditions, neither HU nor DIA treatments significantly altered Bip1 levels.

      With respect to the RNA-Seq, as we mentioned in the major issue 1 from reviewer 1, we reassessed our data to further clarify and add information about ER stress markers induced or repressed by HU and DIA.

      Major issue #2. The interpretation of the CHX and puromycin experiments of Figure 3A-B is hard to follow. My best guess is that the authors argue that CHX decreases misfolded protein load and that puromycin increases misfolded protein load, and that since DIA is a stronger oxidative stress than HU hence CHX is only protective under HU and not DIA. However, while CHX decreases misfolded protein load, puromycin hasn't been show directly to increase it and I don't see how this explains puromycin being protective at all.

      We have found that puromycin treatment alone results in the formation of cytoplasmic foci containing Hsp104, suggesting that puromycin indeed increases folding stress in the cytoplasm. We have now included this data in Figure 4K (please see Main Issue #5 from Reviewer 2). Pm suppresses the formation of N-caps induced by HU or DIA; however, we have not addressed cell survival or fitness in these conditions and therefore we cannot conclude about being protective.

      In addition, upon the reevaluation of our data, we have realized that CHX treatment suppresses HU-induced perinuclear expansion, although it does not suppress but instead enhances ER expansion in the cortical region. This data has been added to the present version of the manuscript in Figure 3C-D (pages 20-21).

      Furthermore, puromycin causes Ca leakage from the ER (which can be recapitulated with thapsigargin and blocked with anisomycin; easy experiments), which could be responsible for the differences from CHX, and the model does not address the effects on downstream stress signaling. The authors should be much more clear regarding their argument, since this data is used to support the argument of disrupted ER proteostasis.

      Thapsigargin has been described to be ineffective in yeasts as they lack a (SERCA)‐type Ca2+ pump which is the target of this drug (Strayle et al., 1999). However, deletion of the P5A-type ATPase Cta4, which is required for calcium transport into ER membranes (Lustoza et al., 2011), reduced but did not abolish ER expansion. We also tested the effect of anisomycin. We found that anisomycin in combination with HU or DIA mimicked CHX behavior (ER expansion occurrs in both conditions, exacerbating perinuclear ER expansion in combination with DIA and cortical ER expansion when combined with HU). It is difficult to correlate this result with a role of Ca leakage in ER expansion, as there is no recent information regarding CHX and Ca leakage, although it has been indicated that CHX treatment does not increase cytoplasmic Ca levels (Moses & Kline, 1995). As anisomycin, like CHX, blocks protein synthesis and stabilizes polysomes, what we can conclude from this information is that nascent peptides attached to ribosomes during protein synthesis do promote ER expansion when combined with HU or DIA. This information will be added to the final version of the paper.

      Regarding the downstream effects of HU or DIA treatment on ER proteostasis, we plan to further explore the effect of these drugs on the secretory system (please see major issue #2 from Reviewer 1) and to evaluate the redox state and processing of several key ER and secretory proteins. We have also further explored the nature of the aggregates that appear in the cytoplasm in our experimental conditions, which also shed light into the downstream effects of these drugs in cytoplasmic proteostasis (please see answer to issue #5 from Reviewer 2).

      Major issue #3. The claim that a canonical UPR is not induced is weak. First, the transcriptional program of S. cerevisiae from Travers et al is used as the canonical UPR, and compared to HU/DIA induced stress in S. pombe. These organisms may not be similar enough to assume that they have transcriptionally identical UPRs. Second, no consideration is given to the mechanism by which the different transcripts are modulated between "canonical" and HU/DIA induced UPR. Is it solely through RIDD, or does it point to differences in sensing or signaling transduction?

      We readdressed this topic by analyzing the genes that have been described to be differentially expressed during UPR activation in S. pombe and comparing them with our data by reevaluating our transcriptomic data.. The re-analysis of our RNA-Seq data have allowed us to infer the mechanisms that modulate the ER response to HU or DIA treatment and further separate them from UPR. This information has been added to the paper (page 26). As an alternative approach, we will also analyse the levels of UPR targets by western blot upon HU or DIA treatment

      Finally, the p-values used are unadjusted (e.g. by Bonferroni's method or by ANOVA or at least controlled by an FDR approach) and unmodulated (extremely important when n = 3 and variance is poorly sampled), which makes them not dependable. It looks like HSF1 targets are induced, which should be addressed.

      We thank the reviewer for pointing this out. We forgot to include this information which now appears in the M&M section as follows:

      “A gene was considered as differentially expressed when it showed an absolute value of log2FC(LFC)≥1 and an adjusted p-valueIn this regard, we are currently performing proteome-wide mass spectrometry experiments to detect protein glutathionylation in our conditions, as it has been previously shown that DIA treatment leads to glutathionylation of key ER proteins such as Bip1, Pdi or Ero1 (Lind et al., 2002; Wang & Sevier, 2016), which might by reproduced upon HU treatment. We also plan to test the folding and processing of specific secretory cargoes by western blot in our experimental conditions (see below, and Reviewer 2, Major issue #1).

      We have already tested whether mutant strains with deletions of key enzymes in both cytoplasmic and ER redox systems are able to expand the ER upon HU or DIA treatment. We have found that only pgr1Δ (glutathione reductase), gsa1Δ (glutathione synthetase) and gcs1Δ (glutamate-cysteine ligase) mutants fully suppressed N-Cap formation, which suggests that glutathione has an important role in the phenotype of ER expansion. We have now added the pgr1Δ mutant strain to the main text of the manuscript (Figure 5C, page 30).

      Major issue #5. Figure S5 presents weak ER expansion in fibrosarcoma cells in response to HU (at very low concentrations and DIA is not included). The lack of any other phenotypes being presented could suggest that such experiments were done but didn't show any effect. The authors should straightforwardly discuss whether they performed experiments looking for perinuclear ER expansion or NPC clustering, and if not, what challenges precluded such experiments. Given how important this line of experimentation is for establishing generality, much more discussion is needed here.

      We not only investigated the effects of HU on the ER in mammalian cells, but also of DIA. The results from this experiment mimicked the effect of HU (an increase in ER-ID fluorescence intensity in DIA). We merely excluded this information from the manuscript because we were focusing on HU at that point due to its importance as it is used currently in clinics. In this new version of the manuscript, we have included an extra panel in supplementary figure 5 to show the results from DIA in mammalian cells.

      Minor concerns

      1) Figure 1A should show individual data points (i.e. 3 averages of independent experiments) in the bar graph.

      Although we initially changed the graph, we believe the bar plot disposition facilitates its comprehension and went back to the initial one. Also, as the rest of the graphs similar to 1A are all expressed as bar plots. Therefore, we preferred keeping the figure as it was in the original version. However, we include here the graph with each of the averages of the independent experiments.

      2) It is argued that Figure 1B demonstrates that the SPB is clustered with the NPC cluster. However, a single image is not enough to support this claim, as the association could be coincidental.

      We have changed the image to show a whole population of cells, with several of them having NPC clusters, and we have indicated the position of SPB in each of them (all colocalizing with the N-Cap).

      3) Figures 1B through 1D do not indicate the HU concentration.

      We thank the reviewer for pointing out this mistake. Figures 1B and 1C represent cells exposed to 15 mM HU for 4 hours, while the graph in 1D shows the results from cells exposed to 75 mM HU over a 4-hour period. This information has been now added to the corresponding figure legend.

      4) I was confused by the photobleaching experiments of Figure S1. How do the authors know that there is complete photobleaching of the cytoplasm or nucleus in the absence of a positive control? If photobleaching is incomplete, they could be measuring motility without compartments rather than transport between compartments, and hence the conclusion that trafficking is unaffected could be wrong.

      Our control is the background of each microscopy image; we make sure that after the laser bleaches a cell, the bleached area coincides with the background noise. That way, we make sure that fluorescence from any remaining GFP is completely removed from the bleached area.

      5) On page 8, they say "exposure to DIA" when they intend HU.

      This has been corrected in the manuscript.

      6) In Figure S3A, the colocalization of INM proteins with the ER are presented. It is not clearly explained what conclusions are meant to be drawn from this figure, but it seems it would have been more useful to compare INM and Cut11, to see whether the NPCs are localizing at the INM or ONM.

      We have added an explanation in the main text to clarify the main conclusions derived from this figure. We think that NPCs localize in a section of the nucleus where the two membranes (INM and ONM) are still bound together.

      7) I had to read Figure 2C's description and caption several times to understand the experiment. A schematic would be helpful. 20 mM HU is low compared to most conditions used. Does repositioning eventually take place for 75 mM HU or 3 mM DIA treatment, or do the cells just die before they get a chance?

      20 mM HU was used in this experiment to provide a time frame suitable for analysis after HU addition, as a higher HU concentration increases the repositioning time. We found that both HU (75mM 4h) and DIA (3mM 4h)-induced ER expansions are reversible upon drug washout. If HU is kept in the media, ER expansions are eventually resolved. However, DIA is a strong oxidant and if it is kept in the media ER expansions are not resolved and cells do not survive.

      8) Figure 2D shows little oxidative consequence from 75 mM HU treatment until 40 min., the same time that phenotypes are observed (Figure 1D). Is this relationship consistent with the kinetics of other concentrations of HU, or of DIA? Seems like a pretty important mechanistic consideration that can rationalize the effects of the two oxidants.

      Thanks to this comment we realized that the numbering underneath Figure 1D (1E in the new version of the manuscript) was wrongly annotated. The original timings shown in the figure were “random”, meaning that the time stablished as 40 minutes was not measuring the passing of 40 minutes since the beginning of the experiment. We have now corrected this panel: the timings are now normalized to the moment when NPCs cluster. The fact that, before, that moment coincided with “40 minutes” does not mean N-Caps appear at that time point in HU (they indeed appear after a >2 hour incubation).

      9) Figure S4 is missing the asterisk on the lower left cell.

      Fixed in the corresponding figure.

      10) How is roundness determined in Figure S4B?

      Roundness in Figure S4B (now S2E) is determined the same way as in Figure 1D, and as is described in the Method section (copied below). A clarification has been added to the legend to address that.

      The ‘roundness’ parameter in the ‘Shape Descriptors’ plugin of Fiji/ImageJ was used after applying a threshold to the image in order to select only the more intense regions and subtract background noise (Schindelin et al., 2012). Roundness descriptor follows the function:

      where [Area] constitutes the area of an ellipse fitted to the selected region in the image and [Major axis] is the diameter of the round shape that in this case would fit the perimeter of the nucleus.

      11) What threshold is used to determine whether cells analyzed in Figures S4C have "small ER" or "large ER"?

      Large ER are considered when their area along the projection of a 3-Z section is over 4 μm2 (more than twice the mean area of the ER in cells with N-Caps in milder conditions). This has now been clarified in the legend of the corresponding figure.

      __12) The authors interpret Figure 4K as indicating that ER expansion is not involved in the generation of punctal misfolded protein aggregates. However, the washout occurs only after the proteins have already aggregated. The proper interpretation is that the aggregates are not reversible by resolution of the stress, and hence are not physically reliant on disulfide bonds. __

      We agree with the reviewer and have modified the interpretation of the indicated figure accordingly (page 29).


      The speculation that these proteins are iron dependent is a stretch; there is no reason to believe that losses of iron metabolism are the most important stress in these cells. It seems at least as likely that oxidizing cysteine-containing proteins in the cytosol or messing with the GSH/GSSG ratio in the cytosol would make plenty of proteins misfold; oxidative stress in budding yeast does activate hsf1. However, this point could be addresses by centrifugation and mass spectrometry to identify the aggregated proteome. It is also surprising that the authors did not investigate ER protein aggregation, perhaps by looking at puncta formation of chaperones beyond BiP. By contrast, the fact that gcs1 deletion prevents ER expansion but does not prevent Hsp104 puncta does support the idea that cytoplasmic aggregation is not dependent on ER expansion.

      To address this suggestion, we plan to analyze the localization of other chaperones and components of the protein quality control such as the ER Hsp40 Scj1 or the ribosome-associated Hsp70 Sks2.

      13) Figure 4L is cited on page 28 when Figure 4K is intended.

      This has been corrected in the text, although new panels have been added and now it is 4N.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02605

      Corresponding author: Woo Jae, Kim

      1. ____Point-by-point description of the revisions

      Reviewer #1

      General Comment: This study investigates the role of the foraging gene in modulating interval timing behaviors in flies, with a particular focus on mating duration. Using single-cell RNA sequencing and gene knockdown experiments, the research demonstrates the crucial role of foraging gene expression in Pdfr-positive cells for achieving longer mating duration (LMD). The study further identifies key neurons in the ellipsoid body (EB) as essential when the foraging gene is overexpressed, highlighting its specific influence on LMD. The findings suggest that a small subset of EB neurons must express the foraging gene to modulate LMD effectively.

      __Answer:____ __We would like to express our gratitude to the reviewer for their insightful comments and positive feedback on our manuscript. During the revision process, we serendipitously discovered that the heart-specific expression of the foraging gene plays a crucial role in regulating LMD behavior. We have elaborated on the significance of this finding in the revised manuscript and have addressed the reviewer's comments accordingly.

      Comment 1. *(optional) Integration of Neuronal Subsets into a Pathway: The knockdown experiments indicate that a small subset of neurons must express the foraging gene to influence LMD. Could these neurons be integrated into a potential signaling pathway, or being treated as separate components within the brain circuit? How might this integration provide a more cohesive understanding of their role in LMD? *

      Answer: We sincerely thank the reviewer for her/his insightful comments regarding the integration of neuronal subsets into a signaling pathway and their potential role in modulating LMD behavior. During the revision process, we conducted further experiments to address this question. While we were unable to identify a specific small subset of EB neurons expressing foraging, we utilized the recently developed EB-split GAL4 driver line (SS00096), which is restricted to the EB region of the brain, to confirm that foraging expression in the EB is indeed crucial for generating LMD behavior (Fig. 4L-M). This finding underscores the importance of foraging in specific neural circuits within the EB for interval timing.

      Additionally, we discovered that foraging expression in Hand-GAL4-labeled pericardial cells (PCs) of the heart is essential for LMD behavior. These PCs are also partially labeled by fru-GAL4 and 30y-GAL4 drivers, indicating that foraging functions in both neuronal and non-neuronal tissues to regulate interval timing. Importantly, we observed that group-reared males exhibit higher calcium activity in PCs compared to socially isolated males, suggesting that social context-dependent calcium dynamics in the heart play a critical role in modulating LMD behavior.

      These findings highlight a novel integration of neuronal and cardiac mechanisms, where foraging expression in both the EB and heart coordinates calcium dynamics to regulate interval timing. This dual-tissue involvement provides a more cohesive understanding of how foraging integrates social cues with internal physiological states to modulate complex behaviors like LMD. We believe this integration of neuronal and cardiac pathways offers a comprehensive framework for understanding the gene’s pleiotropic roles in behavior. We have included these new findings in the revised manuscript to better address the reviewer’s question and to strengthen the discussion of how foraging functions across tissues to regulate interval timing behaviors.

      Comment 2. Genetic Considerations in Gal4 System Usage (Fig. 1D): In the study, the elavc155-Gal4 transgene, located on chromosome I, produces hemizygous males after crossing, while the repo-Gal4 transgene, located on chromosome III, results in heterozygous males. Is there any evidence suggesting that this genetic configuration could impact the experimental outcomes? If so, what steps could be taken to address potential issues?

      Answer: We appreciate the reviewer’s thoughtful consideration of potential genetic confounds related to the chromosomal locations of the elavc155 and repo-GAL4 transgenes. To address this concern, we conducted additional experiments using the nSyb-GAL4 driver, which is located on the third chromosome, and observed that knockdown of foraging with this driver also disrupts LMD behavior (Fig. S1G). This result aligns with our findings using elavc155 (chromosome I) and repo-GAL4 (chromosome III), indicating that the chromosomal location of the GAL4 transgene does not significantly impact the experimental outcomes.

      Furthermore, our extensive tissue-specific GAL4 screening, which included drivers on different chromosomes, consistently demonstrated that foraging knockdown effects on LMD are robust and reproducible across various genetic configurations. These results suggest that the observed behavioral deficits are due to the loss of foraging function rather than positional effects of the GAL4 transgenes. We thank the Reviewer for raising this important point and have taken care to address it thoroughly in our revised manuscript.

      Comment 3. Discrepancies in lacZ Signal Intensity (Fig. 5A): The observed discrepancies in lacZ signal intensity on the surface of the male brain have been attributed to the dissection procedure. Is it feasible to replace the current data with a new, more consistent dataset? How might improved dissection techniques mitigate these discrepancies?

      Answer____: We thank the reviewer for her/his observation regarding the discrepancies in lacZ signal intensity on the surface of the male brain, which we attributed to variations in the dissection procedure. While replacing the current dataset with a new one is feasible, we have instead shifted our focus to address this concern by leveraging more reliable and validated tissue-specific GAL4 drivers combined with foraging-RNAi.

      During the revision process, we extensively examined multiple foraging-GAL4 lines and found that foraging expression in the brain is limited and often inconsistent, despite scRNA-seq data from flySCope indicating broader expression across tissues, including the brain. This discrepancy suggests that many foraging-GAL4 lines may not accurately reflect endogenous foraging expression patterns. To circumvent this issue, we utilized well-characterized tissue-GAL4 drivers to systematically identify tissues where foraging plays a critical role in modulating LMD behavior.

      Our findings revealed that foraging expression in the heart, particularly in fru-positive heart cells, is essential for LMD. This discovery aligns with previous knowledge that foraging is highly enriched in glial cells in the brain, but our new data highlight a previously unrecognized role for cardiac foraging in regulating interval timing behaviors. Furthermore, we demonstrated that calcium activity in these heart cells is dynamically regulated by social context, suggesting that these cells play a crucial role in modulating male mating investment.

      We believe this new analysis addresses the reviewer’s concerns by providing a more robust and consistent approach to studying foraging function, focusing on its role in the heart rather than relying on potentially unreliable brain expression data. We hope these findings meet the reviewer’s expectations and provide a clearer understanding of foraging’s role in mating duration.

      Comment ____4. Rescue Experiment Data (Fig. S2L): Could additional data be provided to demonstrate the rescue effect using the c61-Gal4 driver, similar to what was observed with the 30y-Gal4 driver? How would such data enhance the study's conclusions regarding the specificity and robustness of the foraging gene's role in LMD?

      Answer: We appreciate the reviewer’s suggestion to provide additional rescue experiment data using the c61-GAL4 driver, similar to the results obtained with the 30y-GAL4 driver. While we do not currently have a UAS-for line to perform direct rescue experiments with c61-GAL4, we have conducted extensive follow-up experiments using both 30y-GAL4 driver to further validate the role of foraging in LMD behavior. These experiments consistently demonstrated that foraging knockdown in cells targeted by these drivers disrupts LMD, reinforcing the specificity and robustness of foraging’s role in interval timing.

      Additionally, our revised manuscript includes new findings that highlight the critical role of foraging expression in fru-positive heart neurons for generating male-specific mating investment. These heart neurons exhibit dynamic calcium activity changes in response to social context, further supporting the idea that foraging modulates LMD through both neuronal and non-neuronal mechanisms. While we acknowledge that direct rescue data with c61-GAL4 would strengthen the study, we believe the combination of 30y-GAL4 and c61-GAL4 knockdown results, along with the newly identified role of heart neurons, provides compelling evidence for foraging’s role in LMD.

      In addition, we have confirmed that the 30y-GAL4 driver labels fru-positive heart cells, further supporting the critical role of foraging expression in these cells for generating male-specific mating investment. This finding aligns with our broader results, demonstrating that foraging function in fru-positive heart neurons is essential for modulating interval timing behaviors, particularly LMD. We hope these additional analyses address the reviewer’s concerns and enhance the study’s conclusions regarding the specificity and robustness of foraging function in interval timing behaviors. We have incorporated the following findings into the main text:

      “Therefore, we conclude that the knockdown and genetic rescue effects observed with the Pdfr3A-GAL4 driver (Fig. 3J and 3N) and the 30y-GAL4 driver (Fig. 4A, S2A, and S2L) are attributable to their expression in the heart. In summary, our findings demonstrate that fru-positive heart cells expressing foraging and Pdfr play a critical role in mediating LMD behavior.”


      Reviewer #2

      General Comment: The authors nicely demonstrated that the Drosophila for gene is involved in the plastic LMD behavior that serves as a model for interval timing. For is widely expressed in the body, they have tentatively localized the LMD-relevant for functioning to the ellipsoid body of the central complex.

      Answer: We sincerely thank the reviewer for their positive feedback on our manuscript and their recognition of our findings regarding the role of the foraging gene in modulating plastic LMD behavior as a model for interval timing. In addition to its function in the ellipsoid body (EB) of the central complex, we have identified a novel and critical role for foraging in fru-positive heart neurons. These neurons are essential for regulating male-specific mating investment, as demonstrated by dynamic calcium activity changes in response to social context. This discovery expands our understanding of foraging’s pleiotropic roles, highlighting its function not only in neural circuits but also in non-neuronal tissues, particularly the heart, to modulate interval timing behaviors. We believe these findings provide a more comprehensive view of how *foraging* integrates genetic, neural, and physiological mechanisms to regulate complex behaviors. We hope this additional insight into the role of fru-positive heart neurons further strengthens the manuscript and aligns with the reviewer’s interest in the broader implications of foraging function.


      Major concerns: __ Comment 1.__ Please clarify how a loss-of-function forS allele can be dominant in the presence of overactive forR allele? In the same vein, please clarify how does the forR/forS transgeterozygote supports your hypothesis that high levels of PKG activity disrupt SMD and low levels of it disrupt LMD?

      Answer: We thank the reviewer for her/his insightful questions regarding the dominance of the forS allele in the presence of the overactive forR allele and the implications of the forR/forS transheterozygote phenotype. As the Reviewer noted, the forR allele is associated with higher PKG activity, while the forS allele exhibits lower PKG activity. The disruption of SMD in the presence of a single forR allele can be explained by the excessive PKG activity, which may hyperactivate or desensitize neural circuits required for SMD. Conversely, the forS homozygote disrupts LMD, suggesting that a minimum threshold of PKG activity is necessary for LMD generation.

      The forR/forS transheterozygote, which disrupts both LMD and SMD, presents an intriguing case. Unlike forR/+ or forS/+ heterozygotes, which show intact behaviors due to intermediate PKG activity levels, the forR/forS combination results in conflicting PKG activity levels that likely destabilize shared pathways required for both behaviors. We propose two hypotheses to explain this phenomenon:

      1. Metabolic Disruption: The foraginggene mediates adult plasticity and gene-environment interactions, particularly under conditions of food deprivation (Kent 2009). It influences body fat, carbohydrate metabolism, and gene expression levels, leading to metabolic and behavioral gene-environment interactions (GEI). In forR/forStransheterozygotes, the metabolic changes induced by each allele may accumulate without proper regulatory mechanisms, disrupting the male’s internal metabolic state and impairing the ability to accurately measure interval timing.

      Neuronal Polymorphism: The foraginggene regulates neuronal excitability, synaptic transmission, and nerve connectivity (Renger 1999). The forRand forS alleles may induce distinct neuronal polymorphisms, such as altered synaptic terminal morphology, which could lead to conflicting circuit dynamics in transheterozygotes. This neuronal mismatch may explain why forR/forS flies exhibit disrupted behaviors, unlike heterozygotes with a wild-type allele.

      These findings align with prior studies showing that PKG activity must be tightly regulated within context-dependent ranges for optimal behavior. The foraging gene’s pleiotropic roles, including its influence on metabolic and neural pathways, highlight the importance of allelic balance in maintaining behavioral robustness. The forR/forS transheterozygote phenotype underscores the complexity of foraging’s role in interval timing, where extreme or mismatched PKG activity levels disrupt circuit-specific thresholds critical for distinct behaviors. We hope this explanation clarifies the dominance effects and the role of PKG activity in LMD and SMD, and we have incorporated these insights into the revised manuscript to strengthen our discussion of foraging’s pleiotropic functions.

      We provide a concise explanation of this hypothesis in the Discussion section, as outlined below:

      “The foraging gene plays a critical role in regulating interval timing behaviors, with its allelic variants, rover and sitter, exhibiting distinct effects on LMD and SMD. These differences are primarily driven by their opposing impacts on cGMP-dependent protein kinase (PKG) activity. The forR allele, associated with higher PKG activity, disrupts SMD while maintaining normal LMD (Fig. 1A), suggesting that elevated PKG levels may hyperactivate or desensitize neural circuits specific to SMD processes. Conversely, the forS allele, characterized by lower PKG activity, impairs LMD but not SMD (Fig. 1B), indicating that reduced PKG activity fails to meet the neuromodulatory thresholds required for LMD coordination. The forR/forS transheterozygotes, which disrupt both LMD and SMD (Fig. 1C), reveal a complex interaction between these alleles, likely due to conflicting PKG activity levels or metabolic and neuronal polymorphisms that destabilize shared pathways. This phenomenon underscores the foraging gene’s pleiotropic roles, where allelic balance fine-tunes PKG activity to maintain behavioral robustness, while extreme or mismatched levels disrupt circuit-specific thresholds critical for distinct memory processes [6,10] .

      The foraging gene’s influence on interval timing behaviors extends beyond neural circuits to include metabolic and synaptic regulation. The intact behaviors observed in forR/+ or forS/+ heterozygotes suggest that intermediate PKG activity levels balance circuit dynamics, allowing for normal LMD and SMD. However, the dual deficits in forR/forS transheterozygotes highlight the importance of allelic balance, as conflicting PKG levels may lead to systemic disruptions in both metabolic and neural pathways. This aligns with previous studies showing that foraging mediates adult plasticity and gene-environment interactions, particularly under stress conditions, and regulates synaptic terminal morphology and neuronal excitability [29,77]. The gene’s role in integrating genetic and environmental cues further emphasizes its central role in adaptive behaviors. Collectively, these findings illustrate the complex interplay between PKG activity, neural circuits, and metabolic regulation in shaping interval timing behaviors, highlighting the foraging gene as a key modulator of behavioral plasticity in Drosophila [3,6,77].”

      Comment 2. Please consider removing lines 193-201 & Fig 3G,H, since abruptly and briefly returning to SMD could distract the reader and hinder the flow.

      Answer: We sincerely thank the reviewer for her/his suggestion to improve the flow of the manuscript. In response to reviewer’s feedback, we have removed Figure 3G-H and the related text (lines 193-201) from the main text. While the data on SMD behavior provided additional insights into the role of foraging in gustatory modulation via sNPF-expressing peptidergic neurons, we agree that its inclusion at this point in the manuscript could distract from the primary focus on LMD behavior and interval timing.

      Comment 3. Please use more specific Gal4 drivers to identify the exact subset of the EB-RNs where for function is necessary for LMD. Please note that Taghert lab already identified Pdfr+ EB-RN subset, and in contradiction to your findings, demonstrated that Cry is expressed in these Pdfr+ EB neurons

      Answer: We thank the reviewer for their suggestion to use more specific GAL4 drivers to identify the exact subset of EB ring neurons (EB-RNs) where foraging function is necessary for LMD. In response, we utilized the EB-split-GAL4 driver SS00096, which has been previously employed to map the neuroanatomical ultrastructure of the EB (Turner-Evans 2020). Knockdown of foraging using this refined EB driver disrupted LMD behavior, confirming that foraging function in the EB is indeed crucial for interval timing.

      Regarding the reviewer’s observation about the Taghert lab’s findings on Pdfr+ EB-RNs and the expression of Cry in these neurons, we acknowledge this discrepancy. However, during the revision process, we discovered that foraging and Pdfr are co-expressed not only in EB neurons but also in fru-positive heart neurons, which play a complementary role in modulating LMD behavior. This finding suggests that the apparent contradiction may arise from the dual-tissue involvement of foraging in both EB neurons and heart cells. While foraging function in the EB is critical, its role in heart neurons may provide an additional layer of regulation for interval timing behaviors, potentially compensating for or interacting with EB-related mechanisms.

      We have incorporated these insights into the revised manuscript, emphasizing the importance of both EB and heart neurons in mediating LMD behavior. This dual-tissue perspective offers a more comprehensive understanding of foraging’s role in interval timing and addresses the potential discrepancies highlighted by the reviewer. We hope this clarification resolves the reviewer’s concerns and strengthens the manuscript’s conclusions regarding the neural and non-neural mechanisms underlying foraging function.

      Comment 4. Please clarify how do you think for and Pdfr signaling molecularly interact in these neurons? Since your work doesn't implicate the for+ AL neurons, please remove lines 260-269.Please clarify if the Pdfr+ for+ EB neurons are also fru+.The lacZ staining in Fig5A-B is atypical in having a mosaic-like pattern. Please replace the image.

      Answer: We thank the reviewer for her/his thoughtful questions regarding the molecular interaction between foraging and Pdfr signaling, as well as their observations on the atypical lacZ staining pattern. Below, we address each point in detail:

      1. Molecular Interaction Between foragingand PdfrSignaling: Our tissue-specific driver screening indicates that Pdfr and foraging do not co-express in the same neurons within the brain. Instead, we found that Pdfr and foraging are co-expressed in fru-positive heart cells, suggesting that PDF-Pdfr signaling in these cells modulates calcium activity in pericardial cells (PCs) in a social context-dependent manner. This finding aligns with our previous work showing that PDF signaling is crucial for LMD behavior (Kim 2013). We propose that PDF-Pdfr signaling operates not only through the brain’s sLNv to LNd neuronal circuit but also through a brain-to-heart signaling axis, influencing behaviors and physiological processes across multiple tissues.

      Removal of Lines 260-269: As suggested, we have removed lines 260-269, which discussed for+ AL neurons, as our findings do not implicate these neurons in LMD regulation. This revision helps streamline the manuscript and maintain focus on the relevant neural and cardiac mechanisms.

      Clarification on Pdfr+for+EB Neurons and fru Expression: While our data do not directly address whether Pdfr+ for+ EB neurons are also fru+, we have confirmed that foraging and Pdfr co-express in fru-positive heart cells. This suggests that fru may play a role in integrating foraging and Pdfr signaling in non-neuronal tissues, particularly in the heart, to regulate LMD behavior.

      Replacement of lacZ Staining Images: During the revision process, we extensively examined multiple foraging-GAL4lines and found that foragingexpression in the brain is limited and often inconsistent, despite scRNA-seq data from flySCope indicating broader expression across tissues, including the brain. This discrepancy suggests that many foraging-GAL4 lines may not accurately reflect endogenous foraging expression patterns. To circumvent this issue, we utilized well-characterized tissue-GAL4 drivers to systematically identify tissues where foraging plays a critical role in modulating LMD behavior. Our findings revealed that foraging expression in the heart, particularly in fru-positive heart cells, is essential for LMD. This discovery aligns with previous knowledge that foraging is highly enriched in glial cells in the brain, but our new data highlight a previously unrecognized role for cardiac foraging in regulating interval timing behaviors. Furthermore, we demonstrated that calcium activity in these heart cells is dynamically regulated by social context, suggesting that these cells play a crucial role in modulating male mating investment. We believe this new analysis addresses the reviewer’s concerns by providing a more robust and consistent approach to studying foraging function, focusing on its role in the heart rather than relying on potentially unreliable brain expression data. We hope these findings meet the reviewer’s expectations and provide a clearer understanding of foraging’s role in mating duration.

      We hope these revisions meet the Reviewer’s expectations and provide a clearer understanding of the interplay between foraging and Pdfr signaling in interval timing behaviors.

      Comment 5. Please consider removing lines 303-312, since this negative result may dilute your final conclusions without adding strong factual value.

      Answer: We appreciate the reviewer's suggestion regarding lines 303-312. Upon careful consideration, we believe this paragraph provides important context about the roles of dsx-positive and fru-positive cells in foraging behavior. Specifically, it highlights that the foraging function is associated with fru-positive cells rather than dsx-positive cells, which is a key distinction in our study. This information is relevant to understanding the broader implications of our findings, as it underscores the functional specificity of these genes in regulating behavior. However, to address the reviewer's concern, we have revised the paragraph to ensure it is more concise and directly tied to the study's conclusions. We have also integrated additional data from the new manuscript to further strengthen the factual value of this section. We hope this adjustment strikes the right balance between maintaining necessary context and avoiding any dilution of the final conclusions. Thank you for this thoughtful feedback.

      __Minor concerns: __

      __Comment 6. __Minor points: In the intro please mention other interval timing mechanisms and their underlying molecular mechanisms (e.g., CREB work of Crickmore lab). Please provide a better rationale for why you thought for is a good candidate for LMD? In line 124, when you start to talk about larval neurons - please specify which neurons you are referring to. In Fig 2E,G,H - 'glia' should be replaced with 'neurons'.

      Answer: We appreciate the reviewer’s insightful comments regarding our conclusion linking LMD to interval timing behavior. Current research by Crickmore et al. has shed light on how mating duration in Drosophila serves as a powerful model for exploring changes in motivation over time as behavioral goals are achieved. For instance, at approximately six minutes into mating, sperm transfer occurs, leading to a significant shift in the male's nervous system: he no longer prioritizes sustaining the mating at the expense of his own survival. This change is driven by the output of four male-specific neurons that produce the neuropeptide Corazonin (Crz). When these Crz neurons are inhibited, sperm transfer does not occur, and the male fails to downregulate his motivation, resulting in matings that can last for hours instead of the typical ~23 minutes (Thornquist 2020).

      Recent research by Crickmore et al. has received NIH R01 funding (Mechanisms of Interval Timing, 1R01GM134222-01) to explore mating duration in Drosophila as a genetic model for interval timing. Their work highlights how changes in motivation over time can influence mating behavior, particularly noting that significant behavioral shifts occur during mating, such as the transfer of sperm at approximately six minutes, which correlates with a decrease in the male's motivation to continue mating (Thornquist 2020). These findings suggest that mating duration is not only a behavioral endpoint but may also reflect underlying mechanisms related to interval timing.

      In addition to the efforts of Crickmore's group to connect mating duration with a straightforward genetic model for interval timing, we have previously published several papers demonstrating that LMD and SMD can serve as effective genetic models for interval timing within the fly research community. For instance, we have successfully connected SMD to an interval timing model in a recently published paper (Lee 2023), as detailed below:

      "We hypothesize that SMD can serve as a straightforward genetic model system through which we can investigate "interval timing," the capacity of animals to distinguish between periods ranging from minutes to hours in duration.....

      In summary, we report a novel sensory pathway that controls mating investment related to sexual experiences in Drosophila. Since both LMD and SMD behaviors are involved in controlling male investment by varying the interval of mating, these two behavioral paradigms will provide a new avenue to study how the brain computes the ‘interval timing’ that allows an animal to subjectively experience the passage of physical time (Buhusi & Meck, 2005; Merchant et al, 2012; Allman et al, 2013; Rammsayer & Troche, 2014; Golombek et al, 2014; Jazayeri & Shadlen, 2015)."

      Lee, S. G., Sun, D., Miao, H., Wu, Z., Kang, C., Saad, B., ... & Kim, W. J. (2023). Taste and pheromonal inputs govern the regulation of time investment for mating by sexual experience in male Drosophila melanogaster. PLoS Genetics, 19(5), e1010753.

      We have also successfully linked LMD behavior to an interval timing model and have published several papers on this topic recently (Huang 2024,Zhang 2024,Sun 2024).

      Sun, Y., Zhang, X., Wu, Z., Li, W., & Kim, W. J. (2024). Genetic Screening Reveals Cone Cell-Specific Factors as Common Genetic Targets Modulating Rival-Induced Prolonged Mating in male Drosophila melanogaster. G3: Genes, Genomes, Genetics, jkae255.

      Zhang, T., Zhang, X., Sun, D., & Kim, W. J. (2024). Exploring the Asymmetric Body’s Influence on Interval Timing Behaviors of Drosophila melanogaster. Behavior Genetics, 54(5), 416-425.

      Huang, Y., Kwan, A., & Kim, W. J. (2024). Y chromosome genes interplay with interval timing in regulating mating duration of male Drosophila melanogaster. Gene Reports, 36, 101999.

      Finally, in this context, we have outlined in our INTRODUCTION section below how our LMD and SMD models are related to interval timing, aiming to persuade readers of their relevance. We hope that the reviewer and readers are convinced that mating duration and its associated motivational changes such as LMD and SMD provide a compelling model for studying the genetic basis of interval timing in Drosophila.

      “The mating duration (MD) of male fruit flies, Drosophila melanogaster, serves as an excellent model for studying interval timing behaviors. In Drosophila, two notable interval timing behaviors related to mating duration have been identified: Longer-Mating-Duration (LMD), which is observed when males are in the presence of competitors and extends their mating duration [15–17] and Shorter-Mating-Duration (SMD), which is characterized by a reduction in mating time and is exhibited by sexually experienced males [18,19]. The MD of male fruit flies serves as an excellent model for studying interval timing, a process that can be modulated by internal states and environmental contexts. Previous studies by our group (Kim 2013,Kim 2012,Zhang 2024,Lee 2023,Huang 2024) and others (Thornquist 2020,Crickmore 2013,Zhang 2019,Zhang 2021) have established robust frameworks for investigating MD using advanced genetic tools, enabling the dissection of neural circuits and molecular mechanisms that govern interval timing.

      The foraging gene emerged as a strong candidate for regulating LMD due to its well-documented role in behavioral plasticity and decision-making processes (Kent 2009,Alwash 2021,Anreiter 2019). The foraging gene encodes a cGMP-dependent protein kinase (PKG), which has been implicated in modulating foraging behavior, aggression, and other context-dependent behaviors in Drosophila. Its involvement in these processes suggests a potential role in integrating environmental cues and internal states to regulate interval timing, such as LMD. Furthermore, the molecular mechanisms underlying interval timing have been explored in other contexts, such as the work of the Crickmore et al., which has demonstrated the critical role of CREB (cAMP response element-binding protein) in regulating behavioral timing and plasticity. CREB-dependent signaling pathways, along with other molecular players like PKG, provide a broader framework for understanding how interval timing is orchestrated at the neural and molecular levels (Thornquist 2020,Zhang 2016,Zhang 2021,Zhang 2019,Crickmore 2013,Zhang 2023). By investigating foraging in the context of LMD, we aim to uncover how specific genetic and neural mechanisms fine-tune interval timing in response to social and environmental cues, contributing to a deeper understanding of the principles governing behavioral adaptation.”

      When describing larval neurons, we provide specific references to ensure clarity and accuracy, as outlined below:

      “Moreover, the cultured giant neural characteristics of these phenotypes are distinctly different [29].”

      We thank the reviewer for catching this error. We have corrected the incorrect label "Glia" to "Neuron" in Figures 2E, 2G, and 2H.

      Reviewer #3

      General Comment: This manuscript explores the foraging gene's role in mediating interval timing behaviors, particularly mating duration, in Drosophila melanogaster. The two distinct alleles of the foraging gene-rover and sitter-demonstrate differential impacts on mating behaviors. Rovers show deficiencies in shorter mating duration (SMD), while sitters are impaired in longer mating duration (LMD). The gene's expression in specific neuronal populations, particularly those expressing Pdfr (a critical regulator of circadian rhythms), is crucial for LMD. The study further identifies sexually dimorphic patterns of foraging gene expression, with male-biased expression possibly in the ellipsoid body (EB) being responsible for regulating LMD behavior. The findings suggest that the foraging gene operates through a complex neural circuitry that integrates genetic and environmental factors to influence mating behaviors in a time-dependent manner. Additionally, restoring foraging expression in Pdfr-positive cells rescues LMD behavior, confirming its central role in interval timing related to mating.

      Answer: We sincerely thank the reviewer for her/his thoughtful and comprehensive synthesis of our work, as well as their recognition of its key contributions. We are grateful that the reviewer highlighted the central findings of our study, including the allele-specific roles of forR (rover) and forS (sitter) in regulating distinct interval timing behaviors—specifically, the deficiencies of rovers in SMD and sitters in LMD. We also appreciate the reviewer’s emphasis on the sexually dimorphic expression of the *foraging* gene, particularly its male-biased expression in the ellipsoid body (EB), and its critical role in Pdfr-positive neurons for mediating LMD.

      We agree with the reviewer that the interplay between genetic factors (e.g., allelic variation in foraging) and environmental cues (e.g., circadian rhythms via Pdfr pathways) underscores the complexity of interval timing regulation. The rescue of LMD behavior by restoring foraging expression in Pdfr cells further supports our hypothesis that foraging operates through specialized neural circuits to integrate temporal and environmental inputs. This finding aligns with broader studies on interval timing mechanisms, such as the work of the Crickmore lab on CREB-dependent pathways, which have demonstrated how molecular and neural mechanisms converge to regulate behavioral plasticity and timing.

      In the revised manuscript, we will expand on these points to strengthen the discussion of foraging’s pleiotropic roles in time-dependent mating strategies and its potential links to evolutionary fitness. Specifically, we will incorporate additional insights from the new manuscript, including further evidence of how foraging balances behavioral plasticity with metabolic and neural demands, and how its expression in specific neuronal populations, such as the EB, contributes to adaptive behaviors. These updates will provide a more comprehensive understanding of the gene’s role in interval timing and its broader implications for behavioral adaptation. Once again, we thank the Reviewer for their valuable feedback, which has helped us refine and enhance the presentation of our findings.

      __Major concerns: __

      Comment 1. The sexually dimorphic expression of the foraging gene is not convincing. Specifically, the lacZ signal in the male brain is not representative.

      __Answer:____ __We sincerely thank the reviewer for her/his insightful comment regarding the sexually dimorphic expression of the foraging gene. We agree that the lacZ signal in the male brain, as presented, may not be fully representative, and we appreciate the reviewer’s observation regarding the discrepancies in signal intensity, which we attribute to variations in dissection procedures. While replacing the current dataset with a new one is feasible, we have chosen to address this concern by shifting our focus to a more reliable and validated approach using tissue-specific GAL4 drivers combined with foraging-RNAi.

      During the revision process, we conducted an extensive examination of multiple foraging-GAL4 lines and found that foraging expression in the brain is often limited and inconsistent, despite scRNA-seq data from flySCope indicating broader expression across tissues, including the brain. This discrepancy suggests that many foraging-GAL4 lines may not accurately reflect endogenous foraging expression patterns. To overcome this limitation, we employed well-characterized tissue-specific GAL4 drivers to systematically identify tissues where foraging plays a critical role in modulating LMD behavior.

      Our findings revealed that foraging expression in the heart, particularly in fru-positive heart cells, is essential for LMD. This discovery aligns with previous knowledge that foraging is highly enriched in glial cells in the brain, but our new data highlight a previously unrecognized role for cardiac foraging in regulating interval timing behaviors. Furthermore, we demonstrated that calcium activity in these heart cells is dynamically regulated by social context, suggesting that these cells play a crucial role in modulating male mating investment.

      By focusing on the heart and leveraging more reliable genetic tools, we believe this new analysis addresses the Reviewer’s concerns and provides a more robust and consistent approach to studying foraging function. We hope these findings meet the reviewer’s expectations and offer a clearer understanding of foraging’s role in mating duration. We are grateful for the Reviewer’s constructive feedback, which has significantly strengthened our study.

      Comment 2____. Key control genotypes are missing.

      Answer: We thank the Reviewer for raising this important point regarding control genotypes. We would like to clarify that all necessary control experiments have indeed been conducted, and the results are included in the manuscript. Detailed descriptions of these controls, including the specific genotypes and experimental conditions, are provided in the Methods section. For example, control experiments were performed to account for genetic background effects, GAL4 driver activity, and RNAi efficiency, ensuring the reliability and specificity of our findings. In the revised manuscript, we have further emphasized these control experiments and their outcomes to ensure transparency and reproducibility. We have also included additional details in the Results section to highlight how these controls validate our key findings. For instance, control genotypes lacking the foraging-RNAi or GAL4 drivers were used to confirm that the observed phenotypes are specifically due to the manipulation of foraging expression.

      We appreciate the Reviewer’s attention to this critical aspect of our study and hope that the additional clarification and emphasis on control experiments in the revised manuscript address their concerns. If there are specific control genotypes or experiments the reviewer would like us to include or elaborate on further, we would be happy to do so. Thank you for this valuable feedback.

      Comment 3____.fru is not expressed in the EB, so the authors may need to reconcile their model in figure 5G.

      Answer: We thank the reviewer for her/his insightful comment regarding the expression of fru in the ellipsoid body (EB) and its relevance to our model in Figure 5G. We agree that fru is not expressed in the EB, and we acknowledge the need to reconcile this aspect of our model. While initial evidence suggested a potential role for the EB in regulating foraging-dependent LMD behavior, further investigation has revealed that neurons outside the EB are more likely to be involved in this process.

      During our revision, we identified fru-positive heart neurons that coexpress Pdfr and foraging, which appear to play a critical role in modulating LMD behavior. These findings suggest that the heart, rather than the EB, may be a key site for foraging function in the context of interval timing and mating duration. Specifically, we demonstrated that calcium activity in these fru+ heart cells is dynamically regulated by social context, further supporting their role in modulating male mating investment.

      In light of these new findings, we revised Figure 5G as new Figure 6H and the accompanying model to reflect the updated understanding that fru+ heart neurons, rather than EB neurons, are central to the regulation of LMD behavior. This adjustment aligns with our broader goal of accurately representing the neural and molecular mechanisms underlying foraging’s role in interval timing. We appreciate the Reviewer’s feedback, which has helped us refine our model and strengthen the manuscript. We hope these revisions address their concerns and provide a clearer and more accurate representation of our findings. Thank you for this valuable input.

      Minor concerns: Comment 4____.

      Line 32, what do you mean by "overall success of the collective"

      Line 124-126: I suggest not using "sitter neurons" or "rover neurons". Line 301, typo with "male-specific".

      Answer: We thank the Reviewer for their careful reading and constructive feedback. We have addressed each of their comments as follows:

      1. Line 32: We agree with the reviewer that the phrase "overall success of the collective" was unclear and have completely revised the Abstract to remove this expression. The updated Abstract now provides a clearer and more concise summary of our findings.

      Lines 124-126: We appreciate the reviewer’s suggestion to avoid using the terms "sitter neurons" or "rover neurons," as they could be misleading. We have revised this phrasing to "neurons of sitter/rover allele" to more accurately reflect the genetic context of our study.

      Line 301: We have corrected the typo with "male-specific" to ensure accuracy and clarity in the text.

      We hope these revisions address the Reviewer’s concerns and improve the overall quality of the manuscript. Thank you for your valuable input, which has helped us refine our work.

      __Strengths and limitations of the study:______ This study presents a significant advancement in understanding the foraging gene's role in regulating mating behaviors through interval timing, and identifies the critical role of Pdfr-expressing neurons in the ellipsoid body for LMD. However, it does not fully explain how these neurons specifically modulate timing mechanisms. The lack of in-depth mechanistic exploration of how these neurons interact with other circuits involved in memory and decision-making leaves gaps in the understanding of the exact pathways influencing interval timing. Also, the study focuses more on LMD behaviors and the neural circuits involved, leaving the mechanisms underlying SMD comparatively underexplored.

      __Answer:____ __We thank the reviewer for her/his thoughtful assessment of the strengths and limitations of our study. We agree that our work represents a significant advancement in understanding the role of the foraging gene in regulating mating behaviors through interval timing, particularly in identifying the critical role of Pdfr-expressing neurons in the ellipsoid body (EB) for long mating duration (LMD). However, we acknowledge that the initial manuscript did not fully elucidate how these neurons specifically modulate timing mechanisms or interact with other neural circuits involved in memory and decision-making.

      In response to this feedback, we have conducted additional experiments and analyses, which are now included in the revised manuscript. Specifically, we identified fru-positive heart neurons that coexpress Pdfr and foraging, and we demonstrated their essential role in LMD using calcium imaging (CaLexA). These findings provide a more comprehensive mechanistic understanding of how foraging influences interval timing through cardiac activity, which is dynamically regulated by social context. This new evidence addresses the reviewer’s concern by offering a clearer picture of the neural and molecular pathways underlying LMD.

      Regarding SMD behavior, we agree that it was comparatively underexplored in the initial manuscript. However, we have extensively studied SMD in other contexts, as highlighted in several of our previously published papers. These studies have investigated the sensory mechanisms, memory processes, peptidergic signaling, and clock gene functions associated with SMD (Zhang 2024,Zhang 2024,Sun 2024,Wong 2019,Kim 2024,Lee 2023). While the current manuscript focuses primarily on LMD, we will include a discussion of these findings to provide a more balanced perspective on the mechanisms underlying both LMD and SMD.

      We believe these revisions address the Reviewer’s concerns and significantly strengthen the manuscript by providing a more detailed mechanistic understanding of foraging’s role in interval timing and mating behaviors. We are grateful for the Reviewer’s constructive feedback, which has helped us improve the depth and clarity of our study. Thank you for your valuable input.

      __Advance:______ This study brings a novel perspective to the foraging gene, previously known for its role in regulating food-search behavior. It demonstrates that foraging is also involved in interval timing, a cognitive process integral to mating behaviors in Drosophila. This discovery challenges the assumption that foraging is solely related to foraging strategies, revealing a broader function in time-based decision-making processes.

      Answer: We sincerely thank the reviewer for her/his insightful comments and for recognizing the novel contributions of our study. We are pleased that the reviewer highlighted how our work expands the understanding of the foraging gene, which was previously primarily associated with food-search behavior. By demonstrating its role in interval timing—a cognitive process critical to mating behaviors in Drosophila—we challenge the conventional assumption that foraging is solely related to foraging strategies. Instead, our findings reveal its broader function in time-based decision-making processes, particularly in the context of mating duration.

      This discovery not only advances our understanding of the pleiotropic roles of foraging but also opens new avenues for exploring how genetic and neural mechanisms integrate temporal and environmental cues to regulate complex behaviors. We are grateful for the reviewer’s support and acknowledgment of the significance of our findings. Thank you for this valuable feedback.

      __Audience:______ The study offers significant value to several specialized research communities, including behavioral genetics and evolutionary biology, especially those using the Drosophila model. This could inform future research on other behaviors that depend on precise timing and decision-making.

      Answer: We sincerely thank the reviewer for her/his thoughtful comment and for recognizing the broad relevance of our study. We are pleased that the reviewer highlighted the significant value our work offers to be specialized research communities, particularly in behavioral genetics and evolutionary biology, as well as to researchers using the Drosophila model. By elucidating the role of the foraging gene in interval timing and its impact on mating behaviors, our findings provide a foundation for future research on other behaviors that rely on precise timing and decision-making. This study not only advances our understanding of the genetic and neural mechanisms underlying interval timing but also opens new avenues for exploring how similar processes may operate in other species or contexts. We hope our work will inspire further investigations into the interplay between genetic variation, neural circuits, and environmental cues in shaping adaptive behaviors. Thank you for your valuable feedback and for acknowledging the potential impact of our research.

  2. Feb 2025
    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewing Editor Note:

      The two reviewers have provided thoughtful and constructive feedback that we hope will be of use to the authors to improve their manuscript.

      Reviewer #1 (Recommendations For The Authors):

      The section on "Circuit evolution by duplication and divergence" (starting on line 622) should cite:

      Chakraborty, Mukta, and Erich D. Jarvis. "Brain evolution by brain pathway duplication." Philosophical Transactions of the Royal Society B: Biological Sciences 370, no. 1684 (2015): 20150056.

      and

      Roberts, Ruairí JV, Sinziana Pop, and Lucia L. Prieto-Godino. "Evolution of central neural circuits: state of the art and perspectives." Nature Reviews Neuroscience 23, no. 12 (2022): 725-743.

      It should also reference that the concept originated from genetics:

      Ohno, Susumu. Evolution by gene duplication. Springer Science & Business Media, 1970

      These papers have now been cited: “Duplication and divergence of circuits was also proposed as a possible mechanism for the evolution of brain pathways for vocal learning in song-learning birds, spoken language in humans [@chakraborty2015brain] and other circuits [@roberts2022evolution].”

      and: Our reconstructions identified a potential case for circuit evolution by duplication and divergence [@tosches2017developmental; @roberts2022evolution], a concept that originated from genetics [@ohno1970evolution].

      The terms outgoing and incoming synapses were confusing. The more common terminology is pre and postsynaptic elements. For example, in Fig 1, the label Sensory neuron outgoing and incoming was confusing because I mistakenly thought it was referring to the neurons and I could not figure out what an outgoing sensory neuron was.

      We have now changed ‘incoming’ to ‘postsynaptic’ and ‘outgoing’ to ‘presynaptic’.

      In L-O, there should be an indicator on the figures that they refer to the locations of synaptic sites, as it does in F.

      We have now replaced the labels ‘incoming’ and ‘outgoing’ with ‘presyn’ and ‘postsyn’ for Figure 1 panels L-O to make it clear that these are synaptic sites.

      Figure 2. - last panel of muscle motor - it would be helpful to have names of muscles instead of just having 5 'muscle motor' of different colors

      Each muscle-motor module contains a large number and type of muscles and motor neurons. Labelling them by the name of individual muscle types is therefore not practical at this resolution. The three-day-old Platynereis larvae has 53 different muscle cell types. Their anatomy and classification, together with the details of motoneuron innervation have been described in detail elsewhere (Jasek et al 2022 https://doi.org/10.7554/eLife.71231).

      Figure 3. D and E are hard to understand from the figure; The shading is the number of neurons; that scale should be shown somewhere.

      We are not sure we understand the comment. These plots are histograms that show the distribution of the number of cells across categories. The y axis is the number of neuronal or non-neuronal cell types in each bin.

      PageRank is an algorithm that Google uses. In Figure 4, it seems to be used to indicate centrality. A brief explanation in the text would be useful.

      We have now added an explanation of the centrality measures used. “PageRank is an algorithm used by Google to rank webpages and scores the number and quality of the incoming links of a node [@page1999pagerank], betweenness centrality measures the number of shortest paths that pass through a node in a graph [@freeman1977set],  and authority measures the extent of inputs to a node by hubs in a network [@kleinberg1999authoritative].”

      Figure 5. The labels on some images are not clear. They are on top of each other and elements of the figure

      We have now moved the position of the labels to minimise overlap. We have also added an interactive html file with the network shown in Figure 5 panel A to help the exploration of the network. Added: “Figure 5—source data 1. Interactive html file with the network shown in panel A.”

      There are differences in line thickness in several figures, such as Figure 9 (A and B) and Figure 12 (D and I and N) that presumably means numbers of synaptic contacts. It would be useful to know what the scale is.

      We have now added labels of line thickness to the networks in Figure 4, Figure 5 – figure supplement 2, Figure 9, Figure 12, Figure 7 – figure supplement 1, Figure 15 and Figure 16.

      Reviewer #2 (Recommendations For The Authors):

      (1) Suggestions for improved or additional experiments, data, or analyses.

      (2) Recommendations for improving the writing and presentation.

      Perhaps we require a comprehensive inventory detailing all the innovations compared to previous, more limited publications, particularly in relation to the 2017 publication and 2020 preprint.

      We have provided this detail in Supplementary table 1 that lists all cell types. We included the reference for previously published cell types in the ‘reference’ column except for those that were also described in the 2020 preprint. The current manuscript is a greatly revised and extended version of the original 2020 preprint. In addition, in the online connectome database (https://catmaid.jekelylab.ex.ac.uk), all cell types that were previously published are annotated with the notation ‘FirstAuthor_et_al_year’.

      It is a bit frustrating given the huge amount of graphs, analyses, tables, and networks that are presented in the manuscript, we do not see much of the original EM pictures except for a few examples of cell type blow-ups. It would be useful for future workers in the field to have eventually a sort of compendium of how the authors actually recognized each cell type, without having to connect to the original CATMAID annotation.

      Most neuronal cell types (with the exception of some characteristic sensory neurons such as photoreceptor cells and mechanosensory cells) were not classified based on ultrastructural features, but on features of neurite morphology, body position and synaptic connectivity. It would be therefore not possible to represent most of the cell types with a single layer of an original EM picture. However, in order to make the morphological skeleton characteristics more accessible to the reader, we have now added a comprehensive website ( https://jekelylab.github.io/Platynereis_connectome/)  including all cell types together with their interactive 3D rendering.

      “Interactive 3D morphological renderings of each cell type together with their main annotations can also be explored on a webpage (https://jekelylab.github.io/Platynereis_celltype_compendium.html).”

      The Platynereis 3-day larva is obviously only one transient stage in the developmental cycle of the animal, and it is a very specialized stage (called metatrochophore in annelid jargon), during which the animal does not yet feed, relying instead on its copious yolk. Moreover, it is a stage whose purpose is limited to dispersion, with no complex behavior or social interaction that later stages are going to display. While this work represents a substantial leap forward in understanding neural integration in a whole animal, it must be kept in mind that compared to an adult or growing juvenile, there are likely a considerable number of cells, cell types, and neural modules missing in this larva. This is clearly not a weakness of this study per se, but readers may find it interesting to be presented with this perspective and therefore more biological details about the Platynereis life cycle and associated behaviors.

      Obviously, understanding how the constantly developing nervous system of a worm-like Platynereis gets reshuffled in time will be a great subject to investigate. The authors mention that the 3-day larva displays more than 4000 neuronal cells not yet differentiated. Readers may be interested in their location. Are there niches of neural stem cells? A description of what may be missing from the larva in terms of cell types compared to the adult may be useful.

      We have now added further explanation into the Introduction about the early nectochaete larval stage: “The early nectochaete larva represents a transient dispersing stage in the life cycle of Platynereis. During this stage the larvae do not feed yet but rely on maternally provided yolk. Compared to the juvenile and adult stages it is expected that a considerable number of cell types will be only developing or completely missing at this stage. Three-day-old larvae do not yet have sensory palps and other sensory appendages (cirri), they do not crawl or feed and lack visceral muscles and an enteric nervous system.”

      The location of developing neurons is shown in Figure 3—figure supplement 1 panel I.

      Juvenile or adult cell types have not yet been described in any detail that is close to the level of detail we now provide for the nectochaete larva, therefore a meaningful comparison of cell-type complements across stages is not yet feasible.

      (3) Minor corrections to the text and figures.

      Figure 1: "outgoing" not "outgoung" in panels M, O, Q.

      Corrected

      Line 128: We may need a precise definition of "cable length".

      We have included a definition of cable length in the Methods section under a new subheading ‘Quantitative analysis of neuron morphologies’.

      In all Figures: information on the orientation of the worm's view is sometimes missing in figures, which could make interpretation difficult for the reader, especially for anterior views with no D/V indication. The authors should indicate the orientation for each panel or provide a general orientation in the figure if all panels are oriented the same.

      We have now added D/V or A/P indication to all figures.

      Figure 23: "right view, left side" is confusing.

      We have changed this to “ Each panel shows a ventral (left panel) and a left-side view (right panel).”

      Line 406 : the first mention of the Platynereis cryptic segment, as far as I know, is Saudemont et al, 2008.

      Thank you for pointing this out. We added the citation.

      Figure 45: descending and decussating, 2nd and 3rd line of the legend.

      Corrected

      The format of data source tables is not homogeneized with some files in Excel format and others in plain comma format.

      We have homogeneized the file formats of the supplements and source data. We have .csv files or .rds (R data format) files for the more complex data, such as tibble graphs that cannot be represented in a simple .csv format.

    1. Author response:

      eLife Assessment

      This study presents a valuable theoretical exploration on the electrophysiological mechanisms of ionic currents via gap junctions in hippocampal CA1 pyramidal-cell models, and their potential contribution to local field potentials (LFPs) that is different from the contribution of chemical synapses. The biophysical argument regarding electric dipoles appears solid, but the evidence can be more convincing if their predictions are tested against experiments. A shortage of model validation and strictly comparable parameters used in the comparisons between chemical vs. junctional inputs makes the modeling approach incomplete; once strengthened, the finding can be of broad interest to electrophysiologists, who often make recordings from regions of neurons interconnected with gap junctions.

      We gratefully thank the editors and the reviewers for the time and effort in rigorously assessing our manuscript, for the constructive review process, for their enthusiastic responses to our study, and for the encouraging and thoughtful comments. We especially thank you for deeming our study to be a valuable exploration on the differential contributions of active dendritic gap junctions vs. chemical synapses to local field potentials. We thank you for your appreciation of the quantitative biophysical demonstration on the differences in electric dipoles that appear in extracellular potentials with gap junctions vs. chemical synapses.

      However, we are surprised by aspects of the assessment that resulted in deeming the approach incomplete, especially given the following with specific reference to the points raised:

      (1) Testing against experiments: With specific reference to gap junctions, quantitative experimental verification becomes extremely difficult because of the well-established nonspecificities associated with gap junctional modulators (Behrens et al., 2011; Rouach et al., 2003). The non-specific actions of gap junctions are tabulated in Table 2 of (Szarka et al., 2021), reproduced below. In addition, genetic knockouts of gap junctional proteins are either lethal or involve functional compensation (Bedner et al., 2012; Lo, 1999), together making causal links to specific gap junctional contributions with currently available techniques infeasible.

      In addition, the complex interactions between co-existing chemical synaptic, gap junctional, and active dendritic contributions from several cell-types make the delineation of the contributions of specific components infeasible with experimental approaches. A computational approach is the only quantitative route to specifically delineate the contributions of individual components to extracellular potentials, as seen from studies that have addressed the question of active dendritic contributions to field potentials (Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Sinha & Narayanan, 2015, 2022) or spiking contributions to local field potentials (Buzsaki et al., 2012; Gold et al., 2006; Schomburg et al., 2012). The biophysically and morphologically realistic computational modeling route is therefore invaluable in assessing the impact of individual components to extracellular field potentials (Einevoll et al., 2019; Halnes et al., 2024).

      Together, we emphasize that the computational modeling route is currently the only quantitative methodology to delineate the contributions of gap junctions vs. chemical synapses to extracellular potentials.

      (2) Model validation: The model used in this study was adopted from a physiologically validated model from our laboratory (Roy & Narayanan, 2021). Please note that the original model was validated against several physiological measurements along the somatodendritic axis. We sincerely regret our oversight in not mentioning clearly that we have used an existing, thoroughly physiologically-validated model from our laboratory in this study.

      (3) Comparisons between chemical vs. junctional inputs: We had taken elaborate precautions in our experimental design to match the intracellular electrophysiological signatures with reference to synchronous as well as oscillatory inputs, irrespective of whether inputs arrived through gap junctions or chemical synapses.

      In a revised manuscript, we will address all the concerns raised by the reviewers in detail. We have provided point-by-point responses to reviewers’ helpful and constructive comments below. We thank the editors and the reviewers for this constructive review process, which we believe will help us in improving our manuscript with specific reference to emphasizing the novelty of our approach and conclusions.

      Reviewer #1 (Public review):

      This manuscript makes a significant contribution to the field by exploring the dichotomy between chemical synaptic and gap junctional contributions to extracellular potentials. While the study is comprehensive in its computational approach, adding experimental validation, network-level simulations, and expanded discussion on implications would elevate its impact further.

      We gratefully thank you for your time and effort in rigorously assessing our manuscript, for the enthusiastic response, and the encouraging and thoughtful comments on our study. In what follows, we have provided point-by-point responses to the specific comments.

      Strengths

      Novelty and Scope

      The manuscript provides a detailed investigation into the contrasting extracellular field potential (EFP) signatures arising from chemical synapses and gap junctions, an underexplored area in neuroscience. It highlights the critical role of active dendritic processes in shaping EFPs, pushing forward our understanding of how electrical and chemical synapses contribute differently to extracellular signals.

      We thank you for the positive comments on the novelty of our approach and how our study addresses an underexplored area in neuroscience. The assumptions about the passive nature of dendritic structures had indeed resulted in an underestimation of the contributions of gap junctions to extracellular potentials. Once the realities of active structures are accounted for, the contributions of gap junctions increases by several orders of magnitude compared to passive structures (Fig. 1D).

      Methodological Rigor

      The use of morphologically and biophysically realistic computational models for CA1 pyramidal neurons ensures that the findings are grounded in physiological relevance. Systematic analysis of various factors, including the presence of sodium, leak, and HCN channels, offers a clear dissection of how transmembrane currents shape EFPs.

      We thank you for your encouraging comments on the experimental design and methodological rigor of our approach.

      Biological Relevance

      The findings emphasize the importance of incorporating gap junctional inputs in analyses of extracellular signals, which have traditionally focused on chemical synapses. The observed polarity differences and spectral characteristics provide novel insights into how neural computations may differ based on the mode of synaptic input.

      We thank you for your positive comments on the biological relevance of our approach. We also gratefully thank you for emphasizing the two striking novelties unveiling the dichotomy between gap junctions and chemical synapses in their contributions to field potentials: polarity differences and spectral characteristics.

      Clarity and Depth

      The manuscript is well-structured, with a logical progression from synchronous input analyses to asynchronous and rhythmic inputs, ensuring comprehensive coverage of the topic.

      We sincerely thank you for the positive comments on the structure and comprehensive coverage of our manuscript encompassing different types of inputs that neurons typically receive.

      Weaknesses and Areas for Improvement

      Generality and Validation

      The study focuses exclusively on CA1 pyramidal neurons. Expanding the analysis to other cell types, such as interneurons or glial cells, would enhance the generalizability of the findings. Experimental validation of the computational predictions is entirely absent. Empirical data correlating the modeled EFPs with actual recordings would strengthen the claims.

      We thank you for raising this important point. The prime novelty and the principal conclusion of this study is that gap junctional contributions to extracellular field potentials are orders of magnitude higher when the active nature of cellular compartments are accounted for. The lacuna in the literature has been consequent to the assumption that cellular compartments are passive, resulting in the dogma that gap junctional contributions to field potentials are negligible. Despite knowledge about active dendritic structures for decades now, this assumption has kept studies from understanding or even exploring the contributions of gap junctions to field potentials. The rationale behind the choice of a computational approach to address the lacuna were as follows:

      (1) The complex interactions between co-existing chemical synaptic, gap junctional, and active dendritic contributions from several cell-types make the delineation of the contributions of specific components infeasible with experimental approaches. A computational approach is the only quantitative route to specifically delineate the contributions of individual components to extracellular potentials, as seen from studies that have addressed the question of active dendritic contributions to field potentials (Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Sinha & Narayanan, 2015, 2022) or spiking contributions to local field potentials (Buzsaki et al., 2012; Gold et al., 2006; Schomburg et al., 2012). The biophysically and morphologically realistic computational modeling route is therefore invaluable in assessing the impact of individual components to extracellular field potentials (Einevoll et al., 2019; Halnes et al., 2024).

      (2) With specific reference to gap junctions, quantitative experimental verification becomes extremely difficult because of the well-established non-specificities associated with gap junctional modulators (Behrens et al., 2011; Rouach et al., 2003). The non-specific actions of gap junctions are tabulated in Table 2 of (Szarka et al., 2021). In addition, genetic knockouts of gap junctional proteins are either lethal or involve functional compensation (Bedner et al., 2012; Lo, 1999), together making causal links to specific gap junctional contributions with currently available techniques infeasible.

      We highlight the novelty of our approach and of the conclusions about differences in extracellular signatures associated with active-dendritic chemical synapses and gap junctions, against these experimental difficulties. We emphasize that the computational modeling route is currently the only quantitative methodology to delineate the contributions of gap junctions vs. chemical synapses to extracellular potentials. Our analyses clearly demonstrates that gap junctions do contribute to extracellular potentials if the active nature of the cellular compartments is explicitly accounted for (Fig. 1D). We also show theoretically well-grounded and mechanistically elucidated differences in polarity (Figs. 1–3) as well as in spectral signatures (Figs. 5–8) of extracellular potentials associated with gap junctional vs. chemical synaptic inputs. Together, our fundamental demonstration in this study is the critical need to account for the active nature of cellular compartments in studying gap junctional contributions of extracellular potentials, with CA1 pyramidal neuronal dendrites used as an exemplar.

      In a revised version of the manuscript, we will emphasize the motivations for the approach we took, highlighting the specific novelties both in methodological and conceptual aspects, finally emphasizing the need to account for other cell types and gap junctional contributions therein. Importantly, we will emphasize the non-specificities associated with gap-junctional blockers as the reason why experimental delineation of gap junctional vs. chemical synaptic contributions to LFP becomes tedious. We hope that these points will underscore the need for the computational approach that we took to address this important question, apart from the novelties of the manuscript.

      Role of Active Dendritic Currents

      The paper emphasizes active dendritic currents, particularly the role of HCN channels in generating outward currents under certain conditions. However, further discussion of how this mechanism integrates into broader network dynamics is warranted.

      We thank you for this constructive suggestion. We agree that it is important to consider the implications for broader network dynamics of the outward HCN currents that are observed with synchronous inputs. In a revised manuscript, we will elaborate on the implications of the outward HCN current to network dynamics in detail.

      Analysis of Plasticity

      While the manuscript mentions plasticity in the discussion, there are no simulations that account for activity-dependent changes in synaptic or gap junctional properties. Including such analyses could significantly enhance the relevance of the findings.

      We thank you for this constructive suggestion. Please note that we have presented consistent results for both fewer and more gap junctions in our analyses (Figure 1 with 217 gap junctions and Supplementary Figure 1 with 99 gap junctions). Thus, our fundamentally novel result that gap junctions onto active dendrites differentially shape LFPs holds true irrespective of the relative density of gap junctions onto the neuron. Thus, these results demonstrate that the conclusions about their contributions to LFP are invariant to plasticity in their gap junctional numerosity.

      We had only briefly mentioned plasticity in the Introduction to highlight the different modes of synaptic transmission and to emphasize that plasticity has been studied in both chemical synapses and gap junctions, playing a role in learning and adaptation. However, if this wording inadvertently suggests that our study includes plasticity simulations, we would remove it from Introduction in the updated manuscript to ensure clarity.

      In the ‘Limitations of analyses and future studies’ section in Discussion, we suggested investigating the impact of plasticity mechanisms—specifically, activity-dependent plasticity of ion channels—on synaptic receptors vs. gap junctions and their effects on extracellular field potentials under various input conditions and plasticity combinations across different structures. We fully agree with the reviewer that such studies would offer valuable insights and further enhance the broader relevance of our findings. However, while our study implies this direction, it was not the primary focus of our investigation.

      In the revised manuscript, we will expand on intrinsic/synaptic plasticity and how they could contribute to LFPs (Sinha & Narayanan, 2015, 2022), while also pointing to simulations with different numbers of gap junction in this context.

      Frequency-Dependent Effects

      The study demonstrates that gap junctional inputs suppress highfrequency EFP power due to membrane filtering. However, it could delve deeper into the implications of this for different brain rhythms, such as gamma or ripple oscillations.

      We sincerely thank you for these insightful comments that we totally agree with. As it so happens, this manuscript forms the first part of a broader study where we explore the implications of gap junctions to ripple frequency oscillations. The ripple oscillations part of the work was presented as a poster in the Society for Neuroscience (SfN) annual meeting 2024 (Sirmaur & Narayanan, 2024). There, we simulate a neuropil made of hundreds of morphologically realistic neurons to assess the role of different synaptic inputs — excitatory, inhibitory, and gap junctional — and active dendrites to ripple frequency oscillations. We demonstrate there that the conclusions from single-neuron simulations in this current manuscript extend to a neuropil with several neurons, each receiving excitatory, inhibitory and gap-junctional inputs, especially with reference to high-frequency oscillations. Our networkbased analyses unveiled a dominant mediatory role of patterned inhibition in ripple generation, with recurrent excitations through chemical synapses and gap junctions in conjunction with return-current contributions from active dendrites playing regulatory roles in determining ripple characteristics (Sirmaur & Narayanan, 2024).

      Our principal goal in this study, therefore, was to lay the single-neuron foundation for network analyses of the impact of gap junctions on LFPs. We are preparing the network part of the study, with a strong focus on ripple-frequency oscillations, for submission for peer review separately.

      In a revised manuscript, we will mention the results from our SfN abstract with reference to network simulations and high-frequency oscillations, while also presenting discussions from other studies on the role of gap junctions in synchrony and LFP oscillations.

      Visualization

      Figures are dense and could benefit from more intuitive labeling and focused presentations. For example, isolating key differences between chemical and gap junctional inputs in distinct panels would improve clarity.

      We thank you for this constructive suggestion. In the revised manuscript, we will enhance the visualization of the figures to ensure a clearer and more intuitive distinction between chemical synapses and gap junctions.

      Contextual Relevance

      The manuscript touches on how these findings relate to known physiological roles of gap junctions (e.g., in gamma rhythms) but does not explore this in depth. Stronger integration of the results into known neural network dynamics would enhance its impact.

      We sincerely appreciate your valuable suggestion and acknowledge the importance of integrating our results into established neural network dynamics, particularly their implications for gamma rhythms. We will address this aspect more comprehensively in the revised version of our manuscript.

      Reviewer #2 (Public review):

      This computational work examines whether the inputs that neurons receive through electrical synapses (gap junctions) have different signatures in the extracellular local field potential (LFP) compared to inputs via chemical synapses. The authors present the results of a series of model simulations where either electric or chemical synapses targeting a single hippocampal pyramidal neuron are activated in various spatio-temporal patterns, and the resulting LFP in the vicinity of the cell is calculated and analyzed. The authors find several notable qualitative differences between the LFP patterns evoked by gap junctions vs. chemical synapses. For some of these findings, the authors demonstrate convincingly that the observed differences are explained by the electric vs. chemical nature of the input, and these results likely generalize to other cell types. However, in other cases, it remains plausible (or even likely) that the differences are caused, at least partly, by other factors (such as different intracellular voltage responses due to, e.g., the unequal strengths of the inputs). Furthermore, it was not immediately clear to me how the results could be applied to analyze more realistic situations where neurons receive partially synchronized excitatory and inhibitory inputs via chemical and electric synapses.

      We gratefully thank you for your time and effort in rigorously assessing our manuscript, for the enthusiastic response, and the encouraging and thoughtful comments on our study. In what follows, we have provided point-by-point responses to the specific comments.

      Strengths

      The main strength of the paper is that it draws attention to the fact that inputs to a neuron via gap junctions are expected to give rise to a different extracellular electric field compared to inputs via chemical synapses, even if the intracellular effects of the two types of input are similar. This is because, unlike chemical synaptic inputs, inputs via gap junctions are not directly associated with transmembrane currents. This is a general result that holds independent of many details such as the cell types or neurotransmitters involved.

      We gratefully thank you for the positive comments and the encouraging words about the novel contributions of our study. We are particularly thankful to you for your comment on the generality of our conclusions that hold for different cell types and neurotransmitters involved.

      Another strength of the article is that the authors attempt to provide intuitive, non-technical explanations of most of their findings, which should make the paper readable also for non-expert audiences (including experimentalists).

      We sincerely thank you for the positive comments about the readability of the paper.

      Weaknesses

      The most problematic aspect of the paper relates to the methodology for comparing the effects of electric vs. chemical synaptic inputs on the LFP. The authors seem to suggest that the primary cause of all the differences seen in the various simulation experiments is the different nature of the input, and particularly the difference between the transmembrane current evoked by chemical synapses and the gap junctional current that does not involve the extracellular space. However, this is clearly an oversimplification: since no real attempt is made to quantitatively match the two conditions that are compared (e.g., regarding the strength and temporal profile of the inputs), the differences seen can be due to factors other than the electric vs. chemical nature of synapses. In fact, if inputs were identical in all parameters other than the transmembrane vs. directly injected nature of the current, the intracellular voltage responses and, consequently, the currents through voltage-gated and leak currents would also be the same, and the LFPs would differ exactly by the contribution of the transmembrane current evoked by the chemical synapse. This is evidently not the case for any of the simulated comparisons presented, and the differences in the membrane potential response are rather striking in several cases (e.g., in the case of random inputs, there is only one action potential with gap junctions, but multiple action potentials with chemical synapses). Consequently, it remains unclear which observed differences are fundamental in the sense that they are directly related to the electric vs. chemical nature of the input, and which differences can be attributed to other factors such as differences in the strength and pattern of the inputs (and the resulting difference in the neuronal electric response).

      We thank you for raising this important point. We would like to emphasize that our experimental design and analyses quantitatively account for the spatial distribution and temporal pattern of specific kinds of inputs that arrive through gap junctions and chemical synapses. We submit that our analyses quantitatively demonstrates that the fundamental difference between the gap junctional and chemical synaptic contributions to extracellular potentials is the absence of the direct transmembrane component from gap junctional inputs. We elucidate these points below:

      (1) Spatial distribution: The inputs were distributed randomly across the basal dendrites, irrespective of whether they were through gap junctions or chemical synapses. For both chemical synapses and gap junctions, the inputs were of the same nature: excitatory.

      (2) Different numbers of inputs: We have presented consistent results for both fewer and more gap junctions or chemical synapses in our analyses (see Figure 1 with 217 gap junctions or 245 chemical synapses and Supplementary Figure 2 with 99 gap junctions or 30 chemical synapses). Our fundamentally novel result that gap junctions onto active dendrites shape LFPs holds true irrespective of the relative density of gap junctions onto the neuron.

      (3) Synchronous inputs (Figs. 1–3): For chemical synapses, the waveforms are in the shape of postsynaptic potentials. For gap junctional inputs, the waveforms are in the shape of postsynaptic potentials or dendritic spikes (to respect the active nature of inputs from the other cell). Here, the electrical response of the postsynaptic cell is identical irrespective of whether inputs arrive through gap junctions or chemical synapses: an action potential. We quantitatively matched the strengths such that the model generated a single action potential in response to synchronous inputs, irrespective of whether they arrived through chemical synaptic and gap junctional inputs. We mechanistically analyze the contributions of different cellular components and show that the direct transmembrane current in chemical synapses is the distinguishing factor that determines the dichotomy between the contributions of gap junctions vs. chemical synapses to extracellular potentials (Figs. 2–3). In a revised manuscript, we will show the intracellular responses to demonstrate that they are electrically matched.

      (4) Random inputs (Fig. 4): For random inputs, we did not account for the number of action potentials that arrived, as the only observation we made here was with reference to the biphasic nature of the extracellular potentials with gap junctional inputs in the “No Sodium” scenario. We note that in the “No Sodium” scenario, the time-domain amplitudes were comparable for the field potentials (Fig. 4B, Fig. 4D).

      (5) Rhythmic inputs (Fig. 5–8): For rhythmic inputs, please note that the intracellular and extracellular waveforms for every frequency are provided in supplementary figures S5– S11. It may be noted that the intracellular responses are comparable. In simulations for assessing spike-LFP comparison, we tuned the strengths to produce a single spike per cycle, ensuring fair comparison of LFPs with gap junctions vs. chemical synapses.

      Taken together, we demonstrate through explicit sets of simulations and analyses that the differences in LFPs were not driven by the strength or patterns of the inputs but rather by the differences in direct transmembrane currents, which are subsequently reflected in the LFPs. In a revised manuscript, we will add a section to emphasize these points apart from providing intracellular traces for cases where they are not provided.

      Some of the explanations offered for the effects of cellular manipulations on the LFP appear to be incomplete. More specifically, the authors observed that blocking leak channels significantly changed the shape of the LFP response to synchronous synaptic inputs - but only when electric inputs were used, and when sodium channels were intact. The authors seemed to attribute this phenomenon to a direct effect of leak currents on the extracellular potential - however, this appears unlikely both because it does not explain why blocking the leak conductance had no effect in the other cases, and because the leak current is several orders of magnitude smaller than the spike-generating currents that make the largest contributions to the LFP. An indirect effect mediated by interactions of the leak current with some voltage-gated currents appears to be the most likely explanation, but identifying the exact mechanism would require further simulation experiments and/or a detailed analysis of intracellular currents and the membrane potential in time and space.

      We thank you for raising this important question. Leak channels were among the several contributors to the positive deflection observed in LFPs associated with gap junctions. This effect was present not only in gap junctional models with intact sodium conductance but also in the no-sodium model, where the amplitude of the positive deflection was reduced across other models as well (Fig. 2F, I). Furthermore, even in the absence of leak conductance, a small positive deflection was still observed (Fig. 2F), leading us to further investigate other transmembrane currents over time and across spatial locations, from the proximal to the distal dendritic ends relative to the soma (Fig. 3D). We had observed that the dominant contributor in the case of chemical synapses was the inward synaptic current (Fig. 3A), whereas for gap junctions, the primary contributors were leak conductance along with other outward currents, such as potassium and HCN currents (Fig. 3D). Together, the direct transmembrane component of chemical synapses provides a dominant contribution to extracellular potentials. This dominance translates to differences in the relative contributions of indirect currents (including leak currents) to extracellular potentials associated chemical synaptic vs. gap junctional inputs. Our analyses of the exact ionic mechanisms (Fig. 3) demonstrates the involvement of several ion channels contributing to the indirect component in either scenario.

      In every simulation experiment in this study, inputs through electric synapses are modeled as intracellular current injections of pre-determined amplitude and time course based on the sampled dendritic voltage of potential synaptic partners. This is a major simplification that may have a significant impact on the results. First, the current through gap junctions depends on the voltage difference between the two connected cellular compartments and is thus sensitive to the membrane potential of the cell that is treated as the neuron "receiving" the input in this study (although, strictly speaking, there is no pre- or postsynaptic neuron in interactions mediated by gap junctions). This dependence on the membrane potential of the target neuron is completely missing here. A related second point is that gap junctions also change the apparent membrane resistance of the neurons they connect, effectively acting as additional shunting (or leak) conductance in the relevant compartments. This effect is completely missed by treating gap junctions as pure current sources.

      We thank you for raising this important point. We agree with the analyses presented by the reviewer on the importance of network simulations and bidirectional gap junctions that respect the voltages in both neurons. However, the complexities of LFP modeling precludes modeling of networks of morphologically realistic models with patterns of stimulations occurring across the dendritic tree. LFP modeling studies predominantly uses “post-synaptic” currents to analyze the impact of different patterns of inputs arriving on to a neuron, even when chemical synaptic inputs are considered. Explicitly, individual neurons are separately simulated with different patterns of synaptic inputs, the transmembrane current at different locations recorded, and the extracellular potential is then computed using line source approximation (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). Even in scenarios where a network is analyzed, a hybrid approach involving the outputs of a pointneuron-based network being coupled to an independent morphologically realistic neuronal model is employed (Hagen et al., 2016; Martinez-Canada et al., 2021; Mazzoni et al., 2015). Given the complexities associated with the computation of electrode potentials arising as a distance-weighted summation of several transmembrane currents, these simplifications becomes essential.

      Our approach models gap junctional currents in a similar way as the other model incorporate synaptic currents in LFP modeling (Buzsaki et al., 2012; Gold et al., 2006; Halnes et al., 2024; Ness et al., 2018; Reimann et al., 2013; Schomburg et al., 2012; Sinha & Narayanan, 2015, 2022). As gap junctions are typically implemented as resistors from the other neuronal compartment, we accounted for gap-junctional variability in our model by randomizing the scaling-factors and the exact waveforms that arrive through individual gap junctions at specific locations. Thus, the inputs were not pre-determined by “pre” neurons. Instead, the recorded voltages from potential synaptic partner neurons were randomized across locations and scaled using factors at the dendrites before being injected into the target neuron (Supplementary Fig. S1). While incorporating a network of interconnected neurons is indeed important, we utilized biophysical, morphologically realistic CA1 neuron model with different sets of input patterns to model LFPs, which were derived from the total transmembrane currents across all compartments of the multi-compartmental neuron model. Given the complexity of this approach, adding further network-level interactions or pre-post connections would have been computationally demanding.

      In a revised manuscript, we will introduce the general methodology used in LFP modeling studies to introduce synaptic currents. We will emphasize that our study extends this approach to modeling gap junctional inputs, while also highlighting randomization of locations and the scaling process in assigning gap junctional synaptic strengths.

      One prominent claim of the article that is emphasized even in the abstract is that HCN channels mediate an outward current in certain cases. Although this statement is technically correct, there are two reasons why I do not consider this a major finding of the paper. First, as the authors acknowledge, this is a trivial consequence of the relatively slow kinetics of HCN channels: when at least some of the channels are open, any input that is sufficiently fast and strong to take the membrane potential across the reversal potential of the channel will lead to the reversal of the polarity of the current. This effect is quite generic and well-known and is by no means specific to gap junctional inputs or even HCN channels. Second, and perhaps more importantly, the functional consequence of this reversed current through HCN channels is likely to be negligible. As clearly shown in Supplementary Figure S3, the HCN current becomes outward only for an extremely short time period during the action potential, which is also a period when several other currents are also active and likely dominant due to their much higher conductances. I also note that several of these relevant facts remain hidden in Figure 3, both because of its focus on peak values, and because of the radically different units on the vertical axes of the current plots.

      We thank you for raising this point and agree with you on every point. Please note that we do not assert that the outward HCN currents are exclusively associated with gap junctional inputs. Rather, our results show that synchronous inputs generate outward HCN currents in both chemical synapses (Fig. 3B; positive/outward HCN currents, except in the no sodium or leak model) and gap junctions (Fig. 3D; positive/outward HCN currents). We emphasized this in the case of gap junctions because, in the absence of inward synaptic currents, HCN (acting as outward currents with synchronous inputs) contributed to the positive deflection observed in the LFPs. While HCN would also contribute in the case of chemical synapses, its effect was negligible due to the presence of large inward synaptic currents. Since LFPs reflect the collective total transmembrane currents, the dominant contributors differ between these two scenarios, which we aimed to highlight. Since HCN exhibited outward currents in our synchronous input simulations, we have elaborated on this mechanism in the supplementary figure (Fig. S3). Our intention was not to emphasize this effect for only one synaptic mode but rather to highlight HCN's contribution to the positive deflection as one of the contributing factors.

      We agree that HCN currents are relatively small in magnitude; therefore, our conclusions were based on HCN being one of the several contributing factors. Leak conductance and other outward conductances, including HCN currents (Fig. 3D), collectively contribute to the positive deflections observed in the case of gap junctional synchronous inputs.

      We will ensure that we will account for all the points appropriately in a revised manuscript.

      Finally, I missed an appropriate validation of the neuronal model used, and also the characterization of the effects of the in silico manipulations used on the basic behavior of the model. As far as I understand, the model in its current form has not been used in other studies. If this is the case, it would be important to demonstrate convincingly through (preferably quantitative) comparisons with experimental data using different protocols that the model captures the physiological behavior of at least the relevant compartments (in this case, the dendrites and the soma) of hippocampal pyramidal neurons sufficiently well that the results of the modeling study are relevant to the real biological system. In addition, the correct interpretation of various manipulations of the model would be strongly facilitated by investigating and discussing how the physiological properties of the model neuron are affected by these alterations.

      We thank you for raising this important point. The CA1 pyramidal neuronal model used in this study is built with ion-channel models derived from biophysical and electrophysiological recordings from these cells. As mentioned in the Methods section “Dynamics and distribution of active channels” and Supplementary Table S1, models for individual channels, their gating kinetics, and channel distributions across the somatodendritic arbor (wherever known) are all derived from their physiological equivalents. Importantly, these values were derived from previously validated models from the laboratory, which contain these very ion channel models and the exact same morphology (Roy & Narayanan, 2021). Please compare Supplementary Table S1 with the Table 1 from (Roy & Narayanan, 2021). Please note that this model was validated against several physiological measurements along the somatodendritic axis (Fig. 1 of (Roy & Narayanan, 2021)).

      In a revised manuscript, we will explicitly mention this while also mentioning the different physiological properties that were used for the validation process from (Roy & Narayanan, 2021). We sincerely regret not mentioning these details in the current version of our manuscript.

      We will fix these in a revised version of the manuscript.

      References

      Bedner, P., Steinhauser, C., & Theis, M. (2012). Functional redundancy and compensation among members of gap junction protein families? Biochim Biophys Acta, 1818(8), 1971-1984. https://doi.org/10.1016/j.bbamem.2011.10.016

      Behrens, C. J., Ul Haq, R., Liotta, A., Anderson, M. L., & Heinemann, U. (2011). Nonspecific effects of the gap junction blocker mefloquine on fast hippocampal network oscillations in the adult rat in vitro. Neuroscience, 192, 11-19. https://doi.org/10.1016/j.neuroscience.2011.07.015

      Buzsaki, G., Anastassiou, C. A., & Koch, C. (2012). The origin of extracellular fields and currents--EEG, ECoG, LFP and spikes. Nat Rev Neurosci, 13(6), 407-420. https://doi.org/10.1038/nrn3241

      Einevoll, G. T., Destexhe, A., Diesmann, M., Grun, S., Jirsa, V., de Kamps, M., Migliore, M., Ness, T. V., Plesser, H. E., & Schurmann, F. (2019). The Scientific Case for Brain Simulations. Neuron, 102(4), 735-744. https://doi.org/10.1016/j.neuron.2019.03.027

      Gold, C., Henze, D. A., Koch, C., & Buzsaki, G. (2006). On the origin of the extracellular action potential waveform: A modeling study. J Neurophysiol, 95(5), 3113-3128. https://doi.org/10.1152/jn.00979.2005

      Hagen, E., Dahmen, D., Stavrinou, M. L., Linden, H., Tetzlaff, T., van Albada, S. J., Grun, S., Diesmann, M., & Einevoll, G. T. (2016). Hybrid Scheme for Modeling Local Field Potentials from Point-Neuron Networks. Cereb Cortex, 26(12), 4461-4496. https://doi.org/10.1093/cercor/bhw237

      Halnes, G., Ness, T. V., Næss, S., Hagen, E., Pettersen, K. H., & Einevoll, G. T. (2024). Electric Brain Signals: Foundations and Applications of Biophysical Modeling. Cambridge University Press. https://doi.org/DOI: 10.1017/9781009039826

      Lo, C. W. (1999). Genes, gene knockouts, and mutations in the analysis of gap junctions. Dev Genet, 24(1-2), 1-4. https://doi.org/10.1002/(SICI)1520-6408(1999)24:1/2<1::AIDDVG1>3.0.CO;2-U

      Martinez-Canada, P., Ness, T. V., Einevoll, G. T., Fellin, T., & Panzeri, S. (2021). Computation of the electroencephalogram (EEG) from network models of point neurons. PLoS Comput Biol, 17(4), e1008893. https://doi.org/10.1371/journal.pcbi.1008893

      Mazzoni, A., Linden, H., Cuntz, H., Lansner, A., Panzeri, S., & Einevoll, G. T. (2015). Computing the Local Field Potential (LFP) from Integrate-and-Fire Network Models. PLoS Comput Biol, 11(12), e1004584. https://doi.org/10.1371/journal.pcbi.1004584

      Ness, T. V., Remme, M. W. H., & Einevoll, G. T. (2018). h-Type Membrane Current Shapes the Local Field Potential from Populations of Pyramidal Neurons. J Neurosci, 38(26), 6011-6024. https://doi.org/10.1523/jneurosci.3278-17.2018

      Reimann, M. W., Anastassiou, C. A., Perin, R., Hill, S. L., Markram, H., & Koch, C. (2013). A biophysically detailed model of neocortical local field potentials predicts the critical role of active membrane currents. Neuron, 79(2), 375-390. https://doi.org/10.1016/j.neuron.2013.05.023

      Rouach, N., Segal, M., Koulakoff, A., Giaume, C., & Avignone, E. (2003). Carbenoxolone blockade of neuronal network activity in culture is not mediated by an action on gap junctions. Journal of Physiology, 553(Pt 3), 729-745. https://doi.org/10.1113/jphysiol.2003.053439

      Roy, A., & Narayanan, R. (2021). Spatial information transfer in hippocampal place cells depends on trial-to-trial variability, symmetry of place-field firing, and biophysical heterogeneities. Neural Netw, 142, 636-660. https://doi.org/10.1016/j.neunet.2021.07.026

      Schomburg, E. W., Anastassiou, C. A., Buzsaki, G., & Koch, C. (2012). The spiking component of oscillatory extracellular potentials in the rat hippocampus. J Neurosci, 32(34), 11798-11811. https://doi.org/10.1523/JNEUROSCI.0656-12.2012

      Sinha, M., & Narayanan, R. (2015). HCN channels enhance spike phase coherence and regulate the phase of spikes and LFPs in the theta-frequency range. Proc Natl Acad Sci U S A, 112(17), E2207-2216. https://doi.org/10.1073/pnas.1419017112

      Sinha, M., & Narayanan, R. (2022). Active Dendrites and Local Field Potentials: Biophysical Mechanisms and Computational Explorations. Neuroscience, 489, 111-142. https://doi.org/10.1016/j.neuroscience.2021.08.035

      Sirmaur, R., & Narayanan, R. (2024). Distinct extracellular signatures of chemical and electrical synapses impinging on active dendrites differentially contribute to ripple-frequency oscillations. Society for Neuroscience annual meeting (https://www.abstractsonline.com/pp8/?_gl=1*1bxo7m*_gcl_au*MTc5MTQ0NjE0NC4xNzI3MDcwOTMw*_ga*MTMxMTE5OTcyMy4xNzI3MDcwOTMx*_ga_T09K 3Q2WDN*MTcyNzA3MDkzMS4xLjEuMTcyNzA3MDkzNy41NC4wLjA.#!/20433/ presentation/13949), Chicago, USA.

      Szarka, G., Balogh, M., Tengolics, A. J., Ganczer, A., Volgyi, B., & Kovacs-Oller, T. (2021). The role of gap junctions in cell death and neuromodulation in the retina. Neural Regen Res, 16(10), 1911-1920. https://doi.org/10.4103/1673-5374.308069

    1. Author response:

      The following is the authors’ response to the current reviews.

      Response to Reviewer 2’s comments:

      I am concerned that the results in Figure 8D may not be correct, or that the authors may be mis-interpreting them. From my reading of the paper they cite (Lammers & Flamholz 2023), the equilibrium sharpness limit for the network they consider in Figure 8 should be 0.25. But both solutions shown in Figure 8D fall below this limit, which means that they have sharpness levels that could have been achieved with no energy expenditure. If this is the case, then it would imply that while both systems do dissipate energy, they are not doing so productively; meaning that the same results could be achieved while holding Phi=0.

      I acknowledge that this could be due to a difference in how they measure sharpness, but wanted to raise it here in case it is, in fact, a genuine issue with the analysis.There should be an easy fix for this: just set the sharper "desired response" curve in 8b to be such that it demands non-equilibrium sharpness levels (0.25<S<0.5).

      Thank you for raising this point regarding the interpretation of our results in Figure 8D. We agree that if the equilibrium sharpness limit for this particular network is around 0.25 (as shown by Lammers & Flamholz 2023), then achieving a sharpness below this threshold could, in principle, be accomplished without any energy expenditure. However, in our current design approach, the loss function is solely designed to enforce agreement with a target mean mRNA level at different input concentrations; it does not explicitly constrain energy dissipation, noise, or other metrics. Consequently, the DGA has no built-in incentive to minimize or optimize energy consumption, which means the resulting solutions may dissipate energy without exceeding the equilibrium sharpness limit.

      In other words, the same input–output relationship could theoretically be achieved with \Phi =0 if an explicit constraint or regularization term penalizing energy usage had been included. As noted, adding such a term (e.g., penalizing \Phi^2) is conceptually straightforward but falls outside the scope of this study. Our primary goal is to demonstrate the flexibility of the DGA in designing a desired response, rather than to delve into energy–sharpness trade-offs or other biological considerations

      While we appreciate the suggestion to set a higher target sharpness that exceeds the equilibrium limit, we believe the current example effectively demonstrates the DGA’s ability to design circuits with desired input-output relationships, which is the primary focus of this study. Researchers interested in optimizing energy efficiency, burst size, burst frequency, noise, response time, mutual information, or other system properties can easily extend our approach by incorporating additional terms into the loss function to target these specific objectives.

      We hope this explanation addresses your concern and clarifies that the manuscript provides sufficient context for readers to interpret the results in Figure 8D correctly.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      We thank Reviewer #1 for their thoughtful feedback and appreciation of the manuscript's clarity. Our primary goal is to introduce the DGA  as a foundational tool for integrating stochastic simulations with gradient-based optimization. While we recognize the value of providing detailed comparisons with existing methods and a deeper analysis of the DGA’s limitations (such as rare event handling), these topics are beyond the scope of this initial work. Our focus is on presenting the core concept and demonstrating its potential, leaving more extensive evaluations for future research.

      Reviewer #2 (Public review):

      We thank Reviewer #2 for their detailed and constructive feedback. We appreciate the recognition of the DGA as a significant conceptual advancement for stochastic biochemical network analysis and design.

      Weaknesses:

      (1) Validation of DGA robustness in complex systems:

      Our primary goal is to introduce the DGA framework and demonstrate its feasibility. While validation on high-dimensional and non-steady-state systems is important, it is beyond the scope of this initial work. Future studies may improve scalability by employing techniques such as dynamically adjusting the smoothness of the DGA's approximations during simulation or using surrogate models that remain differentiable but more accurately capture discrete behaviors in critical regions, thus preserving gradient computation while improving accuracy.

      (2) Inference accuracy and optimization:

      We acknowledge that the non-convex loss landscape in the DGA can hinder parameter inference and convergence to global minima, as seen in Figure 5A. While techniques like multi-start optimization or second-order methods (e.g., L-BFGS) could improve performance, our focus here is on establishing the DGA framework. We plan to explore better optimization methods in future work to improve the accuracy of parameter inference in complex systems.

      (3) Use of simple models for demonstration:

      We selected well-understood systems to clearly illustrate the capabilities of the DGA. These examples were intended to demonstrate how the DGA can be applied, rather than to solve problems better addressed by analytical methods. Applying DGA to more complex, analytically intractable systems is an exciting avenue for future work, but introducing the method was our main objective in this study.

      Reviewer #3 (Public review):

      We thank the reviewer for their detailed and insightful feedback. We appreciate the recognition of the DGA as a significant advancement for enabling gradient-based optimization in stochastic systems.

      Weaknesses:

      (1) Application beyond steady-state analysis

      We acknowledge the limitation of focusing solely on steady-state properties. To extend the DGA for analyzing transient dynamics, time-dependent loss functions can be incorporated to capture system evolution over time. This could involve aligning simulated trajectories with experimental time-series data or using moment-matching across multiple time points. 

      (2) Numerical instability in gradient computation

      The reviewer correctly highlights that large sharpness parameters (a and b) in the sigmoid and Gaussian approximations can induce numerical instability due to vanishing or exploding gradients. To address this, adaptive tuning of a and b during optimization could balance smoothness and accuracy. Additionally, alternative smoothing functions (e.g., softmax-based reaction selection) and gradient regularization techniques (such as gradient clipping and trust-region methods) could improve stability and convergence.

      Reviewer #1 (recommendations):

      We thank the reviewer for their thoughtful and constructive feedback on our manuscript. Below, we address each of the comments and suggestions raised.

      Main points:

      (1) It would have been useful to have a brief discussion, based on a concrete example, of what can be achieved with the DGA and is totally beyond the reach of the Gillespie algorithm and the numerous existing stochastic simulation methods.

      Thank you for your comment. We would like to clarify that the primary aim of this work is to introduce the DGA and demonstrate its feasibility for tasks such as parameter estimation and network design. Unlike traditional stochastic simulation methods, the DGA’s differentiable nature enables gradient-based optimization, which is not possible with the classical Gillespie algorithm or its variants.

      (2) As often with machine learning techniques, there is a sense of black box, with a lack of mathematical details of the proposed method: as opposite to the exact Gillespie algorithm, whose foundations lie on solid mathematical results (exponentially-distributed waiting times of continuous-time Markov processes), the DGA involves uncontrolled approximations, that are only briefly mentioned in the paper. For instance, it is currently simply noted that "the approximations introduced by the DGA may be pronounced in more complex settings such as the calculation of rare events", without specifying how limiting these errors are. It would be useful to include a clearer and more comprehensive discussion of the limitations of the DGA: When does it work accurately? What are the approximations/errors and can they be controlled? When is it worth paying the price for those approximations/errors, and when is it better to stick to the Gillespie algorithm? Is this notably the case for problems involving rare events? Clearly, these are difficult questions, and the answers are problem specific. However, it would be important to draw the readers' attention on the issues, especially if the DGA is presented as a potentially significant tool in computational and synthetic biology.

      We acknowledge the importance of discussing the limitations of the DGA in more detail. While we have noted that the approximations introduced by the DGA may impact its accuracy in certain scenarios, such as rare-event problems, a deeper exploration of these trade-offs is outside the scope of this work. Instead, we provide sufficient context in the manuscript to guide readers on when the DGA is appropriate.

      (3) The DGA is here introduced and discussed in the context of non-spatial problems (simple gene regulatory networks). However, numerous problems in the life sciences and computational/synthetic biology, involve stochasticity and spatial degrees of freedom (e.g. for problems involving diffusion, migration, etc). It is notoriously challenging to use the Gillespie algorithm to efficiently simulate stochastic spatial systems, especially in the context of rare events (e.g., extinction or fixation problems). It would be useful to comment on whether, and possibly how, the DGA can be used to efficiently simulate stochastic spatial systems, and if it would be better suited than the Gillespie algorithm for this purpose.

      Thank you for pointing this out. Although our current work centers on non-spatial systems, we agree that many biological contexts incorporate both stochasticity and spatial degrees of freedom. Extending the DGA to efficiently simulate such systems would indeed require substantial modifications—for instance, coupling it with reaction-diffusion frameworks or spatial master equations. We believe this is an exciting direction for future research and mention it briefly in the discussion as a potential extension.

      Minor suggestions:

      (1) After Eq.(10): it would be useful to explain and motivate the choice of the ratio JSD/H.

      Done.

      (2) On page 6, just below the caption of Fig.4: it would be useful to clarify what is actually meant by "... convergence towards the steady-state distribution of the exact Gillespie simulation, which is obtained at a simulation time of 10^4".

      Done.

      (3) At the end of Section B on page 7: please clarify what is meant here by "soft directions".

      Done.

      Reviewer #2 (recommendations):

      We thank the reviewer for their thoughtful comments and constructive feedback. Below, we address each of the comments/suggestions.

      Main points:

      (1) Enumerate the conditions under which DGA assumptions hold (and when they do not). There is currently not enough information for the interested reader to know whether DGA would work for their system of interest. Without this information, it is difficult to assess what the true scope of DGA's impact will be. One simple idea would be to test DGA performance along two axes: (i) increasing number of model states and (ii) presence/absence of non-steady state dynamics. I acknowledge that these are very open-ended directions, but looking at even a single instance of each would greatly strengthen this work. Alternatively, if this is not feasible, then the authors should provide more discussion of the attendant difficulties in the main text.

      We agree that a detailed exploration of the conditions under which the DGA assumptions hold would be a valuable addition to the field. However, this paper primarily aims to introduce the DGA methodology and demonstrate its proof-of-concept applications. A comprehensive analysis along axes such as increasing model states or non-steady-state dynamics, while important, would require significant additional simulations and is beyond the scope of this work. In Appendix A, we have discussed the trade-off between accuracy and numerical stability. Additionally, we encourage future users to tune the hyperparameters a and b for their specific systems.

      (2) Demonstrate DGA performance in a more complex biochemical system. Clearly the authors were aware that analytic solutions exist for the 2-state system in Figure 7, but it this is actually also the case (I think) for mean mRNA production rate of the non-equilibrium system in Figure 8. To really demonstrate that DGA is practically viable, I encourage the authors to seek out an interesting application that is not analytically tractable.

      We appreciate the suggestion to validate DGA on a more complex biochemical system. However, the goal of this study is not to provide an exhaustive demonstration of all possible applications but to introduce the DGA and validate it in systems where ground-truth comparisons are available. While the non-equilibrium system in Figure 8 might be analytically tractable, its complexity already provides a meaningful demonstration of DGA’s ability to optimize parameters and design systems. Extending this work to analytically intractable systems is an exciting direction for future studies, and we hope this paper will inspire others to explore these applications.

      (3) Take steps to improve the robustness of parameter optimization and error bar calculations. (3a) When the loss landscape is degenerate, shallow, or otherwise "difficult," a common solution is to perform multiple (e.g. 25-100) inference runs starting from different random positions in parameter space. Doing this, and then taking the parameter set that minimizes the loss should, in theory, lead to a more robust recovery of the optimal parameter set.

      (3b) It seems clear that the Hessian approximation is underestimating the true error in your inference results. One alternative is to use a "brute force" approach like bootstrap resampling to get a better estimate for the statistical dispersion in parameter estimates. But I recognize that this is only viable if the inference is relatively fast. Simply recovering the true minimum will, of course, also help.

      (3a) We acknowledge the challenge posed by degenerate or shallow loss landscapes during parameter optimization. While performing multiple inference runs from different initializations is a common strategy, this approach is computationally intensive. Instead, we rely on standard optimization techniques (e.g., ADAM) to find a robust local minimum. 

      (3b) Thank you for your comment. We agree that Hessian-based error bars can underestimate uncertainty, particularly in degenerate or poorly conditioned loss landscapes. While methods like bootstrap and Monte Carlo can provide more robust estimates, they can be computationally prohibitive for larger-scale simulations. A simpler reason for not using them is the high resource demand from repeated simulations, which quickly becomes infeasible for complex or high-dimensional models. We note these trade-offs between robust estimation and practicality as an important area for further exploration.

      Moderate comments:

      (1) Figure 7: is it possible to also show the inferred kon values? Specifically, it would be of interest to see how kon varies with repressor concentration.

      Thank you for the suggestion. We have updated Figure 7 to include the inferred kon values, showing their variation with the mean mRNA copy number. However, we could not plot them against repressor concentration due to the lack of available data.

      (2) Figure 8B & D: the authors claim that the sharper system dissipates more energy, but doesn't 8D show the opposite of this? More importantly, it does not look like either network drives sharpness levels that exceed the upper equilibrium limit cited in [36]. So it is not clear that it is appropriate to look at energy dissipation here. In fact, it is likely that equilibrium networks could produce the curves in 8B, and might be worth checking.

      Thank you for pointing this out. We realized that the plotted values in Figure 8D were incorrect, as we had mistakenly plotted noise instead of energy dissipation. The plot has now been corrected. 

      (3) Figure 8: I really like this idea of using DGA to "design" networks with desired input-output properties, but I wonder if you could explore more a biologically compelling use-case. Specifically, what about some kind of switch-like logic where, as the activator concentration increases, you have first 0 genes on, then 1 promoter on, then 2 promoters on. This would achieve interesting regulatory logic, and having DGA try to produce step functions would ensure that you force the networks to be maximally sharp (i.e. about double what you're currently achieving).

      Thank you for this intriguing suggestion. While the proposed switch-like logic use case is indeed compelling, implementing such a system would require significant work. This goes beyond the scope of the current study, which focuses on demonstrating the feasibility of DGA for network design with simple input-output properties.

      Minor comments:

      (1) Figure 4B & C: the bar plots do not do a good job conveying the points made by the authors. Consider alternatives, such as scatter plots or box plots that could convey inference uncertainty.

      Done.

      (2) Figure 4B: consider using a log y-axis.

      The y-axis in Figure 4B is already plotted on a log scale.

      (3) Figure 4D is mentioned prior to 4C in the text. Consider reordering.

      Done. 

      (4) Figure 5B: it is difficult to assess from this plot whether or not the landscape is truly "flat," as the authors claim. Flat relative to what? Consider alternative ways to convey your point.

      Thank you for highlighting this ambiguity. By describing the loss landscape as “flat,” we intend to convey its relative insensitivity to parameter variations in certain regions, rather than implying a completely level surface. While we believe Figure 5B still provides a useful qualitative depiction of this behavior, we acknowledge that it does not quantitatively establish “flatness.” In future work, we plan to incorporate more rigorous measures—such as gradient magnitudes or Hessian eigenvalues—to more accurately characterize and communicate the geometry of the loss landscape.

      Reviewer #3 (recommendations):

      We sincerely thank the reviewer for their thoughtful feedback and constructive suggestions, which have helped us improve the clarity and rigor of our manuscript. Below, we address each of the comments.

      (1) Precision is lacking in the introduction section. Do the authors mean the Direct SSA, sorted SSA, which is usually faster, and how about rejection sampling methods?

      Thank you for pointing this out. We have updated the introduction to explicitly mention the Direct SSA.

      (2) When mentioning PyTorch and Jax, would be good to also talk about Julia, as they have fast stochastic simulators.

      We have now mentioned Julia alongside PyTorch and Jax.

      (3) Mentioned references 22-27. Reference 26 is an odd choice; a better reference is from the same author the Automatic Differentiation of Programs with Discrete Randomness, G Arya, M Schauer, F Schäfer, C Rackauckas, Advances in Neural Information Processing Systems, NeurIPS 2022

      We have now cited the suggested reference.

      (4) Page 1, Section: 'To circumnavigate these difficulties, the DGA modifies....' Have you thought about how you would deal with the bias that will be introduced by doing this?

      Thank you for your insightful comment. We acknowledge the potential for bias due to the differentiable approximations in the DGA; however, our analysis has not revealed any systematic bias compared to the exact Gillespie algorithm. Instead, we observe irregular deviations from the exact results as the smoothness of the approximations increases.

      (5) Page 2, first sentence '... traditional Gillespie...' be more precise here - the direct algorithm.

      Thank you for your comment. We believe that the context of the paper, particularly the schematic in Figure 1, makes it clear that we are focusing on the Direct SSA. 

      (6) Page 2, second paragraph: ' In order to simulate such a system...' This doesn't fit here as this section is about tau-leaping. As this approach approximates discrete operations, it is unclear if it would work for large models, snap-shot data of larger scale and if it would be possible to extend it for time-lapse data

      Thank you for your comment. We respectfully disagree that this paragraph is misplaced. The purpose of this paragraph is to explain why the standard Gillespie algorithm does not use fixed time intervals for simulating stochastic processes. By highlighting the inefficiency of discretizing time into small intervals where reactions rarely occur, the paragraph provides necessary context for the Gillespie algorithm’s event-driven approach, which avoids this inefficiency.

      Regarding the applicability of the DGA to larger models, snapshot data, or time-lapse data, we acknowledge these are important directions and have noted them as potential extensions in the discussion section.

      (7) Page 2 Section B: 'In order to make use of modern deep-learning techniques...' It doesn't appear from the paper that any modern deep learning is used.

      Thank you for your comment. Although the DGA does not utilize deep learning architectures such as neural networks, it employs automatic differentiation techniques provided by frameworks like PyTorch and Jax. These tools allow efficient gradient computations, making the DGA compatible with modern optimization workflows.

      (8) Page 3, Fig 1(a). S matrix last row, B and C should swap places: B should be 1 and C is -1.

      Corrected the typo.

      (9) Fig1 needs a more detailed caption.

      Expanded the caption slightly for clarity.

      (10) Page 3 last paragraph: 'The hyperparameter b...' Consequences of this are relevant, for example can we now go below zero. Also, we lose more efficient algorithms here. It would be good to discuss this in more detail that this is an approx.. algorithm that is good for our case study, but for other to use it more tests are needed.

      Thank you for the comment. Appendix A discusses the trade-offs related to a and b, but we agree that more detailed analysis is needed. The hyperparameters are tailored to our case study and must be tuned for specific systems.

      (11) Page 4, Section C, first paragraph, 'The goal of making...' This is snapshot data. Would the framework also translate to time-lapse data? Also, it would be better to make it clearer earlier which type of data are the target of this study.

      Thank you for your suggestion. While the current study focuses on snapshot data and steady-state properties, we believe the DGA could be extended to handle time-lapse data by incorporating multiple recorded time points into its inference objective. Specifically, one could modify the loss function to penalize discrepancies across observed transitions between these time points, effectively capturing dynamic trajectories. We consider this an exciting area for future development, but it lies beyond our present scope.

      (12) Page 4 Section C, sentence '...experimentally measured moments'. Should later be mentioned as error, as moments are imperfect

      Thank you for your comment. We agree that experimentally measured moments are inherently noisy and may not perfectly represent the true system. However, within the context of the DGA, these moments serve as target quantities, and the discrepancy between simulated and measured moments is already accounted for in the loss function. 

      (13) Page 4 Section C, last sentence '...second-order...such as ADAM'. Another formulation would be better as second order can be confusing, especially in the context of parameter estimation

      We have revised the language to avoid confusion regarding “second-order” methods.

      (14) Fig 4(a) a density plot would fit better here

      Fig. 4(a) has been updated to a scatter density plot as suggested.

      (15) Fig 4(c) Would be interesting to see closer analysis of trade of between gradient and accuracy when changing a and b parameters

      Thank you for this suggestion. We acknowledge that an in-depth exploration of these trade-offs could provide deeper insights into the method’s performance. However, for now, we believe the current analysis suffices to highlight the utility of the DGA in the contexts examined.

      (16) Page 6 Section III, first sentence: This fits more to intro. Further the reference list is severely lacking here, with no comparison to other methods for actually fitting stochastic models.

      Thank you for the suggestion. We have added a few references there.

      (17) Page 6, Section A, sentence: '....experimental measured mean...' Why is it a good measure here (moment matching is not perfect), also do you have distribution data, would that not be better? How about accounting for measurement error?

      Thank you for the comment. While we do not have full distribution data, we acknowledge that incorporating experimental measurement error could enhance the framework. A weighted loss function could model uncertainty explicitly, but this is beyond the scope of the current study. 

      (18) Page 7, section B, first paragraph: 'Motivated by this, we defined the...'Why using Fisher-Information when profile-likelihood have proven to be better, especially for systems with few parameters like this.

      Thank you for the suggestion. While profile-likelihood is indeed a powerful tool for parameter uncertainty analysis, we chose Fisher Information due to its computational efficiency and compatibility with the differentiable nature of the DGA framework.

      (19)  Page 7, section C, sentence '...set kR/off=1..'. In this case, we cannot infer this parameter.

      Thank you for the comment. You are correct that setting kR/off = 1 effectively normalizes the rates, making this parameter unidentifiable. In steady-state analyses, not all parameters can be independently inferred because observable quantities depend on relative—rather than absolute—rate values (as evident when setting the time derivative to zero in the master equation). To infer all parameters, one would need additional information, such as time-series data or moments at finite time.

      (20)  Page 7 Section 2. Estimating parameters .... Sentence: '....as can be seen, there is very good agreement..' How many times the true value falls within the CI (because corr 0.68 is not great).

      Thank you for your comment. While a correlation coefficient of 0.68 indicates moderate agreement, the primary goal was to demonstrate the feasibility of parameter estimation using the DGA rather than achieving perfect accuracy. The coverage of the CI was not explicitly calculated, as the focus was on the overall trends and relative agreement.

      (21) Page 7 Section 2. Estimating parameters .... Sentence: 'Fig5(c) shows....' Is this when using exact simulator?

      Thank you for your question. Yes, the exact values in x-axis of Fig. 5(c) are obtained using the exact Gillespie simulation.

      (22) Page 7 Section 3 Estimating parameters for the... Sentence: 'Fig6(a) shows...' Why Cis are not shown?

      Thank you for your comment. CIs are not shown in Fig. 6(a) because this particular case is degenerate, making the calculation and meaningful representation of CIs challenging. 

      (23) Page 10, Sentence: 'As can be seen in Fig 7(b)...' Can you show uncertainty in measured value? It would be good to see something of a comparison against an exact method, at least on simulated synthetic data

      Thank you for the comment. Fig. 7(a) already includes error bars for the experimental data, which account for measurement uncertainty. However, in Fig. 7(b), we do not include error bars for the experimental values due to limitations in the available data.

      (24) Page 12, Section B Loss function '...n=600...' This is on a lower range. Have you tested with n=1000?

      Yes, we have tested with n=1000 and observed no significant difference in the results. This indicates that n=600 is sufficient for the purposes of this study. 

      (25) Fig 8(c) why there are no CI shown?

      Thank you for your comment. CIs were not included in Fig. 8(c) due to degeneracy, which makes meaningful confidence intervals difficult to compute.

      (26) Page 12 Conclusion, sentence: '..gradients via backpropagation...' Actually, by making the function continuous, both forward and reverse mode might be used. And in this case, forward-mode would actually be the fastest by quite a margin

      Thank you for your insightful comment. You are correct that by making the function continuous, both forward-mode and reverse-mode automatic differentiation can be used. We have now mentioned this point in the discussion.

      (27) Overall comment for the Conclusion section: It would be good to discuss how this framework compares to other model-fitting frameworks for models with stochastic dynamics. The authors mention dynamic data and more discussion on this would be very welcomed. Why use ADAM and not something established like BFGS for model fitting? It would be interesting to discuss how this can fit with other SSA algorithms (e.g. in practice sorting SSA is used when models get larger). Also, inference comparison against exact approaches would be very nice. As it is now, the authors truly only check the accuracy of the SSA on 1 model -it would be interesting to see for other models.

      Thank you for your detailed comments. While this study focuses on introducing the DGA and demonstrating its feasibility, we agree that comparisons with other model-fitting frameworks, testing on additional models, and integrating with other SSA variants like sorted SSA are important directions for future work. Similarly, extending the DGA to handle transient dynamics and exploring alternatives to ADAM, such as BFGS, are promising areas to investigate further.

    1. Các trường hợp đăng ký biến động quy định tại các điểm a, b, i, k, l, m và q khoản 1 Điều này thì trong thời hạn 30 ngày kể từ ngày có biến động

      Ngoài trường hợp này vẫn phải đăng ký biến động nhưng thời hạn là bao nhiêu?

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This study is useful as it provides further analysis of previously published data to address which specific genes are part of the masculinizing actions of E2 on female zebra finches, and where these key genes are expressed in the brain. However the data supporting the conclusion of masculinizing the song system are incomplete as the current manuscript is a re-analysis of differential gene expression modulated by E2 treatment between male/female zebra finches without manipulation of gene expression. The conclusions (and title) regarding song learning are also incompletely supported with no gene manipulation or song analysis. Importantly, the use of WGCNA for a question of sex-chromosome expression in species without dosage compensation is considered inadequate. As the experimental design did not include groups to directly test for song learning, and there was also no analysis of song performance, these data were also considered inadequate in that regard.

      We are sorry the editor felt the manuscript so incomplete and inadequate. Though the tone of this assessment seems more severe than the below reviewer comments, we are also happy to see that the editor has considered our paper further for a revised publication, based on the reviewer’s comments. We address the editor’s comments as follows:

      While we agree that manipulation of some of the genes we discovered, whose expression levels are E2-sensitive in the song system, would take the study further in validating some proposed hypothesis in the discussion of the paper, we don’t think the outcome of gene manipulations would change the major conclusions from the results of the paper. In this study we performed estrogen hormone manipulations, with causal consequences on gene expression in song nuclei and associated song behavior. In a way this is analogous to gene manipulations, but manipulating directly the action of estrogen. The categories of genes impacted, and the differences among the sex chromosomes wouldn’t change.

      For the comment on WGCNA being inadequate for addressing questions on sex chromosome expression in species without dosage compensation, we think the evidence in our data does not bear that out. One main result of this paper is the separation of Z chromosome transcripts whose expression is most strongly regulated by chromosomal dosage (WGCNA module E) across regions from those subject to additional sources of regulation in song nuclei (other modules). It seems to us that rather than being confounded by the lack of dosage compensation, WGCNA allowed us to better resolve the effects of dosage on different genes within the sex chromosomes. We have added a new figure more directly examining sex chromosome transcript abundance within different modules. Briefly, we found that module E assigned Z chromosome genes exhibited almost exactly the male-biased expression ratio expected from no dosage compensation while the Z chromosome genes in song nuclei assigned to other modules were expressed below the dosage predicted value, consistent with module E containing those genes whose expression are most strongly regulated by dose across all brain regions sampled.

      At its core, WGCNA finds sets of correlated genes. The biological reality of the zebra finch transcriptome is that Z chromosome expression is largely anti-correlated with W chromosome due to dosage. However, this dosage effect is not felt equally by all genes and WGCNA provides an unbiased computational framework which can be used to separate dose from other potential sources of gene regulation. This is why roughly ⅓ of Z chromosome genes are not assigned to module E; for example the growth hormone receptor is assigned to module G based on its correlation with genes upregulated within HVC.

      “As the experimental design did not include groups to directly test for song learning, and there was also no analysis of song performance, these data were also considered inadequate in that regard.”

      Concerning the comment on no analysis on song performance in the paper, all such analyses were conducted on our previous study on the same animals (Choe et al. 2021, Hormones & Behavior). The birds considered here were sacrificed at PHD30, prior to the onset of learned song behavior. However, females treated with E2 the same at the same time and allowed to mature into adulthood, went onto to develop rudimentary song. Further, induction of rudimentary song learning in females following E2 treatment has been well established since the early ‘80s. We have added the following text toward the end of the intro to make this more clear:

      “While the birds for this study were sacrificed prior to the developmental presentation of song behavior, we have previously shown that female finches treated in exactly the say way with E2 go on to produce rudimentary imitative songs as adults (Choe et al 2021), consistent with the known induction of vocal learning in females by E2 (REF).”

      Reviewer #1 (Recommendations For The Authors):

      Overall, this is a wonderfully designed and executed study that takes full advantage of new resources, such as the most complete zebra finch genome assembly yet, as well as the latest methods. I have very few suggestions as to the improvement of the manuscript. They are as follows:

      Results Section:

      In the paragraph "Identification of gene expression modules in song nuclei":

      "The E2-treated females in this study had similarly sized song system nuclei as males, indicating that E2 treatment prevented atrophy."

      Clarify if this comparison is to treated and/or untreated males.

      We thank the reviewer for their comment. The relative differences in the song nuclei sizes between the E2-treated females and the other groups is more complex that our original sentence implied. We have revised the main the text as follows

      “In our previous study, we found that estradiol treatment in PHD30 females caused HVC to enlarge and Area X to appear when it normally does not develop in females, but both at sizes less than in untreated or treated males.The sizes of PHD30 female LMAN RA were already the sizes as seen in males, as the later has not atrophied yet at this age(25).”

      In the paragraph "Sex- and micro-chromosome gene expression across the telencephalon": "These animal and chromosome specific shifts in the transcriptomes could represent the systemic effects of allelic chromosomal structural variation..."

      The authors should clarify the meaning of a"llelic chromosomal structural variation" in this context, as it is an unusual phrase. Major chromosomal structural variation seems unlikely to produce these effects. Is it also possible that animal-specific modules with brain-wide higher could also result from laboratory contamination between all samples from one animal? This is not too likely but perhaps should be acknowledged or ruled out.

      We have removed the word allelic, which was unnecessary. We can’t envision how laboratory contamination could occur such that all of one animal’s samples would be affected to produce the observed result which is module and chromosome specific. An animal wide effect could emerge during sacrifice, but we can think of no reason that would affect these modules and not others. Rather, the most likely explanation is biological natural difference between animals. We have added this consideration of alternative explanations.

      In the section "Candidate gene drivers of HVC specialization in E2-treated females":

      When discussing GHR's role in cell growth and proliferation, the authors' argument could be expanded by including the documented role of GH signaling in anti-apoptotic protection of neurons from rounds of neural pruning during development as documented in the chicken, e.g. • Harvey S, Baudet M-L, Sanders EJ. 2009. Growth Hormone-induced Neuroprotection in the Neural Retina during Chick Embryogenesis. Annals of the New York Academy of Sciences, 1163: 414-416. https://doi.org/10.1111/j.1749-6632.2008.03641.x

      We thank the reviewer for sharing this publication with us.. We have added the following sentence to our discussion with the above citation. “Further, our results are consistent with growth hormone’s known role in avian anti-apoptotic protection, with elevated signaling associated with the survival of chicken neurons during rounds of pruning in the developing

      retina.”

      The authors' argument of the relevance of the passerine GH duplication would be strengthened by citing:

      • Rasband SA, Bolton PE, Fang Q, Johnson PLF, Braun MJ. 2023. Evolution of the Growth Hormone Gene Duplication in Passerine Birds, Genome Biol Evol, 15(3) https://doi.org/10.1093/gbe/evad033. Greatly expands on the Yuri et al. paper cited by characterizing of the molecular evolution of these genes across hundreds of avian species, supporting positive selection on multiple amino acid sites identified in both ancestral and duplicate (passerine) growth hormone.

      • Xie F, London SE, Southey BR et al. 2010. The zebra finch neuropeptidome: prediction, detection and expression. BMC Biol 8, 28. https://doi.org/10.1186/1741-7007-8-28 The authors report significantly different expression of the ancestral GH gene in the adult male zebra finch auditory forebrain after different song exposure experiences.

      We have amended the results section sentence and added all suggested citations. The sentence now reads: “The gene which encodes growth hormone receptor’s ligand, growth hormone, is interestingly duplicated and undergoing accelerated evolution in the genomes of songbirds (Rasband et al 2023); the GH ligand has been found to be upregulated in the zebra finch auditory forebrain following the presentation of familiar song (Xie et al 2010).”

      Figures:

      - Figure 1B. "Duration of sex typing" being a shorter bar compared to the others is not fully explained in the experimental design. Presumably at the end of this time period, the sex is non-invasively, phenotypically evident. I suggest an arrow pointing to the PHD/PHD range when sex is apparent in plumage/anatomy.

      - Figure 4. Caption appears to be truncated; "across all... genes"?

      Fixed

      - Figure 5. For 5E, 5F, 5G, 5H, consider enlarging the plots so overlapping gene symbols are readable. Alternately, smaller numbers or symbols could be used with a key in areas where overlapping symbols are hard to prevent.

      We agree that these are not the easiest to read; we originally offset the symbols in R to minimize overlaps, but it can only do so much for the more crammed panels. We have now added a supplemental .xlsx file with the underlying data from each of the 4 tests for readers that want to examine the data in more detail.

      Reviewer #2 (Recommendations For The Authors):

      Since WGCNA methods will inherently draw together sex-chromosome genes into the same module in systems without dosage compensation, I suggest the authors rerun the WGCNA using only female samples and only male samples. Then identify the composition of modules that differ between E2 and vehicle-treated females and compare these genes to males. Then from male WGCNA identify the composition of modules that differ between E2 and vehicle-treated males and compare to female modules.

      We thank the reviewer for their suggestions. However, we believe it is not as strong as the approach we used, which is grouping data from both sexes in the WGCNA analyses in a study that is looking for sex differences. The reviewer's proposed approach amounts to computing modules twice (once per sex), determining song system specialized modules and E2 responsive modules in both settings, then intersecting the two sets to find corresponding modules, all done to prevent the non-dose compensated sex chromosome genes from being drawn into the same module.

      While WGCNA does group the majority of sex chromosome genes into module E, it does not categorize them all this way (Fig 3). The module classification instead differentiates those sex chromosome genes whose expression are most explained by chromosome dosage / sex across regions (modE) from those whose expression is controlled by other sources of regulation; for an example of the latter, the growth hormone receptor (GHR) is one of several Z chromosome genes classified into modG as its expression better correlates with the genes specialized to HVC than it does with the majority of dosage-dependent Z chromosome genes found in modE. Further, to remove biological sex as a variable in a WGCNA analysis that is focused on sex differences seems counterintuitive.

      Instead, to quantitatively address the reviewer’s concern, we conducted additional analyses, that led to an added new figure, associated text, and tables, that better describes sex/chromosome dosage effects on the abundance (FPKM) and expression ratios of sex chromosome transcripts by module irrespective of brain region (Fig. 5). We find that the Z chromosome genes in modE were expressed at the expected chromosome dosage in the non-vocal surrounding regions (65.06% observed vs 66.6% expected) while in other modules, other Z chromosome genes were expressed at intermediate levels between equal expression and the expected chromosomal dosage. For example, the Z chromosome content of modules D and H exhibited near equal expression between sexes. Within the song system, Z chromosome gene content of modG was highly expressed in males beyond what is expected from chromosome dosage, consistent with modG’s male-specific upregulation in song nuclei relative to surrounds in the absence of E2. These results better demonstrate that in our WGCNA on the combined dataset we are able to separate those Z chromosome genes whose expression is predominantly dosage controlled from those subject to additional regulation such as song system specialization.

      Fig. S3 Legend: 'Black arrow' -> 'Red arrow'

      Change made.

      Fig. S5 - What part of the figure shows the 'human convergent signature'? Also, simply listing the number of genes mapped to a chromosome is misleading to readers unfamiliar with the zebra finch genome, you should either provide the number of genes on each chromosome or present as corrected by that number.

      Fig. S5 was the same type of analyses in Fig. 3 but with an older zebra finch genome assembly, where we had not included the panel a for enrichments with genes convergent in expression between songbird song regions and humans speech brain regions. However, we see that Fig. S5 was not adding any new important information to the paper, so we removed it.

      For the chromosome analyses in Fig. 3b, we provide both the total raw number of module assigned genes broken down by chromosome (The black bar plots on the right) as well as a statistical fold-enrichment value of modules per chromosome. Given the number of genes per chromosome and genes per module in our data, we computed the fold-enrichment for each intersection (observed intersection size / expected intersection size). To test for the significance of these enrichments, we bootstrapped FDR corrected p values for the enrichment of each chromosome-module pairing by randomizing the mapping of genes to modules to construct a null distribution of fold enrichments for each intersection. Our intent was not to describe the size of the chromosomes themselves, information readily available elsewhere, but to show the disproportionate chromosomal origins of the gene sets considered by this study. Performing this enrichment test using all annotated genes per chromosome would artificially increase enrichment values and make the analysis less conservative by confounding the results with the inherent enrichment for “brain function” in the assigned genes relative to all genes.

      At several places you say "we correlated expression of each sex chromosome transcript with sexual dimorphism within each region, such that expressed W genes would be positively correlated and depleted Z chromosome genes would be anticorrelated." What was the sexual dimorphism that was being correlated with? Is this the eigengene?

      We thank you for this comment. Our language was less clear than it could be. We tested for correlations of both the eigengene and the individual gene expression profiles with the biological sex of the animals. We have changed the text to:

      “To do this, we tested for a correlation between the expression of each sex chromosome transcript to the animals’ sex within each brain region. We found that female-enriched transcripts were positively correlated with sex and male-enriched transcripts were anticorrelated (Fig. 4f,g).”

      Fig. 4A: The 'true/false' boxes and animal A-L is confusing and unnecessary. I'd suggest just using M and F (or sex symbols) with a horizontal line below each set of 3 for respective E2 and Veh.

      Change made.

      Reviewer #3 (Recommendations For The Authors):

      General comments:

      After the initial characterization of the datasets and module identification, it is quite hard to follow the logic of the data presentation in the various other Results sections or to clearly understand how they relate to the main stated goal to identify factors related to sex differences in vocal learning. The most relevant findings relate to the presumed actions of hormone treatment and sex chromosome gene dosage in song nuclei, whereas analyses of other brain areas, other chromosomes, or speech-related genes serve more as controls and/or appear as distractions from the main theme. A suggestion to increase the clarity of the presentation and potential impact of the study is to change the order of the presentation, focusing first on the specific analyses and comparisons that most directly speak to the main goals of the study, and then secondarily and more briefly presenting the controls or less related comparisons.

      The reviewer’s suggestion for the results section organization is exactly what we had tried to do. We opened the first paragraph on identification of modules, then presented the song nuclei specific modules, followed by E2-changes to those modules; and the followed by other specific results for the remainder of the paper, including module enrichments to specific chromosomes. The reviewer mentioned our analyses of “other brain areas” (which we assume to mean the non-vocal surround regions), other chromosomes (which we assume means autosomes) and speech-related genes as controls were a distraction in the paper; but within our analysis, these other brain regions are essential controls needed to assess the song-system specificity of any observed sex differences observed from the very first paragraphs of the results; the autosomes were not controls for sex chromosome results, but primary results in of themselves; the overlap with speech-related genes was also not a control, but a novel discovery. We have revised these points in the paper to make them clearer, and revised some of the section titles and transitions between sections to help increase clarity of the main storyline of the paper.

      A related comment is that many of the inferences drawn from the WGCNA analysis were quite complex, thus independent verification of some predictions would be quite valuable. For example, consider the passage: "In non-vocal learning juvenile females, interestingly LMAN was specialized relative to the AN by the same gene modules as in males (B, F, and I) as well as an additional module G (Fig. 2b); RA was specialized by module A as in males, but not module L and by additional modules A and G. In contrast, neither juvenile female HVC nor Area X exhibited significant gene module expression specializations relative to their surrounds." Providing in situ hybridization verification of these regional gene expression predictions with a few representative genes seems quite feasible given the group's expertise and would considerably strengthen confidence in the module-based inferences.

      We performed in-situ independent validation of 36 candidate genes in our first study with this dataset (Choe et al 2021). We now mention this validation in the revised paper. The reviewer’s selection of one of our sentences though made us realize that our grammar used to explain the results was not as clear as it needs to be. We thus cleaned up the grammar of our module descriptions so that it should be communicated with less complexity, the main issue noted by the reviewer.

      Because this is a re-analysis of a previously published dataset, the authors should more explicitly describe somewhere in the Discussion how the present analysis advances the understanding of sex differences in songbird neuroanatomy and behavior beyond the previous analysis.

      We have added an additional sentence into the discussion more clearly separating the results of the current study from our previous work.

      Specific comments:

      Abstract:

      There is evidence (from Frank Johnson's lab) that RA does not completely atrophy in female zebra finches, but is still present with more preserved connectivity than previously thought, possibly related to non-singing function(s). A term like 'marked reduction' of female RA may more accurately reflect the current state of knowledge.

      We have changed the text to “partial atrophy”.

      The term "driver" is undefined and unclear at this point of the paper; a clear definition for "driver" is also lacking in the Intro.

      We now define “driver” or “genetic driver” as understood to mean “a genetic locus whose expression and/or inheritance strongly regulates the trait of interest”.

      When citing the literature on studies that identified "specific genes with specialized up- or down-regulated expression in song and speech circuits relative to the surrounding motor control circuits", the authors should also cite studies from other labs (e.g. Li et al., PNAS, 2007; Lovell et al, Plos One 2008; Lovell et al, BMC Genomics 2018; Nevue et al, Sci Rep. 2020), to be accurate and fair.

      Citations added

      For clarity, the authors should explicitly formulate the hypothesis they are proposing at the end of the Summary.

      We thank the reviewer for this comment. We have replaced the final sentence of the summary with: “We present a hypothesis where reduced dosage and expression of these Z chromosome genes changes the developmental trajectory of female HVC, partially preventable by estrogen treatment, contributing to the loss of song learning behavior.”

      Introduction:

      Vocal learning is arguably the ability to imitate 'vocal' sounds, this could be clarified here.

      We have amended the sentence to “Vocal learning is the ability to imitate heard sounds using a vocal organ…”

      Given they are currently considered sister taxa, can the author briefly explain what is the basis for assuming that songbirds and parrots independently evolved vocal learning?

      Although songbirds and parrots belong to a monophyletic clade, they are not sister taxa. There are two clades separating them that are vocal non-learners. We have cited the reference that demonstrated this (e.g. Jarvis et al 2014 Science).

      Why use Taeniopygia castanotis rather than the more broadly used Taeniopygia guttata?

      Zebra finches were recently reclassified and T.castanotis is now more accurate. The Indonesian Timor zebra finch retained T.guttata while the Australian finch, used here, was classified as T.castanotis.

      The authors state: "...vocal learning is strongly sexually dimorphic in zebra finches and many other vocal learning species" and cite Nottebohm and Arnold, Science, 1978. That landmark paper only shows dimorphism in song nuclei (not learning) in two songbird species. The authors should provide citations for other species and behavior, or modify the statement.

      We have added an additional citation (Odom et al.) to this sentence which covers the phylogeny more broadly.

      The authors refer to the nucleus RA as being located in the lateral intermediate arcopallium (LAI). Other labs have described this domain as the dorsal part of the intermediate arcopallium, thus AId or AID (Mello et al., JCN, 2019; Yuan and Bottjer, J Neurophys 2019; Yuan and Bottjer, eNeuro, 2020; Nevue et al., BCM Genomics, 2020). The authors should acknowledge this discrepancy in nomenclature so that data and conclusions can be more readily compared across studies.

      We thank the reviewer and agree that this is helpful. We have added a note at the first mention of LAI.

      The authors state that data from the gynandromorph bird described by Agate et al implicates "sex chromosome gene expression within the song system" as involved in the song system sexual dimorphism. That study, however, only rules out circulating gonadal steroids, and while suggesting a cell-autonomous mechanism like sex chromosome genes, it does not necessarily exclude other brain-autonomous factors like sex differences in local production of sex steroids.

      We say that this study “implicated” sex chromosome gene expression, which is accurate per the results and discussion of that study. We are unsure what “brain autonomous factors like sex differences in local production of sex steroids” means?. “Brain autonomous” and “local production” in the brain seem contradictory in this context?

      Results:

      The authors state that "the E2-treated females in this study had similarly sized song system nuclei as males, indicating that E2 treatment prevented atrophy". Can they clarify whether the VEH-treated females actually had smaller RAs than E2-treated females or VEH-treated males at this age? This is still quite early in development and it is unclear to what extent RA's marked sexual dimorphism in adults or later developmental ages has already taken place in untreated (or VEH-treated) birds. A related comment is that the authors state later on: "We interpret these findings to indicate that: LMAN and RA atrophy later in juvenile female development..." Does this mean these nuclei actually did not show the marked decreases predicted earlier in the text? Clarifying this point would be helpful.

      We thank the reviewer for pointing out this discrepancy, which reviewer #1 asked for clarification as well. RA size at this age is similar in males and females. However, HVC and Area X is smaller and absent respectively in females and E2 treatment partially prevents this atrophy. The text now reads:

      “In our previous study, we found that estradiol treatment in PHD30 females caused HVC to enlarge and Area X to appear when it normally does not develop in females, but both at sizes less than in untreated or treated males.The sizes of PHD30 female LMAN RA were already the sizes as seen in males, as the later has not atrophied yet at this age(25).”

      The authors acknowledge that area X is absent in untreated and VEH-treated females. Could they please clarify how area X and the surrounding stratal tissue that excludes area X were identified for laser capture dissections in juvenile females?

      We have added the following statement to the main text portion discussing the dissections.

      “In the case of vehicle-treated females which lack Area X, a piece of striatum from the same location of where Area X is found in males was taken. “

      Some passages in Results discussing the authors' interpretation of the modules seem quite speculative and possibly belong instead in the Discussion. For example: "... that module A and G genes could be associated with the start of this atrophy; HVC and Area X are likely the first to atrophy or not develop; and lack of any gene module specialization in them at this age could mean that they would be more sensitive to estrogen prevention of vocal learning loss."

      As suggested, we have removed this text from the results; these ideas were already presented in the Discussion. We have merged the resulting small paragraph with the preceding paragraph.

      The authors state: "To assess the effects of chronic exogenous estrogen on the developing song system, we first performed a control analysis of modules in the E2-treated juvenile males." How can an assessment of estrogen effects be a "control" analysis? Does this refer to a contrast with females? Please clarify the language here.

      The reviewer is correct, that E2 treatment in males should not be considered a control experiment. We removed the word “control”.

      When discussing the GO-enriched terms for module G, it is unclear how the authors reached the conclusion about "proliferative", as the enriched terms do not refer to processes more directly indicative of proliferation like "cell division" or "cell cycle regulation". Rather, these terms seem more related to differentiation and growth, which do not necessarily imply proliferation. The authors also refer to "HVC proliferation" later on in the Discussion. However, there is conclusive evidence from several labs that proliferative events associated with postnatal neuronal addition and/or replacement in song nuclei occur in the subventricular zone, not in song nuclei like HVC itself, and that the growth of song nuclei largely reflects cell survival, as well as growth in size and complexity under the regulation of sex steroids.

      We agree that “proliferative” may have been a poor word choice here. We did not mean to indicate that cell division was occuring in HVC itself. Instead we meant to indicate that HVC is able to accommodate the new born neurons from the SVZ. We have replaced the word “proliferative” throughout. In the instance the reviewer mentions specifically we replaced it with,“...potentially act to integrate and differentiate late born neurons.”

      With regard to module E, referring to a telencephalon-wide sexually dimorphic gene expression program seems quite a stretch, given that only a few regions were sampled and compared between sexes. These related statements should be toned down.

      We have replaced “telencephalon-wide” with “more distributed across the finch telencephalon” and other similar language in each instance.

      The following passage is very speculative and should shortened and/or moved to the Discussion: "Based on the findings in these gene sets, we hypothesize that without excess estrogen in females, HVC expansion is prevented by not specializing the growth and neuronal migration promoting genes in module G to the HVC lineage by late development. This is potentially enacted by depleting necessary gene products from the Z sex chromosome, such as GHR, which are already present in only one copy."

      We have deleted this portion of the text, as the idea is already present in the discussion.

      Figure 5: To this reviewer, the comparisons of sex differences and of female response to E2 are the most relevant and informative ones, whereas the regional differences between song nuclei and surrounds refer to different cell populations and cell types where other processes may be occurring, independently of what occurs in song nuclei. It thus seems like the intersection analysis in panel 5i may be subtracting out important "core genes" in terms of E2 effects and/or sex differences in the most relevant cell populations, i.e. in this case within song nucleus HVC.

      Song learning and the vocal learning brain regions are specialized behaviors and associated nuclei which have a set of hundreds of specialized genes compared to the surrounds. Our previous findings shows that E2 drives the appearance of these specializations in female zebra finches. Thus, we considered this the most interesting question to focus on, which we have further highlighted. Nevertheless, in response to the reviewers suggestion, we have added a .xlsx supplemental file containing the results from each of the individual tests so readers may examine any single comparison, or set of comparisons, in more detail.

      Discussion:

      It is unclear what the term "critical period" refers to in: "during the critical period of atrophy for the female vocal circuit"; please clarify.

      We agree that our language was nebulous. We have replaced it with “as several male song control nuclei begin to expand and female nuclei partially atrophy”

      In: "HVC appeared unspecialized at the level of gene module expression in control females", does "unspecialized" refer to a lack of difference in gene expression when compared to surroundings? Please clarify. The same comment applies to other uses of "unspecialized" in this paragraph.

      Yes, unspecialized means lack of difference in gene expression in the song nucleus. To clarify this point, we have reworked that and the following sentence as follows:

      “HVC appeared unspecialized compared to the surrounding nidopallium at the level of gene module expression in control females, with no significantly differentially expressed MEGs . However, in E2-treated females, HVC exhibited a subset of the observed male HVC gene expression specializations. Similarly, the vehicle-treated female striatum located where Area X would be also lacked any specialized gene module expression, but the E2-treated female Area X exhibited a subset of the male Area X specializations, consistent with the known absence of Area X in vehicle-treated females and presence in E2-treated females.”

      The authors state: "...we surprisingly found that the most specialized genes were disproportionately from the Z chromosome", when discussing module G in HVC. Why is this so surprising? In a sense, this could be taken as consistent with the findings of Friedrich et al, 2022, where sex differences in the RA transcriptome were predominantly Z related on 20 dph. Arguably 20 dph is still quite close to 30 dph in the present study, when compared to 50 dph in Friedrich et al, when autosomes predominate.

      Our bioRxiv was originally posted in July 2021, prior to the publication of Friedrich et al, 2022; however we had previously added to our discussion that several of our results are consistent with the observations of Friedrich et al..

      We have a different interpretation of Z chromosome gene results in Friedrich et al.. While the percentage of specialized genes from the Z chromosome decreased, the absolute number of specialized Z chromosome genes actually increased over this interval. In Fig. 3a from Friedrich et al. it appears that ~28% of Z chromosome genes were sexually dimorphic in their expression in RA at PHD20 but that ~39% of Z chromosome genes were similarly dimorphic at PHD50. We interpret this result as the Z chromosome genes being among the earliest genes differentially expressed between the sexes, not that their differential expression or role ever subsequently decreased. We have reworked this portion of the discussion to make our point more clear:

      “This model of sex chromosome influenced song system development is consistent with recent observations comparing male and female zebra finch transcriptomes from RA at young juvenile (PHD20) and young adult (PHD50) ages in un-manipulated birds (Friedrich et al. 2022)57. While that study proposes that the role of the sex chromosome in maintaining transcriptomic sex differences diminishes across development, as the proportion of specialized genes that originate on the sex chromosomes diminishes, this effect was driven by large increases in differentially expressed autosomal genes rather than by any reduction in sex chromosome dimorphism; the percentage of differentially expressed Z chromosome genes increased from PHD20 (28%) to PHD50 (39%) (Friedrich et al). This leads us to conclude that sexually dimorphic Z chromosome expression at juvenile ages precedes the sexually dimorphic expression of the autosomes seen in adults. This is consistent with our hypothesis that sufficient expression of select Z chromosome gene products (GHR, etc..) is necessary for subsequent autosomal song system specializations (modG).”

      Further, when we write ”When examining the module G HVC specialization induced by E2-treatment in female HVC, we surprisingly found that the most specialized genes were disproportionately from the Z chromosome” we are referring to the upregulation of module G by E2 in female HVC, not the sex difference described in RA by Friedrich et al. which only utilized un-treated RA samples and thus is more likely related to our observations of module E.

      The term "sexual dimorphism" has been more traditionally used for sex differences that are very marked, like features that are highly regressed or absent in one sex, most often in females. Quantitative differences in gene expression, including dosage differences like those related to module E, are more appropriately described as sex differences rather than dimorphisms. That usage would be more consistent with most of the literature, and thus preferable.

      We did a google search for common definitions, and found more the opposite. Sexual dimorphism being used more often as differences of degree (with the zebra finch example as one of the top hits), and sex differences being used often as more absolute differences (like presence vs absence of the Y chromosome). Further, as in the reviewer’s first sentence, the definition of sexual dimorphism is a sex difference. That is, the two phrases can be interchangeable. Thus, we prefer to keep sexual dimorphism.

      Several references are incomplete or seem truncated, like 9 and 10.

      Fixed

      Table S2: Please examine and take into account the W gene curation presented in Table S3 of Friedrich et al., 2022.

      We have added additional supplementals (supplemetal_w_chrom_express.csv and supplemetal_z_chrom_express.csv) of the data provided in new Fig 5 incorporating the curation information from Table S3 from Friedrich et al.

      Data availability:

      Genes for all the main modules identified should be presented in a Supplemental Table, or through a link to a stable data repository.

      We have added an additional Supplemental Table supplemental_gene_module_assignment.csv with this information.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      The authors present valuable findings on trends in hind limb morphology throughout the evolution of titanosaurian sauropod dinosaurs, the land animals that reached the most remarkable gigantic sizes. The solid results include the use of 3D geometric morphometrics to examine the femur, tibia, and fibula to provide new information on the evolution of this clade and understand the evolutionary trends between morphology and allometry. Further justification of the ontogenetic stages of the sampled individuals would help strengthen the manuscript's conclusions, and the inclusion of additional large-body mass taxa could provide expanded insights into the proposed trends.

      Most of the analyzed specimens, especially from the smaller taxa, come from adult or subadult specimens. None exhibit features that may indicate juvenile status. However, we lack information of the paleohistology that may be a stronger indicator on the ontogenetic status of the individual, and some of operative taxonomic units used in the study come from mean shape of all the sampled specimens.

      Current information on morphological differences between adult and subadult or juvenile specimens indicates that even early juvenile specimens may share same morphological features and overall morphology as the adult (e.g., see Curry-Rogers et al., 2016; Appendix S3). We included a comprehensive analysis of the impact of juvenile specimens as one of the aspects of the intraspecific variability that may alter our results in Appendix S3.

      Public Reviews:

      Reviewer #1:

      Weaknesses:

      Several sentences throughout the manuscript could benefit from citations. For example, the discussion of using hind limb centroid size as a proxy for body mass has no citations attributed. This should be cited or described as a new method for estimating body mass with data from extant taxa presented in support of this relationship. This particular instance is a very important point to include supporting documentation because the authors' conclusions about evolutionary trends in body size are predicated on this relationship.

      We address this issue in the text (Line 32 & 64). Centroid size seems a good indication as it’s the overall size of the entire hind limb, and the length of the femur and tibia is well correlated independently with the body size/mass. Also, as we use few landmarks and only those that are purely type I or II landmarks, with curves of semilandmarks bounded or limited by them, centroid size is not sensible to landmark number differences across the sample in our study (as the centroid size is dependent of the number of landmarks of the current study as well as the physical dimensions of the specimens).

      We have sampled and repeated all the analyses using other proxies like the femoral length and the body mass estimated from the Campione & Evans (2020) and Mazzeta et al. (2004) methods. The comprehensive description of the method is in Appendix S2, the alternative analyses can be accessed in the Appendix S3 and S4; and the code for the alternative analyses can be accessed in the modified Appendix S5. All offer similar results than the ones obtained in our analyses with the body size proxied with the hind limb landmark configuration centroid size.

      An additional area of concern is the lack of any discussion of taphonomic deformation in Section 3.3 Caveats of This Study, the results, or the methods. The authors provide a long and detailed discussion of taphonomic loss and how this study does a good job of addressing it; however, taphonomic deformation to specimens and its potential effects on the ensuing results were not addressed at all. Hedrick and Dodson (2013) highlight that, with fossils, a PCA typically includes the effects of taphonomic deformation in addition to differences in morphology, which results in morphometric graphs representing taphomorphospaces. For example, in this study, the extreme negative positioning of Dreadnoughtus on PC 2 (which the authors highlight as "remarkable") is almost certainly the result of taphonomic deformation to the distal end of the holotype femur, as noted by Ullmann and Lacovara (2016).

      We included a brief commentary in the Caveats of This Study (Line 467) and greatly expanded this issue in the Appendix S3. We followed the methodology proposed by Lefebvre et al. (2020) to discuss the effects of taphonomic deformation in the shape analyses.

      Our shape variables (PCs obtained from the shape PCA) should be viewed as taphomorphospaces as Hedrick and Dodson, as well as the reviewer, points in such cases.

      The analysis of the effects of taphonomy or errors induced by the landmark estimation method indicate that Dreadnoughtus schrani is one of the few sampled taxa that may have a noticeable impact on our analyses due lithostatic deformation. Other taxa like Mendozasaurus neguyelap or Ampelosaurus atacis may also induce some alterations to the PCs. In general, the trends of those PCs slightly altered by taphonomy, where D. scharni is the only sauropod that may alter an entire PC like PC2, did not exhibit phylogenetic signal and are a small proportion of the sample variance.

      The authors investigated 17 taxa and divided them into 9 clades, with only Titanosauria and Lithostrotia including more than two taxa (and four clades are only represented by one taxon). While some of these clades represent the average of multiple individuals, the small number of plotted taxa can only weakly support trends within Titanosauria. If similar general trends could be found when the taxa are parsed into fewer, more inclusive clades, it would support and strengthen their claims. Of course, the authors can only study what is preserved in the fossil record, and titanosaurian remains are often highly fragmentary; these deficiencies should therefore not be held against the authors. They clearly put effort and thought into their choices of taxa to include in this study, but there are limitations arising from this low sample size that inherently limit the confidence that can be placed on their conclusions, and this caveat should be more clearly discussed. Specifically, the authors note that their dataset contains many lithostrotians, but they do not discuss unevenness in body size sampling. As neither their size-category boundaries nor the taxa which fall into each of them are clearly stated, the reader must parse the discussion to glean which taxa are in each size category. It should be noted that the authors include both Jainosaurus and Dreadnoughtus as 'large' taxa even though the latter is estimated to have been roughly five times the body mass of the former, making Dreadnoughtus the only taxon included in this extreme size category. The effects that this may have on body size trends are not discussed. Additionally, few taxa between the body masses of Jainosaurus and Dreadnoughtus have been included even though the hind limbs of several such macronarians have been digitized in prior studies (such as Diamantinasaurus and Giraffititan; Klinkhamer et al. 2018). Also, several members of Colossosauria are more similar in general body size to Dreadnoughtus than Jainosaurus, but unfortunately, they do not preserve a known femur, tibia, and fibula, so the authors could not include them in this study. Exclusion of these taxa may bias inferences about body size evolution, and this is a sampling caveat that could have been discussed more clearly. Future studies including these and other taxa will be important for further evaluating the hypotheses about macronarian evolution advanced by Páramo et al. in this study.

      Sadly, we could not include some larger sized titanosaurians sauropods. As the reviewers points out, the lack of larger sauropods among the sampled taxa may hinder our results, as the “large-bodied” category is filled with some mid-sized taxa and the former Dreadnoughtus schrani which is five times larger than some of them. We tried to include Elaltitan lilloi, digitized for this study and included in preliminary analyses, but the fragmentary status increased greatly the error by the estimation method as there is only a proximal third or mid femur preserved from this taxon. Therefore we opted to exclude it from our database.

      Other taxa considered, as the reviewer suggest, was not readily available for the authors as the time of this study was conducted and including now may have increased the possible bias of our study. Giraffatitan brancai is an Late Jurassic brachiosaurid, which may again increase the number of early-branching titanosauriforms with large body masses while most of the smaller taxa sampled are recovered in deeply-branching macronarians (including Diamantinasaurus matildae if we would have also included it). Future analyses may include a wider sample of the mid to large-bodied titanosaurians, especially lithostrotians, as well as some colossosaurs like Patagotitan mayorum.

      Reviewer #1 (Recommendations For The Authors):

      These are all minor comments that would improve the manuscript.

      - There are a few typos throughout the manuscript such as: line 70 should be 2016 and line 242 should be forelimb.

      Corrected.

      - To me, the most interesting aspect of your study is the diversity and trends recovered in titanosaurian subclades and I would highlight this, not gigantism, in the title if you choose to revise the title.

      It has been addressed. The specificality of some of the tests and the implication to the acquisition of the spread limb posture and gigantism in early-branching taxa is important nonetheless, so we think that it may remain in the title.

      - The abstract should provide more details on the results such as none of the listed trends were statistically significant.

      Many of the trends exhibit phylogenetic signal, but not the allometric components. We have briefly addressed them.

      - Several sentences in the manuscript need citations such as: line 48 the reference to other megaherbivores, line 66 the discussion of poor understanding of the relationship of wide gauge posture and gigantism, and the use of centroid size as an estimate of body mass (see Public Review).

      We changed the line 66 to improve the focus on the current state of the art in the hypothesis of a relationship between arched limbs and in the increase of body size. We included a section relating centroid size as a proxy (due the good correlation between the femur and tibia length and the body mass) and the caveats of using it. We also expanded in the Appendix S2 the use of centroid size and the alternative models.

      - With titanosaur evolution, you mention that they are adapting to new niches and topography (line 64). What support is there for this versus they are adapting to be more successful in their current environment?

      Noted, we have changed the phrase to improved efficiency exploiting of inland environments, as thy can be either opening new inland niches or adapting better to current inland niches that were already exploited for less deeply branching sauropods. However, its testing is beyond the scope of the current work.

      - Line 384-385: the discussion of Rapetosaurus should mention that it is a juvenile and some studies have suggested that titanosaur limbs grow allometrically.

      We have included a small line. Whether Rapetosaurus krausei exhibit allometric growth or not may not change greatly the discussion, maybe only excluding it as morphologically convergent to Lirainosaurus and Muyelensaurus. But if that so, it will be further proof that small-sized titanosaurs exhibit the robust skeleton expected in the giant titanosaurs.

      - I would consider addressing the question of if we are certain enough in our understanding of titanosaurian phylogeny to rule out homology, especially when you discuss the uncertainty of the placement of specific taxa. Also, Diamantinasaurus is not the only titanosaur that has been proposed as a member of both basal and more derived subclades (e.g., Dreadnoughtus).

      We tried to assume a more conservative approach. We could not fully rule out that some of the features observed in the sampled deeply branching lithostrotians, especially saltasauroids, cannot be present in the entire somphospondylan lineage. However, none of the less deeply-branching or early-branching titanosaurs exhibit this kind of morphology. Recent studies propose the possibility that entire groups, included in this study like the Colossosauria, change its position in the phylogeny. However, despite the debated phylogenetic position of Diamantinasaurus or Dreadnoughtus, or even the inclusion of Colossosauria within the saltasauroids and the inclusion of the Ibero-Armorican lithostrotians as putative saltasaurids (Mocho et al. 2024). However, even considering these changes we did not notice any relevant differences in our conclusions about hind limb arched morphology nor about size. Distal hind limb overall robustness should indeed be addressed in the light of shifts in phylogenetic position and include some interesting sauropods like Diamantinasaurus or expand the large-sized Colossosauria or early-branching somphospondyls as it may have profound implications on the morphofunctional adaptations to specific feeding niches, e.g., see current hypotheses about rearing as mentioned in Bates et al. (2016), Ullmann et al. (2017) or Vidal et al. (2020). We had not enough information to conclude the presence of any plesiomorphic condition or analogous feature with our current sample and the debated titanosaurian phylogeny.

      - I understand this is not standard in the field, but your study provides the opportunity to conduct sensitivity testing of the effects of cartilage thickness and user articulation of the bones on PCA results. This would be an inciteful addition to the field of GMM.

      We are currently developing such a comprehensive analysis and several other implications on our past results. However, we feel that it is beyond the scope of the current study. We appreciate the suggestion nonetheless, as it would be a sensitivity test of the impact of several of our assumptions in the final results that is often not considered.

      - In Figure 1, if all the limbs were arranged the same way it would be easier to interpret. Consider flipping panels B and D to match A and C.

      Accepted.

      - In Figures 2-4, the views in C should be labeled in the figure or caption. Oceanotitan is also in the PCA plot but not included in the figure caption. Also, consider changing the names to represent the paraphyletic groupings you are using instead of formal clade names. For example, change 'Titanosauria' to 'Basal Titanosaurs' to reflect that it is not including all titanosaurs in the sample.

      Changes accepted for the shape PCA results. The informal (i.e., paraphyletic) terms such as “Basal Titanosaurs” were only used in the shape analyses as in the RMA, the Titanosauria (and other more inclusive groups) were used as natural groups. Each partial RMA model is based on a sample of all the taxa that are included within that particular clade (e.g., Titanosauria includes both Dreadnoughtus and Saltasaurus; Lithostrotia excludes the former).

      - I am concerned that centroid size does not scale evenly across the wide-ranging body mass of titanosaurs. I do not know if this affects your size trends or their significance, but as I mentioned above Dreadnoughtus is much bigger than most of the taxa included and that isn't as drastically apparent in centroid size (in Figure 5) as it is when taxa are plotted by body mass.

      Main problematic with centroid size of the hind limb is the shift in the body plan of deeply-branching titanosaurs as the Center of Masses is displaced toward the anterior portion of the body and it has been proposed due a large development of the forelimb region (e.g., Bates et al. 2016). However, it would only increase the effects of the phyletic body size reduction, as smaller taxa tend to have a 1:1 fore limb and hind limb ratio, e.g., from our past analyses as in Páramo et al. (2019), and the sacrum is not as beveled as in earlier somphospondyls, e.g., Vidal et al. (2020). The role of the low-browsing feeding habits of deeply-branching lithostrotians shall be explored elsewhere, as it may be the main driving force of this effect. Our point is, the proxy used may have some slight offset due some high-browsing giant early-branching titanosaurs which has a greater cranial region development which increase its body size and mass beyond our bare-minimum estimation based on the hind limb region. But, overall, this offset is assumed to be low. We repeated the analyses with the femoral length as proxy of body size and a mass estimation, including the quadratic equation based on both humeral and femoral lengths, and the results remain similar. Another problem that arises with the use of centroid size is the way it shall be calculated, but as we used an even number of landmarks and curve semilandmarks, and all of them bounded to anatomical features, it remains equal at least for our sample (but cannot be extrapolated to other geometric morphometric studies that do not use the same configurations)

      We appreciate the reviewer concerns nonetheless, as it was on of our own when designing this study, and we in the future will try to expand the analyses, or advise anyone expanding on this study, using total body size/volume estimations following Bates et al. (2016). Which also includes test of the effects of the different whole-body estimation models.

      Cites:

      Bates KT, Mannion PD, Falkingham PL, Brusatte SL, Hutchinson JR, Otero A, Sellers WI, Sullivan C, Stevens KA, Allen V. 2016. Temporal and phylogenetic evolution of the sauropod dinosaur body plan. Royal Society Open Science 3:150636. doi:10.1098/rsos.150636

      Mocho P, Escaso F, Marcos-Fernández F, Páramo A, Sanz JL, Vidal D, Ortega F. 2024. A Spanish saltasauroid titanosaur reveals Europe as a melting pot of endemic and immigrant sauropods in the Late Cretaceous. Commun Biol 7:1016. doi:10.1038/s42003-024-06653-0

      Páramo A, Ortega F, Sanz JL. 2019. A Niche Partitioning Scenario for the Titanosaurs of Lo Hueco (Upper Cretaceous, Spain). International Congress of Vertebrate Morphology (ICVM) - Abstract Volume, Journal of Morphology. Prague. p. S197.

      Ullmann PV, Bonnan MF, Lacovara KJ. 2017. Characterizing the Evolution of Wide-Gauge Features in Stylopodial Limb Elements of Titanosauriform Sauropods via Geometric Morphometrics. The Anatomical Record 300:1618–1635. doi:10.1002/ar.23607

      Vidal D, Mocho P, Aberasturi A, Sanz JL, Ortega F. 2020. High browsing skeletal adaptations in Spinophorosaurus reveal an evolutionary innovation in sauropod dinosaurs. Sci Rep 10:6638. doi:10.1038/s41598-020-63439-0

      Reviewer #2:

      The authors report a quantitative comparative study regarding hind limb evolution among titanosaurs. I find the conclusions and findings of the manuscript interesting and relevant. The strength of the paper would be increased if the authors were to improve their reporting of taxon sampling and their discussion of age estimation and the potential implications that uncertainty in these estimates would have for their conclusions regarding gigantism (vs. ontogenetic patterns).

      Considering the observations made by reviewer #1, we included a data about the impact of ontogenetic patterns and other intraspecific variability in the Appendix S3. We considered to increase the sample but it has not been possible at the time of this study was carried out.

      Reviewer #2 (Recommendations For The Authors):

      I have a few concerns/requests for the authors, that I hope can be easily resolved.

      Comments:

      - What drove taxon sampling?

      Random sampling of somphospondylan sauropods focused on the Lithostrotia clade for the thesis project of one of the authors, APB. Logistics were also one of the bias on our sample, and based on the available titanosaurian material we left out several macronarians that has been already sampled but would further induce a early-branching large sauropod, deeply-branching small sauropod that may alter our results.

      - Which phylogenies were used to create the supertree applied to the analyses? What references were used to time-calibrate the tips and deeper nodes? I couldn't find any reference to this. Additionally, more information regarding the R packages and analytical pipeline would be appreciated: e.g. were measurements used in the analyses log-transformed?

      A comprehensive description of the methodology is provided in Appendix S2.

      - Age estimate: can the author confirm the skeletal maturity of the sampled individuals? If this is not the case, how can the author be sure that the patterns towards gigantism are not reflecting different ontogenetic stages? I believe this should be part of both methods and discussion.

      As commented before, we excluded small, probable juvenile specimens from our sample. We have no paleohistological sample backing the claims of the ontogenetic status of some of the specimens that were included or excluded were calculating the mean shape for the operative taxonomic units. However, we followed a criteria to identify the relative ontogenetic status and it has been included in Appendix S3.

      - The authors used the centroid size for regressions in Figure 6. Although I believe that this is a good variable, would the author be willing to use body mass and log-transformed femur length in addition to what was done? These would be very useful considering that these variables are (relatively) independent from shape/morphology.

      Accepted, we tested our hypotheses with three alternative models based on femoral length, combined femoral and humeral lengths for body mass estimations. Methodology can be found in Appendix S2, results on Appendix S4, code for the alternative methods in Appendix S5.

      - Data access: will stl. Files of the limb elements be shared and freely available? In this case, where the files will be deposited?

      At the time of the current study, some of the sampled specimens cannot be available (material under study) but the mean shapes can be generated after the landmarks and semilandmark curves and the “atlas” mesh.

      - Additionally, outstanding references regarding limb evolution, GMM, role of ontogeny, and evolution of columnar gait are missing. The authors should reinforce the literature review with the following (alphabetical order):

      Bonnan, M. F. (2003). The evolution of manus shape in sauropod dinosaurs: implications for functional morphology, forelimb orientation, and phylogeny. Journal of Vertebrate Paleontology, 23(3), 595-613.

      Botha, J., Choiniere, J. N., & Benson, R. B. (2022). Rapid growth preceded gigantism in sauropodomorph evolution. Current Biology, 32(20), 4501-4507.

      Curry Rogers, K., Whitney, M., D'Emic, M., & Bagley, B. (2016). Precocity in a tiny titanosaur from the Cretaceous of Madagascar. Science, 352(6284), 450-453.

      Day, J. J., Upchurch, P., Norman, D. B., Gale, A. S., & Powell, H. P. (2002). Sauropod trackways, evolution, and behavior. Science, 296(5573), 1659-1659.

      Fabbri, M., Navalón, G., Benson, R. B., Pol, D., O'Connor, J., Bhullar, B. A. S., ... & Ibrahim, N. (2022). Subaqueous foraging among carnivorous dinosaurs. Nature, 603(7903), 852-857.

      Fabbri, M., Navalón, G., Mongiardino Koch, N., Hanson, M., Petermann, H., & Bhullar, B. A. (2021). A shift in ontogenetic timing produced the unique sauropod skull. Evolution, 75(4), 819-831.

      González Riga, B. J., Lamanna, M. C., Ortiz David, L. D., Calvo, J. O., & Coria, J. P. (2016). A gigantic new dinosaur from Argentina and the evolution of the sauropod hind foot. Scientific Reports, 6(1), 19165.

      Lefebvre, R., Allain, R., & Houssaye, A. (2023). What's inside a sauropod limb? First three‐dimensional investigation of the limb long bone microanatomy of a sauropod dinosaur, Nigersaurus taqueti (Neosauropoda, Rebbachisauridae), and implications for the weight‐bearing function. Palaeontology, 66(4), e12670.

      McPhee, B. W., Benson, R. B., Botha-Brink, J., Bordy, E. M., & Choiniere, J. N. (2018). A giant dinosaur from the earliest Jurassic of South Africa and the transition to quadrupedality in early sauropodomorphs. Current Biology, 28(19), 3143-3151.

      Martin Sander, P., Mateus, O., Laven, T., & Knötschke, N. (2006). Bone histology indicates insular dwarfism in a new Late Jurassic sauropod dinosaur. Nature, 441(7094), 739-741.

      Remes, K. (2008). Evolution of the pectoral girdle and forelimb in Sauropodomorpha (Dinosauria, Saurischia): osteology, myology and function (Doctoral dissertation, München, Univ., Diss., 2008).

      Sander, P. M., & Clauss, M. (2008). Sauropod gigantism. Science, 322(5899), 200-201.

      Yates, A. M., & Kitching, J. W. (2003). The earliest known sauropod dinosaur and the first steps towards sauropod locomotion. Proceedings of the Royal Society of London. Series B: Biological Sciences, 270(1525), 1753-1758.

      We appreciate this suggestion and we already used some of the articles in our study but the selection of cites were based also in the available manuscript space enforced by the edition guidelines. We would have like to include several of these works but we had opted to include some of the works that summarize some of them, whereas excluding others.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      In their manuscript, Gomez-Frittelli and colleagues characterize the expression of cadherin6 (and -8) in colonic IPANs of mice. Moreover, they found that these cdh6-expressing IPANs are capable of initiating colonic motor complexes in the distal colon, but not proximal and midcolon. They support their claim by morphological, electrophysiological, optogenetic, and pharmacological experiments.

      Strengths:

      The work is very impressive and involves several genetic models and state-of-the-art physiological setups including respective controls. It is a very well-written manuscript that truly contributes to our understanding of GI-motility and its anatomical and physiological basis. The authors were able to convincingly answer their research questions with a wide range of methods without overselling their results.

      We greatly appreciate the reviewer’s time, careful reading and support of our study.

      Weaknesses:

      The authors put quite some emphasis on stating that cdh6 is a synaptic protein (in the title and throughout the text), which interacts in a homophilic fashion. They deduct that cdh6 might be involved in IPAN-IPAN synapses (line 247ff.). However, Cdh6 does not only interact in synapses and is expressed by non-neuronal cells as well (see e.g., expression in the proximal tubuli of the kidney). Moreover, cdh6 does not only build homodimers, but also heterodimers with Chd9 as well as Cdh7, -10, and -14 (see e.g., Shimoyama et al. 2000, DOI: 10.1042/02646021:3490159). It would therefore be interesting to assess the expression pattern of cdh6proteins using immunostainings in combination with synaptic markers to substantiate the authors' claim or at least add the possibility of cell-cell-interactions other than synapses to the discussion. Additionally, an immunostaining of cdh6 would confirm if the expression of tdTomato in smooth muscle cells of the cdh6-creERT model is valid or a leaky expression (false positive).

      We agree with the reviewer that Cdh6 could be mediating some other cell-cell interaction besides synapses between IPANs, and we noted it in the discussion. Cdh6 primarily forms homodimers but, as the reviewer points out, has been known to also form heterodimers with some other cadherins. We performed RNAscope in the colonic myenteric plexus with Cdh7 and found no expression (data not shown). Cdh10 is suggested to have very low expression (Drokhlyansky et al., 2020), possibly in putative secretomotor vasodilator neurons, and Cdh14 has not been assayed in any RNAseq screens. We attempted to visualize Cdh6 protein via antibody staining (Duan et al., 2018) but our efforts did not result in sufficient signal or resolution to identify synapses in the ENS, which remain broadly challenging to assay. Similarly, immunostaining with Cdh6 antibody was unable to confirm Cdh6 protein in tdT-expressing muscle cells, or by RNAscope. We have addressed these caveats in the discussion section.

      (1) E. Drokhlyansky, C. S. Smillie, N. V. Wittenberghe, M. Ericsson, G. K. Griffin, G. Eraslan, D. Dionne, M. S. Cuoco, M. N. Goder-Reiser, T. Sharova, O. Kuksenko, A. J. Aguirre, G. M. Boland, D. Graham, O. Rozenblatt-Rosen, R. J. Xavier, A. Regev, The Human and Mouse Enteric Nervous System at Single-Cell Resolution. Cell 182, 1606-1622.e23 (2020).

      (2) X. Duan, A. Krishnaswamy, M. A. Laboulaye, J. Liu, Y.-R. Peng, M. Yamagata, K. Toma, J. R. Sanes, Cadherin Combinations Recruit Dendrites of Distinct Retinal Neurons to a Shared Interneuronal Scaffold. Neuron 99, 1145-1154.e6 (2018).

      Reviewer #2 (Public review):

      Summary:

      Intrinsic primary afferent neurons are an interesting population of enteric neurons that transduce stimuli from the mucosa, initiate reflexive neurocircuitry involved in motor and secretory functions, and modulate gut immune responses. The morphology, neurochemical coding, and electrophysiological properties of these cells have been relatively well described in a long literature dating back to the late 1800's but questions remain regarding their roles in enteric neurocircuitry, potential subsets with unique functions, and contributions to disease. Here, the authors provide RNAscope, immunolabeling, electrophysiological, and organ function data characterizing IPANs in mice and suggest that Cdh6 is an additional marker of these cells.

      Strengths:

      This paper would likely be of interest to a focused enteric neuroscience audience and increase information regarding the properties of IPANs in mice. These data are useful and suggest that prior data from studies of IPANs in other species are likely translatable to mice.

      We appreciate the reviewer’s support of our study and insightful critiques for its improvement.

      Weaknesses:

      The advance presented here beyond what is already known is minimal. Some of the core conclusions are overstated and there are multiple other major issues that limit enthusiasm. Key control experiments are lacking and data do not specifically address the properties of the proposed Cdh6+ population.

      Major weaknesses:

      (1) The novelty of this study is relatively low. The main point of novelty suggests an additional marker of IPANs (Cdh6) that would add to the known list of markers for these cells. How useful this would be is unclear. Other main findings basically confirm that IPANs in mice display the same classical characteristics that have been known for many years from studies in guinea pigs, rats, mice and humans.

      We appreciate the already existing markers for IPANs in the ENS and the existing literature characterizing these neurons. The primary intent of this study was to use these well-established characteristics of IPANs in both mice and other species to characterize Cdh6-expressing neurons in the mouse myenteric plexus and confirm their classification as IPANs.

      (2) Some of the main conclusions of this study are overstated and claims of priority are made that are not true. For example, the authors state in lines 27-28 of the abstract that their findings provide the "first demonstration of selective activation of a single neurochemical and functional class of enteric neurons". This is certainly not true since Gould et al (AJP-GIL 2019) expressed ChR2 in nitrergic enteric neurons and showed that activating those cells disrupted CMC activity. In fact, prior work by the authors themselves (Hibberd et al., Gastro 2018) showed that activating calretinin neurons with ChR2 evoked motor responses. Work by other groups has used chemogenetics and optogenetics to show the effects of activating multiple other classes of neurons in the gut.

      We thank the reviewer for bringing up this important point and apologize if our wording was not clear. Whilst single neurochemical classes of enteric neurons have been manipulated to alter gut functions, all such instances to date do not represent manipulation of a single functional class of enteric neurons. In the given examples, multiple functional classes are activated utilizing the same neurotransmitter, as NOS and calretinin are each expressed to varying degrees across putative motor neurons, interneurons and IPANs. In contrast, Chd6 is restricted to IPANs and therefore this study is the first optogenetic investigation of enteric neurons from a single putative functional class. Our abstract and discussion emphasizes this point and differentiates this study from those previous.

      (3) Critical controls are needed to support the optogenetic experiments. Control experiments are needed to show that ChR2 expression a) does not change the baseline properties of the neurons, b) that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons, and c) that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions focused on here.

      We completely agree controls are essential. However, our paper is not the first to express ChR2 in enteric neurons. Authors of our paper have shown in Hibberd et al. 2018 that expression of ChR2 in a heterogeneous population of myenteric neurons did not change network properties of the myenteric plexus. This was demonstrated in the lack of change in control CMC characteristics in mice expressing ChR2 under basal conditions (without blue light exposure). Regarding question (b), that it should be shown that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons. We show the restricted expression of ChR2 in IPANs and that motor responses (to blue light) are blocked by selective nerve conduction blockade.

      Regarding question (c), that our study should demonstrate that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions. We would not expect each region of the gut to behave comparably. This is because the different gut regions (i.e. proximal, mid, distal) are very different anatomically, as is anatomy of the myenteric plexus and myenteric ganglia between each region, including the density of IPANs within each ganglia, in addition to the presence of different patterns of electrical and mechanical activity [Spencer et al., 2020]. Hence, it is difficult to expect that between regions stimulation of ChR2 should induce similar physiological responses. The motor output we record in our study (CMCs) is a unified motor program that involves the temporal coordination of hundreds of thousands of enteric neurons and a complex neural circuit that we have previously characterized [Spencer et al., 2018]. But, never has any study until now been able to selectively stimulate a single functional class of enteric neurons (with light) to avoid indiscriminate activation of other classes of neurons.

      (1) T. J. Hibberd, J. Feng, J. Luo, P. Yang, V. K. Samineni, R. W. Gereau, N. Kelley, H. Hu, N. J. Spencer, Optogenetic Induction of Colonic Motility in Mice. Gastroenterology 155, 514-528.e6 (2018).

      (2) N. J. Spencer, L. Travis, L. Wiklendt, T. J. Hibberd, M. Costa, P. Dinning, H. Hu, Diversity of neurogenic smooth muscle electrical rhythmicity in mouse proximal colon. American Journal of Physiology-Gastrointestinal and Liver Physiology 318, G244–G253 (2020).

      (3) N. J. Spencer, T. J. Hibberd, L. Travis, L. Wiklendt, M. Costa, H. Hu, S. J. Brookes, D. A. Wattchow, P. G. Dinning, D. J. Keating, J. Sorensen, Identification of a Rhythmic Firing Pattern in the Enteric Nervous System That Generates Rhythmic Electrical Activity in Smooth Muscle. The Journal of Neuroscience 38, 5507–5522 (2018).

      (4) The electrophysiological characterization of mouse IPANs is useful but this is a basic characterization of any IPAN and really says nothing specifically about Cdh6+ neurons. The electrophysiological characterization was also only done in a small fraction of colonic IPANs, and it is not clear if these represent cell properties in the distal colon or proximal colon, and whether these properties might be extrapolated to IPANs in the different regions. Similarly, blocking IH with ZD7288 affects all IPANs and does not add specific information regarding the role of the proposed Cdh6+ subtype.

      Our electrophysiological characterization was guided to be within a subset of Cdh6+ neurons by Hb9:GFP expression. As in the prior comment (1) above, we used these experiments to confirm classification of Cdh6+ (Hb9:GFP+) neurons in the distal colon as IPANs. We have clarified in the results and methods that these experiments were performed in the distal colon and agree that we cannot extrapolate that these properties are also representative of IPANs in the proximal colon. We apologize that this was confusing. Finally, we agree with the reviewer that ZD7288 affects all IPANs in the ENS and have clarified this in the text.

      (5) Why SMP IPANs were not included in the analysis of Cdh6 expression is a little puzzling. IPANs are present in the SMP of the small intestine and colon, and it would be useful to know if this proposed marker is also present in these cells.

      We agree with the reviewer. In addition to characterizing Cdh6 in the myenteric plexus, it would be interesting to query if sensory neurons located within the SMP also express Cdh6. Our preliminary data (n=2) show ~6-12% tdT/Hu neurons in Cdh6-tdT ileum and colon (data not shown). We have added a sentence to the discussion.

      (6) The emphasis on IH being a rhythmicity indicator seems a bit premature. There is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS.

      Regarding the statement there is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS. We agree with the reviewer that evidence of rhythm generation by IH and IT in the ENS has not been explicitly confirmed. We are confident the reviewer agrees that an absence of evidence is not evidence of absence, although the presence of IH has been well described in enteric neurons. We have modified the text in the results to indicate more clearly that IH and IT are known to participate in rhythm generation in thalamocortical circuits, though their roles in the ENS remain unknown. Our discussion of the potential role of IH or IT in rhythm generation or oscillatory firing of the ENS is constrained to speculation in the discussion section of the text.

      (7) As the authors point out in the introduction and discuss later on, Type II Cadherins such as Cdh6 bind homophillically to the same cadherin at both pre- and post-synapse. The apparent enrichment of Cdh6 in IPANs would suggest extensive expression in synaptic terminals that would also suggest extensive IPAN-IPAN connections unless other subtypes of neurons express this protein. Such synaptic connections are not typical of IPANs and raise the question of whether or not IPANs actually express the functional protein and if so, what might be its role. Not having this information limits the usefulness of this as a proposed marker.

      We agree with the reviewer that the proposed IPAN-IPAN connection is novel although it has been proposed before (Kunze et al., 1993). As detailed in our response to Reviewer #1, we attempted to confirm Cdh6 protein expression, but were unsuccessful, due to insufficient signal and resolution. We therefore discuss potential IPAN interconnectivity in the discussion, in the context of contrasting literature.

      (1) W. A. A. Kunze, J. B. Furness, J. C. Bornstein, Simultaneous intracellular recordings from enteric neurons reveal that myenteric ah neurons transmit via slow excitatory postsynaptic potentials. Neuroscience 55, 685–694 (1993).

      (8) Experiments shown in Figures 6J and K use a tethered pellet to drive motor responses. By definition, these are not CMCs as stated by the authors.

      The reviewer makes a valid criticism as to the terminology, since tethered pellet experiments do not record propagation. We believe the periodic bouts of propulsive force on the pellet is triggered by the same activity underlying the CMC. In our experience, these activities have similar periodicity, force and identical pharmacological properties. Consistent with this, we also tested full colons (n = 2) set up for typical CMC recordings by multiple force transducers, finding that CMCs were abolished by ZD7288, similar to fixed pellet recordings (data not shown).

      (9) The data from the optogenetic experiments are difficult to understand. How would stimulating IPANs in the distal colon generate retrograde CMCs and stimulating IPANs in the proximal colon do nothing? Additional characterization of the Cdh6+ population of cells is needed to understand the mechanisms underlying these effects.

      We agree that the different optogenetic responses in the proximal and distal colon are challenging to interpret, but perhaps not surprising in the wider context. It is not only possible that the different optogenetic responses in this study reflect regional differences in the Chd6+ neuronal populations, but also differences in neural circuits within these gut regions. A study some time ago by the authors showed that electrical stimulation of the proximal mouse colon was unable to evoke a retrograde (aborally) propagating CMC (Spencer, Bywater, 2002), but stimulation of the distal colon was readily able to. We concluded that at the oral lesion site there is a preferential bias of descending inhibitory nerve projections, since the ascending excitatory pathways have been cut off. In contrast, stimulation of the distal colon was readily able to activate an ascending excitatory neural pathway, and hence induce the complex CMC circuits required to generate an orally propagating CMC. Indeed, other recent studies have added to a growing body of evidence for significant differences in the behaviors and neural circuits of the two regions (Li et al., 2019, Costa et al., 2021a, Costa et al., 2021b, Nestor-Kalinoski et al., 2022). We have expanded this discussion.

      (1) N. J. Spencer, R. A. Bywater, Enteric nerve stimulation evokes a premature colonic migrating motor complex in mouse. Neurogastroenterology & Motility 14, 657–665 (2002).

      (2) Li Z, Hao MM, Van den Haute C, Baekelandt V, Boesmans W, Vanden Berghe P, Regional complexity in enteric neuron wiring reflects diversity of motility patterns in the mouse large intestine. Elife 8:e42914 (2019).

      (3) Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Dinning PG, Brookes SJ, Spencer NJ, Motor patterns in the proximal and distal mouse colon which underlie formation and propulsion of feces. Neurogastroenterology & Motility e14098 (2021a).

      (4) Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Smolilo DJ, Dinning PG, Brookes SJ, Spencer NJ, Characterization of alternating neurogenic motor patterns in mouse colon. Neurogastroenterology & Motility 33:e14047 (2021b).

      (5) Nestor-Kalinoski A, Smith-Edwards KM, Meerschaert K, Margiotta JF, Rajwa B, Davis BM, Howard MJ, Unique Neural Circuit Connectivity of Mouse Proximal, Middle, and Distal Colon Defines Regional Colonic Motor Patterns. Cellular and Molecular Gastroenterology and Hepatology 13:309-337.e303 (2022).

      Recommendations for the Authors:

      Reviewer #1 (Recommendations for the authors):

      As mentioned above, immunolocalization of cdh6 would be helpful to substantiate the claims regarding IPAN-IPAN synapses.

      As mentioned in our response to both reviewers’ public reviews, we attempted to visualize Cdh6 protein via antibody staining (Duan et al., 2018), but our efforts did not result in sufficient signal or resolution to identify Cdh6+ synapses.

      (1) X. Duan, A. Krishnaswamy, M. A. Laboulaye, J. Liu, Y.-R. Peng, M. Yamagata, K. Toma, J. R. Sanes, Cadherin Combinations Recruit Dendrites of Distinct Retinal Neurons to a Shared Interneuronal Scaffold. Neuron 99, 1145-1154.e6 (2018).

      Reviewer #2 (Recommendations for the authors):

      (1) The authors repeatedly refer to IPANs as "sensory" neurons (e.g. in title, abstract, and introduction) but there is some debate regarding whether these cells are truly "sensory" because the information they convey never reaches sensory perception. This is why they have classically been referred to as intrinsic primary afferent (IPAN) neurons. It would be more appropriate to stick with this terminology unless the authors have compelling data showing that information detected by IPANs reaches the sensory cortex.

      We thank the reviewer for their comment, but respectfully disagree. The term “sensory neuron” is well established in the ENS. The first definitive proof that “sensory neurons” exist in the ENS was published in Kunze et al., 1995. We note that this paper did not use the word “IPAN” but used the term “sensory neuron”. Furthermore, mechanosensory neurons were published in Spencer and Smith (2004).

      Regarding the reviewer’s comment that the authors would need compelling data showing that information detected by IPANs reaches the sensory cortex before the term “sensory neuron” should be valid, it is important to note that many sensory neurons do not provide direct information to the cortex.

      (1) W. A. A. Kunze, J. C. Bornstein, J. B. Furness, Identification of sensory nerve cells in a peripheral organ (the intestine) of a mammal. Neuroscience 66, 1–4 (1995).

      (2) N. J. Spencer, T. K. Smith, Mechanosensory S-neurons rather than AH-neurons appear to generate a rhythmic motor pattern in guinea-pig distal colon. The Journal of Physiology 558, 577–596 (2004).

      (2) Important information regarding the gut region shown and other details are absent from many figure legends.

      We apologize for this omission. We have updated the figure legends to include information on gut regions.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript investigates a mechanism between the histone reader protein YEATS2 and the metabolic enzyme GCDH, particularly in regulating epithelial-to-mesenchymal transition (EMT) in head and neck cancer (HNC).

      Strengths:

      Great detailing of the mechanistic aspect of the above axis is the primary strength of the manuscript.

      Weaknesses:

      Several critical points require clarification, including the rationale behind EMT marker selection, the inclusion of metastasis data, the role of key metabolic enzymes like ECHS1, and the molecular mechanisms governing p300 and YEATS2 interactions.

      We would like to sincerely thank the reviewer for the detailed, in-depth, and positive response. We are committed to implementing constructive revisions to the manuscript to address the reviewer’s concerns effectively.

      Major Comments:

      (1) The title, "Interplay of YEATS2 and GCDH mediates histone crotonylation and drives EMT in head and neck cancer," appears somewhat misleading, as it implies that YEATS2 directly drives histone crotonylation. However, YEATS2 functions as a reader of histone crotonylation rather than a writer or mediator of this modification. It cannot itself mediate the addition of crotonyl groups onto histones. Instead, the enzyme GCDH is the one responsible for generating crotonyl-CoA, which enables histone crotonylation. Therefore, while YEATS2 plays a role in recognizing crotonylation marks and may regulate gene expression through this mechanism, it does not directly catalyse or promote the crotonylation process.

      We thank the reviewer for raising this concern. As stated by the reviewer, YEATS2 functions as a reader protein, capable of recognizing histone crotonylation marks and assisting in the addition of this mark to nearby histone residues, possibly by assisting the recruitment of the writer protein for crotonylation. Our data indicates the involvement of YEATS2 in the recruitment of writer protein p300 on the promoter of the SPARC gene, making YEATS2 a regulatory factor responsible for the addition of crotonyl marks in an indirect manner. Thus, we have decided to make changes in the title by replacing the word “mediates” with “regulates”. Therefore, the updated title can be read as: “Interplay of YEATS2 and GCDH regulates histone crotonylation and drives EMT in head and neck cancer”.

      (2) The study suggests a link between YEATS2 and metastasis due to its role in EMT, but the lack of clinical or pre-clinical evidence of metastasis is concerning. Only primary tumor (PT) data is shown, but if the hypothesis is that YEATS2 promotes metastasis via EMT, then evidence from metastatic samples or in vivo models should be included to solidify this claim.

      We appreciate the reviewer’s suggestion. Here, we would like to state that the primary aim of this study was to delineate the molecular mechanisms behind the role of YEATS2 in maintaining histone crotonylation at the promoter of genes that favour EMT in head and neck cancer. We have dissected the importance of histone crotonylation in the regulation of gene expression in head and neck cancer in great detail, having investigated the upstream and downstream molecular players involved in this process that promote EMT. Moreover, with the help of multiple phenotypic assays, such as Matrigel invasion, wound healing, and 3D invasion assays, we have shown the functional importance of YEATS2 in promoting EMT in head and neck cancer cells. Since EMT is known to be a prerequisite process for cancer cells undergoing metastasis(1), the evidence of YEATS2 being associated with EMT demonstrates a potential correlation of YEATS2 with metastasis. However, as part of the revision, we will use publicly available patient data to investigate the direct association of YEATS2 with metastasis by checking the expression of YEATS2 between different grades of head and neck cancer, as an increase in tumor grade is often correlated with the incidence of metastasis(2).

      (3) There seems to be some discrepancy in the invasion data with BICR10 control cells (Figure 2C). BICR10 control cells with mock plasmids, specifically shControl and pEGFP-C3 show an unclear distinction between invasion capacities. Normally, we would expect the control cells to invade somewhat similarly, in terms of area covered, within the same time interval (24 hours here). But we clearly see more control cells invading when the invasion is done with KD and fewer control cells invading when the invasion is done with OE. Are these just plasmid-specific significant effects on normal cell invasion? This needs to be addressed.

      We appreciate the reviewer for the thorough evaluation of the manuscript. The figure panels in question, Figure 2B and 2C, represent two different experiments performed independently, the invasion assay performed after knockdown and overexpression of YEATS2, respectively. We would like to clarify that both panels represent results that are distinct and independent of each other and that the method used to knockdown or overexpress YEATS2 is also different. As stated in the Materials and Methods section, the knockdown is performed using lentivirus-mediated transfection (transduction) of cells, on the other hand, the overexpression is done using standard method of transfection by directly mixing transfection reagent and the respective plasmids, prior to the addition of this mix to the cells. The difference in the experimental conditions in these two experiments might have attributed to the differences seen in the controls as observed previously(3). Hence, we would like to state that the results of figure panels Figure 2B and Figure 2C should be evaluated independently of each other.

      (4) In Figure 3G, the Western blot shows an unclear band for YEATS2 in shSP1 cells with YEATS2 overexpression condition. The authors need to clearly identify which band corresponds to YEATS2 in this case.

      The two bands seen in the shSP1+pEGFP-C3-YEATS2 condition correspond to the endogenous YEATS2 band (lower band, indicated by * in the shControl lane) and YEATS2-GFP band (upper band, corresponding to overexpressed YEATS2-GFP fusion protein, which has a higher molecular weight). To avoid confusion, the endogenous band will be highlighted (marked by *) in the lane representing the shSP1+pEGFP-C3-YEATS2 condition in the revised version of the manuscript.

      (5) In ChIP assays with SP1, YEATS2 and p300 which promoter regions were selected for the respective genes? Please provide data for all the different promoter regions that must have been analysed, highlighting the region where enrichment/depletion was observed. Including data from negative control regions would improve the validity of the results.

      Throughout our study, we have performed ChIP-qPCR assays to check the binding of SP1 on YEATS2 and GCDH promoter, and to check YEATS2 and p300 binding on SPARC promoter. Using transcription factor binding prediction tools and luciferase assays, we selected multiple sites on the YEATS2 and GCDH promoter to check for SP1 binding. The results corresponding to the site that showed significant enrichment were provided in the manuscript. The region of SPARC promoter in YEATS2 and p300 ChIP assay was selected on the basis of YEATS2 enrichment found in the YEATS2 ChIP-seq data. We will provide data for all the promoter regions investigated (including negative controls) in the revised version of the manuscript.

      (6) The authors establish a link between H3K27Cr marks and GCDH expression, and this is an already well-known pathway. A critical missing piece is the level of ECSH1 in patient samples. This will clearly delineate if the balance shifted towards crotonylation.

      We thank the reviewer for their valuable suggestion. To support our claim, we had checked the expression of GCDH and ECHS1 in TCGA HNC RNA-seq data (provided in Figure 4—figure supplement 1A and B) and found that GCDH showed increase while ECHS1 showed decrease in tumor as compared to normal samples. We hypothesized that higher GCDH expression and decreased ECHS1 expression might lead to an increase in the levels of crotonylation in HNC. To further substantiate our claim, we will check the abundance of ECHS1 in HNC patient samples as part of the revision.

      (7) The p300 ChIP data on the SPARC promoter is confusing. The authors report reduced p300 occupancy in YEATS2-silenced cells, on SPARC promoter. However, this is paradoxical, as p300 is a writer, a histone acetyltransferase (HAT). The absence of a reader (YEATS2) shouldn't affect the writer (p300) unless a complex relationship between p300 and YEATS2 is present. The role of p300 should be further clarified in this case. Additionally, transcriptional regulation of SPARC expression in YEATS2 silenced cells could be analysed via downstream events, like Pol-II recruitment. Assays such as Pol-II ChIP-qPCR could help explain this.

      Using RNA-seq and ChIP-seq analyses, we have shown that YEATS2 affects the expression of several genes by regulating the level of histone crotonylation at gene promoters globally. The histone writer p300 is a promiscuous acyltransferase protein that has been shown to be involved in the addition of several non-acetyl marks on histone residues, including crotonylation(4). Our data provides evidence for the dependency of the writer p300 on YEATS2 in mediating histone crotonylation, as YEATS2 downregulation led to decreased occupancy of p300 on the SPARC promoter (Figure 5F). However, the exact mechanism of cooperativity between YEATS2 and p300 in maintaining histone crotonylation remains to be investigated. To address the reviewer’s concern, we will perform various experiments to delineate the molecular mechanism pertaining to the association of YEATS2 with p300 in regulating histone crotonylation. Following are the experiments that will be performed:

      (a) Co-immunoprecipitation experiments to check the physical interaction between YEATS2 and p300.

      (b) We will check H3K27cr levels on the SPARC promoter and SPARC expression in p300-depleted HNC cells.

      (c) Rescue experiments to check if the decrease in p300 occupancy on the SPARC promoter can be compensated by overexpressing YEATS2.

      (d) As suggested by the reviewer, Pol-II ChIP-qPCR at the promoter of SPARC will be performed in YEATS2-silenced cells to explain the mode of transcriptional regulation of SPARC expression by YEATS2.

      (8) The role of GCDH in producing crotonyl-CoA is already well-established in the literature. The authors' hypothesis that GCDH is essential for crotonyl-CoA production has been proven, and it's unclear why this is presented as a novel finding. It has been shown that YEATS2 KD leads to reduced H3K27cr, however, it remains unclear how the reader is affecting crotonylation levels. Are GCDH levels also reduced in the YEATS2 KD condition? Are YEATS2 levels regulating GCDH expression? One possible mechanism is YEATS2 occupancy on GCDH promoter and therefore reduced GCDH levels upon YEATS2 KD. This aspect is crucial to the study's proposed mechanism but is not addressed thoroughly.

      The source for histone crotonylation, crotonyl-CoA, can be produced by several enzymes in the cell, such as ACSS2, GCDH, ACOX3, etc(5). Since metabolic intermediates produced during several cellular pathways in the cell can act as substrates for epigenetic factors, we wanted to investigate if such an epigenetic-metabolism crosstalk existed in the context of YEATS2. As described in the manuscript, we performed GSEA using publicly available TCGA RNA-seq data and found that patients with higher YEATS2 expression also showed a high correlation with expression levels of genes involved in the lysine degradation pathway, including GCDH. Since the preferential binding of YEATS2 with H3K27cr and the role of GCDH in producing crotonyl-CoA was known(6,7), we hypothesized that higher H3K27cr in HNC could be a result of both YEATS2 and GCDH. We found that the presence of GCDH in the nucleus of HNC cells is correlated to higher H3K27cr abundance, which could be a result of excess levels of crotonyl-CoA produced via GCDH. We also found a correlation between H3K27cr levels and YEATS2 expression, which could arise due to YEATS2-mediated preferential maintenance of crotonylation. This states that although being a reader protein, YEATS2 is affecting the promoter H3K27cr levels, possibly by helping in the recruitment of p300 (as shown in Figure 5F). Thus, YEATS2 and GCDH are both responsible for the regulation of histone crotonylation-mediated gene expression in HNC.

      We did not find any evidence of YEATS2 regulating the expression of GCDH in HNC cells. However, we found that YEATS2 downregulation reduced the nuclear pool of GCDH in head and neck cancer cells (Figure 7F). This suggests that YEATS2 not only regulates histone crotonylation by affecting promoter H3K27cr levels (with p300), but also by affecting the nuclear localization of crotonyl-CoA producing GCDH. Also, we observed that the expression of YEATS2 and GCDH are regulated by the same transcription factor SP1 in HNC. We found that the transcription factor SP1 binds to the promoter of both genes, and its downregulation led to a decrease in their expression (Figure 3 and Figure 7).

      We would like to state that the relationship between YEATS2 and the nuclear localization of GCDH, as well as the underlying molecular mechanism, remains unexplored and presents an open question for future investigation.

      (9) The authors should provide IHC analysis of YEATS2, SPARC alongside H3K27cr and GCDH staining in normal vs. tumor tissues from HNC patients.

      We thank the reviewer for their suggestion. We are consulting our clinical collaborators to assess the feasibility of including this IHC analysis in our revision and will make every effort to incorporate it.

      Reviewer #2 (Public review):

      Summary:

      The manuscript emphasises the increased invasive potential of histone reader YEATS2 in an SP1-dependent manner. They report that YEATS2 maintains high H3K27cr levels at the promoter of EMT-promoting gene SPARC. These findings assigned a novel functional implication of histone acylation, crotonylation.

      We thank the reviewer for the constructive comments. We are committed to making beneficial changes to the manuscript in order to alleviate the reviewer’s concerns.

      Concerns:

      (1) The patient cohort is very small with just 10 patients. To establish a significant result the cohort size should be increased.

      We thank the reviewer for this suggestion. We will increase the number of patient samples to assess the levels of YEATS2 and H3K27cr in normal vs. tumor samples.

      (2) Figure 4D compares H3K27Cr levels in tumor and normal tissue samples. Figure 1G shows overexpression of YEATS2 in a tumor as compared to normal samples. The loading control is missing in both. Loading control is essential to eliminate any disparity in protein concentration that is loaded.

      In Figures 1G and 4D, we have used Ponceau S staining as a control for equal loading. Ponceau S staining is frequently used as an alternative for housekeeping genes like GAPDH as a control for protein loading(8). It avoids the potential for variability in housekeeping gene expression. However, it may be less quantitative than using housekeeping proteins. To address the reviewer’s concern, we will probe with an antibody against a house keeping gene as a loading control in the revised figures, provided its expression remains stable across the conditions tested.

      (3) Figure 4D only mentions 5 patient samples checked for the increased levels of crotonylation and hence forms the basis of their hypothesis (increased crotonylation in a tumor as compared to normal). The sample size should be more and patient details should be mentioned.

      A total of 9 samples were checked for H3K27cr levels (5 of them are included in Figure 4D and rest included in Figure 4—figure supplement 1D). However, as a part of the revision, we will check the H3K27cr levels in more patient samples.

      (4) YEATS2 maintains H3K27Cr levels at the SPARC promoter. The p300 is reported to be hyper-activated (hyperautoacetylated) in oral cancer. Probably, the activated p300 causes hyper-crotonylation, and other protein factors cause the functional translation of this modification. The authors need to clarify this with a suitable experiment.

      In our study, we have shown that p300 is dependent on YEATS2 for its recruitment on the SPARC promoter. As a part of the revision, we propose the following experiments to further substantiate the role of p300 in YEATS2-mediated gene regulation:

      (a) Co-immunoprecipitation experiments to check the physical interaction between YEATS2 and p300.

      (b) We will check H3K27cr levels on the SPARC promoter and SPARC expression in p300-depleted HNC cells.

      (c) Rescue experiments to check if the decrease in p300 occupancy on the SPARC promoter can be compensated by overexpressing YEATS2.

      (d) Pol-II ChIP-qPCR at the promoter of SPARC will be performed in YEATS2-silenced cells to explain the mode of transcriptional regulation of SPARC expression by YEATS2.

      (5) I do not entirely agree with using GAPDH as a control in the western blot experiment since GAPDH has been reported to be overexpressed in oral cancer.

      We would like to clarify that GAPDH was not used as a loading control for protein expression comparisons between normal and tumor samples. GAPDH was used as a loading control only in experiments using head and neck cancer cell lines where shRNA-mediated knockdown or overexpression was employed. These manipulations specifically target the genes of interest and are not expected to alter GAPDH expression, making it a suitable loading control in these instances.

      (6) The expression of EMT markers has been checked in shControl and shYEATS2 transfected cell lines (Figure 2A). However, their expression should first be checked directly in the patients' normal vs. tumor samples.

      We thank the reviewer for the suggestion. To address this, we will check the expression of EMT markers alongside YEATS2 expression in normal vs. tumor samples.

      (7) In Figure 3G, knockdown of SP1 led to the reduced expression of YEATS2 controlled gene Twist1. Ectopic expression of YEATS2 was able to rescue Twist1 partially. In order to establish that SP1 directly regulates YEATS2, SP1 should also be re-introduced upon the knockdown background along with YEATS2 for complete rescue of Twist1 expression.

      To address the reviewer’s concern regarding the partial rescue of Twist1 in SP1 depleted-YEATS2 overexpressed cells, we will perform the experiment as suggested by the reviewer. In brief, we will overexpress both SP1 and YEATS2 in SP1-depleted cells and then assess the expression of Twist1.

      (8) In Figure 7G, the expression of EMT genes should also be checked upon rescue of SPARC expression.

      We thank the reviewer for the suggestion. We will check the expression of EMT markers on YEATS2/ GCDH rescue and update Figure 7G in the revised version of the manuscript.

      References

      (1) T. Brabletz, R. Kalluri, M. A. Nieto and R. A. Weinberg, Nat Rev Cancer, 2018, 18, 128–134.

      (2) P. Pisani, M. Airoldi, A. Allais, P. Aluffi Valletti, M. Battista, M. Benazzo, R. Briatore, S. Cacciola, S. Cocuzza, A. Colombo, B. Conti, A. Costanzo, L. Della Vecchia, N. Denaro, C. Fantozzi, D. Galizia, M. Garzaro, I. Genta, G. A. Iasi, M. Krengli, V. Landolfo, G. V. Lanza, M. Magnano, M. Mancuso, R. Maroldi, L. Masini, M. C. Merlano, M. Piemonte, S. Pisani, A. Prina-Mello, L. Prioglio, M. G. Rugiu, F. Scasso, A. Serra, G. Valente, M. Zannetti and A. Zigliani, Acta Otorhinolaryngol Ital, 2020, 40, S1–S86.

      (3) J. Lin, P. Zhang, W. Liu, G. Liu, J. Zhang, M. Yan, Y. Duan and N. Yang, Elife, 2023, 12, RP87510.

      (4) X. Liu, W. Wei, Y. Liu, X. Yang, J. Wu, Y. Zhang, Q. Zhang, T. Shi, J. X. Du, Y. Zhao, M. Lei, J.-Q. Zhou, J. Li and J. Wong, Cell Discov, 2017, 3, 17016.

      (5) G. Jiang, C. Li, M. Lu, K. Lu and H. Li, Cell Death Dis, 2021, 12, 703.

      (6) D. Zhao, H. Guan, S. Zhao, W. Mi, H. Wen, Y. Li, Y. Zhao, C. D. Allis, X. Shi and H. Li, Cell Res, 2016, 26, 629–632.

      (7) H. Yuan, X. Wu, Q. Wu, A. Chatoff, E. Megill, J. Gao, T. Huang, T. Duan, K. Yang, C. Jin, F. Yuan, S. Wang, L. Zhao, P. O. Zinn, K. G. Abdullah, Y. Zhao, N. W. Snyder and J. N. Rich, Nature, 2023, 617, 818–826.

      (8) I. Romero-Calvo, B. Ocón, P. Martínez-Moya, M. D. Suárez, A. Zarzuelo, O. Martínez-Augustin and F. S. de Medina, Anal Biochem, 2010, 401, 318–320.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews

      Reviewer #1 (Public review):

      Overall I find the evidence very well presented and the study compelling. It offers an important new perspective on the key properties of neoblasts. I do have some comments to clarify the presentation and significance of the work.

      We thank the reviewer for the positive feedback and plan to improve the presentation of the work.

      Reviewer #2 (Public review):

      However, the absence of a cell-cell feedback mechanism during colony growth and the likelihood of the difference needs to be clarified. Is there any difference in interpreting the results if this mechanism is considered?

      We will improve the description of the model assumptions and the interpretation of the data on the basis of these assumptions.

      Although hnf-4 and foxF have been silenced together to validate the model, a deeper understanding of the tgs-1+ cell type and the non-significant reduction of tgs-1+ neoblasts in zfp-1 RNAi colonies is necessary, considering a high neural lineage frequency.

      We will improve the analysis of this result in light of the experimentally determined frequency of the tgs-1+ neoblast population.

      Recommendations for the authors

      Reviewing Editor Comments:

      After consultation, we have compiled a list of the key changes to be made to the manuscript, along with reviewer-specific recommendations to follow.

      (1) Include a section that explicitly describes the assumptions and limitations of the study, particularly with respect to the following assumptions:

      We thank the reviewers for the comment. We added a description of the model assumptions in the methods section “Assumptions underlying neoblast colony growth model”.

      a) All known types of specialized neoblasts cycle at the same rate (see points from Reviewer 1).

      We thank the reviewers for the comment. The current data used to estimate τ (Lei et al., Dev Cell, 2016) does not allow the direct estimation of individual cycling behaviors. Consequently, we assume that all specialized neoblasts cycle at the same average rate, a simplification supported by the model's accurate prediction of colony growth.

      b) The assumption that any FSTF-like gene would behave like zfp1 or foxF and hnfA genes. The manuscript does not mention that there may be fundamental differences among these different FSTFs that could be uncovered by future work. A strong addition to the paper would be to test other epithelial genes (e.g. p53, chd4, egr5) to show reproducible behavior within a single lineage.

      We thank the reviewers for the comment. Colony size reduction following inhibition of Smed-p53 and failure to produce epidermal progenitors is strongly supported by previous analysis (Wagner et al., Cell Stem Cell, 2012). We refer to this observation in the paper in the section titled: “Inhibition of zfp-1 does not induce overexpression of other lineages in homeostasis”. We added the following sentence to the discussion (Line 460-462): Interestingly, suppression of Smed-p53, a TF expressed in neoblasts and required for epidermal cell production, has resulted in a similar reduction in colony size (Wagner et al., Cell Stem Cell, 2012).

      Of note, Chd4 expression is not limited to specialized neoblasts or to a specific lineage (Scinome et al., Development, 2010), and therefore its inhibition likely has a more complex outcome than an effect on a single lineage. Furthermore, egr-5 is not expressed in neoblasts (Tu et al, eLife, 2015), making this experimental condition more challenging to examine in the context of neoblast colonies at the time points assessed in this study.

      c) The fact that the data used to feed the model relies on radiated animals which are likely to have altered cell cycle rates compared to unirradiated animals (see comment by Reviewer 1). Of note, the model predicts a steady increase in colony size, but colony size does not change between 9dpi and 12dpi.

      We thank the reviewers for the comment. The colony size in control animals increased between 9 and 12 dpi (Fig 3B), as predicted by the model. In zfp-1 (RNAi) animals, the median colony size has also increased over this period, at a slower rate, which we attribute to the increase in q. We attribute the unchanged average colony size to an increase in the frequency of cells failing to proliferate, because of selection of a fate they cannot fully differentiate into.

      d) In light of both reviewers' comments about colony expansion vs. feedback, the authors should discuss how predicted changes to division frequencies might change as homeostasis is reached, or explain how their model accounts for the predicted rate differences under homeostatic conditions in which overall neoblast numbers do not change. Can the model estimate when this transition might occur?

      We thank the reviewers for the comment. Our colony assays are constrained by the animals survival following sub-total irradiation (16 to 20 days). In this timeframe, the neoblast population is overwhelmingly smaller in comparison to non-irradiated animals. Therefore, the animals do not reach homeostasis during the experiment, and the model does not allow to estimate the time the system would need to return to homeostasis.

      (2) In Figure 2D, the assumption is that these adjacent smedwi-1+ cells are sisters. Previous data analyzing this relied on EdU or H3P staining to show a shared division history. When these images were collected is therefore extremely critical to include (the methods suggest 7, 9, or 12 days). The authors should justify why they believe that these adjacent cells are derived from a single neoblast that has divided only once.

      We thank the reviewers for the comment. The images were collected at 7 dpi. We modified the figure legend and the associated methods to include this information. At this early time point, smedwi-1+ cell dyads are spatially separated from other neighboring cells, suggesting that they are the product of a single cell division. Importantly, our data is in complete agreement with previous estimates of symmetric renewal division rate (Raz et al., Cell Stem Cell, 2021; Lei et al, Developmental Cell, 2016).

      (3) Clarify the wording 'pre-selected' in the abstract as described by Reviewer 1.

      We thank the reviewers for the comment, and for clarity we replaced the wording “pre-select” with “select”. 

      (4) Experimental details that are important to the interpretation should be added. For example, how is belonging to a colony defined? This is important because some of the data (e.g. Figure S1A: similar numbers of smedwi-1+ cells are observed at 2dpi and 4dpi, but 4dpi is considered a colony whereas 2dpi is not). The timing of quantification should be included in each figure (it is missing in Figure S2, and Figure 3C and 3D). How the authors distinguish biological vs technical replicates is not mentioned.

      We thank the reviewers for the comment. Subtotal irradiation may result in formation of a spatially-isolated cluster of neoblasts that is not distributed throughout the animal (Wagner et al., Science, 2011). This localized cluster of neoblasts is defined as a neoblast colony (Wagner et al., Science, 2011; Wagner et al., Cell Stem Cell, 2012). The small number of high smedwi-1+ cells observed at 4 dpi in our experiments aligns with this definition (Fig S1A). By contrast, the low smedwi-1 expression detected across the animal 2 dpi does not fit this definition and likely reflects remnants of dying neoblasts resulting from irradiation. The following text was added to the figure legend: “isolated cells expressing low levels of smedwi-1+ were scattered in the planarian parenchyma, likely reflecting remnants of dying neoblasts”.

      (5) Figure 5F appears to use SMEDWI-1 antibody (based on capital letters and increased signal in the brain). Is this the case? The methods do not mention the use of a SMEDWI-1 antibody, and the text indicates that these are progenitors, but SMEDWI-1 protein is well known to not mark neoblasts. If the antibody was used, the authors should not claim that these are neoblasts.

      We thank the reviewers for the comment. The SMEDWI-1 antibody used in the experiments described in Figure 5F indeed labels neoblasts and their progeny (Guo et al., Developmental cell, 2006). The methods section “Immunofluorescence combined with FISH” details the labeling procedure, which combines FISH and IF using this antibody.

      All microscopy images are difficult to see. Perhaps this is because they are formatted as CMYK images. They should be converted to RGB format to make them appear less dull.

      We thank the reviewer for the comment. Improved version of the figures has now been uploaded.

      The terminology used in Figure 5 to describe upregulation should not be "overexpression".  We thank the reviewers for the comment.

      We changed the terminology to “upregulated”.

      Reviewer #1 (Recommendations for the authors):

      I think the authors should include a section that explicitly lays out the assumptions and limitations of the study. For example, I believe that determining tau requires assuming that all different types of specialized neoblasts cycle at the same rates. Also there is the assumption that any FSTF-like gene would behave like zfp1 or foxF and hnfA genes. It seems to remain possible that a future study could find that a subset of FSTFs might indeed exert "either/or" decisions in fating, just not the particular genes under investigation here.

      We thank the reviewer for the comment. We added a description of the model assumptions in the methods section.

      In the abstract, the wording "pre-selected" is somewhat puzzling to me. I would interpret a preselection as a process that defines the next specified state prior to its manifestation. Instead, and as I understand the authors argue this as well, the study provides good evidence that the determination mechanism is random in that subsequent neoblast choices do not likely depend on prior states. So I would suggest changing that wording.

      We thank the reviewer for the comment. We replaced “pre-select” with “select”

      Is it possible to determine the uncertainty in measuring tau the cell cycle time and would this have an impact on subsequent modeling?

      We thank the reviewers for the comment. The current data that was used to estimate tau (Lei et al., Dev Cell, 2016) does not allow us to directly estimate the uncertainty in measuring τ.

      For lines 154-164 I would suggest doing a little more to explicitly write out the logic of determining the growth constants within the main text and not just in methods, for ease of reading.

      We thank the reviewer for the comment, and added explanations for how we determined the growth constant in the text. The text now reads (lines 160-166): “Considering an average cell cycle length of 29.7 hours, we calculated the value of q using the following approach: the probabilities of all cell division outcomes must sum to 1. Our experimental data showed that symmetric renewal (p) and asymmetric division (a) occur at equal rates (i.e., p = a). By fitting these parameters to the experimental data, we determined that the difference between the probabilities of symmetric renewal and symmetric differentiation (i.e., p - q) was = 0.345 (Fig 2E, S1D-E). Therefore, with these criteria, we estimated the probabilities of cell division outcomes in the colony as p = 0.45, a = 0.45, and q = 0.1 (Fig 2G; Methods).”

      Line 192 why does post-mitotic progeny number linearly relate to neoblast number? In clones, a change in q has an exponential effect. I feel like I am missing something.

      We thank the reviewer for the comment. In colonies, 50% of cell divisions result in the production of post-mitotic progeny (asymmetric division). Therefore, the number of produced progenitors in a given cell cycle is linearly correlated with the number of neoblasts. This statement is in line with previous analysis of planarian colony size (Wagner et al., Cell Stem Cell, 2012).

      Line103 it also seems possible, although less likely, that the specified state is not fixed within a given cell cycle and could be that cells that try to switch into zeta-neoblasts mid-cell cycle arrest in proliferation etc just for that time.

      We thank the reviewer for the comment and agree that this is a possibility. However, our observations suggest that incorporating this factor into the model is unnecessary for accurately predicting colony size.

      In terms of the feedback mechanism proposed to operate in homeostasis, I think in the case of zfp-1 it is quite likely that loss of epidermal differentiation results in wound responses (this phenomenon has been documented in egr-5 RNAi in Tu et al 2015 I believe). This could play out differently in the clone assay because the effects of sublethal irradiation on this process would predominate in both control versus zfp1(RNAi) conditions.

      We thank the reviewer for the comment. Our RNA-seq analysis following zfp-1 inhibition did not show overexpression of injury-induced genes at an early time point (6 days; Fig. 5B-C). However, an increase in cycling cells was detected much earlier via EdU labeling (3 days; Fig. 5D). In the case of egr-5 suppression, Tu et al. analyzed injury-induced gene expression at a later stage (21 days of RNAi), where they found significant epidermal defects (see Fig. 5C in Tu et al.). We agree that sublethal irradiation effects likely predominate in colony analysis for both control and zfp-1 (RNAi) animals. In homeostasis, additional factors likely influence cell proliferation and differentiation.

      It seems likely that some of the differences noted between homeostasis versus clone growth could ultimately arise from the different growth parameters under each setting. Could the rate parameters be estimated from prior data in homeostasis as well? It seems to me that with the framework the authors use, homeostasis must involve a net zero change to neoblast abundance (also shown by Wagner 2011 by the sigmoidal curve of neoblast abundance at the endpoint of clone expansion). Therefore, in these conditions p=q by definition. Experimental evidence from Lei 2016 (Figure S7M) suggests asymmetric divisions and symmetric renewing divisions are about equally abundant (5/12 41% sym renewing vs 7/12 69% asymmetric renewing). Therefore, under homeostasis, there would be an estimated p=q=0.3 and a=0.4. Compared to clone growth conditions then, in homeostasis, it seems that roughly the rate of symmetric renewal decreases and the rate of symmetric differentiation also increases. I wonder, could this kind of difference potentially account for the differences between homeostasis versus clone expansion settings? It is also worth noting that the clone expansion context has been used as a sensitized genetic background for identifying effects of gene inhibition on neoblast self-renewal, so perhaps the reason this works is that the rates of selfrenewal are relatively less in homeostasis so that clone expansion represents a case where there is greater demand for self-renewal.

      We thank the reviewer for the comment. We agree that under homeostatic conditions, where the population size remains stable, the average probability of symmetric renewal matches the average probability of symmetric differentiation or elimination. By contrast, during colony expansion, the probability of symmetric renewal exceeds that of symmetric differentiation or elimination. The differences in response to a lineage block between homeostasis and colony expansion can have multiple interpretations. However, data from homeostatic animals does not permit the analysis of individual neoblasts or their specific responses to a lineage block. Consequently, we cannot determine whether the proliferative response following the lineage block during homeostasis is a direct response to the lineage block or an indirect effect resulting from changes in other neoblasts. We discuss these possibilities further in lines 472 - 484.

      In terms of the memory effect, I recall some arguments presented in the Raz 2021 study that were consistent with a slight memory for neoblast specification being retained. I believe this was a minor point from detecting a slightly higher likelihood of identifying 2-cell clones that both took on prog1+ identity compared to the population average. If this is the case, it may be worth the authors commenting on reconciling those observations with their model.

      We thank the reviewer for their comment. Raz et al. (Cell Stem Cell, 2021) reported that in the asymmetric division of a zeta-neoblast, which generates a prog-2+ cell and a neoblast, there was a slightly higher observed frequency of zfp-1 expression in the neoblast compared to the expected rate (Expected: 32%, Observed: 44%). This small increase may reflect a mild memory effect, experimental variability, or both. However, statistical analysis using Fisher's exact test yielded a non-significant p-value (p = 0.1), suggesting that this difference could be attributed to experimental variability. Other data from Raz et al., such as lineage representation in early colonies, also did not show significant memory effects, indicating that any such effects, if present, are minimal and difficult to detect. Therefore, while we do not, and cannot, rule out the presence of minor memory effects, we expect that effects of this magnitude will have minimal impact on our model.

      Reviewer #2 (Recommendations for the authors):

      Figure 2C and 2D:

      Please provide the specific time points for the data presented.

      We thank the reviewer for the comment. The information was added to the figure legend.

      Colony growth and homeostasis:

      It would be beneficial to estimate a time point at which colony growth transitions to a model with a cell-cell feedback mechanism, similar to that observed in homeostasis. This would help in understanding the dynamics and timing of these processes.

      We thank the reviewers for the comment. Our colony assays were constrained by the animals survival following sub-total irradiation (16 to 20 days). Neoblast numbers are substantially reduced compared to unirradiated animals, preventing us from determining the time point at which homeostasis is achieved.

      Methods:

      μl should be μL  

      The text was changed accordingly.

      Line 526: H2O should be H2O

      The text was changed accordingly.

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      The authors describe a genome-wide CRISPR screen in mouse ES cells to identify factors and genes that regulate positively and negatively FGF/ERK signaling during differentiation. Out of known and potentially novel regulating signals, Mediator subunit Med12 was a strong hit in the screen and it was clearly and extensively shown by that the loss of Med12 results in impaired FGF/ERK signal responsiveness, modulation of mRNA levels and disturbed cell differentiation leading to reduced stem cell plasticity.<br /> This is a very concise and well written manuscript that demonstrates for the first time the important role of Med12 in ES cells and during early cell differentiation. The results support data that had been previously observed in Med12 mouse models and in addition show that Med12 cooperates with various signaling systems to control gene expression during early lineage decision.

      We thank the reviewer for their positive evaluation of our work.

      Fig. 3 Supp1A-B:<br /> The loci of all three independent Med12 mutant clones and the absence of Med12 should be included. Are all three Med12 loss-of-function mutants?

      In the revised version of the manuscript, we have updated the scheme in Fig. 3 Supp 1A to represent both deletions that were obtained with the CRISPR guides used. Both the more common 97 bp deletion as well as the 105 bp deletion that occurred in one clonal line result in a complete loss of the protein on the western blot (Fig. 3 Supp. 1B), suggesting that all mutant clones used for further experiments are loss-of-function mutants.

      Minor:<br /> Line 466: Should be Fig. 6F, not 6E.

      We have removed this figure panel and the corresponding text in response to the other reviewers' comments.

      Reviewer #1 (Significance):

      The CRISPR screen identified list of some novel interesting factors that regulate FGF/ERK signaling in ES cells. Med12 was then analyzed in very detail on various levels and under various differentiation conditions, resulting in a complex picture how Med12 controls stem cell plasticity. These data support results observed in mouse models and identified novel regulating mechanisms of Med12.

      Reviewer #2 (Evidence, reproducibility and clarity):

      In the manuscript "Med12 cooperates with multiple differentiation signals to enhance embryonic stem cell plasticity" Ferkorn and Schröter report on the role of Med12 in mouse embryonic stem cells. The perform an elegant genetic screen to identify regulators of Spry4 in mouse ESCs, screening for mutations that increase and decrease Spr4-reporter expression in serum/LIF conditions. They find that Med12 deletion results in defects in the exit from naïve pluripotency and in PrE-formation upon Gata-TF overexpression. Using scRNAseq experiments they report a reduction in biological noise in Med12 KO cells differentiating towards PrE upon Gata6 OE.

      Major points:<br /> 1) The title might not exactly reflect the scientific findings of the manuscript. There is little direct evidence for a decrease in plasticity upon Med12 depletion.

      We have changed the title to "Med12 cooperates with multiple differentiation signals to facilitate efficient lineage transitions in embryonic stem cells". In addition, we have toned down claims that Med12 regulates plasticity throughout the manuscript.

      2) Fig 1G: From the data provided it is not entirely clear how well screen results can be validated. Did some of the mutants identified in the screen also produce no detectable phenotypes? What would be the phenotype of knocking out an unrelated gene? In other words, are some of the weak phenotypes really showing Spry4 downregulation or are they withing the range of biological variance?

      Fluorescence levels in Fig. 1G have been normalized to control wild-type cells (dashed red line). Absence of a detectable phenotype would have resulted in normalized fluorescence values around 1. Fluorescence values of all tested mutants were significantly different from 1, as indicated in the statistical analysis given in the figure legend. Furthermore, H2B-Venus fluorescence of cells transfected with a non-targeting control vector are shown in Fig. 1F, and are not different from that of untransfected control wild-type cells. We have now added an explicit explanation how we normalized the data to the figure legend of Fig. 1G, and hope that this addresses the reviewer's concern.

      3) Rescue experiments by re-expressing Med12 in Med12 KO ESCs are missing. Can the differentiation and transcriptional phenotypes be rescued?

      We agree with the reviewer that a rescue experiment re-expressing Med12 would be ideal to ensure that the observed phenotypes are specifically due to loss of Med12. However, we could not identify commercially available full-length Med12 cDNA clones. Even though we managed to amplify full-length Med12 cDNA after reverse transcription, we were unable to clone it into expression vectors. These observations suggest that specific properties of the Med12 cds make the construction of expression vectors by conventional means difficult, and solving these issues is beyond the scope of this study.

      Throughout the study we used multiple independent clonal lines in multiple experimental readouts and obtained congruent results. The reduced expression of pluripotency genes for example was observed in bulk sequencing of the lines introduced in Fig. 3, and by single-cell sequencing of independently generated _Med12-_mutant GATA6-mCherry inducible lines (Fig. 5 Supp. 1B). We argue that this congruence makes it unlikely that the results are dominated by off-target effects.

      4) L365: The subheading "Transitions between embryonic... buffered against loss of Med12" is confusing. The data simply shows that Med12 KOs can still, albeit less efficiently generate PrE upon Gata TF OE. Is there evidence for some active buffering? I think the authors could simply report the data as is, stating that the phenotypes are not a complete block but an impairment of differentiation.

      Prompted by the reviewer's comment as well as remarks along similar lines by reviewer #4, we have completely reorganized this section and now present all the analysis pertaining to PrE differentiation in a new figure 4. In the revised text (lines 316 - 378), we refrain from any speculations about possible buffering and simply report the data as is, as suggested by the reviewer.

      5) L386: Would it not make more sense to reduce dox concentrations in control cells to equalize Gata6 OE to equalize levels between Med12 KO and controls? A shorter pulse of Gata6 does not really directly address unequal expression levels due to loss of Med12. Different pulse length of OE might have consequences that the authors do not control for. This also impacts scRNAseq experiments which suffer from the same, in my opinion, suboptimal experimental setup. This is a point that needs to be addressed.

      We agree with the reviewer that it would have been desirable to equalize GATA6 overexpression levels between wild-type and Med12-mutant cells while keeping induction time the same. In our experience however, reducing the dox concentration is not suitable to achieve this: Rather than reducing transgene expression levels across the board, lower dox concentrations tend to increase the variability within the population - see Fig. 2 in PMID: 16400644 for an example. Since we agree with the reviewer that the setup of the scRNAseq experiment limits our ability to draw conclusion regarding the separation of cell states, we have decided remove these analyses in the revised manuscript. In doing so, we have reorganized the previous figures 5 and 6 into a new single figure 4. This has made the manuscript more concise and allowed us to focus on the main phenotype of the Med12 mutant cells, namely their delayed exit from pluripotency.

      6) The reduced transcript number in Med12 KOs is interesting, but how does it come about. Is there indeed less transcriptional activity or is reduced transcript numbers a side effect of slower growth or the different cell states between WT and Med12 mutants. Appropriate experiments to address this should be performed.

      To address this point, we have performed EU labeling experiments, to compare RNA synthesis rates between wild-type and Med12-mutant during the exit from pluripotency. These experiments confirmed an increase in the mRNA production upon differentiation for both wild-type and Med12 mutant cells, but the method was not sensitive enough to detect any differences between wild-type and Med12 mutant cells within the same condition. The EU labeling thus supports the notion that overall transcriptional rate increases during differentiation, but leaves open the possibility that reduced mRNA levels in Med12 mutant cells arise from effects other than reduced transcriptional output. These new analyses areshown in Fig. 4 Supp. 3 and described in the main text in lines 373 - 378.

      7) I the proposed reduction of biological noise a feature of the PrE differentiation experiments or can it also be observed in epiblast differentiation.

      To address this question, we have carried out single-cell measurements of Spry4 and Nanog mRNA numbers to compare transcriptional variability between wild-type and Med12-_mutant cells during epiblast differentiation (new Fig. 3 Supp. 1G, H). These measurements confirmed the differences between genotypes in mean expression levels detected by RNA sequencing. However, this analysis did not reveal strong differences in mRNA number distributions. Furthermore, as discussed in point 6 above, our interpretations of noise levels in the PrE differentiation paradigm could have been influenced by the unequal GATA6 induction times. Finally, reviewer #4 pointed out that 10x genomics scRNAseq is not ideal to compare noise levels when total mRNA content differ between samples, as is the case in our dataset. We therefore decided to tone down our conclusions regarding altered noise levels in _Med12-mutant cells.

      8) I cannot follow the authors logic that Med12 loss results in enhanced separation between lineages. How is this experimentally supported.

      As discussed in point 6 above, this result could have been influenced by the unequal induction times between wild type and Med12-mutant cells. We have therefore decided to remove this analysis in the revised version of the manuscript.

      Minor points:<br /> Fig 3, Supp1 A: What exactly are the black and blue highlighted letters?

      The black and blue highlighted letters indicate whether bases are part of an intron or an exon. Exon 7 is now explicitly labelled in the figure, and the meaning of the highlighting is explained in the figure legend.

      Reviewer #2 (Significance):

      Overall, this is an interesting study. The screen has been performed to a high technical standard and differentiation defects were appropriately analyzed. The manuscript has some weaknesses in investigating the molecular mode of action of Med12 which could be improved to provide more significant insights.

      Reviewer #3 (Evidence, reproducibility and clarity):

      The authors sought to identify genes important for the transcriptional changes needed during mouse ES cell differentiation. They identified a number of genes and focussed on Med12, as it was the strongest hit from a cluster of Mediator components.

      Using knockout ES cells, differentiation assays, bulk and scRNAseq, they clearly show that Med12 is important for transgene activation and for gene activation generally during exit from self-renewal, but it is not specifically influencing differentiation efficacy per se. Rather, cells lacking Med12 display "a reduced ability to react to changing culture conditions" and, by inference, to environmental changes. They conclude that Med12 "contributes to the maintenance of cellular plasticity during differentiation and lineage transitions."

      Med12 is a structural component of the kinase module of Mediator, but it is not clear what this study tells us about Mediator function. The authors state that their results contrast with those obtained using a Cdk8 inhibitor, which resulted in increased self-renewal (lines 577-580). I'm not sure where their results show "...that loss of Med12 leads to reduced pluripotency." (lines 579-580). They do not test potency of these cells. There is reduced expression of some pluripotency-associated markers and fewer colonies formed in a plating assay, but these assays to not test cellular potency.

      We agree with the reviewer that our RNA sequencing and colony formation assays do not exhaustively test cellular potency. We have therefore changed the wording in the paragraphs that describe these assays and now talk about "reduced pluripotency gene expression" (e.g. lines 20, 228, 461, 512).

      While their phenotype certainly appears different from that reported in cells treated with Cdk8 inhibitor, it's not clear to me what to make of it, or what it might tell us about the function of the Mediator Kinase module or of Mediator. That a co-activator is important for gene expression in general, or even for gene activation upon receipt of some signal, is not really surprising.

      We believe that reporting differences in the phenotypes obtained with Cdk8 inhibition versus knock-out of Med12 is relevant, because it yields new insight into the different functions that the components of the Mediator kinase module have in pluripotent cells. We have previously discussed possible reasons for these functional differences (discussion line 519 - 528), and further expand on them in the revised manuscript.

      Minor points:

      It is surprising they don't relate their work to that of Hamilton et al (https://doi.org/10.1038/s41586-019-1732-z) who conclude that differentiation from the ES cell state towards primitive endoderm is compromised without Med24.

      Thank you for pointing out this omission. We now cite the work of Hamilton et al., in line 317 (related to new Fig. 4) and 537 - 538 in the discussion.

      Stylistic point: please make the separation between paragraphs more obvious. With no indentation or extra spacing between paragraphs it looks like one solid mass of words.

      Reviewer #3 (Significance):

      There is a lot of careful work here, but I'm not getting a big conclusion here. Perhaps the authors could argue their main points somewhat more stridently and what we've learned beyond this current system.

      Prompted by the reviewer's comment, we have re-organized the functional analyses of Med12 function in the manuscript by condensing the previous figures 5 and 6 into a new single figure 4. We have removed all discussions of transcriptional noise and plasticity, and now focus more strongly on the slowed pluripotency transitions as the main phenotype of the Med12 mutant cells. These changes make the manuscript more concise, and we hope that they help to deliver a single, clear message to the reader.

      Reviewer #4 (Evidence, reproducibility and clarity):

      Fernkorn and Schröter report the results of a screen in mESCs based on modulation of the fluorescent intensity of the Spry4:H2B-Venus reporter. They identify candidate genes that both positively and negatively modulate the expression of the reporter. Amongst those, are several known regulators of the FGF pathway (transcriptional activator of Spry4) that serve as a positive control for the screen. The manuscript focuses on characterisation of Med12, and the authors conclude that Med12 does not specifically affect FGF-targets. Paradoxically, the authors show that based on the expression of key naïve markers Med12 cells show delayed differentiation. Functionally, however, Med12 mutant cells at 48hrs can form less colonies when plated back in naïve conditions (that would normally indicate accelerated differentiation ). The authors conclude that Med12 mutants have "a reduced ability to react to changing culture conditions". Next, they examine the Med12 mutation affects embryonic/extraembryonic differentiation using an inducible Gata6 expression system. They show that transgene induction is slower and dampened in mutant cells and that overall the balance of fates is skewed towards embryonic cells. Finally, they use single cell RNA sequencing and observe differences in the number of mRNAs detected, as well as the separation between clusters in the mutant cells. They conclude that the mutants have reduce transcriptional noise levels.

      Overall, it was an interesting article exploring the molecular consequences of knocking out a subunit of the mediator complex. The characterisation focuses primarily on the description of the screen and the more functional consequences of the KO, rather than delving onto the molecular aspects (e.g. whether mediator complex assembly is affected, or it's binding etc). The analysis of the transcriptional noise will be of particular interest to the community, although I have some suggestions to exclude the possibility that the analysis simply reflects changes in global transcription levels. I have a small number of concerns and requests for clarification on the data but all of them should be relatively easy to address.

      Mayor points:

      • Med12, transcription levels and noise (Figure 6G, J-L). This is an intriguing observation. The labelling and multiplexing helped resolve many of the issue typically associated with comparing 10x dataset. I have two observations about this analysis:<br /> 1) Clarify how number of mRNA counts per cell is calculated (figure 6F) - the methods only described a value normalised by the total number of counts per cell.

      The mRNA counts shown in the figure correspond to the raw number of UMIs detected per cell. We now explicitly state this in the figure legend. Please note that after re-organizing the manuscript, former Fig. 6F has become Fig. 4 Supp. 3A.

      I feel this observation is key and has repercussions for the interpretation of the data (see point below) and should be independently validated (although I recognise it's difficult!). Since the authors observed differences in a randomly integrated transgene (iGata experiments), it's possible/likely that the dysregulation of transcription output is more generic. A possible suggestion is measuring global mRNA synthesis and degradation rates, either using inhibitors or by adding modified nucleotides and measuring incorporation rate and loss through pulse/chase labelling.

      We have performed an EU labeling experiment to address this point, which is shown in Fig. 4 Supp. 3 and described in the main text in lines 373 - 378 of the revised manuscript. Please refer to our response to reviewer #2, point 6 for a short description of the results.

      2) 10x is not the ideal for looking at heterogeneity/noise since it has a low capture efficiency and there are a lot of gaps/zeros in the lower expression range. Therefore, it's simply possible that mutant cells have dampened transcriptional output, meaning lowly expressed genes which in the WT contribute to the apparent heterogeneity (because there is a higher chance of not being captured), are below the 10x detection range in the mutant. This can be seen by plotting the cumulative sum of the mean gene count across each sample - the 50% mark (=mean gene count at 50% detection) reflects a measure of the "capture efficiency" (either because of technical reasons or lower mRNA input). Generally (e.g. also seen across technical repeats), the mean coefficient of variation, entropy and other measures of population heterogeneity directly scale with this "mean gene count at 50% detection", while the cell-cell correlation inversely scales with the "mean gene count at 50% detection". If this scaling relationships are observed for the WT and mutant, then it is impossible to say from the single cell RNA-seq whether the differences in heterogeneity are due to biological or technical reasons. Unfortunately, down-sampling the reads does not generally correct or normalise for this type of technical noise since the technical errors accumulate at every step of sample prep. Of course, it's possible that the technical noise in the RNAseq obfuscates real differences in the level of noise. The failure of mutant cells to re-establish the naïve network certainly suggest there is something going on. Therefore, I suggest performing the analysis of capture efficiency vs CV2 mentioned above and adjusting the discussion accordingly, and potentially perform single molecule FISH of key variable genes at the interface of the two clusters to validate the difference in heterogeneity.

      As suggested by the reviewer, we have performed single molecule FISH measurements of variable genes (Fig. 3 Supp. 1 G, H), but these did not provide independent evidence for increased noise levels in Med12 mutant cells. In light of the caveats raised by reviewer #4 when estimating noise levels from 10x scRNAseq data, and the suggestion of reviewer #3 to sharpen the focus of the manuscript, we have decided to remove any strong conclusions about different noise levels between the genotypes. Instead, we focus on the slowed pluripotency transitions as the main phenotype of the Med12 mutant cells to make the manuscript more concise, to deliver a single, clear message.

      • Are Oct4 levels affected? Reduction of Oct4 is sufficient to block differentiation (Radzisheuskaya et al. 2013 - PMID: 23629142).

      We thank the reviewer for this idea. We measured OCT4 expression levels in single cells via quantitative immunostaining and found that that there is no difference between wild-type and Med12-mutant cells. It is therefore unlikely that lowered OCT4 levels block differentiation in the mutant. These new results are shown in Fig. 5, Supp. 1 D, E.

      • Med12 mutants showing transcriptionally delayed differentiation (related to figure 4C). Is this delay also reflected in the expression of formative genes? If I understand correctly, Figure 4C is made from a panel of naïve markers. It would be good to determine if the formative network is equally affected (and in the same direction - suggesting a delay), or if the transcriptional changes speak to a global dysregulation/dampened expression.

      Prompted by the reviewer's suggestion, we have extended our analysis of the differentiation delays to genes that are upregulated during differentiation, such as formative genes. Rather than trying to come up with an new set of formative markers to produce a variation of the original Fig. 4C (Fig. 5C in the revised manuscript), we have taken an unbiased approach and extended Fig. 5E with a panel showing the distribution of expression slopes of the 100 most upregulated genes determined as in Fig. 5D. This analysis demonstrates a lower upregulation slope in Med12-mutant cells. This result confirms that both the upregulation and downregulation of genes is less efficient upon the loss of MED12, in line with our conclusion of delayed differentiation.

      • Control for the re-plating experiments in 2i/LIF (Figure 4B). Replating in 2iLIF + FBS can have a large selective effect in certain mutant backgrounds (e.g. Nodal mutants) which don't accurately reflect the differentiation status. To exclude such effects, it would be good to repeat the replating assays in serum-free conditions (laminin coating can help with attachment) and include undifferentiated controls to ensure that the mutant doesn't have a clonal disadvantage.

      The reason we have included FBS in the re-plating assays is that in our experience, Fgf4-_mutant cells show strongly impaired growth standard in 2i+LIF medium. We anticipate that using laminin coating to help with attachment would not overcome this requirement. We have therefore decided against repeating the re-plating assays. Instead, we state the reason why we used FBS in the main text, and also explicitly acknowledge the reviewers' concern of the risk of selective effects of the FBS and the possible clonal disadvantages of the _Med12 mutant line.

      Minor points:<br /> - I found figure 3D and the corresponding text and caption difficult to understand. It is unclear what a "footprint", "relative pathway activity" or "spearman correlation of footprint" mean. Were all the genes listed below Med12 knocked out and sequenced in this study? I suggest re-working and maybe simplifying the text and figure.

      We re-worked the description about the pathway analysis and stated more clearly that:

      • The footprint is a quantitative measure of the differences in gene expression change of a defined list of target genes between wild-type and perturbation.
      • Only the Med12 mutant data is new data produced in this manuscript and all examples below are from Lackner et al., 2021.

      We think that a more extensive explanation of the terms "relative pathway activity" and "spearman correlation of footprint" would disturb the flow of the manuscript too much. Therefore, we now cite the original paper just next to the sentence these terms are mentioned.

      In figure S1 Sup1 the authors report the dose response of targets to FGF - are those affected in the mutant?

      In this manuscript we have not tested if the dose response of FGF target genes changes upon perturbation of Med12. We argue that such an experiment would be beyond the scope of the current manuscript, since - as acknowledged by the reviewer - "Med12 does not specifically affect FGF-targets".

      • Similarly, it would be helpful to guide the reader through figure 5H-I and the corresponding text and caption since it's not immediately obvious how the analysis/graphs lead to the conclusion stated.

      As a consequence of our reorganization of the manuscript, the original figure 5H-I has been moved to Fig. 4, Supp. 1 in the revised version. The analysis strategy has been described in more detail in one of our previous publications (PMID: 26511924). In keeping with our general decision to make the manuscript more focused and concise, we have decided against further expanding on these data, but instead refer the reader to the original publication.

      • Role of Med12 in regulating FGF signalling. There are two observations that seems a bit at odds with the text description and it would be helpful to clarify: "ppERK levels were indistinguishable between wild-type and Med12-mutant lines" (line 222) - 5/6 datapoints show an increase. "[...] overall these results argue against a strong and specific role of Med12 in regulation of FGF target genes." (line 274). If I understood correctly, ~50% of genes are differentially transcribed because of Med12 KO.

      To address the reviewers' first question, we have performed a statistical test on the quantifications of the western blots. This test indicates that there is no significant change of ppERK levels upon loss-of MED12, which now stated clearly in the text (line 217).

      Second, to clarify why our data argues against a strong and specific role of Med12 in regulation of FGF target genes, we now formulate an expectation (lines 276 - 277): If MED12 specifically regulated FGF target genes, the number of differentially expressed genes would be higher in the wild-type than in the Med12-mutant upon stimulation with FGF. This however is not the case.

      • "[...] as well as transitions between different pluripotent states" (line 41) - references missing.

      We have added a reference to PMID: 28174249 (line 39).

      • Line 447: "differentiation conditions" - it's unclear what it's mean by differentiation and how it relates to the diagram in figure 6A. Are those the 20hr cells? Do the -8h, -4hr and 0hr cells (if I understand the meaning of the diagram) cluster all together?

      We now specify in the text that pluripotency conditions refer to cells maintained in 2i + LIF medium, whereas differentiation refers to cells switched to N2B27 after the doxycycline pulse (lines 341 - 342).

      • The difference in dynamics of mCherry activation as a consequence of Med12 KO are not apparent from figure 5E. It might be easier to visualise this observation if x-axis was normalised to the starting point plotting "time from start of induction".

      We agree with the reviewer that the current alignment has not been optimized to compare GATA6 induction dynamics between wild-type and Med12-mutant cells. If we changed the alignment however, it would not be clear any longer that both genotypes were in N2B27 for the same amount of time before analyzing Epi and PrE differentiation. Since our focus is on the differentiation of the two lineages rather than GATA6-mCherry induction dynamics, we decided to keep the original alignment.

      • Figure 3H/I - what does "gene expression changes" and "fold change ratio" mean?

      In Fig. 3H, we plot the the fold change of gene expression upon FGF4 stimulation in _Med12-_mutant versus that in wild-type cells; in Fig. 3I we plot the distribution of the ratio of these two fold changes across all genes. To make this strategy clearer, we have changed the axis label in Fig. 3H to "expression fold change upon FGF", to make it consistent with the axis label "fold-change ratio" in Fig. 3I.

      • Line 579-580 - please clarify what is meant by "reduced pluripotency".

      Prompted by a similar concern raised by reviewer #3, we have changed the wording throughout this paragraph and now talk of "reduced pluripotency gene expression". See also our response to reviewer #3 above.

      • Title: "enhance ESC plasticity". not sure enhance is the right word? There is no evidence that the plasticity of cells is affected.

      We have changed the title; see also our response to reviewer #2, point 1.

      Reviewer #4 (Significance):

      Overall, it was an interesting article exploring the molecular consequences of knocking out a subunit of the mediator complex. The characterisation focuses primarily on the description of the screen and the more functional consequences of the KO, rather than delving onto the molecular aspects (e.g. whether mediator complex assembly is affected, or it's binding etc). The analysis of the transcriptional noise will be of particular interest to the community, although I have some suggestions to exclude the possibility that the analysis simply reflects changes in global transcription levels. I have a small number of concerns and requests for clarification on the data but all of them should be relatively easy to address.

    1. Very few papers have a relevance score above 50. This suggests that only a small fraction of articles are extremely relevant to the exact search terms used. 0501001502000200400600Distribution of Relevance ScoresRelevance ScoreNumber of Papers.cls-1 {fill: #3f4f75;} .cls-2 {fill: #80cfbe;} .cls-3 {fill: #fff;}plotly-logomark {"x":{"data":[{"orientation":"v","width":[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],"base":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"x":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643],"y":[1712,5757,710,168,141,249,324,296,306,266,245,212,215,193,163,131,123,89,87,77,73,51,65,57,61,51,29,36,35,24,23,22,20,15,19,15,14,15,8,10,7,7,10,8,9,3,6,4,8,2,2,5,5,3,3,5,1,6,3,3,4,3,6,1,3,5,4,4,2,1,3,4,3,1,1,2,4,0,4,1,0,1,1,2,1,2,0,1,0,2,1,1,2,2,4,3,1,1,1,0,3,0,1,2,1,1,0,2,2,0,3,0,0,2,7,2,0,0,0,0,0,0,0,3,1,1,3,1,2,2,1,1,1,4,0,0,2,0,1,0,2,0,0,0,0,1,0,0,1,1,0,0,1,1,1,0,0,0,0,0,0,0,1,1,0,0,1,1,2,0,1,0,0,1,1,1,1,1,1,0,1,0,1,0,1,0,2,3,1,0,0,0,1,2,1,0,3,0,0,0,0,2,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,1,1,0,1,0,0,0,0,1,0,2,1,0,1,0,1,0,1,1,1,0,0,1,1,1,0,0,2,0,0,0,0,1,0,0,1,0,2,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,2,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],"text":["count: 1712<br />relevance_score: 0","count: 5757<br />relevance_score: 1","count: 710<br />relevance_score: 2","count: 168<br />relevance_score: 3","count: 141<br />relevance_score: 4","count: 249<br />relevance_score: 5","count: 324<br />relevance_score: 6","count: 296<br />relevance_score: 7","count: 306<br />relevance_score: 8","count: 266<br />relevance_score: 9","count: 245<br />relevance_score: 10","count: 212<br />relevance_score: 11","count: 215<br />relevance_score: 12","count: 193<br />relevance_score: 13","count: 163<br />relevance_score: 14","count: 131<br />relevance_score: 15","count: 123<br />relevance_score: 16","count: 89<br />relevance_score: 17","count: 87<br />relevance_score: 18","count: 77<br />relevance_score: 19","count: 73<br />relevance_score: 20","count: 51<br />relevance_score: 21","count: 65<br />relevance_score: 22","count: 57<br />relevance_score: 23","count: 61<br />relevance_score: 24","count: 51<br />relevance_score: 25","count: 29<br />relevance_score: 26","count: 36<br />relevance_score: 27","count: 35<br />relevance_score: 28","count: 24<br />relevance_score: 29","count: 23<br />relevance_score: 30","count: 22<br />relevance_score: 31","count: 20<br />relevance_score: 32","count: 15<br />relevance_score: 33","count: 19<br />relevance_score: 34","count: 15<br />relevance_score: 35","count: 14<br />relevance_score: 36","count: 15<br />relevance_score: 37","count: 8<br />relevance_score: 38","count: 10<br />relevance_score: 39","count: 7<br />relevance_score: 40","count: 7<br />relevance_score: 41","count: 10<br />relevance_score: 42","count: 8<br />relevance_score: 43","count: 9<br />relevance_score: 44","count: 3<br />relevance_score: 45","count: 6<br />relevance_score: 46","count: 4<br />relevance_score: 47","count: 8<br />relevance_score: 48","count: 2<br />relevance_score: 49","count: 2<br />relevance_score: 50","count: 5<br />relevance_score: 51","count: 5<br />relevance_score: 52","count: 3<br />relevance_score: 53","count: 3<br />relevance_score: 54","count: 5<br />relevance_score: 55","count: 1<br />relevance_score: 56","count: 6<br />relevance_score: 57","count: 3<br />relevance_score: 58","count: 3<br />relevance_score: 59","count: 4<br />relevance_score: 60","count: 3<br />relevance_score: 61","count: 6<br />relevance_score: 62","count: 1<br />relevance_score: 63","count: 3<br />relevance_score: 64","count: 5<br />relevance_score: 65","count: 4<br />relevance_score: 66","count: 4<br />relevance_score: 67","count: 2<br />relevance_score: 68","count: 1<br />relevance_score: 69","count: 3<br />relevance_score: 70","count: 4<br />relevance_score: 71","count: 3<br />relevance_score: 72","count: 1<br />relevance_score: 73","count: 1<br />relevance_score: 74","count: 2<br />relevance_score: 75","count: 4<br />relevance_score: 76","count: 0<br />relevance_score: 77","count: 4<br />relevance_score: 78","count: 1<br />relevance_score: 79","count: 0<br />relevance_score: 80","count: 1<br />relevance_score: 81","count: 1<br />relevance_score: 82","count: 2<br />relevance_score: 83","count: 1<br />relevance_score: 84","count: 2<br />relevance_score: 85","count: 0<br />relevance_score: 86","count: 1<br />relevance_score: 87","count: 0<br />relevance_score: 88","count: 2<br />relevance_score: 89","count: 1<br />relevance_score: 90","count: 1<br />relevance_score: 91","count: 2<br />relevance_score: 92","count: 2<br />relevance_score: 93","count: 4<br />relevance_score: 94","count: 3<br />relevance_score: 95","count: 1<br />relevance_score: 96","count: 1<br />relevance_score: 97","count: 1<br />relevance_score: 98","count: 0<br />relevance_score: 99","count: 3<br />relevance_score: 100","count: 0<br />relevance_score: 101","count: 1<br />relevance_score: 102","count: 2<br />relevance_score: 103","count: 1<br />relevance_score: 104","count: 1<br />relevance_score: 105","count: 0<br />relevance_score: 106","count: 2<br />relevance_score: 107","count: 2<br />relevance_score: 108","count: 0<br />relevance_score: 109","count: 3<br />relevance_score: 110","count: 0<br />relevance_score: 111","count: 0<br />relevance_score: 112","count: 2<br />relevance_score: 113","count: 7<br />relevance_score: 114","count: 2<br />relevance_score: 115","count: 0<br />relevance_score: 116","count: 0<br />relevance_score: 117","count: 0<br />relevance_score: 118","count: 0<br />relevance_score: 119","count: 0<br />relevance_score: 120","count: 0<br />relevance_score: 121","count: 0<br />relevance_score: 122","count: 3<br />relevance_score: 123","count: 1<br />relevance_score: 124","count: 1<br />relevance_score: 125","count: 3<br />relevance_score: 126","count: 1<br />relevance_score: 127","count: 2<br />relevance_score: 128","count: 2<br />relevance_score: 129","count: 1<br />relevance_score: 130","count: 1<br />relevance_score: 131","count: 1<br />relevance_score: 132","count: 4<br />relevance_score: 133","count: 0<br />relevance_score: 134","count: 0<br />relevance_score: 135","count: 2<br />relevance_score: 136","count: 0<br />relevance_score: 137","count: 1<br />relevance_score: 138","count: 0<br />relevance_score: 139","count: 2<br />relevance_score: 140","count: 0<br />relevance_score: 141","count: 0<br />relevance_score: 142","count: 0<br />relevance_score: 143","count: 0<br />relevance_score: 144","count: 1<br />relevance_score: 145","count: 0<br />relevance_score: 146","count: 0<br />relevance_score: 147","count: 1<br />relevance_score: 148","count: 1<br />relevance_score: 149","count: 0<br />relevance_score: 150","count: 0<br />relevance_score: 151","count: 1<br />relevance_score: 152","count: 1<br />relevance_score: 153","count: 1<br />relevance_score: 154","count: 0<br />relevance_score: 155","count: 0<br />relevance_score: 156","count: 0<br />relevance_score: 157","count: 0<br />relevance_score: 158","count: 0<br />relevance_score: 159","count: 0<br />relevance_score: 160","count: 0<br />relevance_score: 161","count: 1<br />relevance_score: 162","count: 1<br />relevance_score: 163","count: 0<br />relevance_score: 164","count: 0<br />relevance_score: 165","count: 1<br />relevance_score: 166","count: 1<br />relevance_score: 167","count: 2<br />relevance_score: 168","count: 0<br />relevance_score: 169","count: 1<br />relevance_score: 170","count: 0<br />relevance_score: 171","count: 0<br />relevance_score: 172","count: 1<br />relevance_score: 173","count: 1<br />relevance_score: 174","count: 1<br />relevance_score: 175","count: 1<br />relevance_score: 176","count: 1<br />relevance_score: 177","count: 1<br />relevance_score: 178","count: 0<br />relevance_score: 179","count: 1<br />relevance_score: 180","count: 0<br />relevance_score: 181","count: 1<br />relevance_score: 182","count: 0<br />relevance_score: 183","count: 1<br />relevance_score: 184","count: 0<br />relevance_score: 185","count: 2<br />relevance_score: 186","count: 3<br />relevance_score: 187","count: 1<br />relevance_score: 188","count: 0<br />relevance_score: 189","count: 0<br />relevance_score: 190","count: 0<br />relevance_score: 191","count: 1<br />relevance_score: 192","count: 2<br />relevance_score: 193","count: 1<br />relevance_score: 194","count: 0<br />relevance_score: 195","count: 3<br />relevance_score: 196","count: 0<br />relevance_score: 197","count: 0<br />relevance_score: 198","count: 0<br />relevance_score: 199","count: 0<br />relevance_score: 200","count: 2<br />relevance_score: 201","count: 0<br />relevance_score: 202","count: 0<br />relevance_score: 203","count: 0<br />relevance_score: 204","count: 0<br />relevance_score: 205","count: 0<br />relevance_score: 206","count: 0<br />relevance_score: 207","count: 0<br />relevance_score: 208","count: 0<br />relevance_score: 209","count: 1<br />relevance_score: 210","count: 0<br />relevance_score: 211","count: 0<br />relevance_score: 212","count: 0<br />relevance_score: 213","count: 0<br />relevance_score: 214","count: 0<br />relevance_score: 215","count: 1<br />relevance_score: 216","count: 0<br />relevance_score: 217","count: 1<br />relevance_score: 218","count: 0<br />relevance_score: 219","count: 0<br />relevance_score: 220","count: 0<br />relevance_score: 221","count: 0<br />relevance_score: 222","count: 0<br />relevance_score: 223","count: 0<br />relevance_score: 224","count: 1<br />relevance_score: 225","count: 0<br />relevance_score: 226","count: 0<br />relevance_score: 227","count: 0<br />relevance_score: 228","count: 1<br />relevance_score: 229","count: 1<br />relevance_score: 230","count: 0<br />relevance_score: 231","count: 1<br />relevance_score: 232","count: 0<br />relevance_score: 233","count: 0<br />relevance_score: 234","count: 0<br />relevance_score: 235","count: 0<br />relevance_score: 236","count: 1<br />relevance_score: 237","count: 0<br />relevance_score: 238","count: 2<br />relevance_score: 239","count: 1<br />relevance_score: 240","count: 0<br />relevance_score: 241","count: 1<br />relevance_score: 242","count: 0<br />relevance_score: 243","count: 1<br />relevance_score: 244","count: 0<br />relevance_score: 245","count: 1<br />relevance_score: 246","count: 1<br />relevance_score: 247","count: 1<br />relevance_score: 248","count: 0<br />relevance_score: 249","count: 0<br />relevance_score: 250","count: 1<br />relevance_score: 251","count: 1<br />relevance_score: 252","count: 1<br />relevance_score: 253","count: 0<br />relevance_score: 254","count: 0<br />relevance_score: 255","count: 2<br />relevance_score: 256","count: 0<br />relevance_score: 257","count: 0<br />relevance_score: 258","count: 0<br />relevance_score: 259","count: 0<br />relevance_score: 260","count: 1<br />relevance_score: 261","count: 0<br />relevance_score: 262","count: 0<br />relevance_score: 263","count: 1<br />relevance_score: 264","count: 0<br />relevance_score: 265","count: 2<br />relevance_score: 266","count: 0<br />relevance_score: 267","count: 1<br />relevance_score: 268","count: 0<br />relevance_score: 269","count: 1<br />relevance_score: 270","count: 0<br />relevance_score: 271","count: 1<br />relevance_score: 272","count: 0<br />relevance_score: 273","count: 0<br />relevance_score: 274","count: 0<br />relevance_score: 275","count: 0<br />relevance_score: 276","count: 0<br />relevance_score: 277","count: 0<br />relevance_score: 278","count: 0<br />relevance_score: 279","count: 1<br />relevance_score: 280","count: 0<br />relevance_score: 281","count: 0<br />relevance_score: 282","count: 0<br />relevance_score: 283","count: 2<br />relevance_score: 284","count: 1<br />relevance_score: 285","count: 0<br />relevance_score: 286","count: 0<br />relevance_score: 287","count: 0<br />relevance_score: 288","count: 1<br />relevance_score: 289","count: 0<br />relevance_score: 290","count: 0<br />relevance_score: 291","count: 0<br />relevance_score: 292","count: 0<br />relevance_score: 293","count: 0<br />relevance_score: 294","count: 0<br />relevance_score: 295","count: 0<br />relevance_score: 296","count: 0<br />relevance_score: 297","count: 0<br />relevance_score: 298","count: 0<br />relevance_score: 299","count: 0<br />relevance_score: 300","count: 0<br />relevance_score: 301","count: 0<br />relevance_score: 302","count: 0<br />relevance_score: 303","count: 0<br />relevance_score: 304","count: 0<br />relevance_score: 305","count: 0<br />relevance_score: 306","count: 0<br />relevance_score: 307","count: 0<br />relevance_score: 308","count: 1<br />relevance_score: 309","count: 0<br />relevance_score: 310","count: 2<br />relevance_score: 311","count: 0<br />relevance_score: 312","count: 0<br />relevance_score: 313","count: 1<br />relevance_score: 314","count: 1<br />relevance_score: 315","count: 0<br />relevance_score: 316","count: 0<br />relevance_score: 317","count: 0<br />relevance_score: 318","count: 0<br />relevance_score: 319","count: 0<br />relevance_score: 320","count: 0<br />relevance_score: 321","count: 0<br />relevance_score: 322","count: 0<br />relevance_score: 323","count: 0<br />relevance_score: 324","count: 0<br />relevance_score: 325","count: 0<br />relevance_score: 326","count: 0<br />relevance_score: 327","count: 0<br />relevance_score: 328","count: 1<br />relevance_score: 329","count: 0<br />relevance_score: 330","count: 1<br />relevance_score: 331","count: 0<br />relevance_score: 332","count: 0<br />relevance_score: 333","count: 0<br />relevance_score: 334","count: 0<br />relevance_score: 335","count: 0<br />relevance_score: 336","count: 0<br />relevance_score: 337","count: 0<br />relevance_score: 338","count: 0<br />relevance_score: 339","count: 0<br />relevance_score: 340","count: 0<br />relevance_score: 341","count: 0<br />relevance_score: 342","count: 0<br />relevance_score: 343","count: 0<br />relevance_score: 344","count: 0<br />relevance_score: 345","count: 0<br />relevance_score: 346","count: 0<br />relevance_score: 347","count: 0<br />relevance_score: 348","count: 0<br />relevance_score: 349","count: 0<br />relevance_score: 350","count: 0<br />relevance_score: 351","count: 0<br />relevance_score: 352","count: 0<br />relevance_score: 353","count: 0<br />relevance_score: 354","count: 0<br />relevance_score: 355","count: 1<br />relevance_score: 356","count: 0<br />relevance_score: 357","count: 0<br />relevance_score: 358","count: 0<br />relevance_score: 359","count: 0<br />relevance_score: 360","count: 0<br />relevance_score: 361","count: 0<br />relevance_score: 362","count: 0<br />relevance_score: 363","count: 0<br />relevance_score: 364","count: 0<br />relevance_score: 365","count: 0<br />relevance_score: 366","count: 0<br />relevance_score: 367","count: 0<br />relevance_score: 368","count: 0<br />relevance_score: 369","count: 0<br />relevance_score: 370","count: 0<br />relevance_score: 371","count: 1<br />relevance_score: 372","count: 0<br />relevance_score: 373","count: 0<br />relevance_score: 374","count: 1<br />relevance_score: 375","count: 0<br />relevance_score: 376","count: 0<br />relevance_score: 377","count: 0<br />relevance_score: 378","count: 0<br />relevance_score: 379","count: 0<br />relevance_score: 380","count: 0<br />relevance_score: 381","count: 0<br />relevance_score: 382","count: 0<br />relevance_score: 383","count: 0<br />relevance_score: 384","count: 0<br />relevance_score: 385","count: 0<br />relevance_score: 386","count: 0<br />relevance_score: 387","count: 0<br />relevance_score: 388","count: 1<br />relevance_score: 389","count: 0<br />relevance_score: 390","count: 0<br />relevance_score: 391","count: 0<br />relevance_score: 392","count: 0<br />relevance_score: 393","count: 0<br />relevance_score: 394","count: 0<br />relevance_score: 395","count: 0<br />relevance_score: 396","count: 0<br />relevance_score: 397","count: 0<br />relevance_score: 398","count: 0<br />relevance_score: 399","count: 0<br />relevance_score: 400","count: 0<br />relevance_score: 401","count: 0<br />relevance_score: 402","count: 0<br />relevance_score: 403","count: 0<br />relevance_score: 404","count: 0<br />relevance_score: 405","count: 0<br />relevance_score: 406","count: 0<br />relevance_score: 407","count: 0<br />relevance_score: 408","count: 0<br />relevance_score: 409","count: 0<br />relevance_score: 410","count: 0<br />relevance_score: 411","count: 1<br />relevance_score: 412","count: 0<br />relevance_score: 413","count: 0<br />relevance_score: 414","count: 0<br />relevance_score: 415","count: 0<br />relevance_score: 416","count: 0<br />relevance_score: 417","count: 0<br />relevance_score: 418","count: 0<br />relevance_score: 419","count: 0<br />relevance_score: 420","count: 1<br />relevance_score: 421","count: 0<br />relevance_score: 422","count: 0<br />relevance_score: 423","count: 0<br />relevance_score: 424","count: 0<br />relevance_score: 425","count: 0<br />relevance_score: 426","count: 0<br />relevance_score: 427","count: 0<br />relevance_score: 428","count: 0<br />relevance_score: 429","count: 0<br />relevance_score: 430","count: 0<br />relevance_score: 431","count: 0<br />relevance_score: 432","count: 0<br />relevance_score: 433","count: 0<br />relevance_score: 434","count: 0<br />relevance_score: 435","count: 0<br />relevance_score: 436","count: 0<br />relevance_score: 437","count: 0<br />relevance_score: 438","count: 0<br />relevance_score: 439","count: 0<br />relevance_score: 440","count: 0<br />relevance_score: 441","count: 0<br />relevance_score: 442","count: 0<br />relevance_score: 443","count: 0<br />relevance_score: 444","count: 0<br />relevance_score: 445","count: 0<br />relevance_score: 446","count: 0<br />relevance_score: 447","count: 0<br />relevance_score: 448","count: 0<br />relevance_score: 449","count: 0<br />relevance_score: 450","count: 0<br />relevance_score: 451","count: 0<br />relevance_score: 452","count: 0<br />relevance_score: 453","count: 0<br />relevance_score: 454","count: 0<br />relevance_score: 455","count: 0<br />relevance_score: 456","count: 0<br />relevance_score: 457","count: 0<br />relevance_score: 458","count: 1<br />relevance_score: 459","count: 0<br />relevance_score: 460","count: 0<br />relevance_score: 461","count: 0<br />relevance_score: 462","count: 0<br />relevance_score: 463","count: 0<br />relevance_score: 464","count: 0<br />relevance_score: 465","count: 0<br />relevance_score: 466","count: 0<br />relevance_score: 467","count: 0<br />relevance_score: 468","count: 0<br />relevance_score: 469","count: 0<br />relevance_score: 470","count: 0<br />relevance_score: 471","count: 0<br />relevance_score: 472","count: 0<br />relevance_score: 473","count: 0<br />relevance_score: 474","count: 0<br />relevance_score: 475","count: 1<br />relevance_score: 476","count: 0<br />relevance_score: 477","count: 0<br />relevance_score: 478","count: 0<br />relevance_score: 479","count: 0<br />relevance_score: 480","count: 0<br />relevance_score: 481","count: 0<br />relevance_score: 482","count: 0<br />relevance_score: 483","count: 0<br />relevance_score: 484","count: 0<br />relevance_score: 485","count: 0<br />relevance_score: 486","count: 0<br />relevance_score: 487","count: 0<br />relevance_score: 488","count: 0<br />relevance_score: 489","count: 0<br />relevance_score: 490","count: 0<br />relevance_score: 491","count: 0<br />relevance_score: 492","count: 0<br />relevance_score: 493","count: 0<br />relevance_score: 494","count: 0<br />relevance_score: 495","count: 0<br />relevance_score: 496","count: 0<br />relevance_score: 497","count: 0<br />relevance_score: 498","count: 0<br />relevance_score: 499","count: 0<br />relevance_score: 500","count: 0<br />relevance_score: 501","count: 0<br />relevance_score: 502","count: 0<br />relevance_score: 503","count: 0<br />relevance_score: 504","count: 0<br />relevance_score: 505","count: 0<br />relevance_score: 506","count: 0<br />relevance_score: 507","count: 0<br />relevance_score: 508","count: 0<br />relevance_score: 509","count: 0<br />relevance_score: 510","count: 0<br />relevance_score: 511","count: 0<br />relevance_score: 512","count: 0<br />relevance_score: 513","count: 0<br />relevance_score: 514","count: 0<br />relevance_score: 515","count: 0<br />relevance_score: 516","count: 0<br />relevance_score: 517","count: 0<br />relevance_score: 518","count: 0<br />relevance_score: 519","count: 0<br />relevance_score: 520","count: 0<br />relevance_score: 521","count: 0<br />relevance_score: 522","count: 0<br />relevance_score: 523","count: 0<br />relevance_score: 524","count: 0<br />relevance_score: 525","count: 0<br />relevance_score: 526","count: 0<br />relevance_score: 527","count: 0<br />relevance_score: 528","count: 0<br />relevance_score: 529","count: 0<br />relevance_score: 530","count: 0<br />relevance_score: 531","count: 0<br />relevance_score: 532","count: 0<br />relevance_score: 533","count: 0<br />relevance_score: 534","count: 0<br />relevance_score: 535","count: 0<br />relevance_score: 536","count: 0<br />relevance_score: 537","count: 0<br />relevance_score: 538","count: 0<br />relevance_score: 539","count: 0<br />relevance_score: 540","count: 0<br />relevance_score: 541","count: 0<br />relevance_score: 542","count: 0<br />relevance_score: 543","count: 0<br />relevance_score: 544","count: 0<br />relevance_score: 545","count: 0<br />relevance_score: 546","count: 0<br />relevance_score: 547","count: 0<br />relevance_score: 548","count: 0<br />relevance_score: 549","count: 0<br />relevance_score: 550","count: 0<br />relevance_score: 551","count: 0<br />relevance_score: 552","count: 0<br />relevance_score: 553","count: 0<br />relevance_score: 554","count: 0<br />relevance_score: 555","count: 0<br />relevance_score: 556","count: 0<br />relevance_score: 557","count: 0<br />relevance_score: 558","count: 0<br />relevance_score: 559","count: 0<br />relevance_score: 560","count: 0<br />relevance_score: 561","count: 0<br />relevance_score: 562","count: 0<br />relevance_score: 563","count: 0<br />relevance_score: 564","count: 0<br />relevance_score: 565","count: 0<br />relevance_score: 566","count: 0<br />relevance_score: 567","count: 1<br />relevance_score: 568","count: 0<br />relevance_score: 569","count: 0<br />relevance_score: 570","count: 0<br />relevance_score: 571","count: 0<br />relevance_score: 572","count: 0<br />relevance_score: 573","count: 0<br />relevance_score: 574","count: 0<br />relevance_score: 575","count: 0<br />relevance_score: 576","count: 0<br />relevance_score: 577","count: 0<br />relevance_score: 578","count: 0<br />relevance_score: 579","count: 0<br />relevance_score: 580","count: 0<br />relevance_score: 581","count: 0<br />relevance_score: 582","count: 0<br />relevance_score: 583","count: 0<br />relevance_score: 584","count: 0<br />relevance_score: 585","count: 0<br />relevance_score: 586","count: 0<br />relevance_score: 587","count: 0<br />relevance_score: 588","count: 0<br />relevance_score: 589","count: 0<br />relevance_score: 590","count: 0<br />relevance_score: 591","count: 0<br />relevance_score: 592","count: 0<br />relevance_score: 593","count: 0<br />relevance_score: 594","count: 0<br />relevance_score: 595","count: 0<br />relevance_score: 596","count: 0<br />relevance_score: 597","count: 0<br />relevance_score: 598","count: 0<br />relevance_score: 599","count: 0<br />relevance_score: 600","count: 0<br />relevance_score: 601","count: 0<br />relevance_score: 602","count: 0<br />relevance_score: 603","count: 0<br />relevance_score: 604","count: 0<br />relevance_score: 605","count: 0<br />relevance_score: 606","count: 0<br />relevance_score: 607","count: 0<br />relevance_score: 608","count: 0<br />relevance_score: 609","count: 0<br />relevance_score: 610","count: 0<br />relevance_score: 611","count: 0<br />relevance_score: 612","count: 0<br />relevance_score: 613","count: 0<br />relevance_score: 614","count: 0<br />relevance_score: 615","count: 0<br />relevance_score: 616","count: 0<br />relevance_score: 617","count: 0<br />relevance_score: 618","count: 0<br />relevance_score: 619","count: 0<br />relevance_score: 620","count: 0<br />relevance_score: 621","count: 0<br />relevance_score: 622","count: 0<br />relevance_score: 623","count: 0<br />relevance_score: 624","count: 0<br />relevance_score: 625","count: 0<br />relevance_score: 626","count: 0<br />relevance_score: 627","count: 0<br />relevance_score: 628","count: 0<br />relevance_score: 629","count: 0<br />relevance_score: 630","count: 0<br />relevance_score: 631","count: 0<br />relevance_score: 632","count: 0<br />relevance_score: 633","count: 0<br />relevance_score: 634","count: 0<br />relevance_score: 635","count: 0<br />relevance_score: 636","count: 0<br />relevance_score: 637","count: 0<br />relevance_score: 638","count: 0<br />relevance_score: 639","count: 0<br />relevance_score: 640","count: 0<br />relevance_score: 641","count: 0<br />relevance_score: 642","count: 1<br />relevance_score: 643"],"type":"bar","textposition":"none","marker":{"autocolorscale":false,"color":"rgba(30,144,255,0.7)","line":{"width":1.8897637795275593,"color":"transparent"}},"showlegend":false,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null},{"orientation":"v","width":[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],"base":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"x":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643],"y":[4,538,386,66,52,83,77,62,52,46,39,44,36,31,34,26,20,10,16,14,6,9,6,8,6,5,5,5,3,6,3,3,3,0,0,0,3,0,4,2,1,1,0,0,0,0,1,2,2,0,0,0,0,2,2,3,1,0,1,0,0,0,0,2,2,0,1,1,1,0,0,0,1,1,1,0,1,1,0,0,1,0,0,1,0,0,1,1,0,0,1,1,1,0,1,0,0,1,0,0,1,0,1,1,0,0,0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"text":["count: 4<br />relevance_score: 0","count: 538<br />relevance_score: 1","count: 386<br />relevance_score: 2","count: 66<br />relevance_score: 3","count: 52<br />relevance_score: 4","count: 83<br />relevance_score: 5","count: 77<br />relevance_score: 6","count: 62<br />relevance_score: 7","count: 52<br />relevance_score: 8","count: 46<br />relevance_score: 9","count: 39<br />relevance_score: 10","count: 44<br />relevance_score: 11","count: 36<br />relevance_score: 12","count: 31<br />relevance_score: 13","count: 34<br />relevance_score: 14","count: 26<br />relevance_score: 15","count: 20<br />relevance_score: 16","count: 10<br />relevance_score: 17","count: 16<br />relevance_score: 18","count: 14<br />relevance_score: 19","count: 6<br />relevance_score: 20","count: 9<br />relevance_score: 21","count: 6<br />relevance_score: 22","count: 8<br />relevance_score: 23","count: 6<br />relevance_score: 24","count: 5<br />relevance_score: 25","count: 5<br />relevance_score: 26","count: 5<br />relevance_score: 27","count: 3<br />relevance_score: 28","count: 6<br />relevance_score: 29","count: 3<br />relevance_score: 30","count: 3<br />relevance_score: 31","count: 3<br />relevance_score: 32","count: 0<br />relevance_score: 33","count: 0<br />relevance_score: 34","count: 0<br />relevance_score: 35","count: 3<br />relevance_score: 36","count: 0<br />relevance_score: 37","count: 4<br />relevance_score: 38","count: 2<br />relevance_score: 39","count: 1<br />relevance_score: 40","count: 1<br />relevance_score: 41","count: 0<br />relevance_score: 42","count: 0<br />relevance_score: 43","count: 0<br />relevance_score: 44","count: 0<br />relevance_score: 45","count: 1<br />relevance_score: 46","count: 2<br />relevance_score: 47","count: 2<br />relevance_score: 48","count: 0<br />relevance_score: 49","count: 0<br />relevance_score: 50","count: 0<br />relevance_score: 51","count: 0<br />relevance_score: 52","count: 2<br />relevance_score: 53","count: 2<br />relevance_score: 54","count: 3<br />relevance_score: 55","count: 1<br />relevance_score: 56","count: 0<br />relevance_score: 57","count: 1<br />relevance_score: 58","count: 0<br />relevance_score: 59","count: 0<br />relevance_score: 60","count: 0<br />relevance_score: 61","count: 0<br />relevance_score: 62","count: 2<br />relevance_score: 63","count: 2<br />relevance_score: 64","count: 0<br />relevance_score: 65","count: 1<br />relevance_score: 66","count: 1<br />relevance_score: 67","count: 1<br />relevance_score: 68","count: 0<br />relevance_score: 69","count: 0<br />relevance_score: 70","count: 0<br />relevance_score: 71","count: 1<br />relevance_score: 72","count: 1<br />relevance_score: 73","count: 1<br />relevance_score: 74","count: 0<br />relevance_score: 75","count: 1<br />relevance_score: 76","count: 1<br />relevance_score: 77","count: 0<br />relevance_score: 78","count: 0<br />relevance_score: 79","count: 1<br />relevance_score: 80","count: 0<br />relevance_score: 81","count: 0<br />relevance_score: 82","count: 1<br />relevance_score: 83","count: 0<br />relevance_score: 84","count: 0<br />relevance_score: 85","count: 1<br />relevance_score: 86","count: 1<br />relevance_score: 87","count: 0<br />relevance_score: 88","count: 0<br />relevance_score: 89","count: 1<br />relevance_score: 90","count: 1<br />relevance_score: 91","count: 1<br />relevance_score: 92","count: 0<br />relevance_score: 93","count: 1<br />relevance_score: 94","count: 0<br />relevance_score: 95","count: 0<br />relevance_score: 96","count: 1<br />relevance_score: 97","count: 0<br />relevance_score: 98","count: 0<br />relevance_score: 99","count: 1<br />relevance_score: 100","count: 0<br />relevance_score: 101","count: 1<br />relevance_score: 102","count: 1<br />relevance_score: 103","count: 0<br />relevance_score: 104","count: 0<br />relevance_score: 105","count: 0<br />relevance_score: 106","count: 0<br />relevance_score: 107","count: 0<br />relevance_score: 108","count: 0<br />relevance_score: 109","count: 1<br />relevance_score: 110","count: 0<br />relevance_score: 111","count: 1<br />relevance_score: 112","count: 0<br />relevance_score: 113","count: 1<br />relevance_score: 114","count: 0<br />relevance_score: 115","count: 0<br />relevance_score: 116","count: 0<br />relevance_score: 117","count: 0<br />relevance_score: 118","count: 0<br />relevance_score: 119","count: 0<br />relevance_score: 120","count: 0<br />relevance_score: 121","count: 1<br />relevance_score: 122","count: 1<br />relevance_score: 123","count: 0<br />relevance_score: 124","count: 0<br />relevance_score: 125","count: 1<br />relevance_score: 126","count: 1<br />relevance_score: 127","count: 0<br />relevance_score: 128","count: 0<br />relevance_score: 129","count: 0<br />relevance_score: 130","count: 0<br />relevance_score: 131","count: 0<br />relevance_score: 132","count: 0<br />relevance_score: 133","count: 0<br />relevance_score: 134","count: 0<br />relevance_score: 135","count: 0<br />relevance_score: 136","count: 0<br />relevance_score: 137","count: 0<br />relevance_score: 138","count: 0<br />relevance_score: 139","count: 0<br />relevance_score: 140","count: 0<br />relevance_score: 141","count: 0<br />relevance_score: 142","count: 0<br />relevance_score: 143","count: 0<br />relevance_score: 144","count: 0<br />relevance_score: 145","count: 0<br />relevance_score: 146","count: 0<br />relevance_score: 147","count: 0<br />relevance_score: 148","count: 0<br />relevance_score: 149","count: 0<br />relevance_score: 150","count: 0<br />relevance_score: 151","count: 0<br />relevance_score: 152","count: 0<br />relevance_score: 153","count: 0<br />relevance_score: 154","count: 0<br />relevance_score: 155","count: 0<br />relevance_score: 156","count: 0<br />relevance_score: 157","count: 0<br />relevance_score: 158","count: 0<br />relevance_score: 159","count: 0<br />relevance_score: 160","count: 0<br />relevance_score: 161","count: 0<br />relevance_score: 162","count: 0<br />relevance_score: 163","count: 0<br />relevance_score: 164","count: 1<br />relevance_score: 165","count: 0<br />relevance_score: 166","count: 0<br />relevance_score: 167","count: 0<br />relevance_score: 168","count: 0<br />relevance_score: 169","count: 0<br />relevance_score: 170","count: 0<br />relevance_score: 171","count: 0<br />relevance_score: 172","count: 0<br />relevance_score: 173","count: 0<br />relevance_score: 174","count: 0<br />relevance_score: 175","count: 0<br />relevance_score: 176","count: 0<br />relevance_score: 177","count: 0<br />relevance_score: 178","count: 0<br />relevance_score: 179","count: 0<br />relevance_score: 180","count: 0<br />relevance_score: 181","count: 0<br />relevance_score: 182","count: 0<br />relevance_score: 183","count: 0<br />relevance_score: 184","count: 0<br />relevance_score: 185","count: 0<br />relevance_score: 186","count: 0<br />relevance_score: 187","count: 0<br />relevance_score: 188","count: 0<br />relevance_score: 189","count: 0<br />relevance_score: 190","count: 0<br />relevance_score: 191","count: 0<br />relevance_score: 192","count: 0<br />relevance_score: 193","count: 0<br />relevance_score: 194","count: 0<br />relevance_score: 195","count: 0<br />relevance_score: 196","count: 0<br />relevance_score: 197","count: 0<br />relevance_score: 198","count: 0<br />relevance_score: 199","count: 0<br />relevance_score: 200","count: 0<br />relevance_score: 201","count: 0<br />relevance_score: 202","count: 0<br />relevance_score: 203","count: 0<br />relevance_score: 204","count: 0<br />relevance_score: 205","count: 0<br />relevance_score: 206","count: 0<br />relevance_score: 207","count: 0<br />relevance_score: 208","count: 0<br />relevance_score: 209","count: 0<br />relevance_score: 210","count: 0<br />relevance_score: 211","count: 0<br />relevance_score: 212","count: 0<br />relevance_score: 213","count: 0<br />relevance_score: 214","count: 0<br />relevance_score: 215","count: 0<br />relevance_score: 216","count: 0<br />relevance_score: 217","count: 0<br />relevance_score: 218","count: 0<br />relevance_score: 219","count: 0<br />relevance_score: 220","count: 0<br />relevance_score: 221","count: 0<br />relevance_score: 222","count: 0<br />relevance_score: 223","count: 0<br />relevance_score: 224","count: 0<br />relevance_score: 225","count: 0<br />relevance_score: 226","count: 0<br />relevance_score: 227","count: 0<br />relevance_score: 228","count: 0<br />relevance_score: 229","count: 0<br />relevance_score: 230","count: 0<br />relevance_score: 231","count: 0<br />relevance_score: 232","count: 0<br />relevance_score: 233","count: 0<br />relevance_score: 234","count: 0<br />relevance_score: 235","count: 0<br />relevance_score: 236","count: 0<br />relevance_score: 237","count: 0<br />relevance_score: 238","count: 0<br />relevance_score: 239","count: 0<br />relevance_score: 240","count: 0<br />relevance_score: 241","count: 0<br />relevance_score: 242","count: 0<br />relevance_score: 243","count: 0<br />relevance_score: 244","count: 1<br />relevance_score: 245","count: 0<br />relevance_score: 246","count: 0<br />relevance_score: 247","count: 0<br />relevance_score: 248","count: 0<br />relevance_score: 249","count: 0<br />relevance_score: 250","count: 0<br />relevance_score: 251","count: 0<br />relevance_score: 252","count: 0<br />relevance_score: 253","count: 0<br />relevance_score: 254","count: 0<br />relevance_score: 255","count: 0<br />relevance_score: 256","count: 0<br />relevance_score: 257","count: 0<br />relevance_score: 258","count: 0<br />relevance_score: 259","count: 0<br />relevance_score: 260","count: 0<br />relevance_score: 261","count: 0<br />relevance_score: 262","count: 0<br />relevance_score: 263","count: 0<br />relevance_score: 264","count: 0<br />relevance_score: 265","count: 0<br />relevance_score: 266","count: 0<br />relevance_score: 267","count: 0<br />relevance_score: 268","count: 0<br />relevance_score: 269","count: 0<br />relevance_score: 270","count: 0<br />relevance_score: 271","count: 0<br />relevance_score: 272","count: 0<br />relevance_score: 273","count: 0<br />relevance_score: 274","count: 0<br />relevance_score: 275","count: 0<br />relevance_score: 276","count: 0<br />relevance_score: 277","count: 0<br />relevance_score: 278","count: 0<br />relevance_score: 279","count: 0<br />relevance_score: 280","count: 0<br />relevance_score: 281","count: 0<br />relevance_score: 282","count: 0<br />relevance_score: 283","count: 0<br />relevance_score: 284","count: 0<br />relevance_score: 285","count: 0<br />relevance_score: 286","count: 0<br />relevance_score: 287","count: 0<br />relevance_score: 288","count: 0<br />relevance_score: 289","count: 0<br />relevance_score: 290","count: 0<br />relevance_score: 291","count: 0<br />relevance_score: 292","count: 0<br />relevance_score: 293","count: 0<br />relevance_score: 294","count: 0<br />relevance_score: 295","count: 0<br />relevance_score: 296","count: 0<br />relevance_score: 297","count: 0<br />relevance_score: 298","count: 0<br />relevance_score: 299","count: 0<br />relevance_score: 300","count: 0<br />relevance_score: 301","count: 0<br />relevance_score: 302","count: 0<br />relevance_score: 303","count: 0<br />relevance_score: 304","count: 0<br />relevance_score: 305","count: 0<br />relevance_score: 306","count: 0<br />relevance_score: 307","count: 0<br />relevance_score: 308","count: 0<br />relevance_score: 309","count: 0<br />relevance_score: 310","count: 0<br />relevance_score: 311","count: 0<br />relevance_score: 312","count: 0<br />relevance_score: 313","count: 0<br />relevance_score: 314","count: 0<br />relevance_score: 315","count: 0<br />relevance_score: 316","count: 0<br />relevance_score: 317","count: 0<br />relevance_score: 318","count: 0<br />relevance_score: 319","count: 0<br />relevance_score: 320","count: 0<br />relevance_score: 321","count: 0<br />relevance_score: 322","count: 0<br />relevance_score: 323","count: 0<br />relevance_score: 324","count: 0<br />relevance_score: 325","count: 0<br />relevance_score: 326","count: 0<br />relevance_score: 327","count: 0<br />relevance_score: 328","count: 0<br />relevance_score: 329","count: 0<br />relevance_score: 330","count: 0<br />relevance_score: 331","count: 0<br />relevance_score: 332","count: 0<br />relevance_score: 333","count: 0<br />relevance_score: 334","count: 0<br />relevance_score: 335","count: 0<br />relevance_score: 336","count: 0<br />relevance_score: 337","count: 0<br />relevance_score: 338","count: 0<br />relevance_score: 339","count: 0<br />relevance_score: 340","count: 0<br />relevance_score: 341","count: 0<br />relevance_score: 342","count: 0<br />relevance_score: 343","count: 0<br />relevance_score: 344","count: 0<br />relevance_score: 345","count: 0<br />relevance_score: 346","count: 0<br />relevance_score: 347","count: 0<br />relevance_score: 348","count: 0<br />relevance_score: 349","count: 0<br />relevance_score: 350","count: 0<br />relevance_score: 351","count: 0<br />relevance_score: 352","count: 0<br />relevance_score: 353","count: 0<br />relevance_score: 354","count: 0<br />relevance_score: 355","count: 0<br />relevance_score: 356","count: 0<br />relevance_score: 357","count: 0<br />relevance_score: 358","count: 0<br />relevance_score: 359","count: 0<br />relevance_score: 360","count: 0<br />relevance_score: 361","count: 0<br />relevance_score: 362","count: 0<br />relevance_score: 363","count: 0<br />relevance_score: 364","count: 0<br />relevance_score: 365","count: 0<br />relevance_score: 366","count: 0<br />relevance_score: 367","count: 0<br />relevance_score: 368","count: 0<br />relevance_score: 369","count: 0<br />relevance_score: 370","count: 0<br />relevance_score: 371","count: 0<br />relevance_score: 372","count: 0<br />relevance_score: 373","count: 0<br />relevance_score: 374","count: 0<br />relevance_score: 375","count: 0<br />relevance_score: 376","count: 0<br />relevance_score: 377","count: 0<br />relevance_score: 378","count: 0<br />relevance_score: 379","count: 0<br />relevance_score: 380","count: 0<br />relevance_score: 381","count: 0<br />relevance_score: 382","count: 0<br />relevance_score: 383","count: 0<br />relevance_score: 384","count: 0<br />relevance_score: 385","count: 0<br />relevance_score: 386","count: 0<br />relevance_score: 387","count: 0<br />relevance_score: 388","count: 0<br />relevance_score: 389","count: 0<br />relevance_score: 390","count: 0<br />relevance_score: 391","count: 0<br />relevance_score: 392","count: 0<br />relevance_score: 393","count: 0<br />relevance_score: 394","count: 0<br />relevance_score: 395","count: 0<br />relevance_score: 396","count: 0<br />relevance_score: 397","count: 0<br />relevance_score: 398","count: 0<br />relevance_score: 399","count: 0<br />relevance_score: 400","count: 0<br />relevance_score: 401","count: 0<br />relevance_score: 402","count: 0<br />relevance_score: 403","count: 0<br />relevance_score: 404","count: 0<br />relevance_score: 405","count: 0<br />relevance_score: 406","count: 0<br />relevance_score: 407","count: 0<br />relevance_score: 408","count: 0<br />relevance_score: 409","count: 0<br />relevance_score: 410","count: 0<br />relevance_score: 411","count: 0<br />relevance_score: 412","count: 0<br />relevance_score: 413","count: 0<br />relevance_score: 414","count: 0<br />relevance_score: 415","count: 0<br />relevance_score: 416","count: 0<br />relevance_score: 417","count: 0<br />relevance_score: 418","count: 0<br />relevance_score: 419","count: 0<br />relevance_score: 420","count: 0<br />relevance_score: 421","count: 0<br />relevance_score: 422","count: 0<br />relevance_score: 423","count: 0<br />relevance_score: 424","count: 0<br />relevance_score: 425","count: 0<br />relevance_score: 426","count: 0<br />relevance_score: 427","count: 0<br />relevance_score: 428","count: 0<br />relevance_score: 429","count: 0<br />relevance_score: 430","count: 0<br />relevance_score: 431","count: 0<br />relevance_score: 432","count: 0<br />relevance_score: 433","count: 0<br />relevance_score: 434","count: 0<br />relevance_score: 435","count: 0<br />relevance_score: 436","count: 0<br />relevance_score: 437","count: 0<br />relevance_score: 438","count: 0<br />relevance_score: 439","count: 0<br />relevance_score: 440","count: 0<br />relevance_score: 441","count: 0<br />relevance_score: 442","count: 0<br />relevance_score: 443","count: 0<br />relevance_score: 444","count: 0<br />relevance_score: 445","count: 0<br />relevance_score: 446","count: 0<br />relevance_score: 447","count: 0<br />relevance_score: 448","count: 0<br />relevance_score: 449","count: 0<br />relevance_score: 450","count: 0<br />relevance_score: 451","count: 0<br />relevance_score: 452","count: 0<br />relevance_score: 453","count: 0<br />relevance_score: 454","count: 0<br />relevance_score: 455","count: 0<br />relevance_score: 456","count: 0<br />relevance_score: 457","count: 0<br />relevance_score: 458","count: 0<br />relevance_score: 459","count: 0<br />relevance_score: 460","count: 0<br />relevance_score: 461","count: 0<br />relevance_score: 462","count: 0<br />relevance_score: 463","count: 0<br />relevance_score: 464","count: 0<br />relevance_score: 465","count: 0<br />relevance_score: 466","count: 0<br />relevance_score: 467","count: 0<br />relevance_score: 468","count: 0<br />relevance_score: 469","count: 0<br />relevance_score: 470","count: 0<br />relevance_score: 471","count: 0<br />relevance_score: 472","count: 0<br />relevance_score: 473","count: 0<br />relevance_score: 474","count: 0<br />relevance_score: 475","count: 0<br />relevance_score: 476","count: 0<br />relevance_score: 477","count: 0<br />relevance_score: 478","count: 0<br />relevance_score: 479","count: 0<br />relevance_score: 480","count: 0<br />relevance_score: 481","count: 0<br />relevance_score: 482","count: 0<br />relevance_score: 483","count: 0<br />relevance_score: 484","count: 0<br />relevance_score: 485","count: 0<br />relevance_score: 486","count: 0<br />relevance_score: 487","count: 0<br />relevance_score: 488","count: 0<br />relevance_score: 489","count: 0<br />relevance_score: 490","count: 0<br />relevance_score: 491","count: 0<br />relevance_score: 492","count: 0<br />relevance_score: 493","count: 0<br />relevance_score: 494","count: 0<br />relevance_score: 495","count: 0<br />relevance_score: 496","count: 0<br />relevance_score: 497","count: 0<br />relevance_score: 498","count: 0<br />relevance_score: 499","count: 0<br />relevance_score: 500","count: 0<br />relevance_score: 501","count: 0<br />relevance_score: 502","count: 0<br />relevance_score: 503","count: 0<br />relevance_score: 504","count: 0<br />relevance_score: 505","count: 0<br />relevance_score: 506","count: 0<br />relevance_score: 507","count: 0<br />relevance_score: 508","count: 0<br />relevance_score: 509","count: 0<br />relevance_score: 510","count: 0<br />relevance_score: 511","count: 0<br />relevance_score: 512","count: 0<br />relevance_score: 513","count: 0<br />relevance_score: 514","count: 0<br />relevance_score: 515","count: 0<br />relevance_score: 516","count: 0<br />relevance_score: 517","count: 0<br />relevance_score: 518","count: 0<br />relevance_score: 519","count: 0<br />relevance_score: 520","count: 0<br />relevance_score: 521","count: 0<br />relevance_score: 522","count: 0<br />relevance_score: 523","count: 0<br />relevance_score: 524","count: 0<br />relevance_score: 525","count: 0<br />relevance_score: 526","count: 0<br />relevance_score: 527","count: 0<br />relevance_score: 528","count: 0<br />relevance_score: 529","count: 0<br />relevance_score: 530","count: 0<br />relevance_score: 531","count: 0<br />relevance_score: 532","count: 0<br />relevance_score: 533","count: 0<br />relevance_score: 534","count: 0<br />relevance_score: 535","count: 0<br />relevance_score: 536","count: 0<br />relevance_score: 537","count: 0<br />relevance_score: 538","count: 0<br />relevance_score: 539","count: 0<br />relevance_score: 540","count: 0<br />relevance_score: 541","count: 0<br />relevance_score: 542","count: 0<br />relevance_score: 543","count: 0<br />relevance_score: 544","count: 0<br />relevance_score: 545","count: 0<br />relevance_score: 546","count: 0<br />relevance_score: 547","count: 0<br />relevance_score: 548","count: 0<br />relevance_score: 549","count: 0<br />relevance_score: 550","count: 0<br />relevance_score: 551","count: 0<br />relevance_score: 552","count: 0<br />relevance_score: 553","count: 0<br />relevance_score: 554","count: 0<br />relevance_score: 555","count: 0<br />relevance_score: 556","count: 0<br />relevance_score: 557","count: 0<br />relevance_score: 558","count: 0<br />relevance_score: 559","count: 0<br />relevance_score: 560","count: 0<br />relevance_score: 561","count: 0<br />relevance_score: 562","count: 0<br />relevance_score: 563","count: 0<br />relevance_score: 564","count: 0<br />relevance_score: 565","count: 0<br />relevance_score: 566","count: 0<br />relevance_score: 567","count: 0<br />relevance_score: 568","count: 0<br />relevance_score: 569","count: 0<br />relevance_score: 570","count: 0<br />relevance_score: 571","count: 0<br />relevance_score: 572","count: 0<br />relevance_score: 573","count: 0<br />relevance_score: 574","count: 0<br />relevance_score: 575","count: 0<br />relevance_score: 576","count: 0<br />relevance_score: 577","count: 0<br />relevance_score: 578","count: 0<br />relevance_score: 579","count: 0<br />relevance_score: 580","count: 0<br />relevance_score: 581","count: 0<br />relevance_score: 582","count: 0<br />relevance_score: 583","count: 0<br />relevance_score: 584","count: 0<br />relevance_score: 585","count: 0<br />relevance_score: 586","count: 0<br />relevance_score: 587","count: 0<br />relevance_score: 588","count: 0<br />relevance_score: 589","count: 0<br />relevance_score: 590","count: 0<br />relevance_score: 591","count: 0<br />relevance_score: 592","count: 0<br />relevance_score: 593","count: 0<br />relevance_score: 594","count: 0<br />relevance_score: 595","count: 0<br />relevance_score: 596","count: 0<br />relevance_score: 597","count: 0<br />relevance_score: 598","count: 0<br />relevance_score: 599","count: 0<br />relevance_score: 600","count: 0<br />relevance_score: 601","count: 0<br />relevance_score: 602","count: 0<br />relevance_score: 603","count: 0<br />relevance_score: 604","count: 0<br />relevance_score: 605","count: 0<br />relevance_score: 606","count: 0<br />relevance_score: 607","count: 0<br />relevance_score: 608","count: 0<br />relevance_score: 609","count: 0<br />relevance_score: 610","count: 0<br />relevance_score: 611","count: 0<br />relevance_score: 612","count: 0<br />relevance_score: 613","count: 0<br />relevance_score: 614","count: 0<br />relevance_score: 615","count: 0<br />relevance_score: 616","count: 0<br />relevance_score: 617","count: 0<br />relevance_score: 618","count: 0<br />relevance_score: 619","count: 0<br />relevance_score: 620","count: 0<br />relevance_score: 621","count: 0<br />relevance_score: 622","count: 0<br />relevance_score: 623","count: 0<br />relevance_score: 624","count: 0<br />relevance_score: 625","count: 0<br />relevance_score: 626","count: 0<br />relevance_score: 627","count: 0<br />relevance_score: 628","count: 0<br />relevance_score: 629","count: 0<br />relevance_score: 630","count: 0<br />relevance_score: 631","count: 0<br />relevance_score: 632","count: 0<br />relevance_score: 633","count: 0<br />relevance_score: 634","count: 0<br />relevance_score: 635","count: 0<br />relevance_score: 636","count: 0<br />relevance_score: 637","count: 0<br />relevance_score: 638","count: 0<br />relevance_score: 639","count: 0<br />relevance_score: 640","count: 0<br />relevance_score: 641","count: 0<br />relevance_score: 642","count: 0<br />relevance_score: 643"],"type":"bar","textposition":"none","marker":{"autocolorscale":false,"color":"rgba(46,139,87,0.7)","line":{"width":1.8897637795275593,"color":"transparent"}},"showlegend":false,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null}],"layout":{"margin":{"t":43.762557077625573,"r":7.3059360730593621,"b":40.182648401826491,"l":43.105022831050235},"font":{"color":"rgba(0,0,0,1)","family":"","size":14.611872146118724},"title":{"text":"Distribution of Relevance Scores","font":{"color":"rgba(0,0,0,1)","family":"","size":17.534246575342465},"x":0,"xref":"paper"},"xaxis":{"domain":[0,1],"automargin":true,"type":"linear","autorange":false,"range":[-10,210],"tickmode":"array","ticktext":["0","50","100","150","200"],"tickvals":[0,49.999999999999993,100,150,200],"categoryorder":"array","categoryarray":["0","50","100","150","200"],"nticks":null,"ticks":"","tickcolor":null,"ticklen":3.6529680365296811,"tickwidth":0,"showticklabels":true,"tickfont":{"color":"rgba(77,77,77,1)","family":"","size":11.68949771689498},"tickangle":-0,"showline":false,"linecolor":null,"linewidth":0,"showgrid":true,"gridcolor":"rgba(235,235,235,1)","gridwidth":0.66417600664176002,"zeroline":false,"anchor":"y","title":{"text":"Relevance Score","font":{"color":"rgba(0,0,0,1)","family":"","size":14.611872146118724}},"hoverformat":".2f"},"yaxis":{"domain":[0,1],"automargin":true,"type":"linear","autorange":false,"range":[-37.5,787.5],"tickmode":"array","ticktext":["0","200","400","600"],"tickvals":[0,200.00000000000003,400,600],"categoryorder":"array","categoryarray":["0","200","400","600"],"nticks":null,"ticks":"","tickcolor":null,"ticklen":3.6529680365296811,"tickwidth":0,"showticklabels":true,"tickfont":{"color":"rgba(77,77,77,1)","family":"","size":11.68949771689498},"tickangle":-0,"showline":false,"linecolor":null,"linewidth":0,"showgrid":true,"gridcolor":"rgba(235,235,235,1)","gridwidth":0.66417600664176002,"zeroline":false,"anchor":"x","title":{"text":"Number of Papers","font":{"color":"rgba(0,0,0,1)","family":"","size":14.611872146118724}},"hoverformat":".2f"},"shapes":[{"type":"rect","fillcolor":null,"line":{"color":null,"width":0,"linetype":[]},"yref":"paper","xref":"paper","x0":0,"x1":1,"y0":0,"y1":1}],"showlegend":false,"legend":{"bgcolor":null,"bordercolor":null,"borderwidth":0,"font":{"color":"rgba(0,0,0,1)","family":"","size":11.68949771689498}},"hovermode":"closest","barmode":"relative"},"config":{"doubleClick":"reset","modeBarButtonsToAdd":["hoverclosest","hovercompare"],"showSendToCloud":false},"source":"A","attrs":{"b2844c2c7f24":{"x":{},"type":"bar"},"b2842ddf57f9":{"x":{}}},"cur_data":"b2844c2c7f24","visdat":{"b2844c2c7f24":["function (y) ","x"],"b2842ddf57f9":["function (y) ","x"]},"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.20000000000000001,"selected":{"opacity":1},"debounce":0},"shinyEvents":["plotly_hover","plotly_click","plotly_selected","plotly_relayout","plotly_brushed","plotly_brushing","plotly_clickannotation","plotly_doubleclick","plotly_deselect","plotly_afterplot","plotly_sunburstclick"],"base_url":"https://plot.ly"},"evals":[],"jsHooks":[]}

      Are you seeing the axis labels and legend? I thought they were there the first time I looked, but now I do not see them. Odd.

    2. Very few papers have a relevance score above 50. This suggests that only a small fraction of articles are extremely relevant to the exact search terms used. 0501001502000200400600Distribution of Relevance ScoresRelevance ScoreNumber of Papers.cls-1 {fill: #3f4f75;} .cls-2 {fill: #80cfbe;} .cls-3 {fill: #fff;}plotly-logomark {"x":{"data":[{"orientation":"v","width":[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],"base":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"x":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643],"y":[1712,5757,710,168,141,249,324,296,306,266,245,212,215,193,163,131,123,89,87,77,73,51,65,57,61,51,29,36,35,24,23,22,20,15,19,15,14,15,8,10,7,7,10,8,9,3,6,4,8,2,2,5,5,3,3,5,1,6,3,3,4,3,6,1,3,5,4,4,2,1,3,4,3,1,1,2,4,0,4,1,0,1,1,2,1,2,0,1,0,2,1,1,2,2,4,3,1,1,1,0,3,0,1,2,1,1,0,2,2,0,3,0,0,2,7,2,0,0,0,0,0,0,0,3,1,1,3,1,2,2,1,1,1,4,0,0,2,0,1,0,2,0,0,0,0,1,0,0,1,1,0,0,1,1,1,0,0,0,0,0,0,0,1,1,0,0,1,1,2,0,1,0,0,1,1,1,1,1,1,0,1,0,1,0,1,0,2,3,1,0,0,0,1,2,1,0,3,0,0,0,0,2,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,1,1,0,1,0,0,0,0,1,0,2,1,0,1,0,1,0,1,1,1,0,0,1,1,1,0,0,2,0,0,0,0,1,0,0,1,0,2,0,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0,2,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],"text":["count: 1712<br />relevance_score: 0","count: 5757<br />relevance_score: 1","count: 710<br />relevance_score: 2","count: 168<br />relevance_score: 3","count: 141<br />relevance_score: 4","count: 249<br />relevance_score: 5","count: 324<br />relevance_score: 6","count: 296<br />relevance_score: 7","count: 306<br />relevance_score: 8","count: 266<br />relevance_score: 9","count: 245<br />relevance_score: 10","count: 212<br />relevance_score: 11","count: 215<br />relevance_score: 12","count: 193<br />relevance_score: 13","count: 163<br />relevance_score: 14","count: 131<br />relevance_score: 15","count: 123<br />relevance_score: 16","count: 89<br />relevance_score: 17","count: 87<br />relevance_score: 18","count: 77<br />relevance_score: 19","count: 73<br />relevance_score: 20","count: 51<br />relevance_score: 21","count: 65<br />relevance_score: 22","count: 57<br />relevance_score: 23","count: 61<br />relevance_score: 24","count: 51<br />relevance_score: 25","count: 29<br />relevance_score: 26","count: 36<br />relevance_score: 27","count: 35<br />relevance_score: 28","count: 24<br />relevance_score: 29","count: 23<br />relevance_score: 30","count: 22<br />relevance_score: 31","count: 20<br />relevance_score: 32","count: 15<br />relevance_score: 33","count: 19<br />relevance_score: 34","count: 15<br />relevance_score: 35","count: 14<br />relevance_score: 36","count: 15<br />relevance_score: 37","count: 8<br />relevance_score: 38","count: 10<br />relevance_score: 39","count: 7<br />relevance_score: 40","count: 7<br />relevance_score: 41","count: 10<br />relevance_score: 42","count: 8<br />relevance_score: 43","count: 9<br />relevance_score: 44","count: 3<br />relevance_score: 45","count: 6<br />relevance_score: 46","count: 4<br />relevance_score: 47","count: 8<br />relevance_score: 48","count: 2<br />relevance_score: 49","count: 2<br />relevance_score: 50","count: 5<br />relevance_score: 51","count: 5<br />relevance_score: 52","count: 3<br />relevance_score: 53","count: 3<br />relevance_score: 54","count: 5<br />relevance_score: 55","count: 1<br />relevance_score: 56","count: 6<br />relevance_score: 57","count: 3<br />relevance_score: 58","count: 3<br />relevance_score: 59","count: 4<br />relevance_score: 60","count: 3<br />relevance_score: 61","count: 6<br />relevance_score: 62","count: 1<br />relevance_score: 63","count: 3<br />relevance_score: 64","count: 5<br />relevance_score: 65","count: 4<br />relevance_score: 66","count: 4<br />relevance_score: 67","count: 2<br />relevance_score: 68","count: 1<br />relevance_score: 69","count: 3<br />relevance_score: 70","count: 4<br />relevance_score: 71","count: 3<br />relevance_score: 72","count: 1<br />relevance_score: 73","count: 1<br />relevance_score: 74","count: 2<br />relevance_score: 75","count: 4<br />relevance_score: 76","count: 0<br />relevance_score: 77","count: 4<br />relevance_score: 78","count: 1<br />relevance_score: 79","count: 0<br />relevance_score: 80","count: 1<br />relevance_score: 81","count: 1<br />relevance_score: 82","count: 2<br />relevance_score: 83","count: 1<br />relevance_score: 84","count: 2<br />relevance_score: 85","count: 0<br />relevance_score: 86","count: 1<br />relevance_score: 87","count: 0<br />relevance_score: 88","count: 2<br />relevance_score: 89","count: 1<br />relevance_score: 90","count: 1<br />relevance_score: 91","count: 2<br />relevance_score: 92","count: 2<br />relevance_score: 93","count: 4<br />relevance_score: 94","count: 3<br />relevance_score: 95","count: 1<br />relevance_score: 96","count: 1<br />relevance_score: 97","count: 1<br />relevance_score: 98","count: 0<br />relevance_score: 99","count: 3<br />relevance_score: 100","count: 0<br />relevance_score: 101","count: 1<br />relevance_score: 102","count: 2<br />relevance_score: 103","count: 1<br />relevance_score: 104","count: 1<br />relevance_score: 105","count: 0<br />relevance_score: 106","count: 2<br />relevance_score: 107","count: 2<br />relevance_score: 108","count: 0<br />relevance_score: 109","count: 3<br />relevance_score: 110","count: 0<br />relevance_score: 111","count: 0<br />relevance_score: 112","count: 2<br />relevance_score: 113","count: 7<br />relevance_score: 114","count: 2<br />relevance_score: 115","count: 0<br />relevance_score: 116","count: 0<br />relevance_score: 117","count: 0<br />relevance_score: 118","count: 0<br />relevance_score: 119","count: 0<br />relevance_score: 120","count: 0<br />relevance_score: 121","count: 0<br />relevance_score: 122","count: 3<br />relevance_score: 123","count: 1<br />relevance_score: 124","count: 1<br />relevance_score: 125","count: 3<br />relevance_score: 126","count: 1<br />relevance_score: 127","count: 2<br />relevance_score: 128","count: 2<br />relevance_score: 129","count: 1<br />relevance_score: 130","count: 1<br />relevance_score: 131","count: 1<br />relevance_score: 132","count: 4<br />relevance_score: 133","count: 0<br />relevance_score: 134","count: 0<br />relevance_score: 135","count: 2<br />relevance_score: 136","count: 0<br />relevance_score: 137","count: 1<br />relevance_score: 138","count: 0<br />relevance_score: 139","count: 2<br />relevance_score: 140","count: 0<br />relevance_score: 141","count: 0<br />relevance_score: 142","count: 0<br />relevance_score: 143","count: 0<br />relevance_score: 144","count: 1<br />relevance_score: 145","count: 0<br />relevance_score: 146","count: 0<br />relevance_score: 147","count: 1<br />relevance_score: 148","count: 1<br />relevance_score: 149","count: 0<br />relevance_score: 150","count: 0<br />relevance_score: 151","count: 1<br />relevance_score: 152","count: 1<br />relevance_score: 153","count: 1<br />relevance_score: 154","count: 0<br />relevance_score: 155","count: 0<br />relevance_score: 156","count: 0<br />relevance_score: 157","count: 0<br />relevance_score: 158","count: 0<br />relevance_score: 159","count: 0<br />relevance_score: 160","count: 0<br />relevance_score: 161","count: 1<br />relevance_score: 162","count: 1<br />relevance_score: 163","count: 0<br />relevance_score: 164","count: 0<br />relevance_score: 165","count: 1<br />relevance_score: 166","count: 1<br />relevance_score: 167","count: 2<br />relevance_score: 168","count: 0<br />relevance_score: 169","count: 1<br />relevance_score: 170","count: 0<br />relevance_score: 171","count: 0<br />relevance_score: 172","count: 1<br />relevance_score: 173","count: 1<br />relevance_score: 174","count: 1<br />relevance_score: 175","count: 1<br />relevance_score: 176","count: 1<br />relevance_score: 177","count: 1<br />relevance_score: 178","count: 0<br />relevance_score: 179","count: 1<br />relevance_score: 180","count: 0<br />relevance_score: 181","count: 1<br />relevance_score: 182","count: 0<br />relevance_score: 183","count: 1<br />relevance_score: 184","count: 0<br />relevance_score: 185","count: 2<br />relevance_score: 186","count: 3<br />relevance_score: 187","count: 1<br />relevance_score: 188","count: 0<br />relevance_score: 189","count: 0<br />relevance_score: 190","count: 0<br />relevance_score: 191","count: 1<br />relevance_score: 192","count: 2<br />relevance_score: 193","count: 1<br />relevance_score: 194","count: 0<br />relevance_score: 195","count: 3<br />relevance_score: 196","count: 0<br />relevance_score: 197","count: 0<br />relevance_score: 198","count: 0<br />relevance_score: 199","count: 0<br />relevance_score: 200","count: 2<br />relevance_score: 201","count: 0<br />relevance_score: 202","count: 0<br />relevance_score: 203","count: 0<br />relevance_score: 204","count: 0<br />relevance_score: 205","count: 0<br />relevance_score: 206","count: 0<br />relevance_score: 207","count: 0<br />relevance_score: 208","count: 0<br />relevance_score: 209","count: 1<br />relevance_score: 210","count: 0<br />relevance_score: 211","count: 0<br />relevance_score: 212","count: 0<br />relevance_score: 213","count: 0<br />relevance_score: 214","count: 0<br />relevance_score: 215","count: 1<br />relevance_score: 216","count: 0<br />relevance_score: 217","count: 1<br />relevance_score: 218","count: 0<br />relevance_score: 219","count: 0<br />relevance_score: 220","count: 0<br />relevance_score: 221","count: 0<br />relevance_score: 222","count: 0<br />relevance_score: 223","count: 0<br />relevance_score: 224","count: 1<br />relevance_score: 225","count: 0<br />relevance_score: 226","count: 0<br />relevance_score: 227","count: 0<br />relevance_score: 228","count: 1<br />relevance_score: 229","count: 1<br />relevance_score: 230","count: 0<br />relevance_score: 231","count: 1<br />relevance_score: 232","count: 0<br />relevance_score: 233","count: 0<br />relevance_score: 234","count: 0<br />relevance_score: 235","count: 0<br />relevance_score: 236","count: 1<br />relevance_score: 237","count: 0<br />relevance_score: 238","count: 2<br />relevance_score: 239","count: 1<br />relevance_score: 240","count: 0<br />relevance_score: 241","count: 1<br />relevance_score: 242","count: 0<br />relevance_score: 243","count: 1<br />relevance_score: 244","count: 0<br />relevance_score: 245","count: 1<br />relevance_score: 246","count: 1<br />relevance_score: 247","count: 1<br />relevance_score: 248","count: 0<br />relevance_score: 249","count: 0<br />relevance_score: 250","count: 1<br />relevance_score: 251","count: 1<br />relevance_score: 252","count: 1<br />relevance_score: 253","count: 0<br />relevance_score: 254","count: 0<br />relevance_score: 255","count: 2<br />relevance_score: 256","count: 0<br />relevance_score: 257","count: 0<br />relevance_score: 258","count: 0<br />relevance_score: 259","count: 0<br />relevance_score: 260","count: 1<br />relevance_score: 261","count: 0<br />relevance_score: 262","count: 0<br />relevance_score: 263","count: 1<br />relevance_score: 264","count: 0<br />relevance_score: 265","count: 2<br />relevance_score: 266","count: 0<br />relevance_score: 267","count: 1<br />relevance_score: 268","count: 0<br />relevance_score: 269","count: 1<br />relevance_score: 270","count: 0<br />relevance_score: 271","count: 1<br />relevance_score: 272","count: 0<br />relevance_score: 273","count: 0<br />relevance_score: 274","count: 0<br />relevance_score: 275","count: 0<br />relevance_score: 276","count: 0<br />relevance_score: 277","count: 0<br />relevance_score: 278","count: 0<br />relevance_score: 279","count: 1<br />relevance_score: 280","count: 0<br />relevance_score: 281","count: 0<br />relevance_score: 282","count: 0<br />relevance_score: 283","count: 2<br />relevance_score: 284","count: 1<br />relevance_score: 285","count: 0<br />relevance_score: 286","count: 0<br />relevance_score: 287","count: 0<br />relevance_score: 288","count: 1<br />relevance_score: 289","count: 0<br />relevance_score: 290","count: 0<br />relevance_score: 291","count: 0<br />relevance_score: 292","count: 0<br />relevance_score: 293","count: 0<br />relevance_score: 294","count: 0<br />relevance_score: 295","count: 0<br />relevance_score: 296","count: 0<br />relevance_score: 297","count: 0<br />relevance_score: 298","count: 0<br />relevance_score: 299","count: 0<br />relevance_score: 300","count: 0<br />relevance_score: 301","count: 0<br />relevance_score: 302","count: 0<br />relevance_score: 303","count: 0<br />relevance_score: 304","count: 0<br />relevance_score: 305","count: 0<br />relevance_score: 306","count: 0<br />relevance_score: 307","count: 0<br />relevance_score: 308","count: 1<br />relevance_score: 309","count: 0<br />relevance_score: 310","count: 2<br />relevance_score: 311","count: 0<br />relevance_score: 312","count: 0<br />relevance_score: 313","count: 1<br />relevance_score: 314","count: 1<br />relevance_score: 315","count: 0<br />relevance_score: 316","count: 0<br />relevance_score: 317","count: 0<br />relevance_score: 318","count: 0<br />relevance_score: 319","count: 0<br />relevance_score: 320","count: 0<br />relevance_score: 321","count: 0<br />relevance_score: 322","count: 0<br />relevance_score: 323","count: 0<br />relevance_score: 324","count: 0<br />relevance_score: 325","count: 0<br />relevance_score: 326","count: 0<br />relevance_score: 327","count: 0<br />relevance_score: 328","count: 1<br />relevance_score: 329","count: 0<br />relevance_score: 330","count: 1<br />relevance_score: 331","count: 0<br />relevance_score: 332","count: 0<br />relevance_score: 333","count: 0<br />relevance_score: 334","count: 0<br />relevance_score: 335","count: 0<br />relevance_score: 336","count: 0<br />relevance_score: 337","count: 0<br />relevance_score: 338","count: 0<br />relevance_score: 339","count: 0<br />relevance_score: 340","count: 0<br />relevance_score: 341","count: 0<br />relevance_score: 342","count: 0<br />relevance_score: 343","count: 0<br />relevance_score: 344","count: 0<br />relevance_score: 345","count: 0<br />relevance_score: 346","count: 0<br />relevance_score: 347","count: 0<br />relevance_score: 348","count: 0<br />relevance_score: 349","count: 0<br />relevance_score: 350","count: 0<br />relevance_score: 351","count: 0<br />relevance_score: 352","count: 0<br />relevance_score: 353","count: 0<br />relevance_score: 354","count: 0<br />relevance_score: 355","count: 1<br />relevance_score: 356","count: 0<br />relevance_score: 357","count: 0<br />relevance_score: 358","count: 0<br />relevance_score: 359","count: 0<br />relevance_score: 360","count: 0<br />relevance_score: 361","count: 0<br />relevance_score: 362","count: 0<br />relevance_score: 363","count: 0<br />relevance_score: 364","count: 0<br />relevance_score: 365","count: 0<br />relevance_score: 366","count: 0<br />relevance_score: 367","count: 0<br />relevance_score: 368","count: 0<br />relevance_score: 369","count: 0<br />relevance_score: 370","count: 0<br />relevance_score: 371","count: 1<br />relevance_score: 372","count: 0<br />relevance_score: 373","count: 0<br />relevance_score: 374","count: 1<br />relevance_score: 375","count: 0<br />relevance_score: 376","count: 0<br />relevance_score: 377","count: 0<br />relevance_score: 378","count: 0<br />relevance_score: 379","count: 0<br />relevance_score: 380","count: 0<br />relevance_score: 381","count: 0<br />relevance_score: 382","count: 0<br />relevance_score: 383","count: 0<br />relevance_score: 384","count: 0<br />relevance_score: 385","count: 0<br />relevance_score: 386","count: 0<br />relevance_score: 387","count: 0<br />relevance_score: 388","count: 1<br />relevance_score: 389","count: 0<br />relevance_score: 390","count: 0<br />relevance_score: 391","count: 0<br />relevance_score: 392","count: 0<br />relevance_score: 393","count: 0<br />relevance_score: 394","count: 0<br />relevance_score: 395","count: 0<br />relevance_score: 396","count: 0<br />relevance_score: 397","count: 0<br />relevance_score: 398","count: 0<br />relevance_score: 399","count: 0<br />relevance_score: 400","count: 0<br />relevance_score: 401","count: 0<br />relevance_score: 402","count: 0<br />relevance_score: 403","count: 0<br />relevance_score: 404","count: 0<br />relevance_score: 405","count: 0<br />relevance_score: 406","count: 0<br />relevance_score: 407","count: 0<br />relevance_score: 408","count: 0<br />relevance_score: 409","count: 0<br />relevance_score: 410","count: 0<br />relevance_score: 411","count: 1<br />relevance_score: 412","count: 0<br />relevance_score: 413","count: 0<br />relevance_score: 414","count: 0<br />relevance_score: 415","count: 0<br />relevance_score: 416","count: 0<br />relevance_score: 417","count: 0<br />relevance_score: 418","count: 0<br />relevance_score: 419","count: 0<br />relevance_score: 420","count: 1<br />relevance_score: 421","count: 0<br />relevance_score: 422","count: 0<br />relevance_score: 423","count: 0<br />relevance_score: 424","count: 0<br />relevance_score: 425","count: 0<br />relevance_score: 426","count: 0<br />relevance_score: 427","count: 0<br />relevance_score: 428","count: 0<br />relevance_score: 429","count: 0<br />relevance_score: 430","count: 0<br />relevance_score: 431","count: 0<br />relevance_score: 432","count: 0<br />relevance_score: 433","count: 0<br />relevance_score: 434","count: 0<br />relevance_score: 435","count: 0<br />relevance_score: 436","count: 0<br />relevance_score: 437","count: 0<br />relevance_score: 438","count: 0<br />relevance_score: 439","count: 0<br />relevance_score: 440","count: 0<br />relevance_score: 441","count: 0<br />relevance_score: 442","count: 0<br />relevance_score: 443","count: 0<br />relevance_score: 444","count: 0<br />relevance_score: 445","count: 0<br />relevance_score: 446","count: 0<br />relevance_score: 447","count: 0<br />relevance_score: 448","count: 0<br />relevance_score: 449","count: 0<br />relevance_score: 450","count: 0<br />relevance_score: 451","count: 0<br />relevance_score: 452","count: 0<br />relevance_score: 453","count: 0<br />relevance_score: 454","count: 0<br />relevance_score: 455","count: 0<br />relevance_score: 456","count: 0<br />relevance_score: 457","count: 0<br />relevance_score: 458","count: 1<br />relevance_score: 459","count: 0<br />relevance_score: 460","count: 0<br />relevance_score: 461","count: 0<br />relevance_score: 462","count: 0<br />relevance_score: 463","count: 0<br />relevance_score: 464","count: 0<br />relevance_score: 465","count: 0<br />relevance_score: 466","count: 0<br />relevance_score: 467","count: 0<br />relevance_score: 468","count: 0<br />relevance_score: 469","count: 0<br />relevance_score: 470","count: 0<br />relevance_score: 471","count: 0<br />relevance_score: 472","count: 0<br />relevance_score: 473","count: 0<br />relevance_score: 474","count: 0<br />relevance_score: 475","count: 1<br />relevance_score: 476","count: 0<br />relevance_score: 477","count: 0<br />relevance_score: 478","count: 0<br />relevance_score: 479","count: 0<br />relevance_score: 480","count: 0<br />relevance_score: 481","count: 0<br />relevance_score: 482","count: 0<br />relevance_score: 483","count: 0<br />relevance_score: 484","count: 0<br />relevance_score: 485","count: 0<br />relevance_score: 486","count: 0<br />relevance_score: 487","count: 0<br />relevance_score: 488","count: 0<br />relevance_score: 489","count: 0<br />relevance_score: 490","count: 0<br />relevance_score: 491","count: 0<br />relevance_score: 492","count: 0<br />relevance_score: 493","count: 0<br />relevance_score: 494","count: 0<br />relevance_score: 495","count: 0<br />relevance_score: 496","count: 0<br />relevance_score: 497","count: 0<br />relevance_score: 498","count: 0<br />relevance_score: 499","count: 0<br />relevance_score: 500","count: 0<br />relevance_score: 501","count: 0<br />relevance_score: 502","count: 0<br />relevance_score: 503","count: 0<br />relevance_score: 504","count: 0<br />relevance_score: 505","count: 0<br />relevance_score: 506","count: 0<br />relevance_score: 507","count: 0<br />relevance_score: 508","count: 0<br />relevance_score: 509","count: 0<br />relevance_score: 510","count: 0<br />relevance_score: 511","count: 0<br />relevance_score: 512","count: 0<br />relevance_score: 513","count: 0<br />relevance_score: 514","count: 0<br />relevance_score: 515","count: 0<br />relevance_score: 516","count: 0<br />relevance_score: 517","count: 0<br />relevance_score: 518","count: 0<br />relevance_score: 519","count: 0<br />relevance_score: 520","count: 0<br />relevance_score: 521","count: 0<br />relevance_score: 522","count: 0<br />relevance_score: 523","count: 0<br />relevance_score: 524","count: 0<br />relevance_score: 525","count: 0<br />relevance_score: 526","count: 0<br />relevance_score: 527","count: 0<br />relevance_score: 528","count: 0<br />relevance_score: 529","count: 0<br />relevance_score: 530","count: 0<br />relevance_score: 531","count: 0<br />relevance_score: 532","count: 0<br />relevance_score: 533","count: 0<br />relevance_score: 534","count: 0<br />relevance_score: 535","count: 0<br />relevance_score: 536","count: 0<br />relevance_score: 537","count: 0<br />relevance_score: 538","count: 0<br />relevance_score: 539","count: 0<br />relevance_score: 540","count: 0<br />relevance_score: 541","count: 0<br />relevance_score: 542","count: 0<br />relevance_score: 543","count: 0<br />relevance_score: 544","count: 0<br />relevance_score: 545","count: 0<br />relevance_score: 546","count: 0<br />relevance_score: 547","count: 0<br />relevance_score: 548","count: 0<br />relevance_score: 549","count: 0<br />relevance_score: 550","count: 0<br />relevance_score: 551","count: 0<br />relevance_score: 552","count: 0<br />relevance_score: 553","count: 0<br />relevance_score: 554","count: 0<br />relevance_score: 555","count: 0<br />relevance_score: 556","count: 0<br />relevance_score: 557","count: 0<br />relevance_score: 558","count: 0<br />relevance_score: 559","count: 0<br />relevance_score: 560","count: 0<br />relevance_score: 561","count: 0<br />relevance_score: 562","count: 0<br />relevance_score: 563","count: 0<br />relevance_score: 564","count: 0<br />relevance_score: 565","count: 0<br />relevance_score: 566","count: 0<br />relevance_score: 567","count: 1<br />relevance_score: 568","count: 0<br />relevance_score: 569","count: 0<br />relevance_score: 570","count: 0<br />relevance_score: 571","count: 0<br />relevance_score: 572","count: 0<br />relevance_score: 573","count: 0<br />relevance_score: 574","count: 0<br />relevance_score: 575","count: 0<br />relevance_score: 576","count: 0<br />relevance_score: 577","count: 0<br />relevance_score: 578","count: 0<br />relevance_score: 579","count: 0<br />relevance_score: 580","count: 0<br />relevance_score: 581","count: 0<br />relevance_score: 582","count: 0<br />relevance_score: 583","count: 0<br />relevance_score: 584","count: 0<br />relevance_score: 585","count: 0<br />relevance_score: 586","count: 0<br />relevance_score: 587","count: 0<br />relevance_score: 588","count: 0<br />relevance_score: 589","count: 0<br />relevance_score: 590","count: 0<br />relevance_score: 591","count: 0<br />relevance_score: 592","count: 0<br />relevance_score: 593","count: 0<br />relevance_score: 594","count: 0<br />relevance_score: 595","count: 0<br />relevance_score: 596","count: 0<br />relevance_score: 597","count: 0<br />relevance_score: 598","count: 0<br />relevance_score: 599","count: 0<br />relevance_score: 600","count: 0<br />relevance_score: 601","count: 0<br />relevance_score: 602","count: 0<br />relevance_score: 603","count: 0<br />relevance_score: 604","count: 0<br />relevance_score: 605","count: 0<br />relevance_score: 606","count: 0<br />relevance_score: 607","count: 0<br />relevance_score: 608","count: 0<br />relevance_score: 609","count: 0<br />relevance_score: 610","count: 0<br />relevance_score: 611","count: 0<br />relevance_score: 612","count: 0<br />relevance_score: 613","count: 0<br />relevance_score: 614","count: 0<br />relevance_score: 615","count: 0<br />relevance_score: 616","count: 0<br />relevance_score: 617","count: 0<br />relevance_score: 618","count: 0<br />relevance_score: 619","count: 0<br />relevance_score: 620","count: 0<br />relevance_score: 621","count: 0<br />relevance_score: 622","count: 0<br />relevance_score: 623","count: 0<br />relevance_score: 624","count: 0<br />relevance_score: 625","count: 0<br />relevance_score: 626","count: 0<br />relevance_score: 627","count: 0<br />relevance_score: 628","count: 0<br />relevance_score: 629","count: 0<br />relevance_score: 630","count: 0<br />relevance_score: 631","count: 0<br />relevance_score: 632","count: 0<br />relevance_score: 633","count: 0<br />relevance_score: 634","count: 0<br />relevance_score: 635","count: 0<br />relevance_score: 636","count: 0<br />relevance_score: 637","count: 0<br />relevance_score: 638","count: 0<br />relevance_score: 639","count: 0<br />relevance_score: 640","count: 0<br />relevance_score: 641","count: 0<br />relevance_score: 642","count: 1<br />relevance_score: 643"],"type":"bar","textposition":"none","marker":{"autocolorscale":false,"color":"rgba(30,144,255,0.7)","line":{"width":1.8897637795275593,"color":"transparent"}},"showlegend":false,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null},{"orientation":"v","width":[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],"base":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"x":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643],"y":[4,538,386,66,52,83,77,62,52,46,39,44,36,31,34,26,20,10,16,14,6,9,6,8,6,5,5,5,3,6,3,3,3,0,0,0,3,0,4,2,1,1,0,0,0,0,1,2,2,0,0,0,0,2,2,3,1,0,1,0,0,0,0,2,2,0,1,1,1,0,0,0,1,1,1,0,1,1,0,0,1,0,0,1,0,0,1,1,0,0,1,1,1,0,1,0,0,1,0,0,1,0,1,1,0,0,0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"text":["count: 4<br />relevance_score: 0","count: 538<br />relevance_score: 1","count: 386<br />relevance_score: 2","count: 66<br />relevance_score: 3","count: 52<br />relevance_score: 4","count: 83<br />relevance_score: 5","count: 77<br />relevance_score: 6","count: 62<br />relevance_score: 7","count: 52<br />relevance_score: 8","count: 46<br />relevance_score: 9","count: 39<br />relevance_score: 10","count: 44<br />relevance_score: 11","count: 36<br />relevance_score: 12","count: 31<br />relevance_score: 13","count: 34<br />relevance_score: 14","count: 26<br />relevance_score: 15","count: 20<br />relevance_score: 16","count: 10<br />relevance_score: 17","count: 16<br />relevance_score: 18","count: 14<br />relevance_score: 19","count: 6<br />relevance_score: 20","count: 9<br />relevance_score: 21","count: 6<br />relevance_score: 22","count: 8<br />relevance_score: 23","count: 6<br />relevance_score: 24","count: 5<br />relevance_score: 25","count: 5<br />relevance_score: 26","count: 5<br />relevance_score: 27","count: 3<br />relevance_score: 28","count: 6<br />relevance_score: 29","count: 3<br />relevance_score: 30","count: 3<br />relevance_score: 31","count: 3<br />relevance_score: 32","count: 0<br />relevance_score: 33","count: 0<br />relevance_score: 34","count: 0<br />relevance_score: 35","count: 3<br />relevance_score: 36","count: 0<br />relevance_score: 37","count: 4<br />relevance_score: 38","count: 2<br />relevance_score: 39","count: 1<br />relevance_score: 40","count: 1<br />relevance_score: 41","count: 0<br />relevance_score: 42","count: 0<br />relevance_score: 43","count: 0<br />relevance_score: 44","count: 0<br />relevance_score: 45","count: 1<br />relevance_score: 46","count: 2<br />relevance_score: 47","count: 2<br />relevance_score: 48","count: 0<br />relevance_score: 49","count: 0<br />relevance_score: 50","count: 0<br />relevance_score: 51","count: 0<br />relevance_score: 52","count: 2<br />relevance_score: 53","count: 2<br />relevance_score: 54","count: 3<br />relevance_score: 55","count: 1<br />relevance_score: 56","count: 0<br />relevance_score: 57","count: 1<br />relevance_score: 58","count: 0<br />relevance_score: 59","count: 0<br />relevance_score: 60","count: 0<br />relevance_score: 61","count: 0<br />relevance_score: 62","count: 2<br />relevance_score: 63","count: 2<br />relevance_score: 64","count: 0<br />relevance_score: 65","count: 1<br />relevance_score: 66","count: 1<br />relevance_score: 67","count: 1<br />relevance_score: 68","count: 0<br />relevance_score: 69","count: 0<br />relevance_score: 70","count: 0<br />relevance_score: 71","count: 1<br />relevance_score: 72","count: 1<br />relevance_score: 73","count: 1<br />relevance_score: 74","count: 0<br />relevance_score: 75","count: 1<br />relevance_score: 76","count: 1<br />relevance_score: 77","count: 0<br />relevance_score: 78","count: 0<br />relevance_score: 79","count: 1<br />relevance_score: 80","count: 0<br />relevance_score: 81","count: 0<br />relevance_score: 82","count: 1<br />relevance_score: 83","count: 0<br />relevance_score: 84","count: 0<br />relevance_score: 85","count: 1<br />relevance_score: 86","count: 1<br />relevance_score: 87","count: 0<br />relevance_score: 88","count: 0<br />relevance_score: 89","count: 1<br />relevance_score: 90","count: 1<br />relevance_score: 91","count: 1<br />relevance_score: 92","count: 0<br />relevance_score: 93","count: 1<br />relevance_score: 94","count: 0<br />relevance_score: 95","count: 0<br />relevance_score: 96","count: 1<br />relevance_score: 97","count: 0<br />relevance_score: 98","count: 0<br />relevance_score: 99","count: 1<br />relevance_score: 100","count: 0<br />relevance_score: 101","count: 1<br />relevance_score: 102","count: 1<br />relevance_score: 103","count: 0<br />relevance_score: 104","count: 0<br />relevance_score: 105","count: 0<br />relevance_score: 106","count: 0<br />relevance_score: 107","count: 0<br />relevance_score: 108","count: 0<br />relevance_score: 109","count: 1<br />relevance_score: 110","count: 0<br />relevance_score: 111","count: 1<br />relevance_score: 112","count: 0<br />relevance_score: 113","count: 1<br />relevance_score: 114","count: 0<br />relevance_score: 115","count: 0<br />relevance_score: 116","count: 0<br />relevance_score: 117","count: 0<br />relevance_score: 118","count: 0<br />relevance_score: 119","count: 0<br />relevance_score: 120","count: 0<br />relevance_score: 121","count: 1<br />relevance_score: 122","count: 1<br />relevance_score: 123","count: 0<br />relevance_score: 124","count: 0<br />relevance_score: 125","count: 1<br />relevance_score: 126","count: 1<br />relevance_score: 127","count: 0<br />relevance_score: 128","count: 0<br />relevance_score: 129","count: 0<br />relevance_score: 130","count: 0<br />relevance_score: 131","count: 0<br />relevance_score: 132","count: 0<br />relevance_score: 133","count: 0<br />relevance_score: 134","count: 0<br />relevance_score: 135","count: 0<br />relevance_score: 136","count: 0<br />relevance_score: 137","count: 0<br />relevance_score: 138","count: 0<br />relevance_score: 139","count: 0<br />relevance_score: 140","count: 0<br />relevance_score: 141","count: 0<br />relevance_score: 142","count: 0<br />relevance_score: 143","count: 0<br />relevance_score: 144","count: 0<br />relevance_score: 145","count: 0<br />relevance_score: 146","count: 0<br />relevance_score: 147","count: 0<br />relevance_score: 148","count: 0<br />relevance_score: 149","count: 0<br />relevance_score: 150","count: 0<br />relevance_score: 151","count: 0<br />relevance_score: 152","count: 0<br />relevance_score: 153","count: 0<br />relevance_score: 154","count: 0<br />relevance_score: 155","count: 0<br />relevance_score: 156","count: 0<br />relevance_score: 157","count: 0<br />relevance_score: 158","count: 0<br />relevance_score: 159","count: 0<br />relevance_score: 160","count: 0<br />relevance_score: 161","count: 0<br />relevance_score: 162","count: 0<br />relevance_score: 163","count: 0<br />relevance_score: 164","count: 1<br />relevance_score: 165","count: 0<br />relevance_score: 166","count: 0<br />relevance_score: 167","count: 0<br />relevance_score: 168","count: 0<br />relevance_score: 169","count: 0<br />relevance_score: 170","count: 0<br />relevance_score: 171","count: 0<br />relevance_score: 172","count: 0<br />relevance_score: 173","count: 0<br />relevance_score: 174","count: 0<br />relevance_score: 175","count: 0<br />relevance_score: 176","count: 0<br />relevance_score: 177","count: 0<br />relevance_score: 178","count: 0<br />relevance_score: 179","count: 0<br />relevance_score: 180","count: 0<br />relevance_score: 181","count: 0<br />relevance_score: 182","count: 0<br />relevance_score: 183","count: 0<br />relevance_score: 184","count: 0<br />relevance_score: 185","count: 0<br />relevance_score: 186","count: 0<br />relevance_score: 187","count: 0<br />relevance_score: 188","count: 0<br />relevance_score: 189","count: 0<br />relevance_score: 190","count: 0<br />relevance_score: 191","count: 0<br />relevance_score: 192","count: 0<br />relevance_score: 193","count: 0<br />relevance_score: 194","count: 0<br />relevance_score: 195","count: 0<br />relevance_score: 196","count: 0<br />relevance_score: 197","count: 0<br />relevance_score: 198","count: 0<br />relevance_score: 199","count: 0<br />relevance_score: 200","count: 0<br />relevance_score: 201","count: 0<br />relevance_score: 202","count: 0<br />relevance_score: 203","count: 0<br />relevance_score: 204","count: 0<br />relevance_score: 205","count: 0<br />relevance_score: 206","count: 0<br />relevance_score: 207","count: 0<br />relevance_score: 208","count: 0<br />relevance_score: 209","count: 0<br />relevance_score: 210","count: 0<br />relevance_score: 211","count: 0<br />relevance_score: 212","count: 0<br />relevance_score: 213","count: 0<br />relevance_score: 214","count: 0<br />relevance_score: 215","count: 0<br />relevance_score: 216","count: 0<br />relevance_score: 217","count: 0<br />relevance_score: 218","count: 0<br />relevance_score: 219","count: 0<br />relevance_score: 220","count: 0<br />relevance_score: 221","count: 0<br />relevance_score: 222","count: 0<br />relevance_score: 223","count: 0<br />relevance_score: 224","count: 0<br />relevance_score: 225","count: 0<br />relevance_score: 226","count: 0<br />relevance_score: 227","count: 0<br />relevance_score: 228","count: 0<br />relevance_score: 229","count: 0<br />relevance_score: 230","count: 0<br />relevance_score: 231","count: 0<br />relevance_score: 232","count: 0<br />relevance_score: 233","count: 0<br />relevance_score: 234","count: 0<br />relevance_score: 235","count: 0<br />relevance_score: 236","count: 0<br />relevance_score: 237","count: 0<br />relevance_score: 238","count: 0<br />relevance_score: 239","count: 0<br />relevance_score: 240","count: 0<br />relevance_score: 241","count: 0<br />relevance_score: 242","count: 0<br />relevance_score: 243","count: 0<br />relevance_score: 244","count: 1<br />relevance_score: 245","count: 0<br />relevance_score: 246","count: 0<br />relevance_score: 247","count: 0<br />relevance_score: 248","count: 0<br />relevance_score: 249","count: 0<br />relevance_score: 250","count: 0<br />relevance_score: 251","count: 0<br />relevance_score: 252","count: 0<br />relevance_score: 253","count: 0<br />relevance_score: 254","count: 0<br />relevance_score: 255","count: 0<br />relevance_score: 256","count: 0<br />relevance_score: 257","count: 0<br />relevance_score: 258","count: 0<br />relevance_score: 259","count: 0<br />relevance_score: 260","count: 0<br />relevance_score: 261","count: 0<br />relevance_score: 262","count: 0<br />relevance_score: 263","count: 0<br />relevance_score: 264","count: 0<br />relevance_score: 265","count: 0<br />relevance_score: 266","count: 0<br />relevance_score: 267","count: 0<br />relevance_score: 268","count: 0<br />relevance_score: 269","count: 0<br />relevance_score: 270","count: 0<br />relevance_score: 271","count: 0<br />relevance_score: 272","count: 0<br />relevance_score: 273","count: 0<br />relevance_score: 274","count: 0<br />relevance_score: 275","count: 0<br />relevance_score: 276","count: 0<br />relevance_score: 277","count: 0<br />relevance_score: 278","count: 0<br />relevance_score: 279","count: 0<br />relevance_score: 280","count: 0<br />relevance_score: 281","count: 0<br />relevance_score: 282","count: 0<br />relevance_score: 283","count: 0<br />relevance_score: 284","count: 0<br />relevance_score: 285","count: 0<br />relevance_score: 286","count: 0<br />relevance_score: 287","count: 0<br />relevance_score: 288","count: 0<br />relevance_score: 289","count: 0<br />relevance_score: 290","count: 0<br />relevance_score: 291","count: 0<br />relevance_score: 292","count: 0<br />relevance_score: 293","count: 0<br />relevance_score: 294","count: 0<br />relevance_score: 295","count: 0<br />relevance_score: 296","count: 0<br />relevance_score: 297","count: 0<br />relevance_score: 298","count: 0<br />relevance_score: 299","count: 0<br />relevance_score: 300","count: 0<br />relevance_score: 301","count: 0<br />relevance_score: 302","count: 0<br />relevance_score: 303","count: 0<br />relevance_score: 304","count: 0<br />relevance_score: 305","count: 0<br />relevance_score: 306","count: 0<br />relevance_score: 307","count: 0<br />relevance_score: 308","count: 0<br />relevance_score: 309","count: 0<br />relevance_score: 310","count: 0<br />relevance_score: 311","count: 0<br />relevance_score: 312","count: 0<br />relevance_score: 313","count: 0<br />relevance_score: 314","count: 0<br />relevance_score: 315","count: 0<br />relevance_score: 316","count: 0<br />relevance_score: 317","count: 0<br />relevance_score: 318","count: 0<br />relevance_score: 319","count: 0<br />relevance_score: 320","count: 0<br />relevance_score: 321","count: 0<br />relevance_score: 322","count: 0<br />relevance_score: 323","count: 0<br />relevance_score: 324","count: 0<br />relevance_score: 325","count: 0<br />relevance_score: 326","count: 0<br />relevance_score: 327","count: 0<br />relevance_score: 328","count: 0<br />relevance_score: 329","count: 0<br />relevance_score: 330","count: 0<br />relevance_score: 331","count: 0<br />relevance_score: 332","count: 0<br />relevance_score: 333","count: 0<br />relevance_score: 334","count: 0<br />relevance_score: 335","count: 0<br />relevance_score: 336","count: 0<br />relevance_score: 337","count: 0<br />relevance_score: 338","count: 0<br />relevance_score: 339","count: 0<br />relevance_score: 340","count: 0<br />relevance_score: 341","count: 0<br />relevance_score: 342","count: 0<br />relevance_score: 343","count: 0<br />relevance_score: 344","count: 0<br />relevance_score: 345","count: 0<br />relevance_score: 346","count: 0<br />relevance_score: 347","count: 0<br />relevance_score: 348","count: 0<br />relevance_score: 349","count: 0<br />relevance_score: 350","count: 0<br />relevance_score: 351","count: 0<br />relevance_score: 352","count: 0<br />relevance_score: 353","count: 0<br />relevance_score: 354","count: 0<br />relevance_score: 355","count: 0<br />relevance_score: 356","count: 0<br />relevance_score: 357","count: 0<br />relevance_score: 358","count: 0<br />relevance_score: 359","count: 0<br />relevance_score: 360","count: 0<br />relevance_score: 361","count: 0<br />relevance_score: 362","count: 0<br />relevance_score: 363","count: 0<br />relevance_score: 364","count: 0<br />relevance_score: 365","count: 0<br />relevance_score: 366","count: 0<br />relevance_score: 367","count: 0<br />relevance_score: 368","count: 0<br />relevance_score: 369","count: 0<br />relevance_score: 370","count: 0<br />relevance_score: 371","count: 0<br />relevance_score: 372","count: 0<br />relevance_score: 373","count: 0<br />relevance_score: 374","count: 0<br />relevance_score: 375","count: 0<br />relevance_score: 376","count: 0<br />relevance_score: 377","count: 0<br />relevance_score: 378","count: 0<br />relevance_score: 379","count: 0<br />relevance_score: 380","count: 0<br />relevance_score: 381","count: 0<br />relevance_score: 382","count: 0<br />relevance_score: 383","count: 0<br />relevance_score: 384","count: 0<br />relevance_score: 385","count: 0<br />relevance_score: 386","count: 0<br />relevance_score: 387","count: 0<br />relevance_score: 388","count: 0<br />relevance_score: 389","count: 0<br />relevance_score: 390","count: 0<br />relevance_score: 391","count: 0<br />relevance_score: 392","count: 0<br />relevance_score: 393","count: 0<br />relevance_score: 394","count: 0<br />relevance_score: 395","count: 0<br />relevance_score: 396","count: 0<br />relevance_score: 397","count: 0<br />relevance_score: 398","count: 0<br />relevance_score: 399","count: 0<br />relevance_score: 400","count: 0<br />relevance_score: 401","count: 0<br />relevance_score: 402","count: 0<br />relevance_score: 403","count: 0<br />relevance_score: 404","count: 0<br />relevance_score: 405","count: 0<br />relevance_score: 406","count: 0<br />relevance_score: 407","count: 0<br />relevance_score: 408","count: 0<br />relevance_score: 409","count: 0<br />relevance_score: 410","count: 0<br />relevance_score: 411","count: 0<br />relevance_score: 412","count: 0<br />relevance_score: 413","count: 0<br />relevance_score: 414","count: 0<br />relevance_score: 415","count: 0<br />relevance_score: 416","count: 0<br />relevance_score: 417","count: 0<br />relevance_score: 418","count: 0<br />relevance_score: 419","count: 0<br />relevance_score: 420","count: 0<br />relevance_score: 421","count: 0<br />relevance_score: 422","count: 0<br />relevance_score: 423","count: 0<br />relevance_score: 424","count: 0<br />relevance_score: 425","count: 0<br />relevance_score: 426","count: 0<br />relevance_score: 427","count: 0<br />relevance_score: 428","count: 0<br />relevance_score: 429","count: 0<br />relevance_score: 430","count: 0<br />relevance_score: 431","count: 0<br />relevance_score: 432","count: 0<br />relevance_score: 433","count: 0<br />relevance_score: 434","count: 0<br />relevance_score: 435","count: 0<br />relevance_score: 436","count: 0<br />relevance_score: 437","count: 0<br />relevance_score: 438","count: 0<br />relevance_score: 439","count: 0<br />relevance_score: 440","count: 0<br />relevance_score: 441","count: 0<br />relevance_score: 442","count: 0<br />relevance_score: 443","count: 0<br />relevance_score: 444","count: 0<br />relevance_score: 445","count: 0<br />relevance_score: 446","count: 0<br />relevance_score: 447","count: 0<br />relevance_score: 448","count: 0<br />relevance_score: 449","count: 0<br />relevance_score: 450","count: 0<br />relevance_score: 451","count: 0<br />relevance_score: 452","count: 0<br />relevance_score: 453","count: 0<br />relevance_score: 454","count: 0<br />relevance_score: 455","count: 0<br />relevance_score: 456","count: 0<br />relevance_score: 457","count: 0<br />relevance_score: 458","count: 0<br />relevance_score: 459","count: 0<br />relevance_score: 460","count: 0<br />relevance_score: 461","count: 0<br />relevance_score: 462","count: 0<br />relevance_score: 463","count: 0<br />relevance_score: 464","count: 0<br />relevance_score: 465","count: 0<br />relevance_score: 466","count: 0<br />relevance_score: 467","count: 0<br />relevance_score: 468","count: 0<br />relevance_score: 469","count: 0<br />relevance_score: 470","count: 0<br />relevance_score: 471","count: 0<br />relevance_score: 472","count: 0<br />relevance_score: 473","count: 0<br />relevance_score: 474","count: 0<br />relevance_score: 475","count: 0<br />relevance_score: 476","count: 0<br />relevance_score: 477","count: 0<br />relevance_score: 478","count: 0<br />relevance_score: 479","count: 0<br />relevance_score: 480","count: 0<br />relevance_score: 481","count: 0<br />relevance_score: 482","count: 0<br />relevance_score: 483","count: 0<br />relevance_score: 484","count: 0<br />relevance_score: 485","count: 0<br />relevance_score: 486","count: 0<br />relevance_score: 487","count: 0<br />relevance_score: 488","count: 0<br />relevance_score: 489","count: 0<br />relevance_score: 490","count: 0<br />relevance_score: 491","count: 0<br />relevance_score: 492","count: 0<br />relevance_score: 493","count: 0<br />relevance_score: 494","count: 0<br />relevance_score: 495","count: 0<br />relevance_score: 496","count: 0<br />relevance_score: 497","count: 0<br />relevance_score: 498","count: 0<br />relevance_score: 499","count: 0<br />relevance_score: 500","count: 0<br />relevance_score: 501","count: 0<br />relevance_score: 502","count: 0<br />relevance_score: 503","count: 0<br />relevance_score: 504","count: 0<br />relevance_score: 505","count: 0<br />relevance_score: 506","count: 0<br />relevance_score: 507","count: 0<br />relevance_score: 508","count: 0<br />relevance_score: 509","count: 0<br />relevance_score: 510","count: 0<br />relevance_score: 511","count: 0<br />relevance_score: 512","count: 0<br />relevance_score: 513","count: 0<br />relevance_score: 514","count: 0<br />relevance_score: 515","count: 0<br />relevance_score: 516","count: 0<br />relevance_score: 517","count: 0<br />relevance_score: 518","count: 0<br />relevance_score: 519","count: 0<br />relevance_score: 520","count: 0<br />relevance_score: 521","count: 0<br />relevance_score: 522","count: 0<br />relevance_score: 523","count: 0<br />relevance_score: 524","count: 0<br />relevance_score: 525","count: 0<br />relevance_score: 526","count: 0<br />relevance_score: 527","count: 0<br />relevance_score: 528","count: 0<br />relevance_score: 529","count: 0<br />relevance_score: 530","count: 0<br />relevance_score: 531","count: 0<br />relevance_score: 532","count: 0<br />relevance_score: 533","count: 0<br />relevance_score: 534","count: 0<br />relevance_score: 535","count: 0<br />relevance_score: 536","count: 0<br />relevance_score: 537","count: 0<br />relevance_score: 538","count: 0<br />relevance_score: 539","count: 0<br />relevance_score: 540","count: 0<br />relevance_score: 541","count: 0<br />relevance_score: 542","count: 0<br />relevance_score: 543","count: 0<br />relevance_score: 544","count: 0<br />relevance_score: 545","count: 0<br />relevance_score: 546","count: 0<br />relevance_score: 547","count: 0<br />relevance_score: 548","count: 0<br />relevance_score: 549","count: 0<br />relevance_score: 550","count: 0<br />relevance_score: 551","count: 0<br />relevance_score: 552","count: 0<br />relevance_score: 553","count: 0<br />relevance_score: 554","count: 0<br />relevance_score: 555","count: 0<br />relevance_score: 556","count: 0<br />relevance_score: 557","count: 0<br />relevance_score: 558","count: 0<br />relevance_score: 559","count: 0<br />relevance_score: 560","count: 0<br />relevance_score: 561","count: 0<br />relevance_score: 562","count: 0<br />relevance_score: 563","count: 0<br />relevance_score: 564","count: 0<br />relevance_score: 565","count: 0<br />relevance_score: 566","count: 0<br />relevance_score: 567","count: 0<br />relevance_score: 568","count: 0<br />relevance_score: 569","count: 0<br />relevance_score: 570","count: 0<br />relevance_score: 571","count: 0<br />relevance_score: 572","count: 0<br />relevance_score: 573","count: 0<br />relevance_score: 574","count: 0<br />relevance_score: 575","count: 0<br />relevance_score: 576","count: 0<br />relevance_score: 577","count: 0<br />relevance_score: 578","count: 0<br />relevance_score: 579","count: 0<br />relevance_score: 580","count: 0<br />relevance_score: 581","count: 0<br />relevance_score: 582","count: 0<br />relevance_score: 583","count: 0<br />relevance_score: 584","count: 0<br />relevance_score: 585","count: 0<br />relevance_score: 586","count: 0<br />relevance_score: 587","count: 0<br />relevance_score: 588","count: 0<br />relevance_score: 589","count: 0<br />relevance_score: 590","count: 0<br />relevance_score: 591","count: 0<br />relevance_score: 592","count: 0<br />relevance_score: 593","count: 0<br />relevance_score: 594","count: 0<br />relevance_score: 595","count: 0<br />relevance_score: 596","count: 0<br />relevance_score: 597","count: 0<br />relevance_score: 598","count: 0<br />relevance_score: 599","count: 0<br />relevance_score: 600","count: 0<br />relevance_score: 601","count: 0<br />relevance_score: 602","count: 0<br />relevance_score: 603","count: 0<br />relevance_score: 604","count: 0<br />relevance_score: 605","count: 0<br />relevance_score: 606","count: 0<br />relevance_score: 607","count: 0<br />relevance_score: 608","count: 0<br />relevance_score: 609","count: 0<br />relevance_score: 610","count: 0<br />relevance_score: 611","count: 0<br />relevance_score: 612","count: 0<br />relevance_score: 613","count: 0<br />relevance_score: 614","count: 0<br />relevance_score: 615","count: 0<br />relevance_score: 616","count: 0<br />relevance_score: 617","count: 0<br />relevance_score: 618","count: 0<br />relevance_score: 619","count: 0<br />relevance_score: 620","count: 0<br />relevance_score: 621","count: 0<br />relevance_score: 622","count: 0<br />relevance_score: 623","count: 0<br />relevance_score: 624","count: 0<br />relevance_score: 625","count: 0<br />relevance_score: 626","count: 0<br />relevance_score: 627","count: 0<br />relevance_score: 628","count: 0<br />relevance_score: 629","count: 0<br />relevance_score: 630","count: 0<br />relevance_score: 631","count: 0<br />relevance_score: 632","count: 0<br />relevance_score: 633","count: 0<br />relevance_score: 634","count: 0<br />relevance_score: 635","count: 0<br />relevance_score: 636","count: 0<br />relevance_score: 637","count: 0<br />relevance_score: 638","count: 0<br />relevance_score: 639","count: 0<br />relevance_score: 640","count: 0<br />relevance_score: 641","count: 0<br />relevance_score: 642","count: 0<br />relevance_score: 643"],"type":"bar","textposition":"none","marker":{"autocolorscale":false,"color":"rgba(46,139,87,0.7)","line":{"width":1.8897637795275593,"color":"transparent"}},"showlegend":false,"xaxis":"x","yaxis":"y","hoverinfo":"text","frame":null}],"layout":{"margin":{"t":43.762557077625573,"r":7.3059360730593621,"b":40.182648401826491,"l":43.105022831050235},"font":{"color":"rgba(0,0,0,1)","family":"","size":14.611872146118724},"title":{"text":"Distribution of Relevance Scores","font":{"color":"rgba(0,0,0,1)","family":"","size":17.534246575342465},"x":0,"xref":"paper"},"xaxis":{"domain":[0,1],"automargin":true,"type":"linear","autorange":false,"range":[-10,210],"tickmode":"array","ticktext":["0","50","100","150","200"],"tickvals":[0,49.999999999999993,100,150,200],"categoryorder":"array","categoryarray":["0","50","100","150","200"],"nticks":null,"ticks":"","tickcolor":null,"ticklen":3.6529680365296811,"tickwidth":0,"showticklabels":true,"tickfont":{"color":"rgba(77,77,77,1)","family":"","size":11.68949771689498},"tickangle":-0,"showline":false,"linecolor":null,"linewidth":0,"showgrid":true,"gridcolor":"rgba(235,235,235,1)","gridwidth":0.66417600664176002,"zeroline":false,"anchor":"y","title":{"text":"Relevance Score","font":{"color":"rgba(0,0,0,1)","family":"","size":14.611872146118724}},"hoverformat":".2f"},"yaxis":{"domain":[0,1],"automargin":true,"type":"linear","autorange":false,"range":[-37.5,787.5],"tickmode":"array","ticktext":["0","200","400","600"],"tickvals":[0,200.00000000000003,400,600],"categoryorder":"array","categoryarray":["0","200","400","600"],"nticks":null,"ticks":"","tickcolor":null,"ticklen":3.6529680365296811,"tickwidth":0,"showticklabels":true,"tickfont":{"color":"rgba(77,77,77,1)","family":"","size":11.68949771689498},"tickangle":-0,"showline":false,"linecolor":null,"linewidth":0,"showgrid":true,"gridcolor":"rgba(235,235,235,1)","gridwidth":0.66417600664176002,"zeroline":false,"anchor":"x","title":{"text":"Number of Papers","font":{"color":"rgba(0,0,0,1)","family":"","size":14.611872146118724}},"hoverformat":".2f"},"shapes":[{"type":"rect","fillcolor":null,"line":{"color":null,"width":0,"linetype":[]},"yref":"paper","xref":"paper","x0":0,"x1":1,"y0":0,"y1":1}],"showlegend":false,"legend":{"bgcolor":null,"bordercolor":null,"borderwidth":0,"font":{"color":"rgba(0,0,0,1)","family":"","size":11.68949771689498}},"hovermode":"closest","barmode":"relative"},"config":{"doubleClick":"reset","modeBarButtonsToAdd":["hoverclosest","hovercompare"],"showSendToCloud":false},"source":"A","attrs":{"b2844c2c7f24":{"x":{},"type":"bar"},"b2842ddf57f9":{"x":{}}},"cur_data":"b2844c2c7f24","visdat":{"b2844c2c7f24":["function (y) ","x"],"b2842ddf57f9":["function (y) ","x"]},"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.20000000000000001,"selected":{"opacity":1},"debounce":0},"shinyEvents":["plotly_hover","plotly_click","plotly_selected","plotly_relayout","plotly_brushed","plotly_brushing","plotly_clickannotation","plotly_doubleclick","plotly_deselect","plotly_afterplot","plotly_sunburstclick"],"base_url":"https://plot.ly"},"evals":[],"jsHooks":[]}

      How does one interpret a relevance score? is it relevant to the search or standardized somehow?

    1. Author response:

      The following is the authors’ response to the current reviews.

      We thank Reviewers for highlighting the strengths of our work along with suggestions for future directions.

      We agree with the Reviewers that RPS26 depletion may impact not only RAN translation initiation and codon selection (as showed in the experiments in Figure 4G), but also other mechanisms, such as speed of PIC scanning, as we stated in the discussion. Although, we did provide the data showing that mRNA of exogenous FMR1-GFP does not change upon RPS26 depletion (Figure 3B&C), hence observed effect most likely stems from translation regulation. In addition, an experiment with ASO-ACG treatment (Figure 4G) suggests that near cognate start codon selection or speed of PIC scanning may be a part of the regulation of RAN translation sensitive to RPS26 depletion. In addition, our latest unpublished results (Niewiadomska D. et al., in revision), indicate that FMRpolyG in fusion with GFP is fairly stable, in particular, while derived from long repeats (>90xCGG), suggesting that the protein stability is not at play in RPS26-dependent regulation.

      We would like to stress that in order to avoid bias in result interpretation and to mimic the natural situation, the majority of experiments concerning levels of FMRpolyG were performed in cell models with stable expression of ACG-initiated FMRpolyG. Currently, we do not possess a cell model with stable expression of AUG-initiated FMRpolyG, and the experiments based on transient transfection system would not necessarily be comparable to the results obtained in stable expression system. However, we believe that the experiment presented in Figure 2B serves as a good control for overall translation level upon RPS26 depletion indicating that RPS26 insufficiency does not affect global translation and the observed regulation is specific to some mRNAs including the one encoding FMRpolyG frame. We also show that the level of ca. 80% of identified canonical proteins, including FMRP, did not change upon RPS26 silencing (SILAC-MS, Figure 4A). Indeed, we did not explore the ribosome composition upon RPS26 and TSR2 depletion, although, most likely the pool of functional ribosomes in the cell is sufficient enough to support the basal translation level (SUnSET assays, Figure 2B & 5C). However, we cannot exclude possibility that for some mRNAs, including one encoding for FMRpolyG, the observed effect can be partially caused by lowering the number of fully active ribosomes, especially in experiments with transient transfection experiments where transgene expression is hundreds times higher than for average native mRNA.

      Finally, we agree with the Reviewer that in vitro translation assay would provide the evidence of direct effect of RPS26 on FMRpolyG level, however, we did not manage to overcome technical difficulties in obtaining cellular lysate devoid of RPS26 from vendor companies.


      The following is the authors’ response to the original reviews.

      General Comments

      We thank Reviewers for the critical comments and experimental suggestions. We considered most of the advices in the revised version of the manuscript, which allowed for a more balanced interpretation of the results presented, and further supported major statement of the manuscript that insufficiency of the RPS26 and RPS25 plays a role in modulating the efficiency of noncanonical RAN translation from FMR1 mRNA, which results in the production of toxic polyglycine protein (FMRpolyG). Firstly, performing new experiments, we showed that silencing of the RPS26 and its chaperone protein TSR2, which regulates loading/exchange of RPS26 in maturing small ribosome subunit, did not elicit global translation inhibition. Secondly, we demonstrated that in contrary to RPS26 and RPS25 depletion, silencing the RPS6 protein, a core component of 40S subunit, did not affect FMRpolyG production, further supporting the specific effect of RPS26 and RPS25 on RAN translation regulation of mutant FMR1 mRNA. We also observed that depletion of RPS26, RPS25 and RPS6 had significant negative effect on cells proliferation which is in line with previously published results indicating that insufficiencies of ribosomal proteins negatively affect cell growth. Moreover, we showed that FMRpolyG production is significantly affected by RPS26 depletion while initiated at ACG, but not other near cognate start codons. Importantly, translation of FMRP initiated at canonical AUG codon of the same mRNA upstream the CGGexp was not affected by RPS26 silencing, similarly to vast majority of the human proteome. This implies that RAN translation of FMR1 mRNA mediated by RPS26 insufficiency is likely to be dependent on start codon selection/fidelity. In essence, we provide a series of evidences indicating that cellular amount of 40S ribosomal proteins RPS26 and RPS25 is important factor of CGGrelated RAN translation regulation. Finally, we also decided to tone down our claims. Now, we state that the RPS26/25/TSR2 insufficiency or depletion, affects RAN translation, rather than composition of 40S ribosomal subunit per se influences RAN translation. We have addressed all specific concerns below and made changes to the new version of manuscript.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tutak et al use a combination of pulldowns, analyzed by mass spectrometry, reporter assays, and fluorescence experiments to decipher the mechanism of protein translation in fragile X-related diseases. The topic is interesting and important.

      Although a role for Rps26-deficient ribosomes in toxic protein translation is plausible based on already available data, the authors' data are not carefully controlled and thus do not support the conclusions of the paper.

      We sincerely appreciate your rigorous, insightful, and constructive feedback throughout the revision process. We believe your guidance has been instrumental in significantly enhancing the quality of our research. Below, we have addressed your comments pointby-point.

      Strengths:

      The topic is interesting and important.

      Weaknesses:

      In particular, there is very little data to support the notion that Rps26-deficient ribosomes are even produced under the circumstances. And no data that indicate that they are involved in the RAN translation. Essential controls (for ribosome numbers) are lacking, no information is presented on the viability of the cells (Rps26 is an essential protein), and the differences in protein levels could well arise from block in protein synthesis, and cell division coupled to differential stability of the proteins.

      We agree that data presented in the first version of the manuscript did not directly address the following processes: ribosome content, global translation rate and cell viability upon RPS26 depletion. Therefore we addressed some of the issues in the revised version of the manuscript. In particular, we showed that RPS26 and TSR2 knock down did not inhibit global translation (new Figure 2B & 4C), hence we concluded that the changes of FMRpolyG level did not arise from general translational shut down. On the other hand, RPS26, RPS25 and RPS6 depletion negatively affected cells proliferation (new Figure 2A,5D,6C), which is in line with a number of previously published researches (e.g. Cheng et al, 2019; Havkin-Solomon et al, 2023). However, the rate of proliferation abnormalities is limited. We agree that observed effects on RAN translation from mutant FMR1 mRNA may stem from the combination of altered protein synthesis, conditions of the cells but also cis-acting factors of mRNA sequence/structure. In new experiments we showed that single nucleotide substitution of ACG by other near cognate start codons change sensitivity of RAN translation to insufficiency of RPS26 (new Figure 4F). Also the inhibitory effect of antisense oligonucleotide binding to the region of 5’UTR containing ACG initiation codon (ASO_ACG) is different in cells differing in amount of RPS26 (new Figure 4G).

      We also agree that our data only partially supports the role of RPS26-defficient ribosomes in RAN translation. Therefore, we have toned down our claims. Now, we state that the RPS26/25/TSR2 insufficiency or depletion affects RAN translation. We also changed the title of the manuscript to: “Insufficiency of 40S ribosomal proteins, RPS26 and RPS25, negatively affects biosynthesis of polyglycine-containing proteins in fragile-X associated conditions” (Previously it was: “Ribosomal composition affects the noncanonical translation and toxicity of polyglycine-containing proteins in fragile X-associated conditions”.

      Specific points:

      (1) Analysis of the mass spec data in Supplemental Table S3 indicates that for many of the proteins that are differentially enriched in one sample, a single peptide is identified. So the difference is between 1 peptide and 0. I don't understand how one can do a statistical analysis on that, or how it would give out anything of significance. I certainly do not think it is significant. This is exacerbated by the fact that the contaminants in the assay (keratins) are many, many-fold more abundant, and so are proteins that are known to be mitochondrial or nuclear, and therefore likely not actual targets (e.g. MCCC1, PC, NPM1; this includes many proteins "of significance" in Table S1, including Rrp1B, NAF1, Top1, TCEPB, DHX16, etc...).

      The data in Table S6/Figure 3A suffer from the same problem.

      I am not convinced that the mass spec data is reliable.

      We thank Reviewer for the comment concerning MS data; however, we believe that it may stem from misunderstanding of the data presented in Table S3 and S6. Both tables represent the output from MaxQuant analysis (so-called ProteinGroup) of MS .raw files, without any filtering. As stated in the Material&Methods, we applied default parameters suggested by MaxQuant developers to analyze MS data, these include identification of proteins based on at least 1 unique peptide, and thus some of the proteins with only 1 unique peptide are shown in Tables S1 and S3. Reviewer is also right that in this output table common contaminants, such as keratins are included. However, these identifications are denoted as “CON_”, and are further filtered out during statistical analysis in Perseus software. During the statistical analysis we first filtered out irrelevant protein groups identifications, such as contaminants, or only identified by site modifications.

      We have changed the names of Supplementary Table files, giving more detailed description. We hope this will help to avoid misunderstanding for broader public. Secondly, when comparing the data presented in Table S3 and volcano plot presented in Figure 1B, one can notice that indeed the majority of identified proteins are not statistically significant (grey points), thus not selected for further stratification. Lack of significance of these proteins may be partially due to poor MS identification, however, they are not included in the following parts of the manuscript. Further, we selected only eight proteins (out of over 150) for stratification by orthogonal techniques, thus we argue that this step validates the biological relevance of chosen candidate RAN-translation modifiers. One should also keep in mind that pull down samples analyzed by MS often yield lower intensity and identification rates, when comparing to whole cell analysis, as a result of lower protein input or stringent washes used during sample preparation.

      Regarding the data presented in Table S6 (SILAC data), we argue that these data are of very good quality. More than 2,000 proteins were identified in a 125min gradient, with over 80% of proteins that were identified with at least 2 unique peptides. Each of three biological replicates was analyzed three times (technical replicates), giving total of 9 high resolution MS runs. Together, we strongly believe that this data is of high confidence.

      (2) The mass-spec data however claims to identify Rps26 as a factor binding the toxic RNA specifically. The rest of the paper seeks to develop a story of how Rps26-deficient ribosomes play a role in the translation of this RNA. I do not consider that this makes sense.

      Indeed, we identified RPS26 as a protein that co-precipitated with FMR1 containing expanded CGG repeats (Supplementary Figure 1G) and found that depletion of RPS26 hindered RAN translation of FMRpolyG, suggesting that RPS26 positively affects RAN translation. However, we did not state that RPS26 directly interacts with toxic RNA. In order to confirm the specificity of RAN translation regulation by RPS26 insufficiency, we tested whether depletion of other 40S ribosomal protein, RPS6, affects FMRpolyG synthesis. Our experiments showed that there was no any significant effect on RAN translation efficiency post RPS6 silencing (new Figure 5C). Importantly, we showed that RPS26 depletion did not inhibit global translation (new Figure 2B). In addition, mutagenesis of near-cognate start codon (new Figure 4F) and ASO_ACG treatment (new Figure 4G) provided the evidences that modulation of FMRpolyG biosynthesis by RPS26 level may depend on start codon selection. In essence, our data suggest that RPS26 depletion specifically affects synthesis of FMRpolyG, but not FMRP derived from the same FMR1 mRNA with CGGexp. However, we do not claim that the observed effect is the consequence of a direct interaction between RPS26 and 5’UTR of FMR1 mRNA. Downregulation of FMRpolyG biosynthesis could be an outcome of the alteration of ribosomal assembly, decrease of efficiency and fidelity of PIC scanning/initiation or impeded elongation or a combination of all these processes. In the manuscript we presented the results of experiments which tested many of these possibilities.

      (3) Rps26 is an essential gene, I am sure the same is true for DHX15. What happens to cell viability? Protein synthesis? The yeast experiments were carefully carried out under experiments where Rps26 was reduced, not fully depleted to give small growth defects.

      We agree with the Reviewer that RPS26 and DHX15 are essential proteins, similarly to all RNA binding proteins, and caution should be taken during experimental design. To address this, we titrated different concentrations of siRPS26, and found that administration of 5 nM siRPS26, which just partially silenced RPS26, decreased FMRpolyG by around 50% (new Figure 1D). This impact was even greater with 15 nM siRPS26, as we observed around 80% decrease of FMRpolyG.

      Havkin-Solomon et al. (2023), showed that proliferation rate is decreased in cells with mutated C-terminus of RPS26, which is required for contacting mRNA. In accordance with this study, we showed that cells with knocked down RPS26 proliferate less efficiently (new Figure 2A), but depletion of RPS26 did not impact the global translation (new Figure 2B). In addition, our SILAC-MS data indicates that ~80% of proteins with determined expression level were not affected by RPS26 insufficiency, and ~20% of the proteins turned out to be sensitive to RPS26 decrease. Although, these data do not take into account the protein stability.

      (4) Knockdown efficiency for all tested genes must be shown to evaluate knockdown efficiency.

      The current version of the manuscript contains representative western blots with validation of knock-down efficiency (for example in Figure 3B, C, E, Figure 6A) and we included knock-down validations where applicable (Figures 1D, 2B, 4G and 5C).

      (5) The data in Figure 1E have just one mock control, but two cell types (control si and Rps26 depletion).

      Mock control corresponds to the cells treated with lipofectamine reagent and was included in the study to determine the “background” signal from cells treated with delivery agent and reagents used to measure the apoptosis process. These cells were neither expressing FMRpolyG, nor siRNAs. Luminescence signals were normalized to the values obtained from mock control. We added more details describing this assay in the Figure 1 legend.

      (6) The authors' data indicate that the effects are not specific to Rps26 but indeed also observed upon Rps25 knockdown. This suggests strongly that the effects are from reduced ribosome content or blocked protein synthesis. Additional controls should deplete a core RP to ascertain this conclusion.

      We agree that observed effects may stem from reduced ribosome content, however, we argue that this is the only possibility and explanation. Previously, it was shown that RPS25 regulates G4C2-related RAN translation, but knock out of RPS25 does not affect global translation (Yamada S, 2019, Nat. Neuroscience). Similarly, we showed that KD of RPS26 or TSR2 did not reduce significantly global translation rate (SUnSET assay; new Figure 2B and 5C, respectively).

      Moreover, in a new version of manuscript we included a control experiment, where we silenced core ribosomal protein (RPS6) and found that RPS6 depletion did not affect RAN translation from mutant FMR1 mRNA (new Figure 5C), thus strengthening our conclusion about specific RAN translation regulation by the level of RPS26 and RPS25.

      Finally, our observation aligns well with current knowledge about how deficiency of different ribosomal proteins alters translation of some classes of mRNAs (Luan Y, 2022, Nucleic Acids Res; Cheng Z, 2019, Mol Cell). It was shown that depletion of RPS26 affects translation rate of different mRNAs compared to depletion of other proteins of small ribosomal subunit.

      (7) Supplemental Figure S3 demonstrates that the depletion of S26 does not affect the selection of the start codon context. Any other claim must be deleted. All the 5'-UTR logos are essentially identical, indicating that "picking" happens by abundance (background).

      Supplementary Figure 3D represents results indicating that the mutation in -4 position (from G to A) did not affect the RAN translation regardless of RPS26 presence or depletion. However, this result does not imply that RPS26 does not affect the selection of start codon of sequence- or RNA structure-context. We verified this particular -4 position, as it was suggested previously as important RPS26-sensitive site in yeasts (Ferretti M, 2017, Nat Struct Mol Biol). We agree with Reviewer that all 5’UTR logos presented in our paper did not show statistical significance for neither tested position for human mRNAs. On the contrary, we observed that regulation sensitive to RPS26 level depends on the selection of start codon of RAN translation, in particular ACG initiation (new Figure 4F&G). RPS26 depletion affected ACG-initiated but not GTG- or CTG-initiated RAN translation.

      In the previous version of the manuscript, we wrote that we did not identify any specific motifs or enrichment within analyzed transcripts in comparison to the background. On the other hand, we found that the GC-content among analyzed transcripts is higher within 5’UTRs and in close proximity to ATG in coding sequences (Figure 4D), what suggests the importance of RNA stable structures in this region. In addition, we showed that mRNAs encoding proteins responding to RPS26 depletion have shorter than average 5’UTRs (new Figure 4E).

      (8) Mechanism is lacking entirely. There are many ways in which ribosomes could have mRNA-specific effects. The authors tried to find an effect from the Kozak sequence, unsuccessfully (however, they also did not do the experiment correctly, as they failed to recognize that the Kozak sequence differs between yeast, where it is A-rich, and mammalian cells, where it is GGCGCC). Collisions could be another mechanism.

      Indeed, collisions as well as other mechanisms such as skewed start codon fidelity may have an effect on efficiency of FMRpolyG biosynthesis. In the current version of the manuscript, we show that RPS26 amount-sensitive regulation seems to be start codonselection dependent (new Figure 4F&G).

      Reviewer #2 (Public Review):

      Summary:

      Translation of CGG repeats leads to the accumulation of poly G, which is associated with neurological disorders. This is a valuable paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells bound to CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. There are data supporting the claim that RPS26 depletion modulates RAN translation in this RNA, although for some results, the Western results are not strong. The data to support increased aggregation by polyG expression upon S26 KD are incomplete.

      We thank the Reviewer for critical comments and suggestions. We sincerely appreciate your rigorous, insightful, and constructive feedback throughout the revision process.

      Below each specific point, we addressed the mentioned issues.

      Strengths:

      The authors have proteomics data that show the enrichment of a set of proteins on FMR1 RNA but not a related RNA.

      We thank Reviewer for appreciation of provided MS-screening results, which identified proteins enriched on FMR1 RNA with expanded CGG repeats.

      Weaknesses:

      - It is insinuated that RPS26 binds the RNA to enhance CGG-containing protein expression. However, RPS26 reduction was also shown previously to affect ribosome levels, and reduced ribosome levels can result in ribosomes translating very different RNA pools.

      In previous version of the manuscript we did not state that RPS26 binds directly to RNA with expanded CGG repeats and we did not show the experiment indicating direct interaction between studied RNA and RPS26. What we showed is that RPS26 was enriched on FMR1 RNA MS samples, however, we did not verify whether it is direct or indirect interaction. We also tried to test hypothesis that lack of RPS26 in PIC complex may affect efficiency of RAN translation initiation via specific, previously described in yeast Kozak context (Ferretti M, 2017, Nat Struct Mol Biol). As we described this hypothesis was negatively validated. However, we showed that other features of 5’UTR sequences (e.g. higher GC-content or shorter leader sequence) are potentially important for translation efficiency in cells with depleted RPS26.

      Indeed, RPS26 is involved in 40S maturation steps (Plassart L, 2021, eLife) and its insufficiency or mutations or blocking its inclusion to 40S ribosome may result in incomplete 40S maturation, which subsequently might negatively affect translation per se. However, we did not observe global translation inhibition after RPS26 depletion or depletion of TSR2, the chaperon involved in incorporation/exchange RPS26 to small ribosomal subunit (new Figure 2B and 5C). In addition, our SILAC-MS data indicates that majority of studied proteins (including FMRP, the main product of FMR1 gene) were not affected by RPS26 depletion which can be carefully extrapolated to global translation. In revised manuscript we also showed that relatively low silencing of RPS26 also decreased FMRpolyG production in model cells (new Figure 1D).

      We agree that reduced ribosome levels can result in different efficiency of translation of different RNA pools. We enhance this statement in revised manuscript. However, we also showed that the same mRNA containing different near cognate start codons (single/two nucleotide substitution) specific to RAN translation, or targeting this codon with antisense oligonucleotides resulted in altered sensitivity of FMR1 mRNA translation to RPS26 depletion (new Figure 4F).

      - A significant claim is that RPS26 KD alleviates the effects of FMRpolyG expression, but those data aren't presented well.

      We thank the Reviewer for this comment. In the new version of the manuscript, we have added new microscopic images and improved the explanation of Figure 1E. We have also completed the interpretation of Figure 1F in the main text, figure image as well as figure legend, and we hope that these changes will ameliorate understanding of our data.

      Recommendations For The Authors:

      - A significant claim is that RPS26 KD alleviates the effects of FMR polyG expression, but those data aren't presented well:

      Figure 1D (supporting data in S2) and 2D - the authors need to show representative images of a control that has aggregation and indicate aggregates being counted on an image. The legend states that there are no aggregates, but the quantification of aggregates/nucleus is ~1, suggesting there are at least 1 per cell. It is preferred to show at least a representative of what is quantified in the main figure instead of a bar graph.

      The representative images of control and siRPS26-treated cells are now shown in revised version of Figure 1E. Additionally, we completed the Figure legend concerning this part, as well as extended description of the experiment in Materials&Methods section.

      Figure 1E - it is unclear what luminescence signal is being measured. Is this a dye for an apoptotic marker? More information is needed in the legend.

      This information was added to the legend of modified Figure 1F (previously 1E) as suggested.

      - Some of the Western blots are not very convincing. Better evidence for the changes in bar graphs would improve how convincing the data are:

      Fig 2B. The western for FMR95G in the first model is not very convincing. The difference by eye for the second siRNA seems to give a larger effect than the first for 95G construct but they appear almost the same on the graph. More supporting information for the quantification is needed.

      We provided better explanation for WB quantification in M&M section in the manuscript. Alos, we provided additional blot demonstrating independent biological replicate of the mentioned experiment in supplementary materials (Supplementary Figure S2E).

      Figure 4A, the blots for RPS26 and FMR95G are not convincing. They are quite smeary compared to all of the others shown for these proteins in other figures. Could a different replicate be shown?

      We provided additional blot demonstrating the effect on transiently expressed FMRpolyG affected by depletion of TSR2 in COS7 cell line (Supplementary Figure S4A).

      Figure 5A and 5B blots are not ideal. Could a different replicate be shown? Or show multiple replicates in the supplemental figure?

      We provided additional blots from the same experiment, although data is not statistically significant, most likely due to low quality of normalization factor, which is Vinculin (Supplementary Figure S5A). Nevertheless, the level of FMRpolyG is decreased by ~70% after RPS25 silencing in SH-SY5Y cells.

      Figure 2C. Please use the same y axes for all four Westerns in B and C. One would like to compare 95 and 15 repeats, but it is difficult when the y axes are different.

      Thank you for this comment. The y axis was adjusted as suggested by the Reviewer.

      Figure 3D-The text suggests a significant difference between positive and negative responders that is not clear in the figure.

      In the main body of the manuscript we state that: “We did not observe any significant differences in the frequency of individual nucleotide positions in the 20-nucleotide vicinity of the start codon relative to the expected distribution in the BG”, which is in line with the graph showed in Figure 4D (previously 3D).

      Reviewer #3 (Public Review):

      Tutak et al provide interesting data showing that RPS26 and relevant proteins such as TSR2 and RPS25 affect RAN translation from CGG repeat RNA in fragile X-associated conditions. They identified RPS26 as a potential regulator of RAN translation by RNAtagging system and mass spectrometry-based screening for proteins binding to CGG repeat RNA and confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models and patient-derived fibroblasts. Quantitative mass spectrometry analysis found that the expressions of some ribosomal proteins are sensitive to RPS26 depletion while approximately 80% of proteins including FMRP were not influenced. Since the roles of ribosomal proteins in RAN translation regulation have not been fully examined, this study provides novel insights into this research field. However, some data presented in this manuscript are limited and preliminary, and their conclusions are not fully supported.

      (1) While the authors emphasized the importance of ribosomal composition for RAN translation regulation in the title and the article body, the association between RAN translation and ribosomal composition is apparently not evaluated in this work. They found that specific ribosomal proteins (RPS26 and RPS25) can have regulatory effects on RAN translation (Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B), and that the expression levels of some ribosomal proteins can be changed by RPS26 knockdown (Figure 3B, however, the change of the ribosome compositions involved in the actual translation has not been elucidated). Therefore, their conclusive statement, that is, "ribosome composition affects RAN translation" is not fully supported by the presented data and is misleading.

      We thank the Reviewer for critical comments and suggestions. We agree that the initial title and some statements in the text were misleading and the presented data did not fully support the aforementioned statement regarding ribosomal composition affecting FMRpolyG synthesis. Therefore, in the revised version of the manuscript we included a control experiment indicating that depletion of another core 40S ribosomal protein (RPS6) did not impact the FMRpolyG synthesis (new Figure 5C), which supports our hypothesis that RPS26 and RPS25 are specific CGG-related RAN translation modifiers. To precisely deliver a main message of our work, we changed the title that will indicate the specific effect of RPS26 and RPS25 insufficiency on RAN translation of FMRpolyG. Proposed title: “Insufficiency of 40S ribosomal proteins, RPS26 and RPS25 negatively affects biosynthesis of polyglycine-containing proteins in fragile-X associated conditions”. We also changed all statements regarding “ribosomal composition” in main text of the new version of manuscript.

      (2) The study provides insufficient data on the mechanisms of how RPS26 regulates RAN translation. Although authors speculate that RPS26 may affect initiation codon fidelity and regulate RAN translation in a CGG repeat sequence-independent manner (Page 9 and Page 11), what they really have shown is just identification of this protein by the screening for proteins binding to CGG repeat RNA (Figure 1A, 1B), and effects of this protein on CGG repeat-RAN translation. It is essential to clarify whether the regulatory effect of RPS26 on RAN translation is dependent on CGG repeat sequence or near-cognate initiation codons like ACG and GUG in the 5' upstream sequence of the repeat. It would be better to validate the effects of RPS26 on translation from control constructs, such as one composed of the 5' upstream sequence of FMR1 with no CGG repeat, and one with an ATG substitution in the 5' upstream sequence of FMR1 instead of near-cognate initiation codons.

      We agree that the data presented in the manuscript implies that insufficiency of RPS26 plays a pivotal role in the regulation of CGG-related RAN translation and in the revised version of the manuscript we included a series of experiments indicating that ACG codon selection seems to be an important part of RPS26 level-dependent regulation of polyglycine production (new Figure 4F&G; see point 3 below for more details). Importantly, in the luciferase assay showed on Figure 4F we used the AUG-initiated firefly luciferase reporter as normalization control.

      Moreover, to verify if FMRpolyG response to RPS26 deficiency depends on the type of reporter used, we repeated many experiments using FMRpolyG fused with different tags. The luciferase-based assays were in line with experiments conducted on constructs with GFP tag (new Figure 1D), thus strengthening our previous data. Moreover, in the series of experiments, we show that FMRP synthesis which is initiated from ATG codon located in FMR1 exon 1, was not affected by RPS26 depletion (Figure 3E & 4C), even though its translation occurs on the same mRNA as FMRpolyG. This indicates a specific RPS26 regulation of polyglycine frame initiated from ACG near cognate codon.

      (3) The regulatory effects of RPS26 and other molecules on RAN translation have all been investigated as effects on the expression levels of FMRpolyG-GFP proteins in cellular models expressing CGG repeat sequences Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B). In these cellular experiments, there are multiple confounding factors affecting the expression levels of FMRpolyG-GFP proteins other than RAN translation, including template RNA expression, template RNA distribution, and FMRpolyG-GFP protein degradation. Although authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the effect of these regulators on RAN translation by other experiments such as in vitro translation assay that can directly evaluate RAN translation.

      We agree that there are multiple factors affecting final levels of FMRpolyG-GFP proteins including aforementioned processes. We evaluated the level of FMR1 mRNA, which turned out not to be decreased upon RPS26 depletion (Figure 3B&C), therefore, we assumed that what we observed, was the regulation on translation level, especially that RPS26 is a ribosomal protein contacting mRNA in E-site. We believe that direct assays such as in vitro translation may be beneficial, however, depletion of RPS26 from cellular lysate provided by the vendor seems technically challenging, if not completely impossible. Instead, we focused on sequence/structure specific regulation of RAN translation with the emphasis on start-codon initiation selection. It resulted in generating the valuable results pointing out the RPS26 role in start codon fidelity (Figure 4F&G). These new results showed that translation from mRNAs differing just in single or two nucleotide substitution in near cognate start codon (ACG to GUG or ACG to CUG), although results in exactly the same protein, is differently sensitive to RPS26 silencing (new Figure 4F). Similar differences were observed for translation efficiency from the same mRNA targeted or not with antisense oligonucleotide complementary to the region of RAN translation initiation codon (new Figure 4G). These results also suggest that stability of FMRpolyG is not affected in cells with decreased level of RPS26.

      (4) While the authors state that RPS26 modulated the FMRpolyG-mediated toxicity, they presented limited data on apoptotic markers, not cellular viability (Figure 1E), not fully supporting this conclusion. Since previous work showed that FMRpolyG protein reduces cellular viability (Hoem G, 2019,Front Genet), additional evaluations for cellular viability would strengthen this conclusion.

      We thank the Reviewer for this suggestion. We addressed the apoptotic process in order to determine the effect of RPS26 depletion on RAN translation related toxicity (Figure 1F). In revised version of the manuscript, we also added the evaluation on how cells proliferation was affected by RPS26, RPS25, RPS6 and TSR2 depletion. Our data indicate that TSR2 silencing slightly impacted the cellular fitness (new Figure 5D), whereas insufficiencies of RPS26, RPS25 and RPS6 had a much stronger negative effect on proliferation (new Figure 2A, 5D, 6C), which is in line with previous data (Cheng Z 2019, Mol Cell; Luan Y, 2022, Nucleic Acids Res). The difference in proliferation rate after treatment with siRPS26 makes proper interpretation of cellular viability assessment very difficult.

      Recommendations For The Authors:

      (1) It would be nice to validate the effects of overexpression of RPS26 and other regulators on RAN translation, not limited to knockdown experiments, to support the conclusion.

      We did not performed such experiments because we believed that RPS26 overexpression may have no or marginal effect on translation or RAN translation. It is likely impossible to efficiently incorporate overexpressed RPS26 into 40S subunits, because the concentration of all ribosomal proteins in the cells is very high.

      (2) It would be better to explain how authors selected 8 proteins for siRNA-based validation (Figure 1C, 1D, S1D) from 32 proteins enriched in CGG repeat RNA in the first screening.

      We selected those candidates based on their functions connected to translation, structured RNA unwinding or mRNA processing. For example, we tested few RNA helicases because of their known function in RAN translation regulation described by other researchers. This explanation was added to the revised version of the manuscript.

      (3) Original image data showing nuclear FMRpolyG-GFP aggregates should be presented in Figure 1D.

      The representative images of control and siRPS26-treated cells are now shown in modified version of Figure 1E and described with more details in the legend.

      (4) Image data in Figure 2A and 2D have poor signal/noise ratio and the resolution should be improved. In addition, aggregates should be clearly indicated in Figure 2D in an appropriate manner.

      The stable S-FMR95xG cellular model is characterized by very low expression of RANtranslated FMR95xG, therefore, it is challenging to obtain microscopic images of better quality with higher GFP signal. In the L-99xCGG model expression of transgene is higher. Therefore, we provided new image in the new version of Figure 3D (former 2D). Moreover, we showed aggregates on the image obtained using confocal microscopy (new Supplementary Figure 2D).

      (5) The detailed information on patient-derived fibroblast (age and sex of the patient, the number of CGG repeats, etc.) in Figure 2F needed to be presented.

      This information was added to the figure legend (Figure 3F; previously 2F) and in the Material and Methods section as suggested.

      (6) It would be better to normalize RNA expression levels of FMR1 and FMR1-GFP by the housekeeping gene in Figure S2C, like other RT-qPCR experimental data such as Figure 2B.

      Normalization of FMR1-GFP to GAPDH is now shown in modified version of Figure S2C (right graph) as requested by the Reviewer.

      (7) It would be better to add information on molecular weight on all Western blotting data.

      (8) Marks corresponding to molecular weight ladder were added to all images.

      Full blots, including protein ladders were deposited in Zenodo repository, under doi: 10.5281/zenodo.13860370

      References

      Cheng Z, Mugler CF, Keskin A, Hodapp S, Chan LYL, Weis K, Mertins P, Regev A, Jovanovic M & Brar GA (2019) Small and Large Ribosomal Subunit Deficiencies Lead to Distinct Gene Expression Signatures that Reflect Cellular Growth Rate. Mol Cell 73: 36-47.e10

      Havkin-Solomon T, Fraticelli D, Bahat A, Hayat D, Reuven N, Shaul Y & Dikstein R (2023) Translation regulation of specific mRNAs by RPS26 C-terminal RNA-binding tail integrates energy metabolism and AMPK-mTOR signaling. Nucleic Acids Res 51: 4415–4428

      Hoem,G., Larsen,K.B., Øvervatn,A., Brech,A., Lamark,T., Sjøttem,E. and Johansen,T. (2019) The FMRpolyGlycine protein mediates aggregate formation and toxicity independent of the CGG mRNA hairpin in a cellular model for FXTAS. Front. Genet., 10, 1–18.

      Luan Y, Tang N, Yang J, Liu S, Cheng C, Wang Y, Chen C, Guo YN, Wang H, Zhao W, et al (2022) Deficiency of ribosomal proteins reshapes the transcriptional and translational landscape in human cells. Nucleic Acids Res 50: 6601–6617

      Plassart L, Shayan R, Montellese C, Rinaldi D, Larburu N, Pichereaux C, Froment C, Lebaron S, O’donohue MF, Kutay U, et al (2021) The final step of 40s ribosomal subunit maturation is controlled by a dual key lock. Elife 10

    1. A single colony of NRRL B-24224 was picked and grown to saturation in a peptone yeast calcium (PYCa) medium. Genomic DNA was isolated by lysing the cultures of NRRL B-24224 in a 3110BX Mini-BeadBeater for 45 s, treating them with RNase, and performing a phenol-chloroform extraction. A barcoded library for Illumina sequencing was prepared with an NEB Ultra II FS kit and run on an Illumina MiSeq system, which produced just over 3 million 150-base single-end reads. From the same DNA sample, a second library for Oxford Nanopore sequencing was prepared with a rapid barcoding kit (SQK-RBK004) and run on a MinION sequencer with a FLO-MIN6 (R9.4) flow cell for 6 h, which produced ∼30,000 reads with an average length of ∼5.7 kb. These 2 sets of reads were assembled with Unicycler v0.4.4 (9), with the -s flag for the Illumina reads and the -l flag for the Nanopore reads and default settings. Unicycler was used to assemble the reads into a single circular bacterial contig, with 126-fold Illumina coverage and 47-fold Nanopore coverage, which was evaluated and adjusted for accuracy and completeness using Consed v29 and custom scripts as previously described (10).

      A single tiny group of NRRL B-24224 bacteria was taken and grown in a special liquid called PYCa. To study the bacteria's DNA, scientists broke open the bacteria cells to get their DNA out. They used a machine to shake the cells and a special mix of chemicals to help clean the DNA.

      Next, they made a special "barcode" for the DNA to help with reading it. They used two different types of machines to read the DNA. The first machine, called Illumina MiSeq, read over 3 million tiny pieces of DNA. The second machine, called Oxford Nanopore MinION, read about 30,000 pieces, but each piece was much longer.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      In this work, Noorman and colleagues test the predictions of the "four-stage model" of consciousness by combining psychophysics and scalp EEG in humans. The study relies on an elegant experimental design to investigate the respective impact of attentional and perceptual blindness on visual processing. 

      The study is very well summarised, the text is clear and the methods seem sound. Overall, a very solid piece of work. I haven't identified any major weaknesses. Below I raise a few questions of interpretation that may possibly be the subject of a revision of the text. 

      We thank the reviewer for their positive assessment of our work and for their extremely helpful and constructive comments that helped to significantly improve the quality of our manuscript.

      (1) The perceptual performance on Fig1D appears to show huge variation across participants, with some participants at chance levels and others with performance > 90% in the attentional blink and/or masked conditions. This seems to reveal that the procedure to match performance across participants was not very successful. Could this impact the results? The authors highlight the fact that they did not resort to postselection or exclusion of participants, but at the same time do not discuss this equally important point. 

      Performance was indeed highly variable between observers, as is commonly found in attentional-blink (AB) and masking studies. For some observers, the AB pushes performance almost to chance level, whereas for others it has almost no effect. A similar effect can be seen in masking. We did our best to match accuracy over participants, while also matching accuracy within participants as well as possible, adjusting mask contrast manually during the experimental session. Naturally, those that are strongly affected by masking need not be the same participants as those that are strongly affected by the AB, given the fact that they rely on different mechanisms (which is also one of the main points of the manuscript). To answer the research question, what mattered most was that at the group-level, performance was well matched between the two key conditions. As all our statistical inferences, both for behavior and EEG decoding, rest on this group level. We do not think that variability at the individualsubject level detracts from this general approach.  

      In the Results, we added that our goal was to match performance across participants:

      “Importantly, mask contrast in the masked condition was adjusted using a staircasing procedure to match performance in the AB condition, ensuring comparable perceptual performance in the masked and the AB condition across participants (see Methods for more details).”

      In the Methods, we added:

      “Second, during the experimental session, after every 32 masked trials, mask contrast could be manually updated in accordance with our goal to match accuracy over participants, while also matching accuracy within participants as well as possible.”

      (2) In the analysis on collinearity and illusion-specific processing, the authors conclude that the absence of a significant effect of training set demonstrates collinearity-only processing. I don't think that this conclusion is warranted: as the illusory and nonillusory share the same shape, so more elaborate object processing could also be occurring. Please discuss. 

      We agree with this qualification of our interpretation, and included the reviewer’s account as an alternative explanation in the Discussion section:  

      “It should be noted that not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processes representing the triangular shapes as well.”

      (3) Discussion, lines 426-429: It is stated that the results align with the notion that processes of perceptual segmentation and organization represent the mechanism of conscious experience. My interpretation of the results is that they show the contrary: for the same visibility level in the attentional blind or masking conditions, these processes can be implicated or not, which suggests a role during unconscious processing instead. 

      We agree with the reviewer that the interpretation of this result depends on the definition of consciousness that one adheres to. If one takes report as the leading metric for consciousness (=conscious access), one can indeed conclude that perceptual segmentation/organization can also occur unconsciously. However, if the processing that results in the qualitative nature of an image (rather than whether it is reported) is taken as leading – such as the processing that results in the formation of an illusory percept – (=phenomenal) the conclusion can be quite different. This speaks to the still ongoing debate regarding the existence of phenomenal vs access consciousness, and the literature on no-report paradigms amongst others (see last paragraph of the discussion). Because the current data do not speak directly to this debate, we decided to remove  the sentence about “conscious experience”, and edited this part of the manuscript (also addressing a comment about preserved unconscious processing during masking by Reviewer 2) by limiting the interpretation of unconscious processing to those aspects that are uncontroversial:

      “Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling deep unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.”

      (4) The two paradigms developed here could be used jointly to highlight nonidiosyncratic NCCs, i.e. EEG markers of visibility or confidence that generalise regardless of the method used. Have the authors attempted to train the classifier on one method and apply it to another (e.g. AB to masking and vice versa)? What perceptual level is assumed to transfer? 

      To avoid issues with post-hoc selection of (visible vs. invisible) trials (discussed in the Introduction), we did not divide our trials into conscious and unconscious trials, and thus did not attempt to reveal NCCs, or NCCs generalizing across the two paradigms. Note also that this approach alone would not resolve the debate regarding the ‘true’ NCC as it hinges on the operational definition of consciousness one adheres to; also see our response to the previous point the reviewer raised. Our main analysis revealed that the illusory triangle could be decoded with above-chance accuracy during both masking and the AB over extended periods of time with similar topographies (Fig. 2B), so that significant cross-decoding would be expected over roughly the same extended period of time (except for the heightened 200-250 ms peak). However, as our focus was on differences between the two manipulations and because we did not use post-hoc sorting of trials, we did not add these analyses.

      (5) How can the results be integrated with the attentional literature showing that attentional filters can be applied early in the processing hierarchy? 

      Compared to certain manipulations of spatial attention, the AB phenomenon is generally considered to represent an instance of  “late” attentional filtering. In the Discussion section we included a paragraph on classic load theory, where early and late filtering depend on perceptual and attentional load. Just preceding this paragraph, we added this:  

      “Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”

      Reviewer #2 (Public Review): 

      Summary: 

      This is a very elegant and important EEG study that unifies within a single set of behaviorally equated experimental conditions conscious access (and therefore also conscious access failures) during visual masking and attentional blink (AB) paradigms in humans. By a systematic and clever use of multivariate pattern classifiers across conditions, they could dissect, confirm, and extend a key distinction (initially framed within the GNWT framework) between 'subliminal' and 'pre-conscious' unconscious levels of processing. In particular, the authors could provide strong evidence to distinguish here within the same paradigm these two levels of unconscious processing that precede conscious access : (i) an early (< 80ms) bottom-up and local (in brain) stage of perceptual processing ('local contrast processing') that was preserved in both unconscious conditions, (ii) a later stage and more integrated processing (200-250ms) that was impaired by masking but preserved during AB. On the basis of preexisting studies and theoretical arguments, they suggest that this later stage could correspond to lateral and local recurrent feedback processes. Then, the late conscious access stage appeared as a P3b-like event. 

      Strengths: 

      The methodology and analyses are strong and valid. This work adds an important piece in the current scientific debate about levels of unconscious processing and specificities of conscious access in relation to feed-forward, lateral, and late brain-scale top-down recurrent processing. 

      Weaknesses: 

      - The authors could improve clarity of the rich set of decoding analyses across conditions. 

      - They could also enrich their Introduction and Discussion sections by taking into account the importance of conscious influences on some unconscious cognitive processes (revision of traditional concept of 'automaticity'), that may introduce some complexity in Results interpretation 

      - They should discuss the rich literature reporting high-level unconscious processing in masking paradigms (culminating in semantic processing of digits, words or even small group of words, and pictures) in the light of their proposal (deeper unconscious processing during AB than during masking). 

      We thank the reviewer for their positive assessment of our study and for their insightful comments and helpful suggestions that helped to significantly strengthen our paper. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we followed the reviewer’s suggestions and revised the Results/Discussion to include references to influences on unconscious processes and expanded our discussion of unconscious effects during masking vs. AB.  

      Reviewer #3 (Public Review): 

      Summary: 

      This work aims to investigate how perceptual and attentional processes affect conscious access in humans. By using multivariate decoding analysis of electroencephalography (EEG) data, the authors explored the neural temporal dynamics of visual processing across different levels of complexity (local contrast, collinearity, and illusory perception). This is achieved by comparing the decidability of an illusory percept in matched conditions of perceptual (i.e., degrading the strength of sensory input using visual masking) and attentional impairment (i.e., impairing topdown attention using attentional blink, AB). The decoding results reveal three distinct temporal responses associated with the three levels of visual processing. Interestingly, the early stage of local contrast processing remains unaffected by both masking and AB. However, the later stage of collinearity and illusory percept processing are impaired by the perceptual manipulation but remain unaffected by the attentional manipulation. These findings contribute to the understanding of the unique neural dynamics of perceptual and attentional functions and how they interact with the different stages of conscious access. 

      Strengths: 

      The study investigates perceptual and attentional impairments across multiple levels of visual processing in a single experiment. Local contrast, collinearity, and illusory perception were manipulated using different configurations of the same visual stimuli. This clever design allows for the investigation of different levels of visual processing under similar low-level conditions. 

      Moreover, behavioural performance was matched between perceptual and attentional manipulations. One of the main problems when comparing perceptual and attentional manipulations on conscious access is that they tend to impact performance at different levels, with perceptual manipulations like masking producing larger effects. The study utilizes a staircasing procedure to find the optimal contrast of the mask stimuli to produce a performance impairment to the illusory perception comparable to the attentional condition, both in terms of perceptual performance (i.e., indicating whether the target contained the Kanizsa illusion) and metacognition (i.e., confidence in the response). 

      The results show a clear dissociation between the three levels of visual processing in terms of temporal dynamics. Local contrast was represented at an early stage (~80 ms), while collinearity and illusory perception were associated with later stages (~200-250 ms). Furthermore, the results provide clear evidence in support of a dissociation between the effects of perceptual and attentional processes on conscious access: while the former affected both neuronal correlates of collinearity and illusory perception, the latter did not have any effect on the processing of the more complex visual features involved in the illusion perception. 

      Weaknesses: 

      The design of the study and the results presented are very similar to those in Fahrenfort et al. (2017), reducing its novelty. Similar to the current study, Fahrenfort et al. (2017) tested the idea that if both masking and AB impact perceptual integration, they should affect the neural markers of perceptual integration in a similar way. They found that behavioural performance (hit/false alarm rate) was affected by both masking and AB, even though only the latter was significant in the unmasked condition. An early classification peak was instead only affected by masking. However, a late classification peak showed a pattern similar to the behavioural results, with classification affected by both masking and AB. 

      The interpretation of the results mainly centres on the theoretical framework of the recurrent processing theory of consciousness (Lamme, 2020), which lead to the assumption that local contrast, collinearity, and the illusory perception reflect feedforward, local recurrent, and global recurrent connections, respectively. It should be mentioned, however, that this theoretical prediction is not directly tested in the study. Moreover, the evidence for the dissociation between illusion and collinearity in terms of lateral and feedback connections seems at least limited. For instance, Kok et al. (2016) found that, whereas bottom-up stimulation activated all cortical layers, feedback activity induced by illusory figures led to a selective activation of the deep layers. Lee & Nguyen (2001), instead, found that V1 neurons respond to illusory contours of the Kanizsa figures, particularly in the superficial layers. They all mention feedback connections, but none seem to point to lateral connections. 

      Moreover, the evidence in favour of primarily lateral connections driving collinearity seems mixed as well. On one hand, Liang et al. (2017) showed that feedback and lateral connections closely interact to mediate image grouping and segmentation. On the other hand, Stettler et al. (2002) showed that, whereas the intrinsic connections link similarly oriented domains in V1, V2 to V1 feedback displays no such specificity. Furthermore, the other studies mentioned in the manuscript did not investigate feedback connections but only lateral ones, making it difficult to draw any clear conclusions. 

      We thank the reviewer for their careful review and positive assessment of our study, as well as for their constructive criticism and helpful suggestions. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we addressed the reviewer’s comments and suggestions by better relating our study to Fahrenfort et al.’s (2017) paper and by highlighting the limitations inherent in linking our findings to distinct neural mechanisms (in particular, to lateral vs. feedback connections).

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors): 

      -  Methods: it states that "The distance between the three Pac-Man stimuli as well as between the three aligned two-legged white circles was 2.8 degrees of visual angle". It is unclear what this distance refers to. Is it the shortest distance between the edges of the objects? 

      It is indeed the shortest distance between the edges of the objects. This is now included in the Methods.

      -  Methods: It's unclear to me if the mask updating procedure during the experimental session was based on detection rate or on the perceptual performance index reported on Fig1D. Please clarify. 

      It was based on accuracy calculated over 32 trials. We have included this information in the Methods.

      -  Methods and Results: I did not understand why the described procedure used to ensure that confidence ratings are not contaminated by differences in perceptual performance was necessary. To me, it just seems to make the "no manipulations" and "both manipulations" less comparable to the other 2 conditions. 

      To calculate accurate estimates of metacognitive sensitivity for the two matched conditions, we wanted participants to make use of the full confidence scale (asking them to distribute their responses evenly over all ratings within a block). By mixing all conditions in the same block, we would have run the risk of participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition). We made this point explicit in the Results section and in the Methods section:

      “To ensure that the distribution of confidence ratings in the performancematched masked and AB condition was not influenced by participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition, respectively), the masked and AB condition were presented in the same experimental block, while the other block type included the no and both manipulations condition.”

      “To ensure that confidence ratings for these matched conditions (masked, long lag and unmasked, short lag) were not influenced by participants anchoring their confidence ratings to the very easy and very difficult unmatched conditions (no and both manipulations, respectively), one type of block only contained the matched conditions, while the other block type contained the two remaining, unmatched conditions (masked, short lag and unmasked, long lag).”

      - Methods: what priors were used for Bayesian analyses? 

      Bayesian statistics were calculated in JASP (JASP Team, 2024) with default prior scales (Cauchy distribution, scale 0.707). This is now added to the Methods.

      - Results, line 162: It states that classifiers were applied on "raw EEG activity" but the Methods specify preprocessing steps. "Preprocessed EEG activity" seems more appropriate. 

      We changed the term to “preprocessed EEG activity” in the Methods and to “(minimally) preprocessed EEG activity (see Methods)” in the  Results, respectively.

      - Results, line 173: The effect of masking on local contrast decoding is reported as "marginal". If the alpha is set at 0.05, it seems that this effect is significant and should not be reported as marginal. 

      We changed the wording from “marginal” to “small but significant.”  

      - Fig1: The fixation cross is not displayed. 

      Because adding the fixation cross would have made the figure of the trial design look crowded and less clear, we decided to exclude it from this schematic trial representation. We are now stating this also in the legend of figure 1.  

      - Fig 3A: In the upper left panel, isn't there a missing significant effect of the "local contrast training and testing" condition in the first window? If not, this condition seems oddly underpowered compared to the other two conditions. 

      Thanks for the catch! The highlighting in bold and the significance bar were indeed lacking for this condition in the upper left panel (blue line). We corrected the figure in our revision.

      - Supplementary text and Fig S6: It is unclear to me why the two control analyses (the black lines vs. the green and purple lines) are pooled together in the same figure. They seem to test for different, non-comparable contrasts (they share neither training nor testing sets), and I find it confusing to find them on the same figure. 

      We agree that this may be confusing, and deleted the results from one control analysis from the figure (black line, i.e., training on contrast, testing on illusion), as the reviewer correctly pointed out that it displayed a non-comparable analysis. Given that this control analysis did not reveal any significant decoding, we now report its results only in the Supplementary text.  

      - Fig S6: I think the title of the legend should say testing on the non-illusory triangle instead of testing on the illusory triangle to match the supplementary text. 

      This was a typo – thank you! Corrected.  

      Reviewer #2 (Recommendations For The Authors): 

      Issue #1: One key asymmetry between the three levels of T2 attributes (i.e.: local contrast; non-illusory triangle; illusory Kanisza triangle) is related to the top-down conscious posture driven by the task that was exclusively focusing on the last attribute (illusory Kanisza triangle). Therefore, any difference in EEG decoding performance across these three levels could also depend to this asymmetry. For instance, if participants were engaged to report local contrast or non-illusory triangle, one could wonder if decoding performance could differ from the one used here. This potential confound was addressed by the authors by using decoders trained in different datasets in which the main task was to report one the two other attributes. They could then test how classifiers trained on the task-related attribute behave on the main dataset. However, this part of the study is crucial but not 100% clear, and the links with the results of these control experiments are not fully explicit. Could the author better clarity this important point (see also Issue #1 and #3). 

      The reviewer raises an important point, alluding to potential differences between decoded features regarding task relevance. There are two separate sets of analyses where task relevance may have been a factor, our main analyses comparing illusion to contrast decoding, and our comparison of collinearity vs. illusion-specific processing.  

      In our main analysis, we are indeed reporting decoding of a task-relevant feature (illusion) and of a task-irrelevant feature (local contrast, i.e., rotation of the Pac-Man inducers). Note, however, that the Pac-Man inducers were always task-relevant, as they needed to be processed to perceive illusory triangles, so that local contrast decoding was based on task-relevant stimulus elements, even though participants did not respond to local contrast differences in the main experiment. However, we also ran control analyses testing the effect of task-relevance on local contrast decoding in our independent training data set and in another (independent) study, where local contrast was, in separate experimental blocks, task-relevant or task-irrelevant. The results are reported in the Supplementary Text and in Figure S5. In brief, task-relevance did not improve early (70–95 ms) decoding of local contrast. We are thus confident that the comparison of local contrast to illusion decoding in our main analysis was not substantially affected by differences in task relevance. In our previous manuscript version, we referred to these control analyses only in the collinearity-vs-illusion section of the Results. In our revision, we added the following in the Results section comparing illusion to contrast decoding:

      “In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”

      In addition to our main analysis, there is the concern that our comparison of collinearity vs. illusion-specific processing may have been affected by differences in task-relevance between the stimuli inducing the non-illusory triangle (the “two-legged white circles”, collinearity-only) and the stimuli inducing the Kanizsa illusion (the PacMan inducers, collinearity-plus-illusion). We would like to emphasize that in our main analysis classifiers were always used to decode T2 illusion presence vs. absence (collinearity-plus-illusion), and never to decode T2 collinearity-only. To distinguish collinearity-only from collinearity-plus-illusion processing, we only varied the training data (training classifiers on collinearity-only or collinearity-plus-illusion), using the independent training data set, where collinearity-only and collinearity-plus-illusion (and rotation) were task-relevant (in separate blocks). As discussed in the Supplementary Information, for this analysis approach to be valid, collinearity-only processing should be similar for the illusory and the non-illusory triangle, and this is what control analyses demonstrated (Fig. S7). In any case, general task-relevance was equated for the collinearity-only and the collinearity-plus-illusion classifiers.  

      Finally, in supplementary Figure 6 we also show that our main results reported in Figure 2 (discussed at the top of this response) were very similar when the classifiers were trained on the independent localizer dataset in which each stimulus feature could be task-relevant.  

      Together, for the reasons described above, we believe that differences in EEG decoding performance across these three stimulus levels did  are unlikely to depend also depend on a “task-relevance” asymmetry.

      Issue #2: Following on my previous point the authors should better mention the concept of conscious influences on unconscious processing that led to a full revision of the notion of automaticity in cognitive science [1 , 2 , 3 , 4]. For instance, the discovery that conscious endogenous temporal and spatial attention modulate unconscious subliminal processing paved the way to this revision. This concept raises the importance of Issue#1: equating performance on the main task across AB and masking is not enough to guarantee that differences of neural processing of the unattended attributes of T2 (i.e.: task-unrelated attributes) are not, in part, due to this asymmetry rather than to a systematic difference of unconscious processing strengtsh [5 , 6-8]. Obviously, the reported differences for real-triangle decoding between AB and masking cannot be totally explained by such a factor (because this is a task-unrelated attribute for both AB and masking conditions), but still this issue should be better introduced, addressed, clarified (Issue #1 and #3) and discussed. 

      We would like to refer to our response to the previous point: Control analyses for local contrast decoding showed that task relevance had no influence on our marker for feedforward processing. Most importantly, as outlined above, we did not perform real-triangle decoding – all our decoding analyses focused on comparing collinearity-only vs. collinearity-plus-illusion were run on the task-relevant T2 illusion (decoding its presence vs. absence). The key difference was solely the training set, where the collinearity-only classifier was trained on the (task-relevant) real triangle and the collinearity-plus-illusion classifier was trained on the (task-relevant) Kanizsa triangle. Thus, overall task relevance was controlled in these analyses.  

      In our revision, we are now also citing the studies proposed by the reviewer, when discussing the control analyses testing for an effect of task-relevance on local contrast decoding:

      “In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”

      Issue #3: In terms of clarity, I would suggest the authors to add a synthetic figure providing an overall view of all pairs of intra and cross-conditions decoding analyses and mentioning main task for training and testing sets for each analysis (see my previous and related points). Indeed, at one point, the reader can get lost and this would not only strengthen accessibility to the detailed picture of results, but also pinpoint the limits of the work (see previous point). 

      We understand the point the reviewer is raising and acknowledge that some of our analyses, in particular those using different training and testing sets, may be difficult to grasp. But given the variety of different analyses using different training and testing sets, different temporal windows, as well as different stimulus features, it was not possible to design an intuitive synthetic figure summarizing the key results. We hope that the added text in the Results and Discussion section will be sufficient to guide the reader through our set of analyses.  

      In our revision, we are now more clearly highlighting that, in addition to presenting the key results in our main text that were based on training classifiers on the T1 data, “we replicated all key findings when training the classifiers on an independent training set where individual stimuli were presented in isolation (Fig. 3A, results in the Supplementary Information and Fig. S6).” For this, we added a schematic showing the procedure of the independent training set to Figure 3, more clearly pointing the reader to the use of a separate training data set.  

      Issue #4: In the light of these findings the authors should discuss more thoroughly the question of unconscious high-level representations in masking versus AB: in particular, a longstanding issue relates to unconscious semantic processing of words, numbers or pictures. According to their findings, they tend to suggest that semantic processing should be more enabled in AB than in masking. However, a rich literature provided a substantial number of results (including results from the last authors Simon Van Gaal) that tend to support the notion of unconscious semantic processing in subliminal processing (see in particular: [9 , 10 , 11 , 12 , 13]). So, and as mentioned by the authors, while there is evidence for semantic processing during AB they should better discuss how they would explain unconscious semantic subliminal processing. While a possibility could be to question the unconscious attribute of several subliminal results, the same argument also holds for AB studies. Another possible track of discussion would be to differentiate AB and subliminal perception in terms of strength and durability of the corresponding unconscious representations, but not necessarily in terms of cognitive richness. Indeed, one may discuss that semantic processing of stimuli that do not need complex spatial integration (e.g.: words or digits as compared to illusory Kanisza tested here) can still be observed under subliminal conditions. 

      We thank the reviewer for pointing us to this shortcoming of our previous Discussion. Note that our data does not directly speak to the question of high-level unconscious representations in masking vs AB, because such conclusions would hinge on the operational definition of consciousness one adheres to (also see response to Reviewer 1). Nevertheless, we do follow the reviewer’s suggestions and added the following in the Discussion (also addressing a point about other forms of attention raised by Reviewer 1):

      “Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”

      And, in a following paragraph in the Discussion:

      “Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling high-level unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.  

      Reviewer #3 (Recommendations For The Authors): 

      (1) The objective of Fahrenfort et al., 2017 seems very similar to that of the current study. What are the main differences between the two studies? Moreover, Fahrenfort et al., 2017 conducted similar decoding analyses to those performed in the current study.

      Which results were replicated in the current study, and which ones are novel? Highlighting these differences in the manuscript would be beneficial. 

      We now provide a more comprehensive coverage of the study by Fahrenfort et al., 2017. In the Introduction, we added a brief summary of the key findings, highlighting that this study’s findings could have reflected differences in task performance rather than differences between masking and AB:

      “For example, Fahrenfort and colleagues (2017) found that illusory surfaces could be decoded from electroencephalogram (EEG) data during the AB but not during masking. This was taken as evidence that local recurrent interactions, supporting perceptual integration, were preserved during inattention but fully abolished by masking. However, masking had a much stronger behavioral effect than the AB, effectively reducing task performance to chance level. Indeed, a control experiment using weaker masking, which resulted in behavioral performance well above chance similar to the main experiment’s AB condition, revealed some evidence for preserved local recurrent interactions also during masking. However, these conditions were tested in separate experiments with small samples, precluding a direct comparison of perceptual vs. attentional blindness at matched levels of behavioral performance. To test …”

      In the Results , we are now also highlighting this key advancement by directly referencing the previous study:

      “Thus, whereas in previous studies task performance was considerably higher during the AB than during masking (e.g., Fahrenfort et al., 2017), in the present study the masked and the AB condition were matched in both measures of conscious access.” When reporting the EEG decoding results in the Results section, we continuously cite the Fahrenfort et al. (2017) study to highlight similarities in the study’s findings. We also added a few sentences explicitly relating the key findings of the two studies:

      “This suggests that the AB allowed for greater local recurrent processing than masking, replicating the key finding by Fahrenfort and colleagues (2017). Importantly, the present result demonstrates that this effect reflects the difference between the perceptual vs. attentional manipulation rather than differences in behavior, as the masked and the AB condition were matched for perceptual performance and metacognition.”

      “This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues  (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”  

      We also more clearly highlighted where our study goes beyond Fahrenfort et al.’s (2017), e.g., in the Results:

      “The addition of this element of collinearity to our stimuli was a key difference to the study by Fahrenfort and colleagues (2017), allowing us to compare non-illusory triangle decoding to illusory triangle decoding in order to distinguish between collinearity and illusion-specific processing.”

      And in the Discussion:

      “Furthermore, the addition of line segments forming a non-illusory triangle to the stimulus employed in the present study allowed us to distinguish between collinearity and illusion-specific processing.”

      Also, in the Discussion, we added a paragraph “summarizing which results were replicated in the current study, and which ones are novel”, as suggested by the reviewer:

      “This pattern of results is consistent with a previous study that used EEG to decode Kanizsa-like illusory surfaces during masking and the AB (Fahrenfort et al., 2017). However, the present study also revealed some effects where Fahrenfort and colleagues (2017) failed to obtain statistical significance, likely reflecting the present study’s considerably larger sample size and greater statistical power. For example, in the present study the marker for feedforward processing was weakly but significantly impaired by masking, and the marker for local recurrency was significantly impaired not only by masking but also by the AB, although to a lesser extent. Most importantly, however, we replicated the key findings that local recurrent processing was more strongly impaired by masking than by the AB, and that global recurrent processing was similarly impaired by masking and the AB and closely linked to task performance, reflecting conscious access. Crucially, having matched the key conditions behaviorally, the present finding of greater local recurrency during the AB can now unequivocally be attributed to the attentional vs. perceptual manipulation of consciousness.”

      Finally, we changed the title to “Distinct neural mechanisms underlying perceptual and attentional impairments of conscious access despite equal task performance” to highlight one of the crucial differences between the Fahrenfort et al., study and this study, namely the fact that we equalized task performance between the two critical conditions (AB and masking).

      (2) It is not clear from the text the link between the current study and the literature on the role of lateral and feedback connections in consciousness (Lamme, 2020). A better explanation is needed. 

      To our knowledge, consciousness theories such as recurrent processing theory by Lamme make currently no distinction between the role of lateral and feedback connections for consciousness. The principled distinction lies between unconscious feedforward processing and phenomenally conscious or “preconscious” local recurrent processing, where local recurrency refers to both lateral (or horizontal) and feedback connections. We added a sentence in the Discussion:

      “As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness …”

      (3) When training on T1 and testing on T2, EEG data showed an early peak in local contrast classification at 75-95 ms over posterior electrodes. The authors stated that this modulation was only marginally affected by masking (and not at all by AB); however, the main effect of masking is significant. Why was this effect interpreted as nonrelevant? 

      Following this and Reviewer 1’s comment, we changed the wording from “marginal” to “weak but significant.” We considered this effect “weak” and of lesser relevance, because its Bayes factor indicated that the alternative hypothesis was only 1.31 times more likely than the null hypothesis of no effect, representing only “anecdotal” evidence, which is in sharp contrast to the robust effects of the consciousness manipulations on illusion decoding reported later. Furthermore, later ANOVAs comparing the effect of masking on contrast vs. illusion decoding revealed much stronger effects on illusion decoding than on contrast decoding (BFs>3.59×10<sup>4</sup>).

      (4) The decoding analysis on the illusory percept yielded two separate peaks of decoding, one from 200 to 250 ms and another from 275 to 475 ms. The early component was localized occipitally and interpreted as local sensory processing, while the late peak was described as a marker for global recurrent processing. This latter peak was localized in the parietal cortex and associated with the P300. Can the authors show the topography of the P300 evoked response obtained from the current study as a comparison? Moreover, source reconstruction analysis would probably provide a better understanding of the cortical localization of the two peaks. 

      Figure S4 now shows the P300 from electrode Pz, demonstrating a stronger positivity between 375 and 475 ms when the illusory triangle was present than when it was absent. We did not run a source reconstruction analysis.  

      (5) The authors mention that the behavioural results closely resembled the pattern of the second decoding peak results. However, they did not show any evidence for this relationship. For instance, is there a correlation between the two measures across or within participants? Does this relationship differ between the illusion report and the confidence rating? 

      This relationship became evident from simply eyeballing the results figures: Both in behavior and EEG decoding performance dropped from the both-manipulations condition to the AB and masked conditions, while these conditions did not differ significantly. Following a similar observation of a close similarity between behavior and the second/late illusion decoding peak in the study by Fahrenfort et al. (2017), we adopted their analysis approach and ran two additional ANOVAs, adding “measure” (behavior vs. EEG) as a factor. For this analysis, we dropped the both-manipulations condition due to scale restrictions (as noted in footnote 1: “We excluded the bothmanipulations condition from this analysis due to scale restrictions: in this condition, EEG decoding at the second peak was at chance, while behavioral performance was above chance, leaving more room for behavior to drop from the masked and AB condition.”). The analysis revealed that there were no interactions with condition:

      “The pattern of behavioral results, both for perceptual performance and metacognitive sensitivity, closely resembled the second decoding peak: sensitivity in all three metrics dropped from the no-manipulations condition to the masked and AB conditions, while sensitivity did not differ significantly between these performancematched conditions (Fig. 2C). Two additional rm ANOVAs with the factors measure (behavior, second EEG decoding peak) and condition (no-manipulations, masked, AB)<sup>1</sup> for perceptual performance and metacognitive sensitivity revealed no significant interaction (performance: F</iv><sub>2,58</sub>=0.27, P\=0.762, BF<sub>01</sub>=8.47; metacognition: F</iv><sub>2,58</sub=0.54, P\=0.586, BF<sub>2,58</sub>=6.04). This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues  (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”

      (6) The marker for illusion-specific processing emerged later (200-250 ms), with the nomanipulation decoding performing better after training on the illusion than the nonillusory triangle. This difference emerged only in the AB condition, and it was fully abolished by masking. The authors confirmed that the illusion-specific processing was not affected by the AB manipulations by running a rm ANOVA which did not result in a significant interaction between condition and training set. However, unlike the other non-significant results, a Bayes Factor is missing here. 

      We added Bayes factors to all (significant and non-significant) rm ANOVAs.

      (7) The same analysis yielded a second illusion decoding peak at 375-475 ms. This effect was impaired by both masking and AB, with no significant differences between the two conditions. The authors stated that this result was directly linked to behavioural performance. However, it is not clear to me what they mean (see point 5). 

      We added analyses comparing behavior and EEG decoding directly (see our response to point 5).

      (8) The introduction starts by stating that perceptual and attentional processes differently affect consciousness access. This differentiation has been studied thoroughly in the consciousness literature, with a focus on how attention differs from consciousness (e.g., Koch & Tsuchiya, TiCS, 2007; Pitts, Lutsyshyna & Hillyard, Phil. Trans. Roy. Soc. B Biol. Sci., 2018). The authors stated that "these findings confirm and enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness clearly distinguishing and specifying the neural profiles of each processing stage of the influential four-stage model of conscious experience". I found it surprising that this aspect was not discussed further. What was the state of the art before this study was conducted? What are the mentioned neural profiles? How did the current results enrich the literature on this topic? 

      We would like to point out that our study is not primarily concerned with the conceptual distinction between consciousness and attention, which has been the central focus of e.g., Koch and Tsuchiuya (2007). While this literature was concerned with ways to dissociate consciousness and attention, we tacitly assumed that attention and consciousness are now generally considered as different constructs. Our study is thus not dealing with dissociations between attention and consciousness, nor with the distinction between phenomenal consciousness and conscious access, but is concerned with different ways of impairing conscious access (defined as the ability to report about a stimulus), either via perceptual or via attentional manipulations. For the state of the art before the study was conducted, we would like to refer to the motivation of our study in the Introduction, e.g., previous studies’ difficulties in unequivocally linking greater local recurrency during attentional than perceptual blindness to the consciousness manipulation, given performance confounds (we expanded this Introduction section). We also expanded a paragraph in the discussion to remind the reader of the neural profiles of the 4-stage model and to highlight the novelty of our findings related to the distinction between lateral and feedback processes:

      “As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness (Block, 2005; Dehaene et al., 2006; Hatamimajoumerd et al., 2022; Lamme, 2010; Pitts et al., 2018; Sergent & Dehaene, 2004), clearly distinguishing the neural profiles of each processing stage of the influential four-stage model of conscious experience (Fig. 1A). Along with the distinct temporal and spatial EEG decoding patterns associated with lateral and feedback processing, our findings suggest a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-tofeedback connections, ultimately leading to global recurrency and conscious report.”  

      (9) When stating that this is the first study in which behavioural measures of conscious perception were matched between the attentional blink and masking, it would be beneficial to highlight the main differences between the current study and the one from Fahrenfort et al., 2017, with which the current study shares many similarities in the experimental design (see point 1). 

      We would like to refer the reviewer to our response to point 1), where we detail how we expanded the discussion of similarities and differences between our present study and Fahrenfort et al. (2017).

      (10) The discussion emphasizes how the current study "suggests a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-to-feedback connections, ultimately leading to global recurrency and conscious report". For transparency, it is though important to highlight that one limit of the current study is that it does not provide direct evidence for the specified types of connections (see point 6). 

      We added a qualification in the Discussion section:

      “Although the present EEG decoding measures cannot provide direct evidence for feedback vs. lateral processes, based on neurophysiological evidence, …”

      Furthermore, we added this qualification in the Discussion section:

      “It should be noted that the not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processing as well.”

      References

      Angelucci, A., Levitt, J. B., Walton, E. J. S., Hupe, J.-M., Bullier, J., & Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 22(19), 8633–8646.

      Bair, W., Cavanaugh, J. R., & Movshon, J. A. (2003). Time course and time-distance relationships for surround suppression in macaque V1 neurons. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 23(20), 7690–7701.

      Block, N. (2005). Two neural correlates of consciousness. Trends in Cognitive Sciences, 9(2), 46–52.

      Chen, M., Yan, Y., Gong, X., Gilbert, C. D., Liang, H., & Li, W. (2014). Incremental integration of global contours through interplay between visual cortical areas. Neuron, 82(3), 682–694.

      Dehaene, S., Changeux, J.-P., Naccache, L., Sackur, J., & Sergent, C. (2006). Conscious, preconscious, and subliminal processing: a testable taxonomy. Trends in Cognitive Sciences, 10(5), 204–211.

      Hatamimajoumerd, E., Ratan Murty, N. A., Pitts, M., & Cohen, M. A. (2022). Decoding perceptual awareness across the brain with a no-report fMRI masking paradigm. Current Biology: CB. https://doi.org/10.1016/j.cub.2022.07.068

      JASP Team. (2024). JASP (Version 0.19.0)[Computer software]. https://jasp-stats.org/ Kentridge, R. W., Heywood, C. A., & Weiskrantz, L. (2004). Spatial attention speeds discrimination without awareness in blindsight. Neuropsychologia, 42(6), 831– 835.

      Kiefer, M., & Brendel, D. (2006). Attentional Modulation of Unconscious “Automatic” Processes: Evidence from Event-related Potentials in a Masked Priming Paradigm. Journal of Cognitive Neuroscience, 18(2), 184–198.

      Kouider, S., & Dehaene, S. (2007). Levels of processing during non-conscious perception: a critical review of visual masking. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481), 857–875.

      Lamme, V. A. F. (2010). How neuroscience will change our view on consciousness. Cognitive Neuroscience, 1(3), 204–220.

      Luck, S. J., & Hillyard, S. A. (1994). Electrophysiological correlates of feature analysis during visual search. Psychophysiology, 31(3), 291–308.

      Naccache, L., Blandin, E., & Dehaene, S. (2002). Unconscious masked priming depends on temporal attention. Psychological Science, 13(5), 416–424.

      Pitts, M. A., Lutsyshyna, L. A., & Hillyard, S. A. (2018). The relationship between attention and consciousness: an expanded taxonomy and implications for ‘noreport’ paradigms. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 373(1755), 20170348.

      Sergent, C., & Dehaene, S. (2004). Is consciousness a gradual phenomenon? Evidence for an all-or-none bifurcation during the attentional blink. Psychological Science, 15(11), 720–728.

      Van den Bussche, E., Van den Noortgate, W., & Reynvoet, B. (2009). Mechanisms of masked priming: a meta-analysis. Psychological Bulletin, 135(3), 452–477. van Gaal, S., & Lamme, V. A. F. (2012). Unconscious high-level information processing: implication for neurobiological theories of consciousness: Implication for neurobiological theories of consciousness. The Neuroscientist: A Review Journal Bringing Neurobiology, Neurology and Psychiatry, 18(3), 287–301.

      Vogel, E. K., Luck, S. J., & Shapiro, K. L. (1998). Electrophysiological evidence for a postperceptual locus of suppression during the attentional blink. Journal of Experimental Psychology. Human Perception and Performance, 24(6), 1656– 1674.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      his study shows a new mechanism of GS regulation in the archaean Methanosarcina mazei and clarifies the direct activation of GS activity by 2-oxoglutarate, thus featuring another way in which 2-oxoglutarate acts as a central status reporter of C/N sensing.

      Mass photometry and single particle cryoEM structure analysis convincingly show the direct regulation of GS activity by 2-OG promoted formation of the dodecameric structure of GS. The previously recognized small proteins GlnK1 and Sp26 seem to play a subordinate role in GS regulation, which is in good agreement with previous data. Although these data are quite clear now, there remains one major open question: how does 2-OG further increase GS activity once the full dodecameric state is achieved (at 5 mM)? This point needs to be reconsidered.

      Weaknesses:

      It is not entirely clear, how very high 2-OG concentrations activate GS beyond dodecamer formation.

      The data presented in this work are in stark contrast to the previously reported structure of M. mazei GS by the Schumacher lab. This is very confusing for the scientific community and requires clarification. The discussion should consider possible reasons for the contradictory results.

      Importantly, it is puzzling how Schumacher could achieve an apo-structire of dodecameric GS? If 2-OG is necessary for dodecameric formation, this should be discussed. If GlnK1 doesn't form a complex with the dodecameric GS, how could such a complex be resolved there?

      In addition, the text is in principle clear but could be improved by professional editing. Most obviously there is insufficient comma placement.

      We thank Reviewer #1 for the professional evaluation and raising important points. We will address those comments in the updated manuscript and especially improve the discussion in respect to the two points of concern.

      (1) How can GlnA1 activity further be stimulated with further increasing 2-OG after the dodecamer is already fully assembled at 5 mM 2-OG.

      We assume a two-step requirement for 2-OG, the dodecameric assembly and the priming of the active sites. The assembly step is based on cooperative effects of 2-OG and does not require the presence of 2-OG in all 2-OG-binding pockets: 2-OG-binding to one binding pocket also causes a domino effect of conformational changes in the adjacent 2-OG-unbound subunit, as also described for Methanothermococcus thermolithotrophicus GS in Müller et al. 2023. Due to the introduction of these conformational changes, the dodecameric form becomes more favourable even without all 2-OG binding sites being occupied. With higher 2-OG concentrations present (> 5mM), the activity increased further until finally all 2-OG-binding pockets were occupied, resulting in the priming of all active sites (all subunits) and thereby reaching the maximal activity.

      (2) The contradictory results with previously published data on the structure of M. mazei by Schumacher et al. 2023.

      We certainly agree that it is confusing that Schumacher et al. 2023 obtained a dodecameric structure without the addition of 2-OG, which we claim to be essential for the dodecameric form. 2-OG is a cellular metabolite that is naturally present in E. coli, the heterologous expression host both groups used. Since our main question focused on analysing the 2-OG effect on GS, we have performed thorough dialysis of the purified protein to remove all 2-OG before performing MP experiments. In the absence of 2-OG we never observed significant enzyme activity and always detected a fast disassembly after incubation on ice. We thus assume that a dodecamer without 2-OG in Schumacher et al. 2023 is an inactive oligomer of a once 2-OG-bound form, stabilized e.g. by the presence of 5 mM MgCl2.

      The GlnA1-GlnK1-structure (crystallography) by Schumacher et al. 2023 is in stark contrast to our findings that GlnK1 and GlnA1 do not interact as shown by mass photometry with purified proteins. A possible reason for this discrepancy might be that at the high protein concentrations used in the crystallization assay, complexes are formed based on hydrophobic or ionic protein interactions, which would not form under physiological concentrations.

      Reviewer #2 (Public Review):

      Summary:

      Herdering et al. introduced research on an archaeal glutamine synthetase (GS) from Methanosarcina mazei, which exhibits sensitivity to the environmental presence of 2-oxoglutarate (2-OG). While previous studies have indicated 2-OG's ability to enhance GS activity, the precise underlying mechanism remains unclear. Initially, the authors utilized biophysical characterization, primarily employing a nanomolar-scale detection method called mass photometry, to explore the molecular assembly of Methanosarcina mazei GS (M. mazei GS) in the absence or presence of 2-OG. Similar to other GS enzymes, the target M. mazei GS forms a stable dodecamer, with two hexameric rings stacked in tail-to-tail interactions. Despite approximately 40% of M. mazei GS existing as monomeric or dimeric entities in the detectable solution, the majority spontaneously assemble into a dodecameric state. Upon mixing 2-OG with M. mazei GS, the population of the dodecameric form increases proportionally with the concentration of 2-OG, indicating that 2-OG either promotes or stabilizes the assembly process. The cryo-electron microscopy (cryo-EM) structure reveals that 2-OG is positioned near the interface of two hexameric rings. At a resolution of 2.39 Å, the cryo-EM map vividly illustrates 2-OG forming hydrogen bonds with two individual GS subunits as well as with solvent water molecules. Moreover, local side-chain reorientation and conformational changes of loops in response to 2-OG further delineate the 2-OG-stabilized assembly of M. mazei GS.

      Strengths & Weaknesses:

      The investigation studies the impact of 2-oxoglutarate (2-OG) on the assembly of Methanosarcina mazei glutamine synthetase (M mazei GS). Utilizing cutting-edge mass photometry, the authors scrutinized the population dynamics of GS assembly in response to varying concentrations of 2-OG. Notably, the findings demonstrate a promising and straightforward correlation, revealing that dodecamer formation can be stimulated by 2-OG concentrations of up to 10 mM, although GS assembly never reaches 100% dodecamerization in this study. Furthermore, catalytic activities showed a remarkable enhancement, escalating from 0.0 U/mg to 7.8 U/mg with increasing concentrations of 2-OG, peaking at 12.5 mM. However, an intriguing gap arises between the incomplete dodecameric formation observed at 10 mM 2-OG, as revealed by mass photometry, and the continued increase in activity from 5 mM to 10 mM 2-OG for M mazei GS. This prompts questions regarding the inability of M mazei GS to achieve complete dodecamer formation and the underlying factors that further enhance GS activity within this concentration range of 2-OG.

      Moreover, the cryo-electron microscopy (cryo-EM) analysis provides additional support for the biophysical and biochemical characterization, elucidating the precise localization of 2-OG at the interface of two GS subunits within two hexameric rings. The observed correlation between GS assembly facilitated by 2-OG and its catalytic activity is substantiated by structural reorientations at the GS-GS interface, confirming the previously reported phenomenon of "funnel activation" in GS. However, the authors did not present the cryo-EM structure of M. mazei GS in complex with ATP and glutamate in the presence of 2-OG, which could have shed light on the differences in glutamine biosynthesis between previously reported GS enzymes and the 2-OG-bound M. mazei GS.

      Furthermore, besides revealing the cryo-EM structure of 2-OG-bound GS, the study also observed the filamentous form of GS, suggesting that filament formation may be a universal stacking mechanism across archaeal and bacterial species. However, efforts to enhance resolution to investigate whether the stacked polymer is induced by 2-OG or other factors such as ions or metabolites were not undertaken by the authors, leaving room for further exploration into the mechanisms underlying filament formation in GS.

      We thank Reviewer #2 for the detailed assessment and valuable input. We will address those comments in the updated manuscript and clarify the message.

      (1) The discrepancy of the dodecamer formation (max. at 5 mM 2-OG) and the enzyme activity (max. at 12.5 mM 2-OG). We assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site. See also Reviewer #1 R.1). We assume this is the reason why the activity of dodecameric GlnA1 can be further enhanced by increased 2-OG concentration until all catalytic sites are primed.

      (2) The lack of the structure of a 2-OG and ATP-bound GlnA1. Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.

      (3) The observed GlnA1-filaments are an interesting finding. We certainly agree with the referee on that point, that the stacked polymers are potentially induced by 2-OG or ions. However, it is out of the main focus of this manuscript to further explore those filaments. Nevertheless, this observation could serve as an interesting starting point for future experiments.

      Reviewer #3 (Public Review):

      Summary:

      The current manuscript investigates the effect of 2-oxoglutarate and the Glk1 protein as modulators of the enzymatic reactivity of glutamine synthetase. To do this, the authors rely on mass photometry, specific activity measurements, and single-particle cryo-EM data.

      From the results obtained, the authors convey that glutamine synthetase from Methanosarcina mazei exists in a non-active monomeric/dimeric form under low concentrations of 2-oxoglutarate, and its oligomerization into a dodecameric complex is triggered by higher concentration of 2-oxoglutarate, also resulting in the enhancement of the enzyme activity.

      Strengths:

      Glutamine synthetase is a crucial enzyme in all domains of life. The dodecameric fold of GS is recurrent amongst prokaryotic and archaea organisms, while the enzyme activity can be regulated in distinct ways. This is a very interesting work combining protein biochemistry with structural biology.

      The role of 2-OG is here highlighted as a crucial effector for enzyme oligomerization and full reactivity.

      Weaknesses:

      Various opportunities to enhance the current state-of-the-art were missed. In particular, omissions of the ligand-bound state of GnK1 leave unexplained the lack of its interaction with GS (in contradiction with previous results from the authors). A finer dissection of the effect and role of 2-oxoglurate are missing and important questions remain unanswered (e.g. are dimers relevant during early stages of the interaction or why previous GS dodecameric structures do not show 2-oxoglutarate).

      We thank Reviewer #3 for the expert evaluation and inspiring criticism.

      (1) Encouragement to examine ligand-bound states of GlnK1. We agree and plan to perform the suggested experiments exploring the conditions under which GlnA1 and GlnK1 might interact. We will perform the MP experiments in the presence of ATP. In GlnA1 activity test assays when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.

      (2) The exact role of 2-OG could have been dissected much better. We agree on that point and will improve the clarity of the manuscript. See also Reviewer #1 R.1.

      (3) The lack of studies on dimers. This is actually an interesting point, which we did not consider during writing the manuscript. Now, re-analysing all our MP data in this respect, GlnA1 is likely a dimer as smallest species. Consequently, we will add more supplementary data which supports this observation and change the text accordingly.

      (4) Previous studies and structures did not show the 2-OG. We assume that for other structures, no additional 2-OG was added, and the groups did not specifically analyse for this metabolite either. All methanoarchaea perform methanogenesis and contain the oxidative part of the TCA cycle exclusively for the generation of glutamate (anabolism) but not a closed TCA cycle enabling them to use internal 2-OG concentration as internal signal for nitrogen availability. In the case of bacterial GS from organisms with a closed TCA cycle used for energy metabolism (oxidation of acetyl CoA) like e.g. E. coli, the formation of an active dodecameric GS form underlies another mechanism independent of 2-OG. In case of the recent M. mazei GS structures published by Schumacher et al. 2023, the dodecameric structure is probably a result from the heterologous expression and purification from E. coli. (See also Reviewer #1 R.2). One example of methanoarchaeal glutamine synthetases that do in fact contain the 2-OG in the structure, is Müller et al. 2023.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Specific issues:

      L 141: 2-OG levels increase due to slowing GOGAT reaction (due to Gln limitation as a consequence of N-starvation).... (2-OG also increases in bacteria that lack GDH...)

      As the GS-GOGAT cycle is the major route of ammonium assimilation, consumption of 2-OG by GDH is probably only relevant under high ammonium concentrations.

      In Methanoarchaea, GS is strictly regulated and expression strongly repressed under nitrogen sufficiency - thus glutamate for anabolism is mainly generated by GDH under N sufficiency consuming 2-OG delivered by the oxidative part of the TCA cycle (Methanogenesis is the energy metabolism in methanoarchaea, a closed TCA cycle is not present) thus 2-OG is increasing under nitrogen limitation, when no NH3 is available for GDH.

      L148: it is not clear what is meant by: "and due to the indirect GS activity assay"

      We apologize for not being clear here. The GS activity assay used is the classical assay by Sahpiro & Stadtman 1970 and is a coupled optical test assay (coupling the ATP consumption of the GS activity to the oxidation of NADH by lactate dehydrogenase). Based on the coupled test assay the measurements of low activities show a high deviation. We now added this information in the revised MS respectively.

      L: 177: arguing about 2-OG affinities: more precisely, the 0.75 mM 2-OG is the EC50 concentration of 2-OG for triggering dodecameric formation; it might not directly reflect the total 2-OG affinity, since the affinity may be modulated by (anti)cooperative effects, or by additional sites... as there may be different 2-OG binding sites involved... (same in line 201)

      Thank you for the valuable input. We changed KD to EC50 within the entire manuscript. Concerning possible additional 2-OG binding sites: we did not see any other 2-OG in the cryo-EM structure aside from the described one and we therefore assume that the one described in the manuscript is the main and only one. Considering the high amounts of 2-OG (12.5 mM) used in the structure, it is quite unlikely that additional 2-OG sites exist since they would have unphysiologically low affinities.

      In this respect, instead of the rather poor assay shown in Figure 1D, a more detailed determination of catalytic activation by different 2-OG concentrations should be done (similar to 1A)... This would allow a direct comparison between dodecamerization and enzymatic activation.

      We agree and performed the respective experiments, which are now presented in revised Fig. 1D

      Discussion: the role of 2-OG as a direct activator, comparison with other prokaryotic GS: in other cases, 2-OG affects GS indirectly by being sensed by PII proteins or other 2-OG sensing mechanisms (like 2OG-NtcA-mediated repression of IF factors in cyanobacteria)

      We agree and have added that information in the discussion as suggested.

      290. Unclear: As a second step of activation, the allosteric binding of 2-OG causes a series of conformational.... where is this site located? According to the catalytic effects (compare 1A and 1D) this site should have a lower affinity …

      Thank you very much for pointing this out. Binding of 2-OG only occurs in one specific allosteric binding-site. Binding however, has two effects on the GlnA1: dodecamer assembly and priming of the active site (with two specific EC50, which are now shown in Fig. 1A and D).

      See also public comment #1 (1).

      Reviewer #2 (Recommendations For The Authors):

      The primary concern for me is that mass photometry might lead to incorrect conclusions. The differences in the forms of GS seen in SEC and MP suggest that GS can indeed form a stable dodecamer when the concentration of GS is high enough, as shown in Figure S1B. I strongly suggest using an additional biophysical method to explore the connection between GS and 2-OG in terms of both assembly and activity, to truly understand 2-OG's role in the process of assembly and catalysis.

      We apologize if we did not present this clear enough, however the MP analysis of GlnA1 in the absence of 2-OG showed always (monomers/) dimers, dodecamers were only present in the presence of 2-OG. The SEC analysis in Fig. S1B has been performed in the presence of 12.5 mM 2-OG, we realized this information is missing in the figure legend - we now added this in the revised version. The 2-OG is in addition visible in the Cryo EM structure. Thus, we do not agree to perform additional biophysical methods.

      As for the other experimental findings, they appear satisfactory to me, and I have no reservations regarding the cryoEM data.

      (1) Mass photometry is a fancy technique that uses only a tiny amount of protein to study how they come together. However, the concentration of the protein used in the experiment might be lower than what's needed for them to stick together properly. So, the authors saw a lot of single proteins or pairs instead of bigger groups. They showed in Figure S1B that the M. mazei GS came out earlier than a 440-kDa reference protein, indicating it's actually a dodecamer. But when they looked at the dodecamer fraction using mass photometry, they found smaller bits, suggesting the GS was breaking apart because the concentration used was too low. To fix this, they could try using a technique called analytic ultracentrifuge (AUC) with different amounts of 2-OG to see if they can spot single proteins or pairs when they use a bit more GS. They could also try another technique called SEC-MALS to do similar tests. If they do this, they could replace Figure 1A with new data showing fully formed GS dodecamers when they use the right amount of 2-OG.

      Thank you for this input. In MP we looked at dodecamer formation after removing the 2-OG entirely and re-adding it in the respective concentration. We think that GlnA1 is much more unstable in its monomeric/dimeric fraction and that the complete and harsh removal of 2-OG results in some dysfunctional protein which does not recover the dodecameric conformation after dialysis and re-addition of 2-OG. Looking at the dodecamer-peak right after SEC however, we exclusively see dodecamers, which is now included as an additional supplementary figure (suppl. Fig. 1C). Consequently, we did not perform additional experiments.

      (2) Building on the last point, the estimated binding strength (Kd) between 2-OG and GS might be lower than it really is, because the GS often breaks apart from its dodecameric form in this experiment, even though 2-OG helps keep the pairs together, as seen with cryoEM. What if they used 5-10 times more GS in the mass photometry experiment? Would the estimated bond strength stay the same? Could they use AUC or other techniques like ITC to find out the real, not just estimated, strength of the bond?

      We agree that the term KD is not suitable. We have changed the term KD to EC50 as suggested by reviewer #1, which describes the effective concentration required for 50 % dodecamer assembly. Furthermore, we disagree that the dodecamer breaks apart when the concentrations are as low as in MP experiments. The actual reason for the breaking is rather the harsh dialysis to remove all 2-OG before MP experiments. Right after SEC, the we exclusively see dodecamer in MP (suppl. Fig. S1C). See also #2 (1).

      (3) The fact that the GS hardly works without 2-OG is interesting. I tried to understand the experiment setup, but it wasn't clear as the protocol mentioned in the author's 2021 FEBS paper referred to an old paper from 1970. The "coupled optical test assay" they talked about wasn't explained well. I found other papers that used phosphometry assays to see how much ATP was used up. I suggest the authors give a better, more detailed explanation of their experiments in the methods section. Also, it's unclear why the GS activity keeps going up from 5 to 12.5 mM 2-OG, even though they said it's saturated. They suggested there might be another change happening from 5 to 12.5 mM 2-OG. If that's the case, they should try to get a cryo-EM picture of the GS with lots of 2-OG, both with and without ATP/glutamate (or the Met-Sox-P-ADP inhibitor), to see what's happening at a structural level during this change caused by 2-OG.

      We agree with the reviewer that the GS assay was not explained in detail (since published and known for several years). However, we now added the more detailed description of the assay in the revised MS, which also measures the ATP used up by GS, but couples the generation of ADP to an optical test assay producing pyruvate from PEP with the generated ADP catalysed by pyruvate kinase present in the assay. This generated pyruvate is finally reduced to lactate by the present lactate dehydrogenase consuming NADH, the reduction of which is monitored at 340 nm.

      The still increasing activity of GS after dodecamer formation (max. at 5 mM 2-OG) and the continuously increasing enzyme activity (max. at 12.5 mM 2-OG): See also public reviews, we assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site.

      The suggested additional experiments with and without ATP/Glutamate: Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.

      (4) Please remake Figure S2, the panels are too small to read the words. At least I have difficulty doing so.

      We assume the reviewer is pointing to Suppl. Fig S3, we now changed this figure accordingly.

      Line 153, the reference Schumacher et al. 23, should be 2023?

      Yes, thank you. We corrected that.

      Line 497. I believe it's UCSF ChimeraX, not Chimera.

      We apologize and corrected accordingly.

      Reviewer #3 (Recommendations For The Authors):

      Recent studies on the Methanothermococcus thermolithotrophicus glutamine synthetase, published by Müller et al., 2024, have identified the binding site for 2-oxoglutarate as well as the conformational changes that were induced in the protein by its presence. In the present study, the authors confirm these observations and additionally establish a link between the presence of 2-oxoglutarate and the dodecameric fold and full activation of GS.

      Curiously, here, the authors could not confirm their own findings that the dodecameric GS can directly interact with the PII-like GlnK1 protein and the small peptide sP26. However, the lack of mention of the GlnK-bound state in these studies is very alarming since it certainly is highly relevant here.

      We agree with the reviewer that we have not observed the interaction with GlnK1 and sP26 in the recent study. Consequently, we speculate that yet unknown cellular factor(s) might be required for an interaction of GlnA1 with GlnK1 and sP26, which were not present in the in vitro experiments using purified proteins, however they were present in the previous pull-down approaches (Ehlers et al. 2005, Gutt et al. 2021). Another reason might be that post-translational modifications occur in M. mazei, which might be important for the interaction, which are also not present in purified proteins expressed in E. coli.

      The manuscript interest could have been substantially increased if the authors had done finer biochemical and enzymatic analyses on the oligomerization process of GS, used GlnK1 bound to known effectors in their assays and would have done some more efforts to extrapolate their findings (even if a small niche) of related glutamine synthetases.

      We thank the reviewer for their valuable encouragement to explore ligand-bound-states of GlnK1. However, in this manuscript we mainly focused on 2-OG as activator of GlnA1 and decided to dedicate future experiments to the exploration of conditions that possibly favor GlnK1-binding.

      In principle, we have explored the ATP bound GlnK1 effects on GlnA1 activity in the activity assays (Fig. 2E) since ATP (3.6 mM) is present. GlnK1 however showed no effects on GlnA1 activity.

      In general, the manuscript is poorly written, with grammatically incorrect sentences that at times, which stands in the way of passing on the message of the manuscript.

      Particular points:

      (1) It is mentioned that 2-OG induces the active oligomeric (dodecamer, 12-mer) state of GlnA1 without detectable intermediates. However, only 62 % of the starting inactive enzyme yields active 12-mers. Note that this is contradicted in line 212.

      Thanks for pointing out this discrepancy. After removing all 2-OG as we did before MP-experiments, GlnA1 doesn’t reach full dodecamers anymore when 2-OG is re-added. This is not because the 2-OG amount is not enough to trigger full assembly, but because the protein is much more unstable in the absence of 2-OG, so we predict that some GlnA1 breaks during dialysis. See also answer reviewer #2 (1) and supplementary figure S1C.

      Is there any protein precipitation upon the addition of 2-OG? Is all protein being detected in the assay, meaning, is monomer/dimer + dodecamer yields close to 100% of the total enzyme in the assay?

      There is no protein precipitation upon the addition of 2-OG, indeed, GlnA1 is much more stable in the presence of 2-OG. In the mass photometry experiments, all particles are measured, precipitated protein would be visible as big entities in the MP.

      Please add to Figure 1 the amount of monomer/dimer during titration. Some debate why there is no full conversion should be tentatively provided.

      We agree with the reviewer and included the amount of monomer/dimer in the figure, as well as some discussion on why it is not fully converted again. GlnA1 is unstable without 2-OG and it was dialysed against buffer without 2-OG before MP measurements. This sample mistreatment resulted in no full re-assembly after re-adding 2-OG (although full dodecamers before dialysis (suppl. Fig. S1C).

      (2) Figure 1B reflects an exemplary result. Here, the addition of 0.1 mM 2-OG seems to promote monomer to dimer transition. Why was this not studied in further detail? It seems highly relevant to know from which species the dodecamer is assembled.

      We thank the reviewer for their comment. However, we would like to point out that, although not shown in the figure, GlnA1 is always mainly present as dimers as the smallest entity. As suggested earlier, we have added the amount of monomers/dimers to Figure 1A, which shows low monomer-counts at all 2-OG concentrations (Fig.1A). Although not depicted in the graph starting at 0.01 mM OG, we also see mainly dimers at 0 mM 2-OG.

      How does the y-axis compare to the number and percentage of counts assigned to the peaks? In line 713, it is written that the percentage of dodecamer considers the total number of counts, and this was plotted against the 2-OG concentration.

      We thank the reviewer for addressing this unclarity. Line 713 corresponds to Figure 1A, where we indeed plotted the percentage of dodecamer against the 2-OG-concentration. Thereby, the percentage of dodecamer corresponds to the percentage calculated from the Gaussian Fit of the MP-dodecamer-peak. In Figure 1 B, however, the y-axis displays the relative amount of counts per mass, multiple similar masses then add up to the percentage of the respective peak (Gaussian Fit above similar masses).

      (3) Lines 714 and 721 (and elsewhere): Why only partial data is used for statistical purposes?

      We in general only show one exemplary biological replicate, since the quality of the respective GlnA1 purification sometimes varied (maximum activity ranging from 5 - 10 U/mg). Therefore, we only compared activities within the same protein purification. For the EC50 calculations of all measurements, we refer to the supplement.

      (4) Lines 192-193: It is claimed that GlnK1 was previously shown to both regulate the activity of GlnA1 and form a complex with GlnA1. Please mention the ratio between GlnK1 and GlnA1 in this complex.

      We now included the requested information (GlnA1:GlnK1 1:1, (Ehlers et al. 2005); His6-GlnA1 (0.95 μM), His6-GlnK1 (0.65 μM); 2:1,4, Gutt et al. 2021).

      It is also known that PII proteins such as GlnK1 can bind ADP, ATP, and 2-OG. Interestingly, however, for various described PII proteins, 2-OG can only bind after the binding of ATP.

      So, the crucial question here is what is the binding state of GlnK1? 

      Were these assays performed in the absence of ATP? This is key to fully understand and connect the results to the previous observations. For example, if the GlnK1 used was bound to ADP but not to ATP, then the added 2-OG might indeed only be able to affect GlnA1 (leading to its activation/oligomerization). If this were true and according to the data reported, ADP would prevent GlnK1 from interacting with any oligomeric form of GlnA1. However, if GlnK1 bound to ATP is the form that interacts with GlnA1 (potentially validating previous results?) then, 2-OG would first bind to GlnK1 (assuming a higher affinity of 2-OG to GlnK1), eventually causing its release from GlnA1 followed by binding and activation of GlnA1.

      These experiments need to be done as they are essential to further understand the process. Given the ability of the authors to produce the protein and run such assays, it is unclear why they were not done here. As written in line 203, in this case, "under the conditions tested" is not a good enough statement, considering what is known in the field and how many more conclusions could easily be taken from such a setup.

      Thanks for the encouragement to investigate the ligand-bound states of GlnK1. We agree and plan to perform the suggested mass photometry experiments exploring the conditions under which GlnA1 and GlnK1 might interact in future work. In GlnA1 activity test assays, when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.

      (5) Figure 2D legend claims that the graphic shows the percentage of dodecameric GlnA1 as a function of the concentration of 2-OG. This is not what the figure shows; Figure 2D shows the dodecamer/dimer (although legend claims monomer was used, in line 732) ratio as a function of 2-OG (stated in line 736!). If this is true, a ratio of 1 means 50 % of dodecamers and dimers co-exist. This appears to be the case when GlnK1 was added, while in the absence of GlnK1 higher ratios are shown for higher 2-OG concentration implying that about 3 times more dodecamers were formed than dimers. However, wouldn´t a 50 % ratio be physiologically significant?

      We apologize for the partially incorrect and also misleading figure legend and corrected it. Indeed, the ratio of dodecamers and dimers is shown. Furthermore, we did not use monomeric GlnA1 (the smallest entity is mainly a dimer, see Fig 1A), however, the molarity was calculated based on the monomer-mass. Concerning the significance of the difference between the maximum ratio of GlnA1 and GlnK1: The ratio does appear higher, but this is mostly because adding large quantities of GlnK1 broadens all peaks at low molecular weight. This happens because the GlnK1 signal starts overlapping with the signal from GlnA1, leading to inflated GlnA1 dimer counts. We therefore do not think that this is biologically significant, especially as the activities do not differ under these conditions.

      (6) Is it possible that the uncleaved GlnA1 tag is preventing interaction with GlnK1? This should be discussed.

      This is of course a very important point. We however realized that Schumacher et al. also used an N-terminal His-tag, so we assume that the N-terminal tag is not hampering the interaction.

      (7) Line 228: Please detail the reported discrepancies in rmsd between the current protein and the gram-negative enzymes.

      The differences in rmsd between our M.mazei GlnA1 structure and the structure of gram-negative enzymes is caused by a) sequence similarity: E.g. M.mazei GlnA1 compared to B.subtilis GlnA have a sequence percent identity of 58.47; b) ligands in the structure: The B.Subtilis structure contains L-Methionine-S-sulfoximine phosphate, a transition state inhibitor, while the M. mazei  structure contains 2OG; c) Methodology: The structural determination methods also contribute to these differences. B. subtilis GlnA was determined using X-ray crystallography, while the M. mazei GlnA1 structure was resolved using Cryo-EM, where the protein behaves differently in ice compared to a crystal.

      (8) Line 747: The figure title claims "dimeric interface" although the manuscript body only refers to "hexameric interface" or "inter-hexamer interface" (line 224). Moreover, the figure 4 legend uses terms such as vertical and horizontal dimers and this too should be uniformized within the manuscript.

      Thank you for your valuable feedback. We have updated both the figure title and the figure legend as well in the main text to ensure consistency in the description.

      (9) Line 752: The description of the color scheme used here is somehow unclear.

      Thanks for pointing this out. We changed the description to make it more comprehensive.

      (10) Please label H14/15 and H14´/H15´in Fig 4C zoom.

      We agree that this has not been very clear. We added helix labels.

      (11) In Figure 4D legend, make sure to note that the binding sites for the substrate are based on homologies with another enzyme poised with these molecules.

      The same should be clear in the text: sites are not known, they are assumed to be, based on homologies (paragraph starting at line 239).

      Concerning this comment we want to point out that we studied the exact same enzyme as the Schumacher group, except that we used 2-OG in our experiments, which they did not.

      (12) Figure 3 appears redundant in light of Figure 4. 

      (13) Line 235: When mentioning F24, please refer to Figure 5.

      Thank you, we changed that accordingly.

      (14) Please provide the distances for the bonds depicted in Figure 4B.

      Thanks for pointing this out, we added distance labels to Figure 4B. For reasons of clarity only to three H-bonds.

      (15) Line 241: D57 is likely serving to abstract a proton from ammonium, what is residue Glu307 potentially doing? The information seems missing in light of how the sentence is built.

      Thanks for pointing this out. According to previous studies both residues are likely involved in proton abstraction - first from ammonium, and then from the formed gamma-ammonium group. Additionally, they contribute in shielding the active site from bulk solvent to prevent hydrolysis of the formed phospho-glutamate.

      (16) Why do the authors assume that increased concentrations of 2-OG are a signal for N starvation only in M. mazei and not in all prokaryotic equivalent systems (line 288)?

      In line 288, we did not claim that this is a unique signal for M. mazei. It is also the central N-starvation signal in Cyanobacteria but not directly perceived by the cyanobacterial GS through binding directly to GS.

      The authors should look into the residues that bind 2-OG and check if they are conserved in other GS. The results of this sequence analysis should be discussed in line with the variable prokaryotic glutamine synthetase types of activity modulation that were exposed in the introduction and Figure 7.

      Please refer to supplementary figure S5, where we already aligned the mentioned glutamine synthetase sequences. Since this was also already discussed in Müller et al. 2024, we did not want to repeat their observations and refer to our supplementary figure in too much detail.

      (17) Figure 5 title: Replace TS by transition state structures of homology enzymes, or alike.

      Thank you for this suggestion. We did not change the title however, since it is not a homologue but the exact same glutamine synthetase from Methanosarcina mazei.

      (18) Line 249: D170 is not shown in Figure 5A or elsewhere in Figure 5.

      Thank you for pointing this out. We added D170 to figure 5A.

      (19) Representative density for the residues binding 2-OG should be provided, maybe in a supplemental figure.

      Thank you for the suggestion. We added the densities of 2-OG-binding residues to figure 4B

      (20) Line 260: Please add a reference when describing the phosphoryl transfer.

      We thank the reviewer for this important point and added that accordingly.

      (21) Line 296: The binding of 2-OG indeed appears to be cooperative, such that at concentrations above its binding affinity to the protein, only dodecamers are seen (under experimental conditions). However, claiming that the oligomerization is fast is not correct when the experimental setup includes 10 minutes of incubation before measurements are done. Please correct this within the entire manuscript.

      A (fast) continuous kinetic assay could have confirmed this point and revealed the oligomerization steps and the intermediaries in the process (maybe monomer/dimers, then dimers/hexamers, and then hexamers/dodecamers). Such assays would have been highly valuable to this study.

      We thank the reviewer for this suggestion, but disagree. It is indeed a rather fast regulation (as activity assays without pre-incubation only takes 1 min longer to reach full activity, see the newly included suppl. Fig S6). Considering other regulation mechanisms like e.g. transcription or translation regulation, an activation that takes only 60 s is actually quite quick.

      (22) Line 305 (and elsewhere in the manuscript): the authors state that 2-OG primes the active site for a transition state. This appears incorrect. The transition state is the highest energy state in an enzymatic reaction progressing from substrate to product. Meaning, the transition state is a state that has a more or less modified form of the original substrate bound to the active site. This is not the case.

      In line 366 an "active open state" appears much more adequate to use. 

      We agree and changed accordingly throughout the manuscript.

      (23) Line 330: Please delete "found". Eventually replace it with "confirmed": As the authors write, others have described this residue as a ligand to glutamine.

      Thanks, we changed that accordingly, although previous descriptions were just based on homologies without the experimental validation.

      (24) The discussion in at various points summarizing again the results. It should be trimmed and improved.

      (25) Line 381: replace "two fast" with "fast"?

      We thank the reviewer for this suggestion, but disagree on this point. We especially wanted to highlight that there are two central nitrogen-metabolites involved in the direct regulation of GlnA1, that means TWO fast direct processes mediated by 2-OG and glutamine.

    1. Reviewer #1 (Public review):

      Wang et al., recorded concurrent EEG-fMRI in 107 participants during nocturnal NREM sleep to investigate brain activity and connectivity related to slow oscillations (SO), sleep spindles, and in particular their co-occurrence. The authors found SO-spindle coupling to be correlated with increased thalamic and hippocampal activity, and with increased functional connectivity from the hippocampus to the thalamus and from the thalamus to the neocortex, especially the medial prefrontal cortex (mPFC). They concluded the brain-wide activation pattern to resemble episodic memory processing, but to be dissociated from task-related processing and suggest that the thalamus plays a crucial role in coordinating the hippocampal-cortical dialogue during sleep.

      The paper offers an impressively large and highly valuable dataset that provides the opportunity for gaining important new insights into the network substrate involved in SOs, spindles, and their coupling. However, the paper does unfortunately not exploit the full potential of this dataset with the analyses currently provided, and the interpretation of the results is often not backed up by the results presented.

      I have the following specific comments.

      (1) The introduction is lacking sufficient review of the already existing literature on EEG-fMRI during sleep and the BOLD-correlates of slow oscillations and spindles in particular (Laufs et al., 2007; Schabus et al., 2007; Horovitz et al., 2008; Laufs, 2008; Czisch et al., 2009; Picchioni et al., 2010; Spoormaker et al., 2010; Caporro et al., 2011; Bergmann et al., 2012; Hale et al., 2016; Fogel et al., 2017; Moehlman et al., 2018; Ilhan-Bayrakci et al., 2022). The few studies mentioned are not discussed in terms of the methods used or insights gained.

      (2) The paper falls short in discussing the specific insights gained into the neurobiological substrate of the investigated slow oscillations, spindles, and their interactions. The validity of the inverse inference approach ("Open ended cognitive state decoding"), assuming certain cognitive functions to be related to these oscillations because of the brain regions/networks activated in temporal association with these events, is debatable at best. It is also unclear why eventually only episodic memory processing-like brain-wide activation is discussed further, despite the activity of 16 of 50 feature terms from the NeuroSynth v3 dataset were significant (episodic memory, declarative memory, working memory, task representation, language, learning, faces, visuospatial processing, category recognition, cognitive control, reading, cued attention, inhibition, and action).

      (3) Hippocampal activation during SO-spindles is stated as a main hypothesis of the paper - for good reasons - however, other regions (e.g., several cortical as well as thalamic) would be equally expected given the known origin of both oscillations and the existing sleep-EEG-fMRI literature. However, this focus on the hippocampus contrasts with the focus on investigating the key role of the thalamus instead in the Results section.

      (4) The study included an impressive number of 107 subjects. It is surprising though that only 31 subjects had to be excluded under these difficult recording conditions, especially since no adaptation night was performed. Since only subjects were excluded who slept less than 10 min (or had excessive head movements) there are likely several datasets included with comparably short durations and only a small number of SOs and spindles and even less combined SO-spindle events. A comprehensive table should be provided (supplement) including for each subject (included and excluded) the duration of included NREM sleep, number of SOs, spindles, and SO+spindle events. Also, some descriptive statistics (mean/SD/range) would be helpful.

      (5) Was the 20-channel head coil dedicated for EEG-fMRI measurements? How were the electrode cables guided through/out of the head coil? Usually, the 64-channel head coil is used for EEG-fMRI measurements in a Siemens PRISMA 3T scanner, which has a cable duct at the back that allows to guide the cables straight out of the head coil (to minimize MR-related artifacts). The choice for the 20-channel head coil should be motivated. Photos of the recording setup would also be helpful.

      (6) Was the EEG sampling synchronized to the MR scanner (gradient system) clock (the 10 MHz signal; not referring to the volume TTL triggers here)? This is a requirement for stable gradient artifact shape over time and thus accurate gradient noise removal.

      (7) The TR is quite long and the voxel size is quite large in comparison to state-of-the-art EPI sequences. What was the rationale behind choosing a sequence with relatively low temporal and spatial resolution?

      (8) The anatomically defined ROIs are quite large. It should be elaborated on how this might reduce sensitivity to sleep rhythm-specific activity within sub-regions, especially for the thalamus, which has distinct nuclei involved in sleep functions.

      (9) The study reports SO & spindle amplitudes & densities, as well as SO+spindle coupling, to be larger during N2/3 sleep compared to N1 and REM sleep, which is trivial but can be seen as a sanity check of the data. However, the amount of SOs and spindles reported for N1 and REM sleep is concerning, as per definition there should be hardly any (if SOs or spindles occur in N1 it becomes by definition N2, and the interval between spindles has to be considerably large in REM to still be scored as such). Thus, on the one hand, the report of these comparisons takes too much space in the main manuscript as it is trivial, but on the other hand, it raises concerns about the validity of the scoring.

      (10) Why was electrode F3 used to quantify the occurrence of SOs and spindles? Why not a midline frontal electrode like Fz (or a number of frontal electrodes for SOs) and Cz (or a number of centroparietal electrodes) for spindles to be closer to their maximum topography?

      (11) Functional connectivity (hippocampus -> thalamus -> cortex (mPFC)) is reported to be increased during SO-spindle coupling and interpreted as evidence for coordination of hippocampo-neocortical communication likely by thalamic spindles. However, functional connectivity was only analysed during coupled SO+spindle events, not during isolated SOs or isolated spindles. Without the direct comparison of the connectivity patterns between these three events, it remains unclear whether this is specific for coupled SO+spindle events or rather associated with one or both of the other isolated events. The PPIs need to be conducted for those isolated events as well and compared statistically to the coupled events.

      (12) The limited temporal resolution of fMRI does indeed not allow for easily distinguishing between fMRI activation patterns related to SO-up- vs. SO-down-states. For this, one could try to extract the amplitudes of SO-up- and SO-down-states separately for each SO event and model them as two separate parametric modulators (with the risk of collinearity as they are likely correlated).

      (13) L327: "It is likely that our findings of diminished DMN activity reflect brain activity during the SO DOWN-state, as this state consistently shows higher amplitude compared to the UP-state within subjects, which is why we modelled the SO trough as its onset in the fMRI analysis." This conclusion is not justified as the fact that SO down-states are larger in amplitude does not mean their impact on the BOLD response is larger.

      (14) Line 77: "In the current study, while directly capturing hippocampal ripples with scalp EEG or fMRI is difficult, we expect to observe hippocampal activation in fMRI whenever SOs-spindles coupling is detected by EEG, if SOs- spindles-ripples triple coupling occurs during human NREM sleep". Not all SO-spindle events are associated with ripples (Staresina et al., 2015), but hippocampal activation may also be expected based on the occurrence of spindles alone (Bergmann et al., 2012).

      References:

      Bergmann TO, Molle M, Diedrichs J, Born J, Siebner HR (2012) Sleep spindle-related reactivation of category-specific cortical regions after learning face-scene associations. Neuroimage 59:2733-2742.<br /> Caporro M, Haneef Z, Yeh HJ, Lenartowicz A, Buttinelli C, Parvizi J, Stern JM (2011) Functional MRI of sleep spindles and K-complexes. Clin Neurophysiol.<br /> Czisch M, Wehrle R, Stiegler A, Peters H, Andrade K, Holsboer F, Samann PG (2009) Acoustic oddball during NREM sleep: a combined EEG/fMRI study. PLoS One 4:e6749.<br /> Fogel S, Albouy G, King BR, Lungu O, Vien C, Bore A, Pinsard B, Benali H, Carrier J, Doyon J (2017) Reactivation or transformation? Motor memory consolidation associated with cerebral activation time-locked to sleep spindles. PLoS One 12:e0174755.<br /> Hale JR, White TP, Mayhew SD, Wilson RS, Rollings DT, Khalsa S, Arvanitis TN, Bagshaw AP (2016) Altered thalamocortical and intra-thalamic functional connectivity during light sleep compared with wake. Neuroimage 125:657-667.<br /> Horovitz SG, Fukunaga M, de Zwart JA, van Gelderen P, Fulton SC, Balkin TJ, Duyn JH (2008) Low frequency BOLD fluctuations during resting wakefulness and light sleep: a simultaneous EEG-fMRI study. Hum Brain Mapp 29:671-682.<br /> Ilhan-Bayrakci M, Cabral-Calderin Y, Bergmann TO, Tuscher O, Stroh A (2022) Individual slow wave events give rise to macroscopic fMRI signatures and drive the strength of the BOLD signal in human resting-state EEG-fMRI recordings. Cereb Cortex 32:4782-4796.<br /> Laufs H (2008) Endogenous brain oscillations and related networks detected by surface EEG-combined fMRI. Hum Brain Mapp 29:762-769.<br /> Laufs H, Walker MC, Lund TE (2007) 'Brain activation and hypothalamic functional connectivity during human non-rapid eye movement sleep: an EEG/fMRI study'--its limitations and an alternative approach. Brain 130:e75; author reply e76.<br /> Moehlman TM, de Zwart JA, Chappel-Farley MG, Liu X, McClain IB, Chang C, Mandelkow H, Ozbay PS, Johnson NL, Bieber RE, Fernandez KA, King KA, Zalewski CK, Brewer CC, van Gelderen P, Duyn JH, Picchioni D (2018) All-Night Functional Magnetic Resonance Imaging Sleep Studies. J Neurosci Methods.<br /> Picchioni D, Horovitz SG, Fukunaga M, Carr WS, Meltzer JA, Balkin TJ, Duyn JH, Braun AR (2010) Infraslow EEG oscillations organize large-scale cortical-subcortical interactions during sleep: A combined EEG/fMRI study. Brain Res.<br /> Schabus M, Dang-Vu TT, Albouy G, Balteau E, Boly M, Carrier J, Darsaud A, Degueldre C, Desseilles M, Gais S, Phillips C, Rauchs G, Schnakers C, Sterpenich V, Vandewalle G, Luxen A, Maquet P (2007) Hemodynamic cerebral correlates of sleep spindles during human non-rapid eye movement sleep. Proc Natl Acad Sci U S A 104:13164-13169.<br /> Spoormaker VI, Schroter MS, Gleiser PM, Andrade KC, Dresler M, Wehrle R, Samann PG, Czisch M (2010) Development of a large-scale functional brain network during human non-rapid eye movement sleep. J Neurosci 30:11379-11387.<br /> Staresina BP, Bergmann TO, Bonnefond M, van der Meij R, Jensen O, Deuker L, Elger CE, Axmacher N, Fell J (2015) Hierarchical nesting of slow oscillations, spindles and ripples in the human hippocampus during sleep. Nat Neurosci 18:1679-1686.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      I would like to thank the reviewers for their comments and interest in the manuscript and the study.

      Referee #1

      1. I would assume that there are RNA-seq and/or ChIP-seq data out there produced after knockdown of one or more of these DBPs that show directional positioning.

      Response: The directional positioning of CTCF-binding sites at chromatin interaction sites was analyzed by CRISPR experiment (Guo Y et al. Cell 2015). We found that the machine learning and statistical analysis showed the same directional bias of the CTCF-binding motif sequence at chromatin interaction sites as the experimental analysis of Guo Y et al. (lines 229-245, Figure 3b, c, d and Table 1). Since CTCF is involved in different biological functions (Braccioli L et al. Essays Biochem. 2019 ResearchGate webpage), the directional bias of binding sites may be reduced in all binding sites including those at chromatin interaction sites (lines 68-73). In our study, we investigated the DNA-binding sites of proteins using the ChIP-seq data of DNA-binding proteins and DNase-seq data. We also confirmed that the DNA-binding sites of SMC3 and RAD21, which tend to be found in chromatin loops with CTCF, also showed the same directional bias as CTCF by the computational analysis.

      1. Figure 6 should be expanded to incorporate analysis of DBPs not overlapping CTCF/cohesin in chromatin interaction data that is important and potentially more interesting than the simple DBPs enrichment reported in the present form of the figure.

      Response: Following the reviewer's advice, I performed the same analysis with the DNA-binding sites that do no overlap with the DNA-binding sites of CTCF and cohesin (RAD21 and SMC3) (Fig. 6 and Supplementary Fig. 4). The result showed the same tendency in the distribution of DNA-binding sites. The height of a peak on the graph became lower for some DNA-binding proteins after removing the DNA-binding sites that overlapped with those of CTCF and cohesin. I have added the following sentence on lines 427 and 817: For the insulator-associated DBPs other than CTCF, RAD21, and SMC3, the DNA-binding sites that do not overlap with those of CTCF, RND21, and SMC3 were used to examine their distribution around interaction sites.

      1. Critically, I would like to see use of Micro-C/Hi-C data and ChIP-seq from these factors, where insulation scores around their directionally-bound sites show some sort of an effect like that presumed by the authors - and many such datasets are publicly-available and can be put to good use here.

      Response: As suggested by the reviewer, I have added the insulator scores and boundary sites from the 4D nucleome data portal as tracks in the UCSC genome browser. The insulator scores seem to correspond to some extent to the H3K27me3 histone marks from ChIP-seq (Fig. 4a and Supplementary Fig. 3). The direction of DNA-binding sites on the genome can be shown with different colors (e.g. red and green), but the directionality of insulator-associated DNA-binding sites is their overall tendency, and it may be difficult to notice the directionality from each binding site because the directionality may be weaker than that of CTCF, RAD21, and SMC3 as shown in Table 1 and Supplementary Table 2.

      I found that the CTCF binding sites examined by a wet experiment in the previous study may not always overlap with the boundary sites of chromatin interactions from Micro-C assay (Guo Y et al. Cell 2015). The chromatin interaction data do not include all interactions due to the high sequencing cost of the assay. The number of the boundary sites may be smaller than that of CTCF binding sites acting as insulators and/or some of the CTCF binding sites may not be locate in the boundary sites. It may be difficult for the boundary location algorithm to identify a short boundary location. Due to the limitations of the chromatin interaction data, I planned to search for insulator-associated DNA-binding proteins without using chromatin interaction data in this study. I have added the statistical summary of the analysis in lines 364-387 as follows: Overall, among 20,837 DNA-binding sites of the 97 insulator-associated proteins found at insulator sites identified by H3K27me3 histone modification marks (type 1 insulator sites), 1,315 (6%) overlapped with 264 of 17,126 5kb long boundary sites, and 6,137 (29%) overlapped with 784 of 17,126 25kb long boundary sites in HFF cells. Among 5,205 DNA-binding sites of the 97 insulator-associated DNA-binding proteins found at insulator sites identified by H3K27me3 histone modification marks and transcribed regions (type 2 insulator sites), 383 (7%) overlapped with 74 of 17,126 5-kb long boundary sites, 1,901 (37%) overlapped with 306 of 17,126 25-kb long boundary sites. Although CTCF-binding sites separate active and repressive domains, the limited number of DNA-binding sites of insulator-associated proteins found at type 1 and 2 insulator sites overlapped boundary sites identified by chromatin interaction data. Furthermore, by analyzing the regulatory regions of genes, the DNA-binding sites of the 97 insulator-associated DNA-binding proteins were found (1) at the type 1 insulator sites (based on H3K27me3 marks) in the regulatory regions of 3,170 genes, (2) at the type 2 insulator sites (based on H3K27me3 marks and gene expression levels) in the regulatory regions of 1,044 genes, and (3) at insulator sites as boundary sites identified by chromatin interaction data in the regulatory regions of 6,275 genes. The boundary sites showed the highest number of overlaps with the DNA-binding sites. Comparing the insulator sites identified by (1) and (3), 1,212 (38%) genes have both types of insulator sites. Comparing the insulator sites between (2) and (3), 389 (37%) genes have both types of insulator sites. From the comparison of insulator and boundary sites, we found that (1) or (2) types of insulator sites overlapped or were close to boundary sites identified by chromatin interaction data.

      1. The suggested alternative transcripts function, also highlighted in the manuscripts abstract, is only supported by visual inspection of a few cases for several putative DBPs. I believe this is insufficient to support what looks like one of the major claims of the paper when reading the abstract, and a more quantitative and genome-wide analysis must be adopted, although the authors mention it as just an 'observation'.

      Response: According to the reviewer's comment, I performed the genome-wide analysis of alternative transcripts where the DNA-binding sites of insulator-associated proteins are located near splicing sites. The DNA-binding sites of insulator-associated DNA-binding proteins were found within 200 bp centered on splice sites more significantly than the other DNA-binding proteins (Fig. 4e and Table 2). I have added the following sentences on lines 397 - 404: We performed the statistical test to estimate the enrichment of insulator-associated DNA-binding sites compared to the other DNA-binding proteins, and found that the insulator-associated DNA-binding sites were significantly more abundant at splice sites than the DNA-binding sites of the other proteins (Fig 4e and Table 2; Mann‒Whitney U test, p value 5. Figure 1 serves no purpose in my opinion and can be removed, while figures can generally be improved (e.g., the browser screenshots in Figs 4 and 5) for interpretability from readers outside the immediate research field.

      Response: I believe that the Figure 1 would help researchers in other fields who are not familiar with biological phenomena and functions to understand the study. More explanation has been included in the Figures and legends of Figs. 4 and 5 to help readers outside the immediate research field understand the figures.

      1. Similarly, the text is rather convoluted at places and should be re-approached with more clarity for less specialized readers in mind.

      Response: Reviewer #2's comments would be related to this comment. I have introduced a more detailed explanation of the method in the Results section, as shown in the responses to Reviewer #2's comments.

      Referee #2

      1. Introduction, line 95: CTCF appears two times, it seems redundant.

      Response: On lines 91-93, I deleted the latter CTCF from the sentence "and examined the directional bias of DNA-binding sites of CTCF and insulator-associated DBPs, including those of known DBPs such as RAD21 and SMC3".

      1. Introduction, lines 99-103: Please stress better the novelty of the work. What is the main focus? The new identified DPBs or their binding sites? What are the "novel structural and functional roles of DBPs" mentioned?

      Response: Although CTCF is known to be the main insulator protein in vertebrates, we found that 97 DNA-binding proteins including CTCF and cohesin are associated with insulator sites by modifying and developing a machine learning method to search for insulator-associated DNA-binding proteins. Most of the insulator-associated DNA-binding proteins showed the directional bias of DNA-binding motifs, suggesting that the directional bias is associated with the insulator.

      I have added the sentence in lines 96-99 as follows: Furthermore, statistical testing the contribution scores between the directional and non-directional DNA-binding sites of insulator-associated DBPs revealed that the directional sites contributed more significantly to the prediction of gene expression levels than the non-directional sites. I have revised the statement in lines 101-110 as follows: To validate these findings, we demonstrate that the DNA-binding sites of the identified insulator-associated DBPs are located within potential insulator sites, and some of the DNA-binding sites in the insulator site are found without the nearby DNA-binding sites of CTCF and cohesin. Homologous and heterologous insulator-insulator pairing interactions are orientation-dependent, as suggested by the insulator-pairing model based on experimental analysis in flies. Our method and analyses contribute to the identification of insulator- and chromatin-associated DNA-binding sites that influence EPIs and reveal novel functional roles and molecular mechanisms of DBPs associated with transcriptional condensation, phase separation and transcriptional regulation.

      1. Results, line 111: How do the SNPs come into the procedure? From the figures it seems the input is ChIP-seq peaks of DNBPs around the TSS.

      Response: On lines 121-124, to explain the procedure for the SNP of an eQTL, I have added the sentence in the Methods: "If a DNA-binding site was located within a 100-bp region around a single-nucleotide polymorphism (SNP) of an eQTL, we assumed that the DNA-binding proteins regulated the expression of the transcript corresponding to the eQTL".

      1. Again, are those SNPs coming from the different cell lines? Or are they from individuals w.r.t some reference genome? I suggest a general restructuring of this part to let the reader understand more easily. One option could be simplifying the details here or alternatively including all the necessary details.

      Response: On line 119, I have included the explanation of the eQTL dataset of GTEx v8 as follows: " The eQTL data were derived from the GTEx v8 dataset, after quality control, consisting of 838 donors and 17,382 samples from 52 tissues and two cell lines". On lines 681 and 865, I have added the filename of the eQTL data "(GTEx_Analysis_v8_eQTL.tar)".

      1. Figure 1: panel a and b are misleading. Is the matrix in panel a equivalent to the matrix in panel b? If not please clarify why. Maybe in b it is included the info about the SNPs? And if yes, again, what is then difference with a.

      Response: The reviewer would mention Figure 2, not Figure 1. If so, the matrices in panels a and b in Figure 2 are equivalent. I have shown it in the figure: The same figure in panel a is rotated 90 degrees to the right. The green boxes in the matrix show the regions with the ChIP-seq peak of a DNA-binding protein overlapping with a SNP of an eQTL. I used eQTL data to associate a gene with a ChIP-seq peak that was more than 2 kb upstream and 1 kb downstream of a transcriptional start site of a gene. For each gene, the matrix was produced and the gene expression levels in cells were learned and predicted using the deep learning method. I have added the following sentences to explain the method in lines 133 - 139: Through the training, the tool learned to select the binding sites of DNA-binding proteins from ChIP-seq assays that were suitable for predicting gene expression levels in the cell types. The binding sites of a DNA-binding protein tend to be observed in common across multiple cell and tissue types. Therefore, ChIP-seq data and eQTL data in different cell and tissue types were used as input data for learning, and then the tool selected the data suitable for predicting gene expression levels in the cell types, even if the data were not obtained from the same cell types.

      1. Line 386-388: could the author investigate in more detail this observation? Does it mean that loops driven by other DBPs independent of the known CTCF/Cohesin? Could the author provide examples of chromatin structural data e.g. MicroC?

      Response: As suggested by the reviewer, to help readers understand the observation, I have added Supplementary Fig. S4c to show the distribution of DNA-binding sites of "CTCF, RAD21, and SMC3" and "BACH2, FOS, ATF3, NFE2, and MAFK" around chromatin interaction sites. I have modified the following sentence to indicate the figure on line 493: Although a DNA-binding-site distribution pattern around chromatin interaction sites similar to those of CTCF, RAD21, and SMC3 was observed for DBPs such as BACH2, FOS, ATF3, NFE2, and MAFK, less than 1% of the DNA-binding sites of the latter set of DBPs colocalized with CTCF, RAD21, or SMC3 in a single bin (Fig. S4c).

      In Aljahani A et al. Nature Communications 2022, we find that depletion of cohesin causes a subtle reduction in longer-range enhancer-promoter interactions and that CTCF depletion can cause rewiring of regulatory contacts. Together, our data show that loop extrusion is not essential for enhancer-promoter interactions, but contributes to their robustness and specificity and to precise regulation of gene expression. Goel VY et al. Nature Genetics 2023 mentioned in the abstract: Microcompartments frequently connect enhancers and promoters and though loss of loop extrusion and inhibition of transcription disrupts some microcompartments, most are largely unaffected. These results suggested that chromatin loops can be driven by other DBPs independent of the known CTCF/Cohesin.

      FOXA1 pioneer factor functions as an initial chromatin-binding and chromatin-remodeling factor and has been reported to form biomolecular condensates (Ji D et al. Molecular Cell 2024). CTCF have also found to form transcriptional condensate and phase separation (Lee R et al. Nucleic acids research 2022). FOS was found to be an insulator-associated DNA-binding protein in this study and is potentially involved in chromatin remodeling, transcription condensation, and phase separation with the other factors such as BACH2, ATF3, NFE2 and MAFK. I have added the following sentence on line 548: FOXA1 pioneer factor functions as an initial chromatin-binding and chromatin-remodeling factor and has been reported to form biomolecular condensates.

      1. In general, how the presented results are related to some models of chromatin architecture, e.g. loop extrusion, in which it is integrated convergent CTCF binding sites?

      Response: Goel VY et al. Nature Genetics 2023 identified highly nested and focal interactions through region capture Micro-C, which resemble fine-scale compartmental interactions and are termed microcompartments. In the section titled "Most microcompartments are robust to loss of loop extrusion," the researchers noted that a small proportion of interactions between CTCF and cohesin-bound sites exhibited significant reductions in strength when cohesin was depleted. In contrast, the majority of microcompartmental interactions remained largely unchanged under cohesin depletion. Our findings indicate that most P-P and E-P interactions, aside from a few CTCF and cohesin-bound enhancers and promoters, are likely facilitated by a compartmentalization mechanism that differs from loop extrusion. We suggest that nested, multiway, and focal microcompartments correspond to small, discrete A-compartments that arise through a compartmentalization process, potentially influenced by factors upstream of RNA Pol II initiation, such as transcription factors, co-factors, or active chromatin states. It follows that if active chromatin regions at microcompartment anchors exhibit selective "stickiness" with one another, they will tend to co-segregate, leading to the development of nested, focal interactions. This microphase separation, driven by preferential interactions among active loci within a block copolymer, may account for the striking interaction patterns we observe.

      The authors of the paper proposed several mechanisms potentially involved in microcompartments. These mechanisms may be involved in looping with insulator function. Another group reported that enhancer-promoter interactions and transcription are largely maintained upon depletion of CTCF, cohesin, WAPL or YY1. Instead, cohesin depletion decreased transcription factor binding to chromatin. Thus, cohesin may allow transcription factors to find and bind their targets more efficiently (Hsieh TS et al. Nature Genetics 2022). Among the identified insulator-associated DNA-binding proteins, Maz and MyoD1 form loops without CTCF (Xiao T et al. Proc Natl Acad Sci USA 2021 ; Ortabozkoyun H et al. Nature genetics 2022 ; Wang R et al. Nature communications 2022). I have added the following sentences on lines 563-567: Another group reported that enhancer-promoter interactions and transcription are largely maintained upon depletion of CTCF, cohesin, WAPL or YY1. Instead, cohesin depletion decreased transcription factor binding to chromatin. Thus, cohesin may allow transcription factors to find and bind their targets more efficiently. I have included the following explanation on lines 574-576: Maz and MyoD1 among the identified insulator-associated DNA-binding proteins form loops without CTCF.

      As for the directionality of CTCF, if chromatin loop anchors have some structural conformation, as shown in the paper entitled "The structural basis for cohesin-CTCF-anchored loops" (Li Y et al. Nature 2020), directional DNA binding would occur similarly to CTCF binding sites. Moreover, cohesin complexes that interact with convergent CTCF sites, that is, the N-terminus of CTCF, might be protected from WAPL, but those that interact with divergent CTCF sites, that is, the C-terminus of CTCF, might not be protected from WAPL, which could release cohesin from chromatin and thus disrupt cohesin-mediated chromatin loops (Davidson IF et al. Nature Reviews Molecular Cell Biology 2021). Regarding loop extrusion, the 'loop extrusion' hypothesis is motivated by in vitro observations. The experiment in yeast, in which cohesin variants that are unable to extrude DNA loops but retain the ability to topologically entrap DNA, suggested that in vivo chromatin loops are formed independently of loop extrusion. Instead, transcription promotes loop formation and acts as an extrinsic motor that extends these loops and defines their final positions (Guerin TM et al. EMBO Journal 2024). I have added the following sentences on lines 535-539: Cohesin complexes that interact with convergent CTCF sites, that is, the N-terminus of CTCF, might be protected from WAPL, but those that interact with divergent CTCF sites, that is, the C-terminus of CTCF, might not be protected from WAPL, which could release cohesin from chromatin and thus disrupt cohesin-mediated chromatin loops. I have included the following sentences on lines 569-574: The 'loop extrusion' hypothesis is motivated by in vitro observations. The experiment in yeast, in which cohesin variants that are unable to extrude DNA loops but retain the ability to topologically entrap DNA, suggested that in vivo chromatin loops are formed independently of loop extrusion. Instead, transcription promotes loop formation and acts as an extrinsic motor that extends these loops and defines their final positions.

      Another model for the regulation of gene expression by insulators is the boundary-pairing (insulator-pairing) model (Bing X et al. Elife 2024) (Ke W et al. Elife 2024) (Fujioka M et al. PLoS Genetics 2016). Molecules bound to insulators physically pair with their partners, either head-to-head or head-to-tail, with different degrees of specificity at the termini of TADs in flies. Although the experiments do not reveal how partners find each other, the mechanism unlikely requires loop extrusion. Homologous and heterologous insulator-insulator pairing interactions are central to the architectural functions of insulators. The manner of insulator-insulator interactions is orientation-dependent. I have summarized the model on lines 551-559: Other types of chromatin regulation are also expected to be related to the structural interactions of molecules. As the boundary-pairing (insulator-pairing) model, molecules bound to insulators physically pair with their partners, either head-to-head or head-to-tail, with different degrees of specificity at the termini of TADs in flies (Fig. 7). Although the experiments do not reveal how partners find each other, the mechanism unlikely requires loop extrusion. Homologous and heterologous insulator-insulator pairing interactions are central to the architectural functions of insulators. The manner of insulator-insulator interactions is orientation-dependent.

      1. Do the authors think that the identified DBPs could work in that way as well?

      Response: The boundary-pairing (insulator-pairing) model would be applied to the insulator-associated DNA-binding proteins other than CTCF and cohesin that are involved in the loop extrusion mechanism (Bing X et al. Elife 2024) (Ke W et al. Elife 2024) (Fujioka M et al. PLoS Genetics 2016).

      Liquid-liquid phase separation was shown to occur through CTCF-mediated chromatin loops and to act as an insulator (Lee, R et al. Nucleic Acids Research 2022). Among the identified insulator-associated DNA-binding proteins, CEBPA has been found to form hubs that colocalize with transcriptional co-activators in a native cell context, which is associated with transcriptional condensate and phase separation (Christou-Kent M et al. Cell Reports 2023). The proposed microcompartment mechanisms are also associated with phase separation. Thus, the same or similar mechanisms are potentially associated with the insulator function of the identified DNA-binding proteins. I have included the following information on line 546: CEBPA in the identified insulator-associated DNA-binding proteins was also reported to be involved in transcriptional condensates and phase separation.

      1. Also, can the authors comment about the mechanisms those newly identified DBPs mediate contacts by active processes or equilibrium processes?

      Response: Snead WT et al. Molecular Cell 2019 mentioned that protein post-transcriptional modifications (PTMs) facilitate the control of molecular valency and strength of protein-protein interactions. O-GlcNAcylation as a PTM inhibits CTCF binding to chromatin (Tang X et al. Nature Communications 2024). I found that the identified insulator-associated DNA-binding proteins tend to form a cluster at potential insulator sites (Supplementary Fig. 2d). These proteins may interact and actively regulate chromatin interactions, transcriptional condensation, and phase separation by PTMs. I have added the following explanation on lines 576-582: Furthermore, protein post-transcriptional modifications (PTMs) facilitate control over the molecular valency and strength of protein-protein interactions. O-GlcNAcylation as a PTM inhibits CTCF binding to chromatin. We found that the identified insulator-associated DNA-binding proteins tend to form a cluster at potential insulator sites (Fig. 4f and Supplementary Fig. 3c). These proteins may interact and actively regulate chromatin interactions, transcriptional condensation, and phase separation through PTMs.

      1. Can the author provide some real examples along with published structural data (e.g. the mentioned micro-C data) to show the link between protein co-presence, directional bias and contact formation?

      Response: Structural molecular model of cohesin-CTCF-anchored loops has been published by Li Y et al. Nature 2020. The structural conformation of CTCF and cohesin in the loops would be the cause of the directional bias of CTCF binding sites, which I mentioned in lines 531 - 535 as follows: These results suggest that the directional bias of DNA-binding sites of insulator-associated DBPs may be involved in insulator function and chromatin regulation through structural interactions among DBPs, other proteins, DNAs, and RNAs. For example, the N-terminal amino acids of CTCF have been shown to interact with RAD21 in chromatin loops. To investigate the principles underlying the architectural functions of insulator-insulator pairing interactions, two insulators, Homie and Nhomie, flanking the Drosophila even skipped locus were analyzed. Pairing interactions between the transgene Homie and the eve locus are directional. The head-to-head pairing between the transgene and endogenous Homie matches the pattern of activation (Fujioka M et al. PLoS Genetics 2016).

      Referee #3

      1. Some of these TFs do not have specific direct binding to DNA (P300, Cohesin). Since the authors are using binding motifs in their analysis workflow, I would remove those from the analysis.

      Response: When a protein complex binds to DNA, one protein of the complex binds to the DNA directory, and the other proteins may not bind to DNA. However, the DNA motif sequence bound by the protein may be registered as the DNA-binding motif of all the proteins in the complex. The molecular structure of the complex of CTCF and Cohesin showed that both CTCF and Cohesin bind to DNA (Li Y et al. Nature 2020). I think there is a possibility that if the molecular structure of a protein complex becomes available, the previous recognition of the DNA-binding ability of a protein may be changed. Therefore, I searched the Pfam database for 99 insulator-associated DNA-binding proteins identified in this study. I found that 97 are registered as DNA-binding proteins and/or have a known DNA-binding domain, and EP300 and SIN3A do not directory bind to DNA, which was also checked by Google search. I have added the following explanation in line 249 to indicate direct and indirect DNA-binding proteins: Among 99 insulator-associated DBPs, EP300 and SIN3A do not directory interact with DNA, and thus 97 insulator-associated DBPs directory bind to DNA. I have updated the sentence in line 20 of the Abstract as follows: We discovered 97 directional and minor nondirectional motifs in human fibroblast cells that corresponded to 23 DBPs related to insulator function, CTCF, and/or other types of chromosomal transcriptional regulation reported in previous studies.

      1. I am not sure if I understood correctly, by why do the authors consider enhancers spanning 2Mb (200 bins of 10Kb around eSNPs)? This seems wrong. Enhancers are relatively small regions (100bp to 1Kb) and only a very small subset form super enhancers.

      Response: As the reviewer mentioned, I recognize enhancers are relatively small regions. In the paper, I intended to examine further upstream and downstream of promoter regions where enhancers are found. Therefore, I have modified the sentence in lines 917 - 919 of the Fig. 2 legend as follows: Enhancer-gene regulatory interaction regions consist of 200 bins of 10 kbp between -1 Mbp and 1 Mbp region from TSS, not including promoter.

      1. I think the H3K27me3 analysis was very good, but I would have liked to see also constitutive heterochromatin as well, so maybe repeat the analysis for H3K9me3.

      Response: Following the reviewer's advice, I have added the ChIP-seq data of H3K9me3 as a truck of the UCSC Genome Browser. The distribution of H3K9me3 signal was different from that of H3K27me3 in some regions. I also found the insulator-associated DNA-binding sites close to the edges of H3K9me3 regions and took some screenshots of the UCSC Genome Browser of the regions around the sites in Supplementary Fig. 3b. I have modified the following sentence on lines 962 - 964 in the legend of Fig. 4: a Distribution of histone modification marks H3K27me3 (green color) and H3K9me3 (turquoise color) and transcript levels (pink color) in upstream and downstream regions of a potential insulator site (light orange color). I have also added the following result on lines 348 - 352: The same analysis was performed using H3K9me3 marks, instead of H3K27me3 (Fig. S3b). We found that the distribution of H3K9me3 signal was different from that of H3K27me3 in some regions, and discovered the insulator-associated DNA-binding sites close to the edges of H3K9me3 regions (Fig. S3b).

      1. I was not sure I understood the analysis in Figure 6. The binding site is with 500bp of the interaction site, but micro-C interactions are at best at 1Kb resolution. They say they chose the centre of the interaction site, but we don't know exactly where there is the actual interaction. Also, it is not clear what they measure. Is it the number of binding sites of a specific or multiple DBP insulator proteins at a specific distance from this midpoint that they recover in all chromatin loops? Maybe I am missing something. This analysis was not very clear.

      Response: The resolution of the Micro-C assay is considered to be 100 bp and above, as the human nucleome core particle contains 145 bp (and 193 bp with linker) of DNA. However, internucleosomal DNA is cleaved by endonuclease into fragments of multiples of 10 nucleotides (Pospelov VA et al. Nucleic Acids Research 1979). Highly nested focal interactions were observed (Goel VY et al. Nature Genetics 2023). Base pair resolution was reported using Micro Capture-C (Hua P et al. Nature 2021). Sub-kilobase (20 bp resolution) chromatin topology was reported using an MNase-based chromosome conformation capture (3C) approach (Aljahani A et al. Nature Communications 2022). On the other hand, Hi-C data was analyzed at 1 kb resolution. (Gu H et al. bioRxiv 2021). If the resolution of Micro-C interactions is at best at 1 kb, the binding sites of a DNA-binding protein will not show a peak around the center of the genomic locations of interaction edges. Each panel shows the number of binding sites of a specific DNA-binding protein at a specific distance from the midpoint of all chromatin interaction edges. I have modified and added the following sentences in lines 585-589: High-resolution chromatin interaction data from a Micro-C assay indicated that most of the predicted insulator-associated DBPs showed DNA-binding-site distribution peaks around chromatin interaction sites, suggesting that these DBPs are involved in chromatin interactions and that the chromatin interaction data has a high degree of resolution. Base pair resolution was reported using Micro Capture-C.

      Minor comments:

      1. PIQ does not consider TF concentration. Other methods do that and show that TF concentration improves predictions (e.g., https://www.biorxiv.org/content/10.1101/2023.07.15.549134v2 or https://pubmed.ncbi.nlm.nih.gov/37486787/). The authors should discuss how that would impact their results.

      Response: The directional bias of CTCF binding sites was identified by ChIA-pet interactions of CTCF binding sites. The analysis of the contribution scores of DNA-binding sites of proteins considering the binding sites of CTCF as an insulator showed the same tendency of directional bias of CTCF binding sites. In the analysis, to remove the false-positive prediction of DNA-binding sites, I used the binding sites that overlapped with a ChIP-seq peak of the DNA-binding protein. This result suggests that the DNA-binding sites of CTCF obtained by the current analysis have sufficient quality. Therefore, if the accuracy of prediction of DNA-binding sites is improved, althought the number of DNA-binding sites may be different, the overall tendency of the directionality of DNA-binding sites will not change and the results of this study will not change significantly.

      As for the first reference in the reviewer's comment, chromatin interaction data from Micro-C assay does not include all chromatin interactions in a cell or tissue, because it is expensive to cover all interactions. Therefore, it would be difficult to predict all chromatin interactions based on machine learning. As for the second reference in the reviewer's comment, pioneer factors such as FOXA are known to bind to closed chromatin regions, but transcription factors and DNA-binding proteins involved in chromatin interactions and insulators generally bind to open chromatin regions. The search for the DNA-binding motifs is not required in closed chromatin regions.

      1. DeepLIFT is a good approach to interpret complex structures of CNN, but is not truly explainable AI. I think the authors should acknowledge this.

      Response: In the DeepLIFT paper, the authors explain that DeepLIFT is a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input (Shrikumar A et al. ICML 2017). DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. DeepLIFT calculates a metric to measure the difference between an input and the reference of the input.

      Truly explainable AI would be able to find cause and reason, and to make choices and decisions like humans. DeepLIFT does not perform causal inferences. I did not use the term "Explainable AI" in our manuscript, but I briefly explained it in Discussion. I have added the following explanation in lines 615-620: AI (Artificial Intelligence) is considered as a black box, since the reason and cause of prediction are difficult to know. To solve this issue, tools and methods have been developed to know the reason and cause. These technologies are called Explainable AI. DeepLIFT is considered to be a tool for Explainable AI. However, DeepLIFT does not answer the reason and cause for a prediction. It calculates scores representing the contribution of the input data to the prediction.

      Furthermore, to improve the readability of the manuscript, I have included the following explanation in lines 159-165: we computed DeepLIFT scores of the input data (i.e., each binding site of the ChIP-seq data of DNA-binding proteins) in the deep leaning analysis on gene expression levels. DeepLIFT compares the importance of each input for predicting gene expression levels to its 'reference or background level' and assigns contribution scores according to the difference. DeepLIFT calculates a metric to measure the difference between an input and the reference of the input.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Authors has provided a mechanism by which how presence of truncated P53 can inactivate function of full length P53 protein. Authors proposed this happens by sequestration of full length P53 by truncated P53.

      In the study, performed experiments are well described.

      My area of expertise is molecular biology/gene expression, and I have tried to provide suggestions on my area of expertise. The study has been done mainly with overexpression system and I have included few comments which I can think can be helpful to understand effect of truncated P53 on endogenous wild type full length protein. Performing experiments on these lines will add value to the observation according to this reviewer.

      Major comments:

      1. What happens to endogenous wild type full length P53 in the context of mutant/truncated isoforms, that is not clear. Using a P53 antibody which can detect endogenous wild type P53, can authors check if endogenous full length P53 protein is also aggregated as well? It is hard to differentiate if aggregation of full length P53 happens only in overexpression scenario, where lot more both of such proteins are expressed. In normal physiological condition P53 expression is usually low, tightly controlled and its expression get induced in altered cellular condition such as during DNA damage. So, it is important to understand the physiological relevance of such aggregation, which could be possible if authors could investigate effect on endogenous full length P53 following overexpression of mutant isoforms. Response: Thank you very much for your insightful comments. 1) To address "what happens to endogenous wild-type full-length P53 in the context of mutant/truncated isoforms," we employed a human A549 cell line expressing endogenous wild-type p53 under DNA damage conditions such as an etoposide treatment1. We choose the A549 cell line since similar to H1299, it is a lung cancer cell line (www.atcc.org). For comparison, we also transfected the cells with 2 μg of V5-tagged plasmids encoding FLp53 and its isoforms Δ133p53 and Δ160p53. As shown in Figure R1A, lanes 1 and 2, endogenous p53 expression, remained undetectable in A549 cells despite etoposide treatment, which limits our ability to assess the effects of the isoforms on the endogenous wild-type FLp53. We could, however, detect the V5-tagged FLp53 expressed from the plasmid using anti-V5 (rabbit) as well as with anti-DO-1 (mouse) antibody (Figure R1). The latter detects both endogenous wild-type p53 and the V5-tagged FLp53 since the antibody epitope is within the N-terminus (aa 20-25). This result supports the reviewer's comment regarding the low level of expression of endogenous p53 that is insufficient for detection in our experiments. (Figure R1 is included in the file "RC-2024-02608 Figures of Response to Reviewer.)__

      In summary, in line with the reviewer's comment that 'under normal physiological conditions p53 expression is usually low,' we could not detect p53 with an anti-DO-1 antibody. Thus, we proceeded with V5/FLAG-tagged p53 for detection of the effects of the isoforms on p53 stability and function. We also found that protein expression in H1299 cells was more easily detectable than in A549 cells (Compare Figures R1A and B). Thus, we decided to continue with the H1299 cells (p53-null), which would serve as a more suitable model system for this study.

      2) We agree with the reviewer that 'It is hard to differentiate if aggregation of full-length p53 happens only in overexpression scenario'. However, it is not impossible to imagine that such aggregation of FLp53 happens under conditions when p53 and its isoforms are over-expressed in the cell. Although the exact physiological context is not known and beyond the scope of the current work, our results indicate that at higher expression, p53 isoforms drive aggregation of FLp53. Given the challenges of detecting endogenous FLp53, we had to rely on the results obtained with plasmid mediated expression of p53 and its isoforms in p53-null cells.

      Can presence of mutant P53 isoforms can cause functional impairment of wild type full length endogenous P53? That could be tested as well using similar ChIP assay authors has performed, but instead of antibody against the Tagged protein if the authors could check endogenous P53 enrichment in the gene promoter such as P21 following overexpression of mutant isoforms. May be introducing a condition such as DNA damage in such experiment might help where endogenous P53 is induced and more prone to bind to P53 target such as P21.

      Response: Thank you very much for your valuable comments and suggestions. To investigate the potential functional impairment of endogenous wild-type p53 by p53 isoforms, we initially utilized A549 cells (p53 wild-type), aiming to monitor endogenous wild-type p53 expression following DNA damage. However, as mentioned and demonstrated in Figure R1, endogenous p53 expression was too low to be detected under these conditions, making the ChIP assay for analyzing endogenous p53 activity unfeasible. Thus, we decided to utilize plasmid-based expression of FLp53 and focus on the potential functional impairment induced by the isoforms.

      3. On similar lines, authors described:

      "To test this hypothesis, we escalated the ratio of FLp53 to isoforms to 1:10. As expected, the activity of all four promoters decreased significantly at this ratio (Figure 4A-D). Notably, Δ160p53 showed a more potent inhibitory effect than Δ133p53 at the 1:5 ratio on all promoters except for the p21 promoter, where their impacts were similar (Figure 4E-H). However, at the 1:10 ratio, Δ133p53 and Δ160p53 had similar effects on all transactivation except for the MDM2 promoter (Figure 4E-H)."

      Again, in such assay authors used ratio 1:5 to 1:10 full length vs mutant. How authors justify this result in context (which is more relevant context) where one allele is Wild type (functional P53) and another allele is mutated (truncated, can induce aggregation). In this case one would except 1:1 ratio of full-length vs mutant protein, unless other regulation is going which induces expression of mutant isoforms more than wild type full length protein. Probably discussing on these lines might provide more physiological relevance to the observed data.

      Response: Thank you for raising this point regarding the physiological relevance of the ratios used in our study. 1) In the revised manuscript (lines 193-195), we added in this direction that "The elevated Δ133p53 protein modulates p53 target genes such as miR34a and p21, facilitating cancer development2, 3. To mimic conditions where isoforms are upregulated relative to FLp53, we increased the ratios to 1:5 and 1:10." This approach aims to simulate scenarios where isoforms accumulate at higher levels than FLp53, which may be relevant in specific contexts, as also elaborated above.

      2) Regarding the issue of protein expression, where one allele is wild-type and the other is isoform, this assumption is not valid in most contexts. First, human cells have two copies of TPp53 gene (one from each parent). Second, the TP53 gene has two distinct promoters: the proximal promoter (P1) primarily regulates FLp53 and ∆40p53, whereas the second promoter (P2) regulates ∆133p53 and ∆160p534, 5. Additionally, ∆133TP53 is a p53 target gene6, 7 and the expression of Δ133p53 and FLp53 is dynamic in response to various stimuli. Third, the expression of p53 isoforms is regulated at multiple levels, including transcriptional, post-transcriptional, translational, and post-translational processing8. Moreover, different degradation mechanisms modify the protein level of p53 isoforms and FLp538. These differential regulation mechanisms are regulated by various stimuli, and therefore, the 1:1 ratio of FLp53 to ∆133p53 or ∆160p53 may be valid only under certain physiological conditions. In line with this, varied expression levels of FLp53 and its isoforms, including ∆133p53 and ∆160p53, have been reported in several studies3, 4, 9, 10.

      3) In our study, using the pcDNA 3.1 vector under the human cytomegalovirus (CMV) promoter, we observed moderately higher expression levels of ∆133p53 and ∆160p53 relative to FLp53 (Figure R1B). This overexpression scenario provides a model for studying conditions where isoform accumulation might surpass physiological levels, impacting FLp53 function. By employing elevated ratios of these isoforms to FLp53, we aim to investigate the potential effects of isoform accumulation on FLp53.

      4. Finally does this altered function of full length P53 (preferably endogenous one) in presence of truncated P53 has any phenotypic consequence on the cells (if authors choose a cell type which is having wild type functional P53). Doing assay such as apoptosis/cell cycle could help us to get this visualization.

      Response: Thank you for your insightful comments. In the experiment with A549 cells (p53 wild-type), endogenous p53 levels were too low to be detected, even after DNA damage induction. The evaluation of the function of endogenous p53 in the presence of isoforms is hindered, as mentioned above. In the revised manuscript, we utilized H1299 cells with overexpressed proteins for apoptosis studies using the Caspase-Glo® 3/7 assay (Figure 7). This has been shown in the Results section (lines 254-269). "The Δ133p53 and Δ160p53 proteins block pro-apoptotic function of FLp53.

      One of the physiological read-outs of FLp53 is its ability to induce apoptotic cell death11. To investigate the effects of p53 isoforms Δ133p53 and Δ160p53 on FLp53-induced apoptosis, we measured caspase-3 and -7 activities in H1299 cells expressing different p53 isoforms (Figure 7). Caspase activation is a key biochemical event in apoptosis, with the activation of effector caspases (caspase-3 and -7) ultimately leading to apoptosis12. The caspase-3 and -7 activities induced by FLp53 expression was approximately 2.5 times higher than that of the control vector (Figure 7). Co-expression of FLp53 and the isoforms Δ133p53 or Δ160p53 at a ratio of 1: 5 significantly diminished the apoptotic activity of FLp53 (Figure 7). This result aligns well with our reporter gene assay, which demonstrated that elevated expression of Δ133p53 and Δ160p53 impaired the expression of apoptosis-inducing genes BAX and PUMA (Figure 4G and H). Moreover, a reduction in the apoptotic activity of FLp53 was observed irrespective of whether Δ133p53 or Δ160p53 protein was expressed with or without a FLAG tag (Figure 7). This result, therefore, also suggests that the FLAG tag does not affect the apoptotic activity or other physiological functions of FLp53 and its isoforms. Overall, the overexpression of p53 isoforms Δ133p53 and Δ160p53 significantly attenuates FLp53-induced apoptosis, independent of the protein tagging with the FLAG antibody epitope."

      **Referees cross-commenting**

      I think the comments from the other reviewers are very much reasonable and logical.

      Especially all 3 reviewers have indicated, a better way to visualize the aggregation of full-length wild type P53 by truncated P53 (such as looking at endogenous P53# by reviewer 1, having fluorescent tag #by reviewer 2 and reviewer 3 raised concern on the FLAG tag) would add more value to the observation.

      Response: Thank you for these comments. The endogenous p53 protein was undetectable in A549 cells induced by etoposide (Figure R1A). Therefore, we conducted experiments using FLAG/V5-tagged FLp53. To avoid any potential side effects of the FLAG tag on p53 aggregation, we introduced untagged p53 isoforms in the H1299 cells and performed subcellular fractionation. Our revised results, consistent with previous FLAG-tagged p53 isoforms findings, demonstrate that co-expression of untagged isoforms with FLAG-tagged FLp53 significantly induced the aggregation of FLAG-FLp53, while no aggregation was observed when FLAG-tagged FLp53 was expressed alone (Supplementary Figure 6). These results clearly indicate that the FLAG tag itself does not contribute to protein aggregation.

      Additionally, we utilized the A11 antibody to detect protein aggregation, providing additional validation (Figure R3). Given that the fluorescent proteins (~30 kDa) are substantially bigger than the tags used here (~1 kDa) and may influence oligomerization (especially GFP), stability, localization, and function of p53 and its isoforms, we avoided conducting these vital experiments with such artificial large fusions.

      Reviewer #1 (Significance (Required)):

      The work in significant, since it points out more mechanistic insight how wild type full length P53 could be inactivated in the presence of truncated isoforms, this might offer new opportunity to recover P53 function as treatment strategies against cancer.

      Response: Thank you for your insightful comments. We appreciate your recognition of the significance of our work in providing mechanistic insights into how wild-type FLp53 can be inactivated by truncated isoforms. We agree that these findings have potential for exploring new strategies to restore p53 function as a therapeutic approach against cancer.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Zhao and colleagues presents a novel and compelling study on the p53 isoforms, Δ133p53 and Δ160p53, which are associated with aggressive cancer types. The main objective of the study was to understand how these isoforms exert a dominant negative effect on full-length p53 (FLp53). The authors discovered that the Δ133p53 and Δ160p53 proteins exhibit impaired binding to p53-regulated promoters. The data suggest that the predominant mechanism driving the dominant-negative effect is the co-aggregation of FLp53 with Δ133p53 and Δ160p53.

      This study is innovative, well-executed, and supported by thorough data analysis. However, the authors should address the following points:

        • Introduction on Aggregation and Co-aggregation: Given that the focus of the study is on the aggregation and co-aggregation of the isoforms, the introduction should include a dedicated paragraph discussing this issue. There are several original research articles and reviews that could be cited to provide context.* Response: Thank you very much for the valuable comments. We have added the following paragraph in the revised manuscript (lines 74-82): "Protein aggregation has become a central focus of modern biology research and has documented implications in various diseases, including cancer13, 14, 15. Protein aggregates can be of different types ranging from amorphous aggregates to highly structured amyloid or fibrillar aggregates, each with different physiological implications. In the case of p53, whether protein aggregation, and in particular, co-aggregation with large N-terminal deletion isoforms, plays a mechanistic role in its inactivation is yet underexplored. Interestingly, the Δ133p53β isoform has been shown to aggregate in several human cancer cell lines16. Additionally, the Δ40p53α isoform exhibits a high aggregation tendency in endometrial cancer cells17. Although no direct evidence exists for Δ160p53 yet, these findings imply that p53 isoform aggregation may play a major role in their mechanisms of actions."

      2. Antibody Use for Aggregation: To strengthen the evidence for aggregation, the authors should consider using antibodies that specifically bind to aggregates.

      Response: Thank you for your insightful suggestion. We addressed protein aggregation using the A11 antibody which specifically recognizes amyloid-like protein aggregates. We analyzed insoluble nuclear pellet samples prepared under identical conditions as described in Figure 6B. To confirm the presence of p53 proteins, we employed the anti-p53 M19 antibody (Santa Cruz, Cat No. sc-1312) to detect bands corresponding to FLp53 and its isoforms Δ133p53 and Δ160p53. The monomer FLp53 was not detected (Figure R3, lower panel), which may be attributed to the lower binding affinity of the anti-p53 M19 antibody to it. These samples were also immunoprecipitated using the A11 antibody (Thermo Fischer Scientific, Cat No. AHB0052) to detect aggregated proteins. Interestingly, FLp53 and its isoforms, Δ133p53 and Δ160p53, were clearly visible with Anti-A11 antibody when co-expressed at a 1:5 ratio suggesting that they underwent co-aggregation__.__ However, no FLp53 aggregates were observed when it was expressed alone (Figure R2). These results support the conclusion in our manuscript that Δ133p53 and Δ160p53 drive FLp53 aggregation.

      (Figure R2 is included in the file "RC-2024-02608 Figures of Response to Reviewer.)__

      3. Fluorescence Microscopy: Live-cell fluorescence microscopy could be employed to enhance visualization by labeling FLp53 and the isoforms with different fluorescent markers (e.g., EGFP and mCherry tags).

      Response: We appreciate the suggestion to use live-cell fluorescence microscopy with EGFP and mCherry tags for the visualization FLp53 and its isoforms. While we understand the advantages of live-cell imaging with EGFP / mCherry tags, we restrained us from doing such fusions as the GFP or corresponding protein tags are very big (~30 kDa) with respect to the p53 isoform variants (~30 kDa). Other studies have shown that EGFP and mCherry fusions can alter protein oligomerization, solubility and aggregation18, 19. Moreover, most fluorescence proteins are prone to dimerization (i.e. EGFP) or form obligate tetramers (DsRed)20, 21, 22, potentially interfering with the oligomerization and aggregation properties of p53 isoforms, particularly Δ133p53 and Δ160p53.

      Instead, we utilized FLAG- or V5-tag-based immunofluorescence microscopy, a well-established and widely accepted method for visualizing p53 proteins. This method provided precise localization and reliable quantitative data, which we believe meet the needs of the current study. We believe our chosen method is both appropriate and sufficient for addressing the research question.

      Reviewer #2 (Significance (Required)):

      The manuscript by Zhao and colleagues presents a novel and compelling study on the p53 isoforms, Δ133p53 and Δ160p53, which are associated with aggressive cancer types. The main objective of the study was to understand how these isoforms exert a dominant negative effect on full-length p53 (FLp53). The authors discovered that the Δ133p53 and Δ160p53 proteins exhibit impaired binding to p53-regulated promoters. The data suggest that the predominant mechanism driving the dominant-negative effect is the co-aggregation of FLp53 with Δ133p53 and Δ160p53.

      Response: We sincerely thank the reviewer for the thoughtful and positive comments on our manuscript and for highlighting the significance of our findings on the p53 isoforms, Δ133p53 and Δ160p53.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript entitled "Δ133p53 and Δ160p53 isoforms of the tumor suppressor protein p53 exert dominant-negative effect primarily by co-aggregation", the authors suggest that the Δ133p53 and Δ160p53 isoforms have high aggregation propensity and that by co-aggregating with canonical p53 (FLp53), they sequestrate it away from DNA thus exerting a dominant-negative effect over it.

      First, the authors should make it clear throughout the manuscript, including the title, that they are investigating Δ133p53α and Δ160p53α since there are 3 Δ133p53 isoforms (α, β, γ), and 3 Δ160p53 isoforms (α, β, γ).

      Response: Thank you for your suggestion. We understand the importance of clearly specifying the isoforms under study. Following your suggestion, we have added α in the title, abstract, and introduction and added the following statement in the Introduction (lines 57-59): "For convenience and simplicity, we have written Δ133p53 and Δ160p53 to represent the α isoforms (Δ133p53α and Δ160p53α) throughout this manuscript."

      One concern is that the authors only consider and explore Δ133p53α and Δ160p53α isoforms as exclusively oncogenic and FLp53 dominant-negative while not discussing evidences of different activities. Indeed, other manuscripts have also shown that Δ133p53α is non-oncogenic and non-mutagenic, do not antagonize every single FLp53 functions and are sometimes associated with good prognosis. To cite a few examples:

      • Hofstetter G. et al. D133p53 is an independent prognostic marker in p53 mutant advanced serous ovarian cancer. Br. J. Cancer 2011, 105, 1593-1599.
      • Bischof, K. et al. Influence of p53 Isoform Expression on Survival in High-Grade Serous Ovarian Cancers. Sci. Rep. 2019, 9,5244.
      • Knezovi´c F. et al. The role of p53 isoforms' expression and p53 mutation status in renal cell cancer prognosis. Urol. Oncol. 2019, 37, 578.e1-578.e10.
      • Gong, L. et al. p53 isoform D113p53/D133p53 promotes DNA double-strand break repair to protect cell from death and senescence in response to DNA damage. Cell Res. 2015, 25, 351-369.
      • Gong, L. et al. p53 isoform D133p53 promotes efficiency of induced pluripotent stem cells and ensures genomic integrity during reprogramming. Sci. Rep. 2016, 6, 37281.
      • Horikawa, I. et al. D133p53 represses p53-inducible senescence genes and enhances the generation of human induced pluripotent stem cells. Cell Death Differ. 2017, 24, 1017-1028.
      • Gong, L. p53 coordinates with D133p53 isoform to promote cell survival under low-level oxidative stress. J. Mol. Cell Biol. 2016, 8, 88-90. Response: Thank you very much for your comment and for highlighting these important studies.

      We agree that Δ133p53 isoforms exhibit complex biological functions, with both oncogenic and non-oncogenic potentials. However, our mission here was primarily to reveal the molecular mechanism for the dominant-negative effects exerted by the Δ133p53α and Δ160p53α isoforms on FLp53 for which the Δ133p53α and Δ160p53α isoforms are suitable model systems. Exploring the oncogenic potential of the isoforms is beyond the scope of the current study and we have not claimed anywhere that we are reporting that. We have carefully revised the manuscript and replaced the respective terms e.g. 'pro-oncogenic activity' with 'dominant-negative effect' in relevant places (e.g. line 90). We have now also added a paragraph with suitable references that introduces the oncogenic and non-oncogenic roles of the p53 isoforms.

      After reviewing the papers you cited, we are not sure that they reflect on oncogenic /non-oncogenic role of the Δ133p53α isoform in different cancer cases. Although our study is not about the oncogenic potential of the isoforms, we have summarized the key findings below:

      • Hofstetter et al., 2011: Demonstrated that Δ133p53α expression improved recurrence-free and overall survival (in a p53 mutant induced advanced serous ovarian cancer, suggesting a potential protective role in this context.
      • Bischof et al., 2019: Found that Δ133p53 mRNA can improve overall survival in high-grade serous ovarian cancers. However, out of 31 patients, only 5 belong to the TP53 wild-type group, while the others carry TP53 mutations.
      • Knezović et al., 2019: Reported downregulation of Δ133p53 in renal cell carcinoma tissues with wild-type p53 compared to normal adjacent tissue, indicating a potential non-oncogenic role, but not conclusively demonstrating it.
      • Gong et al., 2015: Showed that Δ133p53 antagonizes p53-mediated apoptosis and promotes DNA double-strand break repair by upregulating RAD51, LIG4, and RAD52 independently of FLp53.
      • Gong et al., 2016: Demonstrated that overexpression of Δ133p53 promotes efficiency of cell reprogramming by its anti-apoptotic function and promoting DNA DSB repair. The authors hypotheses that this mechanism is involved in increasing RAD51 foci formation and decrease γH2AX foci formation and chromosome aberrations in induced pluripotent stem (iPS) cells, independent of FL p53.
      • Horikawa et al., 2017: Indicated that induced pluripotent stem cells derived from fibroblasts that overexpress Δ133p53 formed non-cancerous tumors in mice compared to induced pluripotent stem cells derived from fibroblasts with complete p53 inhibition. Thus, Δ133p53 overexpression is "non- or less oncogenic and mutagenic" compared to complete p53 inhibition, but it still compromises certain p53-mediated tumor-suppressing pathways. "Overexpressed Δ133p53 prevented FL-p53 from binding to the regulatory regions of p21WAF1 and miR-34a promoters, providing a mechanistic basis for its dominant-negative inhibition of a subset of p53 target genes."
      • Gong, 2016: Suggested that Δ133p53 promotes cell survival under low-level oxidative stress, but its role under different stress conditions remains uncertain. We have revised the Introduction to provide a more balanced discussion of Δ133p53's dule role (lines 62-73):

      "The Δ133p53 isoform exhibit complex biological functions, with both oncogenic and non-oncogenic potentials. Recent studies demonstrate the non-oncogenic yet context-dependent role of the Δ133p53 isoform in cancer development. Δ133p53 expression has been reported to correlate with improved survival in patients with TP53 mutations23, 24, where it promotes cell survival in a non-oncogenic manner25, 26, especially under low oxidative stress27. Alternatively, other recent evidences emphasize the notable oncogenic functions of Δ133p53 as it can inhibit p53-dependent apoptosis by directly interacting with the FLp53 4, 6. The oncogenic function of the newly identified Δ160p53 isoform is less known, although it is associated with p53 mutation-driven tumorigenesis28 and in melanoma cells' aggressiveness10. Whether or not the Δ160p53 isoform also impedes FLp53 function in a similar way as Δ133p53 is an open question. However, these p53 isoforms can certainly compromise p53-mediated tumor suppression by interfering with FLp53 binding to target genes such as p21 and miR-34a2, 29 by dominant-negative effect, the exact mechanism is not known."

      On the figures presented in this manuscript, I have three major concerns:

      *1- Most results in the manuscript rely on the overexpression of the FLAG-tagged or V5-tagged isoforms. The validation of these construct entirely depends on Supplementary figure 3 which the authors claim "rules out the possibility that the FLAG epitope might contribute to this aggregation. However, I am not entirely convinced by that conclusion. Indeed, the ratio between the "regular" isoform and the aggregates is much higher in the FLAG-tagged constructs than in the V5-tagged constructs. We can visualize the aggregates easily in the FLAG-tagged experiment, but the imaging clearly had to be overexposed (given the white coloring demonstrating saturation of the main bands) to visualize them in the V5-tagged experiments. Therefore, I am not convinced that an effect of the FLAG-tag can be ruled out and more convincing data should be added. *

      Response: Thank you for raising this important concern. We have carefully considered your comments and have made several revisions to clarify and strengthen our conclusions.

      First, to address the potential influence of the FLAG and V5 tags on p53 isoform aggregation, we have revised Figure 2 and removed the previous Supplementary Figure 3, where non-specific antibody bindings and higher molecular weight aggregates were not clearly interpretable. In the revised Figure 2, we have removed these potential aggregates, improving the clarity and accuracy of the data.

      To further rule out any tag-related artifacts, we conducted a co-immunoprecipitation assay with FLAG-tagged FLp53 and untagged Δ133p53 and Δ160p53 isoforms. The results (now shown in the new Supplementary Figure 3) completely agree with our previous result with FLAG-tagged and V5-tagged Δ133p53 and Δ160p53 isoforms and show interaction between the partners. This indicates that the FLAG / V5-tags do not influence / interfere with the interaction between FLp53 and the isoforms. We have still used FLAG-tagged FLp53 as the endogenous p53 was undetectable and the FLAG-tagged FLp53 did not aggregate alone.

      In the revised paper, we added the following sentences (Lines 146-152): "To rule out the possibility that the observed interactions between FLp53 and its isoforms Δ133p53 and Δ160p53 were artifacts caused by the FLAG and V5 antibody epitope tags, we co-expressed FLAG-tagged FLp53 with untagged Δ133p53 and Δ160p53. Immunoprecipitation assays demonstrated that FLAG-tagged FLp53 could indeed interact with the untagged Δ133p53 and Δ160p53 isoforms (Supplementary Figure 3, lanes 3 and 4), confirming formation of hetero-oligomers between FLp53 and its isoforms. These findings demonstrate that Δ133p53 and Δ160p53 can oligomerize with FLp53 and with each other."

      Additionally, we performed subcellular fractionation experiments to compare the aggregation and localization of FLAG-tagged FLp53 when co-expressed either with V5-tagged or untagged Δ133p53/Δ160p53. In these experiments, the untagged isoforms also induced FLp53 aggregation, mirroring our previous results with the tagged isoforms (Supplementary Figure 5). We've added this result in the revised manuscript (lines 236-245): "To exclude the possibility that FLAG or V5 tags contribute to protein aggregation, we also conducted subcellular fractionation of H1299 cells expressing FLAG-tagged FLp53 along with untagged Δ133p53 or Δ160p53 at a 1:5 ratio. The results showed (Supplementary Figure 6) a similar distribution of FLp53 across cytoplasmic, nuclear, and insoluble nuclear fractions as in the case of tagged Δ133p53 or Δ160p53 (Figure 6A to D). Notably, the aggregation of untagged Δ133p53 or Δ160p53 markedly promoted the aggregation of FLAG-tagged FLp53 (Supplementary Figure 6B and D), demonstrating that the antibody epitope tags themselves do not contribute to protein aggregation."

      We've also discussed this in the Discussion section (lines 349-356): "In our study, we primarily utilized an overexpression strategy involving FLAG/V5-tagged proteins to investigate the effects of p53 isoforms Δ133p53 and Δ160p53 on the function of FLp53. To address concerns regarding potential overexpression artifacts, we performed the co-immunoprecipitation (Supplementary Figure 6) and caspase-3 and -7 activity (Figure 7) experiments with untagged Δ133p53 and Δ160p53. In both experimental systems, the untagged proteins behaved very similarly to the FLAG/V5 antibody epitope-containing proteins (Figures 6 and 7 and Supplementary Figure 6). Hence, the C-terminal tagging of FLp53 or its isoforms does not alter the biochemical and physiological functions of these proteins."

      In summary, the revised data set and newly added experiments provide strong evidence that neither the FLAG nor the V5 tag contributes to the observed p53 isoform aggregation.

      2- The authors demonstrate that to visualize the dominant-negative effect, Δ133p53α and Δ160p53α must be "present in a higher proportion than FLp53 in the tetramer" and the need at least a transfection ratio 1:5 since the 1:1 ration shows no effect. However, in almost every single cell type, FLp53 is far more expressed than the isoforms which make it very unlikely to reach such stoichiometry in physiological conditions and make me wonder if this mechanism naturally occurs at endogenous level. This limitation should be at least discussed.

      Response: Thank you for your insightful comment. However, evidence suggests that the expression levels of these isoforms such as Δ133p53, can be significantly elevated relative to FLp53 in certain physiological conditions3, 4, 9. For example, in some breast tumors, with Δ133p53 mRNA is expressed at a much levels than FLp53, suggesting a distinct expression profile of p53 isoforms compared to normal breast tissue4. Similarly, in non-small cell lung cancer and the A549 lung cancer cell line, the expression level of Δ133p53 transcript is significantly elevated compared to non-cancerous cells3. Moreover, in specific cholangiocarcinoma cell lines, the Δ133p53 /TAp53 expression ratio has been reported to increase to as high as 3:19. These observations indicate that the dominant-negative effect of isoform Δ133p53 on FLp53 can occur under certain pathological conditions where the relative amounts of the FLp53 and the isoforms would largely vary. Since data on the Δ160p53 isoform are scarce, we infer that the long N-terminal truncated isoforms may share a similar mechanism.

      Figure 5C: I am concerned by the subcellular location of the Δ133p53α and Δ160p53α as they are commonly considered nuclear and not cytoplasmic as shown here, particularly since they retain the 3 nuclear localization sequences like the FLp53 (Bourdon JC et al. 2005; Mondal A et al. 2018; Horikawa I et al, 2017; Joruiz S. et al, 2024). However, Δ133p53α can form cytoplasmic speckles (Horikawa I et al, 2017) when it colocalizes with autophagy markers for its degradation.

      3-The authors should discuss this issue. Could this discrepancy be due to the high overexpression level of these isoforms? A co-staining with autophagy markers (p62, LC3B) would rule out (or confirm) activation of autophagy due to the overwhelming expression of the isoform.

      Response: Thank you for your thoughtful comments. We have thoroughly reviewed all the papers you recommended (Bourdon JC et al., 2005; Mondal A et al., 2018; Horikawa I et al., 2017; Joruiz S. et al., 2024)4, 29, 30, 31. Among these, only the study by Bourdon JC et al. (2005) provided data regarding the localization of Δ133p534. Interestingly, their findings align with our observations, indicating that the protein does not exhibit predominantly nuclear localization in the Figure below. The discrepancy may be caused by a potentially confusing statement in that paper4

      (The Figure from Bourdon JC et al. (2005) is included in the file "RC-2024-02608 Figures of Response to Reviewer.)__

      The localization of p53 is governed by multiple factors, including its nuclear import and export32. The isoforms Δ133p53 and Δ160p53 contain three nuclear localization sequences (NLS)4 . However, the isoforms Δ133p53 and Δ160p53 were potentially trapped in the cytoplasm by aggregation and masking the NLS. This mechanism would prevent nuclear import.

      Further, we acknowledge that Δ133p53 co-aggregates with autophagy substrate p62/SQSTM1 and autophagosome component LC3B in cytoplasm by autophagic degradation during replicative senescence33. We agree that high overexpression of these aggregation-prone proteins may induce endoplasmic reticulum (ER) stress and activates autophagy34. This could explain the cytoplasmic localization in our experiments. However, it is also critical to consider that we observed aggregates in both the cytoplasm and the nucleus (Figures 6B and E and Supplementary Figure 6B). While cytoplasmic localization may involve autophagy-related mechanisms, the nuclear aggregates likely arise from intrinsic isoform properties, such as altered protein folding, independent of autophagy. These dual localizations reflect the complex behavior of Δ133p53 and Δ160p53 isoforms under our experimental conditions.

      In the revised manuscript, we discussed this in Discussion (lines 328-335): "Moreover, the observed cytoplasmic isoform aggregates may reflect autophagy-related degradation, as suggested by the co-localization of Δ133p53 with autophagy substrate p62/SQSTM1 and autophagosome component LC3B33. High overexpression of these aggregation-prone proteins could induce endoplasmic reticulum stress and activate autophagy34. Interestingly, we also observed nuclear aggregation of these isoforms (Figure 6B and E and Supplementary Figure 6B), suggesting that distinct mechanisms, such as intrinsic properties of the isoforms, may govern their localization and behavior within the nucleus. This dual localization underscores the complexity of Δ133p53 and Δ160p53 behavior in cellular systems."

      Minor concerns:

      - Figure 1A: the initiation of the "Δ140p53" is shown instead of "Δ40p53"

      Response: Thank you! The revised Figure 1A has been created in the revised paper.

      • Figure 2A: I would like to see the images cropped a bit higher, so the cut does not happen just above the aggregate bands

      Response: Thank you for this suggestion. We've changed the image and the new Figure 2 has been shown in the revised paper.

      • Figure 3C: what ratio of FLp53/Delta isoform was used?

      Response: We have added the ratio in the figure legend of Figure 3C (lines 845-846) "Relative DNA-binding of the FLp53-FLAG protein to the p53-target gene promoters in the presence of the V5-tagged protein Δ133p53 or Δ160p53 at a 1: 1 ratio."

      • Figure 3C suggests that the "dominant-negative" effect is mostly senescence-specific as it does not affect apoptosis target genes, which is consistent with Horikawa et al, 2017 and Gong et al, 2016 cited above. Furthermore, since these two references and the others from Gong et al. show that Δ133p53α increases DNA repair genes, it would be interesting to look at RAD51, RAD52 or Lig4, and maybe also induce stress.

      Response: Thank you for your thoughtful comments and suggestions. In Figure 3C, the presence of Δ133p53 or Δ160p53 only significantly reduced the binding of FLp53 to the p21 promoter. However, isoforms Δ133p53 and Δ160p53 demonstrated a significant loss of DNA-binding activity at all four promoters: p21, MDM2, and apoptosis target genes BAX and PUMA (Figure 3B). This result suggests that Δ133p53 and Δ160p53 have the potential to influence FLp53 function due to their ability to form hetero-oligomers with FLp53 or their intrinsic tendency to aggregate. To further investigate this, we increased the isoform to FLp53 ratio in Figure 4, which demonstrate that the isoforms Δ133p53 and Δ160p53 exert dominant-negative effects on the function of FLp53.

      These results demonstrate that the isoforms can compromise p53-mediated pathways, consistent with Horikawa et al. (2017), which showed that Δ133p53α overexpression is "non- or less oncogenic and mutagenic" compared to complete p53 inhibition, but still affects specific tumor-suppressing pathways. Furthermore, as noted by Gong et al. (2016), Δ133p53's anti-apoptotic function under certain conditions is independent of FLp53 and unrelated to its dominant-negative effects.

      We appreciate your suggestion to investigate DNA repair genes such as RAD51, RAD52, or Lig4, especially under stress conditions. While these targets are intriguing and relevant, we believe that our current investigation of p53 targets in this manuscript sufficiently supports our conclusions regarding the dominant-negative effect. Further exploration of additional p53 target genes, including those involved in DNA repair, will be an important focus of our future studies.

      • Figure 5A and B: directly comparing the level of FLp53 expressed in cytoplasm or nucleus to the level of Δ133p53α and Δ160p53α expressed in cytoplasm or nucleus does not mean much since these are overexpressed proteins and therefore depend on the level of expression. The authors should rather compare the ratio of cytoplasmic/nuclear FLp53 to the ratio of cytoplasmic/nuclear Δ133p53α and Δ160p53α.

      Response: Thank you very much for this valuable suggestion. In the revised paper, Figure 5B has been recreated. Changes have been made in lines 214-215: "The cytoplasm-to-nucleus ratio of Δ133p53 and Δ160p53 was approximately 1.5-fold higher than that of FLp53 (Figure 5B)."

      **Referees cross-commenting**

      I agree that the system needs to be improved to be more physiological.

      Just to precise, the D133 and D160 isoforms are not truncated mutants, they are naturally occurring isoforms expressed in almost every normal human cell type from an internal promoter within the TP53 gene.

      Using overexpression always raises concerns, but in this case, I am even more careful because the isoforms are almost always less expressed than the FLp53, and here they have to push it 5 to 10 times more expressed than the FLp53 to see the effect which make me fear an artifact effect due to the overwhelming overexpression (which even seems to change the normal localization of the protein).

      To visualize the endogenous proteins, they will have to change cell line as the H1299 they used are p53 null.

      Response: Thank you for these comments. We've addressed the motivation of overexpression in the above responses. We needed to use the plasmid constructs in the p53-null cells to detect the proteins but the expression level was certainly not 'overwhelmingly high'.

      First, we tried the A549 cells (p53 wild-type) under DNA damage conditions, but the endogenous p53 protein was undetectable. Second, several studies reported increased Δ133p53 level compared to wild-type p53 and that it has implications in tumor development2, 3, 4, 9. Third, the apoptosis activity of H1299 cells overexpressing p53 proteins was analyzed in the revised manuscript (Figure 7). The apoptotic activity induced by FLp53 expression was approximately 2.5 times higher than that of the control vector under identical plasmid DNA transfection conditions (Figure 7). These results rule out the possibility that the plasmid-based expression of p53 and its isoforms introduced artifacts in the results. We've discussed this in the Results section (lines 254-269).

      Reviewer #3 (Significance (Required)):

      Overall, the paper is interesting particularly considering the range of techniques used which is the main strength.

      The main limitation to me is the lack of contradictory discussion as all argumentation presents Δ133p53α and Δ160p53α exclusively as oncogenic and strictly FLp53 dominant-negative when, particularly for Δ133p53α, a quite extensive literature suggests a not so clear-cut activity.

      The aggregation mechanism is reported for the first time for Δ133p53α and Δ160p53α, although it was already published for Δ40p53α, Δ133p53β or in mutant p53.

      This manuscript would be a good basic research addition to the p53 field to provide insight in the mechanism for some activities of some p53 isoforms.

      My field of expertise is the p53 isoforms which I have been working on for 11 years in cancer and neuro-degenerative diseases

      Response: Thank you very much for your positive and critical comments. We've included a fair discussion on the oncogenic and non-oncogenic function of Δ133p53 in the Introduction following your suggestion (lines 62-73).

      References

      1. Pitolli C, Wang Y, Candi E, Shi Y, Melino G, Amelio I. p53-Mediated Tumor Suppression: DNA-Damage Response and Alternative Mechanisms. Cancers 11, (2019).

      Fujita K, et al. p53 isoforms Delta133p53 and p53beta are endogenous regulators of replicative cellular senescence. Nature cell biology 11, 1135-1142 (2009).

      Fragou A, et al. Increased Δ133p53 mRNA in lung carcinoma corresponds with reduction of p21 expression. Molecular medicine reports 15, 1455-1460 (2017).

      Bourdon JC, et al. p53 isoforms can regulate p53 transcriptional activity. Genes & development 19, 2122-2137 (2005).

      Ghosh A, Stewart D, Matlashewski G. Regulation of human p53 activity and cell localization by alternative splicing. Molecular and cellular biology 24, 7987-7997 (2004).

      Aoubala M, et al. p53 directly transactivates Δ133p53α, regulating cell fate outcome in response to DNA damage. Cell death and differentiation 18, 248-258 (2011).

      Marcel V, et al. p53 regulates the transcription of its Delta133p53 isoform through specific response elements contained within the TP53 P2 internal promoter. Oncogene 29, 2691-2700 (2010).

      Zhao L, Sanyal S. p53 Isoforms as Cancer Biomarkers and Therapeutic Targets. Cancers 14, (2022).

      Nutthasirikul N, Limpaiboon T, Leelayuwat C, Patrakitkomjorn S, Jearanaikoon P. Ratio disruption of the ∆133p53 and TAp53 isoform equilibrium correlates with poor clinical outcome in intrahepatic cholangiocarcinoma. International journal of oncology 42, 1181-1188 (2013).

      Tadijan A, et al. Altered Expression of Shorter p53 Family Isoforms Can Impact Melanoma Aggressiveness. Cancers 13, (2021).

      Aubrey BJ, Kelly GL, Janic A, Herold MJ, Strasser A. How does p53 induce apoptosis and how does this relate to p53-mediated tumour suppression? Cell death and differentiation 25, 104-113 (2018).

      Ghorbani N, Yaghubi R, Davoodi J, Pahlavan S. How does caspases regulation play role in cell decisions? apoptosis and beyond. Molecular and cellular biochemistry 479, 1599-1613 (2024).

      Petronilho EC, et al. Oncogenic p53 triggers amyloid aggregation of p63 and p73 liquid droplets. Communications chemistry 7, 207 (2024).

      Forget KJ, Tremblay G, Roucou X. p53 Aggregates penetrate cells and induce the co-aggregation of intracellular p53. PloS one 8, e69242 (2013).

      Farmer KM, Ghag G, Puangmalai N, Montalbano M, Bhatt N, Kayed R. P53 aggregation, interactions with tau, and impaired DNA damage response in Alzheimer's disease. Acta neuropathologica communications 8, 132 (2020).

      Arsic N, et al. Δ133p53β isoform pro-invasive activity is regulated through an aggregation-dependent mechanism in cancer cells. Nature communications 12, 5463 (2021).

      Melo Dos Santos N, et al. Loss of the p53 transactivation domain results in high amyloid aggregation of the Δ40p53 isoform in endometrial carcinoma cells. The Journal of biological chemistry 294, 9430-9439 (2019).

      Mestrom L, et al. Artificial Fusion of mCherry Enhances Trehalose Transferase Solubility and Stability. Applied and environmental microbiology 85, (2019).

      Kaba SA, Nene V, Musoke AJ, Vlak JM, van Oers MM. Fusion to green fluorescent protein improves expression levels of Theileria parva sporozoite surface antigen p67 in insect cells. Parasitology 125, 497-505 (2002).

      Snapp EL, et al. Formation of stacked ER cisternae by low affinity protein interactions. The Journal of cell biology 163, 257-269 (2003).

      Jain RK, Joyce PB, Molinete M, Halban PA, Gorr SU. Oligomerization of green fluorescent protein in the secretory pathway of endocrine cells. The Biochemical journal 360, 645-649 (2001).

      Campbell RE, et al. A monomeric red fluorescent protein. Proceedings of the National Academy of Sciences of the United States of America 99, 7877-7882 (2002).

      Hofstetter G, et al. Δ133p53 is an independent prognostic marker in p53 mutant advanced serous ovarian cancer. British journal of cancer 105, 1593-1599 (2011).

      Bischof K, et al. Influence of p53 Isoform Expression on Survival in High-Grade Serous Ovarian Cancers. Scientific reports 9, 5244 (2019).

      Gong L, et al. p53 isoform Δ113p53/Δ133p53 promotes DNA double-strand break repair to protect cell from death and senescence in response to DNA damage. Cell research 25, 351-369 (2015).

      Gong L, et al. p53 isoform Δ133p53 promotes efficiency of induced pluripotent stem cells and ensures genomic integrity during reprogramming. Scientific reports 6, 37281 (2016).

      Gong L, Pan X, Yuan ZM, Peng J, Chen J. p53 coordinates with Δ133p53 isoform to promote cell survival under low-level oxidative stress. Journal of molecular cell biology 8, 88-90 (2016).

      Candeias MM, Hagiwara M, Matsuda M. Cancer-specific mutations in p53 induce the translation of Δ160p53 promoting tumorigenesis. EMBO reports 17, 1542-1551 (2016).

      Horikawa I, et al. Δ133p53 represses p53-inducible senescence genes and enhances the generation of human induced pluripotent stem cells. Cell death and differentiation 24, 1017-1028 (2017).

      Mondal AM, et al. Δ133p53α, a natural p53 isoform, contributes to conditional reprogramming and long-term proliferation of primary epithelial cells. Cell death & disease 9, 750 (2018).

      Joruiz SM, Von Muhlinen N, Horikawa I, Gilbert MR, Harris CC. Distinct functions of wild-type and R273H mutant Δ133p53α differentially regulate glioblastoma aggressiveness and therapy-induced senescence. Cell death & disease 15, 454 (2024).

      O'Brate A, Giannakakou P. The importance of p53 location: nuclear or cytoplasmic zip code? Drug resistance updates : reviews and commentaries in antimicrobial and anticancer chemotherapy 6, 313-322 (2003).

      Horikawa I, et al. Autophagic degradation of the inhibitory p53 isoform Δ133p53α as a regulatory mechanism for p53-mediated senescence. Nature communications 5, 4706 (2014).

      Lee H, et al. IRE1 plays an essential role in ER stress-mediated aggregation of mutant huntingtin via the inhibition of autophagy flux. Human molecular genetics 21, 101-114 (2012).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript entitled "Δ133p53 and Δ160p53 isoforms of the tumor suppressor protein p53 exert dominant-negative effect primarily by co-aggregation", the authors suggest that the Δ133p53 and Δ160p53 isoforms have high aggregation propensity and that by co-aggregating with canonical p53 (FLp53), they sequestrate it away from DNA thus exerting a dominant-negative effect over it.

      First, the authors should make it clear throughout the manuscript, including the title, that they are investigating Δ133p53α and Δ160p53α since there are 3 Δ133p53 isoforms (α, β, γ), and 3 Δ160p53 isoforms (α, β, γ).

      One concern is that the authors only consider and explore Δ133p53α and Δ160p53α isoforms as exclusively oncogenic and FLp53 dominant-negative while not discussing evidences of different activities. Indeed, other manuscripts have also shown that Δ133p53α is non-oncogenic and non-mutagenic, do not antagonize every single FLp53 functions and are sometimes associated with good prognosis. To cite a few examples: Hofstetter G. et al. D133p53 is an independent prognostic marker in p53 mutant advanced serous ovarian cancer. Br. J. Cancer 2011, 105, 1593-1599. Bischof, K. et al. Influence of p53 Isoform Expression on Survival in High-Grade Serous Ovarian Cancers. Sci. Rep. 2019, 9,5244. Knezovi´c F. et al. The role of p53 isoforms' expression and p53 mutation status in renal cell cancer prognosis. Urol. Oncol. 2019, 37, 578.e1-578.e10. Gong, L. et al. p53 isoform D113p53/D133p53 promotes DNA double-strand break repair to protect cell from death and senescence in response to DNA damage. Cell Res. 2015, 25, 351-369. Gong, L. et al. p53 isoform D133p53 promotes efficiency of induced pluripotent stem cells and ensures genomic integrity during reprogramming. Sci. Rep. 2016, 6, 37281. Horikawa, I. et al. D133p53 represses p53-inducible senescence genes and enhances the generation of human induced pluripotent stem cells. Cell Death Differ. 2017, 24, 1017-1028. Gong, L. p53 coordinates with D133p53 isoform to promote cell survival under low-level oxidative stress. J. Mol. Cell Biol. 2016, 8, 88-90.

      On the figures presented in this manuscript, I have three major concerns:

      1. Most results in the manuscript rely on the overexpression of the FLAG-tagged or V5-tagged isoforms. The validation of these construct entirely depends on Supplementary figure 3 which the authors claim "rule[s] out the possibility that the FLAG epitope might contribute to this aggregation. However, I am not entirely convinced by that conclusion. Indeed, the ratio between the "regular" isoform and the aggregates is much higher in the FLAG-tagged constructs than in the V5-tagged constructs. We can visualize the aggregates easily in the FLAG-tagged experiment, but the imaging clearly had to be overexposed (given the white coloring demonstrating saturation of the main bands) to visualize them in the V5-tagged experiments. Therefore, I am not convinced that an effect of the FLAG-tag can be ruled out and more convincing data should be added.
      2. The authors demonstrate that to visualize the dominant-negative effect, Δ133p53α and Δ160p53α must be "present in a higher proportion than FLp53 in the tetramer" and the need at least a transfection ratio 1:5 since the 1:1 ration shows no effect. However, in almost every single cell type, FLp53 is far more expressed than the isoforms which make it very unlikely to reach such stoichiometry in physiological conditions and make me wonder if this mechanism naturally occurs at endogenous level. This limitation should be at least discussed.
      3. Figure 5C: I am concerned by the subcellular location of the Δ133p53α and Δ160p53α as they are commonly considered nuclear and not cytoplasmic as shown here, particularly since they retain the 3 nuclear localization sequences like the FLp53 (Bourdon JC et al. 2005; Mondal A et al. 2018; Horikawa I et al, 2017; Joruiz S. et al, 2024). However, Δ133p53α can form cytoplasmic speckles (Horikawa I et al, 2017) when it colocalizes with autophagy markers for its degradation. The authors should discuss this issue. Could this discrepancy be due to the high overexpression level of these isoforms? A co-staining with autophagy markers (p62, LC3B) would rule out (or confirm) activation of autophagy due to the overwhelming expression of the isoform.

      Minor concerns:

      • Figure 1A: the initiation of the "Δ140p53" is shown instead of "Δ40p53"
      • Figure 2A: I would like to see the images cropped a bit higher, so the cut does not happen just above the aggregate bands
      • Figure 3C: what ratio of FLp53/Delta isoform was used?
      • Figure 3C suggests that the "dominant-negative" effect is mostly senescence-specific as it does not affect apoptosis target genes, which is consistent with Horikawa et al, 2017 and Gong et al, 2016 cited above. Furthermore, since these two references and the others from Gong et al. show that Δ133p53α increases DNA repair genes, it would be interesting to look at RAD51, RAD52 or Lig4, and maybe also induce stress.
      • Figure 5A and B: directly comparing the level of FLp53 expressed in cytoplasm or nucleus to the level of Δ133p53α and Δ160p53α expressed in cytoplasm or nucleus does not mean much since these are overexpressed proteins and therefore depend on the level of expression. The authors should rather compare the ratio of cytoplasmic/nuclear FLp53 to the ratio of cytoplasmic/nuclear Δ133p53α and Δ160p53α.

      Referees cross-commenting

      I agree that the system needs to be improved to be more physiological.

      Just to precise, the D133 and D160 isoforms are not truncated mutants, they are naturally occurring isoforms expressed in almost every normal human cell type from an internal promoter within the TP53 gene.

      Using overexpression always raises concerns, but in this case I am even more careful because the isoforms are almost always less expressed than the FLp53, and here they have to push it 5 to 10 times more expressed than the FLp53 to see the effect which make me fear an artifact effect due to the overwhelming overexpression (which even seems to change the normal localization of the protein).

      To visualize the endogenous proteins, they will have to change cell line as the H1299 they used are p53 null.

      Significance

      Overall, the paper is interesting particularly considering the range of techniques used which is the main strength. The main limitation to me is the lack of contradictory discussion as all argumentation presents Δ133p53α and Δ160p53α exclusively as oncogenic and strictly FLp53 dominant-negative when, particularly for Δ133p53α, a quite extensive literature suggests a not so clear-cut activity.

      The aggregation mechanism is reported for the first time for Δ133p53α and Δ160p53α, although it was already published for Δ40p53α, Δ133p53β or in mutant p53.

      This manuscript would be a good basic research addition to the p53 field to provide insight in the mechanism for some activities of some p53 isoforms.

      My field of expertise is the p53 isoforms which I have been working on for 11 years in cancer and neuro-degenerative diseases

    1. ~~l:4 50 ft ravishments of the surnrnit of our isyrian daw/ :If-burledi ... s and when'the;book Waes' rnlan spun rne astern lii11,;1an PaphianJrf:8"' ' • d • h c osed h round b '_J "dism1sse ,me wit but mist , w en th a out in bru ' y rern. . e Spell a we of• ofhim, \1 ,

      When I read this it makes me think of illusions, obsession, and a lingering presence of something/someone powerful. The steps in the sentence go from being deeply captivated by someone--the story coming to an end--to it being a dream? Similar to Moby dick regarding the presense of the whale and Ahabs obsession with finding it. Tangled up with Ishmael and following Ahab through the journey, but when the book ended, the "spell" was over and everything was gone, but Ishmael.

    2. Still :ritten before the , author ~e~re not true, incl~~blis hed it anon,, as \by a "Cousin Cherry " (his Au li ::' thorn e th t inhg the ass ert ioY rn ohusl y.h' r!l • h' nt ov1a ' a t e b n t at it1 •s gave 11 to 1m on July 18 k ry Ann Mel .11 °ok was g

      We talked in class before through works on Melville outside of his novels about his relationship with Nathaniel Hawthorne and the intricate bond the two great writers shared. I am curious to explore more about the nature of this relationship and how these two motivate or influence each other's writing.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1 (Public Review):

      O’Neill et al. have developed a software analysis application, miniML, that enables the quantification of electrophysiological events. They utilize a supervised deep learned-based method to optimize the software. miniML is able to quantify and standardize the analyses of miniature events, using both voltage and current clamp electrophysiology, as well as optically driven events using iGluSnFR3, in a variety of preparations, including in the cerebellum, calyx of held, Golgi cell, human iPSC cultures, zebrafish, and Drosophila. The software appears to be flexible, in that users are able to hone and adapt the software to new preparations and events. Importantly, miniML is an open-source software free for researchers to use and enables users to adapt new features using Python.

      Overall this new software has the potential to become widely used in the field and an asset to researchers. However, the authors fail to discuss or even cite a similar analysis tool recently developed (SimplyFire), and determine how miniML performs relative to this platform. There are a handful of additional suggestions to make miniML more user-friendly, and of broad utility to a variety of researchers, as well as some suggestions to further validate and strengthen areas of the manuscript:

      (1) miniML relative to existing analysis methods: There is a major omission in this study, in that a similar open source, Python-based software package for event detection of synaptic events appears to be completely ignored. Earlier this year, another group published SimplyFire in eNeuro (Mori et al., 2024; doi: 10.1523/eneuro.0326-23.2023). Obviously, this previous study needs to be discussed and ideally compared to miniML to determine if SimplyFire is superior or similar in utility, and to underscore differences in approach and accuracy.

      We thank the reviewer for bringing this interesting publication to our attention. We have included SimplyFire in our benchmarking for comprehensive comparison with miniML. The approach taken by SimplyFire differs from miniML in a number of ways. Our results show that miniML provides higher recall and precision than SimplyFire (revised Figure 3). We appreciate that SimplyFire provides a user-interface similar to the commonly used MiniAnalysis software. In addition, the peak-finding-based approach of SimplyFire makes it relatively robust to event shape, which facilitates analysis of diverse data. However, we noted a strong threshold-dependence and long run time of SimplyFire (revised Figure 3 and Figure 3—figure supplement 1). In addition, SimplyFire is not robust against various types of noise typically encountered in electrophysiological recordings. Our extended benchmark analysis thus indicates that AI-based event detection is superior to existing algorithmic approaches, including SimplyFire.

      (2) The manuscript should comment on whether miniML works equally well to quantify current clamp events (voltage; e.g. EPSP/mEPSPs) compared to voltage clamp (currents, EPSC/mEPSCs), which the manuscript highlights. Are rise and decay time constants calculated for each event similarly?

      miniML works equally well for current- and voltage events (Figure 5, Figure 9). In general, events of opposite polarity can be analyzed by simply inverting the data. Transfer learning models may further improve the detection.

      For each detected event, independent of data/recording type, rise times are calculated as 10–90% times (baseline–peak), and decay times are calculated as time to 50% of the peak. In addition, event decay time constants are calculated from a fit to the event average. With miniML being open-source, researchers can adapt the calculations of event statistics to their needs, if desired. In the revised manuscript, we have expanded the Methods section that describes the quantification of event statistics (Methods, Quantification).

      (3) The interface and capabilities of miniML appear quite similar to Mini Analysis, the free software that many in the field currently use. While the ability and flexibility for users to adapt and adjust miniML for their own uses/needs using Python programming is a clear potential advantage, can the authors comment, or better yet, demonstrate, whether there is any advantage for researchers to use miniML over Mini Analysis or SimplyFire if they just need the standard analyses?

      Following the reviewer’s suggestion, we developed a graphical user interface (GUI) for miniML to enhance its usability (Figure 2—figure supplement 2), which is provided on the GitHub repository. Our comprehensive benchmark analysis demonstrated that miniML outperforms existing tools such as MiniAnalysis and SimplyFire. The main advantages are (i) increased reliability of results, which eliminates the need for visual inspection; (ii) fast runtime and easy automation; (iii) superior detection performance as demonstrated by higher recall in both synthetic and real data; (iv) open-source Python-based design. We believe that these advantages make miniML a valuable tool for researchers recording various types of synaptic events, offering a more efficient and reliable solution compared to existing methods.

      (4) Additional utilities for miniML: The authors show miniML can quantify miniature electrophysiological events both current and voltage clamp, as well as optical glutamate transients using iGluSnFR. As the authors mention in the discussion, the same approach could, in principle, be used to quantify evoked (EPSC/EPSP) events using electrophysiology, Ca2+ events (using GCaMP), and AP waveforms using voltage indicators like ASAP4. While I don’t think it is reasonable to ask the authors to generate any new experimental data, it would be great to see how miniML performs when analysing data from these approaches, particularly to quantify evoked synaptic events and/or Ca2+ (ideally postsynaptic Ca2+ signals from miniature events, as the Drosophila NMJ have developed nice approaches).

      In the revised manuscript, we have extended the application examples of miniML. We applied miniML to detect mEPSPs recorded with the novel voltage-sensitive indicator ASAP5 (Figure 9 and Figure 9—figure supplement 1). We performed simultaneous recordings of membrane voltage through electrophysiology and ASAP5 voltage imaging in rat cultured neurons at physiological temperature. Data were analyzed using miniML, with electrophysiology data being used as ground-truth for assessing detection performance in imaging data. Our results demonstrate that miniML robustly detects mEPSPs in current-clamp, and can localize corresponding transients in imaging data. Furthermore, we observed that miniML performs better than template matching and deconvolution on ASAP5 imaging data (Figure 9 and Figure 9—figure supplement 2).

      Reviewer 2 (Public Review):

      This paper presents miniML as a supervised method for the detection of spontaneous synaptic events. Recordings of such events are typically of low SNR, where state-of-the-art methods are prone to high false positive rates. Unlike current methods, training miniML requires neither prior knowledge of the kinetics of events nor the tuning of parameters/thresholds.

      The proposed method comprises four convolutional networks, followed by a bi-directional LSTM and a final fully connected layer which outputs a decision event/no event per time window. A sliding window is used when applying miniML to a temporal signal, followed by an additional estimation of events’ time stamps. miniML outperforms current methods for simulated events superimposed on real data (with no events) and presents compelling results for real data across experimental paradigms and species. Strengths:

      The authors present a pipeline for benchmarking based on simulated events superimposed on real data (with no events). Compared to five other state-of-the-art methods, miniML leads to the highest detection rates and is most robust to specific choices of threshold values for fast or slow kinetics. A major strength of miniML is the ability to use it for different datasets. For this purpose, the CNN part of the model is held fixed and the subsequent networks are trained to adapt to the new data. This Transfer Learning (TL) strategy reduces computation time significantly and more importantly, it allows for using a substantially smaller data set (compared to training a full model) which is crucial as training is supervised (i.e. uses labeled examples).

      Weaknesses:

      The authors do not indicate how the specific configuration of miniML was set, i.e. number of CNNs, units, LSTM, etc. Please provide further information regarding these design choices, whether they were based on similar models or if chosen based on performance.

      The data for the benchmark system was augmented with equal amounts of segments with/without events. Data augmentation was undoubtedly crucial for successful training.

      (1) Does a balanced dataset reflect the natural occurrence of events in real data? Could the authors provide more information regarding this matter?

      In a given recording, the event frequency determines the ratio of event-containing vs. nonevent-containing data segments. Whereas many synapses have a skew towards non-events, high event frequencies as observed, e.g., in pyramidal cells or Purkinje neurons, can shift the ratio towards event-containing data.

      For model training, we extracted data segments from mEPSC recordings in cerebellar granule cells, which have a low mEPSC frequency (about 0.2 Hz, Delvendahl et al. 2019). Unbalanced training data may complicate model training (Drummond and Holte 2003; Prati et al. 2009; Tyagi and Mittal 2020). We therefore decided to balance the training dataset for miniML by down-sampling the majority class (i.e., non-event segments), so that the final datasets for model training contained roughly equal amounts of events and non-events.

      (2) Please provide a more detailed description of this process as it would serve users aiming to use this method for other sub-fields.

      We thank the reviewer for raising this point. In the revised manuscript, we present a systematic analysis of the impact of imbalanced training data on model training (Figure 1—figure supplement 2). In addition, we have revised the description of model training and data augmentation in the Methods section (Methods, Training data and annotation).

      The benchmarking pipeline is indeed valuable and the results are compelling. However, the authors do not provide comparative results for miniML for real data (Figures 4-8). TL does not apply to the other methods. In my opinion, presenting the performance of other methods, trained using the smaller dataset would be convincing of the modularity and applicability of the proposed approach.

      Quantitative comparison of synaptic detection methods on real-world data is challenging because the lack of ground-truth data prevents robust, quantitative analyses. Nevertheless, we compared miniML to common template-based and finite-threshold based methods on four different types of synapses. We noted that miniML generally detects more events, whereas other methods are susceptible to false-positives (Figure 4—figure supplement 1). In addition, we analyzed the performance of miniML on voltage imaging data (Figure 9). Simultaneous recordings of electrophysiological and imaging data allowed a quantitative comparison of detection methods in this dataset. Our results demonstrate that miniML provides higher recall for optical minis recorded using ASAP5 (Figure 9 and Figure 9—figure supplement 2; F1 score, Cohen’s d 1.35 vs. template matching and 5.1 vs. deconvolution).

      Impact:

      Accurate detection of synaptic events is crucial for the study of neural function. miniML has a great potential to become a valuable tool for this purpose as it yields highly accurate detection rates, it is robust, and is relatively easily adaptable to different experimental setups.

      Additional comments:

      Line 73: the authors describe miniML as "parameter-free". Indeed, miniML does not require the selection of pulse shape, rise/fall time, or tuning of a threshold value. Still, I would not call it "parameter-free" as there are many parameters to tune, starting with the number of CNNs, and number of units through the parameters of the NNs. A more accurate description would be that as an AI-based method, the parameters of miniML are learned via training rather than tuned by the user.

      We agree that a deep learning model is not parameter-free, and this term may be misleading. We have therefore changed this sentence in the introduction as follows: "The method is fast, robust to threshold choice, and generalizable across diverse data types [...]"

      Line 302: the authors describe miniML as "threshold-independent". The output trace of the model has an extremely high SNR so a threshold of 0.5 typically works. Since a threshold is needed to determine the time stamps of events, I think a better description would be "robust to threshold choice".

      To detect event localizations, a peak search is performed on the model output, which uses a minimum peak height parameter (or threshold). Extreme values for this parameter do indeed have a small impact on detection performance (Figure 3J). We have changed the description in the introduction and discussion according to the reviewer’s suggestion.

      Reviewer 3 (Public Review):

      miniML as a novel supervised deep learning-based method for detecting and analyzing spontaneous synaptic events. The authors demonstrate the advantages of using their methods in comparison with previous approaches. The possibility to train the architecture on different tasks using transfer learning approaches is also an added value of the work. There are some technical aspects that would be worth clarifying in the manuscript:

      (1) LSTM Layer Justification: Please provide a detailed explanation for the inclusion of the LSTM layer in the miniML architecture. What specific benefits does the LSTM layer offer in the context of synaptic event detection?

      Our model design choice was inspired by similar approaches in the literature (Donahue et al. 2017; Islam et al. 2020; Passricha and Aggarwal 2019; Tasdelen and Sen 2021; Wang et al. 2020). Convolutional and recurrent neural networks are often combined for time-series classification problems as they allow learning spatial and temporal features, respectively. Combining the strengths of both network architectures can thus help improve the classification performance. Indeed, a CNN-LSTM architecture proved to be superior in both training accuracy and detection performance (Figure 1—figure supplement 2). Further, this architecture requires fewer free parameters than comparable model designs using fully connected layers instead. The revised manuscript shows a comparison of different model architectures (Figure 1—figure supplement 2), and we added the following description to the text (Methods, Deep learning model architecture):

      "The combination of convolutional and recurrent neural network layers helps to improve the classification performance for time-series data. In particular, LSTM layers allow learning temporal features."

      (2) Temporal Resolution: Can you elaborate on the reasons behind the lower temporal resolution of the output? Understanding whether this is due to specific design choices in the model, data preprocessing, or post-processing will clarify the nature of this limitation and its impact on the analysis.

      When running inference on a continuous recording, we choose to use a sliding window approach with stride. Therefore, the model output has a lower temporal resolution than the raw data, which is determined by the stride length (i.e., how many samples to advance the sliding window). While using a stride is not required, it significantly reduces inference time (cf. Figure 2—figure supplement 1). We recommend a stride of 20 samples, which does not impact the detection of events. Any subsequent quantification of events (amplitude, area, risetimes, etc.) is performed on raw data. Based on the reviewer’s comment, we have adapted the code to resample the prediction trace to the sampling rate of the original data. This maintains temporal precision and avoids confusion.

      The Methods now include the following statement:

      "To maintain temporal precision, the prediction trace is resampled to the sampling frequency of the raw data."

      (3) Architecture optimization: how was the architecture CNN+LSTM optimized in terms of a number of CNN layers and size?

      We performed a Bayesian optimization over a defined range of hyperparameters in combination with empirical hyperparameter tuning. We now describe this in the Methods section as follows:

      "To optimise the model architecture, we performed a Bayesian optimisation of hyperparameters. Hyperparameter ranges were chosen for the free parameters of all layers. Optimisation was then performed with a maximum number of trials of 50. Models were evaluated using the validation dataset. Because higher number of free parameters tended to increase inference times, we then empirically tuned the chosen hyperparameter combination to achieve a trade-off between number of free parameters and accuracy."

      Recommendations For The Authors

      Reviewing Editor (Recommendations For The Authors):

      Overall suggestions to the authors:

      (1) Directly compare miniML with SimplyFire (which was not cited or discussed in the original manuscript), with both idealized and actual data. Discuss the pros/cons of each software.

      We have conducted an extensive comparison between miniML and SimplyFire using both simulated and actual experimental data. This analysis is now presented in the revised Figure 3, Figure 3—figure supplement 1, and Figure 4—figure supplement 1. In addition, we have included relevant citations for SimplyFire in our manuscript. These additions provide a more comprehensive and balanced view of the available tools in the field, positioning our work within the broader context of existing solutions.

      (2) Generate a better user interface akin to MiniAnalysis or SimplyFire.

      We thank the editor and reviewers for the suggestion to improve the user interface. We have created a user-friendly graphical user interface (GUI) for miniML that is available on our GitHub repository. This GUI is now showcased in Figure 2—figure supplement 2 of the manuscript. The new interface allows users to load and analyze data through an intuitive point-and-click system, visualize results in real-time, and adjust parameters easily without coding knowledge. We have incorporated user feedback to refine the interface and improve user experience. These improvements significantly enhance the accessibility of miniML, making it more user-friendly for researchers with varying levels of programming expertise.

      Reviewer 1 (Recommendations For The Authors):

      Related to point (1) of the Public Review, we have taken the liberty to compare electrophysiological data using miniAnalysis, SimiplyFire, and miniML. In our comparison, we note the following in our experience:

      (1.1) In contrast to both SimplyFire and miniAnalysis, miniML does not currently have a user-friendly interface where the user can directly control or change the parameters of interest, nor does miniML have a user control center, so the user cannot simply type or select the mini manually. Rather, if any parameter needs to be changed, the user needs to read, understand, and change the original source code to generate the preferred change. This level of "activation energy" and required user coding expertise in computer science, which many researchers do not have, renders miniML much less accessible when directly compared to SimplyFire and miniAnalysis. Hence, unless miniML’s interface can be made more user-friendly, this is a major disadvantage, especially when compared to SimplyFire, which has many of the same features as miniML but with a much easier interface and user controls.

      As suggested by the reviewer, we have created a graphical user interface (GUI) for miniML. The GUI allows easy data loading, filtering, analysis, event inspection, and saving of results without the need for writing Python code. Figure 2—figure supplement 2 illustrates the typical workflow for event analysis with miniML using the GUI and a screenshot of the user interface. Code to use miniML via the GUI is now included in the project’s GitHub repository. The GUI provides a simple and intuitive way to analyze synaptic events, whereas running miniML as Python script allows for more customization and a high degree of automatization.

      (1.2) We compared electrophysiological miniature events between miniML, SimplyFire, and miniAnalysis. All three achieved similar mean amplitudes in "wild type" conditions, and conditions in which mini events were enhanced and diminished, so the overall means and utilities are similar, with miniML and SimplyFire being preferred given the flexibility and much faster analysis. We did note a few differences, however. SimplyFire tends to capture a high number of mini-events over miniML, especially in conditions of diminished mini amplitude (e.g., miniML found 76 events, while SimplyFire 587). The mean amplitudes, however, were similar. It seems that in data with low SNR, SimplyFire captures many more events as real minis that are probably noise, while miniML is more selective, which might be an advantage in miniML. That being said, we found SimplyFire to be superior in many respects, not least of which the user interface and experience.

      We appreciate the reviewer’s thorough comparison of miniML, SimplyFire, and MiniAnalysis. While we acknowledge SimplyFire’s user-friendly interface, our study highlights several advantages of AI-based event analysis over conventional algorithmic approaches. Our updated benchmark analysis revealed better detection performance of miniML compared with SimplyFire (revised Figure 3), which had similar performance to deconvolution. As already noted by the reviewer, high false positive rates are a major issue of the SimplyFire approach. Although a minimum amplitude cutoff can partially resolve this problem, detection performance is highly sensitive to threshold setting (revised Figure 3). Another apparent disadvantage of SimplyFire is its relatively slow runtime (Figure 3—figure supplement 1). Finally, we have enhanced miniML’s accessibility by providing a graphical user interface that is easy to use and provides additional functionality.

      Some technical comments:

      (1) Improvements to the dependence version of miniML: There is a need to clarify the dependence version of the python and tensor flow used in this study and in the GitHub. We used Python version 3.8.19 to load the miniML model. However, if Python versions >=3.9, as described on the GitHub provided, it is difficult to have a matched h5py version installed. It is also inaccurate to say using Python >=3.9, because tensor flow version for this framework needs to be around 2.13. However, if using Python >=3.10, it will only allow 2.16 version tensor flow to be the download choice. Therefore, as a Python framework, the dependency version needs to be specified on GitHub to allow researchers to access the model using the entire work.

      Thank you for highlighting this issue. We have now included specific version numbers in the requirements to avoid version conflicts and to ensure proper functioning of the code.

      (2) Due to the intrinsic characteristics of the trained model, every model is only suitable for analyzing data with similar attributes. It is hard for researchers without a strong computer science background to train a new model themselves for their specific data. Therefore, it would be preferred if there were more available transfer learning models on GitHub accessible for researchers to adapt to their data.

      We would like to thank the reviewer for this feedback. Trained models (such as the default model) can often be used on different data (see, e.g., Figure 4, where data from four distinct synaptic preparations were analyzed with the base model, and Figure 5—figure supplement 1). However, changes in event waveform and/or noise characteristics may necessitate transfer learning to obtain optimal results with miniML. We have revised the description and tutorial for model training on the project’s GitHub repository to provide more guidance in this process. In addition, we now provide a tutorial on how to use existing models on out-of-sample data with distinct kinetics, using resampling. We hope these updates to the miniML GitHub repository will facilitate the use of the method.

      Following the suggestion by the reviewer, we have provided the transfer learning models used for the manuscript on the project’s GitHub repository to increase the number of available machine learning models for event detection. In addition, users of miniML are encouraged to supply their custom models. We hope that this will facilitate model exchange between laboratories in the future.

      Reviewer 3:

      I congratulate all authors for the convincing demonstration of their methodology, I do not have additional recommendations.

      We would like to thank the reviewer for the positive assessment of our manuscript.

      References

      Delvendahl, I., Kita, K., & Müller, M. (2019). Rapid and sustained homeostatic control of presynaptic exocytosis at a central synapse. Proceedings of the National Academy of Sciences, 116(47), 23783–23789. https://doi.org/10.1073/pnas.1909675116

      Donahue, J., Hendricks, L. A., Rohrbach, M., Venugopalan, S., Guadarrama, S., Saenko, K., & Darrell, T. (2017). Long-term recurrent convolutional networks for visual recognition and description. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 677–691. https://doi.org/10.1109/tpami.2016.2599174

      Drummond, C., & Holte, R. C. (2003). C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. https: //api.semanticscholar.org/CorpusID:204083391

      Islam, M. Z., Islam, M. M., & Asraf, A. (2020). A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using x-ray images. Informatics in Medicine Unlocked, 20, 100412. https://doi.org/10.1016/j.imu.2020.100412

      Passricha, V., & Aggarwal, R. K. (2019). A hybrid of deep CNN and bidirectional LSTM for automatic speech recognition. Journal of Intelligent Systems, 29(1), 1261–1274. https://doi.org/10.1515/jisys-2018-0372

      Prati, R. C., Batista, G. E. A. P. A., & Monard, M. C. (2009). Data mining with imbalanced class distributions: Concepts and methods. Indian International Conference on Artificial Intelligence. https://api.semanticscholar.org/CorpusID:16651273

      Tasdelen, A., & Sen, B. (2021). A hybrid CNN-LSTM model for pre-miRNA classification. Scientific Reports, 11(1). https://doi.org/10. 1038/s41598-021-93656-0

      Tyagi, S., & Mittal, S. (2020). Sampling approaches for imbalanced data classification problem in machine learning. In P. K. Singh, A. K. Kar, Y. Singh, M. H. Kolekar, & S. Tanwar (Eds.), Proceedings of icric 2019 (pp. 209–221). Springer International Publishing.

      Wang, H., Zhao, J., Li, J., Tian, L., Tu, P., Cao, T., An, Y., Wang, K., & Li, S. (2020). Wearable sensor-based human activity recognition using hybrid deep learning techniques. Security and Communication Networks, 2020, 1–12. https://doi.org/10.1155/2020/ 2132138

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors studied how hippocampal connectivity gradients across the lifespan, and how these relate to memory function and neurotransmitter distributions. They observed older age with less distinct transitions and observed an association between gradient de-differentiation and cognitive decline.

      This is overall an innovative and interesting study to assess gradient alterations across the lifespan and its associations to cognition.

      The paper is well-written, and the methods appear sound and thoughtful. There are several strengths, including the inclusion of two independent cohorts, the use of gradient mapping and alignment techniques, and an overall sound statistical and analysis framework. There are several areas for potential improvements in the paper, and these are listed below:

      We thank the Reviewer for their positive assessment and summary of our work. We address each of the Reviewer’s comments below, and outline the revisions we have made to the manuscript based on the Reviewer’s suggestions.

      (1) The reported D1 associations appear a bit post-hoc in the current work and I was unclear why the authors specifically focussed on dopamine here, as other transmitter systems are similar present at the level of the hippocampus and implicated in aging.

      Other neurotransmitter systems may indeed be relevant in the context of hippocampal function in aging. In this study, however, we included a specific research question about the DA D1 receptor (D1DR) based on previous research 1) emphasizing the role of DA neuromodulation in maintaining functional network segregation in aging to support cognition (Pedersen et al., 2023), 2) reporting heterogeneous distribution of DA markers across the hippocampus, supporting efficient modulation of distinct behaviors (Dubovyk & ManahanVaughan, 2019; Edelmann & Lessmann, 2018; Gasbarri et al., 1994; Kempadoo et al., 2016), and 3) demonstrating the spatial distribution of D1DRs as varying across neocortex along a unimodal-transmodal gradient (Pedersen et al., 2024). To which degree this variation might be reflected in cortico-hippocampal connectivity, however, remained to be investigated. As such, one of the study’s specific aims was to evaluate the spatial distribution of D1DRs as a molecular correlate of the hippocampus’ functional organization. Importantly, we were interested in mapping associations between individual differences in the organization of connectivity and D1DRs. This was uniquely enabled by utilizing the DyNAMiC sample, as it includes structural and functional MRI data in combination with D1DR PET in the same individuals across the adult lifespan (n=180). However, after observing significant spatial correspondence between functional organization and D1DR expressed by the second hippocampal gradient (G2), we did indeed perform complimentary analyses with group-averaged data of additional dopamine markers (D2DR from a subsample of our participants, as well as DAT and FDOPA from open sources) to test the generalizability of the original finding. Taken together, the original analyses based on subject-level data and complimentary group-level analyses provided support for the interpretation of G2 as a dopaminergic mode.

      We have updated the manuscript to clarify the focus on the D1 receptor and the contribution of including additional DA markers.

      Updated paragraph in the Introduction, pages 5-6:

      “Dopamine (DA) is one of the most important modulators of hippocampus-dependent function(47,48), and influences the brain’s functional architecture through enhancing specificity of neuronal signaling(49). Consistently, there is a DA-dependent aspect of maintained functional network segregation in aging which supports cognition(50). Animal models suggest heterogeneous patterns of DA innervation(51,52) and postsynaptic DA receptors(53), across both transverse and longitudinal hippocampal axes, likely allowing for separation between DA modulation of distinct hippocampus-dependent behaviors(47). Moreover, the human hippocampus has been linked to distinct DA circuits on the basis of long-axis variation in functional connectivity with midbrain and striatal regions(54,55). Taken together with recent findings revealing a unimodal-transmodal organization of the most abundantly expressed DA receptor subtype, D1 (D1DR), across cortex(56), we tested the hypothesis that the organization of hippocampal-neocortical connectivity partly reflects the underlying distribution of hippocampal DA receptors, predicting predominant spatial correspondence for any hippocampal gradient conveying a unimodal-transmodal pattern across cortex.”

      Updated sections in the Results, page 13-14:

      “Our next aim was to investigate to which extent the distribution of hippocampal DA D1 receptors (D1DRs), measured by [<sup>11</sup>C]SCH23390 PET in the DyNAMiC(58) sample, may serve as a molecular correlate of the hippocampus’ functional organization.”

      “Complimentary analyses were then conducted to further evaluate G2 as a dopaminergic hippocampal mode by utilizing additional DA markers at group-level.”

      Moreover, the authors may be aware that multiple PET tracers are somewhat challenged in the mesiotemporal region. Is this the case for the D1 receptor as well? The hippocampus is a small and complex structure, and PET more of a low res technique so one would want to highlight and discuss the limitations of the correlations with PET maps here and/or evaluate whether the analysis adds necessary findings to the study.

      We thank the Reviewer for raising this point. The lower resolution of PET is indeed a relevant aspect to consider when quantifying D1DR availability in the hippocampus, even though previous research indicate high test-retest reliability of [<sup>11</sup>C]SCH23390 PET measurement in this region (Kaller et al., 2017). We have now elaborated on PET limitations in the Discussion of the revised manuscript.

      In our study, we made efforts to reduce potential partial volume effects (PVE) by correcting our PET data, and tested spatial associations between our functional gradients and D1DR maps using trend-surface modelling (TSM), rather than through voxel-wise comparisons. This allowed us to evaluate the spatial correspondence between functional connectivity and D1DRs at a level of spatial trends, estimated using TSM models computed at increasing levels of complexity. The results showed consistent spatial overlap between G2 and D1DRs across these models, that is, across spatial trends described at coarser-to-finer scales. Furthermore, this was replicated across several DA markers with PET and SPECT data from independent samples.

      Taken together, we agree with the Reviewer that the spatial correspondence observed between G2 and hippocampal D1DRs should be interpreted in the context of resolution-related limitations inherent to PET imaging. However, we strongly believe that our DA analyses offer valuable insight to the molecular underpinnings of hippocampal functional organization.

      Updated paragraph in the Discussion, pages 25-26:

      “We discovered that G2, specifically, manifested organizational principles shared among function, behavior, and neuromodulation. Meta-analytical decoding reproduced a unimodalassociative axis across G2 (Figure 3B), and analyses in relation to the distribution of D1DRs – which vary across cortex along a unimodal-transmodal axis(76,77) – demonstrated topographic correspondence both at the level of individual differences and across the group. It should, however, be acknowledged that PET imaging in the hippocampus is associated with resolutionrelated limitations, although previous research indicate high test-retest reliability of [<sup>11</sup>C]SCH23390 PET to quantify D1DR availability in this region(78). As such, mapping the distribution of hippocampal D1DRs at a fine spatial scale remains challenging, and replication of our results in terms of overlap with G2 is needed in independent samples. Here, we evaluated the observed spatial overlap between G2 topography and D1DRs across multiple TSM model orders, showing correspondence between modalities from simple to more complex parameterizations of their spatial properties. Topographic correspondence was additionally observed between G2 and other DA markers from independent datasets (Figure 3B), suggesting that G2 may constitute a mode reflecting a dopaminergic phenotype, which contributes to the currently limited understanding of its biological underpinnings.”

      From my (perhaps somewhat biased) perspective, it might be valuable to instead or in addition look at measures of hippocampal microstructure and how these relate to the functional aging effects. This could be done, if available, using data from the same subjects (eg based on quantitative MRI contrasts and/or structural MRI) and/or using contextualization findings as implemented in eg hippomaps.readthedocs.io

      We thank the Reviewer for this suggestion. We performed additional analyses investigating the spatial overlap between our connectivity gradients and estimates of hippocampal microstructure, computed as the ratio of T1- over T2-weighted (T1w/T2w) images (Glasser & Von Essen, 2011; vos de Wael et al., 2018). Analyses of spatial correspondence then followed the TSM-based method used to test the spatial overlap between functional connectivity gradients and D1DR distribution. Applying TSM to the T1w/T2w image computed for each participant yielded subject-level model parameters describing microstructure topography, which were then entered as predictors of connectivity topography in multivariate GLMs (separate models for each gradient and hemisphere, 6 models in total).

      Analyses revealed that microstructure of the right hippocampus significantly predicted gradient topography of right-hemisphere G1 (F = 1.325, p \= 0.034), while no other links between connectivity gradients and microstructure emerged as significant (F 0.930-1.184, ps 0.7060.079).

      These results, suggesting an association along the anteroposterior axis, deviate from previous findings linking hippocampal microstructure to G3-like, medial-lateral, connectivity organization (vos de Wael et al., 2018). As we believe that comprehensive analyses of our gradients in relation to microstructure across the lifespan would be best addressed in future work, we have not included these analyses of microstructure in the revised manuscript.

      (2) Can the authors clarify why they did not replicate based on cohorts that are more widely used in the community and open access, such as CamCAN and/or HCP-Aging? It might connect their results with other studies if an attempt was made to also show that findings persist in either of these repositories.

      We agree with the Reviewer that replication in samples such as CamCAN and/or HCP-Aging would provide valuable opportunities to connect our findings with those of other studies using those datasets. Here, we included the Betula dataset (Nilsson et al., 2004) as our replication sample, as it was immediately available to us, included a large sample of adults in a comparable age, and a word recall episodic memory task closely aligned with the one included in DyNAMiC. Importantly, leveraging the Betula dataset as our replication sample allows us to link our findings to a wide range of previous studies central to the understanding of neurocognitive aging in general, and hippocampal aging in particular (Nyberg, 2017; Nyberg et al., 2020). Betula is a large longitudinal project that has been tracking individuals since 1988, and is part of the National E-infrastructure for Aging Research (NEAR: www.near-aging.se), through which data from several Swedish studies are made available to both national and international researchers. While we acknowledge the value of extending replication efforts to datasets like CamCAN and HCP-Aging, we emphasize the significant contribution of having replicated our connectivity gradients in the Betula dataset.

      (3) The authors applied TSM and related these parameters to topographic changes in the gradients. I was wondering whether and how such an approach controls for autocorrelation present in both the PET map and gradients. Could the authors clarify?

      The Reviewer raises an important topic in spatial autocorrelation. The TSM approach used to parameterize the topography of the functional gradients and D1DR distribution, and to test the spatial correspondence between modalities, did not include any specific method to control for autocorrelation. Here, we highlight two aspects of our study in relation to this point. First, we demonstrated in the Supplementary information (S. Figure 4) that autocorrelation induced by spatial smoothing likely has limited effects on overall gradient topography and the ability of TSM parameters to capture meaningful inter-individual differences in terms of age. Second, in the case of spatial overlap effects being significantly impacted by autocorrelation, we would expect the association between right-hemisphere G2 and D1DR topography to similarly emerge for G2 in the left hemisphere. The absence of such an association may speak to a limited effect of spatial autocorrelation.

      (4) The TSM approach quantifies the gradients in terms of x/y/z direction in a cartesian coordinate system. Wouldn't a shape intrinsic coordinate system in the hippocampus also be interesting, and perhaps even be more efficient to look at here (see eg DeKraker 2022 eLife or Paquola et al 2020 eLife)?

      This is a very relevant question and we appreciate the Reviewer’s suggestion. We recognize that there may be several benefits associated with adopting a shape-intrinsic coordinate system when characterizing effects in the hippocampus, given its curved/folded anatomy. Approaches like the ones adopted in DeKraker et al., 2022 and Paquola et al., 2020, utilizes geodesic coordinate frameworks to represent the hippocampus in surface space, enabling mapping of connectivity onto the hippocampal surface while respecting its inherent curvature and topology. We anticipate that quantifying gradients within such a framework would especially benefit identification of connectivity change across the hippocampal surface relative to reference points such as subfield boundaries, while minimizing effects of interindividual differences in hippocampal shape and folding. In our study, hippocampal gradients and their associated cortical patterns were computed in volumetric space, with TSM subsequently used to parameterize the change in connectivity along these gradients. This indeed yields a description of connectivity change within a coordinate system less specific to hippocampal anatomy, but may favor generalizability and integration with previous gradient findings within and beyond the hippocampus (e.g., Przeździk et al., 2019; Tian et al., 2020; Katsumi et al., 2023; Navarro-Schröder et al., 2015), as well as connections with broader neuroimaging frameworks through techniques such as meta-analytical decoding. In our view, the different coordinate frameworks offer complimentary insight to hippocampal organization, and while we have opted to not undertake novel analyses to explore our gradients within a geodesic coordinate system for the purposes of this paper, we recognize the importance of such evaluation of our gradients in future analyses. We have made updates to the Discussion in the revised manuscript on this topic (pages 23-24):

      “Greater anatomical specificity, with more precise characterization of connectivity in relation to subfield boundaries while minimizing effects of inter-individual differences in hippocampal shape and folding, might be achieved by adopting techniques implementing a geodesic coordinate system to represent effects within the hippocampus(68,69).”

      Reviewer #2 (Public Review):

      Summary:

      This paper derives the first three functional gradients in the left and right hippocampus across two datasets. These gradient maps are then compared to dopamine receptor maps obtained with PET, associated with age, and linked to memory. Results reveal links between dopamine maps and gradient 2, age with gradients 1 and 2, and memory performance.

      Strengths:

      This paper investigates how hippocampal gradients relate to aging, memory, and dopamine receptors, which are interesting and important questions. A strength of the paper is that some of the findings were replicated in a separate sample.

      Weaknesses:

      The paper would benefit from added clarification on the number of models/comparisons for each test. Furthermore, it would be helpful to clarify whether or not multiple comparison correction was performed and - if so - what type or - if not - to provide a justification. The manuscript would furthermore benefit from code sharing and clarifying which results did/did not replicate.

      We thank the Reviewer for their positive assessment and suggestions regarding further clarifications. We have addressed the Reviewer’s comments in a point-by-point manner under the “Recommendations for the authors” section.

      Reviewer #3 (Public Review):

      Summary:

      In this study, the authors analyzed the complex functional organization of the hippocampus using two separate adult lifespan datasets. They investigated how individual variations in the detailed connectivity patterns within the hippocampus relate to behavioral and molecular traits. The findings confirm three overlapping hippocampal gradients and reveal that each is linked to established functional patterns in the cortex, the arrangement of dopamine receptors within the hippocampus, and differences in memory abilities among individuals. By employing multivariate data analysis techniques, they identified older adults who display a hippocampal gradient pattern resembling that of younger individuals and exhibit better memory performance compared to their age-matched peers. This underscores the behavioral importance of maintaining a specific functional organization within the hippocampus as people age.

      Strengths:

      The evidence supporting the conclusions is overall compelling, based on a unique dataset, rich set of carefully unpacked results, and an in-depth data analysis. Possible confounds are carefully considered and ruled out.

      Weaknesses:

      No major weaknesses. The transparency of the statistical analyses could be improved by explicitly (1) stating what tests and corrections (if any) were performed, and (2) justifying the elected statistical approaches. Further, some of the findings related to the DA markers are borderline statistically significant and therefore perhaps less compelling but they line up nicely with results obtained using experimental animals and I expect the small effect sizes to be largely related to the quality and specificity of the PET data rather than the derived functional connectivity gradients.

      We thank the Reviewer for the thoughtful summary and positive assessment of our work. To increase transparency of the statistical analyses, we have in the revised manuscript added information regarding statistical tests and corrections for multiple comparisons. In the Results, p-values were reported at an uncorrected statistical threshold, and we have in the revised manuscript included the corresponding p-values adjusted for multiple comparisons using the Benjamini-Hochberg method to control the false discovery rate (FDR). Finally, in the revised manuscript, we have now elaborated on the potential limitations of our PET analyses and we include the updated paragraph below.

      Addition made to the Results section, page 13:

      “Individual maps of D1DR binding potential (BP) were also submitted to TSM, yielding a set of spatial model parameters describing the topographic characteristics of hippocampal D1DR distribution for each participant. D1DR parameters were subsequently used as predictors of gradient parameters in one multivariate GLM per gradient (in total 6 GLMs, controlled for age, sex, and mean FD). Results are reported with p-values at an uncorrected statistical threshold and p-values after adjustment for multiple comparisons using the Benjamini-Hochberg method to control the false discovery rate (FDR).”

      Addition made to the Results section, page 15:

      “Effects of age on gradient topography were assessed using multivariate GLMs including age as the predictor and gradient TSM parameters as dependent variables (controlling for sex and mean frame-wise displacement; FD). One model was fitted per gradient and hemisphere, each model including all TSM parameters belonging to a gradient (in total, 6 GLMs).”

      Addition made to the Results section, page 17:

      “Models were assessed separately for left and right hemispheres, across the full sample and within age groups, yielding eight hierarchical models in total. Results are reported with p-values at an uncorrected statistical threshold and p-values after FDR adjustment.”

      Updated paragraph in the Discussion, pages 25-26:

      “We discovered that G2, specifically, manifested organizational principles shared among function, behavior, and neuromodulation. Meta-analytical decoding reproduced a unimodalassociative axis across G2 (Figure 3B), and analyses in relation to the distribution of D1DRs – which vary across cortex along a unimodal-transmodal axis(76,77) – demonstrated topographic correspondence both at the level of individual differences and across the group. It should, however, be acknowledged that PET imaging in the hippocampus is associated with resolutionrelated limitations, although previous research indicate high test-retest reliability of [<sup>11</sup>C]SCH23390 PET to quantify D1DR availability in this region(78). As such, mapping the distribution of hippocampal D1DRs at a fine spatial scale remains challenging, and replication of our results in terms of overlap with G2 is needed in independent samples. Here, we evaluated the observed spatial overlap between G2 topography and D1DRs across multiple TSM model orders, showing correspondence between modalities from simple to more complex parameterizations of their spatial properties. Topographic correspondence was additionally observed between G2 and other DA markers from independent datasets (Figure 3B), suggesting that G2 may constitute a mode reflecting a dopaminergic phenotype, which contributes to the currently limited understanding of its biological underpinnings.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Please see the comments in the public review.

      We thank the Reviewer for their comments and recommendations, and have addressed them in the “Public review” section.

      Reviewer #2 (Recommendations For The Authors):

      (1) All statistical analyses are based on linear regressions using trend surface modeling (TSM) parameters that parameterize gradients at the subject level. These models resulted in 9 parameters for gradient 1 and 12 parameters each for gradients 2 and 3. The text states that 'Effects of age on gradient topography was assessed using multivariate GLMs including age as the predictor and gradient TSM parameters as dependent variables (controlling for sex and mean frame-wise displacement; FD)'. Please clarify whether these GLMs were fitted separately for each TSM parameter (i.e., 9+12+12=33 models for both left and right = 66 total models) or on the overall model?

      We appreciate the Reviewer’s request for clarification on this matter. These GLMs were fitted on the overall TSM model, that is, through one GLM per gradient (3) and hemisphere (2), each one including all TSM parameters belonging to a gradient (in total, 6 GLMs).

      In the revised manuscript, we have added more details to the Results section, page 15: “Effects of age on gradient topography were assessed using multivariate GLMs including age as the predictor and gradient TSM parameters as dependent variables (controlling for sex and mean frame-wise displacement; FD). One model was fitted per gradient and hemisphere, each model including all TSM parameters belonging to a gradient (in total, 6 GLMs).”

      (2) Similarly, for memory it appears that multiple models were performed (left and right, young, middle-aged, old, whole groups). Please clarify whether and how multiple comparison correction was performed in this case.

      In the revised manuscript, we have now specified the number of analyses conducted in relation to memory performance. We have also clarified that p-values were reported at an uncorrected statistical threshold, and we have in the revised manuscript included the corresponding p-values adjusted for multiple comparisons using the Benjamini-Hochberg method to control the FDR.

      Updated section in the Results, page 17:

      “Models were assessed separately for left and right hemispheres, across the full sample and within age groups, yielding eight hierarchical models in total. Results are reported with p-values at an uncorrected statistical threshold and p-values after FDR adjustment.”

      (3) Although I applaud the authors for their replication efforts, the results do not appear to replicate well. For example, memory was linked to gradient 2 in the whole group but to gradient 1 in the young group. Furthermore, dopamine was linked to gradient 2 in the right but not the left hemisphere. Although the overall group-level gradients were very stable between the two datasets, it is not clear whether the age findings replicated and the memory subgroup findings only replicated at trend level for memory and only partially replicated at the TSM parameter level.

      We thank the Reviewer for highlighting the inclusion of a replication dataset as a strength of our study, and we appreciate the recommendation to clarify to which extent results replicated. We provide a response to the Reviewer’s points below, and specify the revisions made to the manuscript in relation to this topic.

      The main aim of our study was to characterize the topographic organization of functional hippocampal-neocortical connectivity within the hippocampus across the adult lifespan, as previous studies have limited their focus to younger adults. Given the lack of previous studies for comparison, together with our identification of a novel secondary long-axis connectivity gradient (G2) taking precedence over the previously established medial-lateral G3, we included the Betula sample (Nilsson et al., 2004) for the purpose of replication. There was a high level of consistency between our main dataset and our replication dataset, with gradients 1-3 in left and right hemispheres identified in both samples.

      Further use of the replication dataset, beyond the identification of the connectivity gradients, was originally not planned. As such, not all subsequent analyses in the main dataset were conducted in the replication dataset. However, we found it critical to evaluate the observation that older individuals who maintained a youth-like gradient topography also exhibited higher levels of memory performance in an independent sample. This was possible given that the replication dataset included a comparable number of participants in similar ages and a word recall episodic memory task corresponding well to the one used in DyNAMiC. Overall, we conclude that these analyses replicated well across samples. Firstly, topography of lefthemisphere G1 informed the classification of older adults into youth-like and aged subgroups in both samples. Furthermore, in both samples, we observed that the older subgroups identified based on G1 topography also exhibited the youth-like vs. aged pattern in G2 topography. This pattern was, however, evident also in G3 only in the main sample, possibly suggesting a limited contribution of G3 topography in determining overall functional profiles in older age. In terms of the behavioral relevance of maintaining youth-like gradient topography in older age, we observed effects on word recall performance in both samples; although the Reviewer correctly points out that, the difference between subgroups was significant at trend-level (p = 0.058) in the replication dataset. While this indeed underscores the importance of replication efforts in additional samples, we argue that the pattern observed in our replication dataset is overall consistent with, and conveys effects in the expected direction based on, the original observations in our main dataset.

      In revising the manuscript, we have performed additional analyses for replication purposes in terms of memory. Originally, we observed a significant association between G2 topography and episodic memory across the main sample. However, this effect did not remain significant after FDR adjustment for multiple comparisons. To evaluate this association further, we conducted a corresponding hierarchical multiple regression analysis in the replication dataset, which supported a role of G2 in memory (Adj. R<sup>2</sup> = 0.368, ΔR<sup>2</sup> = 0.081, F= 1.992, p = 0.028). Together, these analyses suggest that inter-individual differences in episodic memory performance may in part be explained by the spatial characteristics of G2 across the adult lifespan, although increased statistical power in relation to the large number of TSM parameters included in the hierarchical regression models may be needed to explore this association in smaller, age-stratified, groups. Relatedly, it is worth mentioning that higher levels of memory performance in older age were linked to the maintenance of youth-like G2 topography in both our main and replication datasets.

      In parallel, topographic parameters of G1 predicted memory performance in the younger adults, which successfully replicates TSM-based results previously reported in Przeździk et al., 2019. Although similar associations were not evident within the other age groups, a link between G1 topography and memory was demonstrated in older age based on a) the identification of individuals maintaining a youth-like G1 profile and higher levels of memory, within which b) memory performance was, as in young adults, significantly predicted by G1 topography.

      The spatial correspondence between G2 topography and distribution of hippocampal D1DRs was lateralized to the right, and as the Reviewer points out, as such did not replicate across hemispheres. To which extent replication across hemispheres should be expected in this case is, however, difficult to determine. Lateralization and/or hemispheric asymmetry is commonly observed in numerous hippocampal features, from the molecular level to its functional involvement in behavior (Nematis et al., 2023; Persson & Söderlund, 2015), including various dopaminergic markers tested in the animal literature (Afonso et al., 1993; Sadeghi et al., 2017). Yet, potential differences between hemispheres in D1DR availability and the spatial distribution of receptors along hippocampal axes remain less studied in humans. More data is therefore needed to determine the nature of this right-hemisphere lateralization.

      In sum, we argue that our results show a good level of replication across independent datasets and across analyses in our main dataset. Whereas this study did not attempt replication of all analyses conducted in the main dataset, it has through replication across independent samples provided support for its main findings – the organization of hippocampal-neocortical connectivity along three main hippocampal gradients across the adult lifespan, and the gradient topography-based identification of older individuals maintaining a youth-like hippocampal organization in older age.

      The revised manuscript includes edits made to incorporate the new analyses and clarifications of observations in relation to memory.

      In the Results, page 17:

      “Observing that the association between G2 and memory did not remain significant after FDR adjustment, we performed the same analysis in our replication dataset, which also included episodic memory testing. Consistent with the observation in our main dataset, G2 significantly predicted memory performance (Adj. R<sup>2</sup> = 0.368, ΔR<sup>2</sup> = 0.081, F= 1.992, p = 0.028) over and above covariates and topography of G1. Here, the analysis also showed that G1 topography predicted performance across the sample (Adj. R<sup>2</sup> = 0.325, ΔR<sup>2</sup> = 0.112, F= 3.431, p < 0.001).”

      In the Discussion, page 26:

      “Results linked both G1 and G2 to episodic memory, suggesting complimentary contributions of these two overlapping long-axis modes. Considered together, analyses in the main and replication datasets indicated a role of G2 topography in memory across the adult lifespan, independent of age. A similar association with G1 was only evident across the entire sample in the replication dataset, whereas results in the main sample seemed to emphasize a role of youthlike G1 topography in memory performance. In line with previous research, memory was successfully predicted by G1 topography in young adults(30), and similarly predicted by G1 in older adults exhibiting a youth-like functional profile.”

      (4) Please share the data and code and add a description of data and code availability in the manuscript.

      We have now made our code available, and added a statement on data and code availability in the revised manuscript.

      On page 37: “Data from the DyNAMiC study are not publicly available. Access to the original data may be shared upon request from the Principal investigator, Dr. Alireza Salami. The Matlab, R, and FSL codes used for analyses included in this study are openly available at https://github.com/kristinnordin/hcgradients. Computation of gradients was done using the freely available toolbox ConGrads: https://github.com/koenhaak/congrads.”

      Reviewer #3 (Recommendations For The Authors):

      Please see the comments in the public review.

      We thank the Reviewer for their comments and recommendations, and have addressed them in the “Public review” section.

      References

      Afonso, D., Santana, C., & Rodriguez, M. (1993). Neonatal lateralization of behavior and brain dopaminergic asymmetry. Brain Research Bulletin, 32(1), 11–16. https://doi.org/10.1016/0361-9230(93)90312-Y

      DeKraker, J., Haast, R. A., Yousif, M. D., Karat, B., Lau, J. C., Köhler, S., & Khan, A. R. (2022). Automated hippocampal unfolding for morphometry and subfield segmentation with HippUnfold. eLife, 11, e77945. https://doi.org/10.7554/eLife.77945

      Dubovyk, V., & Manahan-Vaughan, D. (2019). Gradient of expression of dopamine D2 receptors along the dorso-ventral axis of the hippocampus. Frontiers in Synaptic Neuroscience, 11. https://doi.org/10.3389/fnsyn.2019.00028

      Edelmann, E., & Lessmann, V. (2018). Dopaminergic innervation and modulation of hippocampal networks. Cell and Tissue Research, 373(3), 711–727. https://doi.org/10.1007/s00441-018-2800-7

      Gasbarri, A., Verney, C., Innocenzi, R., Campana, E., & Pacitti, C. (1994). Mesolimbic dopaminergic neurons innervating the hippocampal formation in the rat: A combined retrograde tracing and immunohistochemical study. Brain Research, 668(1), 71–79. https://doi.org/10.1016/0006-8993(94)90512-6

      Glasser, M. F., & Essen, D. C. V. (2011). Mapping Human Cortical Areas In Vivo Based on Myelin Content as Revealed by T1- and T2-Weighted MRI. Journal of Neuroscience, 31(32), 11597–11616. https://doi.org/10.1523/JNEUROSCI.2180-11.2011

      Kaller, S., Rullmann, M., Patt, M., Becker, G.-A., Luthardt, J., Girbardt, J., Meyer, P. M., Werner, P., Barthel, H., Bresch, A., Fritz, T. H., Hesse, S., & Sabri, O. (2017). Test– retest measurements of dopamine D1-type receptors using simultaneous PET/MRI imaging. European Journal of Nuclear Medicine and Molecular Imaging, 44(6), 1025–1032. https://doi.org/10.1007/s00259-017-3645-0

      Katsumi, Y., Zhang, J., Chen, D., Kamona, N., Bunce, J. G., Hutchinson, J. B., Yarossi, M., Tunik, E., Dickerson, B. C., Quigley, K. S., & Barrett, L. F. (2023). Correspondence of functional connectivity gradients across human isocortex, cerebellum, and hippocampus. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04796-0

      Kempadoo, K. A., Mosharov, E. V., Choi, S. J., Sulzer, D., & Kandel, E. R. (2016). Dopamine release from the locus coeruleus to the dorsal hippocampus promotes spatial learning and memory. Proceedings of the National Academy of Sciences, 113(51), 14835–14840. https://doi.org/10.1073/pnas.1616515114

      Navarro Schröder, T., Haak, K. V., Zaragoza Jimenez, N. I., Beckmann, C. F., & Doeller, C. F. (2015). Functional topography of the human entorhinal cortex. eLife, 4, e06738. https://doi.org/10.7554/eLife.06738

      Nemati, S. S., Sadeghi, L., Dehghan, G., & Sheibani, N. (2023). Lateralization of the hippocampus: A review of molecular, functional, and physiological properties in health and disease. Behavioural Brain Research, 454, 114657. https://doi.org/10.1016/j.bbr.2023.114657

      Nilsson, L.-G., Adolfsson, R., Bäckman, L., Frias, C. M. de, Molander, B., & Nyberg, L. (2004). Betula: A Prospective Cohort Study on Memory, Health and Aging. Aging, Neuropsychology, and Cognition, 11(2–3), 134–148. https://doi.org/10.1080/13825580490511026

      Nyberg, L. (2017). Functional brain imaging of episodic memory decline in ageing. Journal of Internal Medicine, 281(1), 65–74. https://doi.org/10.1111/joim.12533

      Nyberg, L., Boraxbekk, C.-J., Sörman, D. E., Hansson, P., Herlitz, A., Kauppi, K., Ljungberg, J. K., Lövheim, H., Lundquist, A., Adolfsson, A. N., Oudin, A., Pudas, S., Rönnlund, M., Stiernstedt, M., Sundström, A., & Adolfsson, R. (2020). Biological and environmental predictors of heterogeneity in neurocognitive ageing: Evidence from Betula and other longitudinal studies. Ageing Research Reviews, 64, 101184. https://doi.org/10.1016/j.arr.2020.101184

      Paquola, C., Benkarim, O., DeKraker, J., Larivière, S., Frässle, S., Royer, J., Tavakol, S.,

      Valk, S., Bernasconi, A., Bernasconi, N., Khan, A., Evans, A. C., Razi, A., Smallwood, J., & Bernhardt, B. C. (2020). Convergence of cortical types and functional motifs in the human mesiotemporal lobe. eLife, 9, e60673. https://doi.org/10.7554/eLife.60673

      Pedersen, R., Johansson, J., Nordin, K., Rieckmann, A., Wåhlin, A., Nyberg, L., Bäckman, L., & Salami, A. (2024). Dopamine D1-Receptor Organization Contributes to Functional Brain Architecture. Journal of Neuroscience, 44(11). https://doi.org/10.1523/JNEUROSCI.0621-23.2024

      Pedersen, R., Johansson, J., & Salami, A. (2023). Dopamine D1-signaling modulates maintenance of functional network segregation in aging. Aging Brain, 3, 100079. https://doi.org/10.1016/j.nbas.2023.100079

      Persson, J., & Söderlund, H. (2015). Hippocampal hemispheric and long-axis differentiation of stimulus content during episodic memory encoding and retrieval: An activation likelihood estimation meta-analysis. Hippocampus, 25(12), 1614–1631. https://doi.org/10.1002/hipo.22482

      Przeździk, I., Faber, M., Fernández, G., Beckmann, C. F., & Haak, K. V. (2019). The functional organisation of the hippocampus along its long axis is gradual and predicts recollection. Cortex, 119, 324–335. https://doi.org/10.1016/j.cortex.2019.04.015

      Sadeghi, L., Rizvanov, A. A., Salafutdinov, I. I., Dabirmanesh, B., Sayyah, M., Fathollahi, Y., & Khajeh, K. (2017). Hippocampal asymmetry: Differences in the left and right hippocampus proteome in the rat model of temporal lobe epilepsy. Journal of Proteomics, 154, 22–29. https://doi.org/10.1016/j.jprot.2016.11.023

      Tian, Y., Margulies, D. S., Breakspear, M., & Zalesky, A. (2020). Topographic organization of the human subcortex unveiled with functional connectivity gradients. Nature Neuroscience, 1–12. https://doi.org/10.1038/s41593-020-00711-6

      vos de Wael, R., Larivière, S., Caldairou, B., Hong, S.-J., Margulies, D. S., Jefferies, E., Bernasconi, A., Smallwood, J., Bernasconi, N., & Bernhardt, B. C. (2018). Anatomical and microstructural determinants of hippocampal subfield functional connectome embedding. Proceedings of the National Academy of Sciences, 115(40), 10154–10159. https://doi.org/10.1073/pnas.1803667115

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Regarding the manuscript's clarity, the sentence on page 5, "We also stained VTA sections for Tyrosine hydroxylase (TH) to estimate the rate of ChR2 colocalization with DA neurons," reads awkwardly. Removing the word "rate" could improve clarity.

      We have made the recommended clarifying edit (page 5, lines 30-31).

      Additionally, the anatomical data and findings are largely non-quantitative in nature. However, solid microscopy images are presented to support each claim. Additional quantification would strengthen the paper, specifically the quantification of projection density for each population and the proportion of each subpopulation that projects to their regions of interest.

      To rigorously quantify the projection density of each subpopulation would require a level of exhaustivity our study was not designed for. This is because during microscopy we focused efforts on imaging regions containing dense signals but did not exhaustively image regions receiving apparently weak or no input. While we considered including a semi-quantitative table of projection density, based on the data available we could not discriminate with confidence between, e.g., regions recipient of minimal input versus no input from VTA populations. Thus, while we stand by our descriptive statements we do not expand on those further.

      The authors should consider discussing the possibility that subpopulations of these cells could still be true interneurons especially if cells were looked at the single neuron level of resolution.

      We agree that some of the VTA populations we studied could include subpopulations that are bona fide interneurons. The identification of alternate markers or combinations of markers, or use of single-cell imaging approaches may indeed support this possibility in future. This is discussed in the context of currently available evidence on page 5 lines 32-34, page 11 lines 2-4, page 12 lines 2-11, and page 12 lines 15-16.

      Overall, the paper is well-written and important for the field and beyond.

      Thank you!

      Reviewer #2:

      Weaknesses:

      While the authors use several Cre driver lines to identify GABAergic projection neurons, they then use wild-type mice to show that projection neurons synapse onto neighboring cells within the VTA. This does not seem to lend evidence to the idea that previously described "interneurons" are projection neurons that collateralize within the VTA.

      We think the use of WT mice is a strength because it allows us to measure both GABA and non-GABA synapses made by VTA projections on to the same cells within VTA. However, we have also done this experiment targeting NAc-projecting VTA VGAT-Cre neurons, and VP-projecting VTA MOR-Cre neurons. Consistent with the WT dataset, we find that these defined projection neurons also make intra-VTA synapses. These data are now included as Figure 7.

      More broadly. Our review of the literature finds very little evidence to support the notion of a VTA interneuron as we define it: VTA neurons that makes only local connections. But the absence of evidence need not imply evidence of absence, thus we do not claim that all VTA neurons previously presumed to be interneurons must be projection neurons. We do express confidence in our findings that VTA projection neurons (that include GABA-releasing neurons) make local synapses in VTA. We argue that in the absence of compelling positive evidence for the existence of VTA interneurons, such as a selective marker, “we”, “the field”, should not presume their existence.

      Other suggestions:

      (1) While the authors present evidence that some projection neurons also synapse locally, there is no quantification as to the proportion of each neuronal subtype that collateralizes within the VTA. This would be a useful analysis.

      We agree this would be useful information. But our experiments were not designed to answer this question. Indeed, we have not conceived of a feasible method to discriminate between collateralizing and non-collateralizing VTA projection neurons at the single-cell level, thus we do not know how we would calculate such proportions.

      (2) There is significant interest in the molecular heterogeneity and spatial topography of the VTA. Additional analyses of the spatial topography of labeled projectors would be useful. For example, knowing if Pvalb+ projection neurons are distributed throughout the VTA or located along the midline would be a useful analysis.

      Prior studies and public databases (e.g., Allen brain atlas, GENSAT) allow one to visualize the location of VTA neurons positive for Pvalb and the other markers we investigated (Olson & Nestler, 2007). However, these label the entire population of neurons and thereby include those that project to any of the various projection targets. There are also studies that have used retrograde labeling approaches to map the distribution of labeled VTA cells projecting to one or another target (Beier et al., 2015; Lammel et al., 2008; Margolis et al., 2006). For example, finding that LHb-projecting neurons (a major target of Pvalb+ VTA neurons) are enriched in medial VTA (Root et al., 2014). From this evidence we might infer that Pvalb+ VTA neurons that project to LHb are likely to be medially biased. Future studies may more carefully map the intersection of specific projection targets for each VTA subpopulation.  

      Reviewer #3 (Recommendations For The Authors):

      Weaknesses:

      This study has a few modest shortcomings, of which the first is likely addressable with the authors' existing data, while the latter items will likely need to be deferred to future studies:

      (1) Some key anatomical details are difficult to discern from the images shown. In Figure 1, the low-magnification images of the VTA in the first column, while essential for seeing what overall section is being shown, are not of sufficient resolution to distinguish soma from processes. A supplemental figure with higher-resolution images could be helpful.

      We uploaded a higher resolution file for figure 1.

      Also, where are the insets shown in the second column obtained from? There is not a corresponding marked region on the low-magnification images. Is this an oversight, or are these insets obtained from other sections that are not shown?

      This was an oversight, we added the corresponding marked region to the low-magnification images.

      Lastly, there is a supplemental figure showing the NAc injection sites corresponding to Figure 5, but not one showing VP or PFC injection sites in Figure 6. Why not?

      We added a figure with histology examples for the VP and the PFC injection sites as done for Figure 5, included as Supplemental Figure 3.

      (2) Because multiple ChR2 neurons are activated in the optogenetic experiments, it is not clear how common is it for any specific projection neuron to make local connections. Are the observed synaptic effects driven by just a few neurons making extensive local collateralizations (while other projection neurons do not), or do most VTA projection neurons have local collaterals? I realize this is a complex question, that may not have an easy answer.

      This is a great question but, indeed, we don’t know the answer. As mentioned in response to Reviewer #2, we are not convinced there is a currently feasible way to discriminate between collateralizing and non-collateralizing cells at the single cell level.

      (3) There is something of a conceptual disconnect between the early and later portions of this paper. Whereas Figures 1-4 examine forebrain projections of genetic subtypes of VTA neurons, the optogenetic studies do not address genetic subtypes at all. I do realize that is outside of the scope of the author's intent, but it does give the impression of somewhat different (but related) studies being stitched together. For example, the MOR-expressing neurons seem to project strongly to the VP, but it is not addressed whether these are also the ones making local projections. Also, after showing that PV neurons project to the LHb, the opto experiments do not examine the LHb projection target at all.

      This too was raised by Reviewer #2. While addressing this question for all the populations we investigated feels redundant, we now include optogenetic data showing that NAc-projecting VTA VGAT-Cre and VP-projecting VTA MOR-Cre neurons also make local collaterals (Figure 7). We think this allows us to connect the two approaches to a greater degree. Based on our findings using a dual virus approach to express Syn:Ruby in each population of VTA projection neuron, we think it very likely that we’d continue to find similar results using optogenetics-assisted slice electrophysiology for each population.

      Other suggestions:

      (1) I appreciated the extensive and high-quality anatomical figures shown in Figures 2-4. However, the layout was sometimes left-to-right, and sometimes right-to-left, which felt distracting. At some point, the text refers to "Fig. 3KJ", i.e. with the letters being in backward alphabetical order, and Figures 3I and 3L do not appear mentioned anywhere in the main text, leading me to wonder if that text was intended to read "Fig. 3I-L".

      Thank you for noting this. We have harmonized the layout of Figures 2-4 and adjusted the in-text Figure call-outs.

      Also, the inset in Figure 3J appears to show local collaterals of NTS neurons in the VTA, since there is no soma in that inset. This is interesting, and worth reporting, but is not explained in either the main text or Figure legend.

      We added a more complete description in the result section (page 6 line 25-30).

      (2) Perhaps I missed it, but I could not find any mention of the intensity of the LED light delivered during the optogenetic experiments. While acknowledging that this can be variable, do the authors have at least a rough range?

      We have added this information to the methods, page 17 line 8.

      Editor's Note:

      Should you choose to revise your manuscript, please double check that you have fully reported all statistics including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals.

      We confirm that we have fully reported all statistics including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals.

      Note to Editor and Readers

      While reanalyzing our data for resubmission, we discovered that some of the short-latency optogenetic evoked postsynaptic currents (oPSCs) we detected were erroneously categorized. Specifically, some VTA cells that showed large outward currents (oIPSCs) when held at 0 mV, also had small inward currents when held at -60 mV. These small inward currents were initially categorized as oEPSCs, suggesting these VTA cells received input from populations of VTA projection neurons that released GABA and/or glutamate. However, the kinetics of these small inward currents were slow and aligned with the within-cell kinetics of the oIPSCs, indicating that these were very likely mediated by GABA<SUB>A</SUB> receptors. In one case the opposite was apparent, with a small PSC initially miscategorized as an oIPSC. These miscategorized oEPSCs and oIPSC were presumably detected because our holding potentials were not precisely identical to the reversal potentials for GABA<SUB>A</SUB> and AMPA receptors, respectively. For this reason, we removed these 14 oEPSCs and 1 oIPSCs from our analyses in the revised version. The revised dataset suggests that VTA glutamate projection neurons may be less likely to collateralize widely within VTA compared to GABA projection neurons. But, importantly, this correction does not affect any of our conclusions.

      Citations:

      Beier, K. T., Steinberg, E. E., DeLoach, K. E., Xie, S., Miyamichi, K., Schwarz, L., Gao, X. J., Kremer, E. J., Malenka, R. C., & Luo, L. (2015). Circuit Architecture of VTA Dopamine Neurons Revealed by Systematic Input-Output Mapping. Cell, 162(3), 622-634. https://doi.org/10.1016/j.cell.2015.07.015

      Lammel, S., Hetzel, A., Hackel, O., Jones, I., Liss, B., & Roeper, J. (2008). Unique properties of mesoprefrontal neurons within a dual mesocorticolimbic dopamine system. Neuron, 57(5), 760-773. https://doi.org/10.1016/j.neuron.2008.01.022

      Margolis, E. B., Lock, H., Chefer, V. I., Shippenberg, T. S., Hjelmstad, G. O., & Fields, H. L. (2006). Kappa opioids selectively control dopaminergic neurons projecting to the prefrontal cortex. Proc Natl Acad Sci U S A, 103(8), 2938-2942. https://doi.org/10.1073/pnas.0511159103

      Olson, V. G., & Nestler, E. J. (2007). Topographical organization of GABAergic neurons within the ventral tegmental area of the rat. Synapse, 61(2), 87-95. https://doi.org/10.1002/syn.20345

      Root, D. H., Mejias-Aponte, C. A., Zhang, S., Wang, H. L., Hoffman, A. F., Lupica, C. R., & Morales, M. (2014). Single rodent mesohabenular axons release glutamate and GABA. Nat Neurosci, 17(11), 1543-1551. https://doi.org/10.1038/nn.3823

    1. Voici un document de synthèse pour un briefing sur la classe préparatoire TSI (Technologie et Sciences Industrielles), basé sur la transcription de la vidéo "Les rendez-vous de la techno : Promotion de la classe préparatoire TSI".

      Présentation générale

      La classe préparatoire TSI est une prépa scientifique dédiée aux bacheliers STI2D et STL (spécialité SPCL). Elle a pour but de préparer les étudiants aux concours d'entrée des écoles d'ingénieurs.

      Objectifs de la prépa TSI

      • Aide à l'orientation
      • Démystifier l'image de la prépa
      • Préparer aux concours d'entrée des écoles d'ingénieurs
      • Acquisition de connaissances et de compétences
      • Découverte de soi à travers le travail, l'organisation et la gestion du stress
      • Inculquer des méthodes de travail, de l'autonomie et de la prise d'initiative

      Immersion en prépa TSI

      Des lycéens de terminale STI2D et STL ont la possibilité de participer à des immersions dans les classes de TSI pour découvrir le fonctionnement de la prépa et échanger avec les étudiants. Ces immersions permettent de :

      • Réaliser des ateliers encadrés par des professeurs de TSI.
      • Découvrir les matières enseignées en prépa.
      • Comprendre l'ambiance de classe et les méthodes de travail.
      • Se projeter et confirmer leur orientation.

      Ateliers proposés lors des immersions

      • Intelligence artificielle dans le domaine de la santé (prédiction de maladies)
      • Modélisation de systèmes présents dans les salles de TSI (ouvre-portails, mixeurs, etc.)
      • Mise en équation de la chute d'un objet (un Playmobil) pour illustrer les activités en sciences de l'ingénieur et en physique

      Témoignages d'étudiants

      • La charge de travail est plus importante qu'en terminale (1h30 à 2h de devoirs par soir).
      • Il est important de trouver un équilibre entre le travail, les résultats et le temps de repos.
      • La prépa permet de se découvrir à travers le travail et l'organisation.

      Accès aux écoles d'ingénieurs après une prépa TSI

      Plusieurs voies d'accès aux écoles d'ingénieurs sont possibles après une prépa TSI :

      • Concours :
        • CCINP (Concours Commun INP) : donne accès à une soixantaine d'écoles. En 2024, 419 places sont réservées aux TSI.
        • Centrale Supelec : donne accès à une trentaine d'écoles. En 2024, 318 places sont proposées.
        • E3A-EPI (anciennement EPITA, IPSA, ESME)
        • SAEIV
      • Dossier (pour des écoles en formation initiale ou en alternance)

      Écoles d'ingénieurs par alternance

      De plus en plus d'écoles d'ingénieurs proposent des formations en alternance, qui permettent de suivre des cours tout en travaillant et en percevant un salaire. L'alternance est basée sur la présentation d'un dossier.

      Les Colles

      La présence de colles est une plus-value qu'on ne retrouve qu'en prépa et pas en école. Les colles sont des interrogations orales d'une heure avec 3 étudiants et un professeur qui reprennent les éléments vus précédemment. Cela permet de conforter les compétences, les connaissances et de débloquer les étudiants.

      Réorientation

      En fin de première année, les étudiants ont la possibilité de se réorienter vers d'autres cursus (BTS, BUT, université, écoles sur dossier). La prépa TSI leur aura apporté des méthodes de travail, de l'autonomie et de la prise d'initiative qui leur seront utiles pour la suite de leurs études.

      Conseils

      • Se renseigner sur les possibilités et les taux de réussite des différentes voies d'accès aux écoles d'ingénieurs.
      • Être stratégique dans ses choix d'orientation et prévoir un plan B.
      • Être sérieux, volontaire, motivé et persévérant.
      • Ne pas hésiter à contacter les établissements, les professeurs principaux et les psychologues de l'Éducation Nationale.
      • Cadrer sa recherche d'entreprise et démarcher les entreprises sur LinkedIn.
      • S'y prendre à l'avance dans sa recherche d'alternance.
    1. 与YUV关系

      是的,你的理解是准确的!YUV本质上与CIELAB有相似之处,它们都基于人眼对亮度的高度敏感性,并且都将亮度和色度信息分开处理,旨在提高效率并适应人眼的感知方式。

      1. YUV与CIELAB的相似性

      在YUV格式中,Y分量代表亮度(即图像的明暗程度),而UV分量表示色度(即颜色的色相和饱和度)。这与CIELAB的设计非常相似,后者通过L*表示亮度,a*b*表示色度(红绿和蓝黄轴上的色差)。

      为什么YUV与CIELAB类似?

      • 人眼对亮度更敏感:人眼对亮度(或明暗)变化的敏感度要远高于对色度(色相和饱和度)的敏感度。这是因为我们对图像的细节和对比度的识别依赖于亮度信息,而对颜色的变化则不那么敏感。这一特性使得在视频和图像处理中,分离亮度和色度可以有效减少信息冗余,提高数据压缩的效率。

      在YUV中,Y分量表示亮度,这个分量在人眼视觉感知中占据主导地位;UV分量则表示色度,它们的变化对视觉感知的影响较小。因此,在视频压缩和传输中,UV分量通常可以进行下采样(比如4:2:0采样),从而减少数据量,而不显著影响图像的视觉质量。

      • 色度分量压缩:由于人眼对色度(如红绿、蓝黄的变化)不如对亮度敏感,YUV格式将色度分量压缩,并且在许多视频编码格式中,色度分量的分辨率通常低于亮度分量。这类似于CIELAB中,色度部分(a*和b*)与亮度(L*)是分开的,并且色度部分的变化对视觉的影响相对较小。

      2. YUV与CIELAB的区别

      尽管YUV和CIELAB有相似的结构,它们的主要区别在于它们的设计目标和应用领域:

      • CIELAB是一个设备无关的色彩空间,旨在提供一种统一的标准来描述人眼对颜色的感知,并且可以用于不同设备之间的颜色转换。它并不依赖于任何特定的设备或显示技术。

      • YUV是为视频编码和传输而设计的,它特别考虑了数据压缩的需求,并且通过色度的下采样来减少冗余信息。它主要用于视频信号的传输和存储,尤其是高效压缩(例如,H.264、MPEG标准)中的应用。

      3. 总结

      • YUVCIELAB都基于人眼对亮度的敏感性,因此它们都通过分离亮度和色度信息来提高效率。YUV通过将Y作为亮度分量,UV作为色度分量,实现了色度的压缩,而CIELAB通过L*表示亮度,a*b*表示色度,确保颜色的感知与人眼一致。

      • YUV的目标高效的视频压缩和传输,而CIELAB的目标是确保设备无关的颜色表示跨设备色彩管理,以确保不同设备(如显示器、打印机)之间的颜色一致性。

      因此,YUV和CIELAB的设计理念相似,但它们各自的应用场景和功能有所不同,YUV更侧重于视频数据的传输效率,而CIELAB侧重于提供一个统一的色彩标准,用于色彩管理和设备之间的颜色转换。

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This paper contains what could be described as a "classic" approach towards evaluating a novel taste stimuli in an animal model, including standard behavioral tests (some with nerve transections), taste nerve physiology, and immunocytochemistry of the tongue. The stimulus being tested is ornithine, from a class of stimuli called "kokumi", which are stimuli that enhance other canonical tastes, increasing essentially the hedonic attributes of these other stimuli; the mechanism for ornithine detection is thought to be GPRC6A receptors expressed in taste cells. The authors showed evidence for this in an earlier paper with mice; this paper evaluates ornithine taste in a rat model.

      Strengths:

      The data show the effects of ornithine on taste: in two-bottle and briefer intake tests, adding ornithine results in a higher intake of most, but not all, stimuli tests. Bilateral nerve cuts or the addition of GPRC6A antagonists decrease this effect. Small effects of ornithine are shown in whole-nerve recordings.

      Weaknesses:

      The conclusion seems to be that the authors have found evidence for ornithine acting as a taste modifier through the GPRC6A receptor expressed on the anterior tongue. It is hard to separate their conclusions from the possibility that any effects are additive rather than modulatory. Animals did prefer ornithine to water when presented by itself. Additionally, the authors refer to evidence that ornithine is activating the T1R1-T1R3 amino acid taste receptor, possibly at higher concentrations than they use for most of the study, although this seems speculative. It is striking that the largest effects on taste are found with the other amino acid (umami) stimuli, leading to the possibility that these are largely synergistic effects taking place at the tas1r receptor heterodimer.

      We would like to thank Reviewer #1 for the valuable comments. Our basis for considering ornithine as a taste modifier stems from our observation that a low concentration of ornithine (1 mM), which does not elicit a preference on its own, enhances the preference for umami substances, sucrose, and soybean oil through the activation of the GPRC6A receptor. Notably, this receptor is not typically considered a taste receptor. The reviewer suggested that the enhancement of umami taste might be due to potentiation occurring at the TAS1R receptor heterodimer. However, we propose that a different mechanism may be at play, as an antagonist of GPRC6A almost completely abolished this enhancement. In the revised manuscript, we will endeavor to provide additional information on the role of ornithine as a taste modifier acting through the GPRC6A receptor.

      Reviewer #2 (Public review):

      Summary:

      The authors used rats to determine the receptor for a food-related perception (kokumi) that has been characterized in humans. They employ a combination of behavioral, electrophysiological, and immunohistochemical results to support their conclusion that ornithine-mediated kokumi effects are mediated by the GPRC6A receptor. They complemented the rat data with some human psychophysical data. I find the results intriguing, but believe that the authors overinterpret their data.

      Strengths:

      The authors examined a new and exciting taste enhancer (ornithine). They used a variety of experimental approaches in rats to document the impact of ornithine on taste preference and peripheral taste nerve recordings. Further, they provided evidence pointing to a potential receptor for ornithine.

      Weaknesses:

      The authors have not established that the rat is an appropriate model system for studying kokumi. Their measurements do not provide insight into any of the established effects of kokumi on human flavor perception. The small study on humans is difficult to compare to the rat study because the authors made completely different types of measurements. Thus, I think that the authors need to substantially scale back the scope of their interpretations. These weaknesses diminish the likely impact of the work on the field of flavor perception.

      We would like to thank Reviewer #2 for the valuable comments and suggestions. Regarding the question of whether the rat is an appropriate model system for studying kokumi, we have chosen this species for several reasons: it is readily available as a conventional experimental model for gustatory research; the calcium-sensing receptor (CaSR), known as the kokumi receptor, is expressed in taste bud cells; and prior research has demonstrated the use of rats in kokumi studies involving gamma Glu-Val-Gly (Yamamoto and Mizuta, Chem. Senses, 2022).

      We acknowledge that fundamentally different types of measurements were conducted in the human psychophysical study and the rat study. Kokumi can indeed be assessed and expressed in humans; however, we do not currently have the means to confirm that animals experience kokumi in the same way that humans do. Therefore, human studies are necessary to evaluate kokumi, a conceptual term denoting enhanced flavor, while animal studies are needed to explore the potential underlying mechanisms of kokumi. We believe that a combination of both human and animal studies is essential, as is the case with research on sugars. While sugars are known to elicit sweetness, it is unclear whether animals perceive sweetness identically to humans, even though they exhibit a strong preference for sugars. In the revised manuscript, we will incorporate additional information to address the comments raised by the reviewer. We will also carefully review and revise our previous statements to ensure accuracy and clarity.

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors set out to investigate whether GPRC6A mediates kokumi taste initiated by the amino acid L-ornithine. They used Wistar rats, a standard laboratory strain, as the primary model and also performed an informative taste test in humans, in which miso soup was supplemented with various concentrations of L-ornithine. The findings are valuable and overall the evidence is solid. L-Ornithine should be considered to be a useful test substance in future studies of kokumi taste and the class C G protein-coupled receptor known as GPRC6A (C6A) along with its homolog, the calcium-sensing receptor (CaSR) should be considered candidate mediators of kokumi taste.

      Strengths:

      The overall experimental design is solid based on two bottle preference tests in rats. After determining the optimal concentration for L-Ornithine (1 mM) in the presence of MSG, it was added to various tastants, including inosine 5'-monophosphate; monosodium glutamate (MSG); mono-potassium glutamate (MPG); intralipos (a soybean oil emulsion); sucrose; sodium chloride (NaCl); citric acid and quinine hydrochloride. Robust effects of ornithine were observed in the cases of IMP, MSG, MPG, and sucrose, and little or no effects were observed in the cases of sodium chloride, citric acid, and quinine HCl. The researchers then focused on the preference for Ornithine-containing MSG solutions. The inclusion of the C6A inhibitors Calindol (0.3 mM but not 0.06 mM) or the gallate derivative EGCG (0.1 mM but not 0.03 mM) eliminated the preference for solutions that contained Ornithine in addition to MSG. The researchers next performed transections of the chord tympani nerves (with sham operation controls) in anesthetized rats to identify the role of the chorda tympani branches of the facial nerves (cranial nerve VII) in the preference for Ornithine-containing MSG solutions. This finding implicates the anterior half-two thirds of the tongue in ornithine-induced kokumi taste. They then used electrical recordings from intact chorda tympani nerves in anesthetized rats to demonstrate that ornithine enhanced MSG-induced responses following the application of tastants to the anterior surface of the tongue. They went on to show that this enhanced response was insensitive to amiloride, selected to inhibit 'salt tastant' responses mediated by the epithelial Na+ channel, but eliminated by Calindol. Finally, they performed immunohistochemistry on sections of rat tongue demonstrating C6A positive spindle-shaped cells in fungiform papillae that partially overlapped in its distribution with the IP3 type-3 receptor, used as a marker of Type-II cells, but not with (i) gustducin, the G protein partner of Tas1 receptors (T1Rs), used as a marker of a subset of type-II cells; or (ii) 5-HT (serotonin) and Synaptosome-associated protein 25 kDa (SNAP-25) used as markers of Type-III cells.

      Weaknesses:

      The researchers undertook what turned out to be largely confirmatory studies in rats with respect to their previously published work on Ornithine and C6A in mice (Mizuta et al Nutrients 2021).

      The authors point out that animal models pose some difficulties of interpretation in studies of taste and raise the possibility in the Discussion that umami substances may enhance the taste response to ornithine (Line 271, Page 9). They miss an opportunity to outline the experimental results from the study that favor their preferred interpretation that ornithine is a taste enhancer rather than a tastant.

      At least two other receptors in addition to C6A might mediate taste responses to ornithine: (i) the CaSR, which binds and responds to multiple L-amino acids (Conigrave et al, PNAS 2000), and which has been previously reported to mediate kokumi taste (Ohsu et al., JBC 2010) as well as responses to Ornithine (Shin et al., Cell Signaling 2020); and (ii) T1R1/T1R3 heterodimers which also respond to L-amino acids and exhibit enhanced responses to IMP (Nelson et al., Nature 2001). While the experimental results as a whole favor the authors' interpretation that C6A mediates the Ornithine responses, they do not make clear either the nature of the 'receptor identification problem' in the Introduction or the way in which they approached that problem in the Results and Discussion sections. It would be helpful to show that a specific inhibitor of the CaSR failed to block the ornithine response. In addition, while they showed that C6A-positive cells were clearly distinct from gustducin-positive, and thus T1R-positive cells, they missed an opportunity to clearly differentiate C6A-expressing taste cells and CaSR-expressing taste cells in the rat tongue sections.

      It would have been helpful to include a positive control kokumi substance in the two-bottle preference experiment (e.g., one of the known gamma-glutamyl peptides such as gamma-glu-Val-Gly or glutathione), to compare the relative potencies of the control kokumi compound and Ornithine, and to compare the sensitivities of the two responses to C6A and CaSR inhibitors.

      The results demonstrate that enhancement of the chorda tympani nerve response to MSG occurs at substantially greater Ornithine concentrations (10 and 30 mM) than were required to observe differences in the two bottle preference experiments (1.0 mM; Figure 2). The discrepancy requires careful discussion and if necessary further experiments using the two-bottle preference format.

      We would like to thank Reviewer #3 for the valuable comments and helpful suggestions. We propose that ornithine has two stimulatory actions: one acting on GPRC6A, particularly at lower concentrations, and another on amino acid receptors such as T1R1/T1R3 at higher concentrations. Consequently, ornithine is not preferable at lower concentrations but becomes preferable at higher concentrations. For our study on kokumi, we used a low concentration (1 mM) of ornithine. The possibility mentioned in the Discussion that 'the umami substances may enhance the taste response to ornithine' is entirely speculative. We will reconsider including this description in the revised version. As the reviewer suggested, in addition to GPRC6A, ornithine may bind to CaSR and/or T1R1/T1R3 heterodimers. However, we believe that ornithine mainly binds to GPRC6A, as a specific inhibitor of this receptor almost completely abolished the enhanced response to umami substances, and our immunohistochemical study indicated that GPRC6A-expressing taste cells are distinct from CaSR-expressing taste cells (see Supplemental Fig. 3). We conducted essentially the same experiments using gamma-Glu-Val-Gly in Wistar rats (Yamamoto and Mizuta, Chem. Senses, 2022) and compared the results in the Discussion. The reviewer may have misunderstood the chorda tympani results: we added the same concentration (1 mM) used in the two-bottle preference test to MSG (Fig. 5-B). Fig. 5-A shows nerve responses to five concentrations of plain ornithine. In the revised manuscript, we will strive to provide more precise information reflecting the reviewer’s comments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The behavioral effects found with the CPRC6A antagonists are not entirely convincing, as the antagonist is seemingly just mixed up in the solution with the stimuli. There are no control experiments demonstrating that the antagonists do not have a taste themselves.

      We mixed the antagonists into both liquids used in the two-bottle preference test to eliminate any potential taste effects of the antagonists themselves. In the electrophysiological experiments, the antagonist was incorporated into the solution after confirming that it did not elicit any appreciable response in the taste nerve.

      (2) The effects of ornithine found with quinine did not have a satisfying explanation - if there is some taste cell-taste cell modulation that accounts for the taste enhancement, why is the quinine less aversive? Why is it not enhanced like the other compounds?

      The effects of ornithine on quinine responses remain difficult to explain. A previous study (Tokuyama et al., Chem Pharm Bull, 2006) proposed that ornithine prevents bitter substances from binding to bitter receptors, although this hypothesis lacks definitive evidence. In the present study, our findings suggest that the binding of quinine to bitter receptors is essential, as another agonist, gallate, also enhanced the preference for quinine, but this effect was abolished by EGCG, a GPRC6A antagonist (see Supplemental Fig. 2).

      (3) Unless I am missing something, there appears to be no quantitative analysis of the immunocytochemical data, just assertions.

      We have made quantitative analyses in the revised text, and the following sentences have been added: “Approximately 11% of GPRC6A-positive cells overlapped with IP3R3 (9 double-positive cells/80 GPRC6A-positive cells), while approximately 8.3% of IP3R3-positive cells expressed GPRC6A (9 double-positive /109 IP3R3-positive cells). In addition, GPRC6A-positive cells were unlikely to colocalize with a-gustducin, another marker for a subset of type II cells, in single taste cells (0 double-positive cell/93 GPRC6A-positive cells). Regarding type III cell markers, GPRC6A-positive cells were unlikely to colocalize with 5-HT in single taste cells (0 double-positive cell/75 GPRC6A-positive cells).”

      (4) The hallmarks of Kokumi taste include descriptors such as "thickness", and "mouthfeel", which sound like potential somatosensory attributes. Perhaps the authors should consider this possibility for at least some of the effects found.

      The term kokumi, a Japanese word, refers to a phenomenon in which the flavor of complexly composed food is enhanced through certain processes, making them more delicious. To date, kokumi has been described using the representative terms thickness, mouthfulness, and continuity, originally introduced in the first paper on kokumi by Ueda et al. (1990). However, these terms are derived from Japanese and may not fully convey the nuances of the original language when translated into these simple English words. In particular, thickness is often interpreted as referring to physical properties such as viscosity or somatosensory sensations. Since kokumi inherently lacks somatosensory elements, this revised paper adopts alternative terms and explanations for the three components of kokumi to prevent misunderstanding and confusion.

      Therefore, to clarify that kokumi attributes are inherently gustatory, thickness is replaced with intensity of whole complex tastes (rich flavor with complex tastes), emphasizing the synergistic effects of a variety of tastes rather than the mere enhancement of a single flavor. Mouthfulness is clarified as not referring to mouthfeel (the tactile sensation a food gives in the mouth) but rather as spread of taste and flavor throughout the oral cavity, describing how the flavor fills the mouth. Continuity is replaced with persistence of taste (lingering flavor).

      (5) I don't think the human experiment (S1) belongs to the paper, even as a supplementary bit of data. It's only 17 subjects, they are all female, and we don't know anything about how they were selected, even though it states they are all students/staff at Kio. Were any of them lab members? Were they aware of the goals of the experiment? Could simply increasing the amount of solute in the soup make it seem thicker? This (sparse) data seems to have been shoehorned into the paper without enough detail/justification.

      Despite the reviewer’s suggestion, we would like to include the human experiment because the rationale of the present study is to confirm, through a human sensory test, that the kokumi of a complex solution (in this case, miso soup) is enhanced by the addition of ornithine. This is followed by basic animal experiments to investigate the underlying mechanisms. Therefore, this human study serves an important role.

      The total number of participants increased to 22 (19 women and three men) following an additional experiment with 5 new participants. New results have been shown in Supplemental Figure 1 with statistical analyses. The rewritten parts are as follows:

      We recruited 22 participants (19 women and three men, aged 21-28 years) from Kio University who were not affiliated with our laboratory, including students and staff members. All participants passed a screening test based on taste sensitivity. According to the responses obtained from a pre-experimental questionnaire, we confirmed that none of the participants had any sensory abnormalities, eating disorders, or mental disorders, or were taking any medications that may potentially affect their sense of taste. All participants were instructed not to eat or drink anything for 1 hour prior to the start of the experiment. We provided them with a detailed explanation of the experimental procedures, including safety measures and personal data protection, without revealing the specific goals of the study.

      (6) The introduction could be more concise - for example, when describing Kokumi stimuli such as ornithine and its possible receptors, the authors do not need to add the detail about how this stimulus was deduced from adding clams to the soup. Details like this can be reserved for the discussion.

      Thank you for this comment. We have tried to shorten the Introduction.

      (7) Line 86: awkward phrasing - this doesn't need to be a rhetorical question.

      We have deleted the sentence.

      (8) Supplementary Figure 1: The labels on the figure say "Miso soup in 1 mM Orn" when the Orn is dissolved into the soup.

      Thank you for pointing out our mistake. We have changed the description, such as “1 mM Orn in miso soup”.

      Reviewer #2 (Recommendations for the authors):

      Major concerns

      (1) The impact of "kokumi" taste ligands on food perception appears to be profound in humans. This observation is fascinating because it implies that molecules like ornithine impact a variety of flavor perceptions, some of which are non-gustatory in nature (e.g., spread, mouthfulness and harmony). What remains unclear is whether "kokumi" ligands produce analogous sensations in rodents. If they don't, then rodents are an inappropriate model system for studying the impact of kokumi on flavor perceptions. The authors fail to address this key issue, and uncritically assume that kokumi ligands produce sensations like thickness, mouthfulness, and continuity in rodents. For this reason, the authors' reference to GPRC6A as a kokumi receptor is inappropriate.

      Thank you very much for the valuable comments. The term kokumi refers to a phenomenon in which the flavor of complexly composed foods is enhanced through certain processes, making them more delicious. It is an important concept in the field of food science, which studies how to make prepared dishes more enjoyable. Kokumi is also considered a higher-order, profound cognitive function evaluated by humans who experience a wide variety of foods. However, it is unclear whether animals, particularly experimental animals, can perceive kokumi in the same way humans do.

      To date, kokumi has been described using the representative terms thickness, mouthfulness, and continuity, originally introduced in the first paper on kokumi by Ueda et al. (1990). However, these terms are derived from Japanese and may not fully convey the nuances of the original language when translated into these simple English words. In particular, thickness is often interpreted as referring to physical properties such as viscosity or somatosensory sensations. Since kokumi inherently lacks somatosensory elements, this revised paper adopts alternative terms and explanations for the three components of kokumi to prevent misunderstanding and confusion.

      Therefore, to clarify that kokumi attributes are inherently gustatory, thickness is replaced with intensity of whole complex tastes (rich flavor with complex tastes), emphasizing the synergistic effects of a variety of tastes rather than the mere enhancement of a single flavor. Mouthfulness is clarified as not referring to mouthfeel (the tactile sensation a food gives in the mouth) but rather as spread of taste and flavor throughout the oral cavity, describing how the flavor fills the mouth. Continuity is replaced with persistence of taste (lingering flavor).

      Rodents are thought to possess basic taste functions similar to humans, such as the expression of taste receptors, including kokumi receptors, in taste cells. Regardless of whether rodents can perceive kokumi, findings from studies on rodents may provide insights into aspects of the kokumi concept as experienced by humans.

      Indeed, the results of this study indicate that ornithine enhances umami, sweetness, fat taste, and saltiness, leading to the enhancement of complex flavors—referred to as intensity of whole taste. The activation of various taste cells, resulting in the enhancement of multiple tastes, may contribute to the sensation of flavors spreading throughout the oral cavity. Furthermore, the strong enhancement of MSG and MPG suggests that glutamate contributes to the mouthfulness and persistence of taste characteristic of kokumi.

      (2) A related concern is that the authors did not make any measurements that model kokumi sensations documented in the literature. For example, they would need to develop behavioral/electrophysiological measurements that reflect the known effects of kokumi ligands on flavor perception (i.e., increases in intensity, spread, continuity, richness, harmony, and punch). For example, ornithine is thought to produce more "punch" (i.e., a more rapid rise in intensity). This could be manifested as a more rapid rise in peripheral taste response or a more rapid fMRI response in the taste cortex. Alternatively, ornithine is thought to increase "continuity" (i.e., make the taste response more persistent). This response would presumably be manifested as a peripheral taste response that adapts more slowly or a more persistent fMRI response. As it stands, the authors have documented that ornithine increases (i) the preference of rats for some chemical stimuli, but not others; and (ii) the response of the CT nerve to some but not all taste stimuli.

      In animal experiments, it is challenging to examine each attribute of kokumi. The increase of complex tastes can be investigated through behavioral experiments and neural activity recordings. However, phenomena such as spread or harmony, which arise from profound human judgments, are difficult to validate in animal studies.

      While it was possible to examine persistence through neural responses to tastants, all stimuli were rinsed at 30 seconds after onset of stimulation, so the exact duration of persistence was not investigated. However, since the MSG response was enhanced approximately 1.5 times with the addition of ornithine, it is strongly suggested that the duration might also have been prolonged.

      Regarding punch, no differences were observed in the neural responses when ornithine was added, likely because the phasic response already had a rapid onset.

      In the context of fMRI studies, there has been a report that adding glutathione to mixtures of umami and salt solutions increases responses (Goto et al. Chem Senses, 2016). However, research specifically examining the attributes of kokumi has not yet been reported.

      (3) The quality of the SNAP-25 immunohistochemistry is poor (see Figure 7D), with lots of seemingly nonspecific staining in and outside the taste bud.

      The quality of the SNAP-25 is not poor. It is known that SNAP-25 labels not only type III cells but also the dense network of intragemmal nerve fibers (Tizzano et al., Immunohistochemical Analysis of Human Vallate Taste Buds. Chem Senses.40:655-60, 2015). Therefore, lots of seemingly nonspecific staining is due to intense SNAP-25-immunoreactivity of the nerve fibers.

      (4) The authors need to drastically scale back the scope of their conclusions. What they can say is that ornithine appears to enhance the taste responses of rats to a variety of taste stimuli and that this effect appears to be mediated by the GPRC6A receptor. They cannot use their data to address kokumi effects in humans, as they have not attempted to model any of these effects. Given the known problems with pharmacological blocking agents (e.g., nonspecificity), the authors would significantly strengthen their case if they could generate similar results in a GPRC6A knockout mouse.

      Our research approach begins with confirming in humans that the addition of ornithine to complex foods (such as miso soup) induces kokumi. Based on this confirmation, we conduct fundamental studies using animal models to investigate the peripheral taste mechanisms underlying the expression of kokumi.

      It is possible that the key to kokumi expression lies in the enhancement of desirable tastes (particularly umami) and the suppression of unpleasant tastes. Moving forward, we will deepen our fundamental research on the action of ornithine mediated through GPRC6A, including studies using knockout mice.

      (5) The introduction is too long. Much of the discussion of kokumi perception in humans should either be removed or shortened considerably.

      Following the reviewer’s suggestion, the introduction has been shortened.

      (6) I recommend that the authors break up the Methods and Results sections into different experiments. This would enable the authors to provide separate rationales for each procedure. For instance, the authors conducted a variety of different behavioral procedures (e.g., long- and short-term preference tests, and preference tests with and without GPRC6A receptor antagonists).

      Rather than following the reviewer’s suggestion, we have added subheadings to describe the purpose of each experiment. This approach would help readers better understand the experimental flow, as each experiment is relatively straightforward.

      (7) The inclusion of the human data is odd for two reasons. First, the measurements used to assess the impact of ornithine on flavor perception in humans were totally different than those used in rats. This makes it impossible to compare the human and rat datasets. Second, the human study was rather limited in scope, had small effect sizes, and had a lot of individual variation. For these reasons, the human data are not terribly helpful. I recommend that the authors remove the human data from this paper, and publish them as part of a more extensive study on humans.

      Despite the reviewer’s suggestion, we would like to include the human experiment because the rationale of the present study is to confirm, through a human sensory test, that the kokumi of a complex solution (in this case, miso soup) is enhanced by the addition of ornithine. This is followed by basic animal experiments to investigate the underlying mechanisms. Therefore, this human study serves an important role. The considerable variation in the scores suggests that evaluating the three kokumi attributes is challenging and likely influenced by differences in judgment criteria among participants.

      The total number of participants increased to 22 (19 women and three men) following an additional experiment with 5 new participants. New results have been shown in Supplemental Figure 1 with statistical analyses. The rewritten parts are as follows:

      We recruited 22 participants (19 women and three men, aged 21-28 years) from Kio University who were not affiliated with our laboratory, including students and staff members. All participants passed a screening test based on taste sensitivity. According to the responses obtained from a pre-experimental questionnaire, we confirmed that none of the participants had any sensory abnormalities, eating disorders, or mental disorders, or were taking any medications that may potentially affect their sense of taste. All participants were instructed not to eat or drink anything for 1 hour prior to the start of the experiment. We provided them with a detailed explanation of the experimental procedures, including safety measures and personal data protection, without revealing the specific goals of the study.

      (8) While the use of English is generally good, there are many instances where the English is a bit awkward. I recommend that the authors ask a native English speaker to edit the text.

      Thank you for this comment. The text has been edited by a native English speaker.

      Minor concerns

      (1) Lines 13-14: The authors state that "the concept of 'kokumi' has garnered significant attention in gustatory physiology and food science." This is an exaggeration. Kokumi has generated considerable interest in food science but has yet to generate much interest in gustatory physiology.

      We have rewritten this part: “The concept of “kokumi” has generated considerable interest in food science but kokumi has not been well studied in gustatory physiology.”

      (2) Line 20: The use of "specific taste" is unclear in this context. The authors indicate (in Figure 5A) that 1 mM ornithine generates a CT nerve response. They also reveal (in Figure 1A) that rats do not prefer 1 mM ornithine over water. The results from a preference test do not provide insight into whether a solution can be tasted; they merely demonstrate a lack of preference for that solution. Based on these data, the authors cannot infer that 1 mM ornithine cannot be tasted.

      We agree with the reviewer’s comment. Ornithine at 1 mM concentration may have a weak taste because this solution elicited a small neural response (Fig. 5-A). We have rewritten the text: “… at a concentration without preference for this solution.”

      (3) Line 44: Sensory information from foods enters the oral and the nasal cavity.

      The nasal cavity has been added.

      (5) Lines 59: The terms "thickness", "mouthfulness" and "continuity" are not intuitive in English, and may reflect, at least in part, a failure in translation. The word thickness implies a tactile sensation (e.g., owing to high viscosity), but the authors use it to indicate a flavor that is more intense and onsets more quickly. The word mouthfulness is supposed to indicate that a flavor is experienced throughout the oral cavity. The problem here is that this happens with all tastants, independent of the presence of substances like ornithine. Indeed, taste buds occur in a limited portion of the oral epithelium, but we nevertheless experience tastes throughout the oral cavity, owing to a phenomenon called tactile referral (see the following reference: Todrank and Bartoshuk, 1991, A taste illusion: taste sensation localized by touch" Physiology & Behavior 50:1027-1031). The word continuity does not imply that the taste is long-lasting or persistent.

      These three attributes were originally introduced by Ueda et al. (1990), who translated Japanese terms describing the profound characteristics of kokumi, which are deeply rooted in Japanese culinary culture. However, these simply translated terms have caused global misunderstanding and confusion, because they sound like somatosensory rather than gustatory descriptions. Therefore, to clarify that kokumi attributes are inherently gustatory, in the revised version we use the terms “intensity of whole complex tastes (rich flavor with complex tastes)” instead of thickness, “mouthfulness (spread of taste and flavor throughout the oral cavity),” and “persistence of taste (lingering flavor)” instead of continuity.

      The results of this study indicate that ornithine enhances umami, sweetness, fat taste, and saltiness, leading to the enhancement of complex flavors—referred to as intensity of whole taste. The activation of various taste cells, resulting in the enhancement of multiple tastes, may contribute to the sensation of flavors spreading throughout the oral cavity. Furthermore, the strong enhancement of MSG and MPG suggests that glutamate contributes to the mouthfulness and persistence of taste characteristic of kokumi.

      (6) Figure legends: The authors provide results of statistical comparisons in several of the figures. They need to explain what statistical procedures were performed. As it stands, it is impossible to interpret the asterisks provided.

      We have explained statistical procedures in each Figure legend.

      (7) I did not see any reference to the sources of funding or any mention of potential conflicts of interest.

      We have added the following information:

      Funding: JSPS KAKENHI Grant Numbers JP17K00935 (to TY) and JP22K11803(to KU).

      Declaration of interests: The authors declare that they have no competing interests.

      Reviewer #3 (Recommendations for the authors):

      (1) I suggest that the authors increase their level of interest in glutathione and gamma-glutamyl peptides. This might include an appropriate gamma-glutamyl control substance in the two-bottle preference study (see Public Review). It might also include more careful attention to the work that identified glutathione as an activator of the CaSR (Wang et al., JBC 2006) and the nature of its binding site on the CaSR which overlaps with its site for L-amino acids (Broadhead et al., JBC 2011). This latter article also identified S-methyl glutathione, in which the free-SH group is blocked, as a high-potency activator of the CaSR. It would be expected to show comparable potency to gamma-glu-Val-Gly in assays of kokumi taste.

      We have appropriately referenced glutathione and gamma-Glu-Val-Gly, potent agonists of CaSR, where necessary. In our previous study (Yamamoto and Mizuta, Chem Senses, 2022), we examined the additive effects of these substances on basic taste stimuli in rodents, and the results were compared in greater detail with those obtained from the addition of ornithine in the present study. We have also discussed the potential binding of ornithine to other receptors, including CaSR and T1R1/T1R3 heterodimers.

      (2) Figures:

      -None of the figures were labelled with their Figure numbers. I have inferred the Figure numbers from the legends and their positions in the pdf.

      We are sorry for this inconvenience.

      - The labelling of Figure 1 and Figure 2 are problematic. In Figure 1 it should be made clear that the horizontal axes refer to the Ornithine concentration. In Figure 2 it should be made clear that the horizontal axes refer to the tastant concentrations (MSG, IMP, etc) and that the Ornithine concentrations were fixed at either zero or 1.0 mM.

      We are sorry for the lack of information about the horizontal axes. We have explained the horizontal axes in figure legends in Figs. 1 and 2. The labelling of both figures has also been modified to make this clear.

      - Figure 3B: 'Control' should appear at the top of this panel since the panels that follow all refer to it.

      Following the reviewer’s suggestion, we have added ‘Control’ at the top of Figure 3B.

      - Figure 5A. Provide a label for the test substance, presumably Ornithine.

      Yes, we have added ‘Ornithine’.

      - Figure 7 would be strengthened by the inclusion of immunohistochemistry analyses of the CaSR.

      We are sorry that we did not analyze immunohistochemistry for the CaSR because a previous study precisely had analyzed the CaSR expression on taste cells in rats. We have analyzed co-expression of GPRC6A and CaSR (see Supplemental Figure 3).

      (3) Other Matters:

      - Line 38: list the five basic taste modalities here.

      Yes, we have included the five basic taste modalities here.

      - Line 107: 'even if ... kokumi ... is less developed in rodents' - if there is evidence that kokumi is less developed in rodents it should be cited here.

      We cannot cite any references here because no studies have compared the perception of kokumi between humans and rodents.

      - Line 308: 'recently we conducted experiments in rats using gallate ...' - the authors appear to imply that they performed the research in Reference 43, however, I was unable to find an overlap between the two lists of authors.

      We are not doing a similar study as the research in Reference 43 (40 in the revised paper). Following the result that gallate is an agonist of GPRC6A as shown by Reference 43, we were interested in doing similar behavioral experiments using gallate instead of ornithine.

      The sentences have been rewritten to avoid misunderstanding.

      - Line 506: the sections are said to be 20 mm thick - should this read 20 micrometers?

      Thank you. We have changed to 20 micrometers.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02767

      Corresponding author(s): Kazuaki Maruyama

      1. General Statements

      Response to Reviewer #1:

      We sincerely appreciate your thoughtful review of our manuscript. Our primary objective is to elucidate the pathogenic mechanisms underlying congenital low-flow vascular malformations, thereby informing the development of novel therapeutic strategies. We recognize that, given the dual nature of our study encompassing both fundamental and clinical science, the presentation may have appeared somewhat convoluted. In response, we have revised the manuscript to clarify these points and have reformatted the text corresponding to your comments—originally presented as a single continuous block—into defined, numbered sections to enhance readability.

      Response to Reviewer #2:

      We are deeply grateful for the time and effort you have dedicated to reviewing our manuscript despite your busy schedule. Your comments have been particularly insightful, especially regarding the section on the preclinical mouse model. In light of your suggestions, we have conducted additional experiments and revised the manuscript accordingly. We trust that these modifications address your concerns and contribute to the overall improvement of our work.

      The revised sections have been highlighted in red in the text.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required):

      The authors investigate the pathogenesis of congenital vascular malformations by overexpressing the Pik3caH1047R mutation under the R26 locus in different cell populations and developmental stages using various Cre and CreERT2 lines, including endothelial-specific and different mesoderm precursor lines. The authors provide a thorough characterization of the vascular malformation phenotypes across models. Specifically, they claim that expressing Pik3caH1047R in the cardiopharyngeal mesoderm (CPM) precursors results in vascular abnormalities localized to the head and neck region of the embryo. The study also includes scRNAseq data analyses, including from previously published data and new data generated by the authors. Trajectory inference analysis of a previous scRNA-seq dataset revealed that Isl1+ mesodermal cells can differentiate into ETV2+ cells, directly giving rise to Prox1+ lymphatic endothelial cell progenitors, bypassing the venous stage. Single-cell RNA sequencing of their CPM model and other in vitro datasets show that Pik3caH1047R upregulates VEGF-A via HIF-1α-mediated hypoxia signaling, findings further corroborated in human samples. Finally, preclinical studies in adult mice confirm that pharmacological inhibition of HIF-1α and VEGF-A reduces the number and size of mutant vessels.

      Major comments

      1. While the study provides a nice characterization of Pik3caH1047R-derived vascular phenotypes induce by expressing this mutation in different cells, the main message of the study is unclear. What is the main question that the authors want to address with this manuscript?

      Response:

      Our main message is as follows:

      1. __ Elucidation of pathogenesis based on developmental cellular origins:__ This study focuses on using embryonic models to elucidate the mechanism by which the Pik3caH1047R mutation induces low-flow vascular malformations. Specifically, we demonstrate that expression of Pik3caH1047R in cells derived from the cardiopharyngeal mesoderm (CPM) induces vascular abnormalities that are confined to the head and neck region. Furthermore, vascular malformations originating from another cell type—for example, Pax3+ cells—are confined to the lower body. This suggests that the embryonic origin of endothelial cells may determine the anatomical location of vascular malformations, with important implications for clinical severity and treatment strategies.

      Molecular ____s____i____gnaling pathways and targeted therapeutic approaches:

      Through single-cell RNA sequencing, we have identified hypoxia signaling—particularly via HIF-1α and VEGF-A—as central to the pathogenesis of these malformations. Moreover, preclinical mouse model experiments demonstrate that pharmacological inhibition of HIF-1α and VEGF-A significantly reduces lesion formation, supporting the potential of targeting these pathways as a novel therapeutic strategy.

      In summary, our main message is that by elucidating the developmental and molecular mechanisms underlying Pik3caH1047R-driven low-flow vascular malformations—especially the pivotal role of hypoxia signaling via HIF-1α/VEGF-A—we provide a strong rationale for novel therapeutic strategies aimed at these challenging conditions

      To further clarify these points, we have revised the manuscript by incorporating additional experiments and reorganizing the text into clearly defined sections.

      The precursor type form where these lesions appear, that venous and lymphatic malformations emerge independently, when and where this phenotype appear?

      Response:

      In Tie2-Cre; R26R-Pik3caH1047R mutant embryos, no prominent phenotype was observed at E9.5 or E11.5. Vascular (venous) malformations are evident from E12.5, whereas lymphatic malformations become prominent from E13.5. We propose that the emergence of the lymphatic phenotype after E13.5 is due to the fact that lymphatic vessels, particularly in the upper body, begin forming a luminal structure mainly from E13.5 onward(Maruyama et al, 2022) . For further details, please refer to the explanation provided in Question 6.

      To address this, we have newly included Supplemental Figure 2 and revised the Results section as follows:

      Whereas clear phenotypes were evident at E12.5 and E13.5, no pronounced external abnormalities were observed at E9.5 or E11.5 (Supplemental Figure 2A–B). Similarly, histological examination revealed no significant differences in the short-axis diameter of the PECAM+ CV or in the number of Prox1+ LECs surrounding the CV between control and mutant embryos at E11.5 (Supplemental Figure 2C–F). We also assessed Tie2-Cre; R26R-Pik3caH1047R mutant embryos at E14.0 from five pregnant mice. Only two embryos were alive at this stage, and both showed severe edema and hemorrhaging, indicating they were nearly moribund. These observations suggest that the critical point for survival of these mutant embryos lies between E13.5 and E14.0 (Supplemental Figure 2G). (Page 5, lines 157–165)

      The manuscript needs some work to make the sections more cohesive and to structure better the main findings and the rationale for choosing the models. Authors should explain better when and where the pathogenic phenotypes refer to blood and/or lymphatic malformations. From the quantifications provided in Figure 1, Pik3caH1047R leads to different phenotypes in blood and lymphatic vessels. These are larger diameters with no difference in the number of blood vessels (are you quantifying all pecam1 positive? Vein, arteries, capillaries?), and an increase in the number of lymphatics vessels. Please clarify and discuss.

      Response:

      We interpreted this as a question regarding which vessels were quantified. The answer to this question is provided in Question 4.

      Which vessel types are considered for the quantifications shown in Fig. 1I, M, Q? All Pecam1+ vessels, including lymphatic, vein, capillaries and arteries or which ones? Provide clarifications.

      __Response: __

      Vessel types were characterized based on anatomical and histological features. For the anatomical details, we referred to The Atlas of Mouse Development by M.H. Kaufman.

      This aspect is described in the Methods section, as follows:

      Veins and arteries were classified based on anatomical criteria. Vessels demonstrating continuity with a clearly identifiable vein (e.g., the anterior cardinal vein) in serial sections were defined as veins. In contrast, the aorta and pulmonary artery, each exhibiting a distinct wall structure indicative of a direct connection to the heart, were designated as arteries. Lymphatic vessels were identified based on the combined expression of Prox1, VEGFR3, and PECAM, along with the developmental stage, morphology, and anatomical location as described in our previous studies (Maruyama et al, 2019, 2022, 2021) . PECAM+ vessels that lacked a definitive wall structure, did not express lymphatic markers, or did not exhibit clearly identifiable continuity necessary for classification as veins or capillaries were collectively designated as blood vessels or vasculatures. (Page 16, lines 530-539)

      Regarding Figure 1I:

      In the tongue and mandible, the facial vein—which branches from the anterior cardinal vein—is dilated, and its continuity with the venous system is confirmed. In contrast, Figure 1J shows the number of PECAM+ vasculatures; however, for smaller vessels, continuity is not always demonstrable, so these are designated as vasculatures according to the criteria.

      Regarding Figures 1M and N:

      In the liver, the dilated vessels are classified as veins because they exhibit continuity with the inferior vena cava. Even in the control group, the central veins tend to have relatively large diameters. Therefore, we compared the average area and quantified the number of abnormal central veins—defined as those contiguous with a vein and exceeding a specified area.

      Regarding Figures 1Q and R:

      Cerebral vessels are classified as veins due to their continuity with the common cardinal and jugular veins. However, as these vessels extend into the periphery, this continuity becomes less distinct, and they are consequently designated as blood vessels lacking Prox1 expression.

      The authors propose that the CPM model results in localized head and neck vascular malformations. However, I am not convinced. The images supporting the neck defects are evident, but it is unclear whether there are phenotypes in the head.

      Response:

      Perhaps the discrepancy arises from a terminological issue. According to the WHO Classification of Tumours, commonly used in clinical settings, the term "Head and Neck" refers to the facial and cervical regions (including the oral cavity, larynx, pharynx, salivary glands, nasal cavity, etc.) and excludes the central nervous system. The inclusion of the brain in Figure 1O-R may have led to some confusion. We included the brain because cerebral cavernous malformations are classified as venous malformations, and thus serve as an example of common sites for venous malformations in humans. To clarify this point, we have made slight revisions to the first part of the Introduction, as follows:

      They frequently manifest in the head and neck region—here defined as the orofacial and cervical areas, excluding the brain. (Page2, lines 52-53)

      Why are half of the experiments with the Tie2-Cre model conducted at E12.5 (e.g., validation of recombination, signaling, proliferation) and the others at E13.5? It becomes confusing for the reader why the authors start the results section with E13.5 and then study E12.5.

      Response:

      This is also related to the previous question (Question 4). We decided to include extensive anatomical information in a single figure. In Supplemental Figure 1, sagittal sections at E12.5 were used so that the pulmonary artery, aorta, and dilated common cardinal vein could be visualized within one sample. This allowed us to demonstrate that the Pik3caH1047R mutation does not affect arteries by contrasting them with the dilated veins. At E13.5, in addition to the dilation observed at E12.5, the common cardinal vein becomes markedly dilated and compresses the surrounding structures. Capturing both veins and arteries simultaneously would require multiple images, which could potentially confuse the reader. Moreover, lymphatic and other organ phenotypes (e.g., in the liver) are more prominent at E13.5. Therefore, we selectively employed both E12.5 and E13.5 stages to suit our specific objectives.

      The quantifications provided do not clarify what the "n" represents or how many embryos or litters were analyzed. 

      Response:

      Thank you for your feedback. We have now incorporated the sample size (n) directly into the graphs and figure legends.

      Blasio et al. (2018), Hare et al (2015) reported that Pik3caH1047R with Tie2-Cre embryos die before E10.5. How do the authors explain the increase in survival here? Were embryos at E13.5alive? What was the Mendelian ratio observed by the authors? Please provide this information and discuss this point.

      Response:

      Two types of Tie2-Cre lines are widely used worldwide. The mouse line employed by Blasio et al. (2018) differs from that used in our study (their manuscript did not specify whether the background was B6 or a mixed strain). In contrast, although Hare et al. (2015) used the same mouse line as we did, they maintained a C57BL/6 background. We selected a mixed background of B6 and ICR, as we believe that a heterogeneous genetic background more accurately reflects the diversity of human pathology. We examined five pregnant females, which yielded approximately 30 embryos from five pregnant mice, of which only two survived until E14.0. Based on these observations, we consider E13.5 to be the appropriate survival limit (see Supplemental Figure 2G for additional details). In our breeding strategy, mice in the Tie2-Cre or Tie2-Cre; R26R-eYFP line were maintained as heterozygotes for Tie2-Cre and homozygotes for R26R-eYFP, whereas those carrying the R26R-Pik3caH1047R allele were homozygous. This approach produced control(Cre (-)) and heterozygous offspring in an expected 1:1 ratio at all examined stages: E9.5 (mutant n = 4, control n = 4 from two pregnant females), E11.5 (mutant n = 8, control n = 8 from two pregnant females), E12.5 (mutant n = 4, control n = 4 from two pregnant females), and E13.5 (mutant n = 5, control n = 5 from two pregnant females), with no deviation from the anticipated Mendelian ratio.

      Regarding this point, we have described it in the Results section as follows:

      Whereas clear phenotypes were evident at E12.5 and E13.5, no pronounced external abnormalities were observed at E9.5 or E11.5 (Supplemental Figure 2A–B). Similarly, histological examination revealed no significant differences in the short-axis diameter of the PECAM+ CV or in the number of Prox1+ LECs surrounding the CV between control and mutant embryos at E11.5 (Supplemental Figure 2C–F). We also assessed Tie2-Cre; R26R-Pik3caH1047R mutant embryos at E14.0 from five pregnant mice. Only two embryos were alive at this stage, and both showed severe edema and hemorrhaging, indicating they were nearly moribund. These observations suggest that the critical point for survival of these mutant embryos lies between E13.5 and E14.0 (Supplemental Figure 2G). (Page 5, lines 157-165)

      Please explain the rationale for using the Cdh5-CreERT2. It is likely due to the lethality observed with Tie2Cre, but this was not mentioned.

      Response:

      Thank you very much for your comment. As mentioned above, nearly all Tie2‐Cre;Pik3caH1047R embryos fail to survive past E14.0.

      The lethality observed with Tie2‐Cre mice is described as follows:

      We also assessed Tie2-Cre; R26R-Pik3caH1047R mutant embryos at E14.0 from five pregnant mice. Only two embryos were alive at this stage, and both showed severe edema and hemorrhaging, indicating they were nearly moribund. These observations suggest that the critical point for survival of these mutant embryos lies between E13.5 and E14.0 (Supplemental Figure 2G). (Page 5, lines 161-165)

      The rationale for using CDH5-CreERT2 mice is described as follows:

      To investigate whether the resulting human disease subtype (e.g., lesions confined to the head and neck region) is determined by the specific embryonic stage at which Pik3caH1047R is expressed, we crossed tamoxifen-inducible, pan-endothelial CDH5-CreERT2 mice with R26R-Pik3caH1047R mice and analyzed the embryos at E16.5 or E17.5. (Page 5, lines 169-172)

      Why were tamoxifen injections done at various time points (E9.5, E12.5, E15.5)? Please clarify the reasoning behind administering tamoxifen at these specific times. Explaining the rationale will help the reader follow the experimental design more easily. Additionally, including an initial diagram summarizing all the strategies to guide the reader from the beginning would be helpful.

      Response:

      Martinez‐Corral et al. (Nat. Commun., 2020) focused on lymphatic malformations, arguing that the timing of tamoxifen administration during the embryonic period determines the anatomical features of these lesions. They stated, “The majority of lesions appeared as large isolated cysts that were localized mainly to the cervical, and less frequently to the sacral region of the skin (Figure 2)”. Although not stated definitively, their data suggest that early embryonic tamoxifen administration results in the formation of large‐caliber lymphatic vessels with region‐specific distribution in the cervical skin (Figure 2C, Supplemental Figure 2). This description likely reflects an intention to model human vascular malformations, implying that the anatomical characteristics of these malformations are influenced by the developmental stage at which the Pik3caH1047R somatic mutation occurs.

      Inspired by these findings, we conducted experiments to determine whether altering the timing of tamoxifen administration would yield region-specific anatomical patterns in vascular malformation development. However, our results indicate that changing the timing of tamoxifen administration does not lead to an anatomical bias similar to that observed in human vascular malformations. Instead, we propose that the embryological cellular origin plays a more significant role in the formation of these human pathologies.

      Regarding this section, we have slightly revised the introductory part of the Figure 2 explanation as follows:

      To investigate whether the resulting human disease subtype (e.g., lesions confined to the head and neck region) is determined by the specific embryonic stage at which Pik3caH1047R is expressed, we crossed tamoxifen-inducible, pan-endothelial CDH5-CreERT2 mice with R26R-Pik3caH1047R mice and analyzed the embryos at E16.5 or E17.5. (Page 5, lines 169-172)

      Additionally, we have added a schematic diagram of the tamoxifen administration schedule at the beginning of Figure 2 and Supplemental Figure 3.

      Why do you use the Isl1-Cre constitutive line (instead of the CreERT2)? The former does not allow control of the timing of recombination (targeting specifically your population of interest) and loses the ability to trace the mutant cell behaviors over time. Is the constitutive expression of Pik3caH1047R in Isl1+ cells lethal at any embryonic time, or do the animals survive into adulthood? When you later use the Isl1-CreERT2 line, why do you induce recombination specifically at E8.5? It would be helpful for the reader to have an explanation for this choice, along with a reference to your previous paper.

      Response:

      Thank you for your comments. We did attempt the same experiments using Isl1-CreERT2 under various conditions. However, administering tamoxifen earlier than E8.5 invariably caused embryonic lethality, likely due to both Pik3ca activity and tamoxifen toxicity, leaving no embryos for analysis. In our previous study, repeated attempts from E6.5 to E16.5 resulted in only two surviving embryos (Maruyama et al., eLife, 2022, Supplemental Figure 3). We also failed to recover any live embryos with tamoxifen administration at E7.5.

      Even reducing the tamoxifen dose to one-fifth did not succeed when given before E8.5. Although E8.5 administration was feasible, the observed phenotype remained mild, and no phenotype was detected at E9.5, E11.5, E12.5, or later stages. These findings align with our earlier observations that moving tamoxifen injection from E8.5 to E9.5 markedly diminishes the Isl1+ contribution to the endothelial lineage.

      Furthermore, Supplemental Figure 5____ and 6 suggest that a decrease in Isl1 mRNA, which occurs as early as E8.0–E8.25, triggers the shift toward endothelial differentiation. Considering these data and the mild phenotype at E8.5, earlier administration would be ideal for impacting Isl1+ cell fate. However, technical constraints prevented us from doing so, leading us to utilize the constitutive Isl1-Cre line instead.

      This section was already included in the Discussion; however, for clarity, we have revised it as follows:

      Given that Isl1 expression disappears at a very early stage and contributes to endothelial differentiation, experiments using Isl1-Cre or Isl1-CreERT2 mice cannot clearly distinguish between LMs, VMs, and capillary malformations, In other words, Isl1+ cells likely label a common progenitor population for multiple endothelial subtypes. Consequently, the diverse vascular malformations in the head and neck—including mixed venous-lymphatic and capillary malformations, as well as the macro- and microcystic subtypes of LMs—cannot be fully accounted for by this study alone. (Page 13, lines 419-425)

      What is the purpose of using this battery of CreERT2 lines (for example, the Myf5-CreERT2)?

      Response:

      The head and neck mesoderm arises primarily from the cardiopharyngeal mesoderm and the cranial paraxial mesoderm. Myf5-CreERT2 labels the cranial paraxial mesoderm in the facial region, which gives rise to facial skeletal muscles. Stone et al. (Dev Cell, 2019) reported that a subset of this lineage contributes to head and neck lymphatic vessels, whereas our study (Maruyama et al., eLife, 2022) found no such contribution—an ongoing point of debate. Nevertheless, expressing Pik3caH1047R in this lineage did not induce any vascular malformations.

      Pax3-CreERT2 mice label Pax3____⁺ paraxial mesoderm (including cranial paraxial mesoderm), which reportedly contributes to the common cardinal vein and subsequently forms trunk lymphatics (Stone & Stainier, 2019; Lupu et al, 2022) . When Pik3caH1047R was expressed in Pax3⁺ cells, we observed abnormal vasculature in the lower trunk and around the vertebrae, consistent with that report.

      Synthesizing these observations with our results from Isl1-Cre, Isl1-CreERT2, and Mef2c-AHF-Cre lines, we propose that Pik3caH1047R mutations within the cardiopharyngeal mesoderm underlie the clinically significant vascular malformations seen in the head and neck region.

      We have also incorporated the following explanation into the main text.

      Regarding the Pax3-CreERT2:

      The head and neck mesoderm arises primarily from the cardiopharyngeal mesoderm and the cranial paraxial mesoderm. In Pax3-CreERT2; R26R-Pik3caH1047R embryos, Pax3+ paraxial mesoderm (including cranial paraxial mesoderm) is labeled; this lineage reportedly contributes to the common cardinal vein and subsequently forms trunk lymphatics(Lupu et al, 2022), (Page 8, lines 247-250)

      Regarding the Myf5-CreERT2;

      In Myf5-CreERT2; R26R-tdTomato mice—which label the cranial paraxial mesoderm, particularly muscle satellite cells—crossed with R26R-Pik3caH1047R, tamoxifen was administered to pregnant mice at E9.5. (Page 8, lines 255-257)

      I find the scRNAseq data in Fig S4 and S5 results very interesting, although I am unsure how they fit with the rest of the story. In principle, a subset of Isl1+ cardiopharyngeal mesoderm (CPM) derivatives into lymphatic endothelial cells was already demonstrated in a previous publication from the group. What is the novelty and purpose here?

      Response:

      This also addresses Question 11. Our aim in using the Isl1⁺ lineage was to determine the extent of analysis possible with this experimental system. Through reanalysis, we found that the downregulation of Isl1 triggers a switch toward endothelial cell differentiation, with this cell fate decision occurring at a very early embryonic stage. Consequently, our single‐cell analysis supports the conclusion that, regardless of the Isl1-CreERT2 line used or the timing of tamoxifen administration, it is challenging to precisely recapitulate the fine clinical phenotypes observed in humans (e.g., lymphatic or venous malformations) with this experimental system. We believe that this single‐cell analysis provides a theoretical basis for the notion that our Isl1-Cre-based developmental model can only generate a mixed phenotype of vascular and lymphatic malformations.

      This section is explained in a similar manner in the revised Discussion for Question 11 as follows:

      Given that Isl1 expression disappears at a very early stage and contributes to endothelial differentiation, experiments using Isl1-Cre or Isl1-CreERT2 mice cannot clearly distinguish between LMs, VMs, and capillary malformations, In other words, Isl1+ cells likely label a common progenitor population for multiple endothelial subtypes. Consequently, the diverse vascular malformations in the head and neck—including mixed venous-lymphatic and capillary malformations, as well as the macro- and microcystic subtypes of LMs—cannot be fully accounted for by this study alone. (Page 13, lines 419-425)

      Why in Fig. 4 ECs were not subclustered for further analysis (as in Fig. S4,5)? This is a missed opportunity to understand the pathogenic phenotypes.

      Response:

      Thank you for your question. We performed sub-clustering analysis, particularly focusing on why no phenotype is observed in arteries, as we believed this approach could provide molecular-level insights. Accordingly, we conducted the analysis presented in Figure 1 for Reviewer 1.





      Figure legends for Figure ____1 ____for Reviewer 1. The number of endothelial cells was insufficient, making subclustering ineffective.

      (Figure for Reviewer 1A, B) Left: UMAP plot showing color-coded clusters (0–3). Subcluster analysis of the Endothelium (Cluster 1) from Fig. 4B. Right: UMAP plot color-coded by condition. (Figure for Reviewer 1C) Heatmap showing the average gene expression of marker genes for each cluster by condition. After cluster annotation, subclusters 0, 1, 2, and 3 were defined as Vein, Capillary, Artery, and Lymphatics, respectively. (Figure for Reviewer 1D) Cell type proportions. (Figure for Reviewer 1E) Number of differentially expressed genes (DEGs) in each sucluster of the PIK3CAH1047R group relative to Control. (Figure for Reviewer 1F) Comparison of enrichment analysis between EC subclusters from scRNA-seq. The bar graph shows the top 20 significantly altered Hallmark gene sets in EC subclusters from scRNA-seq using ssGSEA (escape R package). Red bars represent significantly upregulated Hallmark gene sets in mutants (FDR Initially, we performed sub-clustering on endothelial cells; however, this resulted in a considerably reduced number of cells per sub-cluster, especially in control group (Figure for Reviewer 1A, B). In the control group, there were only approximately 149 endothelial cells in total, and dividing these into four clusters led to very few cells per cluster, thereby introducing statistical instability. Although arterial endothelial cells were relatively well defined by their high expression of Hey1 and Hey2 and lower levels of Nr2f2 and Aplnr, the boundaries between venous, capillary, and lymphatic endothelial cells were less distinct. In particular, defining lymphatic endothelial cells solely by Prox1 expression yielded a very small population; even after incorporating additional lymphatic markers such as Flt4 and Lyve1, it remained challenging to clearly separate the venous, capillary, and lymphatic populations (Figure for Reviewer 1C). Consequently, the proportion of lymphatic endothelial cells was markedly low, and discrepancies with the histological findings further reduced our confidence in this dataset (Figure for Reviewer 1D, E). Moreover, the number of differentially expressed genes (DEGs) increased with the number of cells, and the results of the enrichment analysis as well as the volcano plot were nearly identical to those shown in Figure 4 (Figure for Reviewer 1F, G). In other words, the subclustering process itself had limitations, resulting in the overall outcome being dominated by the most abundant venous cluster.

      It is possible that these limitations in sub-clustering are due to the relatively small number of endothelial cells. Nonetheless, a major strength of our single-cell analysis is its ability to compare various cell types derived from Isl1+ lineages, not just endothelial cells. Therefore, the relative scarcity of endothelial cells represents a limitation of this experimental system. For these reasons, we decided to omit this figure from the final version of the manuscript.

      This point is described in the Discussion section as follows:

      Additionally, we performed endothelial subclustering to explore potential differences in gene expression among arterial, venous, capillary, and lymphatic endothelium. However, in the control embryos, the number of endothelial cells was too low to yield reliable data (data not shown). (Page 13, lines 434-437)

      Hypoxia and glycolysis signatures are not specific to mutant ECs. Do the authors have an explanation for this? It is well known that PI3K overactivation increases glycolysis; please acknowledge this.

      __Response: __

      Thank you for your important comment. We have now incorporated a discussion, along with relevant references, on the section addressing that PI3K overactivation increases glycolysis into the Discussion section as follows:

      It is well known that overactivation of PI3K enhances glycolysis(Hu et al, 2016) . In our study, the elevated expression of glycolytic enzymes, including Ldha, suggests a shift toward aerobic glycolysis, consistent with the Warburg effect. (Page 13, lines447-450)

      Do you have an explanation for the expression of VEGFA by lymphatic mutant cells?

      __Response: __

      VEGF-A acts on VEGFR2 expressed on LECs, thereby promoting their proliferation and migration(Hong et al, 2004; Dellinger & Brekken, 2011) .To clarify this point, we have revised the text accordingly and added additional references as follows:

      We focused on Vegf-a, a key regulator of ECs proliferation and a downstream target of Hif-1α. Vegf-a likely drives both cell-autonomous and non-cell-autonomous effects on blood ECs , as well as LECs(Hong et al, 2004; Dellinger & Brekken, 2011). (Page 13, lines 445-447)

      Likewise, why mesenchymal cells traced from the Islt1-Cre decreased upon expression of Pik3caH1047R?

      Response: When comparing the mesenchyme cluster with other mesoderm-derived cells, we observed a marked downregulation of signaling pathways—notably those involved in inhibiting EMT, such as TGF-β, Wnt/βcatenin, and MYC target genes (Supplemental Figure 7B). Many of these pathways are associated with decreased epithelial-to-mesenchymal transition(Xu et al, 2009; Singh et al, 2012; Larue & Bellacosa, 2005; Yu et al, 2015), which could explain the reduction in the number of mesenchymal cells. However, PI3K activation is generally considered to promote EMT, which is at odds with previous studies.

      On the other hand, several investigations—including those using ES cells—suggest that PI3K activation could suppress TGF-β signaling via SMAD2/3(Yu et al, 2015) , and in some undifferentiated cell contexts, it may also inhibit the Wnt/β-catenin pathway via Smad2/3(Singh et al, 2012) . These multifaceted roles of PI3K could be particularly important during embryonic development(Larue & Bellacosa, 2005).

      Understanding how mesenchymal cell changes under PI3K activation affect endothelial cells is an important issue that requires further study. Accordingly, we have added these points to the Discussion section as follows:

      In our data, the mesenchymal cell population was decreased, and within this cluster, pathways typically promoting epithelial mesenchymal tansition (EMT) (e.g., TGF-β, Wnt, and MYC target genes) were downregulated (Supplemental Figure 7B). Although PI3K activation is generally thought to enhance EMT, several studies in undifferentiated cells have reported that PI3K can suppress these signals via SMAD2/3(Singh et al, 2012; Yu et al, 2015) . Elucidating how these changes in the mesenchyme contribute to vascular malformation pathogenesis remains an important avenue for future research. (Page 13, lines 437-444)

      Authors need to characterize the preclinical model before conducting any preclinical study. No controls are provided, including wild-type mice and phenotypes, before starting the treatment (day 4).

      Response:

      Thank you very much for your comment. We have now added new images illustrating skin under three conditions: untreated skin at Day 7, skin from Cre-negative animals that received tamoxifen, and skin from Cre-positive animals examined 4 days after tamoxifen administration. Additionally, we have included the corresponding statistical data for these skin samples (Figure 6C–E).

      Why did the authors not use their developmental model of head and neck malformation model for preclinical studies? This would be much more coherent with the first part of the manuscript. Also, how many animals were treated and quantified for the different conditions?

      Response:

      We have now indicated the number of animals (n) used under each condition directly on the graphs for clarity. As for why we did not use the Isl1-Cre model, we observed that—similar to the Tie2-Cre line—all Isl1-Cre mutant embryos died between E13.5 and E14.0 (indeed, none survived beyond E14.0; see our newly added Figure 3N). Consequently, we could not perform any postnatal treatment experiments. Moreover, as previously noted, the Isl-CreERT2 line has an extremely narrow developmental window for vascular malformation formation, making it less suitable as a general model.

      Although we considered potential in utero or maternal interventions (e.g., direct uterine injection or placental transfer), these approaches demand extensive technical optimization and remain an area for future investigation. From a clinical standpoint, postnatal therapy meets a more immediate need: while vascular malformations are congenital, they often enlarge over time(Ryu et al, 2023) , becoming more apparent and more likely to require treatment.

      In this study, because embryonic Pik3caH1047R expression was lethal before birth, we generated and treated postnatal cutaneous vascular malformations instead. Although this model does not strictly recapitulate the embryonic disease state, previous studies assessing drug efficacy have similarly employed postnatal tamoxifen-inducible mouse models(Martinez-Corral et al, 2020) , lending validity to this approach. Moreover, because lesions typically become evident later in life rather than in utero, this method more closely aligns with clinical reality and may be more readily translated into practice.

      Minor Comments

      References in the introduction need to be revised. Specifically, how authors reached the stats on head and neck vascular malformations needs to be clarified. For instance, one of the cited papers refers to all types of vascular malformation, while the other focuses exclusively on lymphatic malformations with PIK3CA mutations. Moreover, in the latter, the groups are divided into orofacial and neck and body categories. How do authors substrate the information from the neck and head here?

      Response:

      We have clarified our definition of the “head and neck” region early in the Introduction and separated the discussion on anatomical localization from that on PIK3CA genetics. Additionally, we removed the percentage data of localization to avoid potential confusion with the genetic aspects.

      In Japan, lymphatic and other vascular malformations of the head and neck typically require complex, multidisciplinary management. Consequently, these conditions are officially designated as “intractable diseases,” and the government provides financial assistance for their treatment. Although most of the information is available only in Japanese, we refer reviewers to the following websites for details on head and neck vascular malformations:

      https://www.nanbyou.or.jp/entry/4893 https://www.nanbyou.or.jp/entry/4631 https://www.nanbyou.or.jp/entry/4758.

      (Please read with English translator, e.g., Google chrome translator)

      We are not aware of a comparable system in other countries. However, it is well recognized that vascular malformations frequently occur in the head and neck region(Nair, 2018; Alsuwailem et al, 2020; Sadick et al, 2017), as evidenced by over 250 PubMed hits when searching for “vascular malformation” and “head and neck.

      Incorporating this comment, we have revised the early part of the Introduction as follows:

      They frequently manifest in the head and neck region—here defined as the orofacial and cervical areas, excluding the brain (Zenner et al, 2019; Lee & Chung, 2018; Nair, 2018; Alsuwailem et al, 2020). (Page 2, lines 52-53)

      Also, in line 79, I need clarification on ref 24 about fibrosis.

      __Response: __

      Thank you very much for pointing out the error. We have corrected the placement of the reference accordingly.

      Include references: Studies in mice have shown that p110α is essential for normal blood and lymphatic vessel development. Please clarify and correct. 

      __Response: __

      Thank you very much. We have now added the references(Graupera et al, 2008; Gupta et al, 2007; Stanczuk et al, 2015).

      Please define PIP2 and PIP3

      __Response: __

      Thank you very much for your comment. We have now added the following definitions to the Introduction:

      PIP2: Phosphatidylinositol 4,5-bisphosphate

      PIP3: Phosphatidylinositol 3,4,5-trisphosphate


      Why is Prox1 showing positivity in erythrocytes in Figure 1?

      Response:

      We used paraffin-embedded sections to preserve tissue morphology. Although we applied a reagent to suppress autofluorescence, some spillover from excitation around 488 nm was unavoidable. Moreover, in the mutant mice, blood remained within the abnormal vessels rather than being completely flushed out, which further increased the autofluorescence. Despite our efforts to mitigate this, some residual autofluorescence persisted. Consequently, we also employed DAB-based staining to confirm the specificity of Prox1 labeling in other Figures.

      Regarding Figure 1, I suggest organizing the quantifications in the same order to facilitate phenotype comparisons. For example, I, J vs. Q, R. What is the difference between M and N?

      Response:

      To facilitate the comparison between Figures 1I, J and 1Q, R, we have swapped Figures 1Q and R. Regarding Figures 1M and N, these panels represent the average cross-sectional area of an enlarged malformed vessel and the number of vessels exceeding a defined size, respectively. Although some central veins appeared slightly enlarged in the control group, the liver exhibits both a significant dilation of malformed vessels and an increased number of such vessels.

      Add the reference of the Bulk RNseq data.

      __Response: __

      We have added the following references: (Jauhiainen et al, 2023)

      Mark in the Fig. 4F that the volcano plots are from cluster one of the scRNASeq (this is explained in text and legend, but when you go to the figure, it isn't very clear).

      __Response: __

      We have added the label “Cluster 1: Volcano Plot (genes associated with hypoxia/glycolysis)” to

      Figure 4F.

      Please label Figure 6D/E with the proper labels.

      __Response: __

      We have provided appropriate labels for Figure 6.

      In Fig. 6, it is mentioned that vacuoles are from the tamoxifen injection, how do you know? Do you also see them if you add oil alone (without tamoxifen) or tamoxifen in a WT background?

      __Response: __

      In Figure 6C, we have included both the image at Day 4 and the condition of Cre(–) animals 7 days after tamoxifen injection.

      **Referees cross-commenting**

      I complete agree with referee #2 regarding the preclinical studies. Bevacizumab, does not neutralize murine VEGFA. This is a major issue.

      __Response: __

      As noted in the Reviewer #2 section, there appears to be some effect on mouse vasculature (Lin et al, 2022). However, given the ongoing debate regarding this issue, we performed additional experiments using a neutralizing antibody against mouse VEGF-A (clone 2G11). This antibody has been shown to suppress the proliferation of mouse vascular endothelial cells in vivo, for example(Mashima et al, 2021; Wuest & Carr, 2010). Our results demonstrate that it more sharply suppresses the proliferation of malformed vasculatures (both blood and lymphatic vessels) than bevacizumab. Based on these additional experiments, we revised the figures and updated them as Figure 6.

      Reviewer #1 (Significance (Required)):

      This study addresses a timely and relevant question: the origins, onset and progression of congenital vascular malformations, a field with limited understanding. The work is novel in its approach, employing complex embryonic models that aim to mimic the disease in its native context. By focusing on the effects of Pik3caH1047R mutations in cardiopharyngeal mesoderm-derived endothelial cells, it sheds light on how these mutations drive phenotypic outcomes through specific pathways, such as HIF-1α and VEGF-A signaling, while also identifying potential therapeutic targets. A strong aspect of the study is the use of embryonic models, which enables the investigation of disease onset in a context that closely resembles the in vivo environment. This is particularly valuable for congenital disorders, where native developmental cues are an integral aspect of disease progression. The study also integrates advanced techniques, including single-cell RNA sequencing, to dissect the cellular and molecular responses induced by the Pik3caH1047R mutation. Moreover, from a translational perspective, it provides novel therapeutic strategies for these diseases. Limitations of the study are (1) unclarity of the main question authors try to address, and main conclusions dereived thereof; (2) the different parts of the manuscripts are not well connected, not clear the rationale; (3) scRNAseq analysis is underdeveloped; (4) characterization of the preclinical model is not provided.

      Audience:

      The findings presented here interest specialized audiences within developmental biology, vascular biology, and congenital disease research fields, and clinicians by providing new therapies to treat vascular anomalies. Moreover, the study's integration of single-cell and in vivo models could inspire further research in other contexts where understanding clonal behavior and signaling pathways is critical.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This paper focuses on vascular malformations driven by PI3K mutation, with particular interest on the vascular defects localized at head and neck anatomical sites. The authors exploit the H1047R mutant which has been largely demonstrated to induce both vascular and lymphatic malformation. To limit the effect of H1047R to tissues originated from cardiopharinegal mesoderm, PI3caH1047R mice were crossed with mice expressing Cre under the control of the promoter of Ils1 , a transcription factor that contributes to the development of cardiopharinegal mesoderm-derived tissues. By comparing the embryo phenotype of this model with that observed by inducing at different times of development the expression of PI3caH1047R, the authors conclude that Isl-Cre; PI3caH1047R; R26R-eYFP model recapitulates better the anatomical features of human vascular malformations and in particular those localized at head and neck. In my opinion the new proposed model represents a significant progress to study human vascular malformations. Furthermore, scRNA seq analysis has allowed to propose a mechanism focused on the role of HIF and VEGFA. The authors provides partial evidences that HIF and VEGFA inhibitors halt the development of vascular malformation in VeCAdCre; Pik3caH1047 mice. This experiment is characterized by a conceptual mistake because bevacizumab does not recognize murine VEGFA (see for instance 10.1073/pnas.0611492104; 10.1167/iovs.07-1175. This error dampens my enthusiasm

      CRITICISM

      1. Fig 1A. E13.5 corresponds to the early phase of vascular remodelling. Which is the phenotype at earliest stages (e.g. 9.5 or 10.5)

      Response:

      Thank you very much for your comment. We have created new Supplemental Figure 2, which demonstrates that no obvious phenotype is observed in mutant embryos at E9.5 and E11.5, and that the survival limit of these mutant embryos is around E13.5 to E14.0.

      In response to Reviewer 1’s question, previous study(Hare et al, 2015) have shown that on a B6 background, this mouse model exhibits an earlier onset of phenotype, resulting in early lethality. However we selected a mixed background of B6 and ICR, as we believe that a heterogeneous genetic background more accurately reflects the diversity of human pathology. We examined five pregnant females, which yielded approximately 30 embryos, of which only two survived until E14.0. Based on these observations, we consider E13.5 to E14.0 to be the appropriate survival limit (see Supplemental Figure 2G for additional details).

      We have described this in the Results section as follows:

      Whereas clear phenotypes were evident at E12.5 and E13.5, no pronounced external abnormalities were observed at E9.5 or E11.5 (Supplemental Figure 2A–B). Similarly, histological examination revealed no significant differences in the short-axis diameter of the PECAM+ CV or in the number of Prox1+ LECs surrounding the CV between control and mutant embryos at E11.5 (Supplemental Figure 2C–F). We also assessed Tie2-Cre; R26R-Pik3caH1047R mutant embryos at E14.0 from five pregnant mice. Only two embryos were alive at this stage, and both showed severe edema and hemorrhaging, indicating they were nearly moribund. These observations suggest that the critical point for survival of these mutant embryos lies between E13.5 and E14.0 (Supplemental Figure 2G). (Page 5, lines 157-165)

      Fig 1,2,3. The analysis of VEGFR2 expression is required. This request is important for the paradigmatic and non-overlapping role of this receptor in early and late vascular development. Furthermore ,these data better clarify the mechanism suggested by the experiments reported in fig 5 (VEGFA and HIF expression)

      __Response: __

      Thank you very much for your comment. For each mouse presented in Figures 1, 2, and 3, we performed VEGFR2 immunostaining on serial sections corresponding to each figure and created a new Supplemental Figure 9. VEGFR2 was broadly expressed in both vascular and lymphatic endothelial cells in control and mutant embryos.

      We have described this in the Results section as follows:

      Furthermore, to verify whether VEGF‐A can act via VEGFR2, we performed VEGFR2 immunostaining on several mouse models: Tie2‐Cre; R26R‐Pik3caH1047R embryos (E13.5, corresponding to Figure 1), CDH5‐CreERT2; R26R‐Pik3caH1047R embryos (tamoxifen administered at E9.5 and analyzed at E16.5, corresponding to Figure 2), and Isl1‐Cre; R26R‐Pik3caH1047R embryos (E11.5 and E13.5, corresponding to Figure 3). In all cases, both control and mutant embryos exhibited widespread VEGFR2 expression in blood and lymphatic vessels at early and late developmental stages (Supplemental Figure 9A-R’). These findings suggest that Pik3caH1047R may act in an autocrine manner, at least in part via the VEGF‐A/VEGFR2 axis in endothelial cells, potentially explaining the observed phenotype. (Page 11, lines352-361)

      As done in Fig 1,2 and 3, data quantification by morphometric analysis is also required for results reported in supplemental figure 3

      __Response: __

      Thank you for your comment. We have now added additional statistics and graphs for clarity, which are presented as Supplemental Figure 4.

      Lines 166-174. I suppose that the reported observations were done at E16.5. What happens later? It's crucial to sustain the statement at lines 187-190

      Response:

      At E9.5 and E12.5, we reduced the tamoxifen dose to one-fifth of the standard dose. After collecting embryos from approximately 10 pregnant females, we were only able to obtain three embryos at these stages. When tamoxifen was administered at E15.5, three embryos were obtained from two litters. In most cases, miscarriages occurred by E16.5, making further observation difficult. We focused on the time point around E16.5 because it is generally believed that the basic distribution of the lymphatic system throughout the body is established around this stage (Srinivasan et al, 2007; Maruyama et al, 2022).

      A similar experiment has been reported using T-CreERT2 to induce mosaic expression of Pik3caH1047R in the mesoderm, which resulted in subcutaneous venous malformations in mice at P1–P5 (Castillo et al, 2016). However, that study did not report whether the mice survived normally after birth. In fact, regarding the survival rate, the authors stated, “Our observations on the lethality and vascular defects in MosMes-Pik3caH1047R (T-CreERT2;R26R-Pik3caH1047R) embryos are similar to the previously reported phenotypes of ubiquitous or EC-specific expression of Pik3caH1047R in the developing embryo (Hare et al, 2015),” suggesting a high mortality rate when Pik3caH1047R is expressed using Tie2-Cre. Moreover, according to Hare et al., analysis of 250 Tie2-Cre; R26R-Pik3caH1047R embryos revealed that all were lethal by E11.5. Thus, considering our results in conjunction with those from previous studies, it appears that expression of Pik3caH1047R in the mesoderm or endothelial cells during embryonic development results in the death of most embryos before birth.

      We have supplemented the Results section with the following details:

      Since the standard tamoxifen dose (125 mg/kg body weight) leads to miscarriage or embryonic death within 1–2 days, we diluted it to one-fifth of the original concentration. (Pages 5-6, lines 175-177)

      scRNAseq was performed at E13.5 (Fig 4). It's mandatory to perform the same analysis at E16.5, which corresponds to the phenotypic analysis shown in fig 3. This experiment is required to understand how hypoxia and glycolysis genes changes along the development of the vascular malformation.

      __Response: __

      Thank you very much for your comment. First, regarding the experiments using Isl1‐Cre, we would like to clarify that the survival aspect was not adequately addressed. Our Isl1‐Cre embryos die between E13.5 and E14.0, which makes it practically impossible to perform single‐cell analysis beyond this stage (please refer to the newly added Figure 4N). Similarly, for experiments using CDH5‐CreERT2, the limited number of embryos obtained renders further analysis extremely challenging. Additionally, we have supplemented the Results section with the following description:

      These Isl1-Cre; R26R-Pik3caH1047R mutant embryos likely died from facial hemorrhaging between E13.5 and E14.0 (Figure 3N). (Page 7, lines 236-237)

      Further analysis at later embryonic stages proved challenging. Consequently, we aimed to investigate the effects of Pik3caH1047R on endothelial cells by comparing gene expression at E10.5 with that at E13.5. We performed single‐cell RNA sequencing on E10.5 embryos from both the control (Isl1-Cre; R26R-eYFP) and mutant (Isl1-Cre; R26R-eYFP; R26R-Pik3caH1047R) embryos. Unfortunately, the quality of both datasets was insufficient for reliable analysis. In the control sample, only 40.3% of reads were assigned to cell‐associated barcodes—substantially below the ideal threshold of >70%—with an estimated 790 cells and a median of 598 genes per cell. Similarly, in the mutant sample, only 37.0% of reads were associated with cells, despite an estimated cell count of 7,326 and a median of only 526 genes per cell. These metrics indicate that both datasets were severely compromised by high levels of ambient RNA or by a significant number of cells with low RNA content, precluding robust downstream analysis. This may be due to the fact that immature cells are particularly susceptible to damage incurred during FACS sorting and transportation to the analysis facility. Moreover, the relatively low number of control endothelial cells at E13.5 led us to conclude that performing similar experiments at earlier stages would be difficult. Despite our best efforts, we acknowledge this as a limitation of the present study.

      Lines 326-343. In this section the authors provide pharmacological evidences that HIF and VEGFa are involved in vascular malformation caused by H1047R . However , I'm surprised of efficacy of bevacizumab, which neutralizes human but not murine VEGFA. Genetech has developed B20 mAb that specifically neutralizes murine VEGFA. So the data shown require a. clarification by the authors and the experiments must be done with the appropriate reagent. Furthermore, which is the pharmacokynetics of these compounds topically applied?

      Response:

      Thank you very much for your comment. There are reports that bevacizumab exerts an in vivo inhibitory effect on neovascularization mediated by mouse Vegf-A (Lin et al, 2022). However, given the contentious nature of this issue, we conducted additional experiments. Due to the requirement for an MTA to obtain B20 mAb from Genentech—and considering the time constraints during revision—we opted to use a neutralizing antibody against mouse VEGF-A (clone 2G11) instead. This antibody has been shown to suppress the proliferation of mouse vascular endothelial cells in vivo (Mashima et al, 2021; Wuest & Carr, 2010) .

      The dosing regimen for 2G11 was determined based on previous studies (Surve et al, 2024; Churchill et al, 2022). Moreover, an example of effective local administration is provided in (Nagao et al, 2017). Since this product is an antibody drug, it is metabolized and does not function as a prodrug. Although the precise half-life of 2G11 is unknown, rat IgG2a antibodies generally have a circulating half-life of approximately 7–10 days in rats. However, when administered to mice, the half-life is often significantly reduced due to interspecies differences in neonatal Fc receptor (FcRn) binding affinity, with estimates in murine models typically around 2–4 days(Abdiche et al, 2015; Medesan et al, 1998) . However, in our model the injection is subcutaneous—almost equivalent to an intradermal injection (Figure 6B, C). Because this method is expected to provide a more sustained, slow-release effect (similar to the tuberculin reaction), the half-life should be longer than that achieved with intravenous administration. Consequently, we believe that sufficient efficacy is maintained in this model.

      Regarding LW-6:

      LW-6 is a small molecule that, due to its hydrophobic nature, is believed to freely cross cell membranes. Once inside the cell, it facilitates the degradation of HIF-1α, leading to reduced expression of its downstream targets (Lee et al, 2010). Although its half-life is estimated to be around 30 minutes, the active metabolites may exert sustained secondary effects (Lee et al, 2021). When administered intravenously, peak blood concentrations are reached within 5 minutes, making Cmax a critical parameter due to the rapid onset of action. In our experiments, we based the dosing regimen on previous studies (Lee et al, 2010; Song et al, 2016; Xu et al, 2022, 2024). While those studies administered doses comparable to or twice as high as ours via intravenous, intraperitoneal, or oral routes, our experimental design—in which a single dose was administered on Day 4 and samples were collected on Day 7—necessitated a single-dose protocol.

      Regarding Rapamycin:

      Several studies have demonstrated that local administration yields anti-inflammatory effects (Takayama et al, 2014; Tyler et al, 2011). Similar outcomes have been observed in vascular malformations (Boscolo et al, 2015; Martinez-Corral et al, 2020). Although the half-life of rapamycin is estimated to be approximately 6 hours following intravenous administration, it may be even shorter (Comas et al, 2012; Popovich et al, 2014).

      In light of these comments, we have revised Figure 6. Furthermore, the Results section pertaining to Figure 6 has been updated as follows:

      Hif-1α and Vegf-A inhibitors suppress the progression of vascular malformations.

      We next examined whether administering Hif-1α and Vegf-A inhibitors could effectively treat vascular malformations. Tamoxifen was administered to 3–4-week-old CDH5-CreERT2;R26R-Pik3caH1047R mice to induce mutations in the dorsal skin. Anti-VEGF-A, a Vegf-A neutralizing antibody; LW6, a Hif-1α inhibitor; and rapamycin, an mTOR inhibitor, were topically applied, and their effects were analyzed (Figure 6A). Both anti-VEGF-A and LW6 reduced the visible swelling in the dorsal skin, whereas the difference between the drug-treated and control groups was less pronounced with rapamycin (Figure 6B). In tamoxifen-treated Cre(–) mice, inflammatory cell infiltration and fibrosis were observed from the dermis to the subcutaneous tissue; however, there were no changes in the number of PECAM⁺ vasculatures or VEGFR3⁺ lymphatic vessels, including their enlarged forms, compared to the untreated control (Figure 6C–E). In contrast, tamoxifen administration to CDH5-CreERT2;R26R-Pik3caH1047R mice resulted in an increase in these vascular structures by day 4 (Figure 6C–E). At day 7, comparing mice with or without treatment using anti-VEGF-A, LW6, or rapamycin, the number of PECAM⁺ vasculatures was reduced in the treated groups; however, in the rapamycin group, the number of enlarged PECAM⁺ vasculatures did not differ from that in the untreated group (Figure 6F–M). Similarly, for VEGFR3⁺ lymphatic vessels, both anti-VEGF-A and LW6 induced a reduction, whereas rapamycin did not produce a statistically significant decrease (Figure 6N–U). (Page 11, lines 363-381)

      **Referees cross-commenting**

      The issues raised by refereee #1 related to the phenotype analysis are right. In my opinion the Isl model here proposed well mimic human pathology evenf the vascular damage at. head is not so evident

      Response:

      Perhaps the discrepancy arises from a terminological issue. According to the WHO Classification of Tumours, commonly used in clinical settings, the term "Head and Neck" refers to the facial and cervical regions (including the oral cavity, larynx, pharynx, salivary glands, nasal cavity, etc.) and excludes the central nervous system. The inclusion of the brain in Figure 1O-R may have led to some confusion. We included the brain because cerebral cavernous malformations are classified as venous malformations, and thus serve as an example of common sites for venous malformations in humans.

      To clarify this point, we have made slight revisions to the first part of the Introduction, as follows:

      They frequently manifest in the head and neck region—here defined as the orofacial and cervical areas, excluding the brain. (Page2, lines 52-53)

      Reviewer #2 (Significance (Required)):

      General assessment

      STRENGTH : a new mouse model seems to well recapitulate human vascular malformation. Possible key molecules have been identified

      WEAKNESS. The pharmacological approach to support the role of VEGFA e HIF is not appropriate

      References for the review:

      Abdiche YN, Yeung YA, Chaparro-Riggers J, Barman I, Strop P, Chin SM, Pham A, Bolton G, McDonough D, Lindquist K, et al (2015) The neonatal Fc receptor (FcRn) binds independently to both sites of the IgG homodimer with identical affinity. mAbs 7: 331–343

      Alsuwailem A, Myer CM & Chaudry G (2020) Vascular anomalies of the head and neck. Semin Pediatr Surg 29: 150968

      Boscolo E, Limaye N, Huang L, Kang K-T, Soblet J, Uebelhoer M, Mendola A, Natynki M, Seront E, Dupont S, et al (2015) Rapamycin improves TIE2-mutated venous malformation in murine model and human subjects. J Clin Investig 125: 3491–3504

      Castillo SD, Tzouanacou E, Zaw-Thin M, Berenjeno IM, Parker VER, Chivite I, Milà-Guasch M, Pearce W, Solomon I, Angulo-Urarte A, et al (2016) Somatic activating mutations in Pik3ca cause sporadic venous malformations in mice and humans. Sci Transl Med 8: 332ra43

      Churchill MJ, Bois H du, Heim TA, Mudianto T, Steele MM, Nolz JC & Lund AW (2022) Infection-induced lymphatic zippering restricts fluid transport and viral dissemination from skin. J Exp Med 219: e20211830

      Comas M, Toshkov I, Kuropatwinski KK, Chernova OB, Polinsky A, Blagosklonny MV, Gudkov AV & Antoch MP (2012) New nanoformulation of rapamycin Rapatar extends lifespan in homozygous p53−/− mice by delaying carcinogenesis. Aging (Albany NY) 4: 715–722

      Dellinger MT & Brekken RA (2011) Phosphorylation of Akt and ERK1/2 Is Required for VEGF-A/VEGFR2-Induced Proliferation and Migration of Lymphatic Endothelium. PLoS ONE 6: e28947

      Graupera M, Guillermet-Guibert J, Foukas LC, Phng L-K, Cain RJ, Salpekar A, Pearce W, Meek S, Millan J, Cutillas PR, et al (2008) Angiogenesis selectively requires the p110α isoform of PI3K to control endothelial cell migration. Nature 453: 662–666

      Gupta S, Ramjaun AR, Haiko P, Wang Y, Warne PH, Nicke B, Nye E, Stamp G, Alitalo K & Downward J (2007) Binding of Ras to Phosphoinositide 3-Kinase p110α Is Required for Ras- Driven Tumorigenesis in Mice. Cell 129: 957–968

      Hare LM, Schwarz Q, Wiszniak S, Gurung R, Montgomery KG, Mitchell CA & Phillips WA (2015) Heterozygous expression of the oncogenic Pik3ca H1047R mutation during murine development results in fatal embryonic and extraembryonic defects. Dev Biol 404: 14–26

      Hong Y, Lange‐Asschenfeldt B, Velasco P, Hirakawa S, Kunstfeld R, Brown LF, Bohlen P, Senger DR & Detmar M (2004) VEGF‐A promotes tissue repair‐associated lymphatic vessel formation via VEGFR‐2 and the α1β1 and α2β1 integrins. FASEB J 18: 1111–1113

      Hu H, Juvekar A, Lyssiotis CA, Lien EC, Albeck JG, Oh D, Varma G, Hung YP, Ullas S, Lauring J, et al (2016) Phosphoinositide 3-Kinase Regulates Glycolysis through Mobilization of Aldolase from the Actin Cytoskeleton. Cell 164: 433–446

      Jauhiainen S, Ilmonen H, Vuola P, Rasinkangas H, Pulkkinen HH, Keränen S, Kiema M, Liikkanen JJ, Laham-Karam N, Laidinen S, et al (2023) ErbB signaling is a potential therapeutic target for vascular lesions with fibrous component. eLife 12: e82543

      Larue L & Bellacosa A (2005) Epithelial–mesenchymal transition in development and cancer: role of phosphatidylinositol 3′ kinase/AKT pathways. Oncogene 24: 7443–7454

      Lee JW & Chung HY (2018) Vascular anomalies of the head and neck: current overview. Arch Craniofacial Surg 19: 243–247

      Lee K, Kang JE, Park S-K, Jin Y, Chung K-S, Kim H-M, Lee K, Kang MR, Lee MK, Song KB, et al (2010) LW6, a novel HIF-1 inhibitor, promotes proteasomal degradation of HIF-1α via upregulation of VHL in a colon cancer cell line. Biochem Pharmacol 80: 982–989

      Lee K, Lee J-Y, Lee K, Jung C-R, Kim M-J, Kim J-A, Yoo D-G, Shin E-J & Oh S-J (2021) Metabolite Profiling and Characterization of LW6, a Novel HIF-1α Inhibitor, as an Antitumor Drug Candidate in Mice. Molecules 26: 1951

      Lin Y, Dong M, Liu Z, Xu M, Huang Z, Liu H, Gao Y & Zhou W (2022) A strategy of vascular‐targeted therapy for liver fibrosis. Hepatology 76: 660–675

      Lupu I-E, Kirschnick N, Weischer S, Martinez-Corral I, Forrow A, Lahmann I, Riley PR, Zobel T, Makinen T, Kiefer F, et al (2022) Direct specification of lymphatic endothelium from non-venous angioblasts. Biorxiv: 2022.05.11.491403

      Martinez-Corral I, Zhang Y, Petkova M, Ortsäter H, Sjöberg S, Castillo SD, Brouillard P, Libbrecht L, Saur D, Graupera M, et al (2020) Blockade of VEGF-C signaling inhibits lymphatic malformations driven by oncogenic PIK3CA mutation. Nat Commun 11: 2869

      Maruyama K, Miyagawa-Tomita S, Haneda Y, Kida M, Matsuzaki F, Imanaka-Yoshida K & Kurihara H (2022) The cardiopharyngeal mesoderm contributes to lymphatic vessel development in mouse. Elife 11

      Maruyama K, Miyagawa-Tomita S, Mizukami K, Matsuzaki F & Kurihara H (2019) Isl1-expressing non-venous cell lineage contributes to cardiac lymphatic vessel development. Dev Biol 452: 134–143

      Maruyama K, Naemura K, Arima Y, Uchijima Y, Nagao H, Yoshihara K, Singh MK, Uemura A, Matsuzaki F, Yoshida Y, et al (2021) Semaphorin3E-PlexinD1 signaling in coronary artery and lymphatic vessel development with clinical implications in myocardial recovery. Iscience: 102305

      Mashima T, Wakatsuki T, Kawata N, Jang M-K, Nagamori A, Yoshida H, Nakamura K, Migita T, Seimiya H & Yamaguchi K (2021) Neutralization of the induced VEGF-A potentiates the therapeutic effect of an anti-VEGFR2 antibody on gastric cancer in vivo. Sci Rep 11: 15125

      Medesan C, Cianga P, Mummert M, Stanescu D, Ghetie V & Ward ES (1998) Comparative studies of rat IgG to further delineate the Fc : FcRn interaction site. Eur J Immunol 28: 2092–2100

      Nagao M, Hamilton JL, Kc R, Berendsen AD, Duan X, Cheong CW, Li X, Im H-J & Olsen BR (2017) Vascular Endothelial Growth Factor in Cartilage Development and Osteoarthritis. Sci Rep 7: 13027

      Nair SC (2018) Vascular Anomalies of the Head and Neck Region. J Maxillofac Oral Surg 17: 1–12

      Popovich IG, Anisimov VN, Zabezhinski MA, Semenchenko AV, Tyndyk ML, Yurova MN & Blagosklonny MV (2014) Lifespan extension and cancer prevention in HER-2/neu transgenic mice treated with low intermittent doses of rapamycin. Cancer Biol Ther 15: 586–592

      Ryu JY, Chang YJ, Lee JS, Choi KY, Yang JD, Lee S-J, Lee J, Huh S, Kim JY & Chung HY (2023) A nationwide cohort study on incidence and mortality associated with extracranial vascular malformations. Sci Rep 13: 13950

      Sadick M, Wohlgemuth WA, Huelse R, Lange B, Henzler T, Schoenberg SO & Sadick H (2017) Interdisciplinary Management of Head and Neck Vascular Anomalies: Clinical Presentation, Diagnostic Findings and Minimalinvasive Therapies. Eur J Radiol Open 4: 63–68

      Singh AM, Reynolds D, Cliff T, Ohtsuka S, Mattheyses AL, Sun Y, Menendez L, Kulik M & Dalton S (2012) Signaling Network Crosstalk in Human Pluripotent Cells: A Smad2/3-Regulated Switch that Controls the Balance between Self-Renewal and Differentiation. Cell Stem Cell 10: 312–326

      Song JG, Lee YS, Park J-A, Lee E-H, Lim S-J, Yang SJ, Zhao M, Lee K & Han H-K (2016) Discovery of LW6 as a new potent inhibitor of breast cancer resistance protein. Cancer Chemother Pharmacol 78: 735–744

      Srinivasan RS, Dillard ME, Lagutin OV, Lin F-J, Tsai S, Tsai M-J, Samokhvalov IM & Oliver G (2007) Lineage tracing demonstrates the venous origin of the mammalian lymphatic vasculature. Gene Dev 21: 2422–2432

      Stanczuk L, Martinez-Corral I, Ulvmar MH, Zhang Y, Laviña B, Fruttiger M, Adams RH, Saur D, Betsholtz C, Ortega S, et al (2015) cKit Lineage Hemogenic Endothelium-Derived Cells Contribute to Mesenteric Lymphatic Vessels. Cell Reports 10: 1708–1721

      Stone OA & Stainier DYR (2019) Paraxial Mesoderm Is the Major Source of Lymphatic Endothelium. Dev Cell 50: 247-255.e3

      Surve CR, Duran CL, Ye X, Chen X, Lin Y, Harney AS, Wang Y, Sharma VP, Stanley ER, Cox D, et al (2024) Signaling events at TMEM doorways provide potential targets for inhibiting breast cancer dissemination. bioRxiv: 2024.01.08.574676

      Takayama K, Kawakami Y, Kobayashi M, Greco N, Cummins JH, Matsushita T, Kuroda R, Kurosaka M, Fu FH & Huard J (2014) Local intra-articular injection of rapamycin delays articular cartilage degeneration in a murine model of osteoarthritis. Arthritis Res Ther 16: 482

      Tyler B, Wadsworth S, Recinos V, Mehta V, Vellimana A, Li K, Rosenblatt J, Do H, Gallia GL, Siu I-M, et al (2011) Local delivery of rapamycin: a toxicity and efficacy study in an experimental malignant glioma model in rats. Neuro-Oncol 13: 700–709

      Wuest TR & Carr DJJ (2010) VEGF-A expression by HSV-1–infected cells drives corneal lymphangiogenesis. J Exp Med 207: 101–115

      Xu H, Chen Y, Li Z, Zhang H, Liu J & Han J (2022) The hypoxia-inducible factor 1 inhibitor LW6 mediates the HIF-1α/PD-L1 axis and suppresses tumor growth of hepatocellular carcinoma in vitro and in vivo. Eur J Pharmacol 930: 175154

      Xu J, Lamouille S & Derynck R (2009) TGF-β-induced epithelial to mesenchymal transition. Cell Res 19: 156–172

      Xu Q, Liu H, Ye Y, Wuren T & Ge R (2024) Effects of different hypoxia exposure on myeloid-derived suppressor cells in mice. Exp Mol Pathol 140: 104932

      Yu JSL, Ramasamy TS, Murphy N, Holt MK, Czapiewski R, Wei S-K & Cui W (2015) PI3K/mTORC2 regulates TGF-β/Activin signalling by modulating Smad2/3 activity via linker phosphorylation. Nat Commun 6: 7212

      Zenner K, Cheng CV, Jensen DM, Timms AE, Shivaram G, Bly R, Ganti S, Whitlock KB, Dobyns WB, Perkins J, et al (2019) Genotype correlates with clinical severity in PIK3CA-associated lymphatic malformations. Jci Insight 4

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, Dong et al. study the directed cell migration of tracheal stem cells in Drosophila pupae. The migration of these cells which are found in two nearby groups of cells normally happens unidirectionally along the dorsal trunk towards the posterior. Here, the authors study how this directionality is regulated. They show that inter-organ communication between the tracheal stem cells and the nearby fat body plays a role. They provide compelling evidence that Upd2 production in the fat body and JAK/STAT activation in the tracheal stem cells play a role. Moreover, they show that JAK/STAT signalling might induce the expression of apicobasal and planar cell polarity genes in the tracheal stem cells which appear to be needed to ensure unidirectional migration. Finally, the authors suggest that trafficking and vesicular transport of Upd2 from the fat body towards the tracheal cells might be important.

      Strengths:

      The manuscript is well written. This novel work demonstrates a likely link between Upd2JAK/STAT signalling in the fat body and tracheal stem cells and the control of unidirectional cell migration of tracheal stem cells. The authors show that hid+rpr or Upd2RNAi expression in a fat body or Dome RNAi, Hop RNAi, or STAT92E RNAi expression in tracheal stem cells results in aberrant migration of some of the tracheal stem cells towards the anterior. Using ChIP-seq as well as analysis of GFP-protein trap lines of planar cell polarity genes in combination with RNAi experiments, the authors show that STAT92E likely regulates the transcription of planar cell polarity genes and some apicobasal cell polarity genes in tracheal stem cells which appear to be needed for unidirectional migration. Moreover, the authors hypothesise that extracellular vesicle transport of Upd2 might be involved in this Upd2-JAK/STAT signalling in the fat body and tracheal stem cells, which, if true, would be quite interesting and novel.

      Overall, the work presented here provides some novel insights into the mechanism that ensures unidirectional migration of tracheal stem cells that prevents bidirectional migration. This might have important implications for other types of directed cell migration in invertebrates or vertebrates including cancer cell migration.

      Weaknesses:

      It remains unclear to what extent Upd2-JAK/STAT signalling regulates unidirectional migration. While there seems to be a consistent phenotype upon genetic manipulation of Upd2-JAK/STAT signalling and planar cell polarity genes, as in the aberrant anterior migration of a fraction of the cells, the phenotype seems to be rather mild, with the majority of cells migrating towards the posterior.

      We agree that the phenotype is mild, as perturbing JAK/STAT signaling in the progenitors specifically affects the coordinated migration of the cells rather than alters their direction or completely blocks migration. Our data indicate that inter-organ communication ensures coordinated behavior of the progenitor cells, although the differential responses exhibited by individual cells represent an interesting unresolved issue that awaits future in-depth investigation.

      While I am not an expert on extracellular vesicle transport, the data presented here regarding Upd2 being transported in extracellular vesicles do not appear to be very convincing.

      We performed additional PLA experiments which support the interaction between Upd2 and the core components of extracellular vesicles (revised Figure 8). Furthermore, we performed electron microscopy to visualize the Lbm-containing vesicles in fat body (Figure 8-figure supplement 1D).

      These data are now provided in the revised manuscript.

      Major comments:

      (1) The graphs showing the quantification of anterior (and in some cases also posterior migration) are quite confusing. E.g. Figure 1F (and 5E and all others): These graphs are difficult to read because the quantification for the different conditions is not shown separately. E.g. what is the migration distance for Fj RNAi anterior at 3h in Fig5E? Around -205micron (green plus all the other colors) or around -70micron (just green, even though the green bar goes to -205micron). If it's -205micron, then the images in C' or D' do not seem to show this strong phenotype. If it's around -70, then the way the graph shows it is misleading, because some readers will interpret the result as -205. Moreover, it's also not clear what exactly was quantified and how it was quantified. The details are also not described in the methods. It would be useful, to mark with two arrowheads in the image (e.g. 5 A' -D') where the migration distance is measured (anterior margin and point zero).

      Overall, it would be better, if the graph showed the different conditions separately. Also, n numbers should be shown in the figure legend for all graphs.

      We apologize for those inappropriate presentation and insufficient description and thank you for kindly pointing them out. We used different colors to represent different genotypes, and the columns were superimposed. we chose to show the quantification in different conditions separately in the revised Figures. The anterior migration distance for Fj RNAi is around 70 µm.

      We now provided detailed description in the revised methods. For migration distance measurement, we took snapshots at 0hr\ 1hr\ 2hr and 3hr, and measured the distance from the starting point (the junction of TC and DT) to the leading edge of progenitor clusters. The velocity formula: v=d (micrometer)/t (min). As you kindly suggested, we indicated the anterior margin and point zero in the corresponding panels. We have added n number in the legends.

      (2) Figure 2-figure supplement 1: C-L and M: From these images and graph it appears that Upd2 RNAi results in no aberrant anterior migration. Why is this result different from Figures 2D-F where it does?

      The fat body-expressing lsp2-Gal4 was used in Figure 2-figure supplement 1C-L and Figure 2D-F, while trachea specific btl-Gal4 was used in Figure 2-figure supplement 1K-L. The lsp2-Gal4-driven but not btl-Gal4-driven upd2RNAi causes aberrant anterior migration, suggesting that fat bodyderived Upd2 plays a role. We have further clarified this in the text.

      (3) Figure 5F: The data on the localisation of planar cell polarity proteins in the tracheal stem cell group is rather weak. Figure 5G and J should at least be quantified for several animals of the same age for each genotype. Is there overall more Ft-GFP in the cells on the posterior end of the cell group than on the opposite side? Or is there a more classic planar cell polarity in each cell with FtGFP facing to the posterior side of the cell in each cell? Maybe it would be more convincing if the authors assessed what the subcellular localisation of Ft is through the expression of Ft-GFP in clones to figure out whether it localises posteriorly or anteriorly in individual cells.

      We staged the animals, measured several animals for each genotype and provided the quantifications in the revised manuscript. The level of Ft-GFP is higher in the cells at the frontal edge. We tried to examine the expression of Ft-GFP at single-cell level. However, this turned out to be technically difficult because the tracheal stem cells are not regularly arranged as epithelial cells and the proximal-distant axis of the tracheal stem cells remains unclear. We thus decided to measure the fluorescence signal of groups of stem cells along the DT regardless of their individual polarity within cells.

      (4) Regarding the trafficking of Upd2 in the fat body, is it known, whether Grasp65, Lbm, Rab5, and 7 are specifically needed for extracellular vesicle trafficking rather than general intracellular trafficking? What is the evidence for this?

      In our experiments, knocking down rab5, rab7, grasp65 or lbm in trachea using btl-Gal4 did not cause abnormality in the disciplined migration, which excludes their intracellular contribution in the trachea (Figure 7-figure supplement 1). Perturbation of Grasp65 or Lbm in fat body increased intracellular upd2-containing vesicles, indicating that intracellular production is functional (Figure 6J). The Grasp65 is specifically required for Upd2 production. Lbm, Rab5 and Rab7 are important of vesicle trafficking. Our conclusion does not pertain to extracellular or intracellular compartment.

      (5) Figure 8A-B: The data on the proximity of Rab5 and 7 to the Upd2 blobs are not very convincing.

      The confocal images indicate the proximity of Rab5 and Rab7 to the Upd2 vesicles. We interpret the proximity together with the results from Co-IP and PLA data (Figure 8E-K).

      (6) The authors should clarify whether or not their work has shown that "vesicle-mediated transport of ligands is essential for JAK/STAT signaling". In its current form, this manuscript does not appear to provide enough evidence for extracellular vesicle transport of Upd2.

      Lbm belongs to the tetraspanin protein family that contains four transmembrane domains, which are the principal components of extracellular vesicles. We show that Lbm interacts with Upd2. The JAK/STAT signaling depends on the Upd2 in the fat body as well as vesicle trafficking machinery. Furthermore, we performed electron microscopy and show the presence of Lbm-containing vesicles in fat body (Figure 8-figure supplement 1D).

      (7) What is the long-term effect of the various genetic manipulations on migration? The authors don't show what the phenotype at later time points would be, regarding the longer-term migration behaviour (e.g. at 10h APF when the cells should normally reach the posterior end of the pupa). And what is the overall effect of the aberrant bidirectional migration phenotype on tracheal remodelling?

      We observed that the integrity of tracheal network especially the dorsal trunk was impaired, which may be due to incomplete regeneration (Figure 3-figure supplement1E-I).

      (8) The RNAi experiments in this manuscript are generally done using a single RNAi line. To rule out off-target effects, it would be important to use two non-overlapping RNAi lines for each gene.

      We validated the phenotype using several independent RNAi alleles.

      Reviewer #2 (Public review):

      Summary:

      This work by Dong and colleagues investigates the directed migration of tracheal stem cells in Drosophila pupae, essential for tissue homeostasis. These cells, found in two nearby groups, migrate unidirectionally along the dorsal trunk towards the posterior to replenish degenerating branches that disperse the FGF mitogen. The authors show that inter-organ communication between tracheal stem cells and the neighboring fat body controls this directionality. They propose that the fat body-derived cytokine Upd2 induces JAK/STAT signaling in tracheal progenitors, maintaining their directional migration. Disruption of Upd2 production or JAK/STAT signaling results in erratic, bidirectional migration. Additionally, JAK/STAT signaling promotes the expression of planar cell polarity genes, leading to asymmetric localization of Fat in progenitor cells. The study also indicates that Upd2 transport depends on Rab5- and Rab7-mediated endocytic sorting and Lbm-dependent vesicle trafficking. This research addresses inter-organ communication and vesicular transport in the disciplined migration of tracheal progenitors.

      Strengths:

      This manuscript presents extensive and varied experimental data to show a link between Upd2JAK/STAT signaling and tracheal progenitor cell migration. The authors provide convincing evidence that the fat body, located near the trachea, secretes vesicles containing the Upd2 cytokine. These vesicles reach tracheal progenitors and activate the JAK-STAT pathway, which is necessary for their polarized migration. Using ChIP-seq, GFP-protein trap lines of planar cell polarity genes, and RNAi experiments, the authors demonstrate that STAT92E likely regulates the transcription of planar cell polarity genes and some apicobasal cell polarity genes in tracheal stem cells, which seem to be necessary for unidirectional migration.

      Weaknesses:

      Directional migration of tracheal progenitors is only partially compromised, with some cells migrating anteriorly and others maintaining their posterior migration.

      Our results suggest that Upd2-JAK/STAT signaling is required for the consistency of disciplined migration. Although only a few tracheal progenitors display anterior migration, these cells lose the commitment of directional movement. We acknowledge that the phenotype is moderate.

      Additionally, the authors do not examine the potential phenotypic consequences of this defective migration.

      We examined the long-term effects of the aberrant migration and observed an impairment of tracheal integrity and melanized tracheal branches (Figure 3-figure supplement1E-I).

      It is not clear whether the number of tracheal progenitors remains unchanged in the different genetic conditions. If there are more cells, this could affect their localization rather than migration and may change the proposed interpretation of the data.

      We examined the progenitor cell number in bidirectional movement samples and control group. The results show that cell number does not exhibit a significant difference between control and bidirectional movement groups (Figure 3-figure supplement 1).

      Upd2 transport by vesicles is not convincingly shown.

      We performed additional PLA experiments to further support the interaction between Upd2 and the core components of extracellular vesicles. Furthermore, we performed electron microscopy and show the presence of Lbm-containing vesicles in fat body (Figure 8-supplement 1D). Additional experiments such as colocalization and Co-IP assay and better quantification are provided in the revised manuscript (see revised Figure 8).

      Data presentation is confusing and incomplete.

      We used different colors to represent different genotypes, and the columns were superimposed. we changed the graphs to show the quantification in different conditions separately. We revised data presentation to avoid confusing.

      Reviewer #3 (Public review):

      Summary:

      Dong et al tackle the mechanism leading to polarized migration of tracheal progenitors during Drosophila metamorphosis. This work fits in the stem cell research field and its crucial role in growth and regeneration. While it has been previously reported by others that tracheal progenitors migrate in response to FGF and Insulin signals emanating from the fat body in order to regenerate tracheal branches, the authors identified an additional mechanism involved in the communication of the fat body and tracheal progenitors.

      Strengths:

      The data presented were obtained using a wide range of complementary techniques combining genetics, molecular biology, quantitative, and live imaging techniques. The authors provide convincing evidence that the fat body, found in close proximity to the trachea, secrete vesicles containing the Upd2 cytokine that reach tracheal progenitors leading to JAK-STAT pathway activation, which is required for their polarized migration. In addition, the authors show that genes regulating planar cell polarity are also involved in this inter-organ communication.

      Weaknesses:

      (1) Affecting this inter-organ communication leads to a quite discrete phenotype where polarized migration of tracheal progenitors is partially compromised. The study lacks data showing the consequences of this phenotype on the final trachea morphology, function, and/or regeneration capacities at later pupal and adult stages. This could potentially increase the significance of the findings.

      Regarding your kind suggestion, we examined the long-term effects of the aberrant migration and observed the impairment of tracheal integrity and melanized tracheal branches (Figure 3-figure supplement1E-I).

      (2) The conclusions of this paper are mostly well supported by data, but some aspects of data acquisition and analysis need to be clarified and corrected, such as recurrent errors in plotting of tracheal progenitor migration distance that mislead the reader regarding the severity of the phenotype.

      We used different colors to represent different genotypes, and the columns were superimposed. we changed the graphs to show the quantification in different conditions separately. We thank you for kindly pointing it out.

      (3) The number of tracheal progenitors should be assessed since they seem to be found in excess in some genetic conditions that affect their behavior. A change in progenitor number could lead to crowding, thus affecting their localization rather than migration capacities, thereby changing the proposed interpretation. In addition, the authors show data suggesting a reduced progenitor migration speed when the fat body is affected, which would also be consistent with a crowding of progenitors.

      We examined the cell number in bidirectional movement samples and control group. We examined cell number and cell proliferation and observed that there was no significance between control and bidirectional movement groups (Figure 3-figure supplement 2).

      (4) The authors claim that tracheal progenitors display a polarized distribution of PCP proteins that is controlled by JAK-STAT signaling. However, this conclusion is made from a single experiment that is not quantified and for which there is no explanation of how the plot profile measurements were performed. It also seems that this experiment was done only once. Altogether, this is insufficient to support the claim. Finally, a quantification of the number of posterior edges presenting filopodia rather than the number of filopodia at the anterior and posterior leading edges would be more appropriate.

      We staged the animals, measured several animals for each genotype and provided the quantifications in the revised manuscript. The level of Ft-GFP is higher in the cells at the frontal edge. We tried to examine the expression of Ft-GFP at single-cell level. However, this turned out to be difficult due to the fact that the tracheal stem cells are not regularly patterned as epithelial cells and the proximaldistant axis of tracheal stem cells is not well defined. We thus decided to measure the fluorescence signal of groups of stem cells along the DT regardless of their individual polarity.

      (5) The authors demonstrate that Upd2 is transported through vesicles from the fat body to the tracheal progenitors where they propose they are internalized. Since the Upd2 receptor Dome ligand binding sites are exposed to the extracellular environment, it is difficult to envision in the proposed model how Upd2 would be released from vesicles to bind Dome extracellularly and activate the JAK-STAT pathway. Moreover, data regarding the mechanism of the vesicular transport of Upd2 are not fully convincing since the PLA experiments between Upd2 and Rab5, Rab7, and Lbm are not supported by proper positive and negative controls and co-immunoprecipitation data in the main figure do not always correlate to the raw data.

      We use molecular modeling to show that Upd2 and Lbm intermingle, and Upd2 is not entirely encapsulated in vesicles (Figure 8-supplement 1E). We performed PLA experiments using the animals not expressing upd2-Cherry as negative control (Figure 8 E-J). We corrected the Co-IP panel and apologize for this error.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Minor comments:

      (1) Figure 1-figure supplement 1: E: How was the migration velocity assessed? By live imaging individual cells or following the cell front of the group? Over what time period? Do the data points in the graph correspond to individual cells or the cell group? It would be important to show confocal images that go along with this quantification.

      We took snapshots of pupae at 0hr\ 1hr\ 2hr and 3hr, and measured the distance covered by the migrating progenitor cells from the start place (the junction of TC and DT) to the leading edge of progenitor groups. We then calculated the migration rate by v=d (micrometer)/t (min). As the progenitor cells revolve around and migrate along the DT, tracking single tracheoblast through intact cuticle is technically challenging. We have therefore measured the leading edge as a proxy to the whole cell group. We agree with you that time-lapse imaging is favorable for analysis of migration.

      (2) Figure 1-figure supplement 1: F: Why is there Gal80ts in the genotype? (and in Figure 1H). Also, what pupal age was used for this quantification?

      Expression of hid and rpr in L3 stage impaired fat body integrity and adipocyte abundance, and caused lethality. Gal80ts was used for controlling the expression of rpr.hid. The pupal at 0hr APF were used in EdU experiment.

      (3) Figure 2C: what is shown in the 6 columns (why 3 each for control and rpr/hid)?

      We conducted 3 replicates of each group for control and rpr.hid.

      (4) In the methods, several Drosophila stocks are listed as 'source:" from a particular person (e.g. Dr Ma). Please list the real source of this stock, e.g. Bloomington stock number, or the lab and publication in which the stock was originally made.

      We provide the information on these stocks in the revised methods.

      (5) The SKOV3 carcinoma cell and S2 cell work is not described in the methods.

      We added detailed description of this experiment in the revised method-Cell culture and transfection. 

      (6) Figure 6 (F) 'Bar graph plots the abundance of Upd2-mCherry-containing vesicles in progenitors.' What does abundance mean? What was quantified, the number of vesicles, or the mean intensity? This is also not mentioned in the methods.

      We counted the number of Upd2-mCherry-containing vesicles in fat body cells and trachea progenitors and added the description of measurement in the method.

      (7) There are a few language mistakes throughout the manuscript. E.g.

      (a) Line 117 and other places: Language: 'fat body' should be 'the fat body'.

      We thank you for pointing out these errors and corrected it accordingly.

      (b) Line 1276 Language mistakes: 'Video 1 3D-view of confocal image stacks of tracheal progenitors and fat body. Scale bar: 100 μm. Genotypes: UAS-mCD8-GFP/+;lsp2-Gal4,P[B123]-RFP-moe/+.' :stacks and genotypes should be singular.

      We fixed these errors and thank you for kindly pointing them out. We also proofread the entire manuscript to assure accuracy.

      (8) In general, it is hard to figure out the exact genotypes used in experiments. This is mostly not written very clearly in the figure legends. E.g. Figure 2: genotype for A-C missing in figure legend (is B from control animals?)

      We added genotypes in the figure legends. For Figure 2, A and C lsp2-Gal4,P[B123]-RFP-moe/+ for control, UAS-rpr-hid/+;Gal80ts/+;lsp2-Gal4,P[B123]-RFP-moe/+ for rpr.hid; B from control animals.

      Reviewer #2 (Recommendations for the authors):

      Major comments:

      (1) The phenotype resulting from Upd2 downregulation by RNAi is subtle and shown by unconvincing images. In addition, these phenotypes are analyzed using only one RNAi line.

      We used two independent alleles of upd2RNAi from THFC (THU1288 and THU1331), and observed similar phenotype. For RNAi experiments, we always use multiple independent alleles.

      (2) The authors should analyze the phenotypic consequences of directional migration changes. Is there an effect on tracheal remodeling?

      We observed that the integrity of tracheal network especially the dorsal trunk was impaired and that melanized tracheal branches were present, which may be due to incomplete regeneration (Figure 3figure supplement1E-I).

      (3) The number of tracheal progenitors should be quantified, as some genetic conditions may affect cell numbers, as is apparent in some panels.

      We examined cell number and cell proliferation and observed that there was no significance between control and bidirectional movement groups (Figure 3-figure supplement 1).

      (4) The data on PCP protein distribution are unconvincing, unquantified, and insufficient to support one of the main conclusions of the study, which is stated in the abstract: "JAK/STAT signaling promotes the expression of genes involved in planar cell polarity, leading to asymmetric localization of Fat in progenitor cells."

      We staged the animals, measured several animals for each genotype and provided the quantifications in the revised manuscript. The level of Ft-GFP is higher in the cells at the frontal edge. We tried to examine the expression of Ft-GFP at single-cell level. However, this turned out to be difficult due to the fact that the tracheal stem cells are not regularly patterned as epithelial cells and the proximaldistant axis of tracheal stem cells is not well defined. We thus decided to measure the fluorescence signal of groups of stem cells along the DT regardless of their individual polarity.

      Minor comments:

      (1) Language should be revised. In many places in the manuscript, starting in line 113, "fat body" should be "the fat body".

      Thank you for pointing out this error. We corrected it accordingly.

      (2) Genotypes used in experiments should be described.

      We added all the genotypes. We proofread the entire manuscript to complete the figure legends for genotypes.

      (3) Line 67, the reference to "The progenitor cells reside in Tr4 and Tr5 metameres and start to move along the tracheal branch" should include (Chen and Krasnow, Science 2014).

      We added the reference in the manuscript.

      (4) Line 1081, Figure 7 Legend. "Bar graph plots the abundance of Upd2-mCherry-containing vesicles" Abundance is the number of vesicles? The graph displays the average number of vesicles? Please explain and describe the quantification.

      The bar graph represents the number of Upd2-mCherry-containing vesicles in different conditions. We quantified the number of vesicles per area.

      (5) Figure 1 (I-J) What is shown on the panels? Progenitors marked with? This information is not present in the figure or figure legend. Same for Figure 2 (D-E).

      Figure 1I-J show the vector of migrating progenitors. We added the information in the legends. The tracheal cells were labeled by nls-mCherry in Figure 1I-J. In Figure 2D-E, the progenitors were marked with P[B123]-RFP-moe.

      (6) Figure 3 Q, Stat92E-GFP values in the graph are not well-explained. What do the numbers in the y-axis refer to?

      y-axis represents the intensity of Stat92E-GFP normalized to control. We have changed the y-axis label to ‘normalized Stat92E-GFP intensity’ in the legends.

      (7) In general, figures and figure legends must be revised. Sometimes stainings are not well-defined, some scale bars are missing and plots do not say what the values are.

      We apologized for inadequate information and have revised the figures and legends accordingly.

      Reviewer #3 (Recommendations for the authors):

      Several points should be addressed by the authors in order to improve their manuscript.

      Major points:

      (1) The phenotype obtained from decreasing the inter-organ signaling is quite discrete. It is further weakened by the fact that the images chosen to illustrate the measures are not really convincing. No image at 1h APF shows any clear anterior migration. Based on the scale, most of the images at 3h APF do not show a striking difference compared to the control, and in any case, stronger phenotypes would be missed anteriorly since they would thus be out of frame. In addition, at 3h APF, progenitors migrating anteriorly from Tr5 position get mixed with those migrating posteriorly from Tr4 so it is not clear how measurements were made. Given that most phenotypes are observed upon the use of RNAis, it is possible that phenotypes are weak due to persistent gene expression. Using null clones for dome, hop, or stat in progenitors could therefore aggravate the phenotypes and support further the significance of the study. Finally, assessing the consequences of compromised fat body-tracheal communication on trachea morphology, function, and regeneration later in pupal development and on adult flies would also help strengthen the importance of the findings.

      We agree with you that anteriorly migrated Tr5 progenitors adjoining Tr4 progenitor hinders measurements and that mutants may give stronger phenotype than RNAi lines. We only measured Tr4 progenitors (instead of Tr5) when assessing anterior migration. Thus, we performed experiments using mutant alleles, which gave aberrant migration of tracheal progenitors (Figure 3-figure supplement1A-D). We can now show that the integrity of tracheal network especially dorsal trunk was impaired, which may be due to incomplete regeneration (Figure 3-figure supplement1E-I).

      (2) Although the authors did not observe defects in tracheal progenitor proliferation, progenitors seem to be present in excess in some key genetic background (e.g, upon expression of rpr.hid, statRNAi, Rab-RNAi or in the presence of BFA). This excess could be the result of another mechanism than proliferation (recruitment of extra progenitors since it is not clear how they originate, defect in apoptosis...) and could impact the localization of progenitors, those being pushed anteriorly as a consequence of crowding. A proper characterization of tracheal progenitor number would thus help to discriminate between defects in migration or crowding. This point could also be addressed by performing individual tracking of tracheal progenitors, to find out whether each progenitor is indeed migrating in the wrong direction or if the movement assessed by the global tracking method that is used is just a consequence of progenitor excess.

      We examined the cell number in bidirectional movement samples and control group. The results show that there was no significance between control and bidirectional movement groups (Figure 3figure supplement 1). We also tried to follow every progenitor, but were unable to obtain convincing results with P[B123]-RFP-moe, as tracking single tracheoblast through intact cuticle is technically challenging.

      (3) Regarding the ChIP-seq experiment, an explanation of why choosing the "establishment of planar polarity" family should be provided since data indicate a quite low GeneRatio. Indeed, the "cell adhesion" family seems a more obvious candidate, which would be further supported by the fact that the JAK-STAT pathway has been shown to affect cell adhesion components such as ECadherin and FAK (Silver and Montell 2001, Mallart et al 2024). Also, have these known targets of JAK-STAT signaling been found in the ChIP-seq data? Since filopodia polarization is affected in tracheal progenitors when JAK-STAT signaling is decreased, the same question also applies to enabled, which is involved in filopodia formation and has been recently identified as a target of JAK-STAT signaling.

      As you kindly suggested, we tested a number of cell adhesion-related genes such as E-Cadherin (shg), fak, robo2 and enabled (ena). We did not observe an apparent aberrancy in the migration of tracheal progenitors (Figure 5-supplement 1J).

      (4) Data investigating PCP protein distribution is not convincing, not quantified, and not sufficient to draw one of the main conclusions of the study, which is even written in the abstract "JAK/STAT signaling promotes the expression of genes involved in planar cell polarity leading to asymmetric localization of Fat in progenitor cells."

      We better quantified the abundance of Ft in in the progenitors in the frontal edge and those lagging behind. The traces plot multiple replicates in the figures. The level of Ft-GFP is higher in the cells at the frontal edge.

      (5) Overall, the figures together with their caption and/or the material and methods section lack some important information for the reader to fully understand the data. In addition, some errors are found in multiple plots throughout the article and must be corrected. Here are some examples:

      According to your suggestion, we revised legends and methods section to include sufficient information.

      (a) Migration distance plots from Figure 3E do not match the data presented in the source data file. It seems that, when creating the plot, instead of superimposing the bars, bars were stacked. This should be corrected for all migration distance plots from Figure 3E onward, including in supplementary figures.

      We apologized for misleading representation. We revised it accordingly and show the quantification in different conditions separately.

      (b) The number of analyzed flies and/or clusters of tracheal progenitors from different flies should be stated for all quantification or observations made on images. This information is lacking for all migration distance plots, for progenitor migration tracking (Figure 1 I, J), for DIPF reporter in Figure 2J, for plot profiles (Figure 5G, J), for Upd2-Rab5/Rab7/Lbm co-detections, PLA, CoIP, and lbm-pHluorin experiments. This also applies to RNA seq, ChIP seq, and surface proteomics, for which the number of pupae and number of replicates is not indicated.

      We changed the graphs to show the quantification and n number in different conditions separately.

      We also added the n number of replicates in methods.

      (c) How quantifications were performed is not sufficiently explained. For example, the reference point for migration distance measurement is not defined, and neither is whether the measures were made on fixed or live imaging samples. In fluorescence intensity measurements and Upd2 vesicle counting, information on whether measures were made on a single z slice or on a projection of several z slices should be stated together with what ROI and which FIJI tool for quantification were used. For plot profiles, the same information regarding z slices misses together with how the orientation, the thickness, and the length of the line were chosen, and again the number of times the experiment was conducted should be mentioned and error bars should appear on graphs.

      We thank this reviewer for the suggestions which help clarify the methodology of our experiments and improve presentation of our data. We have made the changes according to the suggestions and modified our methods section and the related figures to incorporate these changes.

      For measuring the migration distance of tracheal progenitors, we took snapshots of living pupae at 0hr\ 1hr\ 2hr and 3hr APF, and measured the migration distance of tracheal progenitors from the start place (the junction of TC and DT) to the leading edge of progenitor groups.

      For the measurements of fluorescent intensity of stat92E-GFP and DIPF, we took z-stack confocal images of samples and quantified the fluorescent intensity using FIJI. Specifically, intensity was quantified for regions of interest, using the Analysis and Measurement tools. To quantify Upd2mCherry vesicles, z-stack confocal images of fat body were taken and the cell counting function of FIJI was used to measure the vesicle number.

      To quantify the fluorescent intensity of in vivo tagged Ds, Ft and Fj proteins, a single z slice was used. The expression level of the protein was assessed as the integrated fluorescent intensity normalized to area.

      For the measurement of Ft-GFP distribution, a single z slice of the progenitors immediately proximal to the DT was imaged. An arbitrary line was drawn along the migration direction from the starting TC-DT junction to the leading front (the length of the line corresponds to the distribution range of tracheal stem cell clusters). Then, fluorescent intensity along the line was automatically calculated with the imbedded measurement function of Zeiss confocal software.

      Minor points:

      (1) In several instances, the authors generalize that stem cells migrate to leave their niche, but this is not the case for all stem cells.

      The phenomenon that stem cells leave their niche when they are activated is commonly observed. We interpreted the general mechanism from our system of tracheal stem cells. We fully agree with you that it may not be the case for all stem cells. We modified the text accordingly.

      (2) Line 122 -a reference paper or an image showing the expression pattern of the lsp2-Gal4 driver is missing.

      We added the reference in the manuscript.

      (3) Line 136 - The term "traces of individual progenitors" is overstated and should be reformulated as the method used does not seem to be individual cell tracking.

      We rephrased accordingly in the revised manuscript.

      (4) Line 146 - Fat body and tracheal progenitors are qualified as interdependent organs, in which aspect do tracheal progenitors affect the fat body?

      Current knowledge suggests a close inter-organ crosstalk between trachea and fat body: The fly trachea provides oxygen to the body and influences the oxidation and metabolism of the whole body. When the trachea is perturbed, the body is in hypoxia, which causes inflammatory response in adipose tissue as an important immune organ (Shin et al., 2024).

      (5) Line 163 - Not all the genes tested are cytokines, so the sentence should be reformulated. In addition, in supplementary Fig2-1 C-J, the KD of hh seems to abolish completely tracheal progenitor migration, which is not commented on.

      According to your suggestion, we revised the description on information of the genes tested. We added comments in the revised manuscript regarding phenotypes of hh knockdown. 

      (6) Line 180 - Conclusion is made on Dome expression while using a dome-Gal4 construct, which does not necessarily recapitulate the endogenous pattern of dome expression, so it should be reformulated. Ideally, dome expression should be assessed in another way. Also, it is not clear whether GFP is present only in progenitors since images are zoomed.

      We revised statement and provided larger view of dome>GFP that shows an enriched expression in the tracheal progenitors (Figure 2-figure supplement 2E), an expression pattern that is consistent with FlyBase.

      (7) Line 199 - Is it upd-Gal4 or upd2-Gal4 that is used? Since the conclusion of the experiment is made on upd2, the use of upd-gal4 would not be relevant. If upd2-gal4 is used, it should be corrected. In general, the provenance of the Gal4 lines should be provided. In addition, a strong GFP signal in the trachea is visible on the image in Supplementary Figure 2-2F but not commented on and seems contradictory with the conclusion mentioning that fat body and gut are the main source of Upd2 production.

      We removed data obtained from the use of this irrelevant upd-Gal4 line.

      (8) Figures:

      -  Figure 1 G, H - Scale bar is missing.

      We added it accordingly.

      -  Figure 1 I, J - The information on the staining is missing.

      We added it in the revised manuscript.

      -  Figure 2A - Providing explanations of the terms "Count" and "Gene ratio" in the caption would be helpful for readers who are not used to this kind of data. In addition, the color code is confusing since the same color is used for the selected gene family and for high p-values (the same applies to other similar graphs).

      Gene ratio refers to the proportion of genes in a dataset that are associated with a particular biological process, function, or pathway. Count indicates the number of genes from input gene list that are associated with a specific GO term. We used redness to indicate a smaller p-value and a higher significance.

      -  Figure 2 B, C - What does the color scale represent? What do the columns in C correspond to, different time points, different replicates?

      The color scale represents the normalized expression. The columns in C correspond to different replicates of control and rpr.hid.

      -  Figure 2 F - The error bars on the 3h APF posterior bars are missing.

      We added error bars accordingly.

      -  Figure 2 G - The legend "Down-Stable-Up" is in comparison to what?

      The control group was generated from the reaction without H2O2. The comparison was relative to the control group.

      -  Figure 2 J - The specificity of the DIPF tool that has been created should be validated in other tissues displaying known JAK-STAT activity and/or in conditions of decreased JAK-STAT signaling. In addition, the added value of the tool as compared to the JAK-STAT activity reporter used later, which has been well characterized, is not obvious.

      We added the signal of DIPF in fat body and salivary gland, both of which harbor active JAK/STAT signaling (Figure 2-figure supplement 2F-H). As opposed to the well characterized Stat92E-GFP reporter that assays the downstream transcription activity, the DIPF reporter measures the upstream event of receptor dimerization.

      -  Figure 3 I-P - Reporter tool validation in Images I-L could be moved to supplementary data. In images M-P, staining of nuclei and/or membranes would be useful to assess cell integrity.

      We revised the figures accordingly.

      -  Figure 3Q and similar plots in the following figures do not explain the normalization performed and how it can be higher than 1 in control conditions.

      In these figures, we normalized the signal relative to control groups, e.g., The value of Stat92E-GFP in btl-GFP control group was set to 1 in the previous Figure 3Q (revised Figure 3-supplementary

      Figure B-J).

      -  Figure 4C - These representations lack explanations to be fully understood by a broad audience.

      The figure showing that Stat92E binding was detected in the promoters and intronic regions (the orange peaks) of genes functioning in distal-to-proximal signaling, such as ds, fj, fz, stan, Vang and fat2. We added the information in figure legend according to your suggestion.

      -  Figure 5 K,L - What is the x-axis missing, together with the method of tracking used?

      The x-axis refers to time of recording from a t stack series with a time interval of 5 min. We revised method section and provide detailed procedure of this experiment.

      -  Figures 6 and 8- The overall figures lack a wider view of the cells/tissues/organs and/or additional staining to understand what is presented.

      We showed preparation of fat body. In order to obtain the high resolution of vesicles, we used high magnification. We now added wider views of the tissues under investigation (e.g. Figure 6-figure supplement 1).

      -  Figure 6 D,E - The scale bar is missing.

      We added it accordingly.

      -  Figure 8 O-S - What is the blue staining?

      The blue staining shows DAPI-stained nuclei. We have added the information in the legend.

      -  PLA experiments can give a lot of non-specific background. What kind of controls have been used in Figure 8 F-J? Negative controls should be done on cells that do not express upd2-mCherry using both antibodies to detect non-specific background, which does not usually appear completely black.

      If possible, a positive control using a known protein interacting with Rab5-GFP should be included.

      We used the control samples without one of the primary antibodies in previous Figure 8. In the revised Figure 8, we conducted experiment as you suggested with controls that do not express upd2mCherry (Figure 8 E-J).

      -  Co-IP experiments - The raw data file for blots is quite hard to read through. Some legends are not facing the right lane and some blots presented in the main figure are difficult to track since several blots are presented in the raw data file. e.g.

      (a)  Raw blot for Figure 8 K: the band for mCherry in the IP anti-GFP blot (lane one in K) is not convincing, it is not distinguishable from other aspecific bands. On the reverse IP presented only in raw data, on the input from blot IB anti-mCherry, both lanes present exactly the same bands at 72kb when one of the lanes corresponds to extract from flies not expressing upd2-mCherry.

      We thank you for pointing out the incorrect labels. We apologized for the errors and corrected it accordingly.

      (b)  Raw blot for Figure 8 L: on the input blot IB anti-GFP, there is a band corresponding to Rab7-GFP in the lane of the extract from flies not expressing Rab7-GFP.

      We corrected it.

      (c)  Raw data for Figure 8 M: on the last blot, legends are missing above the input Ib anti-GFP blot.

      We added the missing legends in the figure.

      Shin, M., Chang, E., Lee, D., Kim, N., Cho, B., Cha, N., Koranteng, F., Song, J.J., and Shim, J. (2024). Drosophila immune cells transport oxygen through PPO2 protein phase transition. Nature 631, 350-359.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, a chromosome-level genome of the rose-grain aphid M. dirhodum was assembled with high quality, and A-to-I RNA-editing sites were systematically identified. The authors then demonstrated that: 1) Wing dimorphism induced by crowding in M. dirhodum is regulated by 20E (ecdysone signaling pathway); 2) an A-to-I RNA editing prevents the binding of miR-3036-5p to CYP18A1 (the enzyme required for 20E degradation), thus elevating CYP18A1 expression, decreasing 20E titer, and finally regulating the wing dimorphism of offspring.

      Strengths:

      he authors present both genome and A-to-I RNA editing data. An interesting finding is that a A-to-I RNA editing site in CYP18A1 ruin the miRNA binding site of miR-3036-5p. And loss of miR-3036-5p regulation lead to less 20E and winged offspring.

      Weaknesses:

      How crowding represses the miR-3036-5p is still unclear.

      Reviewer #2 (Public Review):

      Summary:

      Environmental influences on development are ubiquitous, affecting many phenotypes in organisms. However molecular genetic and cellular mechanisms transducing environmental signals are still only barely understood. This study examines part of one such intracellular mechanism in a polyphenic (or dimorphic) aphid.

      Strengths:

      While other published reports have linked phenotypic plasticity to RNA editing before, this study reports such an interaction in insects. The study uses a wide array of molecular tools to identify connections upstream and downstream of the RNA editing to elucidate the regulatory mechanism, which is illuminating.

      Weaknesses:

      While this system is intriguing, this report does not foster confidence in its conclusions. Many of the analyses seem based on very small sample sizes. It is itself problematic that sample sizes are not obvious in most figures, although based on Methods section covering RNAseq, they seem to be either 3, 6 or 9, depending on whether stages were pooled, but that point is not made clear. With such small sample sizes, statistical tests of any kind are unreliable. Besides the ambiguity on sample sizes, it's unclear what error bars or whiskers show in plots throughout this study. When sample sizes are small estimates of variance are not reliable. Student's t-test is not appropriate for comparisons with such small sample sizes. Presently, it is not possible to replicate the tests shown in Figures 3, 4 and 6. (Besides the HT-seq reads, other data should also be made publicly available, following the journal's recommendations.) Regardless, effect sizes in some comparisons (Fig 3J, 4A-C, 6E, H) are clearly not large, making confidence in conclusions low. The authors should be cautious about over-interpreting these data.

      We appreciate very much for the reviewers’ time spent on our manuscript and the referees for the valuable suggestions and comments.

      To Reviewer #1:

      At present, researches on miRNAs mainly focus on its role in gene regulation by binding to the mRNA of target genes, “how miRNAs are regulated” has received less attention.

      Recent researches indicated that the expression of miRNAs is also regulated at the transcriptional or post transcriptional level. Transcriptional regulation including changes in the promoter of microRNA genes, and post-transcriptional mechanisms such as changes in miRNA processing and stability can both affect the final expression level of miRNAs.

      This article did not address how crowding treatment regulates miRNA expression. But this will be a very interesting issue, and we will pay attention to it in our future research.

      Thank you for this suggestion.

      To Reviewer #2:

      (1) “Transgenerational wing dimorphism was observed in M. dirhodum in which crowding of the parent (100 mother aphids in a 10 cm³ tube) increased the winged offspring (Fig 3E).” In this experiment, over 250 offsprings were used to calculate the proportion of winged and wingless individuals in normal (277), crowding (255) and crowding+20E (272) groups, respectively.

      “The RNAi-mediated knockdown of CYP18A1 and ADAR2 can significantly increase the titer of 20E (Fig. 4E) and reduce the number of winged offspring by 29.6% and 24.4% (Fig. 4F), respectively.” In this experiment, over 245 offsprings were used to calculate the proportion of winged and wingless individuals in dsEGFP (273), dsCYP18A1(248), and dsADAR2 (250) groups, respectively.

      “miR-3036-5p agomir and antagomir treatments could affect the proportion of winged offspring under normal conditions (Fig. 6F), but have no effect on the wing dimorphism of offspring under crowded conditions (Fig. 6L).” In this experiment, over 235 offsprings were used to calculate the proportion of winged and wingless individuals in each group, respectively.

      So I think our conclusion that crowding treatment, A-to-I RNA editing, and miRNAs could affect the wing dimorphism of offspring in M. dirhodum is very reliable. Because the number of aphids we use to count the results is sufficient.

      (2) The quantitative PCR method is used to detect changes in gene expression levels of CYP18A1 and ADAR2 after treatment with crowding, 20E, dsRNA, miRNA agomir and antagomir, and the results are shown in Fig. 3J, 4A-C, 5B, 6B, H, respectively. 5 biological replicates (more than 100 aphids were used for each biological replicate) were used in each sample, which might be sufficient for qPCR experiments. And among these biological replicates, the differences in gene expression levels are relatively small.

      (3) The titer of 20E was detected after treatment with crowding, 20E, dsRNA, miRNA agomir and antagomir, and the results are shown in Fig. 3I, 4E, 6E, K, respectively. 8 biological replicates (more than 100 aphids were used for each biological replicate) were used in each sample.

      The number of biological replicates used in each analysis and the number of aphids included in each biological replicate have been added in the Materials and Methods section. Thank you very much for pointing out this important issue.

      Reviewer #1 (Recommendations For The Authors):

      Several questions:

      (1) This study was conducted on the rose-grain aphid M. dirhodum. However, pea aphid Acyrthosiphon pisum seems to be a better object in wing dimorphism and development studies. Have the authors also identified the A-to-I RNA editing on pea aphids or other aphids?

      Wheat is one of the main grain crops in China as well as in the world. Metopolophium dirhodum is one of the most important wheat aphids around China, and has posed a significant threat to grain production. The current study was conducted to determine the regulatory mechanism of wing dimorphism on M. dirhodum, which might be of great significance to better control this pest in wheat production.

      Surely the pea aphid offers more established experimental tools and genomic resources. However, with the development of high-throughput sequencing technology, the chromosome level genomes of many insect species have been assembled. That means any of various insects might be studied as a model species, and not limited to Drosophila melanogaster, Acyrthosiphon pisum, etc.

      We didn’t identify the A-to-I RNA editing on pea aphids or other aphids. A recent study has shown that editing events are poorly conserved across different Xenopus species. Even sites that are detected in both X. laevis and X. tropicalis show largely divergent editing levels or developmental profiles. In protein-coding regions, only a small subset of sites that are found mostly in the brain are well conserved between frogs and mammals. The conservation of RNA editing in aphids is still unknown, and we will continue to pay attention to this issue in our future research works.

      Reference: Nguyen TA, Heng JWJ, Ng YT, Sun R, Fisher S, Oguz G, Kaewsapsak P, Xue S, Reversade B, Ramasamy A, Eisenberg E, Tan MH. Deep transcriptome profiling reveals limited conservation of A-to-I RNA editing in Xenopus. BMC Biology. 2023, 21(1):251.

      (2) "Two miRNA-target prediction software programs, miRanda and RNAhybrid, were used to identify the miRNAs that potentially act on CYP18A1. The results showed that miR-3036-5p could bind to the sequence containing edited position (editing site 528) of CYP18A1 in M. dirhodum." Is there any other miRNA that can also act on CYP18A1, thereby regulating its expression?

      The predicted results indicate that there are several other miRNAs can act on CYP18A1, but none of them can bind to this editing site (editing site 528). Therefore, we did not pay attention to other miRNAs.

      (3) 11678 A-to-I RNA-editing sites were systematically identified in M. dirhodum. Does that mean RNAi-mediated knockdown of ADAR2 may affect the RNA-editing and expression of a large number of genes? Please clarify.

      It is of course possible that RNAi-mediated knockdown of ADAR2 may affect the RNA-editing and expression of a large number of genes. A-to-I RNA editing was also observed in 5 other genes that involved in 20E biosynthesis and signaling pathway, but no evident difference was identified for the RNA editing and expression levels of these 5 genes after crowding treatment (Fig. S2, Table S5). That means the A-to-I RNA editing of CYP18A1 might be crucial in 20E-mediated wing dimorphism in M. dirhodum.

      (4) It is interesting that "the transcriptional level of ADAR2 was 2.19 fold higher in the crowding+20E treatment parent than that in the normal group, but no significant difference was identified between the crowding and normal groups". ADAR2 can be induced by 20E, rather than crowding. How should the author explain? It seems that 20E induction can also cause many RNA editing events.

      20-hydroxyecdysone (20E) can affect the growth and development, molting, metamorphosis, and reproductive processes of insects. According to this result, 20E induction can also cause RNA editing events by regulating the expression of ADAR2, and which may provide valuable references for the future study on 20E. Meanwhile, we will also continue to pay attention to this issue in our future research works.

      (5) Authors provided a lot of text to describe the genome assembly. I don't think it's necessary, authors can make appropriate deletions.

      Thank you for this suggestion. This is the first high-quality chromosome-level genome of M. dirhodum, which will be very helpful for the cloning, functional verification, and evolutionary analysis of genes in this important species or even other Hemiptera insects. Therefore, I think it is necessary to provide a detailed description. We will also make appropriate deletions in the “Result and Discussion” sections.

      Reviewer #2 (Recommendations For The Authors):

      Additional concerns

      - With an existing genome sequence available for the peas aphid *Acyrthosiphon pisum*, why have these authors chosen to use the rose-grain aphid for this study? It would be helpful to address any limitations in *Acyrthosiphon pisum* or advantages in *Metopolophium dirhodum* that explain that decision.

      Wheat is one of the main grain crops in China as well as in the world. Metopolophium dirhodum is one of the most important wheat aphids around China, and has posed a significant threat to grain production. The current study was conducted to determine the regulatory mechanism of wing dimorphism on M. dirhodum, which might be of great significance to better control this pest in wheat production.

      Surely the pea aphid offers more established experimental tools and genomic resources. However, with the development of high-throughput sequencing technology, the chromosome level genomes of many insect species have been assembled. That means any of various insects might be studied as a model species, and not limited to Drosophila melanogaster, Acyrthosiphon pisum, etc.

      - In Figure 5E, what anatomy is being shown in FISH? Moreover, this represents a single sample. It would be preferable to include a supplemental figure with comparable images from at least 3 additional specimens.

      It is the whole aphid body, and we have already uploaded additional 2 FISH images to the supplementary material Fig. S5. Thank you for this suggestion.

      - L190: Conservation alone seems inadequate to conclude that a chromosome functions as a sex chromosome. It would be fine to note the homology between Chr1 and the X of other Aphidini, but there are other explanations for that. Inference that Chr 1 is a sex chromosome might come from observations in karyotypes (by relative size comparisons or ideally from FISH) or from comparison of reads mapped to the chromosomes, suggesting Chr1 is hemizygous in males.

      Karyotype analysis experiment was not conducted in this research, so here the sex chromosome was determined based on chromosome homology between M. dirhodum and A. pisum genome. We have made appropriate modifications to the description in the article. Thank you for this suggestion.

      - L205: It's unclear to me how to interpret RNA editing results, based on RNAseq data, that map to "intergenic regions", especially when this is such a large fraction (37.3%) of the total result. Does this suggest a fundamental problem with the analysis, that so much RNAseq data maps to parts of the genome that are not annotated as genes?

      Non-coding RNA regions often account for a large proportion in the genome, and this RNAseq data is mapped to non-coding RNA transcription regions (37.3%) between protein-coding genes (intergenic regions).

      - L288-290: What degrees of confidence are attached to the predictions of these miRNA targets?

      There is no clear research indicating the accuracy of miRNA target prediction software. However, by comprehensively utilizing multiple prediction tools and experimental verification, the accuracy and reliability of prediction can be significantly improved.

      Actually, the prediction of miRNA targets is only a preliminary identification step, and we have subsequently demonstrated that miR-3036-5p can act on CYP18A1 through dual-luciferase reporter assay, RNA immunoprecipitation and FISH, etc.

      - L296-298: The mechanism proposed in this study seems to imply that miR-3036-5p should be absent (not expressed) in aphids under crowded conditions. Therefore, relative realtime PCR is not particularly useful here. Finding that the miR relative expression is reduced by 48.8% is meaningless, because in *relative* expression, zero has no special meaning. In this case, absolute quantitative PCR, measuring actual transcript numbers, would be far more informative.

      miR-3036-5p is not absent in aphids under crowded conditions. Only a significant decrease of miR-3036-5p in expression level under crowded conditions was identified compared to normal feeding conditions (Fig. 5B). So it should be reasonable to use relative quantitative methods for expression level analysis.

      - L361: Isn't alternative mRNA splicing a more common post-transcriptional modification?

      I'm very sorry, this sentence has been modified to “A-to-I RNA editing is one of the most prevalent forms of posttranscriptional modification in animals, plants, and other organisms.” Thank you for this suggestion.

      - L372: "Functional wing polymorphism is commonly observed in insects as a form of adaptation and a source of variation for natural selection (14)." The relationship between plastic phenotypic variation and natural selection is complex, and there is a large theoretical literature in evolutionary biology and evo-devo on this topic, but it is not a focus in the cited review by Zhang et al.. It would be helpful if the authors could expand on this idea with reference to some of this literature (e.g. Levins 1968; Harrison 1980; Moran 1992; Roff 1996; West-Eberhard 2003; Zera 2009).

      I have changed the citation and expanded on this idea. “Wing polymorphism is commonly observed in insects, resulting from variation in both genetic factors and environmental factors (Zera 2009).”

      - L404: Use the word "accurate" seems inappropriate in this context. Both morphs are equally "accurate".

      This sentence has been modified to “resulting in the alteration of CYP18A1 expression and wing dimorphism of offspring regulated by miR-3036-5p”, Thank you for this suggestion.

      - L412: Reference 67 seems irrelevant to this point.

      References have been changed and added.

      67. E.J. Duncan, C.B. Cunningham, P.K. Dearden. Phenotypic plasticity: what has DNA methylation got to do with it? Insects. 13(2):110 (2022).

      68. K.J. Rangan, S.L. Reck-Peterson, RNA recoding in cephalopods tailors microtubule motor protein function. Cell 186, 2531-2543 (2023).

      - L443: Is this referring to "mixed stage" aphids?

      Yes. To make it clearer, this sentence has been modified to “Approximately 200 mg of fresh M. dirhodum with mixed stages (including first- to fourth-instar nymphs and winged and wingless adults)”.

      - L483: What mass or number of individual aphids was used? I assume multiple individuals were pooled?

      Each sample contains approximately 200 aphids.

      - L499: Why was k = 17 used? The default is k = 21.

      The selection of k is usually an odd number between 15 and 21, which ensures that the types of k-mers can cover the genome while being small enough to avoid erroneous effects. Therefore, using 17 is very reasonable.

      - L574: what does it mean "multiple editing types"? What different types are possible? Are you referring to things other than A-to-I editing?

      That means besides A-to-I, this locus may also have other editing situations, such as A-to-C. If this situation occurs, it will be discarded.

      - L635: Which luciferase construct or plasmid has been used in this experiment? Citation to that source is necessary.

      PmirGLO vector (Promega, Leiden, Netherlands) was used in this experiment, and a reference has been added.

      B. Zhu, L. Li, R. Wei, P. Liang, X. Gao. Regulation of GSTu1-mediated insecticide resistance in Plutella xylostella by miRNA and lncRNA. PLoS Genetics. 17(10), e1009888 (2021).

      - L644: Did cDNA synthesis employ random primers or a poly-dT primer?

      This kit provides mixed primers, including random and poly-dT primers. (PrimeScript™ RT reagent Kit with gDNA Eraser (Perfect Real Time), Takara Biotechnology, Dalian, China).

      - Fig 4D: Seems like this panel should be divided to cover the two sites, as in Fig 3F. Right now the x-axis labels seem redundant.

      Done. Thank you for this suggestion.

      - Fig 7: Consider adding ADAR2 to this figure.

      Done. Thank you for this suggestion.

      - Table 1: It would be helpful to represent this data in a figure where the phylogenetic relationships among the species can be shown.

      The phylogenetic relationships among the species were shown in Fig. 1D, and the table here may present genome information in more detail.

    1. Reviewer #2 (Public review):

      Summary:

      This study by Pradhan et al. offers critical insights into the mechanisms by which antimony-resistant Leishmania donovani (LD-R) parasites alter host cell lipid metabolism to facilitate their own growth and, in the process, acquire resistance to amphotericin B therapy. The authors illustrate that LD-R parasites enhance LDL uptake via fluid-phase endocytosis, resulting in the accumulation of neutral lipids in the form of lipid droplets that surround the intracellular amastigotes within the parasitophorous vacuoles (PV) that support their development and contribute to amphotericin B treatment resistance. The evidence provided by the authors supporting the main conclusions is compelling, presenting rigorous controls and multiple complementary approaches. The work represents an important advance in understanding how intracellular parasites can modify host metabolism to support their survival and escape drug treatment.

      Strengths:

      (1) The study utilizes clinical isolates of antimony-resistant L. donovani and provides interesting mechanistic information regarding the increased LD-R isolate virulence and emerging amphotericin B resistance.

      (2) The authors have used a comprehensive experimental approach to provide a link between antimony-resistant isolates, lipid metabolism, parasite virulence, and amphotericin B resistance. They have combined the following approaches:<br /> (a) In vivo infection models involving BL/6 and Apoe-/- mice.<br /> (b) Ex-vivo infection models using primary Kupffer cells (KC) and peritoneal exudate macrophages (PEC) as physiologically relevant host cells.<br /> (c) Various complementary techniques to ascertain lipid metabolism including GC-MS, Raman spectroscopy, microscopy.<br /> (d) Applications of genetic and pharmacological tools to show the uptake and utilization of host lipids by the infected macrophage resident L. donovani amastigotes.

      (3) The outcome of this study has clear clinical significance. Additionally, the authors have supported their work by including patient data showing a clear clinical significance and correlation between serum lipid profiles and treatment outcomes.

      (4) The present study effectively connects the basic cellular biology of host-pathogen interactions with clinical observations of drug resistance.

      (5) Major findings in the study are well-supported by the data:<br /> (a) Intracellular LD-R parasites induce fluid-phase endocytosis of LDL independent of LDL receptor (LDLr).<br /> (b) Enhanced fusion of LDL-containing vesicles with parasitophorous vacuoles (PV) containing LD-R parasites both within infected KCs and PECs cells.<br /> (c) Intracellular cholesterol transporter NPC1-mediated cholesterol efflux from parasitophorous vacuoles is suppressed by the LD-R parasites within infected cells.<br /> (d) Selective exclusion of inflammatory ox-LDL through MSR1 downregulation.<br /> (e) Accumulation of neutral lipid droplets contributing to amphotericin B resistance.

      Weaknesses:

      The weaknesses are minor:

      (1) The authors do not show how they ascertain that they have a purified fraction of the PV post-density gradient centrifugation.

      (2) The study could have benefited from a more detailed analysis of how lipid droplets physically interfere with amphotericin B access to parasites.

      Impact and significance:

      This work makes several fundamental advances:

      (1) The authors were able to show the link between antimony resistance and enhanced parasite proliferation.

      (2) They were also able to reveal how parasites can modify host cell metabolism to support their growth while avoiding inflammation.

      (3) They were able to show a certain mechanistic basis for emerging amphotericin B resistance.

      (4) They suggest therapeutic strategies combining lipid droplet inhibitors with current drugs.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This manuscript describes the impact of modulating signaling by a key regulatory enzyme, Dual Leucine Zipper Kinase (DLK), on hippocampal neurons. The results are interesting and will be important for scientists interested in synapse formation, axon specification, and cell death. The methods and interpretation of the data are solid, but the study can be further strengthened with some additional studies and controls.

      We greatly appreciate the thorough review and thoughtful suggestions from the reviewers and editors on our original manuscript. We provide point-to-point response below.  We added new studies on P10 mice and controls as suggested, and made revision of figures and texts for clarification. The revised manuscript includes three new supplemental figures; major text revision is copied under response.

      Reviewer #1 (Public Review):

      Summary:

      In this work, Ritchie and colleagues explore functional consequences of neuronal over-expression or deletion of the MAP3K DLK that their labs and others have strongly implicated in both axon degeneration, neuronal cell death, and axon regeneration. Their recent work in eLife (Li, 2021) showed that inducible over-expression of DLK (or the related LZK) induces neuronal death in the cerebellum. Here, they extend this work to show that inducible over-expression in Vglut1+ neurons also kills excitatory neurons in hippocampal CA1, but not CA3. They complement this very interesting finding with translatomics to quantify genes whose mRNAs are differentially translated in the context of DLK over-expression or knockout, the latter manipulation having little to no effect on the phenotypes measured. The authors note that several genes and pathways are differentially regulated according to whether DLK is over-expressed or knocked out. They note DLK-dependent changes in genes related to synaptic function and the cytoskeleton and ultimately relate this in cultured neurons to findings that DLK over-expression negatively impacts synapse number and changes microtubules and neurites, though with a less obvious correlation.

      Strengths:

      This work represents a conceptual advance in defining DLK-dependent changes in translation. Moreover, the finding that DLK may differentially impact neuronal death will become the basis for future studies exploring whether DLK contributes to differential neuronal susceptibility to death, which is a broadly important topic.

      We thank the reviewer for the comments on the value of our work.

      Weaknesses:

      This seems like two works in parallel that the authors have not yet connected. First is that DLK affects the translation of an interesting set of genes, and second, that DLK(OE) kills some neurons, disrupts their synapses, and affects neurite growth in culture.

      Specific questions:

      (1) Is DLK effectively knocked out? The authors reference the floxxed allele in their 2016 work (PMID: 27511108), however, the methods of this paper say that the mouse will be characterized in a future publication. Has this ever been published? The major concern is that here the authors show that Cre-mediated deletion results in a smaller molecular weight protein and the maintenance of mRNA levels.

      We apologize for out-of-date citation of the DLK(cKO)<sup>fl/fl</sup> mice.  The DLK(cKO)<sup>fl/fl</sup> mice have been published in (Li et al., 2021; Saikia et al., 2022); excision of the flox-ed exon was verified using several Cre drivers (Pv-Cre, AAV-Cre, and VGlut1-Cre in this study).  The flox-ed exon contains the initiation ATG and 148 amino acids.  By western blot analysis using antibodies against C-terminal peptides of DLK on cerebellar extracts (in Li et al., 2021) and hippocampal extracts (this study), the full-length DLK protein was significantly reduced (Fig 1A-B); DLK is expressed in other hippocampal cells, in addition to glutamatergic neurons, explaining remaining full-length DLK detected. 

      Our Ribo-seq of VGlut1-Cre; DLK(cKO)<sup>fl/fl</sup> detected remaining Dlk mRNAs lacking the floxed exon (Fig.S1C), which has several candidate ATG at amino acid 223 and after (Fig.S1C1). We detected a very faint band for smaller molecular weight proteins on western blots, only when the membrane was exposed under 5X longer exposure using Pico PLUS Chemiluminescent Substrate (Thermo Scientific, 34580) and a Licor Odyssey XF Imager (revised Fig. S1B). This smaller molecular weight protein might be produced using any candidate ATGs, but would represent an N-terminal truncated DLK protein lacking the ATP binding site and ~1/4 of the kinase domain, i.e. not a functional kinase. 

      The revised manuscript has updated citation for DLK(cKO)<sup>fl/fl</sup>. Revised Fig.S1B includes images of a western blot under normal exposure vs longer exposure of western blots using anti-DLK antibodies. New Fig.S1C1 shows effects of floxed exon on DLK.

      (2) Why does DLK(OE) not kill CA3 neurons? The phenomenon is clear but there is no link to gene expression changes. In fact, the highlighted transcript in this work, Stmn4, changes in a DLK-dependent manner in CA3.

      We agree that this is a very interesting question not answered by our gene expression analysis.  While we verified Stmn4 expression levels to correlate to the levels of DLK, we do not think that increased Stmn4 per se in DLK(iOE) is a major factor accounting for CA1 death vs CA3 survival. Several published studies have also reported regulation of Stmn4 mRNAs in other cell types, in the contexts of cell death (Watkins et al., 2013; Le Pichon et al., 2017) and axon regeneration and cytoskeleton disruption (Asghari Adib et al., 2024; DeVault et al., 2024; Hu et al., 2019;  Shin et al., 2019). As Stmns have significant expression and function redundancy, conventional knockdown or overexpression of individual Stmn generally does not lead to detectable effects on cellular function. As CA3 neurons are widely known for their dense connections and show resilience to NMDA-mediated neurotoxicity (Sammons et al., 2024; Vornov et al., 1991), we speculate that the differential vulnerability of CA1 and CA3 under DLK(iOE) is a reflection of both the intrinsic property, such as gene expression, and also their circuit connection. 

      In the revised manuscript, we have included following statement on pg 18:

      ‘While our data does not pinpoint the molecular changes explaining why CA3 would show less vulnerability to increased DLK, we may speculate that DLK(iOE) induced signal transduction amplification may differ in CA1 vs CA3. CA1 genes appear to be more strongly regulated than CA3 genes, consistent with our observation that increased c-Jun expression in CA1 is greater than that in CA3. Other parallel molecular factors may also contribute to resilience of CA3 neurons to DLK(iOE), such as HSP70 chaperones, different JNK isoforms, and phosphatases, some of which showed differential expression in our RiboTag analysis of DLK(iOE) vs WT (shown in File S2. WT vs DLK(iOE) DEGs). Together with other genes that show dependency on DLK, the DLK and Jun regulatory network contributes to the regional differences in hippocampal neuronal vulnerability under pathological conditions.’

      Further we state in ‘Limitation of our study’ on pg 20:

      ‘Our analysis also does not directly address why CA3 neurons are less vulnerable to increased DLK expression. Future studies using cell-type specific RiboTag profiling and other methods at a refined time window will be required to address how DLK dependent signaling interacts with other networks underlying hippocampal regional neuron vulnerability to pathological insults.’

      We hope our data will stimulate continued interests for testable hypothesis in future studies.

      (3) Why are whole hippocampi analyzed to IP ribosome-associated mRNAs? The authors nicely show a differential effect of DLK on CA1 vs CA3, but then - at least according to their methods ¬- lyse whole hippocampi to perform IP/sequencing. Their data are therefore a mix of cells where DLK does and does not change cell death. The key issue is whether DLK does/does not have an effect based on the expression changes it drives.

      At the time of planning the Ribo-Tag experiment several years ago, we focused on the hippocampal glutamatergic neurons. Due to technical difficulty in micro-dissecting individual hippocampal regions from this early timepoint, we opted to use whole hippocampi to isolate ribosome-associated mRNAs. We agree with the reviewer that it is important to sort out DLK-dependent general gene expression changes vs those specific to a particular cell type where DLK impacts its survival. With emerging CA1, CA3 and other cell-type specific Cre drivers and advanced RNAseq technology, we hope that our work will stimulate broad interest in these questions in future studies. 

      In the revised manuscript, we have included new analysis comparing our Vglut1-RiboTag profiling (P15) with CamK2-RiboTag (for CA1) and Grik4-RiboTag (for CA3) (P42) published in Traunmüller et al., 2023 (GSE209870). We find that >80% of the top ranked genes in their CamK2-RiboTag (for CA1) and Girk4-RiboTag (for CA3) were detected in our VGlut1-RiboTag (revised methods and Supplemental Excel File S3). CA1-enriched genes tended to be expressed higher in DLK(cKO), compared to control, whereas CA3-enriched genes showed less significant correlation to DLK expression levels. Additionally, many genes known to specify CA1 fate do not show significant downregulation in DLK(iOE). This analysis, along with other data in our manuscript, is consistent with an idea that DLK does not regulate neuronal fate.

      In the revised manuscript, we presented this additional analysis in Fig. S6K-L, and expanded text description on page 9:

      ‘Additionally, we compared our Vglut1-RiboTag datasets with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We defined a list of genes enriched in CamK2-expressing CA1 neurons relative to Grik4-expressing CA3 neurons (CA1 genes), and those enriched in Grik4-expressing CA3 neurons (CA3 genes) (File S3). When compared with the entire list of Vglut1-RiboTag profiling in our control and DLK(cKO), we found CA1 genes tended to be expressed more in DLK(cKO) mice, compared to control (Fig.S6K), while CA3 genes showed a slight enrichment in control though the trend was less significant, and were less clustered towards one genotype (Fig.S6L). Moreover, many CA1 genes related to cell-type specification, such as FoxP1, Satb2, Wfs1, Gpr161, Adcy8, Ndst3, Chrna5, Ldb2, Ptpru, and Ntm, did not show significant downregulation when DLK was overexpressed. These observations imply that DLK likely specifically down-regulates CA1 genes both under normal conditions and when overexpressed, with a stronger effect on CA1 genes, compared to CA3 genes. Overall, the informatic analysis suggests that decreased expression of CA1 enriched genes may contribute to CA1 neuron vulnerability to elevated DLK, although it is also possible that the observed down-regulation of these genes is a secondary effect associated with CA1 neuron degeneration’.

      (4) Is the subtle decrease in synapse number (Basson/Homer co-loc.) in the DLK (OE) simply a function of neurons (and their synapses, presumably) having died? At the P15 time point that the authors choose because cell death is minimal, there is still a ~25% reduction in CA1 thickness (Figure 2B), which is larger than the ~15% change in synapses (Figure 5H) they describe.

      We thank reviewer for the question. To address this, we have analyzed synapses in the CA1 region at P10 in DLK(iOE) mice when there was no detectable loss of neurons. At P10, we did not detect significant changes in Bassoon, Homer1, or colocalized puncta in CA1 (Fig.S11A-F). In P15 DLK(iOE) mice, Homer1 puncta were slightly smaller (Fig.5L) and showed a significant decrease in CA1 SR (Fig.5I).

      In the revised manuscript we have also redone our statistical analysis of synapses, using mice rather than ROIs (revised Fig. 5), as recommended by R3. We also analyzed synapses in CA3, and found no significant differences in P10 or P15 (Fig.S12).  We would interpret the data to mean that the effects of DLK(OE) on synapses in CA1 may represent an early step in neuronal death. We hope that future studies will shed clarity on this question.

      Reviewer #2 (Public Review):

      This manuscript describes the impact of deleting or enhancing the expression of the neuronal-specific kinase DLK in glutamatergic hippocampal neurons using clever genetic strategies, which demonstrates that DLK deletion had minimal effects while overexpression resulted in neurodegeneration in vivo. To determine the molecular mechanisms underlying this effect, ribotag mice were used to determine changes in active translation which identified Jun and STMN4 as DLK-dependent genes that may contribute to this effect. Finally, experiments in cultured neurons were conducted to better understand the in vivo effects. These experiments demonstrated that DLK overexpression resulted in morphological and synaptic abnormalities.

      Strengths:

      This study provides interesting new insights into the role of DLK in the normal function of hippocampal neurons. Specifically, the study identifies:

      (1) CA1 vs CA3 hippocampal neurons have differing sensitivity to increased DLK signaling.

      (2) DLK-dependent signaling in these neurons is similar to but distinct from the downstream factors identified in other cell types, highlighted by the identification of STMN4 as a downstream signal.

      (3) DLK overexpression in hippocampal neurons results in signaling that is similar to that induced by neuronal injury.

      The study also provides confirmatory evidence that supports previously published work through orthogonal methods, which adds additional confidence to our understanding of DLK signaling in neurons. Taken together, this is a useful addition to our understanding of DLK function.

      We thank the reviewer for careful reading and positive comments.

      Weaknesses:

      There are a few weaknesses that limit the impact of this manuscript, most of which are pointed out by the authors in the discussion. Namely:

      (1) It is difficult to distinguish whether the changes in the translatome identified by the authors are DLK-dependent transcriptional changes, DLK-dependent post-transcriptional changes or secondary gene expression changes that occur as a result of the neurodegeneration that occurs in vivo. Additional expression analysis at earlier time points could be one method to address this concern.

      We appreciate the reviewer’s comment, and have performed new analysis on c-Jun and p-c-Jun levels in CA1, CA3, and DG in P10 DLK(OE) mice. Our data suggest that in CA3 elevations in p-c-Jun and c-Jun occur separately from cell death in a DLK-dependent manner, though the high elevation of both p-c-Jun and c-Jun in CA1 correlates with cell death.

      The data is presented in revised Fig.S7A,B, and described in revised text on pg 9-10:

      ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis.’

      Also, on pg.10:

      In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).

      (2) Related to the above, it is difficult to conclusively determine from the current data whether the changes in synaptic proteins observed in vivo are a secondary result of neuronal degeneration or a primary impact on synapse formation. The in vitro studies suggest this has the potential to be a primary effect, though the difference in experimental paradigm makes it impossible to determine whether the same mechanisms are present in vitro and in vivo.

      We appreciate the comment, which is related to R1 point 4. We have performed further analysis and revised the text on pg.12 with the following text:

      ‘To assess effects of DLK overexpression on synapses, we immunostained hippocampal sections from both P10 and P15, with age-matched littermate controls. Quantification of Bassoon and Homer1 immunostaining revealed no significant differences in CA1 SR and CA3 SR and SL in P10 mice of _<_i>Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> and control (Fig.S11A-F, S12A-J). In P15, Bassoon density and size in CA1 SR were comparable in both mice (Fig 5G, H, K), while Homer1 density and size were reduced in DLK(iOE) (Fig.5G,I, L). Overall synapse number in CA1 SR was similar in DLK(iOE) and control mice (Fig.5J). Similar analysis on CA3 SR and SL detected no significant difference from control (Fig.S12M-V).’

      We would interpret the data to mean that the effects of DLK(OE) on synapses in CA1 may represent an early step in neuronal death. We hope that future studies will shed clarity on this question.

      Additionally, to address whether the same mechanisms are present in vitro, we have performed further analysis on cultured hippocampal neurons. As described in the Methods, we made hippocampal neuron cultures from P1 pups of the following crosses:

      For control: Vglut1<sup>Cre/+</sup> X Rosa26<sup>tdT/+</sup> 

      For DLKcKO: Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>  X Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>;Rosa26<sup>tdT/+</sup> 

      For DLKiOE: H11-DLK<sup>iOE/iOE</sup> X Vglut1<sup>Cre/+</sup>;Rosa26<sup>tdT/+</sup> 

      Dissociated cells from a given litter were pooled into the same culture. Because there were different proportions of neurons with our genotype of interest in each culture, it is not simple to know whether DLK was causing significant cell death.

      On pg 13, we stated our observation:

      ‘We did not notice an obvious effect of DLK(iOE) or DLK(cKO) on neuron density in cultures at DIV2. To assess neuronal type distribution in our cultures, we immunostained DIV14 neurons with antibodies for Satb2, as a CA1 marker (Nielsen et al., 2010), and Prox1, as a marker of DG neurons (Iwano et al., 2012). We did not observe significant differences in the proportion of cells labeled with each marker in DLK(cKO) or DLK(iOE) cultures (Fig.S13E). These data are consistent with the idea that DLK signaling does not have a strong role in neuron-type specification both in vivo and in vitro’.

      (3) The phenotype of DLK cKO mice is very subtle (consistent with previous reports) and while the outcome of increased DLK levels is interesting, the relevance to physiological DLK signaling is less clear. What does seem possible is that increased DLK may phenocopy other neuronal injuries but there are no real comparisons to directly address this in the manuscript. It would be helpful for the authors to provide this analysis as well as a table with all of the translational changes along with fold changes.

      Thank you for the suggestion. The fold changes of genes showing significantly altered expression in DLK(cKO) and DLK(iOE) are provided in the excel files (Supplementary excel File S1 WT vs DLK(cKO) DEGs and File S2. WT vs DLK(iOE) DEGs, highlighted columns B and F).  

      On pg 6, we revised the text as following to include comparison of DLK levels in other physiological conditions and our mice:

      ‘Several studies have reported that DLK protein levels increase under a variety of conditions, including optic nerve crush (Watkins et al., 2013), NGF withdrawal (~2 fold) (Huntwork-Rodriguez et al., 2013; Larhammar et al., 2017), and sciatic nerve injury (Larhammar et al., 2017). Induced human neurons show increased DLK abundance about ~4 fold in response to ApoE4 treatment (Huang et al., 2019). Increased expression of DLK can lead to its activation through dimerization and autophosphorylation (Nihalani et al., 2000)’.

      And,

      ‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’

      In Discussion, we state (pg. 16): ‘The levels of DLK in our DLK(iOE) mice model appear comparable to those reported under traumatic injury and chronic stress.’

      (4) For the in vivo experiments, it is unclear whether multiple sections from each animal were quantified for each condition. More information here would be helpful and it is important that any quantification takes multiple sections from each animal into account to account for natural variability.

      We apologize this was unclear in the original manuscript.

      In the revised methods, under Confocal imaging and quantification (pg 33), we stated: “For brain tissue, three sections per mouse were imaged with a minimum of three mice per genotype for data analysis.”

      In revised figure legends, we made it clear that multiple sections from each animal have been used for quantification in all instances, i.e. “Each dot represents averaged thickness from 3 sections per mouse, N≥4 mice/genotype per timepoint.” 

      In Fig.1F-H: “Each dot represents averaged intensity from 3 sections per mouse”

      In Fig.S3B “Data points represent individual mice, averages taken across 3 sections per mouse”

      Reviewer #3 (Public Review):

      Dr Jin and colleagues revisit DLK and its established multifactorial roles in neuronal development, axonal injury, and neurodegeneration. The ambitious aim here is to understand the DLK-dependent gene network in the brain and, to pursue this, they explore the role of DLK in hippocampal glutamatergic neurons using conditional knockout and induced overexpression mice. They produce evidence that dorsal CA1 and dentate gyrus neurons are vulnerable to elevated expression of DLK, while CA3 neurons appear unaffected. Then they identify the DLK-dependent translatome featured by conserved molecular signatures and cell-type specificity. Their evidence suggests that increased DLK signaling is associated with possible STMN4 disruptions to microtubules, among else. They also produce evidence on cultured hippocampal neurons showing that expression levels of DLK are associated with changes in neurite outgrowth, axon specification, and synapse formation. They posit that downstream translational events related to DLK signaling in hippocampal glutamatergic neurons are a generalizable paradigm for understanding neurodegenerative diseases.

      Strengths

      This is an interesting paper based on a lot of work and a high number of diverse experiments that point to the pervasive roles of DLK in the development of select glutamatergic hippocampal neurons. One should applaud the authors for their work in constructing sophisticated molecular cre-lox tools and their expert Ribotag analysis, as well as technical skill and scholarly treatment of the literature. I am somewhat more skeptical of interpretations and conclusions on spatial anatomical selectivity without stereological approaches and also going directly from (extremely complex) Ribotag profiling patterns to relevance based on immunohistochemistry and no additional interventions to manipulate (e.g. by knocking down or blocking) their top Ribotag profile hits. Also, it seems to this reviewer that major developmental claims in the paper are based on gene translational profiling dependent on DLK expression, not DLK activation, despite some evidence in the paper that there is a correlation between the two. Therefore, observed patterns and correlations may or may not be physiologically or pathologically relevant. Generalizability to neurodegenerative diseases is an overreach not justified by the scope, approach, and findings of the paper.

      We thank the reviewer for the encouraging and constructive comments on the manuscript.

      Weaknesses and Suggestions:

      The authors state that the rationale for the translatomic studies is to "to gain molecular understanding of gene expression associated with DLK in glutamatergic neurons" and to characterize the "DLK-dependent molecular and cellular network", However, a problem with the experimental design is the selection of an anatomical region at a time point featured by active neurodegeneration. Therefore, it is not straightforward that the differentially expressed genes or pathways caused by DLK overexpression changes could be due to processes related to neurodegeneration. Indeed, the authors find enrichment of signals related to pathways involved in extracellular matrix organization, apoptosis, unfolded protein responses, the complement cascade, DNA damage responses, and depletion of signals related to mitochondrial electron transport, etc., all of which could be the consequence of neurodegeneration regardless of cause. A more appropriate design to discover DLK-dependent pathways might be to look at a region and/or a time point that is not confounded by neurodegeneration.

      We appreciate reviewer’s comment. We included our thoughts in ‘Limitation of the study’ (pg 20):

      ‘Future studies using cell-type specific RiboTag profiling and other methods at a refined time window will be required to address how DLK dependent signaling interacts with other networks underlying hippocampal regional neuron vulnerability to pathological insults.’

      In a related vein, the authors ask "if the differentially expressed genes associated with DLK(iOE) might show correlation to neuronal vulnerability" and, to answer this question, they select the set of differentially expressed genes after DLK overexpression and assess their expression patterns in various regions under normal conditions. It looks to me that this selection is already confounded by neurodegeneration which could be the cause for their downregulation. Therefore, such gene profiles may not be directly linked to neuronal vulnerability. A similar issue also relates to the conclusion that "...the enrichment of DLK-dependent translation of genes in CA1 suggests that the decreased expression of these genes may contribute to CA1 neuron vulnerability to elevated DLK".

      We agree with the reviewer’s concern that it is difficult to separate neurodegenerative consequences from changes caused by DLK solely based on our translatomics studies on P15 DLK(iOE) mice.  As responded to reviewer 1 (point 4) and reviewer 2 (point 1), we have included new analysis of P10 mice (Fig.S7A,B) when neurons did not show detectable sign of degeneration.

      We consider several lines of evidence supporting that some differentially expressed genes in DLK(iOE) vs control may likely be specific for increased DLK signaling.

      First, the genes identified in DLK(iOE) vs control represent a small set of genes (260), which is comparable to other DLK dependent datasets (Asghari Adib et al., 2024) but shows cell-type specificity.

      Second, our analysis using rank-rank hypergeometric overlap (RRHO) detects a significant correlation between upregulated genes from DLK(iOE) vs downregulated genes in DLK(cKO), and vice versa, suggesting that expression of a similar set of genes is depended on DLK (Fig.3C, S6C-E). Consistently, GO term analysis using the list of genes coordinately regulated by DLK, derived from our RRHO analysis, leads to identification of similar GO terms related to up- and downregulated genes as using DLK(iOE)-RiboTag data alone. SynGO analysis of DLK(iOE) regulated genes and DLK(cKO) regulated genes also identified similar synaptic processes regulated by significantly regulated genes (Fig.3F and S6J).  

      Third, we performed additional analysis comparing our Vglut1-RiboTag dataset with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We observed >80% overlap among the top ranked genes (revised Methods). We described this analysis on pg 9 and Fig. S6K-L (and Supplemental Excel File S3):

      ‘Additionally, we compared our Vglut1-RiboTag datasets with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We defined a list of genes enriched in CamK2-expressing CA1 neurons relative to Grik4-expressing CA3 neurons (CA1 genes), and those enriched in Grik4-expressing CA3 neurons (CA3 genes) (File S3). When compared with the entire list of Vglut1-RiboTag profiling in our control and DLK(cKO), we found CA1 genes tended to be expressed more in DLK(cKO) mice, compared to control (Fig.S6K), while CA3 genes showed a slight enrichment in control though the trend was less significant, and were less clustered towards one genotype (Fig.S6L). Moreover, many CA1 genes related to cell-type specification, such as FoxP1, Satb2, Wfs1, Gpr161, Adcy8, Ndst3, Chrna5, Ldb2, Ptpru, and Ntm, did not show significant downregulation when DLK was overexpressed. These observations imply that DLK likely specifically down-regulates CA1 genes both under normal conditions and when overexpressed, with a stronger effect on CA1 genes, compared to CA3 genes. Overall, the informatic analysis suggests that decreased expression of CA1 enriched genes may contribute to CA1 neuron vulnerability to elevated DLK, although it is also possible that the observed down-regulation of these genes is a secondary effect associated with CA1 neuron degeneration.’

      To understand the role and relevance of the DLK overexpression model, there should be a discussion of to what extent it corresponds to endogenous levels of DLK expression or DLK-MAPK pathway activation under baseline or pathological conditions.

      We appreciate the suggestion, which is similar to R2 point 3. We have revised the text and discussion to include how DLK levels may be altered in other physiological conditions vs our mice.

      Pg. 6: ‘Several studies have reported that DLK protein levels increase under a variety of conditions, including optic nerve crush (Watkins et al., 2013), NGF withdrawal (~2 fold) (Huntwork-Rodriguez et al., 2013; Larhammar et al., 2017), and sciatic nerve injury (Larhammar et al., 2017). Induced human neurons show increased DLK abundance about ~4 fold in response to ApoE4 treatment (Huang et al., 2019). Increased expression of DLK can lead to its activation through dimerization and autophosphorylation (Nihalani et al., 2000)’.

      And,

      ‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’

      In Discussion (pg. 16): ‘The levels of DLK in our DLK(iOE) mice model appear comparable to those reported under traumatic injury and chronic stress.’

      The authors posit that "dorsal CA1 neurons are vulnerable to elevated DLK expression, while neurons in CA3 appear largely resistant to DLK overexpression". This statement assumes that DLK expression levels start at a similar baseline among regions. Do the authors have any such data? Ideally, they should show whether DLK expression and p-c-Jun (as a marker of downstream DLK signaling) are the same or different across regions in both WT and overexpression mice. For example, what are the DLK/p-c-Jun expression levels in regions other than CA1 in Supplementary Figures 2-3 and how do they compare with each other? Normalization to baseline for each region does not allow such a comparison. Also, in Supplementary Figure 6, analyses and comparisons between regions are done at a time point when degeneration has already started. Ideally, these should be done at P10.

      We thank the reviewer for raising these points. In the revised manuscript we have included protein expression analysis of DLK (Fig S3), c-Jun, and p-c-Jun at P10 (Fig. S7).

      We provided a quantification of DLK immunostaining intensity in CA1 and CA3 in Fig.S3D,E and find roughly comparable levels between regions.

      Pg. 6: ‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’

      We provided our quantifications without normalization to baseline in each region for c-Jun and p-c-Jun, and revised the text accordingly:

      Pg. 9-10: ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis’.

      Pg. 10: ‘In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).

      Illustration of proposed selective changes in hippocampal sector volume needs to be very carefully prepared in view of the substantial claims on selective vulnerability. In 2A under P15 and especially P60, it is difficult to see the difference - this needs lower magnification and a lot of care that anteroposterior levels are identical because hippocampal sector anatomy and volumes of sectors vary from level to level. One wonders if the cortex shrinks, too. This is important.

      Thank you for raising the point. We have provided images to view the anteroposterior level in Fig.S2A-C. We have noticed cortex in DLK(OE) mice to become thinner, along with expansion of ventricles in some animals at later timepoints (Fig.S2C).

      One cannot be sure that there is selective death of hippocampal sectors with DLK overexpression versus, say, rearrangement of hippocampal architecture. One may need stereological analysis, otherwise this substantial claim appears overinterpreted.

      We appreciate the comment.

      In the revised manuscript, we included a new supplemental figure (Fig. S2) showing lower magnification images of coronal sections, and used cautionary wording, such as ‘CA3 is less vulnerable, compared to CA1’, to minimize the impression of over-interpretation.  By NeuN staining, at P10, P15, P60, we did not observe detectable difference in overall hippocampus architecture, apart from noted cell death of CA1 and DG and associated thinning of each of the layers. At 46 weeks, some animals showed differences in the overall shape of dorsal hippocampus, though this appeared to reflect a disproportionately large CA3 region compared to other regions (Fig S2). Increased GFAP staining (Fig.S5A-C) was detected in CA1 but not in CA3, and microglia by IBA1 staining (Fig.S5E) also displayed less reactivity in CA3, compared to CA1. Thus, based on NeuN staining, GFAP staining, IBA1 staining and analysis of the differentially regulated genes, we infer that the effect of DLK(iOE) in CA1 is different than the effect on CA3.

      Is the GFAP excess reflective of neuroinflammation? What do microglial markers show? The presence of neuroinflammation does not bode well with apoptosis. Speaking of which, TUNEL in one cell in Supplementary Figure 4E is not strong evidence of a more widespread apoptotic event in CA1.

      We have included staining data for the microglia marker IBA1. Both GFAP and IBA1 showed evidence of reactivity particularly in the CA1 region (S5A-E), supporting the differential vulnerability in different regions, though whether cell death is primarily due to apoptosis is unclear.

      We agree that our data of sparse TUNEL staining at P15 (Fig S5F,G) do not rule out whether other mechanisms of cell death may also occur.  We have included this in our limitations (pg.20) “While we find evidence for apoptosis, other forms of cell death may also occur.”

      In several places in the paper (as illustrated in Figure 4B, Supplementary Figure 2B, etc.): the unit of biological observation in animal models is typically not a cell, but an organism, in which averaged measures are generated. This is a significant methodological problem because it is not easy to sample neurons without involving stereological methods. With the approach taken here, there is a risk that significance may be overblown.

      We appreciate the reviewer’s point. We used same region for quantification of RNAscope, genotype-blind when possible. We revised the graphs to show mean values for individual mice in Fig.4B, 4C, and Fig.S3B (previously Fig.S2B).

      Other Comments and Questions:

      Supplementary Figure 9: The authors state that data points are shown for individual ROIs - ideally, they should also show averages for biological replicates. Can the authors confirm that statistical analyses are based on biological replicates (mice) and not ROIs?

      We have revised the graphs to show averages from individual mice in Fig.5B-D, F5E-F (previously Fig.S9G-I), Fig.5H-J, and Fig.5K-L (previously Fig.S9J-L)  and Fig.S10B,C,E,F (previously Fig.S9B,C, E,F). The statistical analyses are based on biological replicates of mice.

      For in vitro experiments, what is the effect of DLK overexpression on neuronal viability and density? Could these variables confound effects on synaptogenesis/synapse maturation?

      As described in the Methods, we made hippocampal neuron cultures from P1 pups of the following crosses:

      For control: Vglut1<sup>Cre/+</sup> X Rosa26<sup>tdT/+</sup> 

      For DLKcKO: Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>  X Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>;Rosa26<sup>tdT/+</sup> 

      For DLKiOE: H11-DLK<sup>iOE/iOE</sup> X Vglut1<sup>Cre/+</sup>;Rosa26<sup>tdT/+</sup> 

      Dissociated cells from a given litter were pooled into the same culture. Because there were different proportions of neurons with our genotype of interest in each culture, it is not simple to know whether DLK was causing significant cell death.

      On pg 13, we stated our observation:

      ‘We did not notice an obvious effect of DLK(iOE) or DLK(cKO) on neuron density in cultures at DIV2. To assess neuronal type distribution in our cultures, we immunostained DIV14 neurons with antibodies for Satb2, as a CA1 marker (Nielsen et al., 2010), and Prox1, as a marker of DG neurons (Iwano et al., 2012). We did not observe significant differences in the proportion of cells labeled with each marker in DLK(cKO) or DLK(iOE) cultures (Fig.S13E). These data are consistent with the idea that DLK signaling does not have a strong role in neuron-type specification both in vivo and in vitro’.

      We cannot rule out whether variable factors in our cultures may confound effects on synaptogenesis/synapse maturation, and would hope future studies will shed clarity.

      Correlations between c-jun expression and phosphorylation are extremely important and need to be carefully and convincingly documented. I am a bit concerned about Supplementary Figure 6 images, especially 6B-CA1 (no difference between control and KO, too small images) and 6D (no p-c-Jun expression at all anywhere in the hippocampus at P15?).

      At P10, P15, and P60 we stained for p-c-Jun using the Rabbit monoclonal p-c-Jun (Ser73) (D47G9) antibody from Cell Signaling (cat# 3270) at a 1:200 dilution and imaged using an LSM800 confocal microscope with a 20x objective. We observed p-c-Jun to be quite low generally in control animals. We have replaced the images in Fig.S7F (previously S6D), and adjusted the brightness/contrast to enable better visualization of the low signal in Fig.S7B,D,F (previously Fig.S6B,D).

      We revised our text to present the data carefully as stated above:

      Pg. 9-10: ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis’.

      Pg. 10: ‘In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).

      Recommendations for the authors:

      Several major and minor reservations were raised. The major issues are the need for more information about the over-expression of DLK and a need to extrapolate to an in vivo condition with DLK. A considerable amount of useful information is presented with some very nicely done experiments but it is not yet a coherent or integrated story. The lack of impact of DLK overexpression in some neurons is perhaps the most impactful observation of the study and would be great to have more information around the differential transcriptional/signaling response in these cell types. There is also a need for more experimental details and to address several questions about the mouse genetic and translatome analysis. They are valid concerns that require attention by the authors.

      We thank the editors and reviewers for their thoughtful evaluation and suggestions.  We hope that the editors and reviewers find that the new data and text changes in our revised manuscript, along with above point-to-point response, have addressed the concerns and strengthened our findings.

      Minor points:

      (1)The authors state that deletion of DLK has no effect on CA1 at 1yr, however, the image of CA1 in Figure S1D shows substantially fewer NeuN+ neurons. Is this a representative field of view?

      We have re-examined images, and observed no effect on hippocampal morphology at 1 yr. We now included representative images in the revised Fig S1D.

      (2) Is the DLK protein section staining in Figure 2C a real signal? The staining looks like speckles and is purely somatic. Axonal staining is widely expected based on the literature and the authors' own work. There should be a specificity control.

      To our knowledge, axonal staining of DLK reported in the literature is mostly based on cultured DRG neurons. In addition to the reported axonal localization, DLK is present in the cell soma, near the golgi (Hirai et al., 2002), and in the post-synaptic density (Pozniak et al., 2013).

      In the revised manuscript, we addressed this point by including controls with no primary antibody, and using an antibody against the closely related kinase, LZK. These additional data are shown in (Fig.S3C,D) (previously Fig.S2C), supporting that DLK protein staining represents real signal.  At P10 and P15, DLK immunostaining around CA3 showed axonal staining of the mossy fibers, as well as in the soma and dendritic layers (Fig.S3C,D). A similar pattern was also seen in primary cultured neurons (Fig 6A).

      (3) The protein expression of DLK in the transgenic overexpressor (Figure S7C) looks, to the resolution of this blot, to be at least 50kD heavier than 'WT' DLK. Can the authors explain this discrepancy?

      The Cre-induced DLK(iOE) transgene has T2A and tdTomato in-frame to C-terminus of DLK. It is known that T2A ‘self-cleavage’ is often incomplete. DLK-T2A-tdTomato would be about 50 kD bigger than WT DLK. We now include the transgene design in revised Fig S1D, and also stated in figure legend of Fig.S8C (previously S7C) that ‘Larger molecular weight band of DLK in Vglut1<sup>Cre/+</sup>;H11-DLKiOE/+ would match the predicted molecular weight of DLK-T2A-tdTomato if T2A-peptide induced ‘self-cleavage’ due to ribosomal skipping is ineffective (Fig.S1D).’

      (4) Expression changes in DLK affect various aspects of neurites in CA1 cultures (Figure 6), and changes in DLK also modestly affect STMN4 (and 2, perhaps indirectly) levels (Figure S7C), but there is no indication that DLK acts via STMN4 to cause these changes. It is not clear what to make of these data. Of note, Stmn4 levels change in response to DLK in CA3, without DLK affecting cell death in this region.

      We appreciate and agree with the comment. Other studies (Asghari Adib et al., 2024; DeVault et al., 2024; Hu et al., 2019; Larhammar et al., 2017; Le Pichon et al., 2017; Shin et al., 2019; Watkins et al., 2013) reported expression changes in Stmn4 mRNAs in other cell types and cellular contexts, which appeared to depend on DLK. Hippocampal neurons express multiple Stmns (Fig.S8A). While we present our analysis on the effects of DLK dosage on Stmn4, and also Stmn2, we do not think that DLK-induced changes of Stmn4 expression per se is a major factor underlying CA1 cell death vs CA3 survival.

      In the revised manuscript, we addressed this point in ‘Limitation of our study’ (pg 20):

      ‘Additional experiments will be needed to elucidate in vivo roles of STMN4 and its interaction with other STMNs’.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      • Reviewer #1 (Evidence, reproducibility and clarity (Required)): Summary:

      In this manuscript, Hammond et al. study robustness of the vertebrate segmentation clock against morphogenetic processes such as cell ingression, cell movement and cell division to ask whether the segmentation clock and morphogenesis are modular or not. The modularity of these two would be important for evolvability of the segmenting system. The authors adopt a previously proposed 3D model of the presomitic mesoderm (Uriu et al. 2021 eLife) and include new elements; diKerent types of cell ingression, tissue compaction and cell cycles. Based on the results of numerical simulations that synchrony of the segmentation clock is robust, the authors conclude that there is a modularity in the segmentation clock and morphogenetic processes.

      The presented results support the conclusion. The manuscript is clearly written. I have several comments that could help the authors further strengthen their arguments.

      Major comment:

      [Optional] In both the current model and Uriu et al. 2021, coupling delay in phase oscillator model is not considered. Given that several previous studies (e.g. Lewis 2003, Herrgen et al. 2010, Yoshioka-Kobayashi et al. 2020) suggested the presence of coupling delays in Delta- Notch signaling, could the authors analyze the eKect of coupling delay on robustness of the segmentation clock against morphogenetic processes?

      Response: We thank the reviewer for the suggestion. Owing to the computational demands of including such a delay in the model, we cannot feasibly repeat every simulation analysed here in the presence of delay, and would like to note that the increased computational demand that delays put on the simulations is also the reason why Uriu et al 2021 did not include it, as stated in their published exchange with reviewers. However, analogous to our analysis in figure 7, we can analyse how varying the position of progenitor cell ingression aKects synchrony in the presence of the coupling delay measured in zebrafish by Herrgen et al. (2010). We show this analysis in a new figure 8 (8B, specifically), on page 21, and discuss its implications in the text on pages 20- 22. Our analysis reveals that the model cannot recover synchrony using the default parameters used by Uriu et al. (2021) and reveal a much stronger dependence on the rate of cell mixing (vs) than shown in the instantaneous coupling case (cf. figure 7). However, by systematically varying the value of the delay we find that a relatively minor increase in the delay is suKicient to recover synchrony using the parameter set of Uriu et al. (see figure 8C). Repeating this across the three scenarios of cell ingression we see that the combination of coupling strength and delay determine the robustness of synchrony to varying position of cell ingression. This suggests that the combination of these two parameters constrain the evolution of morphogenesis.

      Minor comments:

      • PSM radius and oscillation synchrony are both denoted by the same alphabet r. The authors should use different alphabets for these two to avoid confusion.

      Response: We thank the reviewer for spotting this. This has now been changed throughout to rT, as shorthand for ‘radius of tissue’.

      • page 5 Figure 1 caption: (x-x_a/L) should be (x-x_a)/L.

      Response: We thank the reviewer for spotting this. This has now been corrected.

      • Figure 3C: Description of black crosses in the panels is required in the figure legend.

      Response: Thank you for spotting this. The legend has now been corrected.

      • Figure 3C another comment: In this panel, synchrony r at the anterior PSM is shown. It is true that synchrony at anterior PSM is most relevant for normal segment formation. However, in this case, the mobility profile is changed, so it may be appropriate to show how synchrony at mid and posterior PSM would depend on changes in mobility profile. Is synchrony improved by cell mobility at the region where cell ingression happens?

      Response: We thank the reviewer for the suggestion. We have now plotted the synchrony along the AP axis for varying motility profiles, and this can be seen in figure 3 supplement 1, and is briefly discussed in the text on page 11. We show that while the synchrony varies with x-position (as already expected, see figure 2), there is no trend associated with the shape of the motility profile.

      • In page 12, the authors state that "the results for the DP and DP+LV cases are exactly equal for L = 185 um, as .... and the two ingression methods are numerically equivalent in the model". I understood that in this case two ingression methods are equivalent, but I do not understand why the results are "exactly" equal, given the presence of stochasticity in the model.

      Response: These results can be exactly equal despite the simulations being stochastic because they were both initialised using the same ‘seed’ in the source code. However, we now see that this might be confusing to the reader, and we have re-generated this figure but this time initialising the simulations for each ingression scenario using a diKerent seed value. This is now reflected in the text on page 12 and in figure 4.

      • The authors analyze the eKect of cell density on oscillation synchrony in Fig. 4 and they mention that higher density increases robustness of the clock by increasing the average number of interacting neighbours. I think it would be helpful to plot the average number of neighbouring cells in simulations as a function of density to quantitatively support the claim.

      Response: We thank the reviewer for their suggestion. Distributions of neighbour numbers for exemplar simulations with varying density can now be found in figure 4 supplementary figure 1 and are referred to in the text on page 11.

      • The authors analyze the eKect of PSM length on synchrony in Fig. 4. I think kymographs of synchrony r as shown in Fig. 2D would also be helpful to show that indeed cells get synchronized while advecting through a longer PSM.

      Response: We thank the reviewer for their suggestion and agree that visualising the data in this way is an excellent idea. We have generated the suggested kymographs and added them to figure 4 as supplements 2 and 4, and discussed these results in the text on page 12.

      • I understand that cells in M phase can interact with neighboring cells with the same coupling strength kappa in the model, although their clocks are arrested. If so, this aspect should be also mentioned in the main text in page 16, as this coupling can be another noise source for synchrony.

      Response: We agree this is an important clarification. We explicitly state this, and briefly justify our choice, in the text on page 16.

      • Figure 5-figure supplement 2: panel labels A, B, C are missing.

      Response: Thank you for bringing this to our attention. These have now been added.

      • Figure 5-figure supplement 3: panel labels A, B, C are missing.

      Response: Thank you for bringing this to our attention. These have now been added.

      • Reviewer #1 (Significance (Required)):

      Synchronization of the segmentation clock has been studied by mathematical modeling, but most previous studies considered cells in a static tissue without morphogenesis. In the previous study by Uriu et al. 2021, morphogenetic processes such as cell advection due to tissue elongation, tissue shortening, and cell mobility were considered in synchronization. The current manuscript provides methodological advances in this aspect by newly including cell ingression, tissue compaction and cell cycle. In addition, the authors bring a concept of modularity and evolvability to the field of the vertebrate segmentation clock, which is new. On the other hand, the manuscript confirms that the synchronization of the segmentation clock is robust by careful simulations, but it does not propose or reveal new mechanisms for making it robust or modular. The main targets of the manuscript will be researchers working on somitogenesis and evolutionary biologists who are interested in evolution of developmental systems. The manuscript will also be interested by broader audiences, like developmental biologists, biophysicists, and physicists and computer scientists who are working on dynamical systems.

      Response: We thank the reviewer for their interest in our manuscript and for acknowledging us as one of the first to address the modularity and evolvability of somitogenesis. We hope that this work will encourage others to think about these concepts in this system too. In the original submission, we identified a high enough coupling strength as the main mechanism underlying the identified modularity in somitogenesis. Since, we have included an analysis of the coupling delay and find that it is the interplay between coupling strength and coupling delay that mediate the identified modularity, allowing PSM morphogenesis and the segmentation clock to evolve independently in regions of parameter space that are constrained and determined by the interplay between these two parameters. We have now added an extra figure (figure 8) where we explore this interplay and have discussed it at length in the last section of the results and in the discussion. We again thank the reviewer for encouraging us to include delays in our analysis.

      • Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      SUMMARY

      The manuscript from Hammond et al., investigates the modularity of the segmentation clock and morphogenesis in early vertebrate development, focusing on how these processes might independently evolve to influence the diversity of segment numbers across vertebrates.

      Methodology | The study uses a previously published computational model, parameterized for zebrafish, to simulate and analyse the interactions between the segmentation clock and the morphogenesis of the pre-somitic mesoderm (PSM). Their model integrates cell advection, motility, compaction, cell division, and the synchronization of the embryo clock. Three alternative scenarios of PSM morphogenesis were modeled to examine how these changes aKect the segmentation clock.

      Model System | The computational model system combines a representation of cell movements and the phase oscillator dynamics of the segmentation clock within a three-dimensional horseshoe-shaped domain mimicking the geometry of the vertebrate embryo PSM. The parameters used for the mathematical model are mostly estimated from previously published experimental findings.

      Key Findings and Conclusions | (1) The segmentation clock was found to be broadly robust against variations in morphogenetic processes such as cell ingression and motility; (2) Changes in the length of the PSM and the strength of phase coupling within the clock significantly influenced the system's robustness; (3) The authors conclude that the segmentation clock and PSM morphogenesis exhibited developmental modularity (i.e. relative independence), allowing these two phenomena to evolve independently, and therefore possibly contributing to the diverse segment numbers observed in vertebrates.

      MAJOR COMMENTS

      1. The key conclusion drawn by the authors (that there is robustness, and therefore modularity, between the morphogenetic cellular processes modeled and the embryo clock synchronization) stems directly from the modeling results appropriately presented and discussed in the manuscript. The model comprises some strong assumptions, however all have been clearly explained and the parameterization choices are supported by experimental findings, providing biological meaning to the model. Estimated parameters are well explained and seem reasonable assumptions (from the embryology perspective).

      Response: We thank the reviewer for their positive comments about our work

      1. This study, as is, achieves its proposed goal of evaluating the potential robustness of the embryo clock to changes in (some) morphogenetic processes. The authors do not claim that the model used is complete, and they properly identify some limitations, including the lack of cell-cell interactions. Given the recognized importance of cellular physical interactions for successful embryo development, including them in the model would be a significant addition in future studies.

      Response: We would like to clarify that the model does include cell-cell interactions as cells interact with their neighbours’ clock phase to synchronise and to avoid occupying the same physical space.

      1. The authors have deposited all the code used for analysis in a public GitHub repository that is updated and available for the research community.

      Response: We support open source coding practices.

      1. In page 6, the authors justify their choice of clock parameters for cells ingressing the PSM: "As ingressing cells do not appear to express segmentation clock genes (Mara et al. (2007)), the position at which cells ingress into the PSM can create challenges for clock patterning, as only in the 'oK' phase of the clock will ingressing cells be in-phase with their neighbours."

      However, there are several lines of evidence (in chick and mouse), that some oscillatory clock genes are already being expressed as early as in the gastrulation phase (so prior to PSM ingression) (Feitas et al, 2001 [10.1242/dev.128.24.5139]; Jouve et al, 2002 [10.1242/dev.129.5.1107]; Maia-Fernandes at al, 2024 [10.1371/journal.pone.0297853])

      Question: Is this also true in zebrafish? (I.e. is there any recent experimental evidence that the clock genes are not expressed at ingression, since the paper cited to support this assumption is from 2007). If they are expressed in zebrafish (as they are in mouse and chick), then the cell addition should have random clock gene periods when they enter the PSM and not start all with a constant initial phase of zero. Probably this will not impact the results since the cells will also be out of phase with their neighbours when they "ingress", however, it will model more closely the biological scenario (and avoid such criticism).

      Response: We thank the reviewer for their comments. While it is known that in zebrafish the clock begins oscillating during epiboly and before the onset of segmentation (Riedel-Kruse et al., 2007), to our knowledge no-one has examined whether posteriorly or laterally ingressing progenitor cells express clock genes prior to their ingression into the PSM, which occurs later in development than the first oscillations which give rise to the first somites. We have not found any published evidence of her/hes gene expression in the dorsal donor tissues or lateral tissues surrounding the PSM, however we acknowledge that this has not been actively studied before and our assumption relies on an absence of evidence, rather than evidence of absence.

      However, we agree with the reviewer that one should include such an analysis for completeness, and we have now generated additional simulations where progenitor cells ingress with a random clock phase. This data is presented in figure 2 supplement 1 and mentioned in the main text on page 9.

      MINOR COMMENTS:

      1. The citations are appropriate and cover the major labs that have published work related to this study (although with some overrepresentation of the lab that published the model used).

      Response: We have cited the vast literature on somitogenesis to the best of our ability and do recognise that the work of the Oates lab appears prominently, but this is probably because their experimental data were originally used to parametrise the model in Uriu et al. 2021.

      The text is clear, carefully written, and both the methods and the reasoning behind them are clearly explained and supported by proper citations.

      Response: We are very glad to see that the reviewer found that the manuscript was clearly presented.

      1. The figures are comprehensive, properly annotated, with explanatory self-contained legends. I have no comments regarding the presentation of the results.

      Response: Thank you

      Minor suggestions:

      1. Page 26: In the Cell addition sub-section of the Methods section, correct all

      instances where the word domain is used, but subdomain should be used (for clarity and coherence with the description of the model, stated as having a single domain comprising 3 subdomains).

      Response: We thank the reviewer for raising this, this is a good point. We have now corrected to ‘subdomain’ where appropriate.

      1. Page 32: Table 1. Parameter values used in our work, unless otherwise stated -> Suggestion: Add a column with the individual citations used for each parameter (to facilitate the confirmation of each corresponding reference).

      Response: Thank you for the suggstion, we have now done this (see table 1 page 36).

      **Referee Cross-commenting**

      I carefully read the reports provided by my fellow reviewers. My cross-comments aim to enhance the collective evaluation of the manuscript by Hammond et al.

      • On reviewer #1's Comments:

      I agree with Reviewer #1's overall evaluation of the manuscript's value and relevance, and with their general comments. I particularly support the suggestion to optionally include coupling delays known to influence the clock's period, as this would improve the model's completeness and benefit the research community. I also view this as an optional but desirable addition, not mandatory.

      Response: As per reviewer #1’s suggestion, we have now included this analysis (figure 8).

      In Fig. 4, I agree that showing kymographs, similar to Fig. 2D, for each PSM length would greatly improve the visualization of the results, given the relevance of this result to the manuscript's main message.

      Response: As per reviewer #1’s suggestion, we have now included such an analysis (figure 4 supplements 2 and 4) and agree with both reviewers that they improve the communication of our results.

      The remaining minor comments are useful and relevant to improving the manuscript.

      • On reviewer #3's Comments:

      Although I agree with Reviewer #3 that the paper is somewhat lengthy, I find the detailed description of the model in its biological context necessary and welcomed by the embryology research community. Without this detail, the paper might be too 'dry' and lose part of its audience. Conversely, focusing mostly on embryology without detailing the model parameters and simulation findings would deprive it of novelty and critical insights.

      Response: We thank Reviewer #2 for this assessment, which we agree with. Nonetheless we have sought to streamline our writing throughout to increase clarity without reducing the content.

      Overall, I find Reviewer #3's suggestions scientifically interesting, particularly comments 3, 4, and 5, which express legitimate questions for future study. However, I find them tangential to the main question addressed in this manuscript, which pertains to the modularity of the segmentation clock and morphogenesis. Therefore, I do not see them as significant improvements for the authors to implement in the current study.

      Response: We thank Reviewer #2 for their comments here and refer them to our responses to Reviewer #3.

      I would like to know how the authors respond to comments 1 and 2, which I do not have the expertise to evaluate.

      Response: We have now addressed these concerns in our response to Reviewer #3. Please see below.

      I agree with comment 6 that a brief mention of the known pathways/gene networks to which the assumptions apply (in zebrafish) would be a good addition. However, I do not think a detailed discussion is needed, as specific genes/networks can be diKerent for diKerent organisms.

      Response: We now justify this assumption in the methods on page 32.

      I disagree with comment 7, as Fig. 3 shows that the clock is robust to changes in cell ingression regime across all cell motility profiles tested. This is an important result for the manuscript's take home message, and should remain in the main text, not as a supplementary figure.

      Response: We agree with Reviewer #2 and have included this in our response to Reviewer #3.

      Finally, regarding Reviewer #3's concern about the incompleteness of the results, I find the results robust given the formalism chosen and within the scenarios where the assumptions hold. I believe this concern applies to the formalism (which is a choice) and not to the quality or relevance of the work presented in the manuscript. Additionally, some of the model's limitations have been adequately addressed by the authors.

      Response: We thank Reviewer #2 for their comments.

      • Reviewer #2 (Significance (Required)): GENERAL ASSESSMENT

      • This study uses a previously published model to simulate alternative scenarios of morphogenetic parameters to infer the potential independence (termed here modularity) between the segmentation clock and a set of morphogenetic processes, arguing that such modularity could allow the evolution of more flexible body plans, therefore partially explaining the variability in the number of segments observed in the vertebrates. This question is fundamental and relevant, yet still poorly researched. This work provides a comprehensive simulation with a model that tries to simplify the many morphogenetic processes described in the literature, reducing it to a few core fundamental processes that allow drawing the conclusions seeked. It provides theoretical insight to support a conceptual advance in the field of evolutionary vertebrate embryology.

      ADVANCE

      • This study builds on a model recently published by Uriu et al. (eLife, 2021) that incorporates quantitative experimental data within a modeling framework including cell and tissue-level parameters, allowing the study of multiscale phenomena active during zebrafish embryo segmentation. Uriu's publication reports many relevant and often non-intuitive insights uncovered by the model, most notably the description of phase vortices formed by the synchronizing genetic oscillators interfering with the traveling-wave front pattern.

      However, this model can be further explored to ask additional questions beyond those described in the original paper. A good example is the present study, which uses this mathematical framework to investigate the potential independence between two of the modeled processes, thereby extracting extra knowledge from it. Accordingly, the present study represents a step forward in the direction of using relevant theoretical frameworks to quantitatively explore the landscape of complex molecular hypotheses in silico, and with it shed some light on fundamental open questions or inform the design of future experiments in the lab.

      • The study incorporates a wide range of existing literature on the developmental biology of vertebrates. It comprehensively cites prior work, such as the foundational studies by Cooke and Zeeman on the segmentation clock and the role of FGF signaling in PSM development as discussed by Gomez et al. The literature properly covers the breadth of knowledge in this field.

      AUDIENCE

      • Target audience | This study is relevant for fundamental research in developmental biology, specifically targeting researchers who focus on early embryo development and morphogenesis from both experimental and theoretical perspectives. It is also relevant for evolutionary biologists investigating the genetic factors that influence vertebrate evolution, as well as to computational biologists and bioinformatics researchers studying developmental processes and embryology.

      Developmental researchers studying the segmentation clock in other vertebrate model organisms (namely mouse and chick), will find this publication especially valuable since it provides insights that can help them formulate new hypotheses to elucidate the molecular

      mechanisms of the clock (for example finding a set of evolutionarily divergent genes that might interfere with PSM length). Additionally, this study provides a set of cellular parameters that have yet to be measured in mouse and chick, therefore guiding the design of future experiments to measure them, allowing the simulation of the same model with sets of parameters from diKerent vertebrate model organisms, therefore testing the robustness of the findings reported for zebrafish.

      MY EXPERTISE

      My areas of research (relevant for this study): Vertebrate embryo clock oscillations in Gallus gallus; Computational biology.

      I can evaluate the relevance and validity of the model, critically evaluate its outputs and parameters, and the significance of the model assumptions for drawing relevant biological insights; however, I am not an expert on this mathematical formalism.

      • Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Verd and colleagues explored how various biologically relevant factors influence the robustness of clock dynamics synchronization among neighboring cells within the context of somatogenesis, adapting a mathematical model presented by Urio et. al in 2021 in a similar context. Specifically they show that clock dynamics is robust to diKerent biological mechanisms such as cell infusion, cellular motility, compaction-extension and cell-division. On the other hand , the length of Presomitic Mesoderm (PSM) and density of cells in it has a significant role in the robustness of clock dynamics. While the manuscript is well-written and provides clear descriptions of methods and technical details, it tends to be somewhat lengthy. Below are the comments I would like the authors to address:

      1. The authors mention that "...the model is three dimensional and so can quantitatively recapture the rates of cell mixing that we observe in the PSM". I am not convinced with this justification of using a 3D model. None of the eKects the authors explore in this manuscript requires a three dimensional model or full physical description of the cellular mechanics such as excluded volume interaction etc. A one-dimensional model characterized by cell position along the arclength of PSM and somatic region and segmentation clock phase θ can incorporate all the physics authors described in this manuscript as well as significantly computationally cheap allowing the authors to explore the eKect of diKerent parameters in greater detail.

      Response: One of the main objectives of the work we present in this manuscript is to assess how the evolution of PSM morphogenesis affects, or does not affect, segment patterning. The PSM is a three-dimensional tissue with diKering cell rearrangement dynamics along its anterior-posterior axis. In addition, PSM dimension, density, the rearrangement rate, and patterns of cell ingression all vary across vertebrate species, and they are functional, especially cell mixing as it promotes synchronisation and drives elongation. In order to answer questions on the modularity of somitogenesis we therefore consider it absolutely necessary to include a three-dimensional representation of the PSM thatcaptures single cells and their movements. In addition, this will allow us, as Reviewer #2 also pointed out, to reparametrize our model using species-specific data as it becomes available.

      While the reviewer is right in that lower dimensional representations would be computationally more efficient, and are generally more tractable, it would not be possible to represent cell mixing in one dimension, as this happens in three dimensions. One could perhaps encode the synchrony-promoting eKect of cell mixing via some coupling function κ(x) that increases towards the posterior, however it is unclear what existing biological data one could use to parameterise this function or determine its form. Cell mixing can be modelled in a two-dimensional framework, however this cannot quantitatively recapture the rate of cell mixing observed in vivo, which is an advantage of this model.

      Furthermore, it is unclear how one would simulate processes such as compaction- extension using a one-dimensional model. The two diKerent scenarios of cell ingression which we consider can also not be replicated in a one-dimensional model, as having a population of cells re-acquiring synchrony on the dorsal surface of the tissue while new material is added to the ventral side, creating asynchrony, is qualitatively diKerent than a one-dimensional scenario where cells are introduced continuously along the spatial axis.

      I am not sure about the justification for limiting the quantification of phase synchrony in a very limited (one cell diameter wide) region at one end of the somatic part (Page 33 below Fig. 9). From my understanding of the manuscript, the segments appear in significant length anterior to this region. Wouldn't an ensemble average of multiple such one cell diameter wide regions in the somatic region be a more accurate metric for quantifying synchrony?

      Response: Indeed, such a metric (e.g. as that used by Uriu et al. to quantify synchrony along the x- axis) would be more accurate for determining synchrony within the PSM. However, as per the clock and wavefront model of somitogenesis, only synchrony at the very anterior of the PSM (or at the wavefront, equivalently) is functional for somitogenesis and thus evolution. Therefore, we restrict our analysis to the anterior-most region of the PSM. We now further justify this in the main text on page 9.

      While studying the eKect of cellular ingression, the authors study three discrete modes- random, DP and DP+LV and show that in the DP+LV mode the clock synchrony becomes aKected. I would like the authors to explore this in a continuous fashion from a pure DP ingression to Pure LV ingression and intermediates.

      Response: We thank the reviewer for this suggestion; this is a very interesting question. We are currently working on a related computational and experimental project to address the question of how PSM morphogenesis can change over evolutionary time to evolve the diKerent modes that we see across species. As part of this work, we are running precisely the simulations suggested by the reviewer to find regions of parameter space in which all the relevant morphogenetic processes can freely evolve. While interesting, this work is however outside the scope of the current manuscript.

      While studying the effect of length and density of cells in PSM on cellular synchrony, the authors restrict to 3 values of density and 6 values of PSM length keeping the other parameter constant. I would be interested to see a phase diagram similar to Fig. 7 in the two-dimensional parameter space of L and ρ0. I am curious if a scaling relation exists for the parameter values that partition the parameter space with and without synchrony.

      Response: We thank the reviewer for their suggestion and agree that this would constitute an interesting addition to the manuscript. We have now generated these data, which are shown in figure 4 supplement 5 and mentioned on page 13. We see no clear relationship between these two variables when co-varying in the presence of random ingression.

      Both in the abstract and introduction, the authors discuss at a great length about the variability in the number of segments. I am curious how the number and width of the segments observed depend on different parameters related to cellular mechanics and the segmentation clock ?

      Response: We thank the reviewer for this question. It was not clear to us if this was something the reviewer wants us to address in the study’s background and introduction, or an analysis we should include in the results. Therefore, we have responded to both comprehensively below:

      The prevailing conceptual framework for understanding this is the clock and wavefront model (Cooke and Zeeman, 1976), which posits that the somite length is inversely proportional to the frequency of the clock relative to the speed of the wavefront, and that the total number of segments is the relative frequency multiplied by the total duration of somitogenesis.

      Experimentally we know that the frequency is determined in part by the coupling strength (Liao, Jorg, and Oates, 2016), and from comparative embryological studies (Gomez et al., 2008; Steventon et al., 2016) we know that changes in the elongation dynamics of the PSM correlate with changes in somite number, presumably by altering the total duration of somitogenesis (Gomez et al., 2009). These changes in elongation are thought to be driven by the changes in cell and tissue mechanics we test in our manuscript.

      Within our model, we cannot in general predict how the number of segments responds to changes in either clock parameters or cell mechanical parameters, as we lack understanding of what causes somitogenesis to cease; this is thus not encoded in our model and segmentation can in principle proceed indefinitely. Therefore, we have not performed this analysis.

      Similarly, we have not included an analysis of somite length. This is for two reasons: 1) as per the clock and wavefront model, the frequency at the PSM anterior (which we analyse) is equivalent to this measurement, as we assume (in general) the wavefront ($x = x_{a}$) is inertial. 2) the length of the nascent somite is not thought to be of much relevance to the adult phenotype, and by extension evolution. Somites undergo cell division and growth soon after their patterning by the segmentation clock, therefore their final size does not majorly depend on the dynamics of the segmentation clock. Rather, the main function of the clock is to control their number (and polarity).

      The authors assume that the phase dynamics of the chemical network may be described by an oscillator with constant frequency. For the completeness of the manuscript, the author should discuss in detail, for which chemical networks this is a good assumption.

      Response: We thank the reviewer for their suggestion and now justify this assumption in the methods on page 31.

      Such an assumption is appropriate for the segmentation clock, as the clock in the posterior of the PSM is thought to oscillate with a constant frequency, at least for the majority of somitogenesis although the frequency of somite formation slows towards the end of this process in zebrafish (Giudicelli et al., 2007, PLoS Biol.). In addition, PSM cells isolated and cultured in the presence of FGF (thus replicating the signalling environment of the posterior PSM) will continue to exhibit her1 oscillations with an apparently constant frequency (Webb et al., 2016).

      We note that such formulations are widely used within the segmentation clock literature (e.g. Riedel-Kruse et al., 2007, Morelli et al., 2009).

      Figure 3 and the associated text shows no eKect of the cellular motility profile in the synchrony of the segmentation clock. This may be moved to the supplementary considering the length of this manuscript.

      Response: Thank you for the suggestion. However, we would argue that the lack of eKect is a crucial result when discussing modularity. Reviewer #2 agrees with this assessment.

      • Reviewer #3 (Significance (Required)):

      The manuscript answers some important questions in the synchrony of segmentation clock in the vertebrates utilizing a model published earlier. However, the presented result is incomplete in some aspects (points 2 to 5 of section A) and that could be overcome by a more detailed analysis using a simpler one dimensional (point 1 of section A). I believe this manuscript could be of interest to an intersecting audience of developmental biologists, systems biologists, and physicists/engineers interested in dynamical systems.

      My research interests are building physics and engineering based models of cell and tissue scale biological phenomena.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, Hammond et al. study robustness of the vertebrate segmentation clock against morphogenetic processes such as cell ingression, cell movement and cell division to ask whether the segmentation clock and morphogenesis are modular or not. The modularity of these two would be important for evolvability of the segmenting system. The authors adopt a previously proposed 3D model of the presomitic mesoderm (Uriu et al. 2021 eLife) and include new elements; different types of cell ingression, tissue compaction and cell cycles. Based on the results of numerical simulations that synchrony of the segmentation clock is robust, the authors conclude that there is a modularity in the segmentation clock and morphogenetic processes.

      The presented results support the conclusion. The manuscript is clearly written. I have several comments that could help the authors further strengthen their arguments.

      Major comment:

      [Optional] In both the current model and Uriu et al. 2021, coupling delay in phase oscillator model is not considered. Given that several previous studies (e.g. Lewis 2003, Herrgen et al. 2010, Yoshioka-Kobayashi et al. 2020) suggested the presence of coupling delays in Delta-Notch signaling, could the authors analyze the effect of coupling delay on robustness of the segmentation clock against morphogenetic processes?

      Minor comments:

      • PSM radius and oscillation synchrony are both denoted by the same alphabet r. The authors should use different alphabets for these two to avoid confusion.
      • page 5 Figure 1 caption: (x-x_a/L) should be (x-x_a)/L.
      • Figure 3C: Description of black crosses in the panels is required in the figure legend.
      • Figure 3C another comment: In this panel, synchrony r at the anterior PSM is shown. It is true that synchrony at anterior PSM is most relevant for normal segment formation. However, in this case, the mobility profile is changed, so it may be appropriate to show how synchrony at mid and posterior PSM would depend on changes in mobility profile. Is synchrony improved by cell mobility at the region where cell ingression happens?
      • In page 12, the authors state that "the results for the DP and DP+LV cases are exactly equal for L = 185 um, as .... and the two ingression methods are numerically equivalent in the model". I understood that in this case two ingression methods are equivalent, but I do not understand why the results are "exactly" equal, given the presence of stochasticity in the model.
      • The authors analyze the effect of cell density on oscillation synchrony in Fig. 4 and they mention that higher density increases robustness of the clock by increasing the average number of interacting neighbors. I think it would be helpful to plot the average number of neighboring cells in simulations as a function of density to quantitatively support the claim.
      • The authors analyze the effect of PSM length on synchrony in Fig. 4. I think kymographs of synchrony r as shown in Fig. 2D would also be helpful to show that indeed cells get synchronized while advecting through a longer PSM.
      • I understand that cells in M phase can interact with neighboring cells with the same coupling strength kappa in the model, although their clocks are arrested. If so, this aspect should be also mentioned in the main text in page 16, as this coupling can be another noise source for synchrony.
      • Figure 5-figure supplement 2: panel labels A, B, C are missing.
      • Figure 5-figure supplement 3: panel labels A, B, C are missing.

      Significance

      Synchronization of the segmentation clock has been studied by mathematical modeling, but most previous studies considered cells in a static tissue without morphogenesis. In the previous study by Uriu et al. 2021, morphogenetic processes such as cell advection due to tissue elongation, tissue shortening, and cell mobility were considered in synchronization. The current manuscript provides methodological advances in this aspect by newly including cell ingression, tissue compaction and cell cycle. In addition, the authors bring a concept of modularity and evolvability to the field of the vertebrate segmentation clock, which is new. On the other hand, the manuscript confirms that the synchronization of the segmentation clock is robust by careful simulations, but it does not propose or reveal new mechanisms for making it robust or modular. The main targets of the manuscript will be researchers working on somitogenesis and evolutionary biologists who are interested in evolution of developmental systems. The manuscript will also be interested by broader audiences, like developmental biologists, biophysicists, and physicists and computer scientists who are working on dynamical systems.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, Huang et al used SMRT sequencing to identify methylated nucleotides (6mA, 4mC, and 5mC) in Pseudomonas syringae genome. They show that the most abundant modification is 6mA and they identify the enzymes required for this modification as when they mutate HsdMSR they observe a decrease of 6mA. Interestingly, the mutant also displays phenotypes of change in pathogenicity, biofilm formation, and translation activity due to a change in gene expression likely linked to the loss of 6mA. Overall, the paper represents an interesting set of new data that can bring forward the field of DNA modification in bacteria.

      Thank you for your valuable feedback on our paper exploring the impact of 6mA modification in P. syringae.

      Major Concerns:

      Most of the authors' data concern Psph pathovar. I am not sure that the authors' conclusions are supported by the two other pathovars they used in the initial 2 figures. If the authors want to broaden their conclusions to Pseudomonas syringe and not restrict it to Psph, the authors should have stronger methylation data using replicates. Additionally, they should discuss why Pss is so different than Pst and Psph. Could they do a blot to confirm it is really the case and not a sequencing artifact? Is the change of methylation during bacterial growth conserved between the pathovar? The authors should obtain mutants in the other pathovar to see if they have the same phenotype. The authors have a nice set of data concerning Psph but the broadening of the results to other pathovar requires further investigation.

      We appreciate the reviewer’s insightful comments. While the majority of our data focuses on the Psph, we recognize the importance of validating these findings in Pss and Pst. To this end, we have performed additional experiments using dot blot and mutant construction to enhance our conclusions in other pathovars.

      We agree that we should discuss why Pss is different from Psph and Pst. We performed a dot blot assay using genome DNA in Pss and Pst, presented in Figure S5A. Meanwhile, we compared the 6mA modification level of Pss and Pst in different growth phases. As shown in Figure S5A, the change of methylation during bacterial growth is conserved in Pst. The change was not obvious in Pss, which might be due to the lack of a type I R-M system.

      “In accordance with previous studies showing that growth conditions affect the bacterial methylation status, we applied dot blot experiments using the same amount of DNA (1 μg) from these three P. syringae strains to detect the 6mA levels during both logarithmic and stationary phases. The results revealed that 6mA levels in the stationary phase were much higher compared to the logarithmic phase in Psph and Pst, but no significant change in Pss. Additionally, we found that during the stationary phase, 6mA methylation levels in Psph and Pst were higher than those in Pss. These findings were consistent with the MTases predication on these three strains, since Pss does not harbor any type I R-M systems, which are important for 6mA medication in bacteria.”

      Please see Figure S5A and Lines 220-228 in the revised manuscript.

      We also tried to construct an HsdM mutant in Pst to explore whether the influence of 6mA methylation was conserved in P. syringae, but it failed after multiple attempts. We did not construct a Pss mutant because no type I R-M system was predicted, and few methylation sites were identified via SMRT-seq in this strain. Therefore, we overexpressed HsdM in Pst instead. We have performed additional experiments in WT and the HsdM overexpression strains, including dot blot and growth curve assays.

      Please see Figures S5B-C and Lines228-232 in the revised manuscript.

      The authors should include proper statistical analysis of their data. A lot of terms are descriptive but not supported by a deeper analysis to sustain the conclusions. For example, in Figure 4E, we do not know if the overlap is significant or not. Are DEGs more overlapping to 6mA sites than non-DEGs? Here is a non-exhaustive list of terms that need to be supported by statistics: different level (L145), greater conservation (L162), significant conservation (L165), considerable similarity (L175), credible motifs (L189), Less strong (L277) and several "lower" and "higher" throughout the text.

      Thank you for the insightful feedback. We have made the following revisions in the manuscript to ensure that the terms are more precise and do not require statistical significance testing.

      (1) Statistical analysis: We have added statistical tests for the overlap between DEGs and 6mA sites in Figure 4E. We performed the Fisher test, and we found the overlap was not significant (p> 0.05). DEGs and non-DEGs were both non-significant overlapped 6mA sites. Please see Figure 4E and Lines 261-262.

      “Less strong” was used to indicate the influence of HsdM on biofilm in Figure 5D. All Figures with “*” labels were analyzed using students' two-tailed t-tests with a significant change (p < 0.05).

      (2) Revised wording: For terms used to describe comparisons, we have revised the wording to be clearer and ensure that the terminology used did not imply the need for statistical significance testing when not required. For example:

      “Different level” has been removed. Please see Lines 143-144.

      “Greater conservation” has been revised to “more conserved functional terms”. Please see Lines 161-162.

      “Significant conservation” has been revised to “notable conservation”. Please see Line 165.

      “Credible motifs” has been revised to “identified motifs”. Please see Line 186.

      The authors performed SMRT sequencing of the delta hsdMSR showing a reduction of 6mA. Could they include a description of their results similar to Figures 1-2. How reduced is the 6mA level? Is it everywhere in the genome? Does it affect other methylation marks? This analysis would strengthen their conclusions.

      Yes, we agree. We have provided additional analysis and descriptions to strengthen the conclusions regarding these valuable comments. We determined three methylation sites in the HsdMSR mutant strain and compared the overlapped genes within these modification patterns. Specifically, we focused on the 6mA sites in Psph WT, HsdMSR mutant, and HsdM motif CAGCN<sub>(6)</sub>CTC. As expected, we found almost all of the reduction 6mA sites in the ΔhsdMSR were from motif CAGCN<sub>(6)</sub>CTC. We also noticed that 5mC and 4mC sites in the mutant were relatively similar to that in WT, and the slight difference might be caused by sequencing errors. Overall, we propose that HsdMSR only catalyze the 6mA located on the motif CAGCN<sub>(6)</sub>CTC, but does not affect other 6mA sites and other modification types.

      Please see Figures S4D-E and Lines 212-218 in the revised manuscript.

      In Figure 6E to conclude that methylation is required on both strands, the authors are missing the control CAGCN6CGC construct otherwise the effect could be linked to the A on the complementary strand.

      Thank you for your valuable suggestions. We have provided the control result on the complementary strand. Please see Figure 6C. The new result evidences the conclusion that 6mA methylation regulates gene transcription based on methylation on both strands.

      Please see Figure 6C and Lines 329-330 in the revised manuscript.

      Reviewer #2 (Public Review):

      In the present manuscript, Huang et.al. revealed the significant roles of the DNA methylome in regulating virulence and metabolism within Pseudomonas syringae, with a particular focus on the HsdMSR system in this model strain. The authors used SMRT-seq to profile the DNA methylation patterns (6mA, 5mC, and 4mC) in three P. syringae strains (Psph, Pss, and Psa) and displayed the conservation among them. They further identified the type I restriction-modification system (HsdMSR) in P. syringae, including its specific motif sequence. The HsdMAR participated in the process of metabolism and virulence (T3SS & Biofilm formation), as demonstrated through RNA-seq analyses. Additionally, the authors revealed the mechanisms of the transcriptional regulation by 6mA. Strictly from the point of view of the interest of the question and the work carried out, this is a worthy and timely study that uses third-generation sequencing technology to characterize the DNA methylation in P. syringae. The experimental approaches were solid, and the results obtained were interesting and provided new information on how epigenetics influences the transcription in P. syringae. The conclusions of this paper are mostly well supported by data, but some aspects of data analysis and discussion need to be clarified and extended.

      Thank you for your positive feedback and recognition of the importance of our study. We appreciate the suggestions for further clarification and extension of some aspects of data analysis and discussion. We added further analysis of the SMRT-seq result of the ΔhsdMSR and overexpressed HsdM in Pst to provide more information on conservation. We added these contents to the discussion in the revised manuscript. Please see Figure 6C and  Figure S5.

      Reviewer #3 (Public Review):

      Summary:

      The article by Huang et.al. presents an in-depth study on the role of DNA methylation in regulating virulence and metabolism in Pseudomonas syringae, a model phytopathogenic bacterium. This comprehensive research utilized single-molecule real-time (SMRT) sequencing to profile the DNA methylation landscape across three model pathovars of P. syringae, identifying significant epigenetic mechanisms through the Type-I restriction-modification system (HsdMSR), which includes a conserved sequence motif associated with N6-methyladenine (6mA). The study provides novel insights into the epigenetic mechanisms of P. syringae, expanding the understanding of bacterial pathogenicity and adaptation. The use of SMRT sequencing for methylome profiling, coupled with transcriptomic analysis and in vivo validation, establishes a robust evidence base for the findings

      Strengths:

      The results are presented clearly, with well-organized figures and tables that effectively illustrate the study's findings.

      Weaknesses:

      It would be helpful to add more details, especially in the methods, which make it easy to evaluate and enhance the manuscript's reproducibility.

      Thank you for the positive evaluation of our study, as well as the constructive feedback provided. We have added more details in methods for RNA-seq analysis and Ribo-seq analysis. Please see Lines 484-515.

      “Briefly, bacteria were cultured to an OD<sub>600</sub> of 0.4, at which point chloramphenicol was added to a final concentration of 100 µg/mL for 2 minutes. Cells were then pelleted and washed with pre-chilled lysis buffer [25 mM Tris-HCl, pH 8.0; 25 mM NH4Cl; 10 mM MgOAc; 0.8% Triton X-100; 100 U/mL RNase-free DNase I; 0.3 U/mL Superase-In; 1.55 mM chloramphenicol; and 17 mM GMPPNP]. The pellet was resuspended in lysis buffer, followed by three freeze-thaw cycles using liquid nitrogen. Sodium deoxycholate was then added to a final concentration of 0.3% before centrifugation. The resulting supernatant was adjusted to 25 A260 units and mixed with 2 mL of 500 mM CaCl<sub>2</sub> and 12 µL MNase, making up a total volume of 200 µL. After the digestion, the reaction was quenched with 2.5 mL of 500 mM EGTA. Monosomes were isolated using Sephacryl S400 MicroSpin columns, and RNA was purified using the miRNeasy Mini Kit (Qiagen). rRNA was removed using the NEBNext rRNA Depletion Kit, and the final library was constructed with the NEBNext Small RNA Library Prep Kit. For each sample, ribosome footprint reads were mapped to the Psph 1448A reference genome, and the translational efficiency was calculated by dividing the normalized Ribo-seq counts by the normalized RNA counts. Two biological replicates were performed for all experiments.”

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      I would recommend the authors limit their manuscript to Psph pathovar and include statistical analysis supporting their conclusions.

      Thank you for your suggestion.

      Minor

      • L104: "significantly" please add a p-value and explain the analysis.

      Sorry for the confusion. We have added the p-value and explained the analysis in the method section. The p-value used for SMRT-seq was the modification quality value (QV) score, which were used to call the modified bases A (QV=50) and C (QV=100). Please see Lines 452-454.

      • Figures 1B, D, F, and Figure 2A: make the Venn diagram to scale

      Yes, we have revised.

      • L110-111: missing p-value to say that the authors observe a bigger overlap in Pst than Psph as they observed more modified sites in Pst

      Sorry for the confusion. We said it had a bigger overlap in Pst because the number 17.7 was bigger than the number of 15 in Psph. To avoid misunderstanding, we revised the wording to “more genes equipped with all three modification types were detected in Pst than Psph”. Please see Lines 110-111.

      • L112: missing description of their Pss analysis (IDP, sites...)

      We have added the information for Pss in the revised manuscript.

      “Additionally, the methylome atlas of Pss revealed a lower incidence of methylation than those of Psph and Pst, particularly in terms of 6mA modifications, which were only seen in 457 significant 6mA occurrences under the same threshold (IPD > 1.5) and a total of 2,853 and 1,438 methylation sites were detected as 5mC and 4mC, respectively”. Please see Lines 114-116.

      • L118: "modification" to "modified "

      We have revised. Please see Line 119.

      • L120: "modification sites" to "modified nucleotides"

      We have revised. Please see Line 121.

      • L142: correct the title "Methylated genes revealed highly functional conservation among three P. syringae strains" maybe to "Methylated genes are functionally conserved among ..."

      We have revised. Please see Line 142.

      • Figure 2C: not easy to read and interpret

      Sorry for the confusion. Figure 2C revealed the significantly enriched functional pathways in GO and KEGG databases among three P. syringae strains. The specific names of each pathway were listed on the left, and each column with dots indicated the number of genes within one kind of methylation in one of three P. syringae strains. The larger the size, the bigger the number.

      We have revised the legend of Figure 2C. Please see Lines 575-579.

      “The dot plot revealed the significantly enriched functional pathways in GO and KEGG databases among three P. syringae strains. The specific names of each pathway were listed on the left, and each column with dots indicated the number of genes within one kind of methylation in one of three P. syringae strains. The size of the dots indicates the number of related genes.”

      • Figure 6B-C: what is the difference between B 24h and C?

      Figure 6B revealed the expression difference between WT and mutant during 24 hours. Figure 6C only showed a time point in 24 hours. To avoid repetition, we have removed Figure 6C.

      • Figure 6C-D: if the same maybe remove Figure 2C

      We have removed Figure 6D.

      Reviewer #2 (Recommendations For The Authors):

      The manuscript could be improved by addressing the following concerns:

      (1) In line 146: How to understand the percentage conserved in "more than two of the strains"?

      Sorry for the confusion, we planned to indicate the pattern that conserved in two strains and three strains. We have revised it to: “Notable, about 25% to 45% of methylated genes were conserved in two and three strains”. Please see Line 145.

      (2) In line 178: Five conserved sequence motifs should be replaced by "Six conserved sequence motifs".

      We have revised. Please see Line 176.

      (3) In Figure 2B, specify the C1, C2 and C3. "m6A" should be replaced by "6mA".

      Yes, we have revised.

      (4) In Figure S2, "m6A" should be replaced by "6mA".

      Yes, we have revised.

      (5) In line 212, please add references for the previous studies showing that growth conditions affect bacterial methylation status.

      Thank you for your suggestion. We have added the relevant references (Gonzalez and Collier, 2013), (Krebes et al., 2014), (Sanchez-Romero and Casadesus, 2020).

      (6) In line 217, "illustrate" should be "illustrated".

      Yes, we have revised. Please see Line 210.

      (7) There are some genes colored in grey, revealing bigger differences between the two strains than those related to ribosomal protein, T3SS, and alginate synthesis in Fig. 4A. Do they have important functional roles as well?

      Thank you for your suggestion. A total of 116 genes with bigger differences (|Log<sub>2</sub>FC| > 2) except for genes related to ribosomal protein, T3SS, and alginate synthesis. Among these genes, 31 were annotated as hypothetical proteins and 4 as transcription factors with unknown functions, and the remaining genes mostly encoded metabolism-related enzymes. These enzymes might have effects on growth defects in ΔhsdMSR. We added this information in the revised manuscript. Please see Line 249-254.

      (8) The authors should discuss what will be the potential signals or factors that can regulate the activity of HsdMSR. In other words, what can decide the extent of methylation through activating or suppressing the expression of HsdMSR?

      Thank you for your valuable suggestion. We have added this part in the discussion part. Please see Lines 404-415.

      “Apart from the established roles of 6mA and HsdMSR in P. syringae, certain signals or factors may influence HsdMSR expression. For instance, we confirmed that the growth phase affects methylation levels in P. syringae. Previous studies have shown that increased temperatures can reduce methylation levels, as observed in PAO1(Doberenz et al., 2017). These findings suggest that HsdMSR expression may be responsive to both intrinsic cellular states and extrinsic environmental conditions. To further explore potential upstream TFs regulating the expression of HsdMSR, we searched for TF-binding sites in the HsdMSR promoter using our own databases (Fan et al., 2020; Shao et al., 2021; Sun et al., 2024). As a result, we found three candidate TFs (PSPPH_0061, PSPPH_3268, and PSPPH_3504) that might directly bind and regulate HsdMSR expression. Future studies on these TFs and their interactions with the HsdMSR promoter would help clarify the regulatory network governing HsdMSR activity.”

      Reviewer #3 (Recommendations For The Authors):

      (1) Some figures contain dense information, which may be overwhelming for readers. Streamlining the legend for Figure 1 and resizing the Venn diagrams within it could enhance clarity and visual appeal.

      Thank you for your suggestion. We have scaled all the Venn plots in the revised version.

      (2) Incorporating a discussion about the role of the restriction-modification (RM) system in bacterial defense against phage infection into the discussion section could enrich the manuscript's context and relevance.

      Thank you for your valuable suggestion. We have added this part in the Discussion part. Please see Lines 416-427.

      “RM systems are known for their intrinsic role as innate immune systems in anti-phage infection, and present in around 90% of bacterial genomes(Oliveira et al., 2014). RM systems protect bacteria self by recognizing and degrading foreign phage DNA via methylation-specific site and restriction endonucleases (REases) (Loenen et al., 2014). In addition, other phage-resistance systems are similar to RM systems but carry extra genes. One is called the phage growth limitation (Pgl) system, which modifies and cleaves phage DNA. However, the Pgl only modifies the phage DNA in the first infection cycle, and cleaves phage DNA in the subsequent infections, which gives a warn to the neighboring cells(Hampton et al., 2020; Hoskisson et al., 2015). To counteract RM and RM-like systems, phages have evolved strategies, including unusual modifications such as hydroxymethylation, glycosylation, and glucosylation. They can also encode their own MTases to protect their DNA or employ strategies to evade restriction systems and other anti-RM defenses.(Iida et al., 1987; Murphy et al., 2013; Vasu and Nagaraja, 2013).

      (3) In line 152: What is the importance of the mentioned example of Cro/CI family TF?

      Thank you for your comments. The Cro/CI are important TFs present in phages. The interaction between Cro and CI affects bacteria immunity status in Enterohemorrhagic Escherichia coli (EHEC) strains(Jin et al., 2022). RM systems are known as a kind of phage-defense system, and hence we mentioned it here. We have added this description in the revised manuscript. Please see Lines 152-154.

      Reference:

      (1) Doberenz, S., Eckweiler, D., Reichert, O., Jensen, V., Bunk, B., Sproer, C., Kordes, A., Frangipani, E., Luong, K., Korlach, J., et al. (2017). Identification of a Pseudomonas aeruginosa PAO1 DNA Methyltransferase, Its Targets, and Physiological Roles. mBio 8. 10.1128/mBio.02312-16.

      (2) Fan, L., Wang, T., Hua, C., Sun, W., Li, X., Grunwald, L., Liu, J., Wu, N., Shao, X., Yin, Y., et al. (2020). A compendium of DNA-binding specificities of transcription factors in Pseudomonas syringae. Nat Commun 11, 4947. 10.1038/s41467-020-18744-7.

      (3) Gonzalez, D., and Collier, J. (2013). DNA methylation by CcrM activates the transcription of two genes required for the division of Caulobacter crescentus. Mol Microbiol 88, 203-218. 10.1111/mmi.12180.

      (4) Hampton, H.G., Watson, B.N., and Fineran, P.C. (2020). The arms race between bacteria and their phage foes. Nature 577, 327-336.

      (5) Hoskisson, P.A., Sumby, P., and Smith, M.C. (2015). The phage growth limitation system in Streptomyces coelicolor A (3) 2 is a toxin/antitoxin system, comprising enzymes with DNA methyltransferase, protein kinase and ATPase activity. Virology 477, 100-109.

      (6) Iida, S., Streiff, M.B., Bickle, T.A., and Arber, W. (1987). Two DNA antirestriction systems of bacteriophage P1, darA, and darB: characterization of darA− phages. Virology 157, 156-166.

      (7) Jin, M., Chen, J., Zhao, X., Hu, G., Wang, H., Liu, Z., and Chen, W.-H. (2022). An engineered λ phage enables enhanced and strain-specific killing of enterohemorrhagic Escherichia coli. Microbiology Spectrum 10, e01271-01222.

      (8) Krebes, J., Morgan, R.D., Bunk, B., Sproer, C., Luong, K., Parusel, R., Anton, B.P., Konig, C., Josenhans, C., Overmann, J., et al. (2014). The complex methylome of the human gastric pathogen Helicobacter pylori. Nucleic Acids Res 42, 2415-2432. 10.1093/nar/gkt1201.

      (9) Loenen, W.A., Dryden, D.T., Raleigh, E.A., Wilson, G.G., and Murray, N.E. (2014). Highlights of the DNA cutters: a short history of the restriction enzymes. Nucleic Acids Res 42, 3-19.

      (10) Murphy, J., Mahony, J., Ainsworth, S., Nauta, A., and van Sinderen, D. (2013). Bacteriophage orphan DNA methyltransferases: insights from their bacterial origin, function, and occurrence. Appl Environ Microb 79, 7547-7555.

      (11) Oliveira, P.H., Touchon, M., and Rocha, E.P. (2014). The interplay of restriction-modification systems with mobile genetic elements and their prokaryotic hosts. Nucleic Acids Res 42, 10618-10631.

      (12) Sanchez-Romero, M.A., and Casadesus, J. (2020). The bacterial epigenome. Nature reviews. Microbiology 18, 7-20. 10.1038/s41579-019-0286-2.

      (13) Shao, X., Tan, M., Xie, Y., Yao, C., Wang, T., Huang, H., Zhang, Y., Ding, Y., Liu, J., Han, L., et al. (2021). Integrated regulatory network in Pseudomonas syringae reveals dynamics of virulence. Cell Rep 34, 108920. 10.1016/j.celrep.2021.108920.

      (14) Sun, Y., Li, J., Huang, J., Li, S., Li, Y., Lu, B., and Deng, X. (2024). Architecture of genome-wide transcriptional regulatory network reveals dynamic functions and evolutionary trajectories in Pseudomonas syringae. bioRxiv, 2024.2001. 2018.576191.

      (15) Vasu, K., and Nagaraja, V. (2013). Diverse functions of restriction-modification systems in addition to cellular defense. Microbiol Mol Biol Rev 77, 53-72. 10.1128/MMBR.00044-12.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review)

      Summary:

      The authors wanted to use AlphaFold-multimer (AFm) predictions to reduce the challenge of physics-based protein-protein docking.

      Strengths:

      They found that two features of AFm predictions are very useful. 1) pLLDT is predictive of flexible residues, which they could target for conformational sampling during docking; 2) the interface-pLLDT score is predictive of the quality of AFm predictions, which allows the authors to decide whether to do local or global docking.

      Weaknesses:

      (1) As admitted by the authors, the AFm predictions for the main dataset are undoubtedly biased because these structures were used for AFm training. Could the authors find a way to assess the extent of this bias?

      Indeed, the AFm training included most of the structures in the DB5 benchmark for its training as many structures (either unbound or bound) were deposited before the training cut-off period. One of the challenges of estimating this bias is the availability of new structures - both bound and unbound deposited after the training cut-off. Estimating the extent of training bias is therefore conditional on these factors and difficult. A few studies have attempted to address this bias (Yin et al, 2022, https://doi.org/10.1002/pro.4379).

      In our study, we assess this bias by comparing the AFm structures to the bound and unbound forms and calculating their Ca RMSDs and TM-scores (new addition). We now elaborate in the Results:Dataset curation section and we have added a figure comparing the TM-scores in the supplement.

      We added a clarifying text and a note about the TM-score calculation in the manuscript as follows:

      “Since most of the benchmark targets in DB5.5 were included in AlphaFold training, there would be training bias associated with their predictions (i.e. our measured success rates are an upper bound).”

      “We also calculated the TM-scores of the AFm predicted complex structures with respect to the bound and the unbound crystal structures (Supplementary Figure S2). As TM-scores reflect a global comparison between structures and are less sensitive to local structural deviations, no strong conclusions could be derived. This is in agreement with our intuition that since both unbound and bound states of proteins will share a similar fold, and AlphaFold can predict structures with high TM-scores in most cases, gauging the conformational deviations with TM-scores would be inconclusive.”

      (2) For the CASP15 targets where this bias is absent, the presentation was very brief. In particular, it would be interesting to see how AFm helped with the docking. The authors may even want to do a direct comparison with docking results without the help of AFm.

      Unfortunately since this was a CASP-CAPRI round, the structure of the unbound Antigen or the nanobodies was unavailable. Thus we cannot perform a comparison without using AF2 at all since we need a structure prediction tool to produce the unbound nanobody and the nanobody-antigen complex template structure to dock. This has been clarified in the main text for better understanding for the readers.

      “Since the nanobody-antigen complexes were CASP targets, we did not have unbound structures, rather only the sequences of individual chains. Therefore, for each target, we employed the AlphaRED strategy as described in Fig 7.”

      Reviewer #1 (Recommendations For The Authors):

      For suggestions for major improvements, see comments under weaknesses. One additional suggestion: the authors found that pLLDT is predictive of flexible residues. Can they try to find AFm features that are predictive of the interface site? Such information may guide their docking to a local site.

      This is a great idea that we and others have been thinking about considerably. Prior work by Burke et al. (Towards a structurally resolved human protein interaction network) examines AlphaFold’s ability to predict PPIs. For high-confidence predicted models of interacting protein complexes, the authors showed that pDockQ correlated reasonably well with correct protein interactions.

      That being said, binding site identification, particularly in a partner-agnostic fashion, i.e. determining binding patches on a given protein, is an area of on-going research . We hope a future study examines AlphaFold3 or ESM3 specifically for this task.

      “Further, we tested multiple thresholds to estimate the optimum cut-off for distinguishing near-native structures (defined as an interface-RMSD < 4 Å) from the predictions. Figure 3.B summarizes the performance with a confusion matrix for the chosen interface-pLDDT cutoff of 85. 79 % of the targets are classified accurately with a precision of 75%, thereby validating the utility of interface-pLDDT as a discriminating metric to rank the docking quality of the AFm complex structure predictions. With AlphaFold3 and ESM3 being released, investigating features that could predict flexible residues or interface site would be valuable, as this information may guide local docking.”

      Minor:

      Page 3, lines 73-77, state how many targets were curated from DB5.5.

      We have now clarified this in the manuscript. All 254 targets curated from DB5.5 at the time of this benchmark study.

      “For each protein target, we extracted the amino acid sequences from the bound structure and predicted a corresponding three-dimensional complex structure with the ColabFold implementation of the AlphaFold multimer v2.3.0 (released in March 2023) for the 254 benchmark targets from DB5.5.”

      In Figure 1, the color used for medium is too difficult to distinguish from the grey color used for rigid.

      We thank you for this suggestion. We have updated the color to olive. Further, based on Reviewer 2’s suggestions, we have moved this plot to the Supplementary.

      Reviewer #2 (Public Review):

      Summary:

      In short, this paper uses a previously published method, ReplicaDock, to improve predictions from AlphaFold-multimer. The method generated about 25% more acceptable predictions than AFm, but more important is improving an Antibody-antigen set, where more than 50% of the models become improved.

      When looking at the results in more detail, it is clear that for the models where the AFm models are good, the improvement is modest (or not at all). See, for instance, the blue dots in Figure 6. However, in the cases where AFm fails, the improvement is substantial (red dots in Figure 6), but no models reach a very high accuracy (Fnat ~0.5 compared to 0.8 for the good AFm models). So the paper could be summarized by claiming, "We apply ReplicaDock when AFm fails", instead of trying to sell the paper as an utterly novel pipeline. I must also say that I am surprised by the excellent performance of ReplicaDock - it seems to be a significant step ahead of other (not AlphaFold) docking methods, and from reading the original paper, that was unclear. Having a better benchmark of it alone (without AFm) would be very interesting.

      We thank the reviewer for highlighting the performance of ReplicaDock. ReplicaDock alone is benchmarked in the original paper (10.1371/journal.pcbi.1010124), with full details on the 2022 version of DB5.5 in the supplement. Indeed ReplicaDock2 achieves the highest reported success rates on flexible docking targets reported in the literature (until this AlphaRED paper!).

      Regarding this statement about “the paper could be summarized…” it might be helpful to give more context. ReplicaDock is a replica exchange Monte Carlo sampling approach for protein docking that incorporates flexibility in an induced-fit fashion. However, the choice of which backbone residues to move is solely dependent on contacts made during each docking trajectory. In the last section of the ReplicaDock paper, we introduced “Directed Induced-fit” where we biased the backbone sampling only towards those residues where we knew the backbone is flexible (this information is obtained because for the benchmark set, we had both unbound and bound structures and hence could cherry-pick the specific residues which are mobile). We agree with the reviewers that AlphaRED is essentially a derivative of ReplicaDock, however, the two major claims that we make in this paper are:

      (1) AlphaFold pLDDT is an effective predictor of backbone flexibility for practical use in docking.

      (2) We can automate the Directed InducedFit approach within ReplicaDock by utilizing this pLDDT information per residue for conformational sampling in protein docking; and in doing so, create a pipeline that would allow us to go from sequence-to-structure-to-complex, specifically capturing conformational changes.

      To conclude these claims, we pose the following questions in the Introduction:

      “(1) Do the residue-specific estimates from AF/AFm relate to potential metrics demonstrating conformational flexibility?

      (2) Can AF/AFm metrics deduce information about docking accuracy?

      (3) Can we create a docking pipeline for in-silico complex structure prediction incorporating AFm to convert sequence-to-structure-to-docked complexes?”

      This work requires a pipeline, the center of which lies in ReplicaDock as a docking method, but has functionalities that were absent in prior work. The goal is also to develop a one-stop shop without manual intervention (a prerequisite for biasing backbone sampling in ReplicaDock) that could be utilized by structural biologists efficiently.

      We clarify this points in the abstract and main text as follows:

      Abstract: “In this work, we combine AlphaFold as a structural template generator with a physics-based replica exchange docking algorithm \add{to better sample conformational changes.”

      Introduction:

      “The overarching goal is to create a one-stop, fully-automated pipeline for simple, reproducible, and accurate modeling of protein complexes. We investigate the aforementioned questions and create a protocol to resolve AFm failures and capture binding-induced conformational changes. We first assess the utility of AFm confidence metrics to detect conformational flexibility and binding site confidence.”

      These results also highlight several questions I try to describe in the weakness section below. In short, they boil down to the fact that the authors must show how good/bad ReplicaDock is at all targets (not only the ones where AFm fails. In addition, I have several more technical comments.

      Strengths:

      Impressive increase in performance on AB-AG set (although a small set and no proteins).

      We thank the reviewer for their comments.

      Weaknesses:

      The presentation is a bit hard to follow. The authors mix several measures (Fnat, iRMS, RMSDbound, etc). In addition, it is not always clear what is shown. For instance, in Figure 1, is the RMSD calculated for a single chain or the entire protein? I would suggest that the author replace all these measures with two: TM-score when evaluating the quality of a single chain and DockQ when evaluating the results for docking. This would provide a clearer picture of the performance. This applies to most figures and tables.

      We apologize for the lack of clarity owing to different metrics. Irms and fnat are standard performance metrics in the docking field, but we agree that DockQ would be simpler when the detail of the other metrics are not required. We have updated the figures Figure 5 and Figure 8 to also show DockQ comparisons.

      Regarding Figure 1, as highlighted in Line 90 of the main-text, “Figure 1 shows the Ca-RMSD of all protein partners of the AFm predicted complex structures with respect to the bound and the unbound.” As suggested by the reviewer in their further comments, we have moved this FIgure to the Supplementary. We have also included TM-score comparison in the Supplementary ( SupFig S2) and included clarifying statements in the main text:

      “We also tested TM-scores to measure the structural deviations of the AFm predicted complex structures with respect to the bound and unbound structures (Supplementary Figure S2). However, this metric is not sensitive enough to detect the subtle, local conformational changes upon binding.”

      For instance, Figure 9 could be shown as a distribution of DockQ scores.

      We have now updated Figure 5 to include DockQ scores in Panel D. Since DockQ is a function of iRMSD, fnat and L-RMSD, it shows cumulative improvement in performance. Some of the nuanced details, such as, the protocol improves i-RMSD considerably but fnat improvement is lacking, and can highlight whether backbone sampling is the challenge or is it sidechain refinement.Therefore, we need to retain the iRMSD and fnat metrics in panel A-C . But We have incorporated this in the main text as follows:

      “Finally, to evaluate docking success rates, we calculate DockQ for top predictions from AFm and AlphaRED respectively (Figure 5D). AlphaRED demonstrates a success rate (DockQ>0.23) for 63% of the benchmark targets. Particularly for Ab-Ag complexes, AFm predicted acceptable or better quality docked structures in only 20% of the 67 targets. In contrast, the AlphaRED pipeline succeeds in 43% of the targets, a significant improvement.”

      Further, we have reevaluated success rates in Figure 8 (previously Figure 9) and have updated the manuscript to report these updated success rates.

      “By utilizing the AlphaRED strategy, we show that failure cases in AFm predicted models are improved for all targets (lower Irms for 97 of 254 failed targets) with CAPRI acceptable-quality or better models generated for 62% of targets overall (Fig 8)”.

      The improvements on the models where AFm is good are minimal (if at all), and it is unclear how global docking would perform on these targets, nor exactly why the plDDT<0.85 cutoff was chosen.

      We agree with the reviewers that the improvement on the models with good AFm predictions is minimal. We acknowledge this in the text now as follows:

      “Most of the improvements in the success rates are for cases where AFm predictions are worse. For targets with good AFm predictions, AlphaRED refinement results in minimal improvements in docking accuracy.”

      The choice of pLDDT cutoff = 85 is elaborated in the “Interface-pLDDT correlates with DockQ and discriminates poorly docked structures” section, paragraph 3. Briefly, we tested multiple metrics and the interface pLDDT had the highest AUC, indicating that it is the best metric for this task. For interface-pLDDT we tested multiple thresholds, and the cutoff of 85 resulted in the highest percentage of true-positive and true-negative rates. This is illustrated with the confusion matrix in Figure 3.B with the precision scores. We now clarify this in the text as follows:

      “With interface-pLDDT as a discriminating metric, we tested multiple thresholds to estimate the optimum cut-off for distinguishing near-native structures (defined as an interface-RMSD < 4 Å) from the predictions. Figure 3B summarizes the performance with a confusion matrix for the chosen interface-pLDDT cutoff of 85. 79% of the targets are classified accurately with a precision of 75%, thereby validating the utility of interface-pLDDT as a discriminating metric to rank the docking quality of the AFm complex structure predictions.”

      To better understand the performance of ReplicaDock, the authors should therefore (i) run global and local docking on all targets and report the results, (ii) report the results if AlphaFold (not multimer) models of the chains were used as input to ReplicaDock (I would assume it is similar). These models can be downloaded from AlphaFoldDB.

      The performance of ReplicaDock on DB5.5 is tabulated in our prior work (https://doi.org/10.1371/journal.pcbi.1010124) and we direct the reviewers there for the detailed performance and results. In our opinion, the benchmark suggested by the reviewer would be redundant and not worth the computational expense.

      The scope of this paper is to highlight a structure prediction + physics-based modeling pipeline for docking to adapt to the accuracy of up-and-coming structure prediction tools.

      Using AlphaFold monomer chains as input and benchmarking on that, albeit interesting scientifically, will not be useful for either the pipeline or biologists who would want a complex structure prediction. We thank the authors for their comments but want to reemphasize that the end goal of this work is to increase the accuracy of complex structure predictions and PPIs obtained from computational tools.

      Further, it would be interesting to see if ReplicaDock could be combined with AFsample (or any other model to generate structural diversity) to improve performance further.

      We would like to highlight that ReplicaDock is a stand-alone tool for protein docking and here we demonstrate the ability of adapting it with metrics derived from AlphaFold or other structure prediction tools (say ESMFold) such as pLDDT for conformational sampling and improving docking accuracy. We definitely agree that adapting it to use with tools such as AFSample will be interesting but it is out of scope of this work.

      The estimates of computing costs for the AFsample are incorrect (check what is presented in their paper). What are the computational costs for RepliaDock global docking?

      The authors of the AFSample paper report that “AFsample requires more computational time than AF2, as it generates 240 models, and including the extra recycles, the overall timing is 1000 more costly than the baseline.” We have reported these exact numbers in our manuscript.

      The computational costs of ReplicaDock are 8-72 CPU hours on a single node with 24 processors as reported in our prior work.

      For AlphaRED, the costs are slightly higher owing to the structure prediction module in the beginning and are up to 100 CPU hrs for our largest (max Nres) target.

      It is unclear strictly what sequences were used as input to the modelling. The authors should use full-length UniProt sequences if they were not done.

      We report this in the methods section of the manuscript as well as in Figure 5. Full length complex sequences were used for the models that we extracted from DB5.5.

      “As illustrated in Fig. 5, given a sequence of a protein complex, we use the ColabFold implementation of AF2-multimer to obtain a predictive template.”

      We clarify this in the methods section as:

      “For each target in the DB5.5 dataset, we first extracted the corresponding FASTA sequence for the bound complex and then obtained AlphaFold predicted models with the ColabFold v1.5.2 implementation of AlphaFold and AlphaFold-multimer (v.2.3.0).”

      The antibody-antigen dataset is small. It could easily be expanded to thousands of proteins. It would be interesting to know the performance of ReplicaDock on a more extensive set of Antibodies and nanobodies.

      This work demonstrates the performance on the docking benchmark, i.e. given unbound structure can you predict the bound complexes. With this regard, our analysis has been focussed on targets where both the unbound and bound structures are available so that we could evaluate the ability of AlphaRED on modeling protein flexibility and docking accuracy. For antibody-antigen complexes, there are only 67 structures with both unbound and bound complexes available and they constituted our dataset. Benchmarking AlphaRED on all antibody-antigen targets can give biased results as most Ab-Ag complexes are in AlphaFold training set. Further, our work is more aimed towards predicting conformational flexibility in docking and not rigid-body docked complexes, so benchmarking on existing bound Ab-Ag structures is out of scope for this work.

      Using pLDDT on the interface region to identify good/bas models is likely suboptimal. It was acceptable (as a part of the score) for AlphaFold-2.0 (monomer), but AFm behaves differently. Here, AFm provides a direct score to evaluate the quality of the interaction (ipTM or Ranking Confidence). The authors should use these to separate good/bad models (for global/local docking), or at least show that these scores are less good than the one they used.

      We thank the reviewers for this suggestion.

      Reviewer #2 (Recommendations For The Authors):

      Some Figures could be skipped/improved

      Fig 1: Use TM-score instead a much better measure (and the figure is not necessary).

      Figure 1 compares the bias of AlphaFold towards unbound or bound forms of the proteins. We believe that this figure highlights the slight inherent bias of AlphaFold towards bound structures over unbound.

      As the reviewers have suggested we have included a plot comparing the TM-scores for the structures. Further, we have moved this figure to the Supplementary.

      Fig 2. Skip B (why compare RMSD with pLDDT?). Add a figure to see how this correlates over all targets not just two.

      RMSD and LDDT both represent metrics to evaluate conformational variability between two structures, such as the bound and unbound forms of the same protein structure. On one hand where RMSD measures overall deviation of residues, LDDT allows the estimation of relative domain orientations and concerted proteins. We have elaborated this in Methods as well as in the Results section titled “AlphaFold pLDDT provides a predictive confidence measure for backbone flexibility”.

      The data for the benchmark targets is now included in the Supplementary (Supplementary Figures S3-S4).

      Fig 3. Color the different chains of a protein differently. Thereby the Receptor/Ligand/Bound labels can be omitted.

      We thank the reviewers for this suggestion. However, the color scheme is chosen to highlight (1) the relative orientation of protein partners relative to each other. We have ensured that the alignment is over one partner (Receptor) so that you could see the relative orientation of the other partner (Ligand) in the modeled protein over the bound structure (in one color). (2) The coloring of the receptor and ligand chain is by pLDDT (from red to blue) to highlight that for decoys with incorrectly predicted interfaces, the pLDDT scores of the interface residues are indeed lower and can be a discriminating metric. We elaborate this in the caption of Figure 3 as well as in the section “Interface-pLDDT correlates with DockQ and discriminates poorly docked structures”. Coloring the chains of a protein differently will obfuscate the point that we are aiming to make and will be inconclusive for the readers as they would need to rely only on quantitative metrics (Irms and DockQ) reported but won’t be able to visualize the interface pLDDT of the incorrectly bound structures. We hope that this justifies the choice of our color scheme.

      Fig 4. Include RankConf, ipTM, pDockQ, and other measures in the plos (they are likely better). Include DockQ for the top targets. It is difficult to estimate for multi chain complexes.

      We thank the reviewer for this suggestion. We have now included the DockQ performances for all targets in Figure 5 (previously Figure 6) as well as re-evaluated our final success rates based on the DockQ calculations in Figure 8 (previously Figure 9).

      Fig 5. use a better measure to split (see above).

      We have elaborated on the choice of the split for the comments above and the interface pLDDT threshold of 85 is a decision made post observation on the docking benchmark. We do want to highlight that the cut-off is arbitrary and in our online server (ROSIE) as well as in custom scripts, this cut-off can be tuned by the user as required. We would suggest a cut-off of 85 based on our observations but the users are welcome to tune this as per their needs.

      Fig 6. Replace lrms/fnat with DockQ.

      We have now included DockQ scores in our manuscript.

      Fig 7. Color the different chains of a protein differently.

      We have colored the protein chains differently. AlphaFold models are in Orange, Bound complexes are in Gray, and predicted proteins from AlphaRED are in Blue-Green indicating the two partners. All models are aligned over the receptor so relative orientations of the ligand protein can be observed.

      Fig 8 Color the different chains of a protein differently.

      The chains are colored differently. We would like the reviewer to elaborate more on what they would like to observe as we believe our color scheme makes intuitive sense for readers.

      Fig 9. Use DockQ instead of CAPRI criteria.

      The figure has been updated based on DockQ. To elaborate, the CAPRI criteria is set based on DockQ scores as elaborated in the figure caption.

  3. Jan 2025
    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      MacDonald et al., investigated the consequence of double knockout of substance P and CGRPα on pain behaviors using a newly created mouse model. The investigators used two methods to confirm knockout of these neuropeptides: traditional immunolabeling and a neat in vitro assay where sensory neurons from either wildtype or double knock are co-cultured with substance P "sniffer cells", HEK cells stably expressing NKR1 (a substance P receptor), GCaMP6s and Gα15. It should be noted that functional assays confirming CGRPα knockout were not performed. Subsequently, the authors assayed double knockout mice (DKO) and wildtype (WT) mice in numerous behavioral assays using different pain models, including acute pain and itch stimuli, intraplanar injection of Complete Freund's Adjuvant, prostaglandin E2, capsaicin, AITC, oxaliplatin, as well as the spared nerve injury model. Surprisingly, the authors found that pain behaviors did not differ between DKO and WT mice in any of the behavioral assays or pain paradigms. Importantly, female and male mice were included in all analyses. These data are important and significant, as both substance P and CGRPα have been implicated in pain signaling, though the magnitude of the effect of a single knockout of either gene has been variable and/or small between studies.

      The conclusions of the study are largely supported by the data; however, additional experimental controls and analyses would strengthen the authors claims.

      We thank the reviewer for their insightful comments and have answered them below.

      (1) The authors note that single knockout models of either substance P or CGRPα have produced variable effects on pain behaviors that are study-dependent. Therefore, it would have strengthened the study if the authors included these single knockout strains in a side-by-side analysis (in at least some of the behavioral assays), as has been done in prior studies in the field when using double- or triple-knockout mouse models (for example, see PMID: 33771873). If in the authors hands, single knockouts of either peptide also show no significant differences in pain behaviors, then the finding that double knockouts also do not show significant differences would be less surprising.

      In our study, we found no phenotypic differences between WT and DKO mice, suggesting Substance P and CGRPα are largely dispensable for pain behavior. We agree that if we had we observed significant changes in behavior, it would have been interesting to examine the effects of knocking out each gene individually to determine which peptide is responsible for the phenotype. However, given the double deletion had no effect, we can predict that loss of each alone would have no or minor effects. In line with this, a more recent study that comprehensively phenotyped the Calca KO mouse found no deficits in a range of danger related behaviors (PMID: 34376756). Overall, as we are reporting negative data about the Double KO, we do not believe extensive studies of the single KOs is necessary to support the findings of our paper.

      (2) It is unclear why the authors only show functional validation of substance P knockout using "sniffer" cells, but not CGRPα. Inclusion of this experiment would have added an additional layer of rigor to the study.

      Imaging of CGRPα release is more challenging using the ‘sniffer’ approach because functional CGRP receptors require the expression of two genes: Calcrl (or Calcr) along with Ramp1. We now have succeeded in generating a new stable cell line expressing Calcrl and Ramp1, along with GCaMPs and human Galpha15 and include new data in the revised Figure 1F-H and Figure Supplement 1B. These cells respond robustly to CGRPalpha, but not to SP. In contrast, the existing SP cell line responds to SP but not CGRPalpha. Capsaicin evokes a strong response in these cells in co-culture with DRGs. This response is dramatically reduced in the DKO. This data therefore confirms our mice have a loss of CGRPalpha signaling as indicated by IHC.

      (3) The authors should be a bit more reserved in the claims made in the manuscript. The main claim of the study is that "CGRPα and substance P are not required for pain transmission." However, the authors also note that neuropeptides can have opposing effects that may produce a net effect of no change. In my view, the data presented show that double knockout of substance P and CGRPα do not affect somatic pain behaviors, but do not preclude a role for either of these molecules in pain signaling more generally. Indeed, the authors also note that these neuropeptides could be involved in nociceptor crosstalk with the immune or vascular systems to promote headache. The authors only assayed pain responses to glabrous skin stimulation. How the DKO mice would behave in orofacial pain assays, migraine assays, visceral pain assays, or bone/joint pain assays, for example, was not tested. I do not suggest the authors include these experiments, only that they address the limitations/weaknesses of their study more thoroughly.

      The reviewer makes an important point that we agree with. Our study assesses acute and chronic pain in peptide DKO mice lacking Substance P and CGRPα. Most of our data focuses on the hindpaw as pain in the paw is the gold-standard approach for phenotyping pain targets and numerous well-validated chronic pain models have been developed for this body site.  However, to extend the conclusions to other tissues, we did also look at visceral pain and GI distress using acetic acid and LiCl models (Figure 2J and Figure 2 supplement). We agree with the reviewer that given the utility of CGRP monoclonal antibodies, migraine experiments would be interesting for future studies using these mice, a point we highlight in the discussion. Bone/joint pain is also clearly important from a translational perspective, but outside the scope of the current study.

      (4) A more minor but important point, the authors do not describe the nature of the WT animals used. Are the littermates or a separately maintained colony of WT animals? The WT strain background should be included in the methods section.

      The WT strain are C57/BL6j from Jackson Lab. This has been added to the methods.

      Reviewer #2 (Public Review):

      Summary:

      The paper aimed to examine the effect of co-ablating Substance P and CGRPα peptides on pain using Tac1 and Calca double knockout (DKO) mice. The authors observed no significant changes in acute, inflammatory, and neuropathic pain. These results suggest that Substance P and CGRPα peptides do not play a major role in mediating pain in mice. Moreover, they reveal that the lack of behavioral phenotype cannot be explained by the redundancy between the two peptides, which are often co-expressed in the same neuron

      Strengths:

      The paper uses a straightforward approach to address a significant question in the field. The authors confirm the absence of Substance P and CGRPα peptides at the levels of DRG, spinal cord, and midbrain. Subsequently, they employ a comprehensive battery of behavioral tests to examine pain phenotypes, including acute, inflammatory, and neuropathic pain. Additionally, they evaluate neurogenic inflammation by measuring edema and extravasation, revealing no changes in DKO mice. The data are compelling, and the study's conclusions are well-supported by the results. The manuscript is succinct and well-presented.

      We thank the reviewer for their enthusiasm for the importance of our work.

      Reviewer #3 (Public Review):

      In this study, the authors were assessing the role of double global knockout of substance P and CGPRα on the transmission of acute and chronic pain. The authors first generated the double knockout (DKO) mice and validated their animal model. This is then followed by a series of acute and chronic pain assessments to evaluate if the global DKO of these neuropeptides are important in modulating acute and chronic pain behaviors. Authors found that these DKO mice Substance P and CGRPα are not required for the transmission of acute and chronic pain although both neuropeptides are strongly implicated in chronic pain. This study does provide more insight into the role of these neuropeptides on chronic pain processing, however, more work still needs to be done. (see the comments below).

      We thank the reviewer for their detailed and constructive feedback, and below outline the steps we have taken to answer their concerns.

      (1) In assessing the double KO (result #1), why are different regions of the brains shown for substance P and CGRPα (for example, midbrain for substance P and amygdala for CGRPα)? Since the authors mentioned that these peptides co-expressed in the brain (as in the introduction), shouldn't the same brain regions be shown for both IHC? It would be ideal if the authors could show both regions (midbrain and amygdala) in addition to the DRG and spinal cord for both peptides in their findings.<br /> In addition, since this is double KO, the authors should show more representative IHC-stained brain regions (spanning from the anterior to posterior).

      We could not co-stain both SP and CGRP in the same sections as the DKO mouse has endogenous GFP and RFP fluorescence, limiting us to one channel (far red). Specifically, we use a Calca KO that is a Cre:GRP knock-in/knockout (Chen et al 2018, PMID30344042) and Tac1 KO is a tagRFP knock-in/knockout (Wu et al 2018 PMID29485996). This is why we show different brain sections.

      (2) It is also unclear as to why the authors only assessed the loss of substance P signaling in the double KO mice. Shouldn't the same be done for CGRPα signaling? Either the authors assess this, or the authors have to provide clear explanations as to why only substance P signaling was assessed.

      As noted in our response to Reviewer 1, imaging of CGRP release is more challenging using the ‘sniffer’ approach because functional CGRP receptors require the expression of two genes: Calcrl (or Calcr) along with Ramp1. We have now generated this cell line and performed the experiment (see revised Figure 1 and Figure 1 Supplement).

      (3) Has these animal's naturalistic behavior been assessed after the double KO (food intake, sleep, locomotion for example)? I think this is important as changes to these naturalistic behaviors can affect pain processes or outcomes.

      We agree that assessment of naturalistic behavior including food intake, sleep and locomotion would be interesting to look at in DKO mice. However, our study is focused on acute and chronic pain behavior of these animals, and therefore a comprehensive phenotypic assessment of naturalistic home-cage behavior is outside the scope of our study.

      (4) Figure 2H: The authors acknowledge that there is a trend to decrease with capsaicin-evoked coping-like responses. However, a close look at the graph suggests that the lack of significance could be driven by 1 mouse. Have the authors run an outlier test? Alternatively, the authors should consider adding more n to these experiments to verify their conclusions.

      We were reluctant to add more animals searching for significance. Instead, we investigated the potential phenotype further by looking at cfos staining in the cord and found no differences (Figure 2, supplement 1). This result suggests loss of the two peptides does not grossly disrupt capsaicin evoked pain signal transmission between the nociceptor and post-synaptic dorsal neurons in the spinal cord.

      (5) Similarly, the values for WT in the evoked cFos activity (Figure 2- Suppl Figure 1) are pretty variable. Considering that the n number is low (n = 5), authors should consider adding more n.<br /> Also, since the n number is low in this experiment (eg. 5 vs 4), does this pass the normality test to run a parametric unpaired t-test? Either the authors increase their n numbers or run the appropriate statistical test.

      As described in the statistical tables, the Shapiro-Wilk test indicates these data do pass the normality test. Therefore, we retain the use of the unpaired t test, which demonstrates no significant difference between the groups.

      (6) In most of the results, authors ran a parametric test despite the low n number. Authors have to ensure that they are carrying out the appropriate statistical test for their dataset and n number.

      We now provide a table of the statistical results, which provides detailed information about all statistical tests performed in this study. For experiments where we make a single comparison between the two distributions (WT vs DKO), we have run a Shapiro-Wilk test. Where the data from both groups pass the normality test, we retain the use of the unpaired t test. Where the Shapiro-Wilk test indicates data from either group are unlikely to be normally distributed, we now use a Mann-Whitney U test to compare the groups, as this non-parametric test makes no assumptions about the underlying distribution.

      Many experiments involved two factors (genotype, and e.g. temperature, drug, time-point). These data were analyzed in the original submission using 2-WAY ANOVA or Repeated Measures 2-WAY ANOVA, followed by post-hoc Sidak’s tests to compute p values adjusted for multiple comparisons. Because there is no widely agreed non-parametric alternative to 2-WAY ANOVA for analyzing data with two factors and that enables us to account for multiple comparisons, we used 2-WAY ANOVA as is typically used in the field for these kinds of experiments. We reasoned sticking with the 2-WAY ANOVA was the best course of action based on information provided by the statistical software used for this study - https://www.graphpad.com/support/faq/with-two-way-anova-why-doesnt-prism-offer-a-nonparametric-alternative-test-for-normality-test-for-homogeneity-of-variances-test-for-outliers/

      We note that regardless of the test, our conclusion that there are no major changes in acute or chronic pain behaviors are clear and strongly supported.

      (7) Along the same line of comment with the previous, authors should increase the n number for DKO for staining (Figure 4) as n number is only 3 and there is variability in the cFos quantification in the ipsilateral side.

      We believe this is not necessary as the finding is clear that there is no difference.

      (8) Authors should provide references for statement made in Line 319-321 as authors mentioned that there are accumulating evidence indicating that secretion of these neuropeptides from nociceptor peripheral terminals modulates immune cells and the vasculature in diverse tissues.

      We now provide several references to primary papers and reviews supporting this statement.

      (9) Authors state that the sample size used was similar to those from previous studies, but no references were provided. Also, even though the sample sizes used were similar, I believe that the right statistic test should be used to analyze the data.

      We have now cited several classic studies phenotyping mouse KOs in pain in the methods that used similar sample sizes. As detailed above, we have taken the reviewer’s feedback on board and performed normality testing to ensure the correct statistical test is used for each experiment.

      (10) In the discussion, the authors noted that knocking out of a gene remains the strongest test of whether the molecule is essential for a biological phenomenon. At the same time, it was acknowledged that Substance P infusion into the spinal cord elicits pain, but it is analgesic in the brain. The authors might want to expand more on this discussion, including how we can selectively assess the role of these neuropeptides in areas of interest. For example, knocking out both Substance P and CGRPα in selected areas instead of the global KO since there are reported compensatory effects.

      This is highlighted in the closing paragraph: “Emerging approaches to image and manipulate these molecules (Girven et al., 2022; Kim et al., 2023), as well as advances in quantitating pain behaviors (Bohic et al., 2023; MacDonald and Chesler, 2023), may ultimately reveal the fundamental roles of neuropeptides in generating our experience of pain.” The Kim preprint (now published, and so the citation has been updated in the text) describes a method of inactivating neuropeptide transmission in select brain regions in a cell-type specific manner.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      I do not have any major comments. My minor comments are as follows:

      (1) What was the control group for all behavioral studies? Was it WT from an independent colony or one of the littermates was used for generating controls?

      We used C57/Bl6 mice from Jax. This is now mentioned in methods.

      (2) In Fig. 2H, it seems that the effect will become significant if several mice are added.

      We are reluctant to add mice searching for significance. Sample sizes were determined before we collected the data blind.

      (3) There is no figure 3, but two figures 4.

      Thank you. This has been corrected.

      (4) Multiple typos in the legend for figure 4 (lines 234-254). Line 242 (& n=8 (3M, 3F)), line 243 (swelling and plasma), line 252 ((n=8 for) & n=6 for DKO (4M, 4F)).

      Thank you. This has been corrected.

      (5) In Figure 4 (lines 273-285), the contralateral side is mentioned in B but no images are shown.

      Thank you. We removed the mention.

      (6) Although ligand knockouts cannot be compared directly with receptor inhibition, the readers could benefit from discussing studies of receptor ablation and/or pharmacological inhibition.

      We do discuss the classic studies of receptor KO, and the clinical data on receptor blockers here –

      “However, selective antagonists of the Substance P receptor NKR1 failed to relieve chronic pain in human clinical trials (Hill, 2000). Although CGRP monoclonal antibodies and receptor blockers have proven effective for subsets of migraine patients, their usefulness for other types of pain in humans is unclear (De Matteis et al., 2020; Jin et al., 2018). In line with this, knockout mice deficient in Substance P, CGRPα or their receptors have been reported to display some pain deficits, but the analgesic effects are neither large nor consistent between studies (Cao et al., 1998; De Felipe et al., 1998; Guo et al., 2012; Salmon et al., 2001, 1999; Zimmer et al., 1998).” 

      Reviewer #3 (Recommendations For The Authors):

      Minor comments:

      (1) Figure 1E: What does chambers mean? Additionally, are the 12 chambers equally from the male and female samples (6 from male and 6 from female)?

      We have changed this to well. Each replicate is an individual well from 8 well chamber slide. In all these experiments, the wells are approximately evenly distributed by mouse, because from each mouse we cultured around 8 wells’ worth of DRGs.

      (2) Figure 1D: What does low and high mean in the Hargreaves test?

      These refer to a low and high active intensity of the radiant heat stimulus. Number is now described in the methods. 40 and 55 in the intensity units used by the instrument.

      (3) Figure 2-Suppl Figure 1: Authors should provide a bigger image of the image so that it is clearer to the readers.

      We think the image is of a reasonable size and comparable to the images used elsewhere in the paper.

      (4) Authors should consider labeling their supplementary figures in running numbers or combining supplementary figures together to avoid confusion. For example, Figure 2-Supplementary Figure 1 and Figure 2- Supplementary Figure 2 can be combined as just Supplementary Figure 2.

      We agree with the reviewer this would be clearer, but we have followed eLife’s convention for labelling and numbering supplements.

      (5) Figure 3 is mislabeled as Figure 4.

      Thank you. We have corrected this.

      (6) Only female mice were used in the CFA experiment, which does not go in line with the rest of the results which consist of both sexes.

      We have repeated the experiment with additional male mice. To be consistent with the von frey data, these were followed for 7 days, and so the figure now shows a 7 day time course.

      (7) Typo in line 243. The word "and" is subscript.

      Thank you. We have corrected this.

      (8) There is a typo in the legend for Figure 4 where E is labeled I, G is labeled as F, and J is labeled as J.

      Thank you. We have corrected this.

      (9) Authors should specify what "several weeks" means (Line 263).

      It means three weeks. We tested to 21 days. We will replace with three.

      (10) Authors should specify what "one day" means (Line 267). For example, how many days after the intraplantar oxaliplatin treatment? Also, authors should justify why that specific time point was selected or have a reference for it.

      This means one day after - 24 hours. Please see PMID: 33693512. Two references are provided in them methods.

      (11) Figure 4 legend: authors should again be specific on what "prolonged" entails (Line 277).

      We have replaced prolonged with 30 minutes brushing. Specifically, 3 x 10 min stim period, with 1 min rest between stim. It is in the methods.

      (12) In the methods section, authors state that both male and female mice were used for all experiments. However, only female mice were used in the CFA experiment (see minor comment #6). Authors should verify and correct this.

      This is correct. We only used female mice for one of the groups. We have since repeated with males, now included in the data.

      (13) Authors should be more specific in the methods section on how long the habituation per day, how many days and what were the mice habituation to (experimenter, room, chamber, etc)?

      As noted in the methods, mice are habituated for at least an hour to the chambers, and thus implicitly to the room. We do not perform explicit habituation to the investigator such as repeated handling.

      (14) Authors need to provide more information on the semi-automated procedure they are referring to in Line 397. Also, authors should also provide the criteria for cFos quantification (eg. Intensity, etc). If this has been published before, they should provide the reference.

      We have added this. We used the ‘Find maxima’ and ‘Analyze particles’ functions in FIJI, followed by a manual curation step.

      (15) How much acetone was applied and how was it applied to the paw? (Line 495)

      We used the same applicator (1ml syringe with a well at the top) to generate a droplet of acetone that was used for all mice. This has been added to methods.

      (16) Authors should specify the amount of capsaicin injected in μl (Line 500).

      20 ul. We have added this.

      (17) Authors should explain or reference why they are analyzing the 15 min interval between 5 and 20 minutes for injection (Line507-508).

      Acetic acid behaviour lasts around 30 mins in our hands. We chose the 15 minute interval because it reduces burdensome hand scoring time by 50% versus doing the whole 30 mins. We reasoned that in the first 5 mins post injection the animal behaviour may be contaminated by stress related to handling, injection and return to chamber. Thus, 5 and 20 minutes provided a sensible time-frame for scoring the behavior when it is at its peak.

      (18) Authors have to provide more information/explanation on how they decide on the conditioned taste aversion protocol. Like why they do 30 mins exposure to a single water-containing bottle followed 90 mins exposure to both bottles. If this has been published before, they should provide the reference.

      We read dozens of different published protocols in the literature, and piloted one that was something of an amalgam of some of them with various adaptations of convenience. Because it worked on our first attempt, we stuck to it. The advantage of the CTA assay is it is incredibly robust to changes in the specificities of the paradigm, evincing the clear survival value of learning to avoid tastes that make you sick.

      (19) Authors again should provide more detail in their methods section.

      a. Specify the time frame that they are assessing here (Line 533).

      This can be seen in the Figure. 0 to 120 mins. We have added it to the methods.

      b. How long were the mice allowed to recover post-SNI before mechanical allodynia was assessed (Line 545)?

      This is apparent in the figures. 2 days to 21 days. We have added it to the methods.

      c. How much of the oxaliplatin was injected into the mice?

      40 ug / 40 ul (see PMID:33693512)

      Editors note: Reviewers agreed that addressing the concerns about power, outliers, and statistics, as well as functional validation of CGRPα would raise the strength of evidence to compelling, and inclusion of comparison to single KO would raise it to exceptional.

      Should you choose to revise your manuscript, please check to ensure full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, Satouh et al. report giant organelle complexes in oocytes and early embryos. Although these structures have often been observed in oocytes and early embryos, their exact nature has not been characterized. The authors named these structures "endosomal-lysosomal organelles form assembly structures (ELYSAs)". ELYSAs contain organelles such as endosomes, lysosomes, and probably autophagic structures. ELYSAs are initially formed in the perinuclear region and then migrate to the periphery in an actin-dependent manner. When ELYSAs are disassembled after the 2-cell stage, the V-ATPase V1 subunit is recruited to make lysosomes more acidic and active. The ELYSAs are most likely the same as the "endolysosomal vesicular assemblies (ELVAs)", reported by Elvan Böke's group earlier this year (Zaffagnini et al. doi.org/10.1016/j.cell.2024.01.031). However, it is clear that Satouh et al. identified and characterized these structures independently. These two studies could be complementary. Although the nature of the present study is generally descriptive, this paper provides valuable information about these giant structures. The data are mostly convincing, and only some minor modifications are needed for clarification and further explanation to fully understand the results.

      Reviewer #2 (Public Review):

      Satouh et al report the presence of spherical structures composed of endosomes, lysosomes, and autophagosomes within immature mouse oocytes. These endolysosomal compartments have been named as Endosomal-LYSosomal organellar Assembly (ELYSA). ELYSAs increase in size as the oocytes undergo maturation. ELYSAs are distributed throughout the oocyte cytoplasm of GV stage immature oocytes but these structures become mostly cortical in the mature oocytes. Interestingly, they tend to avoid the region which contains metaphase II spindle and chromosomes. They show that the endolysosomal compartments in oocytes are less acidic and therefore non-degradative but their pH decreases and becomes degradative as the ELYSAs begin to disassemble in the embryos post-fertilization. This manuscript shows that lysosomal switching does not happen during oocyte development, and the formation of ELYSAs prevents lysosomes from being activated. Structures similar to these ELYSAs have been previously described in mouse oocytes (Zaffagnini et al, 2024) and these vesicular assemblies are important for sequestering protein aggregates in the oocytes but facilitate proteolysis after fertilization. The current manuscript, however, provides further details of endolysosomal disassembly post-fertilization. Specifically, the V1-subunit of V-ATPase targeting the ELYSAs increases the acidity of lysosomal compartments in the embryos. This is a well-conducted study and their model is supported by experimental evidence and data analyses.

      Reviewer #3 (Public Review):

      Fertilization converts a cell defined as an egg to a cell defined as an embryo. An essential component of this switch in cell fate is the degradation (autophagy) of cellular elements that serve a function in the development of the egg but could impede the development of the embryo. Here, the authors have focused on the behavior during the egg-to-embryo transition of endosomes and lysosomes, which are cytoplasmic structures that mediate autophagy. By carefully mapping and tracking the intracellular location of well-established marker proteins, the authors show that in oocytes endosomes and lysosomes aggregate into giant structures that they term Endosomal LYSosomal organellar Assembl[ies] (ELYSA). Both the size distribution of the ELYSAs and their position within the cell change during oocyte meiotic maturation and after fertilization. Notably, during maturation, there is a net actin-dependent movement towards the periphery of the oocyte. By the late 2-cell stage, the ELYSAs are beginning to disintegrate. At this stage, the endo-lysosomes become acidified, likely reflecting the activation of their function to degrade cellular components.

      This is a carefully performed and quantified study. The fluorescent images obtained using well-known markers, using both antibodies and tagged proteins, support the interpretations, and the quantification method is sophisticated and clearly explained. Notably, this type of quantification of confocal z-stack images is rarely performed and so represents a real strength of the study. It provides sound support for the conclusions regarding changes in the size and position of the ELYSAs. Another strength is the use of multiple markers, including those that indicate the activity state of the endo-lysosomes. Altogether, the manuscript provides convincing evidence for the existence of ELYSAs and also for regulated changes in their location and properties during oocyte maturation and the first few embryonic cell cycles following fertilization.

      At present, precisely how the changes in the location and properties of the ELYSAs affect the function of the endo-lysosomal system is not known. While the authors' proposal that they are stored in an inactive state is plausible, it remains speculative. Nonetheless, this study lays the foundation for future work to address this question.

      Minor point: l. 299. If I am not mistaken, there is a typo. It should read that the inhibitors of actin polymerization prevent redistribution from the cytoplasm to the cortex during maturation.

      Minor point: A few statements in the Introduction would benefit from clarification. These are noted in the comments to the authors.

      We sincerely appreciate the editorial board of eLife and the reviewers for their helpful and constructive comments on our manuscript. We are pleased that the reviewers acknowledged that we identified and characterized this assembly structure independently. In the revised manuscript, we have carefully considered the reviewers’ comments and conducted additional analysis to address each of them.

      Regarding the typographical errors, we revised the description to fit with our findings and the reviewers’ comments. We also found that the primer sequence was correct, and we carefully checked the accuracy of the entire manuscript.

      We hope that the revised version will now be deemed suitable for publication in eLife.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Q. 1) The authors state in the Abstract that ELYSAs contain autophagosome-like membranes in the outer layer. However, this seems to be just speculation based on the LC3 staining results and is not directly shown. Are there autophagosome-like double membrane structures in ELYSAs?

      We appreciate this comment. We also agree with this concern; however, it was difficult to assert that they are autophagosomes based on the observation of the electron micrographs. For this reason, we rephrased it to be "Most ELYSAs are also positive for an autophagy regulator, LC3.” (lines 33). In addition, we revised the notation to LC3-positive structures in the Result and Discussion section (line 165-169, 286).

      Q. 2) The data in Figure 2A, showing a decrease in the number of LAMP1 structures, seems to contradict the data in Figure 1B, showing an apparent increase in LAMP1 structures. Please explain this discrepancy. If the authors did not count structures just below the plasma membrane, please explain the rationale for this.

      We really appreciate the valuable comment. Regarding the number of LAMP1-positive structures, it is not suitable for comparison with Figure 1B, etc., as pointed out by the reviewer, since the distribution of the LAMP1 signal differs from plane to plane. To avoid any potential confusion, we added new images of the Z-projection of the immunostained images that can better reflect the number of positive structures in the whole oocyte/embryo in Figure 2.

      In addition, as the reviewer pointed out, there is a technical difficulty in measuring the LAMP1-positive signal on the plasma membrane or just below it. We explained how and why we had to delete plasma membrane signals in our response #21.

      Q. 3) The actin dependence is not observed in Figure 5C. What is the difference between Figure 5C and 5E? Please explain further.

      We apologize for the lack of clarity; Figures 5C and 5E show the average number of LAMP1-positive structures (5C) and the percentage of the sum of granule volumes in LAMP1 positive structure (5E), respectively, after classifying the LAMP1 positive granules by their diameters.

      We removed Figure 5E for the sake of conciseness since we already mentioned a similar fact in Figure 5C. To clarify the corresponding explanations, we moved figures that were not classified by diameter to Supplementary Figure 8 to improve readability. Moreover, we have rewritten the main text on lines 200–211.

      Q. 4) While the actin inhibitors reduce the number of peripheral LAMP1 structures (Figure 5F), they do not affect their number in the central region (Figure 5G). How can the authors conclude that actin inhibitors inhibit the migration of LAMP1 structures?

      We appreciate the comment. As pointed out, the number of large LAMP1-positive structures in the medial region did not change. Therefore, we have avoided the description that ELYSAs migrate from the middle region to the cell periphery and have unified the description of whether large structures in the periphery occur. Please refer to the subsection title (line 188), the following descriptions (lines 189–199), the related description in the Results (lines 200–211), and the title and the legend of Figure 5.

      Q. 5) The authors show that the V1A subunit associates with the surface of LAMP1 structures as punctate structures (Figure 6B). What are these V1A-positive structures? Is V1A recruited to some specific domains of ELYSAs, or are V1A-positive active lysosomes recruited to ELYSAs? Please provide an interpretation of these data. The phrase "The V1-subunit of V-ATPase is targeted to these structures" (line 262) is not appropriate because it is indistinguishable whether only the V1 subunits are recruited or active lysosomes containing the V1 subunit are recruited.

      Thank you for the valuable comment. Indeed, our analysis, including the analysis of Fig. 8 described on line 262, did not clarify whether free V1A-mCherry molecules accessed the ELYSA periphery or whether lysosomes with V1A-mCherry molecules newly merged into the ELYSA. Therefore, we added this interpretation to lines 232–234 of the Results and revised the Discussion as "The number of membrane structures positive for V1A-mCherry increase upon ELYSA disassembly, indicating further acidification of the endosomal/lysosomal compartment" (lines 292–294).

      Q. 6) Why did the authors use LysoSensor as a marker for ELYSA instead of LAMP1 in Figure 8 and 9? Some reasons should be given.

      There is a clear technical reason for this: when LAMP1-EGFP was expressed in a zygote, it was largely migrated to the plasma membrane before and after the 2-cell stage, making it difficult to capture the change of ELYSAs. To circumvent this difficulty, we used Lysosensor to visualize ELYSAs instead of LAMP1-EGFP. This explanation was added to lines 258–260.

      Q. 7) In Figure 9A, it is not clear whether the activity of LysoSensor-positive structures is lower at this stage compared to other stages. It may be shown in Figure S7, but the data are not clearly visible. A direct comparison would be ideal.

      A new analysis similar to that shown in Fig. 9 for early 2-cells and 4-cells was performed and added to Figure S7. To support direct comparison, the ranges of axes were set to be similar.

      As a result, the quantified MagicRed signal on the isolated LysoSensor-positive punctate structure in MII oocyte was nearly the same as that in early 2-cells and 4-cells. In early 2-cells, LysoSensor gave a signal at the cellular boundary, where MagicRed staining was not observed, confirming that MagicRed activity is higher in the interior than in the cell periphery in post-fertilization embryos. We have included an additional description in the main text (lines 280–282).

      Q. 8) In the phrase "pregnant mare serum gonadotropin or an anti-inhibin antibody" (line 382), is "or" correct?

      When inducing superovulatory stimulation, an anti-inhibin antibody (distributed as CARD HyperOva) can be used as a substitute for PMSG (after additional stimulation with hCG), which results in the production of eggs of similar quality to those of PMSG. This was used in most experiments. To amend the lack of clarity, a reference (Takeo and Nakagata Plos One, 2015) was added to the description of HyperOva (line 417).

      Q. 9) In almost all graphs, please indicate what the X-axis is indicating (not just "number") so that readers can understand what number is being represented without reading the legends.

      We revised the axis titles in all figures.

      Q. 10) Since grayscale images provide better contrast than color images, it is recommended that single-color images be shown in grayscale.

      We replaced all single-color images with grayscale images.

      Reviewer #2 (Recommendations For The Authors):

      Specific comments:

      Q. 11) Figure 1 and S1- Both Rab5 and Rab7 co-localize with LAMP1. However, there seems to be a lot of LAMP1-free Rab5 dots as compared to the Lamp1-free Rab7. As a result, LAMP1 and Rab7 are co-localized more frequently than LAMP1 and Rab5 (video1). Could it be that early endosomes (Rab5+) are yet to be incorporated into ELYSAs? If so, a brief discussion of this phenomenon would be nice.

      Thank you very much for the comment. We agree with the reviewer’s interpretation. In accordance with this suggestion, we clearly stated in the main text: “Although small punctate structures that are RAB5-positive but LAMP1-negative also spread over the cytosol, most giant structures were positive for RAB5 and LAMP1 (Video 1)” (lines 91–93). In the Discussion section, a brief statement was included: “Considering the large number of RAB5-positive and LAMP1-negative punctate structures in MII oocytes, these layers may also reflect the assembly mechanism of the ELYSA” (lines 318–320).

      Q. 12) Video 3 (and Figure 6) clearly shows the dynamics of LAMP1-labelled vesicles during maturation, which is impressive. In contrast to the live cell imaging after LAMP1 mRNA injection, Figure 1 used anti-LAMP1 Ab to detect endogenous levels of LAMP1. It appears that mRNA microinjection causes LAMP1 overexpression causing more (but smaller) vesicles to form. It should be easy to quantify and compare the vesicles in Figure 1 and 6

      We appreciate the comment. As mentioned, injections of EGFP-LAMP1 mRNA are useful for the visualization of LAMP1 dynamics during the maturation phase from GV to MII by live cell imaging, which is not feasible with immunostaining. However, the fluorescence emitted by EGFP-LAMP1 is only a few tenths of that of antibody staining, and because of the technical difficulty of microinjection into GV oocytes, the signal-to-noise ratio sufficient for imaging was merely one in ten oocytes. In addition, live cell imaging of oocytes in Figure 6 had to be carried out with very low excitation light exposure to reduce the toxicity. It was also performed with a low magnification lens and a longer step size in the z-axis. For these reasons, in examining the point raised, we performed an additional 3D object analysis, in the same way as in Figure 2, on the data of IVM oocytes injected with EGFP-LAMP1 mRNA using the same lens as in Figure 1 and with a longer exposure time than in live imaging. The results were compared with the MII data of Figures 1 and 2.

      As a result, as shown in the new Figure S8, more objects with a diameter of 0.2–0.4 µm were found than in the immunostaining data, which fits the reviewer’s point. In addition, the counts were lower for the 0.6–1.0 µm diameter, but there was no significant difference in the number of larger LAMP1 positive structures corresponding to the ELYSA size. We consider that this was appropriate for the original purpose of characterizing the ELYSA formation process. A description of these points has been added to lines 221–225.

      Q. 13) In Figure 4A and B- Seems like not all LAMP1-positive structures were LC3-positive. Is there any size or location within the oocyte that determines LC3 positivity?

      We appreciate the valuable comment. To answer this comment, we proceeded with a new 3D object-based co-localization analysis on Lamp1 and LC3, determined the number, volume, and distribution within the oocyte, and incorporated the results as Supplementary Figure 6. To examine the positivity, we further analyzed the percentage of double-positive structures of all the LAMP1-positive structures. The results showed that their average diameter significantly shifted from 2.36 µm (GV) to 3.78 µm (MII). Moreover, it was clearly indicated that LAMP1-positive structures smaller than 2 µm in diameter are rarely positive for LC3. In terms of location, measuring the distance of the double positive structures from the oocyte center (the cellular geometric center) indicated that they tend to be observed at the periphery of both stages of oocytes (more than 80% in > 30 µm in the MII oocyte). Of note, no clear tendency of double positivity was observed. A description of these points has been added to lines 174–186.

      Q. 14) In discussion, line 256- Small ELYSAs are formed in GV oocytes. Since you haven't checked the smaller-sized, growing oocytes, I suggest rephrasing this sentence as 'are present' rather than 'are formed'.

      We agree with the reviewer’s suggestion and changed it to "present" (line 287).

      Q. 15) Line 188- ELISA should instead be ELYSA

      Thank you for pointing this out. We have found a few more typographical errors, and all of them have been corrected (lines 213 and 321).

      Reviewer #3 (Recommendations For The Authors):

      Q. 16) Line 42: What do you mean by 'zygotic gene expression following the degradation of the cellular components of each maternal and paternal gamete'? ZGA requires this degradation? Please provide supporting references from the literature.

      We apologize for the confusing wording. We meant to say that both ZGA and degradation of parental components are required. To avoid misunderstanding, we have revised “zygotic gene expression as well as the degradation of the cellular components of each maternal and paternal gamete” and inserted a new reference (line 44).

      Q. 17) 50: MII means metaphase II, not meiosis II.

      We corrected the clerical mistake (line 50).

      Q. 18) 51: Define LC3.

      We added the definition of LC3 (line 51-52).

      Q. 19) 60: 'lysosomal activity in oocytes is upregulated by sperm-derived factors as the oocytes grow and mature'. As written, the sentence implies that oocytes grow and mature after fertilization. This may be true for maturation, but I would be surprised to learn that there is growth of the oocyte after fertilization.

      We appreciate this valuable comment.

      The C. elegans lives mainly as a hermaphrodite, which contains a couple of U-shaped gonad arms including the ovary, spermatheca and uterus in the body. Oocytes grow in the ovary and maturate upon receiving major sperm proteins secreted from sperms and ovulated to the spermatheca for fertilization. In 2017, Kenyon’s group reported that major sperm proteins act as sperm-secreted hormones to upregulates the lysosomal activity in oocytes during oocyte growth and maturation. We have revised our manuscript to avoid misunderstanding, to ' lysosomal activity in oocytes is upregulated by major sperm proteins secreted from sperms as the oocytes grow and mature '. (L. 61-66).

      Q. 20) 94 and Figure 1B: While it is clear that many LAMP1 foci at the late 2-cell stage do not also contain RAB5, it seems that the majority of RAB5 loci also stain for LAMP1. This may be a minor point in the context of the paper but could be clarified.

      We could not easily agree with the suggestion because of the possibility that the images might give different impressions on each plane. Therefore, as a way to verify this point, we attempted to quantify the co-localization by reconstructing the 3D puncta information based on the two types of antibody staining data. Unfortunately, as shown in Fig. 1AB, Rab5 had a high cytoplasmic background, and although we were able to extract peaks, we could not reliably recalibrate the three-dimensional punctate structure (please refer to the new Supplementary Fig. 6). Therefore, co-localization on each other's punctate structure (LAMP1/RAB5 vs. RAB5/LAMP1) could not be verified. The validation using specific planes also showed large differences between planes, with overlapping punctate structures counted separately in adjacent planes, making reliable quantification difficult. This is an issue that will be addressed in the future.

      On the other hand, the newly added Z-projection figure (Fig. 1AB) shows that RAB5-positive and LAMP1-negative punctate structures tend to accumulate along the LAMP1-positive punctate structures larger than 1 µm at the late 2-cell stage in all observed embryos; we added this statement on lines 99–101.

      Q. 21) 100-102 and Figure 2A: Does the decrease in the total number of LAMP1 foci refer just to cytoplasmic or also to membrane foci? If the former, what was the reason for not including the membrane in the analysis?

      We appreciate the critical question. The LAMP1 signal on the plasma membrane interfered with the measurement of the signals just below the plasma membrane. The biological cause of this increased signal on the plasma membrane, as shown in Fig. 2E, seemed to be caused by the migration of the LAMP1 signals post-fertilization, which was also reported in a previous paper by Zaffagnini et al. (2024), published in Cell.

      In our analysis, oocytes are giant cells, and confocal imaging has a technical limitation in obtaining the same fluorescent intensity along the z-axis. However, 3D-object analysis requires thresholding based on absolute values. As a result of this situation, the presence of the plasma membrane signal caused punctate structures located close to the membrane to be captured and recognized as a single, very large LAMP1-positive structure, resulting in the loss of the punctate structure that should be measured.

      To avoid this issue, we have used several programs to correct the fluorescence difference along the z-axis; nonetheless, these attempts were unsuccessful. Therefore, as described in the Materials and Methods section, we applied only background subtraction at each z-position and then manually removed the plasma membrane signal (which was thin and continuous at the edges). Furthermore, when the plasma membrane and punctate structure signals overlapped, we paid attention not to remove the signals but to separate them. Thus, we believe that the decrease in the number and volume of LAMP1-positive structures after fertilization is still a phenomenon associated with the shift of LAMP1 to the plasma membrane.

      Q. 22) Figure 2B, F, G: As the x-axis does not represent a continuous variable, adjacent data points should not be connected by a line. The histogram representations in A, C, and E are much easier to understand. I suggest presenting all data in this format.

      We revised the line graphs to bar graphs. Besides, to make the significance among populations clearer, the significances are now expressed using alphabetical indicators.

      Q. 23) Figure 2B, C: It seems that the values for the different stages are expressed relative to the value at MII. Why not use the GV value at the base-line? This would follow the developmental trajectory of the oocyte/embryo more directly and would not (I believe) change the conclusions.

      We appreciated the comment. We meant to express that ELYSA develops most in the MII phase and that it decreases after fertilization, so considering the reviewer’s suggestion, we expressed GV-MII changes based on GV and changes after fertilization based on the MII phase (Fig. 2C, D).

    1. Reviewer #1 (Public review):

      Summary:

      This paper is an elegant, mostly observational work, detailing observations that polysome accumulation appears to drive nucleoid splitting and segregation. Overall I think this is an insightful work with solid observations.

      Strengths:

      The strengths of this paper are the careful and rigorous observational work that leads to their hypothesis. They find the accumulation of polysomes correlates with nucleoid splitting, and that the nucleoid segregation occurring right after splitting correlates with polysome segregation. These correlations are also backed up by other observations:

      (1) Faster polysome accumulation and DNA segregation at faster growth rates.<br /> (2) Polysome distribution negatively correlating with DNA positioning near asymmetric nucleoids.<br /> (3) Polysomes form in regions inaccessible to similarly sized particles.

      These above points are observational, I have no comments on these observations leading to their hypothesis.

      Weaknesses:

      It is hard to state weaknesses in any of the observational findings, and furthermore, their two tests of causality, while not being completely definitive, are likely the best one could do to examine this interesting phenomenon.

      Points to consider / address:

      Notably, demonstrating causality here is very difficult (given the coupling between transcription, growth, and many other processes) but an important part of the paper. They do two experiments toward demonstrating causality that help bolster - but not prove - their hypothesis. These experiments have minor caveats, my first two points.

      (1) First, "Blocking transcription (with rifampicin) should instantly reduce the rate of polysome production to zero, causing an immediate arrest of nucleoid segregation". Here they show that adding rifampicin does indeed lead to polysome loss and an immediate halting of segregation - data that does fit their model. This is not definitive proof of causation, as rifampicin also (a) stops cell growth, and (b) stops the translation of secreted proteins. Neither of these two possibilities is ruled out fully.

      1a) As rifampicin also stops all translation, it also stops translational insertion of membrane proteins, which in many old models has been put forward as a possible driver of nucleoid segregation, and perhaps independent of growth. This should at last be mentioned in the discussion, or if there are past experiments that rule this out it would be great to note them.

      1b) They address at great length in the discussion the possibility that growth may play a role in nucleoid segregation. However, this is testable - by stopping surface growth with antibiotics. Cells should still accumulate polysomes for some time, it would be easy to see if nucleoids are still segregated, and to what extent, thereby possibly decoupling growth and polysome production. If successful, this or similar experiments would further validate their model.

      (2) In the second experiment, they express excess TagBFP2 to delocalize polysomes from midcell. Here they again see the anticorrelation of the nucleoid and the polysomes, and in some cells, it appears similar to normal (polysomes separating the nucleoid) whereas in others the nucleoid has not separated. The one concern about this data - and the differences between the "separated" and "non-separated" nuclei - is that the over-expression of TagBFP2 has a huge impact on growth, which may also have an indirect effect on DNA replication and termination in some of these cells. Could the authors demonstrate these cells contain 2 fully replicated DNA molecules that are able to segregate?

      (3) What is not clearly stated and is needed in this paper is to explain how polysomes do (or could) "exert force" in this system to segregate the nucleoid: what a "compaction force" is by definition, and what mechanisms causes this to arise (what causes the "force") as the "compaction force" arises from new polysomes being added into the gaps between them caused by thermal motions.

      They state, "polysomes exert an effective force", and they note their model requires "steric effects (repulsion) between DNA and polysomes" for the polysomes to segregate, which makes sense. But this makes it unclear to the reader what is giving the force. As written, it is unclear if (a) these repulsions alone are making the force, or (b) is it the accumulation of new polysomes in the center by adding more "repulsive" material, the force causes the nucleoids to move. If polysomes are concentrated more between nucleoids, and the polysome concentration does not increase, the DNA will not be driven apart (as in the first case) However, in the second case (which seems to be their model), the addition of new material (new polysomes) into a sterically crowded space is not exerting force - it is filling in the gaps between the molecules in that region, space that needs to arise somehow (like via Brownian motion). In other words, if the polysome region is crowded with polysomes, space must be made between these polysomes for new polysomes to be inserted, and this space must be made by thermal (or ATP-driven) fluctuations of the molecules. Thus, if polysome accumulation drives the DNA segregation, it is not "exerting force", but rather the addition of new polysomes is iteratively rectifying gaps being made by Brownian motion.

      The authors use polysome accumulation and phase separation to describe what is driving nucleoid segregation. Both terms are accurate, but it might help the less physically inclined reader to have one term, or have what each of these means explicitly defined at the start. I say this most especially in terms of "phase separation", as the currently huge momentum toward liquid-liquid interactions in biology causes the phrase "phase separation" to often evoke a number of wider (and less defined) phenomena and ideas that may not apply here. Thus, a simple clear definition at the start might help some readers.

      (4) Line 478. "Altogether, these results support the notion that ectopic polysome accumulation drives nucleoid dynamics". Is this right? Should it not read "results support the notion that ectopic polysome accumulation inhibits/redirects nucleoid dynamics"?

      (5) It would be helpful to clarify what happens as the RplA-GFP signal decreases at midcell in Figure 1- is the signal then increasing in the less "dense" parts of the cell? That is, (a) are the polysomes at midcell redistributing throughout the cell? (b) is the total concentration of polysomes in the entire cell increasing over time?

      (6) Line 154. "Cell constriction contributed to the apparent depletion of ribosomal signal from the mid-cell region at the end of the cell division cycle (Figure 1B-C and Movie S1)" - It would be helpful if when cell constriction began and ended was indicated in Figures 1B and C.

      (7) In Figure 7 they demonstrate that radial confinement is needed for longitudinal nucleoid segregation. It should be noted (and cited) that past experiments of Bacillus l-forms in microfluidic channels showed a clear requirement role for rod shape (and a given width) in the positing and the spacing of the nucleoids.<br /> Wu et al, Nature Communications, 2020 . "Geometric principles underlying the proliferation of a model cell system" https://dx.doi.org/10.1038/s41467-020-17988-7

      (8) "The correlated variability in polysome and nucleoid patterning across cells suggests that the size of the polysome-depleted spaces helps determine where the chromosomal DNA is most concentrated along the cell length. This patterning is likely reinforced through the displacement of the polysomes away from the DNA dense region"

      It should be noted this likely functions not just in one direction (polysomes dictating DNA location), but also in the reverse - as the footprint of compacted DNA should also exclude (and thus affect) the location of polysomes

      (9) Line 159. Rifampicin is a transcription inhibitor that causes polysome depletion over time. This indicates that all ribosomal enrichments consist of polysomes and therefore will be referred to as polysome accumulations hereafter". Here and throughout this paper they use the term polysome, but cells also have monosomes (and 2 somes, etc). Rifampicin stops the assembly of all of these, and thus the loss of localization could occur from both. Thus, is it accurate to state that all transcription events occur in polysomes? Or are they grouping all of the n-somes into one group?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this report, the authors investigated the effects of reproductive secretions on sperm function in mice. The authors attempt to weave together an interesting mechanism whereby a testosterone-dependent shift in metabolic flux patterns in the seminal vesicle epithelium supports fatty acid synthesis, which they suggest is an essential component of seminal plasma that modulates sperm function by supporting linear motility patterns.

      Strengths:

      The topic is interesting and of general interest to the field. The study employs an impressive array of approaches to explore the relationship between mouse endocrine physiology and sperm function mediated by seminal components from various glandular secretions of the male reproductive tract.

      Thank you for your positive evaluation of our study's topic and approach. We are pleased that you found our investigation into the effects of reproductive secretions on sperm function to be of general interest to the field. We appreciate your positive feedback on the diverse methods we employed to explore this complex relationship.

      Weaknesses:

      Unfortunately, support for the proposed mechanism is not convincingly supported by the data, and the experimental design and methodology need more rigor and details, and the presence of numerous (uncontrolled) confounding variables in almost every experimental group significantly reduce confidence in the overall conclusions of the study.

      The methodological detail as described is insufficient to support replication of the work. Many of the statistical analyses are not appropriate for the apparent designs (e.g. t-tests without corrections for multiple comparisons). This is important because the notion that different seminal secretions will affect sperm function would likely have a different conclusion if the correct controls were selected for post hoc comparison. In addition, the HTF condition was not adjusted to match the protein concentrations of the secretion-containing media, likely resulting in viscosity differences as a major confounding factor on sperm motility patterns.

      We appreciate you highlighting concerns regarding our weak points and apologize for our unclear description. We revised the manuscript to be as rigorous and detailed as possible. In addition, some experimental designs were changed to simpler direct comparisons, and additional experiments were conducted (New Figure 1A-F, lines 103-113). We have made our explanations more consistent with the provided data, which includes further experimentation with additional controls and larger sample sizes to increase the reliability of the findings.

      To address the multiple testing problem, a multiple testing correction was made by making the statistical tests more stringent (Please see Statistical analysis in the Methods section and the Figure legends). Based on different statistical methods, the analysis results did not require significant revisions of the previous conclusions.

      Because the experiments on mixing extracts from the seminal vesicles were exploratory, we planned to avoid correcting for multiple comparisons. Repeating the t-test could lead to a Type I error in some results, so we apologize for not interpreting and annotating them. In the revised version, we removed the dataset for experiments on mixing extracts from the seminal vesicles and prostate, and we changed the description to refer to the clearer dataset mentioned above.

      The viscosity of the secretion-containing medium was measured with a viscometer, confirming that secretions did not significantly affect the viscosity of the solution. In addition, as the reviewer pointed out, we addressed the issue that the HTF condition could not be used as a control because of the heterogeneity in protein concentration (New Fig.1G, lines 110-111).

      Overall, we concluded that seminal vesicle secretion improves the linear motility of sperm more than prostate secretion.

      There is ambiguity in many of the measurements due to the lack of normalization (e.g. all Seahorse Analyzer measurements are unnormalized, making cell mass and uniformity a major confounder in these measurements). This would be less of a concern if basal respiration rates were consistently similar across conditions and there were sufficient independent samples, but this was not the case in most of the experiments.

      We apologize for the many ambiguities in the first manuscript. Cell culture experiments in the paper, including the flux analysis, were performed under conditions normalized or fixed by the number of viable cells. The description has also been revised to emphasize that the measurement values are standardized by cell count (lines 183-185, 189-190, 194-197). We emphasize that testosterone affects metabolism under the same number of viable cells (New Fig.4). This change in basal respiration is thought to be due to the shift in the metabolic pathway of seminal vesicle epithelial cells to a “non-normal TCA cycle” in which testosterone suppresses mitochondrial oxygen consumption, even under aerobic conditions (New Figs.3, 4, 5).

      The observation that oleic acid is physiologically relevant to sperm function is not strongly supported. The cellular uptake of 10-100uM labeled oleic acid is presumably due to the detergent effects of the oleic acid, and the authors only show functional data for nM concentrations of exogenous oleic acid. In addition, the effect sizes in the supporting data were not large enough to provide a high degree of confidence given the small sample sizes and ambiguity of the design regarding the number of biological and technical replicates in the extracellular flux analysis experiments.

      Thank you for your important critique. As you noted, the too-high oleic acid concentration did not reflect physiological conditions. Therefore, we changed the experimental design of an oleic acid uptake study and started again. We added an in vitro fertilization experiment corresponding to the functional data of exogenous oleic acid at nM concentrations (New Fig.7J,K, Lines 274-282).

      For the flux data to determine the effect of oleic acid on sperm metabolism, we have indicated in the text that the data were obtained based on eight male mice and two technical replicates. Pooled sperm isolated and cultured from multiple mice were placed in one well. The measurements were taken in three different wells, and each experiment was repeated four times. We did not use the extracellular flux analyzers XFe24 or XFe96. The measurements were also repeated because the XF HS Mini was used in an 8-well plate (only a maximum of 6 samples at a run since 2 wells were used for calibration).

      Overall, the most confident conclusion of the study was that testosterone affects the distribution of metabolic fluxes in a cultured human seminal vesicle epithelial cell line, although the physiological relevance of this observation is not clear.

      We thank the comments that this finding is one of the more robust conclusions of our study. Below we have written our thoughts on the physiological relevance of the observation results and our proposed revisions. In the mouse experiments, when the action of androgens was inhibited by flutamide, oleic acid was no longer synthesized in the seminal vesicles. The results of the experiments using cultured seminal vesicle epithelial cells showed that oleic acid was not being synthesized because of a change in metabolism dependent on testosterone. We have also added IVF data on the effects of oleic acid on sperm function (New Fig.7 and Supplementary Fig. 5, lines 274-282).<br /> As you can see, we have obtained consistent data in vitro and in vivo in mice. Our data also showed that the effects of testosterone on metabolic fluxes in vitro are similar in mouse and human seminal vesicle epithelial cells (New Fig.9). Therefore, it can be assumed that a decrease in testosterone levels causes abnormalities in the components of human semen. However, the conclusion was overestimated in the original manuscript, so we changed the wording as follows: It could be assumed that a decrease in testosterone levels causes abnormalities in the components of human semen. (lines 422-423)

      In the introduction, the authors suggest that their analyses "reveal the pathways by which seminal vesicles synthesize seminal plasma, ensure sperm fertility, and provide new therapeutic and preventive strategies for male infertility." These conclusions need stronger or more complete data to support them.

      We appreciate your comments about the suggestion presented in the introduction.

      We also removed our conclusions regarding treatment and prevention strategies for male infertility (lines 96-98). We wanted to discuss our findings not conclusively but as future applications that could result from further research based on our initial findings.

      The last sentence of the introduction has been revised to tone down these assertions as follows: These analyses revealed that testosterone promotes the synthesis of oleic acid in seminal vesicle epithelial cells and its secretion into seminal plasma, and the oleic acid ensures the linear motility and fertilization ability of sperm.

      We are grateful for your suggestions, which have prompted us to refine our manuscript.

      Reviewer #2 (Public Review):

      Summary:

      Using a combination of in vivo studies with testosterone-inhibited and aged mice with lower testosterone levels, as well as isolated mouse and human seminal vesicle epithelial cells, the authors show that testosterone induces an increase in glucose uptake. They find that testosterone induces differential gene expression with a focus on metabolic enzymes. Specifically, they identify increased expression of enzymes that regulate cholesterol and fatty acid synthesis, leading to increased production of 18:1 oleic acid.

      Strength:

      Oleic acid is secreted by seminal vesicle epithelial cells and taken up by sperm, inducing an increase in mitochondrial respiration. The difference in sperm motility and in vivo fertilization in the presence of 18:1 oleic acid and the absence of testosterone is small but significant, suggesting that the authors have identified one of the fertilization-supporting factors in seminal plasma.

      Thank you for your positive comments regarding our work on the role of testosterone in regulating metabolic enzymes and the subsequent production of 18:1 oleic acid in seminal vesicle epithelial cells. We are pleased that the strength of our findings, particularly identifying oleic acid as a factor influencing sperm motility and mitochondrial respiration, has been recognized.

      Weaknesses:

      Further studies are required to investigate the effect of other seminal vesicle components on sperm capacitation to support the author's conclusions. The author's experiments focused on potential testosterone-induced changes in the rate of seminal vesicle epithelial cell glycolysis and oxphos, however, provide conflicting results and a potential correlation with seminal vesicle epithelial cell proliferation should be confirmed by additional experiments.

      Thank you very much for your valuable criticism. Although we fully agree with your comment, conducting experiments to investigate the effects of other seminal vesicle components on the fertilization potential of sperm would be a great challenge for us. This is because it has taken us the last three years to identify oleic acid as a key factor in seminal plasma. We are considering a follow-up study to explore the effect of other seminal vesicle components on sperm capacitation. Therefore, we have revised the Introduction and conclusions to tone down our assertions .

      The revised manuscript also includes additional data showing a correlation between changes in metabolic flux and the proliferation of seminal vesicle epithelial cells using shRNA. As a result, it was shown that cell proliferation is promoted when mitochondrial oxidative phosphorylation is promoted by ACLY knockdown (New Fig.8D, lines 303-305). This shows a close relationship between the metabolic shift in seminal vesicle epithelial cells and cell proliferation. The revised manuscript includes an interpretation and discussion of these results (lines 369-379).

      We are grateful for your suggestions, which have prompted us to refine our manuscript.

      Reviewer #3 (Public Review):

      Summary:

      Male fertility depends on both sperm and seminal plasma, but the functional effect of seminal plasma on sperm has been relatively understudied. The authors investigate the testosterone-dependent synthesis of seminal plasma and identify oleic acid as a key factor in enhancing sperm fertility.

      Strengths:

      The evidence for changes in cell proliferation and metabolism of seminal vesicle epithelial cells and the identification of oleic acid as a key factor in seminal plasma is solid.

      Weaknesses:

      The evidence that oleic acids enhance sperm fertility in vivo needs more experimental support, as the main phenotypic effect in vitro provided by the authors remains simply as an increase in the linearity of sperm motility, which does not necessarily correlate with enhanced sperm fertility.

      We appreciate the positive feedback on the solid evidence of cell proliferation and metabolic changes in seminal vesicle epithelial cells and the identification of oleic acid as an important factor in seminal plasma. We fully agree with the assessment that the evidence linking oleic acid and increased sperm fertility in vivo needs further experimental support. To address this concern, we changed the experimental design of an oleic acid study and started again to be more physiological regarding the effect of oleic acid on fertility outcomes, increased the replicates of artificial insemination, and added in vitro fertilization assessments (New Fig.7 and supplementary Fig.5, lines 274-282). The revised manuscript describes these experiments and discusses the association between oleic acid and fertility.

      We are grateful for your suggestions, which have prompted us to refine our manuscript.

      Recommendations for the authors:

      Reviewing Editor's note:

      As you can see from the three reviewers' comments, the reviewers agree that this study can be potentially important if major concerns are adequately addressed. The major concern common to all the reviewers is the incomplete mechanistic link between the physiological androgen effect on the production of oleic acid and its effect on sperm function. Statistical analyses need more rigor and consideration of other important capacitation parameters are needed to address these concerns and to improve the manuscript to support the current conclusions.

      Thank you for summarizing the reviewers' feedback and for your insights regarding the major concerns raised. We appreciate the reviewers' understanding of the potential importance of our work and have addressed the issues highlighted to strengthen the manuscript. We believe these changes will improve the quality of the manuscript and provide a clearer and more complete understanding of the role of androgens and oleic acid in sperm function.

      Reviewer #1 (Recommendations For The Authors):

      The following comments are provided with the hope of aiding the authors in improving the alignment between the data and their interpretations.

      Thank you for allowing us to strengthen our manuscript with your valuable comments and queries. We have made our best efforts to reflect your feedback.

      Major Comments:

      (1) The methodological detail is not sufficient to reproduce the work. For example:<br /> a. Manufacturer protocols are referred to extensively. These protocols are neither curated nor version-controlled. Please consider describing the underlying components of the assays. If information is not available, please consider providing catalog numbers and lot numbers in the methods (if appropriate for journal style requirements).

      We appreciate this suggestion, which we believe is important to ensure reproducibility. We described the catalog number in our Methodology and included as much information as possible.

      b. Please consider describing the analyses in full, with consideration given to whether blinding was part of the design. For example- line 492: "apoptotic cells were quantified using ImageJ". How was this quantified? How were images pre-processed? Etc.

      Although blinding was not performed, experiments and analyses based on Fisher's three principles were conducted to eliminate bias (lines 549-552). In order to avoid false-positive or false-negative results, it is clearly stated that tissue sections treated with DNAse were used as positive controls, and tissue sections without TdT were used as negative controls for apoptosis. We have added detailed quantification methods (lines 544-546).

      c. Please consider providing versions of all acquisition and analysis software used.

      We have added software version information in Materials and Methods.

      (2) Please consider revisiting the statistical analyses. Many of the analyses don't seem appropriate for the design. For example, the use of a t-test with multiple comparisons for repeated measures design in Figure 2 and the use of t-test for two-factor design in Figure 8. etc.

      To address the multiple testing issues, the statistical methodology was changed to a more rigorous one. Details are given in the Statistical analysis in the Methods section and the Figure legends.

      (3) The increase in % LIN in Figure 1 may be confounded by differences in viscosity between HTF and the fluid secretion mixtures. For this reason, HTF may not be an appropriate control for the ANOVA post hoc analysis. HTF protein was not adjusted to the same concentration as the secretion mixtures, correct? Ultimately, it does not appear that there would be a significant statistical effect of the different fluid mixtures if appropriate statistical comparisons were made. This detracts from the notion that the secretions impact sperm function.

      (4) Figure 1, the statistical analysis in the legend suggests that the experiments were analyzed with a t-test. Were corrections made for multiple comparisons in B-D? An ANOVA would probably be more appropriate.

      We used a viscometer to measure the viscosity of a solution of prostate and seminal vesicle secretions adjusted to a protein concentration of 10 mg/mL. The results showed that the secretions did not cause any significant viscosity changes (New Fig.1G, Lines 110-111).

      As you pointed out, the protein levels in the HTF medium and the secretion mixture are not adjusted to the same concentration. In addition, the original manuscript was not a controlled experiment because the two factors, seminal vesicle and prostate extracts, were modified. Therefore, to investigate the effect of prostate and seminal vesicle secretions on sperm motility, we modified the experimental design to directly compare the effects of the two groups: seminal vesicle and prostate extracts (New Fig.1A-G, lines 101-113). To show the sperm quality used in this study, motility data from sperm cultured in the HTF medium are presented independently in New Supplemental Fig.1A.

      (5) Additionally in Figure 1, there is no baseline quality control data to show that there are no intrinsic differences between sperm sampled from the two treatment groups. So baseline differences in sperm quality/viability remain a potential confounder.

      We thank you for this important point. Epididymal sperm were collected from healthy mice. We recovered only the seminal vesicle secretions from the flutamide-treated mice to pursue its role in the accessory reproductive glands, since testosterone targets the testes and accessory reproductive organs. So, there was no qualitative difference between the epididymal sperm before treatment. Nevertheless, incubation with seminal vesicle secretion for one hour altered the sperm motility pattern and in vivo fertilization results. Sperm function was altered by seminal vesicle secretion in a short period of culture time. We apologize for the confusion, and we have revised the text and figure to carry a clearer message (lines 128-132).

      (6) Figure 1E, did the authors confirm that flutamide-treated mice had decreased serum androgens? How often were mice treated with flutamide? This is important because flutamide has a relatively short half-life and is rapidly metabolized to inert hydroxyflutamide.

      Serum testosterone levels were unchanged. Flutamide was administered every 24 hours for 7 consecutive days. Although there was no change in blood testosterone levels (New Supplemental Fig.1B), a decrease in the weight of the seminal vesicles, prostate, and epididymis was confirmed. This is thought to be due to the pharmacological activity of flutamide.

      (7) Figure 1H, the meaning of 'relative activity of mitochondria' isn't clear. JC-1 does not measure 'activity'. A decreased average voltage potential across the inner mitochondrial membrane may indicate that more of the sperm from the flutamide group were dead. Additionally, J-aggregates are slow to form, generally requiring long incubation periods of at least 90 minutes or more. Additional positive and negative controls for predictable mitochondrial transmembrane voltage potential polarization states would have improved the quality of this experiment.

      Thank you for pointing this out. We have replaced the relative activity of mitochondria with high mitochondrial membrane potential (New Fig.1M, lines 125-128). Actually, it is thought that the sperm cultured in seminal vesicle secretions from mice that had been administered flutamide died because the motility of the sperm was also significantly reduced. Since antimycin reduces mitochondrial membrane potential, we have added an experiment in which 10 µM antimycin-treated sperm were used as a control to confirm that the JC-1 reaction is sensitive to changes in membrane potential.

      (8) Figure 4, the extracellular flux data appear to be unnormalized. The Seahorse instruments are extremely sensitive to the mass and uniformity of the cells at the bottom of the well. This may be a significant confounder in these results. For example, all of the observed differences between groups could simply be a product of differential cell mass, which is in line with the reduced growth potential of testosterone-treated cells indicated by the authors in the results section.

      We thank you for this important point. After correcting for cell viability, we seeded the same number of viable cultured cells into wells between experimental groups before measuring them in the flux analyzer. There were no significant differences in survival rates in all experiments. As a result, an increase in glucose-induced ECAR and a suppression of mitochondrial respiration were observed. We would like to emphasize that this difference based on metabolic data does not imply a reduction in the growth potential of the cells due to testosterone treatment.

      We described that these measurements are normalized based on cell count and viability (lines 184, 190, 195).

      (9) How did the authors know that the isolated mouse primary cells were epithelial cells? Was this confirmed? What was the relative sample purity?

      The cells were labeled with multiple epithelial cell markers (cytokeratin) and confirmed using immunostaining and flow cytometry. The percentage of cells positive for epithelial cell markers was approximately 80%. A stromal cell marker (vimentin) was also used to confirm purity, but only a few percent of cells were positive. The contaminating cell type was considered to be mainly muscle cells because the gene expression levels of muscle cell markers verified by RNA-seq were relatively high.

      (10) It is misleading to include the lactate/pyruvate media measurements in the middle of the figure in Figure 4 D and E because it seems at first glance like these measurements were made in the seahorse media but they are completely unrelated. Additionally, these measures are not normalized and are sensitive to confounding differences in cell viability, seeding density, mass, etc.

      Thank you for pointing this out. We have placed the lactate and pyruvate measurement graphs after the flux data of ECAR. We noted that these measurements are normalized based on cell count and viability (lines 189-190). The doubling time of seminal vesicle epithelial cells was approximately 3 days, and testosterone inhibited cell proliferation. Therefore, the seeding concentration of cells was increased 4-fold in the testosterone-treated group compared to the control, and experiments were conducted to ensure that the confluency at the time of measurement after 7 days of culture was comparable between groups.

      (11) The flux analyzer assays sold by Agilent have many ambiguities and problems of interpretation. Unfortunately, Agilent's interest in marketing/sales has outpaced their interest in scientific rigor. Please consider revising some of the language regarding the measurements. For example, 'ATP production rate' is not directly measured. Rather, oligomycin-sensitive respiration rate is measured. The conversion of OCR to ATP production rate is an estimation that depends on complex assumptions often requiring additional testing and validation. The same is true for other ambiguous terms such as 'maximal respiration' referring to FCCP uncoupled respiration, and glycolytic rate- which is also not measured directly. If the authors are interested in a more detailed description of the problems with Agilent's interpretation of these assays please see the following reference (PMID: 34461088).

      Thank you for your critical criticism and thoughtful advice, as well as for sharing the excellent reference. We agree with you on the flux analyzer ambiguities and data interpretation problems. The description of the measured values has been revised as follows.

      We have replaced the “ATP production rate” with the “oligomycin-sensitive respiratory rate.” Similarly, we have replaced “maximal respiration” with “FCCP-induced unbound respiration.” (lines 197-202) We chose not to deal with the conversion of OCR to ATP production rate because it is outside the scope of interest in our study.

      Avoid using the term "glycolytic capacity". We use “Oligomycin-sensitive ECAR.” (line 186) We recognize that the ECARs measured in this study reflect experimental conditions and may not fully represent physiological glycolytic flux in vivo. So, the main section includes a data set of glucose uptake studies to emphasize the significance of the changes obtained with the flux analyzer assay. (New Fig.6, lines 230-254)

      Figure 6, it's not surprising to see the accumulation of labeled oleic acid in the cells, however, this does not mean that oleic acid is participating in normal metabolic processes. Oleic acid will have detergent effects at high (uM) concentrations. The observation that sperm 'take up' OA at 10-100 uM concentrations should also be validated against sperm function the health of the cells is very likely to be negatively impacted. Additionally, no apparent accumulation is noted in the fluorescence imaging at 1uM, but the authors insinuate that uptake occurs at low nM concentrations. The effects in Figure 6D-F are nominal at best and are likely a result of the small sample sizes.

      Thank you for your good suggestion. We agree with the reviewer that high concentrations of oleic acid had a detergent effect. To improve the consistency of functional data and observations, oleic acid uptake tests were performed under the same concentration range as the sperm motility tests (New Fig.7A-C). The oleic acid concentration at this time was calculated regarding the oleic acid concentration in seminal fluid recovered from mice as detected by GCMS to reflect in vivo conditions.

      Epididymal sperm were incubated with fluorescently labeled oleic acid and observed after quenching of extracellular fluorescence. Fluorescent signals were detected selectively in the midpiece of the sperm. The fluorescence intensity of sperm quantified by flow cytometry increased significantly in a dose-dependent manner (New Fig.7A-C, lines 261-264).

      Furthermore, increasing the sample size did not change the trend of the sperm motility data. Although the effect size of oleic acid on sperm motility was small (New Fig.7D-G, lines 265-268), an improvement in fertilization ability was observed both in vitro (IVF) and in vivo (AI) (New Fig.7J-L, lines 274-282, 286-291). We conclude that the effect of oleic acid on sperm is of substantial significance. These data and interpretations have been revised in the text in the Results section.

      (12) Figure 6H, I applaud the authors for attempting intrauterine insemination experiments to test their previous findings. That said, there is no supporting data included to show that the sperm from the treatment groups had comparable starting viability/quality. Additionally, it is difficult to tell if the results are due to the small sample sizes and particularly the apparent outlier in the flutamide-only group.

      Thanks for the praise and comments for improvement. As we answered in your comment #5 above, the epididymal sperm was collected from healthy mice. Therefore, there is no qualitative difference in the epididymal sperm before treatment. This is described in the figure legend (lines 1130-1131). We apologize again for this complication. We also more than doubled the number of replications of the experiment. The impact of the outlier would have been minimal.

      (13) One final question related to Figure 6H: how did the authors know they were retrieving all of the possible 2-cell embryos from the uterus? Perhaps the authors could provide the raw counts of unfertilized eggs and 2-cell embryos so we can see if there were differences between the mice.

      We retrieved the pronuclear stage embryos from the fallopian tubes. It is not certain whether all embryos were recovered. Therefore, we added the number of embryos in the graph and in the supplementary data.

      (14) Figure 7 has the same seahorse assay normalization problem as mentioned earlier. Without normalization, it is difficult to tell if the effects are simply due to differences in cell mass. Were the replicates indicated in the graphs run on the same plate? If so, it would be much more convincing to see a nested design, with technical replicates within plates, and additional replicates run on separate plates.

      As we answered in your comment #8 above, these measurements were normalized based on sperm count. This has been corrected to be noted in the text and the figure legend (lines 1123-1124).

      Pooled sperm isolated and cultured from multiple mice were placed in one well. The measurements were taken in three different wells, and each experiment was repeated four times. We did not use the extracellular flux analyzers XFe24 or XFe96. The measurements were also repeated because the XF HS Mini was used in an 8-well plate (only a maximum of 6 samples at a run since 2 wells were used for calibration).

      (15) The statistical test in Figures 8E and F described in the legend is inappropriate (t-test), this appears to be a two-factor design.

      Thank you for pointing this out. Differences between groups were assessed using a two-way analysis of variance (ANOVA). When the two-way ANOVA was significant, differences among values were analyzed using Tukey's honest significant difference test for multiple comparisons.

      (16) The data in Figure 8 are interesting, and the effects appear to be a little more consistent compared with the mouse primary cells, potentially due to cell uniformity. However, the data are unnormalized, causing significant ambiguity, and there are no measures of cell viability to determine if the effects are due to cell death (or at least relative cell mass).

      As we answered in your comments #8 and #14 above, these measurements were normalized based on cell count and viability. This has been corrected to be noted in the figure legend (lines 1185-1186).

      Minor Comments:

      (1) The section title indicating the beginning of the results section is missing.

      A section title has been added to indicate the beginning of the results section.

      (2) There were several typos and confusingly worded statements throughout. Please consider additional editing.

      We used a proofreading service and corrected as much as possible.

      (3) In the introduction, a brief description of seminal fluid physiology is provided, but the reference is directed toward human physiology. Given that the research is performed solely in the mouse, a brief comparative description of mouse physiology would be helpful. For example, what is the role of mouse seminal fluid in the formation of the mating plug? What are the implications of the relative size disparity in seminal vesicles in mice versus humans? Etc.

      The third paragraph of the introduction has been revised (lines 57-60).

      Reviewer #2 (Recommendations For The Authors):

      Thank you for allowing us to strengthen our manuscript with your valuable comments and queries. We have made our best efforts to reflect your feedback.

      (1) The abstract is confusing and partly misleading and should be revised to more clearly and accurately summarize the study.

      The abstract was revised to be clearer and more accurate (lines 20-34).

      (2) The introduction should be revised to more accurately describe the sperm life cycle. Spermatogenesis, per definition, for example, exclusively takes place in the testis, sperm do not gain fertilization competence in the epididymis, sperm isolated from the epididymis cannot fertilize an oocyte unless in vitro capacitated, etc. In the last paragraph the connection between changes in fructose and citrate concentration, sperm metabolism and testicular-derived testosterone and AR remain unclear.

      The introduction was revised to be clearer and more accurate (lines 44-45).

      Citric acid and fructose are chemical components that are the subject of biochemical testing and are commonly used as semen testing items for humans and livestock. This is because the secretory function of the prostate and seminal vesicles is dependent on androgens. The measurement of citric acid and fructose concentrations in semen is routinely used to indicate testicular androgen production function (ISBN: 978-1-4471-1300-3, 978 92 4 0030787).

      (3) Throughout the manuscript the concept of (in vitro) capacitation is missing. Mixing sperm with seminal plasma is not the only way to achieve sperm that can fertilize the oocyte. Since media containing bicarbonate and albumin is the standard procedure in the field to capacitate epididymal mouse sperm rein vitro, the manuscript would gain value from a comparison between the effect of seminal plasma and in vitro capacitating media. Interesting readouts in addition to motility would i.e. be sAC activation, PKA-substrate phosphorylation, and acrosomal exocytosis.

      Thank you for pointing out this important point. As the reviewer points out, fertilization can be achieved in artificial insemination and in vitro fertilization using epididymal sperm which have not been exposed to seminal plasma. This has historically led to an underestimation of the role of accessory reproductive glands, such as the prostate and seminal vesicles. However, it has been reported that the removal of seminal vesicles in rodents decreases the fertilization rate after natural mating. This has been shown to be due to multiple factors affecting sperm motility rather than factors involved in plug formation (PMID: 3397934), but details of these factors and the whole picture of the role of the accessory glands were not known. This led us to become interested in the effects of sperm plasma on sperm other than fertilization and led us to begin research on the role of the accessory glands that synthesize sperm plasma.

      Early in our study, we found that simply exposing sperm to seminal vesicle extracts for 1 hour before IVF dramatically reduced fertilization rates, even in HTF medium containing bicarbonate and albumin. The experiment was designed on the assumption that seminal plasma contains factors that inhibit sperm from acquiring fertilizing ability. Therefore, we conducted experiments using modified HTF without albumin to avoid unintended motility patterns.

      However, we also respect the reviewer's opinion, and we have added our preliminary data related to IVF (New supplementary Fig.5).

      (4) In the introduction and throughout the manuscript it is unclear what the authors mean by "linear motility". An increase in VSL doesn't mean that the sperm swim in a more linear or straight way, or even that the sperm are 'straightened', it means that they swim faster from point A to point B. Do the authors mean progressive or hyperactivated motility? Please clarify.

      For all conditions tested the authors should follow the standard in the field and include the % of motile, progressively motile, and hyperactivated sperm.

      Thank you for pointing this out. We appreciate your feedback regarding the terminology. In our manuscript, "linear motility" refers to the degree to which sperm move in a straight line. We have clarified this by explaining that VSL (Straight-Line Velocity) and LIN (Linearity) are used to quantify and describe linear motility in sperm analysis: Higher VSL values indicate more direct, linear movement. A higher LIN value indicates a straighter path, thus representing greater linear motility. These terms have been standardized, and explanations have been added to the main text (lines 111-113).

      In response to your suggestion, we have included the percentage of motility and progressive motility for all conditions tested. However, since the experiment was performed using modified HTF without albumin, we have decided not to report the percentage of hyperactivation to avoid confusion.

      (5) Did the authors confirm that the injection of flutamide decreases androgen levels? That control needs to be included in the experiment to validate the conclusion.

      Injection of flutamide did not reduce androgen levels (see reviewer #1, comment 6). This is because flutamide's mechanism of action is based on antagonizing androgen and inhibiting its binding to the androgen receptor (New Fig.2A).

      (6) The role of mitochondrial activity in sperm progressive motility is still under investigation. PMID: 37440924 i.e. showed that inhibition of the ETC does not affect progressive but hyperactivated motility. The authors should either include additional experiments to confirm the correlation between mitochondrial activity and sperm progressive motility or tone down that conclusion.

      We have previously shown that treatment with D-chloramphenicol, an inhibitor of mitochondrial translation, significantly reduced sperm mitochondrial membrane potential, ATP levels, and linear motility (PMID: 31212063). Also, in the previous manuscript, we did not address progressive motility or hyperactivated motility in our analysis. We have chosen to discuss the effect of mitochondrial activity on linear motility rather than on progressive motility and hyperactivation of sperm.

      Was mitochondrial activity also altered in epididymal sperm incubated with and without seminal plasma or in aged mice?

      The mitochondrial membrane potential of epididymal sperm cultured with seminal vesicle extract (SV) was higher than that of epididymal sperm cultured without seminal vesicle extract (without SV: 67.3 ± 0.8%, with SV: 83.4 ± 1.8%). On the other hand, the mitochondrial membrane potential of epididymal sperm cultured with seminal vesicle extract recovered from aged mice was decreased (SV from aged: 60.3 ± 2.7%). It should be noted that the epididymal spermatozoa used in these experiments were healthy individuals, different from those from which seminal vesicle extracts were collected. (See also the response to reviewer 1's comment #5.)

      (7) The quality of the provided images showing AR, Ki67, and TUNEL staining should be improved or additional images should be included. Especially the AR staining is hard to detect in the provided images. The authors should also include a co-staining between AR and vesicle epithelial cells. That epithelial cells are multilayered does not come across in the pictures provided.

      We apologize for any inconvenience caused. The image has been replaced with one of higher resolution. The multilayered structure of the epithelial cells will also be seen.

      For the 12-month-old mice, an age-matched control should be included to support the authors' conclusion.

      To clarify the seminal vesicle changes associated with aging, we included images of 3-month-old mice as controls (New Supplementary Fig.2D).

      Overall, the rationale for the experiment does not become clear. How are the amount of seminal vesicle epithelial cells, testosterone, and AR expression connected to seminal plasma secretions? Why is it a disadvantage to have proliferating seminal vesicle epithelial cells? How is proliferation connected to the proposed switch in metabolic pathway activity?

      We have added some explanations and supporting data to the manuscript (New Fig.8D, lines 303-305, 315-319, 369-379). Cell proliferation stopped when the metabolic shift occurred, redirecting glucose toward fatty acid synthesis. Fatty acid synthesis is an important function of the seminal vesicle, and in the presence of testosterone, fatty acid synthesis enhancement and arrest of proliferation occur simultaneously. The connection between metabolism and cell proliferation was further demonstrated when ACLY was knocked down by shRNA, which stopped fatty acid synthesis and released the proliferative arrest induced by testosterone, allowing the cells to proliferate again. However, we do not know what effects occur when cell proliferation is stopped.

      (8) The experiments provided for glycolysis and oxphos are inconsistent and insufficient to support the authors' conclusion that testosterone shifts glycolytic and oxphos activity of seminal vesicle epithelial cells. Multiple groups (PMID 37440924, 37655160, 32823893) have shown that the increased flux through central carbon metabolism during capacitation is accompanied by an accumulation of intracellular lactate and increased secretion of lactate into the surrounding media. How do the authors explain that they see an increase in glucose uptake and ECAR but not in lactate and a decrease in pyruvate? Did the authors additionally quantify intracellular pyruvate and lactate? Since pyruvate and lactate are in constant equilibrium, it is odd that one metabolite is changing and the other one is not.

      Thank you for pointing this out. Since ECAR is often used as an alternative to lactate production but does not directly measure lactate levels, we measured changes in lactate and pyruvate concentrations in the culture medium. Under our experimental conditions, glucose appeared to be directed primarily towards anabolic processes, such as fatty acid synthesis, rather than the OXPHOS pathway, which may explain the lack of lactate production. The observed decrease in pyruvate might indicate its conversion to acetyl-CoA in the mitochondria, supporting both fatty acid synthesis and the TCA cycle. This shift would be consistent with the metabolic reprogramming toward anabolic activity.

      What do the authors mean by "the glycolytic pathway was not enhanced despite the activation of glycolysis" Seahorse, especially using a series of pathway inhibitors, only provides an indirect measurement of glycolysis and oxphos since the instrument does not provide a distinction from which pathways the detected protons are originating. The authors should consider a more optimized experimental design, i.e. the authors could monitor ECAR and OCR in the presence of glucose over time with and without the addition of testosterone. That would be less invasive since the sperm are not starved at the beginning of the experiment and would provide a more direct read-out. Did the authors normalize cell numbers in their experiment? Alternatively, the authors could consider performing metabolomics experiments.

      I agree with the reviewer. Buzzwords such as “glycolytic capacity” simply do not make sense, so we have removed them from the phrases noted by the reviewer. Please refer to the response to some of reviewer 1's points regarding the ambiguity of the data measured by the flux analyzer. Nevertheless, the assay design of the flux analysis could be used as a good “starting point” and provide information on the glycolytic system and respiratory control. Therefore, the interpretation of the flux analysis is supported by subsequent data sets.

      (9) The authors would strengthen their results by confirming their gene expression data by quantifying the expression of the respective proteins.

      Does testosterone treatment increase GLUT4 protein levels in isolated seminal vesicle epithelial cells? Or does it change the localization of the transporter? Are GLUT4 gene and protein levels altered in flutamide-treated cells? How do the authors explain that testosterone increases glucose uptake without changing Glut gene expression?

      We performed Western blot analysis to measure GLUT4 protein levels in seminal vesicle epithelial cells after testosterone treatment. The results showed that testosterone does not alter the expression of GLUT4 protein but simply changes its subcellular localization (New Fig.6C,D, lines 238-244).

      The discussion includes the interpretation of the observation that testosterone increases glucose uptake by altering localization without altering GLUT4 gene expression, a phenomenon commonly seen in other cells, such as cardiomyocytes (lines 362-364). The revised main figure also includes a data set of changes in GLUT4 localization, including flutamide-treated data. See also Reviewer 3's main comment #1.

      (10) Considering that the authors claim that SV secretions are crucial for sperm fertilization capacity, how do they explain that fertilization rates are still at 40 % when sperm are treated with flutamide?

      It is actually about 50% fertilized with HTF because it is fertilized without SV. Considering this baseline, we found that seminal vesicle secretions positively affect sperm in vivo fertilization. On the other hand, seminal plasma from flutamide-treated mice reduced the fertilization ability of healthy sperm. These are described in the text (lines 283-294).

      (11) It would be beneficial for the reader to include a schematic summarizing the results.

      Thank you for your advice from the reader's point of view. We have visualized the summaries of this study and added them to the manuscript (New Fig.10).

      Minor comments:

      Line 38: Male fertility, no article, please revise.

      I have changed “The male fertility” to “Male fertility” and added some references (lines 42-43).

      Line 55: Seminal plasma or TGFb? Please clarify.

      Corrected as follows. “TGFβ, a component of seminal plasma, increases antigen-specific Treg cells in the uterus of mice and humans, which induces immune tolerance, resulting in pregnancy.” (lines 60-62)

      Line 63: Why do the authors find it surprising that blood and seminal plasma have different compositions?

      This is because seminal plasma contains unique biochemical components that are not normally found in blood or only in small quantities. The intention was to emphasize the unique function of seminal plasma in supporting the physiological functions of sperm and to highlight its complex role by comparing it to blood. We clarified these intentions and reflected them in the revised text (lines 62-67).

      Line 94: The headline causes confusion. Seminal plasma does not induce sperm motility, it increases progressive sperm motility.

      Corrected as follows. “The effect of androgen-dependent changes in mouse seminal vesicle secretions on the linear motility of sperm” (lines 101-102)

      Reviewer #3 (Recommendations For The Authors):

      Thank you for allowing us to strengthen our manuscript with your valuable comments and queries. We have made our best efforts to reflect your feedback.

      Major:

      Figure 4 and Figure 5: The trend shows that GLUT3 is up-regulated and GLUT4 is downregulated although both of them are not statistically significant. However, GLUT4 is picked for all the following experiments based on protein localization. Providing other evidence/discussion why not to further consider other GLUTs will help to justify. Also, this reviewer suggests including GLUT4 localization data in the main figure as it is important data for the logical flow to link the following figures.

      We focused on GLUT4 because it was known that testosterone increases glucose uptake by changing the localization of GLUT4 without changing its expression (lines 230-231). In the revised manuscript, the increasing trend in Glut3 gene expression was also mentioned in the discussion, in addition to GLUT4 (lines 360-362). In any case, the results showed that testosterone increased glucose uptake by regulating the function of glucose transporters.

      Immunostaining of GLUT1~4 was performed to compare seminal vesicles from flutamide-treated mice with controls, and localization changes were observed only in GLUT4. Therefore, we hypothesized that GLUT4 is regulated by testosterone and performed the experiment. Fortunately, we were able to obtain a GLUT4-specific inhibitor, which dramatically inhibited the testosterone-dependent glucose uptake and subsequent lipid synthesis in seminal epithelial cells, leading us to believe that GLUT4 is a major glucose transporter.

      Increasing sperm linearity by oleic acid is observed and interpreted as enhanced sperm fertilizing potential. It is not clear why and how sperm linearity can be a determinant factor for enhancing sperm fertility in vivo. Providing an explanation of the effect of oleic acid on another key motility parameter more proven to be directly correlated with fertility (i.e., hyperactivation), and more direct evidence of oleic acid on enhancing sperm linearity indeed increasing sperm fertilization using IVF, is strongly recommended to support the author's main conclusion.

      Thank you for pointing this out. It is known that proteins derived from the seminal vesicles inhibit the hyperactivation of sperm and the acrosome reaction. Therefore, we conducted an experiment to add oleic acid, focusing on fatty acid synthesis caused by the metabolic shift of the seminal vesicles, which had not been known until now.

      Sperm were pretreated with an oleic acid-containing medium before IVF and oleic acid enhanced sperm linearity. When the sperm number was sufficient, there was no change in the cleavage rate after in vitro fertilization, but when the sperm count was reduced to one-tenth of the normal, the cleavage rate increased compared to the control (lines 274-282). In other words, the physiological role of oleic acid is to increase the probability of fertilization by keeping the sperm motility pattern linear or progressive. This increases the likelihood of the sperm passing through the female reproductive tract and environments that are unfavorable to sperm survival. Our research has uncovered significant insights into the role of seminal vesicle fluid and oleic acid in sperm fertilization. Due to the strong effect of the decapacitation factor, we found that seminal vesicle fluid reduces the fertilization rate in IVF. However, it does not interfere with the fertilization rate in in vivo during artificial insemination. This emphasizes the importance of oleic acid, along with other protein components of seminal plasma, in ensuring the in vivo fertilization ability of sperm.

      Minor:

      Please correct a typo in Line 173: sifts to shifts

      All typographical errors have been corrected.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors employed a combinatorial CRISPR-Cas9 knockout screen to uncover synthetically lethal kinase genes that could play a role in drug resistance to kinase inhibitors in triple-negative breast cancer. The study successfully reveals FYN as a mediator of resistance to depletion and inhibition of various tyrosine kinases, notably EGFR, IGF-1R, and ABL, in triple-negative breast cancer cells and xenografts. Mechanistically, they demonstrate that KDM4 contributes to the upregulation of FYN and thereby is an important mediator of drug resistance. All together, these findings suggest FYN and KDM4A as potential targets for combination therapy with kinase inhibitors in triple-negative breast cancer. Moreover, the study may also have important implications for other cancer types and other inhibitors, as the authors suggest that FYN could be a general feature of drug-tolerant persister cells.

      Strengths:

      (1) The authors used a large combination matrix of druggable tyrosine kinase gene knockouts, enabling studying of co-dependence of kinase genes. This approach mitigates off-target effects typically associated with kinase inhibitors, enhancing the precision of the findings.

      (2) The authors demonstrate the importance of FYN in drug resistance in multiple ways. They demonstrate synergistic interactions using both knockouts and inhibitors, while also revealing its transcriptional upregulation upon treatment, strengthening the conclusion that FYN plays a role in the resistance.

      (3) The study extends its impact by demonstrating the potent in vivo efficacy of certain combination treatments, underscoring the clinical relevance of the identified strategies.

      Weaknesses:

      (1) The methods and figure legends are incomplete, posing a barrier to the reproducibility of the study and hindering a comprehensive understanding and accurate interpretation of the results.

      We thank the reviewer for pointing this out. We tried adding as much detail in methods and figures legends as possible to maximize reproducibility and accuracy in interpreting our results as will be described for our responses for the recommendations for authors.

      (2) The authors make use of a large quantity of public data (Fig. 2D/E, Fig. 3F/L/M, Fig 4C, Fig 5B/H/I), whereas it would have strengthened the paper to perform these experiments themselves. While some of this data would be hard to generate (e.g. patient data) other data could have been generated by the authors. The disadvantage of the use of public data is that it merely comprises associations, but does not have causal/functional results (e.g. FYN inhibition in the different cancer models with various drugs). Moreover, by cherry-picking the data from public sources, the context of these sources is not clear to the reader, and thus harder to interpret correctly. For example, it is not directly clear whether the upregulation of FYN in these models is a very selective event or whether it is part of a very large epigenetic re-programming, where other genes may be more critical. While some of the used data are from well-known curated databases, others are from individual papers that the reader should assess critically in order to interpret the data. Sometimes the public data was redundant, as the authors did do the experiments themselves (e.g. lung cancer drug-tolerant persisters), in this case, the public data could also be left out.

      More importantly, the original sources are not properly cited. While the GEO accession numbers are shown in a supplementary table, the articles corresponding to this data should be cited in the main text, and preferably also in the figure legend, to clarify that this data is from public sources, which is now not always the case (e.g. line 224-226). If these original papers do already mention the upregulation of FYN, and the findings from the authors are thus not original, these findings should be discussed in the Discussion section instead of shown in the Results.

      We welcome the reviewer’s concern. As reviewer pointed out, our analysis with FYN expression levels in multiple studies with drug tolerant cells may merely reflect association and not causal relationships. We had at least shown that FYN inhibition may reduce drug tolerance in TNBC and EGFR inhibitor treated lung cancer cells (figures 2H, 5E). The causal role of FYN in emergence of drug tolerance in other cancers treated with different drugs (such as irinotecan treated colon adenocarcinoma and gemcitabine treated pancreatic adenocarcinoma) may be beyond scope of this study. We made a brief discussion addressing this concern in lines 273-275.

      We also added proper citations of the public data used in this study in main text and figure legends in lines 267-269. The GEO accession numbers are listed in supplementary table S2. Importantly, none of the referenced studies identified FYN as key factor in generating drug tolerant cells.

      (3) The claim in the abstract (and discussion) that the study "highlights FYN as broadly applicable mediator of therapy resistance and persistence", is not sufficiently supported by the results. The current study only shows functional evidence for this for an EGFR, IGF1R, and Abl inhibitor in TNBC cells. Further, it demonstrates (to a limited extent) the role of FYN in gefitinib and osimertinib resistance (also EGFR inhibitors) in lung cancer cells. Thus, the causal evidence provided is only limited to a select subset of tyrosine kinase inhibitors in two cancer types. While the authors show associations between FYN and drug resistance in other cancer types and after other treatments, these associations are not solid evidence for a causal connection as mentioned in this statement. Epigenetic reprogramming causing drug resistance can be accompanied by altered gene expression of many genes, and the upregulation of FYN may be a consequence, but not a cause of the drug resistance. Therefore, the authors should be more cautious in making such statements about the broad applicability of FYN as a mediator of therapy resistance.

      We fully agree with the reviewer’s concern that FYN upregulation is simply an association, and may not be the cause of drug tolerance and resistance. Therefore, to accurately convey our findings, we edited our manuscript in lines 34-36 in abstract to “FYN expression is associated with therapy resistance and persistence by demonstrating its upregulation in various experimental models of drug-tolerant persisters and residual disease following targeted therapy, chemotherapy, and radiotherapy” and lines 288-290 in discussion to “ Upregulation of FYN is a general feature of drug tolerant cancer cells, suggesting the association of FYN expression with drug resistance and tumor recurrence after treatment.” We hope this satisfies the reviewer.

      (4) The rationale for picking and validating FYN as the main candidate gene over other genes such as FGFR2, FRK2, and TEK is not clear.

      a. While gene pairs containing FGFR2 knockouts seemed to be equally effective as FYN gene pairs in the primary screening, these could not be validated in the validation experiment. It is unclear whether multiple individual or a pool of gRNAs were used for this validation, or whether only 1 gRNA sequence was picked per gene for this validation. If only 1 gRNA per gene was used, this likely would have resulted in variable knockout efficiencies. Moreover, the T7 endonuclease assay may not have been the best method to check knockout efficiency, as it only implies endonuclease activity around a gene (but not to the extent of indels that can cause frameshifts, such as by TIDE analysis, or extent of reduction in protein levels by western blot).

      b. Moreover, FRK2 and TEK, also demonstrated many synergistic gene pairs in the primary screen. However, many of these gene pairs were not included in the validation screening. The selection criteria of candidate gene pairs for validation screening is not clear. Still, TEK-ABL2 was also validated as a strong hit in the validation screen. The authors should better explain the choice of FYN over other hits, and/or mention that TEK and FRK2 may also be important targets for combination treatment that can be further elucidated.

      We thank the reviewer for improving our manuscript. We had concerns with the generalizability of FGFR2, FRK and TEK in TNBC as their expressions are very low in MDA-MB-231, nor were they enriched in TNBC compared to cancer cell lines of other subtypes. We added a brief comment on this concern in results section and discussion section (lines 150-154, figure S3). Although we acknowledge that the validations done in figure 2B is a result of only one guide RNA, with validations with pharmacological inhibition of FYN (figure 2F-I), we hope the reader and reviewer can be convinced with our key findings in synthetic lethality between FYN and other tyrosine kinases.

      (5) On several occasions, the right controls (individual treatments, performed in parallel) are not included in the figures. The authors should include the responses to each of the single treatments, and/or better explain the normalization that might explain why the controls are not shown.

      a. Figure 2G: The effect of PP2 treatment, without combined treatment, is not shown.

      b. Figure 2H/3G: The effect of the knockouts on growth alone, compared to sgGFP, is not demonstrated. It is unclear whether the viability of knockouts is normalized to sgGFP, or to each untreated knockout.

      c. Figure 2L: The effect of SB203580 as a single treatment is not shown.

      We thank the reviewer for pointing this out. The data shown for all figures listed in these concerns were normalized by the changes in viability by pharmacological or genetic perturbations that synergized with TKIs (NVP-ADW742, gefitinib…etc.) used in the figures in the original manuscript. As reviewer had suggested, we newly added the effect of SB203580 and PP2 treatment on cell viability in supplementary figures S4A, S4K. SB203580 had no significant effect on cell viability, while PP2 treatment caused significant decrease in cell viability, which is expected as PP2 can inhibit activity of multiple Src family kinases. Regardless of the effect of SB203580 and PP2 on cell viability as single agent, it is evident that treatment of TKIs synergistically decreased cell viability in cancer cell lines. The change in viability by FYN or histone lysine demethylase knockout was also provided in newly added figure S4D and S6C. Notably, genetic ablation of FYN or histone lysine demethylases had modest, if any, influences on cell viability.

      (6) The study examines the effects at a single, relatively late time point after treatment with inhibitors, without confirming the sequential impact on KDM4A and FYN. The proposed sequence of transcriptional upregulation of KDM4A followed by epigenetic modifications leading to FYN upregulation would be more compellingly supported by demonstrating a consecutive, rather than simultaneous, occurrence of these events. Furthermore, the protein level assessment at 48 hours (for RNA levels not clearly described), raises concerns about potential confounding factors. At this late time point, reduced cell viability due to the combination treatment could contribute to observed effects such as altered FYN expression and P38 MAPK phosphorylation, making it challenging to attribute these changes solely to the specific and selective reduction of FYN expression by KDM4A.

      We thank the reviewer for pointing this out. We performed time course experiment for NVP-ADW742 treatment on MDA-MB-231 cells in our newly added figure 3E. Surprisingly, treatment of NVP-ADW742 increased KDM4A protein level within two hours. FYN protein accumulation followed KDM4A accumulation after 24 hours. This observation, with our chromatin immunoprecipitation data in figure 3O, provide evidence that FYN accumulation is a consequence of KDM4A accumulation and H3K9me3 demethylation upon TKI treatment. We newly discussed this data in results and discussion section in lines 214-216.

      (7) The cut-off for considering interactions "synergistic" is quite low. The manual of the used "SynergyFinder" tool itself recommends values above >10 as synergistic and between -10 and 10 as additive ( https://synergyfinder.fimm.fi/synergy/synfin_docs/). Here, values between 5-10 are also considered synergistic. Caution should be taken when discussing those results. Showing the actual dose response (including responses to each single treatment) may be required to enable the reader to critically assess the synergy, along with its standard deviation.

      We thank the reviewer for careful comments. We reanalyzed our data with SynergyFinder plus tool (Zheng, Genomics, Proteomics, and Bioinformatics 2022), which implements mathematical models distinct from SynergyFinder 3, for more faithful implementation of Bliss, Loewe independence models, and more critically, calculates statistical significance of the synergy. We provide updates synergy plots with statistics in figures 2F, 3J, and S4B. All drug combinations show statistically significant synergy (p<0.01). We also add raw data used to calculate synergy in figures 2F, 3J and S4B in supplementary dataset S2.

      (8) As the effect size on Western blots is quite limited and sometimes accompanied by differences in loading control, these data should be further supported by quantifications of signal intensities of at least 3 biological replicates (e.g. especially Figure 3A/5A). The figure legends should also state how many independent experiments the blots are representative of.

      We added quantifications for figure 3A and 5A for better depiction of our results. Figure legends were edited to indicate this is a representative of three independent experiments.

      (9) While the article provides mechanistic insights into the likely upregulation of FYN by KDM4A, this constitutes only a fragment of the broader mechanism underlying drug resistance associated with FYN. The study falls short in investigating the causes of KDM4A upregulation and fails to explore the downstream effects (except for p38 MAPK phosphorylation, which may not be complete) of FYN upregulation that could potentially drive sustained cell proliferation and survival. These omissions limit the comprehensive understanding of the complete molecular pathway, and the discussion section does not address potential implications or pathways beyond the identified KDM4A-FYN axis. A more thorough exploration of these aspects would enhance the study's contribution to the field.

      We welcome the reviewer’s careful concern. We agree our delineation of mechanisms underlying TKI resistance in TNBC involving KDM4 and FYN is far from complete. The increases in expression of histone demethylases were observed in cancers treated with different drugs. The mechanisms governing the increase in histone demethylase expression is not known and is beyond the scope of this paper. We newly added this in discussion section in lines 299-304.

      (10) FYN has been implied in drug resistance previously, and other mechanisms of its upregulation, as well as downstream consequences, have been described previously. These were not evaluated in this paper, and are also not discussed in the discussion section. Moreover, the authors did not investigate whether any of the many other mechanisms of drug resistance to EGFR, IGF1R, and Abl inhibitors that have been described, could be related to FYN as well. A more comprehensive examination of existing literature and consideration of alternative or parallel mechanisms in the discussion would enhance the paper's contribution to understanding FYN's involvement in drug resistance.

      FYN has been implicated in TKI resistance in CML cell lines (Irwin, Oncotarget, 2015). In this study, FYN is similarly transcriptionally upregulated in imatinib resistant CML, and this upregulation is dependent on EGR1 transcription factor. To address this concern, we generated EGR1 KO MDA-MB-231 cells and tested whether these cells retain the ability to accumulate FYN. Consistent with the previous study, imatinib treatment increased EGR1 protein level. However, EGR1 knockout did not influence FYN accumulation in MDA-MB-231 cells. EGR1 mediated accumulation of FYN may be context specific phenomenon to CML (Figure S5B). We newly discussed this result in result sections in lines 187-190. We also acknowledge that SRC family kinases are generally involved in drug resistance in many cancers. We discuss the recent findings regarding SRC family kinases in drug resistance in result section in lines 145-147 and discussion sections in lines 315-317.

      Reviewer #2 (Public Review):

      Summary:

      Kim et al. conducted a study in which they selected 76 tyrosine kinases and performed CRISPR/Cas9 combinatorial screening to target 3003 genes in Triple-negative breast cancer (TNBC) cells. Their investigation revealed a significant correlation between the FYN gene and the proliferation and death of breast cancer cells. The authors demonstrated that depleting FYN and using FYN inhibitors, in combination with TKIs, synergistically suppressed the growth of breast cancer tumor cells. They observed that TKIs upregulate the levels of FYN and the histone demethylase family, particularly KDM4, promoting FYN expression. The authors further showed that KDM4 weakens the H3K9me3 mark in the FYN enhancer region, and the inhibitor QC6352 effectively inhibits this process, leading to a synergistic induction of apoptosis in breast cancer cells along with TKIs. Additionally, the authors discovered that FYN is upregulated in various drug-resistant cancer cells, and inhibitors targeting FYN, such as PP2, sensitize drug-resistant cells to EGFR inhibitors.

      Strengths:

      This study provides new insights into the roles and mechanisms of FYN and KDM4 in tumor cell resistance.

      Weaknesses:

      It is important to note that previous studies have also implicated FYN as a potential key factor in drug resistance of tumor cells, including breast cancer cells. While the current study is comprehensive and provides a rich dataset, certain experiments could be refined, and the logical structure could be more rigorous. For instance, the rationale behind selecting FYN, KDM4, and KDM4A as the focus of the study could be more thoroughly justified.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The methods and figure legends are incomplete, posing a barrier to the reproducibility of the study and hindering a comprehensive understanding and accurate interpretation of the results. A critical revision of these aspects is needed, for example:

      a. Catalogue numbers of certain products critical to reproduce the study (e.g. antibodies) and/or at what company they have been purchased (e.g. used compounds)

      b. On several occasions the used concentrations of drugs or exposure time are not mentioned (e.g. Figure 2H, G (PP2), I, J, K, L, etc.)

      c. Figure legend of figure panels E-I in Figure 5 seems to be completely incorrect and not consistent with the figure axis etc.

      d. RT-qPCR methodology is not described in Methods.

      e. Western blot methods are very limited: these should be described in more detail or cite an article that does.

      f. Organoid culture: Information about the source of tumour cells (e.g. pre-treatment biopsy, material after surgery), isolation of tumor cells (e.g. methodology, characterization of material) and culture conditions (e.g. culture time before the experiment) is lacking.

      g. Information about how gefitinib/osimertinib-resistant PC9 and HCC827 cells are generated (as well as culture conditions and where they are from) is missing.

      We thank the reviewer for pointing these out. We have done our best to add experimental details for reproducibility in methods section and figure legends in lines 343-348, 408-426, 431-432, 439-453, 648-650, 671-672 and 691-693.

      (2) Figure 1B/C/D: it would be more meaningful if the most important hits (at least in one of these panels) were highlighted (e.g. line with gene-pair named), or visualized separately, so that the reader does not have to read the supplementary table to know what the most important hits were.

      We thank the reviewer for careful concern. We newly added labels for key synergistic gene pairs in figures 1D as reviewer suggested.

      (3) qPCR data shown in Figure S4 is from 1 independent experiment. As these experiments (especially qPCR) can be rather variable and the effect size is not very large, I would highly recommend repeating these experiments, or excluding them, as conclusions from them are not solid.

      We found performing qPCR with many drugs that did not cause substantial synergistic cell death with NVP-ADW742 in figure S5C (figure S4A in previous version of manuscript) will not provide much additional insights. Also, as we were more interested in finding direct regulators of FYN expression, we focused on drugs that inhibit epigenetic regulator that activate transcription. Therefore, we focused on performing FYN qPCR with drug combinations involving GSK-J4 (KDM6 inhibitor) and pinometostat(DOT1L inhibitor). As shown in our newly added figure in S5D, while GSK-J4 inhibited FYN expression, pinometostat failed to do so. Also, we also confirm that knockout of KDM5 or KDM6 reproducibly failed to decrease FYN expression upon TKI treatment (figure S5E and S5G). The new results are discussed in lines 193-198. We hope these additions satisfy the reviewer.

      (4) For validation of synergistic knockouts, it would be helpful for the interpretation to also show the viability/growth of each knockout (or treatment), instead of mostly normalized scores. For example, the reader now has no insight into whether FYN knockout itself already affects cell viability, or not. If it (or EGFR/IGF1R/ABL knockout) would already substantially affect cell viability, a further reduction in cell viability may not be as relevant as when it would not affect cell viability at all.

      We thank the reviewer for pointing this out. We replaced our figure in figure 2A to indicate raw changes in cell viability in each single and double knockout cells in figure S2A. We hope this satisfies the reviewer.

      (5) The curve fitting as in Figure 2G is somewhat misleading. While the curve seems to be forced to go from 1-0, the +PP2 dose-response curve does actually not seem to start at 1, but rather at 0.8, likely resulting from the effect of PP2 as a single treatment, thus, effects may be interpreted as more synergistic than that they truly are.

      The results shown in figure 2G is actually normalized to cells treated or not with PP2 to better reflect the effect of NVP-ADW742, gefitinib and imatinib in the presence of PP2. So viability value starting at 0.8 is not because of the effect of PP2 treatment as single agent (because it is normalized to PP2 treated cells), but is actually because very small dose of particularly NVP-ADW742 resulted in modest decrease in viability. To more accurately depict our findings, we added the data point in figure 2G with TKI dose of 0uM at viability 1. We also added details for normalization of viability in figure legends.

      (6) The readability of the paper could be enhanced by higher-quality images (now the text is quite pixelated).

      We had technical difficulties in converting file types. We have replaced figures for better resolution for all main and supplementary figures.

      (7) The discussion now contains one paragraph about the selectivity of kinase inhibitors, and that repurposing of inhibitors with more relaxed specificity or multi-kinase inhibitors can be beneficial. This does not seem to fall within the scope of the study, as there was no comparison between selective and non-selective inhibitors. It was also not clearly mentioned that the non-selective inhibitors worked better than the gene knockouts, or that for example, KDM3 and KDM4 knockout together worked better than only KDM4 knockout. It is recommended to either remove this paragraph, or rephrase it so that it better fits the actual results

      We agree with the reviewer. We chose to remove this paragraph in lines 308-313.

      (8) The entire paper does not discuss any known functions of FYN. Its function could be very briefly introduced in the results section when highlighting it as an important hit. More importantly, its known role in cancer and especially drug resistance should be discussed in the discussion (see also Public review).

      We thank the reviewer for pointing this out. We added brief description of the role of FYN in cancer malignancy and drug resistance in lines 145-147. Particularly, FYN accumulation by EGR1 transcription factor had been described in the context of imatinib resistant chronic myeloid leukemia (Irwin, Oncotarget, 2015). To address this, we tested whether EGR1 knockout decreases FYN level in MDA-MB-231 (Figure S5A). Notably EGR1 knockout failed to decrease FYN protein level. This result was discussed in lines 187-190.

      (9) Textual changes including:

      a. Line 29 (and others) "Massively parallel combinatorial CRISPR screens": I would rather choose a more descriptive term, such as "combinatorial tyrosine kinase knockout CRISPR screen", which already clarifies the screen used knockouts of (druggable) tyrosine kinases only. Using both "Parallel" and "combinatorial" is somewhat redundant, and "massively" is subjective, in my opinion.

      Manuscript edited as suggested (lines 29, 63, 86, 283). The term “massively parallel” have been removed as they don’t significantly change our scientific findings.

      b. Line 67 (and others): "to identify ... for elimination of TNBC": while this may be its potential implication, this study has identified genes in (mostly) TNBC cell lines and cell line xenografts. Please rephrase to something more within the scope of this research.

      Manuscript edited as suggested (lines 68-69) as “we utilize CombiGEM-CRISPR technology to identify tyrosine kinase inhibitor combinations with synergistic effect in TNBC cell line and xenograft models for potential combinatorial therapy against TNBC.” We hope it satisfies the reviewer.

      c. Line 31 (and others): Please check the capitals of words describing inhibitors, and make them consistent (e.g. Imatinib written with capital I, other inhibitors without capitals).

      We thank the reviewer for catching this error. We changed all “imatinib” and “osimertinib” to lowercase.

      d. Line 71: "... combining PP2, saracatinib (FYN inhibitor), .." ..." Here it is not clear PP2 is a FYN inhibitor, and, as saracatinib is a well-known Src-inhibitor, it is not correct to just say "FYN inhibitor". Better to rephrase to something such as:  "combining PP2 (Lck/Fyn inhibitor), saracatinib (Src/FYN inhibitor).

      As reviewer noted, most Src family kinase inhibitors are not selective against specific member among other Src family members. Therefore, we changed line 73 to “PP2, saracatinib (Src family kinase / FYN inhibitor).”

      e. Line 81: "The resulting library enabled massively parallel screens of pairwise knockouts, .." To clarify this is for the selected kinases only: "The resulting library enabled screens of pairwise knockouts of the 76 tyrosine kinase genes, .."

      Manuscript edited as suggested by the reviewer in line 86.

      f. Line 88 (and others): "after infection" consider rephrasing to "after transduction" as this is more commonly used when using lentiviral vectors only.

      We thank the reviewer for this. Every “infection” that designates lentiviral transduction were changed to “transduction”.

      g. Line 97-99: While being described as "good" correlation, a correlation of the same sgRNA pair, yet in a different order, of r=0.5 does not seem to be very good, neither does a correlation of r=0.74 for biological replicates. Please consider describing in a less subjective way.

      We removed the subjective terms and changed the manuscript as follows: “sgRNA pair (e.g., sgRNA-A + sgRNA-B and sgRNA-B + sgRNA-A) were positively correlated (r = 0.50) and were combined when calculating Z (Fig. S1D). The Z scores for three biological replicates were also correlated with r = 0.74 between replicates #2 and #3 (Fig. S1E).” in lines 97-101.

      h. Lines 92-96 and lines 102-115: The results section here contains quite a lot of technical information. While some information may be directly needed to understand the described results (such as a very short and simple explanation of how to interpret gene interaction score), other information may be more appropriate for the Methods section, to enhance the readability of the paper. Consider simplifying here and giving a more detailed overview in the Methods section. Also, the text is not entirely clear. You seem to give two separate explanations of how the GI scores were calculated (Starting in lines 106 and 111): please rephrase and clearly indicate the connections between those two explanations (in the Methods section).

      We thank the reviewer for valuable suggestion. We moved significant portions of the technical descriptions in methods section. We also clarified the text regarding the procedures for calculating GI scores in lines 385-387.

      i. Line 142: "These findings suggest that gene A could represent an attractive drug target.." "Gene A" should be "FYN"?

      We thank the reviewer for catching this. Indeed, it is “FYN” and we changed it in line 154.

      j. Line 149: Introduce Saracatinib, and make the reader aware that it actually mostly targets Src, and FYN with lower affinity.

      We newly added text in lines 73 and 164 to indicate that saracatinib is an inhibitor against Src family kinases.

      k. Line 469: "by the two sgRNA." "by the two sgRNAs".

      Corrected

      l. Throughout text/figures/figure legends, please check for consistency in the naming of cell lines, compounds, referring to figures etc. (E.g. MDA-MB-231/MDA MB 231/MDAMB-231 ; Fig. 1/Figure 1).

      Corrected. Thank you for catching this error.

      m. In Methods, frequently ug or uL are used instead of µg or µL

      Corrected.

      n. Legend Figure 5: Clarify what A, G, I, D, and P mean.

      Corrected in line 685-686 to: “A: NVP-ADW742, G: gefitinib, I: imatinib, D: doxorubicin, P: Paclitaxel.”

      o. Line 303: What is meant by: "The six variable nucleotides were added in reverse primer for multiplexing". Could you clarify this in the text?

      We apologize for confusion the six nucleotides is index sequence for multiplexed run in NGS. The text in lines 373-374 is edited to: “The six nucleotides described as “NNNNNN” in reverse primer above represents unique index to identify biological replicates in multiplexed NGS run.”

      Reviewer #2 (Recommendations For The Authors):

      To enhance the robustness of the conclusions drawn from this study, certain concerns merit attention.

      Concerns:

      (1) Line 130 indicates that eight synergistic target gene combinations were validated. It would be helpful to clarify the criteria used to select these gene pairs and provide the rationale for studying these specific combinations of genes.

      In fact, we had selected the gene pairs that we had the sgRNAs against available when we performed the experiments, so we did not have very good reason to explain our selections. Instead we added a brief discussion in lines 304-306 that further validations are required for the gene pairs not experimentally tested.

      (2) According to Figure 2C, FYN was identified as crucial among the 30 gene pairs, and its upregulation in TNBC prompted further investigation. It would be informative to discuss the expression levels of TEK, FRK, and FGFR2 in TNBC and explain why these nodes were not studied. Is there existing evidence demonstrating the superiority of FYN over these other genes?

      The similar concern was raised by reviewer #1. The expression levels of TEK, FRK and FGFR2 were relatively low in MDA-MB-231 and TNBCs in general, and we were concerned about the generalizability of these targets for treating TNBC. While the validation of these genes for possible synthetic lethality may lead to valuable insight, this may be beyond scope of this paper. This concern is newly discussed in result and discussion sections in lines 150-154.

      (3) The screening process employed only one cell line, and validation was conducted with only one cell line (Figure 2A). Consider supplementing the findings with more convincing evidence from other breast cancer cell lines to strengthen the conclusions.

      Although the CRISPR screens and primary validations were done with only one cell line, further validations with drug combinations were done in independent cancer cell lines such as Hs578T (figures S4E-J). Also, the possible association of FYN expression in drug tolerant cells were also demonstrated in lung cancer cells. We hope this satisfies the reviewer.

      (4) The network analysis in Figure 2C lacks a description of the methodology used. It would be beneficial to provide a brief explanation of the methods employed for this analysis.

      The network analysis was done manually with the size of each node proportional to the number of gene pairs. We newly added text in figure legend in line 638 to clarify this.

      (5) The significance of gene A mentioned in line 142 is unclear. Please provide a clear explanation or context for the importance of this gene.

      This is a mistake that were also pointed out by reviewer #1. The “gene A” should have been “FYN”. We corrected this in line 154.

      6. In Figure 2J and Figure 2K, it would be more informative to measure the phosphorylation levels of FYN and SRC rather than just their baseline levels. Consider revising the figures accordingly.

      We thank the reviewer for a careful comment. We newly provide supplementary figure S5A to show that phosphorylation level of FYN is increased, but this increase was proportional to the increase in FYN protein level, so the ratio of pFYN/FYN did not change significantly. We discussed this result in lines 187-190.

      (7) Figure S4B lacks biological replicates, which could impact the reliability of the experimental results. Consider adding biological replicates to enhance the robustness of the findings.

      This was also pointed out by reviewer #1. Instead of performing qPCR for all drugs, we focused on validating the decrease in FYN mRNA level for drug combinations that synergistically kill cancer cells. We were also aiming to identify direct mediator of FYN mRNA upregulation, so we focused on drug combination that involves inhibitor of epigenetic regulator that promotes transcription. To this end, we tested the impact of GSK-J4(KDM6 inhibitor) and pinometostat (DOT1L inhibitor) in combination with TKI in regulating FYN expression level. Notably, while GSK-J4 attenuated FYN mRNA accumulation by NVP-ADW742 treatment, pinometostat failed to do so (figure S5C). We newly described these results in lines 192-197 in results section.

      (8) Line 186 indicates that KDM3 knockout was not tested in Figure S5A. It would be helpful to provide an explanation for this omission or consider including the data if available.

      We thank the reviewer for pointing this out. The T7 endonuclease assay results for KDM3, KDM4 and PHF8 are added in figure S6B. All guide RNAs used in the study efficiently generated indel mutations.

      (9) In line 206, KDM4A is introduced, but Figures 3J and 3M had already pointed to KDM4A. The authors did not analyze the ChIP results for other members of the KDM4 family at this point. Please address this inconsistency and provide a rationale for focusing on KDM4A. Additionally, in Figure 3M, consider adding peak labeling to the enriched portion for clarity.

      We welcome the reviewer’s careful concern. KDM4 family enzymes perform catalytically identical reactions, and are thought to be redundant. Therefore, we judged that the most abundantly expression genes among KDM4 family should be the primary target to focus on. To this end, we analyzed the expression levels of KDM4 family genes in supplementary figure S6A. Indeed KDM4A expression was the highest among other KDM4 family genes. We discussed this in results section in lines 218-220.

      (10) The author only indicated the relationship between the H3K9me3 level in the enhancer region and FYN expression. It would be valuable to verify the activity of the enhancers and investigate additional markers such as H3K27ac and H3K4me1. Consider discussing these aspects to provide a more comprehensive understanding.

      Since we and others had shown that histone dementhylases are increased upon drug treatment, we focused on histone methylation marks which are associated with gene repression and whose removal by demethylases are associated with drug resistance. To this end, KDM6 demethylases removing H3K27me3 may serve as attractive alternative. In our newly added supplementary figure S6E, ADW742 treatment did not decrease H3K27me3 level in FYN promoter, indicating that H3K9me3 may be the dominant epigenetic change that modulates FYN expression upon drug treatment. This was briefly discussed in lines 233-235.

      (11) In Figure 4A, the addition of the drug alone does not inhibit tumor growth. Please provide an explanation for this result and consider discussing potential reasons for the observed lack of inhibition.

      The drug dose was adjusted carefully to minimize tumor shrinkage by single drug so that synergistic tumor shrinkage can be clearer.

      (12) Line 208 indicates missing parentheses in the text describing Figure 4C. Please correct the text accordingly to ensure clarity.

      Corrected. Thank you for catching this error.

      (13) The figure legends for Figures 5E, F, G, and H contain errors. Please correct the figure legends to accurately describe the respective figures.

      We thank the reviewer for catching this error. We have changed the figure legends in lines 691-697 to accurately describe the figures.

      (14) It may be beneficial for the authors to divide the results section into several subsections and add headings to improve the overall understanding of the findings.

      This is an excellent suggestion. We divided our results section into subsections and added headings in lines 80, 141, 181, 237 and 251 to help readers understand our findings.

      (15) The authors should include the sgRNA sequences used for gene targeting, along with details of the target genes and negative/positive controls, in the Supplementary Materials to enhance reproducibility and transparency.

      This is a critical point for improving reproducibility of our work. The sgRNA sequences used in the study are newly added in supplementary table S3.

      (16) The resolution of the figures in the Supplementary Materials is too low, which may impede the authors' ability to interpret the data. Consider providing higher-resolution figures for better readability.

      We had similar concern posed by reviewer #1, we provided higher resolution image for all main and supplementary figures.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      By using the biophysical chromosome stretching, the authors measured the stiffness of chromosomes of mouse oocytes in meiosis I (MI) and meiosis II (MII). This study was the follow-up of previous studies in spermatocytes (and oocytes) by the authors (Biggs et al. Commun. Biol. 2020: Hornick et al. J. Assist. Rep. and Genet. 2015). They showed that MI chromosomes are much stiffer (~10 fold) than mitotic chromosomes of mouse embryonic fibroblast (MEF) cells. MII chromosomes are also stiffer than the mitotic chromosomes. The authors also found that oocyte aging increases the stiffness of the chromosomes. Surprisingly, the stiffness of meiotic chromosomes is independent of meiotic chromosome components, Rec8, Stag3, and Rad21L. with aging.

      Strengths:

      This provides a new insight into the biophysical property of meiotic chromosomes, that is chromosome stiffness. The stiffness of chromosomes in meiosis prophase I is ~10-fold higher than that of mitotic chromosomes, which is independent of meiotic cohesin. The increased stiffness during oocyte aging is a novel finding.

      Weaknesses:

      A major weakness of this paper is that it does not provide any molecular mechanism underlying the difference between MI and MII chromosomes (and/or prophase I and mitotic chromosomes).

      We acknowledge that our study does not provide a comprehensive explanation for the stage-related alterations in chromosome stiffness; however, we believe that the observation of these changes is itself of broad interest. Initially, we hypothesized that DNA damage or depletion of meiosis-specific cohesin might contribute to the observed increase in chromosome stiffness. However, our experimental finding did not support these hypotheses, indicating that neither DNA damage nor cohesion depletion is responsible for the stiffness increase. The molecular basis underlying the stage-related stiffness increase remains elusive and requires exploration in future studies. In the Discussion, we propose that factors such as condensin, nuclear proteins, and histone methylation may play a role in regulating meiotic chromosome stiffness. The involvement of these factors in stage-related chromosome stiffening requires future investigation.

      Reviewer #2 (Public Review):

      This paper reports investigations of chromosome stiffness in oocytes and spermatocytes. The paper shows that prophase I spermatocytes and MI/MII oocytes yield high Young Modulus values in the assay the authors applied. Deficiency in each one of three meiosis-specific cohesins they claim did not affect this result and increased stiffness was seen in aged oocytes but not in oocytes treated with the DNA-damaging agent etoposide.

      The paper reports some interesting observations which are in line with a report by the same authors of 2020 where increased stiffness of spermatocyte chromosomes was already shown. In that sense, the current manuscript is an extension of that previous paper, and thus novelty is somewhat limited. The paper is also largely descriptive as it does neither propose a mechanism nor report factors that determine the chromosomal stiffness.

      There are several points that need to be considered.

      (1) Limitations of the study and the conclusions are not discussed in the "Discussion" section and that is a significant gap. Even more so as the authors rely on just one experimental system for all their data - there is no independent verification - and that in vitro system may be prone to artefacts.

      Our experimental system has been used to study different types of chromosome stiffness as well as nuclear stiffness.  We have compared our results with previously published data and found the data is consistent across different experiments. To address the reviewer’s concern, we describe the limitations of our in vitro experimental approach in the Discussion section.

      (2) It is somewhat unfortunate that they jump between oocytes and spermatocytes to address the cohesin question. Prophase I (pachytene) spermatocytes chromosomes are not directly comparable to MI or MII oocyte chromosomes. In fact, the authors report Young Modulus values of 3700 for MI oocytes and only 2700 for spermatocyte prophase chromosomes, illustrating this difference. Why not use oocyte-specific cohesin deficiencies?

      In this study, our goal was to investigate the mechanism underlying the increased chromosome stiffness observed during prophase I. Ideally, we would have compared wild-type and cohesin-deleted mouse oocytes at the metaphase I (MI) stage. However, experimental constraints made this approach unfeasible: spermatocytes and oocytes from  Rec8<sup>-/-</sup> and  Stag3<sup>-/-</sup> mutant mice cannot reach MI stage, and  Rad21l<sup>-/-</sup> mutant mice are sterile in males and subfertile in females, because cohesin proteins are crucial for germline cell development.

      Additionally, collecting prophase I chromosomes from oocytes is exceptionally challenging and requires fetal mice as prophase I oocyte sources because female oocytes progress to the diplotene stage during fetal development. The process is further complicated by the difficulty of genotyping fetal mice, making the study of female prophase I impracticable. By contrast, spermatocytes are continuously generated in males throughout life, with meiotic stages readily identifiable, making them more accessible for analysis.

      Our findings consistently showed increased chromosome stiffness in both prophase I spermatocytes and MI oocytes, suggesting that the phenomenon is not sex-specific. This observation implies that similar effects on chromosome stiffness may occur across meiotic stages, from prophase I to MI.

      (3) It remains unclear whether the treatment of oocytes with the detergent TritonX-100 affects the spindle and thus the chromosomes isolated directly from the Triton-lysed oocytes. In fact, it is rather likely that the detergent affects chromatin-associated proteins and thus structural features of the chromosomes.

      Regarding the use of Triton X-100, it is important to emphasize that the concentration used (0.05%) is very low and unlikely to significantly affect chromosome stiffness. To support this assertion, we have provided additional evidence in the revised manuscript demonstrating that this low concentration of Triton X-100 has a negligible effect on chromosome stiffness (Supplement Fig. 5, Right panel).

      (4) Why did the authors use mouse strains of different genetic backgrounds, CD-1, and C57BL/6? That makes comparison difficult. Breeding of heterozygous cohesin mutants will yield the ideal controls, i.e. littermates.

      The genetic mutant mice, all in a C57BL/6 background, were generously provided by Dr. Philip Jordan and delivered to our lab. As our lab does not currently maintain C57BL/6 colony and given that this strain typically produces small litter sizes - which would have complicated the remainder of the study - we chose CD-1 mice as the control group and used C57BL/6 mice specifically for the cohesin study. To address potential concerns regarding genetic background differences, we compared our results with previously published data from C57BL/6 mice and found no significant differences (2710 ± 610 Pa versus 3670 ± 840 Pa, P= 0.4809) (Biggs et al., 2020). Furthermore, prophase I spermatocytes from CD-1 mice showed no significant difference compared to any of the three cohesin-deleted C57BL/6 mutant mice, suggesting that chromosome stiffness is not significantly influenced by genetic background.

      (5) How did the authors capture chromosome axes from STAG3-deficienct spermatocytes which feature very few if any axes? How representative are those chromosomes that could be captured?

      We isolated chromosomes from prophase I mutant spermatocytes, which were identified by their large size, round shape, and thick chromosomal threads - characteristics indicative of advanced condensation and a zygotene-like stage during prophase I (Supplemental Fig. 3). The methodology for isolating these chromosomes has been described in details in our previous publication (Biggs et al., 2020), which is referenced in the current manuscript.

      Reviewer #3 (Public Review):

      Summary:

      Understanding the mechanical properties of chromosomes remains an important issue in cell biology. Measuring chromosome stiffness can provide valuable insights into chromosome organization and function. Using a sophisticated micromanipulation system, Liu et al. analyzed chromosome stiffness in MI and MII oocytes. The authors found that chromosomes in MI oocytes were ten-fold stiffer than mitotic ones. The stiffness of chromosomes in MI mouse oocytes was significantly higher than that in MII oocytes. Furthermore, the knockout of the meiosis-specific cohesin component (Rec8, Stag3, Rad21l) did not affect meiotic chromosome stiffness. Interestingly, the authors showed that chromosomes from old MI oocytes had higher stiffness than those from young MI oocytes. The authors claimed this effect was not due to the accumulated DNA damage during the aging process because induced DNA damage reduced chromosome stiffness in oocytes.

      Strengths:

      The technique used (isolating the chromosomes in meiosis and measuring their stiffness) is the authors' specialty. The results are intriguing and informative to the chromatin/chromosome and other related fields.

      Weaknesses:

      (1) How intact the measured chromosomes were is unclear.

      Currently, a well-calibrated chromosome mechanics experiment requires the extracellular isolation of chromosomes. In experiments conducted parallel to those in our previous study (Biggs et al., 2020), we obtained quantitatively consistent results, including measurements of the Young modulus for prophase I spermatocyte chromosomes.  Our isolation approach is significantly gentler than bulk methods that rely on hypotonic buffer-driven cell lysis and centrifugation. If substantial chromosomal damage had occurred during isolation, we would expect greater variation between experiments, as different amounts or types of damage could influence the results. 

      (2) Some control data needs to be included.

      We used wild-type prophase I spermatocytes and metaphase I (MI) oocytes as controls. To validate our findings, we compared some of our results with those reported in a previous study and observed consistent outcomes (Biggs et al., 2020).

      (3) The paper was not well-written, particularly the Introduction section.

      We have revised the paper and improved the overall quality of the manuscript.

      (4) How intact were the measured chromosomes? Although the structural preservation of the chromosomes is essential for this kind of measurement, the meiotic chromosomes were isolated in PBS with Triton X-100 and measured at room temperature. It is known that chromosomes are very sensitive to cation concentrations and macromolecular crowding in the environment (PMID: 29358072, 22540018, 37986866). It would be better to discuss this point.

      As suggested, we investigated the impact of PBS and Triton X-100 on chromosome stiffness. Our findings indicate that neither PBS nor Triton X-100 caused significant changes in chromosome stiffness (Supplemental Fig. 5).

      Recommendations For The Authors:

      Major points of Reviewers that the Editor indicated should be addressed

      (1) Reviewer's point 3, the effect of the high concentration of etoposide: It would be advisable to use lower concentrations of etoposide to observe the effect of DNA damage on chromosome stiffness more accurately.

      The effect of etoposide on oocyte is dose-dependent (Collins et al., 2015). Oocytes are generally not highly sensitive to DNA damage, and even at relatively high concentrations, not all may exhibit a response. To ensure that sufficient DNA damage in the oocytes we isolated, we used relatively high concentration of etoposide for the experiment. This concentration (50 μg/ml) falls within the typical range reported in the literature (Marangos and Carroll, 2012)(Cai et al., 2023)(Lee et al., 2023). As the reviewer suggested, we tested two additional lower concentrations of etoposide (5 μg/ml and 25 μg/ml) (see Fig. 5 C). We did not observe any significant differences in chromosome stiffness in 5 µg/ml etoposide-treated oocytes compared to the control. However, higher concentrations of etoposide (25 μg/ml) significantly reduced oocyte chromosome stiffness compared to the control.

      Revision to manuscript:

      “Results at lower etoposide concentrations revealed that chromosome stiffness in untreated control oocytes was not significantly different from that in oocytes treated with 5 μg/ml etoposide (3780 ± 700 Pa versus 3930 ± 400 Pa, P = 0.8624). However, chromosome stiffness in untreated oocytes was significantly higher than that in oocytes treated with 25 μg/ml etoposide (3780 ± 700 Pa versus 1640 ± 340 Pa, P = 0.015) (Figure 5C).”

      (2) Reviewer's point 3, the effect of Triton X-100: This is related to the concern of the #3 reviewer. It is critical to check whether the detergent does not affect the stiffness indirectly or not.

      To demonstrate that the low concentration of Triton X-100 does not influence chromosome stiffness, we conducted additional experiments. First, we isolated chromosomes and measured their stiffness. Then, we treated the chromosomes with 0.05% Triton X-100 via micro-spraying and remeasured the stiffness. The results showed no significant difference (see Supplement Fig. 5 right panel).

      Revision to manuscript:

      “In addition to past experiments indicating that mitotic chromosomes are stable for long periods after their isolation (Pope et al., 2006), we carried out control experiments on mouse oocyte chromosomes where we incubated them for 1 hour in PBS, or exposed them to a flow of Triton X-100 solution for 10 minutes; there was no change in chromosome stiffness in either case (Methods and Supplementary Fig. 5).”

      (3) Reviewer's point 1, the effect of the buffer composition: Please describe how the composition affects the stiffness of the chromosomes.

      PBS is an economical and effective buffer solution that closely mimics the osmotic conditions of the cytoplasm, which is crucial for maintaining chromosomal structural integrity. Appropriate ion concentrations are crucial for preserving chromosome integrity, as imbalances—either too high or too low—can alter chromosome morphology (Poirier and Marko, 2002). When chromosomes are stored in PBS, their stiffness remains relatively stable, even with prolonged exposure, ensuring minimal changes to their physical properties. To confirm this, we isolated chromosomes and measured their stiffness. After one-hour incubation in PBS, we remeasured stiffness and observed no significant differences, which demonstrated that chromosomes remain stable in PBS (see Supplement Fig.5 left panel).

      Revision to manuscript:

      “In this study, we developed a new way to isolate meiotic chromosomes and measure their stiffness. However, one concern is that the measurements were conducted in PBS solution, which is different from the intracellular environment. To address this, we monitored chromosome stiffness overtime in PBS solution and found that it remained stable over a period of one hour (Supplement Fig. 5 Left panel).”

      Reviewer #1 (Recommendations For The Authors):

      Major points:

      (1) Previously, the role of condensin complexes in chromosome stiffness is shown (Sun et al. Chromosome Research, 2018). Thus, at least the authors described the condensin staining on MI and MII chromosomes.

      We have added sentences in the discussion to elaborate on the role of condensin.

      Revision to manuscript:

      “Several factors, including condensin, have been found to affect chromosome stiffness (Sun et al., 2018). Condensin exists in two distinct complexes, condensin I and condensin II, and both are active during meiosis. Published studies indicate that condensin II is more sharply defined and more closely associated with the chromosome axis from anaphase I to metaphase II (Lee et al., 2011). Additionally, condensin II appears to play a more significant role in mitotic chromosome mechanics compared to condensin I (Sun et al., 2018). Thus, condensin II likely contributes more significantly to meiotic chromosome stiffness than condensin I.”

      (2) Although the authors nicely showed the difference in the stiffness between MI and MII chromosomes (Figure 2), as known, MI chromosomes are bivalent (with four chromatids) while MII chromosomes are univalent (with two chromatids). The physical property of the chromosomes would be affected by the number of chromatids. It would be essential for the authors to measure the physical properties of a univalent of MI chromosomes from mice defective in meiotic recombination such as Spo11 and/or Mlh3 KO mice.

      The reviewer correctly pointed out that the number of chromatids in chromosomes differs between metaphase I (MI) and metaphase II (MII) stages. We have addressed this difference by calculating Young’s modulus (E), a mechanical property that describes the elasticity of a material, independent of its geometry. Young’s modulus describes the intrinsic properties of the material itself, rather than the specific characteristics of the object being tested. It is calculated as E=(F/A)/(∆L/L0), where F was the force given to stretch the chromosome, A was the cross-section area, ∆L was the length change of the chromosome, and L0 was the original length of the chromosome. While an increase in chromosome or chromatid numbers, results in a larger cross-sectional area, leading to a higher doubling force (F). This variation in chromosome number or cross-sectional area does not impact the calculation of chromosome stiffness/Young’s modulus (E). While study of the mutants suggested by the referee would certainly be interesting, it would be likely that the absence of these key recombination factors would impact chromosome stiffness in a more complex way than just changing their thickness; this type of study is beyond the scope of the present manuscript and is an exciting direction for future studies.

      (3) In Figure 5, the authors measure the stiffness of etoposide-treated MI chromosomes. The concentration of the drug was 50 ug/ml, which is very high. The authors should analyze the different concentrations of the drug to check the chromosome stiffness. Moreover, etoposide is an inhibitor of Topoisomerase II. The effect of the drug might be caused by the defective Top2 activity, rather than Top2-adducts, thus DNA damage. It is very important to check the other Top2 inhibitors or DNA-damaging agents to generalize the effect of DNA damage on chromosome stiffness. Moreover, DNA damage induces the DNA damage response. It is important to check the effect of DDR inhibitors on the damage-induced change of stiffness.

      The reviewer is correct in noting that etoposide can induce DNA damage and inhibit Top2 activity. To address this concern, our previous DNase experiment provided further clarity and supports our results of this study (Biggs et al., 2020). This experiment was conducted in vitro, where DNase treatment caused DNA damage on chromosomes without affecting Top2 activity or triggering DNA damage response. The results demonstrated that DNase treatment led to reduced chromosome stiffness, which aligns with the findings presented in our manuscript.

      (4) In the same line as the #3 point, the authors also need to check the effect of etoposide on the stiffness of mitotic chromosomes from MEF.

      Experiments on MEF mitotic chromosomes were designed to serve as a reference for the meiotic chromosome studies. The etoposide experiments on meiotic chromosomes specifically aimed to investigate how DNA damage affects meiotic chromosome structure. While it would be interesting to explore the effects of etoposide-induced DNA damage on mitotic chromosomes, it represents a distinct research question that falls outside the scope of the current study.

      Minor points:

      (1) Line 141-142: Previous studies by the author analyzed the stiffness of mitotic chromosomes from pro-metaphase. Which stage of cell cycles did the authors analyze here?

      To ensure consistency in our experiments, we also measured the stiffness of mitotic chromosomes at the prometaphase stage. The precise stage used is very near to metaphase, at the very end of the prometaphase stage. We have modified the manuscript to clarify this point.

      Revision to manuscript:

      “For comparison with the meiotic case, we measured the chromosome stiffness of Mouse Embryonic Fibroblasts (MEFs) at late pro-metaphase (just slightly before their attachment to the mitotic spindle) and found that the average Young’s modulus was 340 ± 80 Pa (Figure 2B). The value is consistent with our previously published data, where the modulus for MEFs was measured to be 370 ± 70 Pa (Biggs et al., 2020).”

      (2) Line 157: Here, the doubling force of MI (and MII) oocytes should be described in addition to those of spermatocytes.

      The purpose of this paragraph is to demonstrate the reproductivity and consistency of our experiments. In this section, we compared our data with previously published findings. Published data do not include chromosome stiffness measurement from MI mouse oocytes. Our experiment is the first to assess this. Therefore, we did not include MI mouse oocytes in that comparison. To clarify this, we have added sentences to highlight the comparison of doubling force.

      Revision to manuscript:

      “Here, we found that the doubling forces of chromosomes from MI and MII oocytes are 3770 ± 940 pN and 510 ± 50 pN, respectively. We conclude that chromosomes from MI oocytes are much stiffer than those from both mitotic cells and MII oocytes (Supplement Fig. 2), in terms of either Young’s modulus or doubling force.”

      (3) Line 202: What stage of prophase I do the authors mean by the spermatocyte stage here? Diakinesis, Metaphase I or prometaphase I? I am not sure how the authors can determine a specific stage of prophase I by only looking at the thickness of the chromosomes. Please show the thickness distribution of WT and Rec8<sup>-/-</sup> chromosomes.

      We have reworded the sentence and clarified that the spermatocyte stage is prophase I stage. Since Rec8<sup>-/-</sup> spermatocytes cannot progress beyond the pachytene stage of prophase I, the isolated chromosomes must be in prophase I rather than diakinesis, metaphase I, prometaphase I, or any later stages (Xu et al., 2005). Based on the cell size and degree of chromosome condensation (Biggs et al., 2020), it is most likely that the measured chromosomes are at the zygotene-like stage. However, as we cannot definitively determine the exact substage of prophase I, thus, we have referred to them simply as prophase I.

      Revision to manuscript:

      “We isolated chromosomes from Rec8<sup>-/-</sup> prophase I spermatocytes, which displayed large and round cell size and thick chromosomal threads, indicative of advanced chromosome compaction after stalling at a zygotene-like prophase I stage (Supplement Fig. 3). The combination of large cell size and degree of chromosome compaction allowed us to reliably identify Rec8<sup>-/-</sup> prophase I chromosomes. Using micromanipulation, we measured chromosome stiffness by stretching the chromosomes (Supplement Fig. 3) (Biggs et al., 2019).”

      Reviewer #2 (Recommendations For The Authors):

      (1) Line 135: that statement is not substantiated; better to show retraction data and full reversibility.

      We added a figure showing oocyte chromosome stretching, which showed that the oocyte chromosome is elastic, and that the stretching process is reversible (Supplement Fig.1).

      (2) Line 144: the authors claim that the Young Modulus of MII oocytes is "slightly" higher than that of mitotic cells (MEFs). Well, "slightly" means it is rather similar, and therefore the commonly used statement that MII is similar to mitosis is OK - contrary to the authors' claim.

      We have removed the word “slightly” in the manuscript. The difference is statistically significant.

      Revision to manuscript:

      “Surprisingly, despite this reduction, the stiffness of MII oocyte chromosomes was still significantly higher than that for mitotic cells (Figure 2B).”

      (3) There are a lot of awkward sentences in this text. Some sentences lack words, are not sufficiently precise in wording and/or logic, and there are numerous typos. Some examples can be found in lines 89 (grammar), 94, 95 ("looked"), 98, 101 ("difference" - between what?), and some are commonplaces or superficial (lines 92/93, 120..., ). Occasionally the present and past tense are mixed (e.g. in M&M). Thus the manuscript is quite poorly written.

      Thanks for the comments of the reviewer. We have revised all the sentences highlighted by the reviewer and polished the entire manuscript.

      Reviewer #3 (Recommendations For The Authors):

      (1) Line 48. "We then investigated the contribution of meiosis-specific cohesin complexes to chromosome stiffness in MI and MII oocytes." There is no data on oocytes with meiosis-specific cohesin KO. This part should be corrected.

      We have corrected this error.

      Revision to manuscript:

      “We examined the role of meiosis-specific cohesin complexes in regulating chromosome stiffness.”

      (2) Lines 155-157. The result of MI mouse oocyte chromosomes should also be mentioned here (Supplementary Figure 1).

      Please see our response to Reviewer 1 – Minor Point 2.

      (3) Line 163. "The stiffness of chromosomes in MI mouse oocytes is significantly higher compared to MII oocytes."<br /> Is this because two homologs are paired in MI chromosomes (but not in MII chromosomes)? The authors may want to discuss the possible mechanism.

      Please see our response to Reviewer 1 – Major Point 2.

      (4) Line 188: "We hypothesized that MI oocytes... would have higher chromosome stiffness than MII oocytes." Why did the authors measure chromosomes from spermatocytes but not MI oocytes?

      Both spermatocytes and oocytes from Rec8<sup>-/-</sup>, Stag3<sup>-/-</sup>, and Rad21l<sup>-/-</sup> mutant mice cannot reach MI stage because cohesin proteins are crucial for germline-cell development. We chose to use spermatocytes in our study because collecting fetal meiotic oocytes is extremely difficult, and genotyping fetal mice adds another layer of complexity to the experiments. In females, all oocytes complete prophase I and progress to the dictyotene stage during the fetal stage. Obtaining individual oocytes at this stage is challenging. In contrast, spermatocytes are continuously generated at all stages in males.

      (5) To support the authors' conclusion, verifying the KO of REC8, STAG3, and RAD21L by immunostaining or other methods is essential.

      These mice are provided by one of the authors, Dr. Philip Jordan, who has published several papers using these knockout mice (Hopkins et al., 2014)(Ward et al., 2016). The immunostaining of these models has already been well-characterized in those previous studies. In addition to performing double genotyping, we also use the size of the collected testes as an additional verification of the mutant genotype. These knockout mice have significantly smaller testes compared to their wild-type counterparts, providing a clear physical indicator of the mutation.

      (6) Some of the cited papers and descriptions in the Introduction are not appropriate and confusing. This part should be improved:

      Line 79. Recent studies have revealed that the 30-nm fiber is not considered the basic structure of chromatin (e.g., review, PMID: 30908980; original papers, PMID: 19064912, 22343941, 28751582). This point should be included.

      We have corrected the references as needed. Additionally, thank you for the updated information regarding the 30-nm fiber. We have removed all the descriptions about the 30-nm fiber to ensure the information is accurate and up to date.

      (7) Line 83. Reviews on mitotic chromosomes, rather than Ref. 9, should be cited here. For instance, PMID: 33836947, 31230958.

      We have corrected it and added references according to the review’s suggestion.

      (8) Line 85. Refs. 10 and 11 are not on the "Scaffold/Radial-Loop" model. For instance, PMID: 922894, 277351, 12689587. The other popular model is the hierarchical helical folding model (PMID: 98280, 15353545).

      We have corrected it and added appropriate references according to the review’s suggestion. Regarding the hierarchical helical folding model, our experiments do not provide data that either support or refute this model. Thus, we have opted not to include any discussion of this model in our manuscript.

      (9) Figure legends. There is no description of the statistical test.

      We have added the description of the statistical test at the end of the figure legends for clarity.

      (10) Line 156. The authors should mention which stages in spermatocyte prophase I (pachytene?) were used for their measurement.

      We cannot precisely determine the substage of prophase I in the spermatocytes although it is most likely in the pachytene stage.

      (11) Line 241. "DNA damage reduces chromosome stiffness in oocytes." It would be better to show how much damage was induced in aged and etoposide-treated chromosomes, for example, by gamma-H2AX immunostaining. In addition, there are some papers that show DNA damage makes chromatin/chromosomes softer (e.g., PMID: 33330932). The authors need to cite these papers.

      The effects of etoposide and age on meiotic oocytes has been published (Collins et al., 2015)(Marangos et al., 2015)(Winship et al., 2018).

      We are grateful for the citation information provided by the reviewer and have added it to our manuscript.

      Revision to manuscript:

      “Overall, these findings suggest that DNA damage reduces chromosome stiffness in oocytes instead of increasing it, which aligns with other studies showing that DNA damage can make chromosomes softer (Dos Santos et al., 2021). These results suggest that the increased chromosome stiffness observed in aged oocytes is not due to DNA damage.”

      (12) Line 328. Senescence?

      This error is corrected in the revised manuscript.

      Revision to manuscript:

      “Defective chromosome organization is often related to various diseases, such as cancer, infertility, and senescence (Thompson and Compton, 2011; Harton and Tempest, 2012; He et al., 2018).”

      References:

      Biggs, R., P.Z. Liu, A.D. Stephens, and J.F. Marko. 2019. Effects of altering histone posttranslational modifications on mitotic chromosome structure and mechanics. Mol. Biol. Cell. 30:820–827. doi:10.1091/mbc.E18-09-0592.

      Biggs, R.J., N. Liu, Y. Peng, J.F. Marko, and H. Qiao. 2020. Micromanipulation of prophase I chromosomes from mouse spermatocytes reveals high stiffness and gel-like chromatin organization. Commun. Biol. 3:1–7. doi:10.1038/s42003-020-01265-w.

      Cai, X., J.M. Stringer, N. Zerafa, J. Carroll, and K.J. Hutt. 2023. Xrcc5/Ku80 is required for the repair of DNA damage in fully grown meiotically arrested mammalian oocytes. Cell Death Dis. 14:1–9. doi:10.1038/s41419-023-05886-x.

      Collins, J.K., S.I.R. Lane, J.A. Merriman, and K.T. Jones. 2015. DNA damage induces a meiotic arrest in mouse oocytes mediated by the spindle assembly checkpoint. Nat. Commun. 6. doi:10.1038/ncomms9553.

      Harton, G.L., and H.G. Tempest. 2012. Chromosomal disorders and male infertility. Asian J. Androl. 14:32–39. doi:10.1038/aja.2011.66.

      He, Q., B. Au, M. Kulkarni, Y. Shen, K.J. Lim, J. Maimaiti, C.K. Wong, M.N.H. Luijten, H.C. Chong, E.H. Lim, G. Rancati, I. Sinha, Z. Fu, X. Wang, J.E. Connolly, and K.C. Crasta. 2018. Chromosomal instability-induced senescence potentiates cell non-autonomous tumourigenic effects. Oncogenesis. 7. doi:10.1038/s41389-018-0072-4.

      Hopkins, J., G. Hwang, J. Jacob, N. Sapp, R. Bedigian, K. Oka, P. Overbeek, S. Murray, and P.W. Jordan. 2014. Meiosis-Specific Cohesin Component, Stag3 Is Essential for Maintaining Centromere Chromatid Cohesion, and Required for DNA Repair and Synapsis between Homologous Chromosomes. PLoS Genet. 10:e1004413. doi:10.1371/journal.pgen.1004413.

      Lee, C., J. Leem, and J.S. Oh. 2023. Selective utilization of non-homologous end-joining and homologous recombination for DNA repair during meiotic maturation in mouse oocytes. Cell Prolif. 56:1–12. doi:10.1111/cpr.13384.

      Lee, J., S. Ogushi, M. Saitou, and T. Hirano. 2011. Condensins I and II are essential for construction of bivalent chromosomes in mouse oocytes. Mol. Biol. Cell. 22:3465–3477. doi:10.1091/mbc.E11-05-0423.

      Marangos, P., and J. Carroll. 2012. Oocytes progress beyond prophase in the presence of DNA damage. Curr. Biol. 22:989–994. doi:10.1016/j.cub.2012.03.063.

      Marangos, P., M. Stevense, K. Niaka, M. Lagoudaki, I. Nabti, R. Jessberger, and J. Carroll. 2015. DNA damage-induced metaphase i arrest is mediated by the spindle assembly checkpoint and maternal age. Nat. Commun. 6:1–10. doi:10.1038/ncomms9706.

      Poirier, M.G., and J.F. Marko. 2002. Mitotic chromosomes are chromatin networks without a mechanically contiguous protein scaffold. Proc. Natl. Acad. Sci. U. S. A. 99:15393–15397. doi:10.1073/pnas.232442599.

      Pope, L.H., C. Xiong, and J.F. Marko. 2006. Proteolysis of Mitotic Chromosomes Induces Gradual and Anisotropic Decondensation Correlated with a Reduction of Elastic Modulus and Structural Sensitivity to Rarely Cutting Restriction Enzymes. Mol. Biol. Cell. 17:104. doi:10.1091/MBC.E05-04-0321.

      Dos Santos, Á., A.W. Cook, R.E. Gough, M. Schilling, N.A. Olszok, I. Brown, L. Wang, J. Aaron, M.L. Martin-Fernandez, F. Rehfeldt, and C.P. Toseland. 2021. DNA damage alters nuclear mechanics through chromatin reorganization. Nucleic Acids Res. 49:340–353. doi:10.1093/nar/gkaa1202.

      Sun, M., R. Biggs, J. Hornick, and J.F. Marko. 2018. Condensin controls mitotic chromosome stiffness and stability without forming a structurally contiguous scaffold. Chromosom. Res. 26:277–295. doi:10.1007/s10577-018-9584-1.

      Thompson, S.L., and D.A. Compton. 2011. Chromosomes and cancer cells. Chromosom. Res. 19:433–444. doi:10.1007/s10577-010-9179-y.

      Ward, A., J. Hopkins, M. Mckay, S. Murray, and P.W. Jordan. 2016. Genetic Interactions Between the Meiosis-Specific Cohesin Components, STAG3, REC8, and RAD21L. G3 (Bethesda). 6:1713–24. doi:10.1534/g3.116.029462.

      Winship, A.L., J.M. Stringer, S.H. Liew, and K.J. Hutt. 2018. The importance of DNA repair for maintaining oocyte quality in response to anti-cancer treatments, environmental toxins and maternal ageing. Hum. Reprod. Update. 24:119–134. doi:10.1093/humupd/dmy002.

      Xu, H., M.D. Beasley, W.D. Warren, G.T.J. van der Horst, and M.J. McKay. 2005. Absence of Mouse REC8 Cohesin Promotes Synapsis of Sister Chromatids in Meiosis. Dev. Cell. 8:949–961. doi:10.1016/j.devcel.2005.03.018.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      Major issue #1. Regarding the conclusions on IRE1 signaling, both yeast species have different IRE1 activities (https://elifesciences.org/articles/00048), the total deletion of IRE1 in S pombe appears to indicate that expansion of perinuclear ER is independent of IRE1, however since IRE1 signaling has exclusively a negative impact on mRNA expression, it might be relevant to identify mRNA whose expression is stabilized under those circumstances and evaluate whether those could confer a mechanism which would also yield perinuclear ER expansion (eg differential deregulation of ER stress controlled lipid biosynthesis required for lipid membrane synthesis). In S. cerevisiae, do the authors observe HAC1 mRNA splicing?

      We have not tested whether HAC1 mRNA is processed in S. cerevisiae. To address this question, we will perform RT-PCR to test it.

      In addition, as requested by the reviewers, we will further test the involvement of Ire1 in the HU/DIA-induced phenotype in S. pombe. For that, we will reassess our RNA-seq data and compare it with data from (Kimmig et al., 2012) (UPR activation in S. pombe). We will test the levels and splicing of mRNA of Bip1 upon HU/DIA treatments by RT-PCR and finally we will test the levels of Gas2p which has been described to decrease upon Ire1/UPR activation in S. pombe.

      We are confident in that the results of these experiments and the re-analysis of our RNA-Seq data will help us to infer the mechanisms that modulate the ER response to HU or DIA treatment.

      Major issue #2. The authors indicate that HU and DIA lead to thiol stress, it might be relevant to evaluate the thiol-redox status of major secretory proteins in S. pombe (or even cargo reporters if necessary) to fully document the stress impact on global protein redox status.

      We agree with the reviewer that it is important to determine the redox and the functional state of the secretory pathway in our conditions to fully understand the cellular consequences of these treatments, especially in the case of HU, as it is routinely used in clinics.

      In this context, we have already included new data showing that HU or DIA treatment leads to alterations in the Golgi apparatus and in the distribution of secretory proteins (Figures 3A-B).

      In addition, we plan to perform mass spectrometry experiments to detect protein glutathionylation in our conditions, as it has been previously shown that DIA treatment leads to glutathionylation of key ER proteins such as Bip1, Pdi or Ero1 (Lind et al., 2002; Wang & Sevier, 2016), which might by reproduced upon HU treatment. We will test specifically the redox state of Bip1, Pdi and/or Ero1 by immunoprecipitation and western blot.

      Finally, we plan to test the folding and processing of specific secretory cargoes by western blot in our experimental conditions (See below, Reviewer 2, Major issue #1).

      What happens if HU-treated yeast cells are grown in the presence of n-acetyl cysteine?

      We have tested whether the addition of this antioxidant could prevent and/or revert the N-Cap phenotype. We found that NAC in combination with HU increased N-Cap incidence (Figure 5H). As NAC is a GSH precursor and we find that GSH is required to develop the phenotype of N-Cap (Figure 5A-B, D, G), this result further supports that the HU-induced cellular damage might involve ectopic glutathionylation of proteins.

      Unfortunately, we have not tested NAC in combination with DIA, as NAC seems to reduce DIA as soon as they get in contact, as judged by the change in the characteristic orange color of DIA, the same that happens when we combine GSH and DIA (Supplementary Figure 5A-B).

      In this regard, the following information has been added to the manuscript (page 32-33, highlighted in blue):

      "We also tested GSH addition to the medium in combination with either HU or DIA. When mixed with DIA, we noticed that the color of the culture changed after GSH addition (Figure S5A), which suggests that GSH and DIA can interact extracellularly, thus preventing us from being able to draw conclusions from those experiments. On the other hand, combining GSH with HU increased N-Cap incidence (Figure 5G), as expected based on our previous observations. Additionally, we checked whether the addition of the antioxidant N-acetyl cysteine (NAC), a GSH precursor, impacted upon the N-Cap phenotype. The results were the same as with GSH addition: when combined with HU, NAC increased N-Cap incidence (Figure 5H), whereas in combination, the two compounds interacted extracellularly (Figure S5B). These data align with NAC being a precursor of GSH, as incrementing GSH levels augments the penetrance of the HU-induced phenotype".

      Major issue #3. The appearance of cytosolic aggregates is intriguing, do the authors have any idea on the nature of the protein aggregates?

      DIA is a strong oxidant, and HU treatment results in the production of reactive oxygen species (ROS). Therefore, one hypothesis would be that cytoplasmic chaperone foci represent oxidized and/or misfolded soluble proteins. Indeed, this hypothesis is supported by the appearance of cytoplasmic foci containing the guk1-9-GFP and Rho1.C17R-GFP soluble reporters of misfolding upon HU or DIA treatment (Figure 4I-J). We have already tested if they contain Vgl1, which is one of the main components of heat shock induced stress granules in S. pombe (Wen et al., 2010). However, we found that HU or DIA-induced foci lacked this stress granule marker, and indeed Vgl1 did not form any foci in response to these treatments. Therefore, our aggregates differ from the canonical stress-induced granules. We have yet to include this data in the manuscript, but we plan to do that for the final version.

      To further explore the nature of the cytoplasmic aggregates induced by HU and DIA, we will test whether Hsp104-containing foci colocalize with guk1-9-GFP and/or Rho1.C17R-GFP foci.

      Are those resulting from proficient retrotranslocation or reflux of misfolded proteins from the ER?

      To test whether these cytosolic aggregates result from retrotranslocation from the ER, we plan to use the vacuolar Carboxipeptidase Y mutant reporter CPY*, which is misfolded. This misfolded protein is imported into the ER lumen but does not reach the vacuole. Instead, it is retrotranslocated to the cytoplasm, where it is ubiquitinated and degraded by the proteasome (Mukaiyama et al., 2012). We will analyze by fluorescence microscopy the localization of CPY*´-GFP and Hsp104-containing aggregates upon HU or DIA treatment and with or without proteasome inhibitors. We can also test the levels, processing and ubiquitination of CPY*-GFP by western blot, as ubiquitination of retrotranslocated proteins occurs once they are in the cytoplasm.

      Are those aggregates membrane bound or do they correspond to aggresomes as initially defined? The Walter lab has demonstrated a tight balance between ER phagy and ER membrane expansion (https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.0040423), which could also impact on the presence of protein aggregates in the cytosol.

      Our results suggest that these aggregates are not bound to ER membranes, as they do not appear in close proximity to the ER area marked by mCherry-AHDL in fluorescence microscopy images.

      To fully rule out this possibility, we will test whether these Hsp104-aggregates colocalize with ER transmembrane proteins such as Rtn1 or Yop1, with Gma12-GFP that marks the Golgi apparatus and with the dye FM4-64 that stains endosomal-vacuole membranes.

      We have tested whether deletion of key genes involved in autophagy affected the N-Cap phenotype. To this end, we used deletions of ypt1, vac8 and atg8 in strains expressing Cut11-GFP and/or mCherry-AHDL and found that none of them affected N-Cap formation. These data suggest that the core machinery of autophagy is not critical for HU/DIA-induced ER expansion. We plan to include this data in the final version of the manuscript along with the rest of experiments proposed.

      To get deeper insights and to fully rule out a possible contribution of macro-autophagy to the HU- and DIA-induced phenotypes, we plan to analyze by western blot whether GFP-Atg8 is induced and cleaved upon HU or DIA treatments which would be indicative of macroautophagy activation.

      To test whether the cytoplasmic aggregates are the result of an imbalance between ER-expansion and ER-phagy we plan to analyze the localization of GFP-Atg8 and Hsp104-RFP in the atg7Δ mutant, impaired in the core macro-autophagy machinery. In these conditions, the number or size of the cytoplasmic aggregates might be impacted.

      On the other hand, it has been recently shown that an ER-selective microautophagy occurs in yeasts upon ER stress (Schäfer et al., 2020; Schuck et al., 2014). This micro-ER-phagy involves the direct uptake of ER membranes into lysosomes, is independent of the core autophagy machinery and depends on the ESCRT system and is influenced by the Nem1-Spo7 phosphatase. ESCRT directly functions in scission of the lysosomal membrane to complete the uptake of the ER membrane. Interestingly, N-Caps are fragmented in the absence of cmp7 and specially in the absence of vps4 or lem2, the nuclear adaptor of the ESCRT (Figure 3E), We had initially interpreted these results as the need to maintain nuclear membrane identity during the process of ER expansion (Kume et al., 2019); however, the appearance of fragmented ER upon HU treatment in the absence of ESCRT might also be due to an inability to complete microautophagic uptake of ER membranes. To test this hypothesis, we plan to analyze whether the fragmented ER in these conditions co-localize with lysosome/vacuole markers.

      Major issue #4. Nucleotide depletion was previously shown to lead to HSP16 expression through activation of the spc1 MAPK pathway (https://academic.oup.com/nar/article/29/14/3030/2383924), one might think that HU (or diamide) could lead to this through a nucleotide dependent mechanism and not necessary through a thiol-redox protein misfolding stress. This issue has to be sorted out to ensure that the HSP effect is independent of nucleotide depletion.

      As stated in (Taricani et al., 2001), hsp16 expression is strongly induced in a cdc22-M45 mutant background. We performed experiments in this mutant that were included in the original version of the manuscript and remain in the current version (Sup. Fig. 2C) and, under restrictive conditions, we do not see spontaneous N-Cap formation. If Hsp16 overexpression and nucleotide depletion were key to the mechanism triggering N-Cap appearance, we would expect this mutant to eventually form N-Caps when placed at restrictive temperature. Furthermore, Taricani et al. show that Hsp16 expression was abolished in a Δatf1 mutant background in the presence of HU, and we found that this mutant is still able to produce N-Caps in HU; therefore, our results strongly suggest that the phenotype of N-cap is independent on the MAPK pathway and on the expression of hsp16.

      Minor issues

      1. __P1 - UPR = Unfolded Protein Response: __Corrected in the manuscript
      2. 2__. P22 - HSP upregulation "might" be indicative of a folding stress:__ Corrected in the manuscript
      3. __ The abstract does not reflect the findings presented in the manuscript. In addition, I would recommend the authors revise the storytelling in their manuscript to push forward the message on either the specific phenotype associated with perinuclear ER or on the characterization of protein misfolding stress.__ We have modified the abstract to better reflect our findings and will further revise our arguments in the final version of the manuscript once we have the results of the experiments proposed

      Reviewer 2

      Major issue #1. The authors state the cytoplasmic and ER folding are both disrupted. The impact on ER protein biogenesis would be bolstered with some biochemical data focused on the folding of one or more nascent secretory proteins. Is disulfide bond formation and/or protein folding indeed disrupted?

      We have addressed the status of secretion in cells treated with HU or DIA by assessing the morphology of the Golgi apparatus and the localization of several secretory proteins by fluorescence microscopy and found that both HU and DIA treatments impact the secretion system. In addition, we plan on addressing the redox status of ER proteins (Bip1, Pdi or Ero1) by biochemical approaches. Please see the answer to major issue #2 from reviewer 1.

      We will also analyze by western blot the biogenesis and processing of the wildtype vacuolar Carboxypeptidase Y (Cpy1-GFP) and alkaline phosphase (Pho8-GFP), two widely used markers to test the functionality of the ER/endomembrane system.

      Major issue #2. Increased signal of Bip1 in the expanded perinuclear ER is shown and is suggested as consistent with immobilization of BiP upon binding of misfolded proteins. The authors suggest that this increased signal must reflect Bip1 redistribution because "Bip1 levels are constant". Yet, the western image (Figure 4B) looks to show increased level of Bip1 protein up HU treatment. Given the abundance of Bip1 in cells, it seems possible that a two-fold increase in newly synthesized proteins in the perinuclear region may account for the increased signal. These original data cited by the authors uses photobleaching (not just fluorescence intensity) to show a change in crowding / mobility, which the authors should consider to support their conclusion. Alternatively, a detected increased engagement of Bip1 with substrates (e.g. pulldown experiment) would be similarly strengthening.

      This same issue arose with reviewer 3, so we decided to change the image of the western blot showing another one with less exposure and added a quantification showing that Bip1-GFP levels remain mostly constant between control conditions and treatments with HU and DIA.

      We have also performed the suggested photobleaching experiment to analyze potential changes in crowding and mobility in Bip1-GFP upon HU treatment. We found that Bip1-GFP signal recovers after photobleaching the perinuclear ER in HU-treated cells that had not yet expanded the ER, showing that Bip1-GFP is dynamic in these conditions. However, Bip1-GFP signal did not recover after photobleaching the whole N-Cap in cells that had fully developed the expanded perinuclear ER phenotype, whereas it did recover when only half of the N-Cap region was bleached. This suggests that Bip1-GFP is mobile within the expanded perinuclear ER but cannot freely diffuse between the cortical and the perinuclear ER once the N-Cap is formed.

      These data have been included in the revised version of the manuscript, in figure 4B, sup. figures 4A-B, and in page 23.

      Major issue #3. It is curious that cycloheximide (CHX) has a distinct impact on HU versus DIA treatment. Blocking protein synthesis with CHX exacerbates the phenotype with DIA, but not HU. The authors use the data with CHX to argue that their drug treatments are interfering with folding during synthesis and translation into the ER. If so, what is the rationale as to why CHX treatment decreases expansion upon HU treatment? Relatedly, is protein synthesis and/or ER import impacted upon treatment with HU and/or DIA?

      As all three reviewers had comments about the CHX and Pm-related data, we revised those experiments and noticed a phenotype occurring upon HU+CHX treatment that had gone unnoticed previously and that changed our understanding about the effect of these drugs on the ER. Briefly, we noticed that, although CHX treatment decreases the HU-induced expansion of the perinuclear ER, it indeed induced expansion but in this case in the cortical area of the ER. This means that the phenotype of ER expansion in HU is not being suppressed by addition of CHX, but rather taking place in another area of the ER (cortical ER). We do not understand why this happens; however, these results show that ER expansion is exacerbated both in DIA and HU when combined with CHX. We have included this data in Figures 3C-D and in page 22.

      We also examined the trafficking of secretory proteins that go from the ER to the cell tips and noticed that this transit was affected under both drugs (Figures 3A-B). This suggests that, although there is still protein synthesis when cells are exposed to the drugs (as can be seen by the higher levels of chaperones induced by both stresses (Figure 4C-E)), their protein synthesis capacity is possibly impinged on to certain degree. All this information is now included in the manuscript (page 19).

      Major issue #4. While the authors suggest that there is disulfide stress in the ER / nucleus, the redox environment in these compartments is not tested directly (only cytoplasmic probes).

      Although we have only included experiments using one redox sensor in the manuscript, we had tested the oxidation of several biosensors during HU and DIA exposure monitoring cytoplasmic, mitochondrial and glutathione-specific probes. We have tried to use ER directed probes however, we have not been successful due to oversaturation of the probe in the highly oxidative environment of the ER lumen.

      Although so far we have not been able to directly test the redox status of the ER with optical probes, we plan to test the folding and redox status of several ER proteins and secretory markers by biochemical approaches, so hopefully these experiments will give us more information on this question (See answer to Reviewer 1, Main Issue #2 and Reviewer 2, Main issue #1).

      Major Issue #5. What do the authors envision is the role of the cytoplasmic chaperone foci? Do CHX / Pm treatment with HU/DIA reverse the chaperone foci?

      Pm causes premature termination of translation, leading to the release of truncated, misfolded, or incomplete polypeptides into the cytosol and the re-engagement of ribosomes in a new cycle of unproductive translation, as puromycin does not block ribosomes (Aviner, 2020; Azzam & Algranati, 1973). This is likely to decrease the number of peptides entering the ER that can be targeted by either HU or DIA, decreasing in turn ER expansion. Indeed, we have found that Pm treatment alone results in the formation of multiple cytoplasmic protein aggregates marked by Hsp104-GFP (Figure 4K), consistent with a continuous release of incomplete and misfolded nascent peptides to the cytoplasm. This would explain why Pm treatment suppresses N-Cap formation when cells are treated with either HU or DIA.

      To further test this idea, we plan to carefully analyze the number, size and dynamics of Hsp104-containing cytoplasmic aggregates in cells treated with HU or DIA and Pm, where N-Caps are suppressed. We expect to find an increase in the accumulation of proteotoxicity in the cytoplasm in these conditions.

      On the other hand, CHX inhibits translation elongation by stalling ribosomes on mRNAs, preventing further peptide elongation but leaving incomplete polypeptides tethered to the blocked ribosomes. This reduces overall protein load entering the ER by blocking new protein synthesis and stabilizes misfolded proteins bound to ribosomes. Accordingly, it has been shown previously that blocking translation with CHX abolishes protein aggregation (Cabrera et al., 2020; Zhou et al., 2014). Similarly, we have found that Hsp104 foci are not observed when we add CHX alone or in combination with HU or DIA (Figures 4K-L). These results suggest that cytoplasmic foci that we observe upon HU or DIA treatment likely contain misfolded proteins derived from ongoing translation.

      As this question has also been raised by reviewer 1, we have decided to further explore the nature of these cytoplasmic foci (please see answer to Reviewer1, Issue 3). Briefly:

      • We plan to test whether they colocalize with the foci of Guk1-9-GFP and Rho1.C17R-GFP reporters of misfolding that appear upon HU or DIA treatments.
      • We will test whether these foci are membrane bound.
      • We plan to test whether the cytoplasmic foci represent proteins retro-translocated from the ER.
      • We will also test whether autophagy or an imbalance between ER expansion and ER-phagy might contribute to the accumulation of cytoplasmic protein foci. The new data regarding the suppression of cytoplasmic foci by CHX treatment has already been included in the current version of the manuscript in Figure 4K and in the text (page 30).

      The authors argue that cytoplasmic foci are "independent" from ER expansion and are "not a direct consequence of thiol stress" based on the observation that DTT does not reverse these foci. This seems like a strong statement based on the limited analysis of these foci.

      We agree with the reviewer. We have toned down our statements about the relationship between thiol stress, the cytoplasmic chaperone foci and their relationship with ER expansion. We have removed from the text the statement that cytoplasmic foci are independent from ER expansion and thiol stress and have further revised our claims about CHX and Pm in the main text and the discussion to address these and the other reviewers' concerns.

      Major Issue #6. Based on the transcriptional data, the authors speculate a potential role on role on iron-sulfur cluster protein biogenesis. This would seem to be rather straightforward to test.

      To address this issue, we plan to analyze the localization of proteins involved in iron-sulfur cluster assembly and/or containing iron-sulfur clusters by in vivo fluorescence microscopy, such as DNA polymerase Dna2 or Grx5, during HU or DIA treatments.

      Related to this, we have found that a subunit of the ribonucleotide reductase (RNR) aggregated in the cytoplasm upon HU exposure (Figure S2B). It is worth noting that RNR is an iron-containing protein whose maturation needs cytosolic Grxs (Cotruvo & Stubbe, 2011; Mühlenhoff et al., 2020). The catalytic site, the activity site (which governs overall RNR activity through interactions with ATP) and the specificity site (which determines substrate choice) are located in the R1 (Cdc22) subunits, which are the ones that aggregate, while the R2 subunits (Suc22) contain the di-nuclear iron center and a tyrosyl radical that can be transferred to the catalytic site during RNR activity (Aye et al., 2015). The fact that a subunit of RNR aggregates could be related to an impingement on its synthesis and/or maturation due to defects in iron-sulfur cluster formation, as it has been recently published that RNR cofactor biosynthesis shares components with cytosolic iron-sulfur protein biogenesis and that the iron-sulfur cluster assembly machinery is essential for iron loading and cofactor assembly in RNR in yeast (Li et al., 2017). This information has been added to the discussion.

      Major Issue #7. The authors suggest that "pre-treatment" with DTT before HU addition suppresses formation of the N-Caps. However, these samples (Figure 2J) contain DTT coincident with the treatment as well. To say it is the effect of pre-treatment, the DTT should be added and then washed out prior to HU or DIA addition. Alternatively, the language used to describe these experiments and their outcomes could be revised.

      We modified the language used to describe the experiment in the manuscript, as suggested by the reviewer, to clarify that while DTT is kept in the medium, N-Caps never form. In addition, we have also performed a pre-treatment with DTT; adding 1 mM DTT one hour before, washing the reducing agent out and adding HU to the medium then. The result indicates that pre-treating cells with DTT significantly reduces N-Cap formation after a 4-hour incubation with HU, which suggests that triggering reducing stress "protects" cells from the oxidative damage induced by HU and DIA. This information has been also added to the manuscript (Figure 2J).

      Major Issue #8. For a manuscript with 128 references there is rather limited discussion of the data in the context of the wider literature. The discussion primarily focuses on a recap of the results. The authors do cite several prior works focused on redox-dependent nuclear expansion. However, while cited, there is no real discussion of the relationship between this work in the context of that previously published (including several known disulfide bonded proteins that are involved in nuclear/ER architecture).

      We have revised and expanded our discussion. In addition, in the final revision of our work we will increase the discussion in the context of the new results obtained.

      Minor points

      1. __ Figure numbering goes from figure 4 to S6 to 5.__ We have updated the numbering of the figures after merging several supplementary figures, so now this issue is fixed.

      __ It would be helpful to the reader to explain what some of the reporters are in brief. For example, Guk1-9-GFP and Rho1.C17R-GFP reporters__.

      Both the Guk1-9-GFP and Rho1.C17R-GFP are two thermosensitive mutants in guanylate kinase and Rho1 GTPase respectively, that have been previously used in S. pombe as soluble reporters of misfolding in conditions of heat stress. During mild heat shock, both mutants aggregate into reversible protein aggregate centers (Cabrera et al., 2020). This information has now been added to the manuscript.

      __ Supplementary Figure 3. The main text suggests panel 3A is focused on diamide treatment. The figure legend discusses this in terms of HU treatment. Which is correct?__

      We thank the reviewer for pointing out this mistake. The experiment was performed in 75 mM HU, the legend was correct. It has now been corrected in the manuscript.

      __ The authors use ref 110 and 111 to suggest the importance of UPR-independent signaling. However, they do not point out that this UPR-independent signaling referred to in these papers is dependent on the UPR transmembrane kinase IRE1.__

      We have included pertinent clarification in the new discussion.

      Reviewer 3

      Major issue #1. It is hard to see how the claim of ER stress can be supported if BiP levels do not change (Fig. 4B). Also, this figure is overexposed. The RNA-seq data should be able to establish ER stress as well, but no rigorous analysis of ER stress markers is presented.

      Regarding the levels of Bip1, we now show in Figure 4 a less exposed image of the western blot, and a quantification of Bip1-GFP intensity from three independent experiments. We find that, in our experimental conditions, neither HU nor DIA treatments significantly altered Bip1 levels.

      With respect to the RNA-Seq, as we mentioned in the major issue 1 from reviewer 1, we plan to reassess our data to further clarify and add information about ER stress markers induced or repressed by HU and DIA. We also will test the levels of Bip1 and several UPR targets by RT-PCR and by western blot.

      Major issue #2. The interpretation of the CHX and puromycin experiments of Figure 3A-B is hard to follow. My best guess is that the authors argue that CHX decreases misfolded protein load and that puromycin increases misfolded protein load, and that since DIA is a stronger oxidative stress than HU hence CHX is only protective under HU and not DIA. However, while CHX decreases misfolded protein load, puromycin hasn't been show directly to increase it and I don't see how this explains puromycin being protective at all.

      We have found that puromycin treatment alone results in the formation of cytoplasmic foci containing Hsp104, suggesting that puromycin indeed increases folding stress in the cytoplasm. We have now included this data in Figure 4K (please see Main Issue #5 from Reviewer 2). Pm suppresses the formation of N-caps induced by HU or DIA; however, we have not addressed cell survival or fitness in these conditions and therefore we cannot conclude about being protective.

      In addition, upon the reevaluation of our data, we have realized that CHX treatment suppresses HU-induced perinuclear expansion, although it does not suppress but instead enhances ER expansion in the cortical region. This data has been added to the present version of the manuscript in Figure 3C-D (page 22).

      Furthermore, puromycin causes Ca leakage from the ER (which can be recapitulated with thapsigargin and blocked with anisomycin; easy experiments), which could be responsible for the differences from CHX, and the model does not address the effects on downstream stress signaling. The authors should be much more clear regarding their argument, since this data is used to support the argument of disrupted ER proteostasis.

      As the reviewer requested, we plan to test the effect of anisomycin (thapsigargin has been described to not work in yeast, as they lack a (SERCA)‐type Ca2+ pump (Strayle et al., 1999), which this drugs targets.

      Regarding the downstream effects of HU or DIA treatment on ER proteostasis, we plan to further explore the effect of these drugs on the secretory system (please see major issue #2 from Reviewer 1) and to evaluate the redox state and processing of several key ER and secretory proteins. We will further explore the nature of the aggregates that appear in the cytoplasm in our experimental conditions, which will also shed light into the downstream effects of these drugs in cytoplasmic proteostasis (please see answer to issue #5 from Reviewer 2).

      Major issue #3. The claim that a canonical UPR is not induced is weak. First, the transcriptional program of S. cerevisiae from Travers et al is used as the canonical UPR, and compared to HU/DIA induced stress in S. pombe. These organisms may not be similar enough to assume that they have transcriptionally identical UPRs. Second, no consideration is given to the mechanism by which the different transcripts are modulated between "canonical" and HU/DIA induced UPR. Is it solely through RIDD, or does it point to differences in sensing or signaling transduction?

      We plan on readdressing this topic by analyzing the genes that have been described to be differentially expressed during UPR activation in S. pombe and comparing them with our data, first by reevaluating our transcriptomic data and second by choosing Bip1 and some other of the differentially expressed genes in (Kimmig et al., 2012) (for example, Gas2, Pho1 or Yop1) and assessing by RT-PCR their mRNA levels in our experimental conditions. As an alternative approach, we will also analyse the levels of UPR targets by western blot upon HU or DIA treatment.

      We are confident that the results of these experiments and the re-analysis of our RNA-Seq data will allow us to infer the mechanisms that modulate the ER response to HU or DIA treatment.

      Finally, the p-values used are unadjusted (e.g. by Bonferroni's method or by ANOVA or at least controlled by an FDR approach) and unmodulated (extremely important when n = 3 and variance is poorly sampled), which makes them not dependable. It looks like HSF1 targets are induced, which should be addressed.

      We thank the reviewer for pointing this out. We forgot to include this information which now appears in the M&M section as follows:

      "A gene was considered as differentially expressed when it showed an absolute value of log2FC(LFC){greater than or equal to}1 and an adjusted p-valueIn this regard, we plan to perform proteome-wide mass spectrometry experiments to detect protein glutathionylation in our conditions, as it has been previously shown that DIA treatment leads to glutathionylation of key ER proteins such as Bip1, Pdi or Ero1 (Lind et al., 2002; Wang & Sevier, 2016), which might by reproduced upon HU treatment. We will also test specifically the redox state of Bip1, Pdi and/or Ero1 by immunoprecipitation and western blot. We also plan to test the folding and processing of specific secretory cargoes by western blot in our experimental conditions (see below, and Reviewer 2, Major issue #1).

      We have already tested whether mutant strains with deletions of key enzymes in both cytoplasmic and ER redox systems are able to expand the ER upon HU or DIA treatment. We have found that only pgr1Δ (glutathione reductase), gsa1Δ (glutathione synthetase) and gcs1Δ (glutamate-cysteine ligase) mutants fully suppressed N-Cap formation, which suggests that glutathione has an important role in the phenotype of ER expansion. We have now added the pgr1Δ mutant strain to the main text of the manuscript (Figure 5C, page 31).

      Major issue #5. Figure S5 presents weak ER expansion in fribrosarcoma cells in response to HU (at very low concentrations and DIA is not included). The lack of any other phenotypes being presented could suggest that such experiments were done but didn't show any effect. The authors should straightforwardly discuss whether they performed experiments looking for perinuclear ER expansion or NPC clustering, and if not, what challenges precluded such experiments. Given how important this line of experimentation is for establishing generality, much more discussion is needed here.

      We not only investigated the effects of HU on the ER in mammalian cells, but also of DIA. The results from this experiment mimicked the effect of HU (an increase in ER-ID fluorescence intensity in DIA). We merely excluded this information from the manuscript because we were focusing on HU at that point due to its importance as it is used currently in clinics. In this new version of the manuscript, we have included an extra panel in supplementary figure 5 to show the results from DIA in mammalian cells.

      Minor concerns

      1) Figure 1A should show individual data points (i.e. 3 averages of independent experiments) in the bar graph.

      Although we initially changed the graph, we believe the bar plot disposition facilitates its comprehension and went back to the initial one. Also, as the rest of the graphs similar to 1A are all expressed as bar plots, changing one would mean that, to avoid visual noise, we should change all. Therefore, we preferred keeping the figure as it was in the original version. However, we include here the graph with each of the averages of the independent experiments.

      2) It is argued that Figure 1B demonstrates that the SPB is clustered with the NPC cluster. However, a single image is not enough to support this claim, as the association could be coincidental.

      We have changed the image to show a whole population of cells, with several of them having NPC clusters, and we have indicated the position of SPB in each of them (all colocalizing with the N-Cap).

      3) Figures 1B through 1D do not indicate the HU concentration.

      We thank the reviewer for pointing out this mistake. Figures 1B and 1C represent cells exposed to 15 mM HU for 4 hours, while the graph in 1D shows the results from cells exposed to 75 mM HU over a 4-hour period. This information has been now added to the corresponding figure legend.

      4) I was confused by the photobleaching experiments of Figure S1. How do the authors know that there is complete photobleaching of the cytoplasm or nucleus in the absence of a positive control? If photobleaching is incomplete, they could be measuring motility without compartments rather than transport between compartments, and hence the conclusion that trafficking is unaffected could be wrong.

      Our control is the background of each microscopy image; we make sure that after the laser bleaches a cell, the bleached area coincides with the background noise. That way, we make sure that fluorescence from any remaining GFP is completely removed from the bleached area.

      5) On page 8, they say "exposure to DIA" when they intend HU.

      This has been corrected in the manuscript.

      6) In Figure S3A, the colocalization of INM proteins with the ER are presented. It is not clearly explained what conclusions are meant to be drawn from this figure, but it seems it would have been more useful to compare INM and Cut11, to see whether the NPCs are localizing at the INM or ONM.

      We have added an explanation in the main text to clarify the main conclusions derived from this figure. We think that NPCs localize in a section of the nucleus where the two membranes (INM and ONM) are still bound together.

      7) I had to read Figure 2C's description and caption several times to understand the experiment. A schematic would be helpful. 20 mM HU is low compared to most conditions used. Does repositioning eventually take place for 75 mM HU or 3 mM DIA treatment, or do the cells just die before they get a chance?

      20 mM HU was used in this experiment to provide a time frame suitable for analysis after HU addition, as a higher HU concentration increases the repositioning time. We found that both HU (75mM 4h) and DIA (3mM 4h)-induced ER expansions are reversible upon drug washout. If HU is kept in the media, ER expansions are eventually resolved. However, DIA is a strong oxidant and if it is kept in the media ER expansions are not resolved and cells do not survive.

      8) Figure 2D shows little oxidative consequence from 75 mM HU treatment until 40 min., the same time that phenotypes are observed (Figure 1D). Is this relationship consistent with the kinetics of other concentrations of HU, or of DIA? Seems like a pretty important mechanistic consideration that can rationalize the effects of the two oxidants.

      Thanks to this comment, we realized the notation underneath Figure 1D (1E in the new version of the manuscript) could lead to misunderstandings, as the timings there were "random". We have now made a clarification for this panel to be clearer: the timings are normalized to the moment when NPCs cluster. The fact that, before, that moment coincided with "40 minutes" does not mean N-Caps appear at that time point-quite the opposite, as most of them start to appear after >2 hours have passed in HU. We hope this can be better understood now.

      9) Figure S4 is missing the asterisk on the lower left cell.

      Fixed in the corresponding figure.

      10) How is roundness determined in Figure S4B?

      Roundness in Figure S4B (now S2E) is determined the same way as in Figure 1D, and as is described in the Method section (copied below). A clarification has been added to the legend to address that.

      The 'roundness' parameter in the 'Shape Descriptors' plugin of Fiji/ImageJ was used after applying a threshold to the image in order to select only the more intense regions and subtract background noise (Schindelin et al., 2012). Roundness descriptor follows the function:

              Round=4 X [Area]/π X [Major axis]2
      

      where [Area] constitutes the area of an ellipse fitted to the selected region in the image and [Major axis] is the diameter of the round shape that in this case would fit the perimeter of the nucleus.

      11) What threshold is used to determine whether cells analyzed in Figures S4C have "small ER" or "large ER"?

      Large ER are considered when their area along the projection of a 3-Z section is over 4 μm2 (more than twice the mean area of the ER in cells with N-Caps in milder conditions). This has now been clarified in the legend of the corresponding figure.

      __12) The authors interpret Figure 4K as indicating that ER expansion is not involved in the generation of punctal misfolded protein aggregates. However, the washout occurs only after the proteins have already aggregated. The proper interpretation is that the aggregates are not reversible by resolution of the stress, and hence are not physically reliant on disulfide bonds. __

      We agree with the reviewer and have modified the interpretation of the indicated figure accordingly (page 30).

      The speculation that these proteins are iron dependent is a stretch; there is no reason to believe that losses of iron metabolism are the most important stress in these cells. It seems at least as likely that oxidizing cysteine-containing proteins in the cytosol or messing with the GSH/GSSG ratio in the cytosol would make plenty of proteins misfold; oxidative stress in budding yeast does activate hsf1. However, this point could be addresses by centrifugation and mass spectrometry to identify the aggregated proteome. It is also surprising that the authors did not investigate ER protein aggregation, perhaps by looking at puncta formation of chaperones beyond BiP. By contrast, the fact that gcs1 deletion prevents ER expansion but does not prevent Hsp104 puncta does support the idea that cytoplasmic aggregation is not dependent on ER expansion.

      To address this suggestion, we plan to analyze the localization of other chaperones and components of the protein quality control such as the ER Hsp40 Scj1 or the ribosome-associated Hsp70 Sks2.

      13) Figure 4L is cited on page 28 when Figure 4K is intended.

      This has been corrected in the text, although new panels have been added and now it is 4N.

      • *
    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors' research group had previously demonstrated the release of large multivesicular body-like structures by human colorectal cancer cells. This manuscript expands on their findings, revealing that this phenomenon is not exclusive to colorectal cancer cells but is also observed in various other cell types, including different cultured cell lines, as well as cells in the mouse kidney and liver. Furthermore, the authors argue that these large multivesicular body-like structures originate from intracellular amphisomes, which they term "amphiectosomes." These amphiectosomes release their intraluminal vesicles (ILVs) through a "torn-bag mechanism." Finally, the authors demonstrate that the ILVs of amphiectosomes are either LC3B positive or CD63 positive. This distinction implies that the ILVs either originate from amphisomes or multivesicular bodies, respectively.

      Strengths:

      The manuscript reports a potential origin of extracellular vesicle (EV) biogenesis. The reported observations are intriguing.

      Weaknesses:

      It is essential to note that the manuscript has issues with experimental designs and lacks consistency in the presented data. Here is a list of the major concerns:

      (1) The authors culture the cells in the presence of fetal bovine serum (FBS) in the culture medium. Given that FBS contains a substantial amount of EVs, this raises a significant issue, as it becomes challenging to differentiate between EVs derived from FBS and those released by the cells. This concern extends to all transmission electron microscopy (TEM) images (Figure 1, 2P-S, S5, Figure 4 P-U) and the quantification of EV numbers in Figure 3. The authors need to use an FBS-free cell culture medium.

      Although FBS indeed contains bovine EVs, however, the presence of very large multivesicular EVs (amphiectosomes) that our manuscript focuses on has never been observed and reported. For reported size distributions of EVs in FBS, please find a few relevant references below:

      PMID: 29410778, PMID: 33532042, PMID: 30940830 and PMID: 37298194

      All the above publications show that the number of lEVs > 350-500 nm is negligible in FBS. The average diameter of MV-lEVs (amphiectosomes) described in our manuscript is around 1.00-1.50 micrometer.

      Reviewer #1: These papers evaluated the effectiveness of various methods to eliminate EVs from FBS, emphasizing the challenges associated with the presence of EVs in FBS. They also caution against using FBS in EV studies due to these issues. However, I did not find a clear indication regarding the size distributions of EVs in FBS in these papers.

      Please provide accurate reference supporting the claim that 'lEVs > 350-500 nm are negligible in FBS.' The papers cited by the authors do not address this specific point.

      In the revised manuscript, we addressed the point that due to sterile filtering of FBS, it cannot contain large >0.22 µm EVs

      Our response to Reviewer #1 point 2. When we demonstrated the TEM of isolated EVs, we consistently used serum- free conditioned medium (Fig2 P-S, Fig2S5 J, O) as described previously (Németh et al 2021, PMID: 34665280).

      Reviewer #1: This is an important point that is not mentioned in the original main text, figure legend or method. Please address.

      We agree and we apologize for it. We added this information to the revised manuscript.

      Our response to Reviewer #1 point 3. Our TEM images show cells captured in the process of budding and scission of large multivesicular EVs excluding the possibility that these structures could have originated from FBS.

      Reviewer #1: These images may also depict the engulfment of EVs in FBS. Hence, it is crucial to utilize EV-free or EV-depleted FBS.

      As we mentioned earlier, we added the information to the revised manuscript that sterile filtering of the FBS presumably removed particles >0.22 µm EVs

      Our response to Reviewer #1 point 4. In addition, in our confocal analysis, we studied Palm-GFP positive, cell-line derived MV-lEVs. Importantly, in these experiments, FBS-derived EVs are non-fluorescent, therefore, the distinction between GFP positive MV-lEVs and FBS-derived EVs was evident.

      Reviewer #1: I agree that these fluorescent-labeled assays conclusively indicate that the MV-lEVs are originating from the cells. However, the images of concerns are the non- fluorescent-labeled images in (Figure 1, 2P-S, S5, Figure 4 P-U and Figure 3). The MV-lEVs may derive from both the cells and FBS.

      Please see above our response to points 1-3.

      Our response to Reviewer #1 point 5. In addition, culturing cells in FBS-free medium (serum starvation) significantly affects autophagy. Given that in our study, we focused on autophagy related amphiectosome secretion, we intentionally chose to use FBS supplemented medium.

      Reviewer #1 If this is a concern, the authors should use EV-depletive FBS.

      As we discussed above, sterile filtration of FBS removes particles >0.22 µm. In addition, based on our preliminary experiments, EV-depleted serum may effect cell physiology. 

      Our response to Reviewer #1 point 6. Even though the authors of this manuscript are not familiar with the technological details how FBS is processed before commercialization, it is reasonable to assume that the samples are subjected to sterile filtration (through a 0.22 micron filter) after which MV-lEVs cannot be present in the commercial FBS samples.

      Reviewer #1This is a fair comment that needs to be included in the manuscript.

      As you suggested, this comment is now included in the revised manuscript

      (2) The data presented in Figure 2 is not convincingly supportive of the authors' conclusion. The authors argue that "...CD81 was present in the plasma membrane-derived limiting membrane (Figures 2B, D, F), while CD63 was only found inside the MV-lEVs (Fig. 2A, C, E)." However, in Figure 2G, there is an observable CD63 signal in the limiting membrane (overlapping with the green signals), and in Figure 2J, CD81 also exhibits overlap with MV-IEVs.

      Both CD63 and CD81 are tetraspanins known to be present both in the membrane of sEVs and in the plasma membrane of cells (for references, please see Uniprot subcellular location maps: https://www.uniprot.org/uniprotkb/P08962/entry#subcellular_location https://www.uniprot.org/uniprotkb/P60033/entry#subcellular_location). However, according the feedback of the reviewer, for clarity, we will delete the implicated sentence from the text.

      Reviewer #1 Please also justify the statement questioned in (3) as these arguments are interconnected.

      We hope you find our above responses to your comment acceptable.

      (3) Following up on the previous concern, the authors argue that CD81 and CD63 are exclusively located on the limiting membrane and MV-IEVs, respectively (Figure 2-A-M). However, in lines 104-106, the authors conclude that "The simultaneous presence of CD63, CD81, TSG101, ALIX, and the autophagosome marker LC3B within the MV-lEVs..." This statement indicates that CD63 and CD81 co-localize to the MV-IEVs. The authors need to address this apparent discrepancy and provide an explanation.

      There must be a misunderstanding because we did not claim or implicate in the text that “CD81 and CD63 are exclusively located on the limiting membrane and MV-IEVs”. Here we studied co-localization of the above proteins in the case intraluminal vesicles (ILVs). In Fig 2. we did not show any analysis of limiting membrane co-localization.

      Reviewer #1 I have indicated that this statement is found in lines 104-106, where the authors argue, 'The simultaneous presence of CD63, CD81, TSG101, ALIX, and the autophagosome marker LC3B within the MV-lEVs...' If the authors acknowledge the inaccuracy of this statement, please provide a justification for this argument.

      For clarity, we modified the description of data shown in Fig2 in the revised manuscript.

      (4) The specificity of the antibodies used in Figure 2 should be validated through knockout or knockdown experiments. Several of the antibodies used in this figure detect multiple bands on western blots, raising doubts about their specificity. Verification through additional experimental approaches is essential to ensure the reliability and accuracy of all the immunostaining data in this manuscript.

      We will consider this suggestion during the revision of the manuscript.

      Reviewer #1:Please do so.

      We carefully considered the suggestion, but we realized that it was not feasible for us to perform gene silencing in the case of all our used antibodies before resubmission of our revised manuscript. However, we repeated the Western blot for mouse anti-CD81 (Invitrogen MAA5-13548) and replaced the previous Western blot by it in the revised manuscript (Fig.2-S4H)

      (5) In Figures 2P-R, the morphology of the MV-IEVs does not resemble those shown in Figures 1-A, H, and D, indicating a notable inconsistency in the data.

      EM images in Figure2 P-R show sEVs separated from serum-free conditioned media as opposed to MV-lEVs, which were in situ captured in fixed tissue cultures (Fig1). Therefore, the two EV populations necessarily have different size and structure. Furthermore, Fig. 1 shows images of ultrathin sections while in Figure 2P-R, we used a negative-positive contrasting of intact sEV-s without embedding and sectioning.

      (6) There are no loading controls provided for any of the western blot data.

      Not even the latest MISEV 2023 guidelines give recommendations for proper loading control for separated EVs in Western blot (MISEV 2023 , DOI: 10.1002/jev2.12404 PMID: 38326288). Here we applied our previously developed method (PMID: 37103858), which in our opinion, is the most reliable approach to be used for sEV Western blotting. For whole cell lysates, we used actin as loading control (Fig3-S2B).

      Reviewer #1: The blots referenced here (Fig2-S3; Fig2-S4B; Fig3-S2B) were conducted using total cell lysates, not EV extracts. Only one blot in Fig3-S2B includes an actin control. All remaining blots should incorporate actin controls for consistency.

      Fig2-S3 (corresponding to Fig2-S4 in the revised manuscript) only shows reactivity of the used antibodies. This Western blot is not intended to serve as a basis of any quantitative conclusions. Fig2-S4 (corresponding to Fig2-S5 in the revised manuscript) includes the actin control. Fig3-S2B shows the complete membrane, which was cut into 4 pieces, and the immune reactivity of different antibodies was tested. The actin band was included on the anti-LC3B blot. For clarity, we rephrased the figure legend.

      Additionally, for Figures 2-S4B, the authors should run the samples from lanes i-iii in a single gel.

      Please note that in Figure 2- S4B, we did run a single gel, and the blot was cut into 4 pieces, which were tested by anti-GFP, anti-RFP, anti-LC3A and anti-LC3B antibodies. Full Western blots are shown in Fig.3_S2 B, and lanes “1”, “2” and “3” correspond to “i”, “ii” and “iii” in Fig.2-S4, respectively.

      Reviewer #1: In the original Figure 2- S4B, the blots were sectioned into 12 pieces. If lanes "i," "ii," and "iii" were run on the same blot, the authors are advised to eliminate the grids between these lanes.

      Grids separating the lanes have been eliminated on Fig.2_S4 (now Fig.2_S5 in the revised manuscript).

      (7) In Figure 2-S4, is there co-localization observed between LC3RFP (LC3A?) with other MV-IFV markers? How about LC3B? Does LC3B co-localize with other MV-IFV markers?

      In Supplementary Figure 2-S4, we showed successful generation of HEK293T-PalmGFP-LC3RFP cell line. In this case we tested the cells, and not the released MV-lEVs. LC3A co-localized with the RFP signal as expected.

      Reviewer #1: Does LC3RFP colocalize with MV-IFV markers in HEK293T-PalmGFP-LC3RFP cell line? This experiment aims to clarify the conclusion made in lines 104-106, where the authors assert that 'The concurrent existence of CD63, CD81, TSG101, ALIX, and the autophagosome marker LC3B within the MV-lEVs...'

      In the case of PalmGFP-LC3RFP cells, LC3-RFP is overexpressed. Simultaneous assessment of this overexpressed protein with non-overexpressed, fluorescent antibod-detected molecules proved to be challenging because of spectral overlaps and inappropriate signal-noise ratios. Furthermore, in association with EVs, the number of antibody-detected molecules is substantially lower than in cells. Therefore, even though we tried, we could not successfully perform these experiments.

      (8) The TEM images presented in Figure 2-S5, specifically F, G, H, and I, do not closely resemble the images in Figure 2-S5 K, L, M, N, and O. Despite this dissimilarity, the authors argue that these images depict the same structures. The authors should provide an explanation for this observed discrepancy to ensure clarity and consistency in the interpretation of the presented data.

      As indicated in Material and Methods, Fig 2-S5 F, G, H and I are conventional TEM images fixed by 4% glutaraldehyde 1% OsO<sub>4</sub> 2h and embedded into Epon resin with a post contrasting of 3.75% uranyl acetate 10 min and 12 min lead citrate. Samples processed this way have very high structure preservation and better image quality, however, they are not suitable for immune detection. In contrast, Fig.2.-S5 K,L,M,N shows immunogold labelling of in situ fixed samples. In this case we used milder fixation (4% PFA, 0.1% glutaraldehyde, postfixed by 0.5% OsO<sub>4</sub> 30 min) and LR-White hydrophilic resin embedding. This special resin enables immunogold TEM analysis. The sections were exposed to H<sub>2</sub>O<sub>2</sub> and NaBH<sub>4</sub> to render the epitopes accessible in the resin. Because of the different applied techniques, the preservation of the structure is not the same. In the case of Fig.2 J, O, separated sEVs were visualised by negative-positive contrast and immunogold labelling as described previously (PMID: 37103858).

      Reviewer #1: Please include this justification in the revised version.

      We included this justification in the revised manuscript.

      (9) For Figures 3C and 3-S1, the authors should include the images used for EV quantification. Considering the concern regarding potential contamination introduced by FBS (concern 1), it is advisable for the authors to employ an independent method to identify EVs, thereby confirming the reliability of the data presented in these figures.

      In our revised manuscript, we will provide all the images used for EV quantification in Figure 3C. Given that Figures 3C and 3-S1 show MV-lEVs released by HEK293T-PlamGFP cells, the possible interference by FBS-derived non-fluorescent EVs can be excluded.

      Reviewer #1: Please provide all the images.

      Original LASX files are provided (DOI: 10.6019/S-BIAD1456 ).

      Reviewer #1: The images raising concerns regarding the contamination of EVs in FBS primarily consist of transmission electron microscopy (TEM) images, namely, Figure 1, 2P-S, S5, and Figure 4 P-U, along with the quantification of EV numbers in Figure 3. These concerns persist despite the use of fluorescent-labeled experiments. While fluorescent-labeled MV-lEVs are conclusively identified as originating from the cells, the MV-lEVs observed in Figure 1, 2P-S, S5, and Figure 4 P-U and Figure 3 may derive from both the cells and FBS.

      Large EVs (with diameter >800 nm) derived from FBS were not present in our experiments, as discussed above.

      (10) Do the amphiectosomes released from other cell types as well as cells in mouse kidneys or liver contain LC3B positive and CD63 positive ILVs?

      Based on our confocal microscopic analysis, in addition the HEK293T-PalmGFP cells, HT29 and HepG2 cells also release similar LC3B and CD63 positive MV-lEVs. Preliminary evidence shows MV-lEV secretion by additional cell types.

      The response of Reviewer #1: Please show these data in the revised manuscript. Moreover, do cells in mouse kidneys or liver contain LC3B positive and CD63 positive ILVs?

      We have added new confocal microscopic images to Fig2-S3 showing amphiectosomes released also by the H9c2 (ATCC) cardiomyoblast cell line. To preserve the ultrastructure of MV-lEVs in complex organs like kidney and liver, fixation with 4% glutaraldehyde with 1% OsO4 appears to be essential. This fixation does not allow for immune detection to assess LC3B and CD63 positive MV-lEVs in the ultrathin sections.

      Reviewer #2 (Public Review):

      Summary:

      The authors had previously identified that a colorectal cancer cell line generates small extracellular vesicles (sEVs) via a mechanism where a larger intracellular compartment containing these sEVs is secreted from the surface of the cell and then tears to release its contents. Previous studies have suggested that intraluminal vesicles (ILVs) inside endosomal multivesicular bodies and amphisomes can be secreted by the fusion of the compartment with the plasma membrane. The 'torn bag mechanism' considered in this manuscript is distinctly different because it involves initial budding off of a plasma membrane-enclosed compartment (called the amphiectosome in this manuscript, or MV-lEV). The authors successfully set out to investigate whether this mechanism is common to many cell types and to determine some of the subcellular processes involved.

      The strengths of the study are:

      (1) The high-quality imaging approaches used, seem to show good examples of the proposed mechanism.

      (2) They screen several cell lines for these structures, also search for similar structures in vivo, and show the tearing process by real-time imaging.

      (3) Regarding the intracellular mechanisms of ILV production, the authors also try to demonstrate the different stages of amphiectosome production and differently labelled ILVs using immuno-EM.

      Several of these techniques are technically challenging to do well, and so these are critical strengths of the manuscript.

      The weaknesses are:

      (1) Most of the analysis is undertaken with cell lines. In fact, all of the analysis involving the assessment of specific proteins associated with amphiectosomes and ILVs are performed in vitro, so it is unclear whether these processes are really mirrored in vivo. The images shown in vivo only demonstrate putative amphiectosomes in the circulation, which is perhaps surprising if they normally have a short half-life and would need to pass through an endothelium to reach the vessel lumen unless they were secreted by the endothelial cells themselves.

      Our previous results analyzing PFA-fixed, paraffin embedded sections of colorectal cancer patients provided direct evidence that MV-lEV secretion also occurs in humans in vivo (PMID: 31007874). Regarding your comment on the presence of amphiectosomes in the circulation despite their short half-lives, we would like to point out that Fig1.X shows a circulating lymphocyte which releases MV-lEV within the vessel lumen. Furthermore, in the revised manuscript, an additional Fig.1-S1 is provided. Here, we show the release of MV-lEVs both by an endothelial and a sub-endothelial cell (Fig.1-S1G). In addition, these images show the simultaneous presence of MV-lEVs and sEVs in the circulation (Fig.1-S1.A,C,D,H and I). The transmission electron micrographs of mouse kidney and liver sections provide additional evidence that the MV-lEVs are released by different types of cells, and the “torn bag release” also takes place in vivo (Fig.1.V).

      (2) The analysis of the intracellular formation of compartments involved in the secretion process (Figure 2-S5) relies on immuno-EM, which is generally less convincing than high-/super-resolution fluorescence microscopy because the immuno-labelling is inevitably very sporadic and patchy. High-quality EM is challenging for many labs (and seems to be done very well here), but high-/super-resolution fluorescence microscopy techniques are more commonly employed, and the study already shows that these techniques should be applicable to studying the intracellular trafficking processes.

      As you suggested, in the revised manuscript, we present additional super-resolution microscopy (STED) data. The intracellular formation of amphisomes, the fragmentation of LC3B-positive membranes and the formation of LC3B-positive ILVs were captured (Fig. 3B-F).

      (3) One aspect of the mechanism, which needs some consideration, is what happens to the amphisome membrane, once it has budded off inside the amphiectosome. In the fluorescence images, it seems to be disrupted, but presumably, this must happen after separation from the cell to avoid the release of ILVs inside the cell. There is an additional part of Figure 1 (Figure 1Y onwards), which does not seem to be discussed in the text (and should be), that alludes to amphiectosomes often having a double membrane.

      We agree with your comment regarding the amphisome membrane and we added a sentence to the Discussion of the revised manuscript. Fig1Y onwards is now discussed in the manuscript. In addition, we labelled the surface of living HEK293 cells with wheat germ agglutinin (WGA), which binds to sialic acid and N-acetyl-D-glucosamine. After removing the unbound WGA by washes, the cells were cultured for an additional 3 hours, and the release of amphiectosomes was studied. The budding amphiectosome had WGA positive membrane providing evidence that the external limiting membrane had a plasma membrane origin (Fig.3G)

      (4) The real-time analysis of the amphiectosome tearing mechanism seemed relatively slow to me (over three minutes), and if this has been observed multiple times, it would be helpful to know if this is typical or whether there is considerable variation.

      Thank you for this comment. In the revised manuscript, we highlight that the first released LC3 positive ILV was detected as early as within 40 sec.

      Overall, I think the authors have been successful in identifying amphiectosomes secreted from multiple cell lines and demonstrating that the ILVs inside them have at least two origins (autophagosome membrane and late endosomal multivesicular body) based on the markers that they carry. The analysis of intracellular compartments producing these structures is rather less convincing and it remains unclear what cells release these structures in vivo.

      I think there could be a significant impact on the EV field and consequently on our understanding of cell-cell signalling based on these findings. It will flag the importance of investigating the release of amphiectosomes in other studies, and although the authors do not discuss it, the molecular mechanisms involved in this type of 'ectosomal-style' release will be different from multivesicular compartment fusion to the plasma membrane and should be possible to be manipulated independently. Any experiments that demonstrate this would greatly strengthen the manuscript.

      We appreciate these comments of the reviewer. Experiments are on their way to elucidate the mechanism of the “ectosomal style” exosome release and will be the topic of our next publication.

      In general, the EV field has struggled to link up analysis of the subcellular biology of sEV secretion and the biochemical/physical analysis of the sEVs themselves, so from that perspective, the manuscript provides a novel angle on this problem.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors describe a novel mode of release of small extracellular vesicles. These small EVs are released via the rupture of the membrane of so-called amphiectosomes that resemble "morphologically" Multivesicular Bodies.

      These structures have been initially described by the authors as released by colorectal cancer cells (https://doi.org/10.1080/20013078.2019.1596668). In this manuscript, they provide experiments that allow us to generalize this process to other cells. In brief, amphiectosomes are likely released by ectocytosis of amphisomes that are formed by the fusion of multivesicular endosomes with autophagosomes. The authors propose that their model puts forward the hypothesis that LC3 positive vesicles are formed by "curling" of the autophagosomal membrane which then gives rise to an organelle where both CD63 and LC3 positive small EVs co-exist and would be released then by a budding mechanism at the cell surface that appears similar to the budding of microvesicles /ectosomes. Very correctly the authors make the distinction from migrasomes because these structures appear very similar in morphology.

      Strengths:

      The findings are interesting despite that it is unclear what would be the functional relevance of such a process and even how it could be induced. It points to a novel mode of release of extracellular vesicles.

      Weaknesses:

      This reviewer has comments and concerns concerning the interpretation of the data and the proposed model. In addition, in my opinion, some of the results in particular micrographs and immunoblots (even shown as supplementary data) are not of quality to support the conclusions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Highlight MV-IEV, ILV and limiting membrane in Figure-1G, N, and U.

      Based on the suggestion, we revised Figure1

      (2) Figure 1-Y-AF are not mentioned in the text.

      In the revised manuscript, we discuss Figure 1Y-AF

      (3) The term "IEVs" in Figure 2-S2 is not defined.

      We modified the figure legend: we changed MV-lEV to amphiectosome

      (4) Need to quantify co-localization in Figure 2-S2.

      As suggested, we carried out the co-localisation analysis (Fig2-S2I), and Fig2-S2 was re-edited

      Reviewer #2 (Recommendations For The Authors):

      I have two recommendations for improving the manuscript through additional experiments:

      (1) I think the description of the intracellular processes taking place in order to form amphiectosomes would be much stronger if some super-resolution imaging could be undertaken. This should label the different compartments before and after fusion with specific markers that highlight the protein signature of the different limiting and ILV membranes much more clearly than immuno-EM. It will also help in characterising the double-membrane structure of amphiectosomes at the point of budding and reveal whether the patchy labelling of the inner membrane emerges after amphiectosome release (the schematic model currently suggests that it happens before).

      Thank you for your suggestion. STED microscopy was applied and results are shown in new Fig3 and the schematic model was modified accordingly.

      (2) The implications of the manuscript would be more wide-ranging if the authors could test genetic manipulations that are believed to block exosome or ectosome release, eg. Rab27a or Arrdc1 knockdown. This may allow them to determine whether MV-lEVs can be released independently of the classical exosome release mechanism because they use a different route to be released from the plasma membrane. This experiment is not essential, but I think it would start to address the core regulatory mechanisms involved, and if successful, would easily allow the authors to determine the ratio of CD63-positive sEVs being secreted via classical versus amphiectosome routes.

      The suggestion is very valuable for us and these studies are being performed in a separate project.

      I think there are several other ways in which the manuscript could be improved to better explain some of the approaches, findings and interpretation:

      (1) Include some explanation in the text of certain key tools, particularly:

      a. Palm-GFP and whether its expression might alter the properties of the plasma membrane since this is used in a lot of experiments and is the only marker that seems to uniformly label the outer membrane of amphiectosomes. One concern might be that its expression drives amphiectosome secretion.

      We found evidence for amphiectosome release also in the case of several different cells not expressing Palm-GFP. We believe, this excludes the possibility that Palm-GFP expression is the inducer of the amphiectosome release. Both by fluorescent and electron microscopy, the Palm-GFP non expressing cells showed very similar MV-lEVs. In addition, in the case of non-transduced HEK293 and fluorescent WGA-binding, we made similar observations.

      b. Lactadherin - does this label the amphiectosomes after their release or does the wash-off step mean that it only labels cells, which subsequently release amphiectosomes?

      Lactadherin labels the amphiectosomes after their release and fixation. Living cells cannot be labelled by lactadherin as PS is absent in the external plasma membrane layer of living cells. We used WGA on HEK293 cells to further support the plasma membrane origin of the external membrane of amphiectosomes.

      (2) Explain the EM and confocal imaging approaches more clearly. Most importantly, is a 3D reconstruction always involved to confirm that 'separated' amphiectosomes are not joined to cells in another Z-plane.

      Thank you for your suggestion. We have modified the manuscript accordingly

      (3) Presenting triple-labelled images with red, green and yellow channels does not allow individual labelling to be determined without single-channel images and even then, it is much more informative to use three distinguishable colours that make a different colour with overlap, eg. CMY? Fig.2_S2D and E do not display individual channels, so definitely need to be changed.

      In case of Fig.2_S2D, we now show the individual channels, the earlier E image has been removed. In case of the STED images, CMY colors had been used, as you suggested.

      (4) Please discuss in the text the data in Figure 1Y onwards concerning single/double membranes on MV-lEVs.

      In the revised manuscript, we discuss the question on single/double membranes and we refer to Figure 1Y-AF

      (5) On line 162, reword 'intraluminal TSPAN4 only' to 'one in which TSPAN4 is only intraluminal' to make it clear that other proteins are also marking the intraluminal region, not TSPAN4 only.

      We modified the text accordingly.

      (6) Points for further discussion and further conclusions:

      a. In vivo experiments - discuss the limitations of this part of the analysis - it seems that none of the amphiectosome markers have been analysed in this part of the study and the MV-lEVs are only in the circulation.

      b. Can the authors give any further indication of the levels of MV-lEVs relative to free sEVs from any of their studies?

      Using our current approach, it is not possible to determine the levels of MV-lEVs to free sEV. Without analyzing serial ultrathin sections, determination of the relative ratio of MV-lEVs and sEVs would depend on the actual section plane. In future projects, we will determine the ratio of LC3 positive and negative sEVs by single EV analysis techniques (such as SP-IRIS). In the revised manuscript, additional TEM images are included to provide evidence for the simultaneous presence of sEVs and MV-lEVs and MV-lEVs both inside and outside of the circulation.

      c. Please discuss the single versus double membrane issue (relating to experiments proposed above).

      We discuss this question in more details in the revised manuscript.

      d. Please point out that the release mechanism (plasma membrane budding) will involve different molecular mechanisms to establish exosome release, and this might provide a route to determine relative importance.

      We are currently running a systemic analysis of the release mechanism of amphiectosomes, and this will be the topic of a separate manuscript.

      Reviewer #3 (Recommendations For The Authors):

      * The model is not supported.

      * The data is not of quality.

      * The appropriate methods are not exploited.

      We are sorry, we cannot respond to these unsupported critiques.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary: <br /> In this manuscript, the authors identified that 

      (1) CDK4/6i treatment attenuates the growth of drug-resistant cells by prolongation of the G1 phase; 

      (2) CDK4/6i treatment results in an ineffective Rb inactivation pathway and suppresses the growth of drug-resistant tumors;

      (3) Addition of endocrine therapy augments the efficacy of CDK4/6i maintenance;

      (4) Addition of CDK2i with CDK4/6 treatment as second-line treatment can suppress the growth of resistant cell;

      (5) The role of cyclin E as a key driver of resistance to CDK4/6 and CDK2 inhibition.

      Strengths:

      To prove their complicated proposal, the authors employed orchestration of several kinds of live cell markers, timed in situ hybridization, IF and Immunoblotting. The authors strongly recognize the resistance of CDK4/6 + ET therapy and demonstrated how to overcome it.

      Weaknesses:

      The authors need to underscore their proposed results from what is to be achieved by them and by other researchers. 

      Thank you for your thoughtful review and for highlighting both the strengths and weaknesses of our manuscript. We appreciate your recognition of the methodological rigor and the significance of our findings in addressing resistance to CDK4/6 inhibitors combined with endocrine therapy.

      To address your concern regarding the need to delineate our results from those achieved by other researchers, we will incorporate clarifications in the revised manuscript. Specifically, we will:

      (1) Clearly distinguish our novel contributions from prior findings in the field.

      (2) Explicitly cite and discuss relevant studies to contextualize our work, ensuring that our contributions are appropriately framed within the broader body of knowledge.

      These revisions will enhance the transparency and impact of our manuscript, as well as highlight the originality and significance of our findings. Thank you again for your constructive feedback.

      Reviewer #2 (Public review):

      Summary:

      This study elucidated the mechanism underlying drug resistance induced by CDK4/6i as a single agent and proposed a novel and efficacious second-line therapeutic strategy. It highlighted the potential of combining CDK2i with CDK4/6i for the treatment of HR+/HER2- breast cancer.

      Strengths:

      The study demonstrated that CDK4/6 induces drug resistance by impairing Rb activation, which results in diminished E2F activity and a delay in G1 phase progression. It suggests that the synergistic use of CDK2i and CDK4/6i may represent a promising second-line treatment approach. Addressing critical clinical challenges, this study holds substantial practical implications.

      Weaknesses: 

      (1) Drug-resistant cell lines: Was a drug concentration gradient treatment employed to establish drug-resistant cell lines? If affirmative, this methodology should be detailed in the materials and methods section. 

      We greatly appreciate the reviewer for raising this important question. In the revised manuscript, we will update the methods section to include a detailed description of how the drug-resistant cell lines were developed. Specifically, we will clarify whether a drug concentration gradient treatment was employed and provide step-by-step details to ensure reproducibility.

      (2) What rationale informed the selection of MCF-7 cells for the generation of CDK6 knockout cell lines? Supplementary Figure 3. A indicates that CDK6 expression levels in MCF-7 cells are not notably elevated. 

      We appreciate the reviewer’s insightful question about the rationale for selecting MCF-7 cells to generate CDK6 knockout cell lines. This choice was guided by prior studies highlighting the significant role of CDK6 in mediating resistance to CDK4/6 inhibitors (1-4). Moreover, we observed a 4.6-fold increase in CDK6 expression in CDK4/6 inhibitor-resistant MCF-7 cells compared to their drug-naïve counterparts (Supplementary Figure 3A). While we did not detect notable differences in CDK4/6 activity between wild-type and CDK6 knockout cells under CDK4/6 inhibitor treatment, these findings point to a potential non-canonical function of CDK6 in conferring resistance to CDK4/6 inhibitors.

      (3) For each experiment, particularly those involving mice, the author must specify the number of individuals utilized and the number of replicates conducted, as detailed in the materials and methods section. 

      We sincerely thank the reviewer for bringing this to our attention. In the revised manuscript, we will provide explicit details regarding the number of replicates and mice used for each experiment. This information will be included in the materials and methods section, figure legends, and relevant text to ensure transparency and clarity.

      (4) Could this treatment approach be extended to triple-negative breast cancer? 

      We greatly appreciate the reviewer’s inquiry about extending our findings to triple-negative breast cancer (TNBC). Based on our data presented in Figure 1 and Supplementary Figure 2, which include the TNBC cell line MDA-MB-231, we anticipate that the benefits of maintaining CDK4/6 inhibitors could indeed be applied to TNBC with an intact Rb/E2F pathway.

      Reviewer #3 (Public review):

      Summary:

      In their manuscript, Armand and colleagues investigate the potential of continuing CDK4/6 inhibitors or combining them with CDK2 inhibitors in the treatment of breast cancer that has developed resistance to initial therapy. Utilizing cellular and animal models, the research examines whether maintaining CDK4/6 inhibition or adding CDK2 inhibitors can effectively control tumor growth after resistance has set in. The key findings from the study indicate that the sustained use of CDK4/6 inhibitors can slow down the proliferation of cancer cells that have become resistant, and the combination of CDK2 inhibitors with CDK4/6 inhibitors can further enhance the suppression of tumor growth. Additionally, the study identifies that high levels of Cyclin E play a significant role in resistance to the combined therapy. These results suggest that continuing CDK4/6 inhibitors along with the strategic use of CDK2 inhibitors could be an effective strategy to overcome treatment resistance in hormone receptor-positive breast cancer.

      Strengths:

      (1) Continuous CDK4/6 Inhibitor Treatment Significantly Suppresses the Growth of Drug-Resistant HR+ Breast Cancer: The study demonstrates that the continued use of CDK4/6 inhibitors, even after disease progression, can significantly inhibit the growth of drug-resistant breast cancer.

      (2) Potential of Combined Use of CDK2 Inhibitors with CDK4/6 Inhibitors: The research highlights the potential of combining CDK2 inhibitors with CDK4/6 inhibitors to effectively suppress CDK2 activity and overcome drug resistance.

      (3) Discovery of Cyclin E Overexpression as a Key Driver: The study identifies overexpression of cyclin E as a key driver of resistance to the combination of CDK4/6 and CDK2 inhibitors, providing insights for future cancer treatments.

      (4) Consistency of In Vitro and In Vivo Experimental Results: The study obtained supportive results from both in vitro cell experiments and in vivo tumor models, enhancing the reliability of the research.

      (5) Validation with Multiple Cell Lines: The research utilized multiple HR+/HER2- breast cancer cell lines (such as MCF-7, T47D, CAMA-1) and triple-negative breast cancer cell lines (such as MDA-MB-231), validating the broad applicability of the results.

      Weaknesses:

      (1) The manuscript presents intriguing findings on the sustained use of CDK4/6 inhibitors and the potential incorporation of CDK2 inhibitors in breast cancer treatment. However, I would appreciate a more detailed discussion of how these findings could be translated into clinical practice, particularly regarding the management of patients with drug-resistant breast cancer. 

      We greatly appreciate this opportunity to further contextualize our findings within clinical practice. In the revised manuscript, we will expand the discussion to explore how the identified mechanisms can inform patient stratification and therapeutic combinations. We will also highlight the potential of integrating CDK2 inhibitors with continued CDK4/6 inhibition as a second-line strategy for HR+ breast cancer patients who exhibit resistance to CDK4/6 inhibitors, leveraging insights from current and ongoing clinical trials. This will provide a clearer framework for translating our findings into actionable therapeutic strategies.

      (2) While the emergence of resistance is acknowledged, the manuscript could benefit from a deeper exploration of the molecular mechanisms underlying resistance development. A more thorough understanding of how CDK2 inhibitors may overcome this resistance would be valuable. 

      Thank you for this insightful suggestion. In the revised manuscript, we will delve deeper into the molecular mechanisms by which CDK2 inhibitors counteract resistance to CDK4/6 inhibitors and endocrine therapy. We will emphasize the role of the non-canonical Rb inactivation pathway and upregulated transcriptional activity in reactivating CDK2, which contribute to resistance under CDK4/6 inhibition. Furthermore, we will discuss how dual inhibition of CDK4/6 and CDK2 effectively suppresses this resistance pathway, offering a mechanistic rationale for the therapeutic potential of this combination strategy.

      (3) The manuscript supports the continued use of CDK4/6 inhibitors, but it lacks a discussion on the long-term efficacy and safety of this approach. Additional studies or data to support the safety profile of prolonged CDK4/6 inhibitor use would strengthen the manuscript. 

      We greatly appreciate the reviewer for raising this important point. To address this, we will incorporate a discussion on the long-term safety and efficacy of CDK4/6 inhibitor maintenance therapy. Drawing from clinical trials and retrospective analyses (5-9), we will highlight data supporting the tolerability of prolonged CDK4/6i treatment, particularly in combination with endocrine therapy. We will also discuss its clinical benefits over chemotherapy or endocrine therapy alone, contextualizing these findings with our proposed therapeutic approach (6,8-11).

      References:

      (1) Yang C, Li Z, Bhatt T, Dickler M, Giri D, Scaltriti M_, et al._ Acquired CDK6 amplification promotes breast cancer resistance to CDK4/6 inhibitors and loss of ER signaling and dependence. Oncogene 2017;36:2255-64

      (2) Li Q, Jiang B, Guo J, Shao H, Del Priore IS, Chang Q_, et al._ INK4 Tumor Suppressor Proteins Mediate Resistance to CDK4/6 Kinase Inhibitors. Cancer Discov 2022;12:356-71

      (3) Ji W, Zhang W, Wang X, Shi Y, Yang F, Xie H_, et al._ c-myc regulates the sensitivity of breast cancer cells to palbociclib via c-myc/miR-29b-3p/CDK6 axis. Cell Death & Disease 2020;11:760

      (4) Wu X, Yang X, Xiong Y, Li R, Ito T, Ahmed TA_, et al._ Distinct CDK6 complexes determine tumor cell response to CDK4/6 inhibitors and degraders. Nature Cancer 2021;2:429-43

      (5) Martin JM, Handorf EA, Montero AJ, Goldstein LJ. Systemic Therapies Following Progression on First-line CDK4/6-inhibitor Treatment: Analysis of Real-world Data. Oncologist 2022;27:441-6

      (6) Xi J, Oza A, Thomas S, Ademuyiwa F, Weilbaecher K, Suresh R_, et al._ Retrospective Analysis of Treatment Patterns and Effectiveness of Palbociclib and Subsequent Regimens in Metastatic Breast Cancer. J Natl Compr Canc Netw 2019;17:141-7

      (7) Basile D, Gerratana L, Corvaja C, Pelizzari G, Franceschin G, Bertoli E_, et al._ First- and second-line treatment strategies for hormone-receptor (HR)-positive HER2-negative metastatic breast cancer: A real-world study. Breast 2021;57:104-12

      (8) Kalinsky K, Accordino MK, Chiuzan C, Mundi PS, Sakach E, Sathe C_, et al._ Randomized Phase II Trial of Endocrine Therapy With or Without Ribociclib After Progression on Cyclin-Dependent Kinase 4/6 Inhibition in Hormone Receptor–Positive, Human Epidermal Growth Factor Receptor 2–Negative Metastatic Breast Cancer: MAINTAIN Trial. Journal of Clinical Oncology;0:JCO.22.02392

      (9) Kalinsky K, Bianchini G, Hamilton EP, Graff SL, Park KH, Jeselsohn R_, et al._ Abemaciclib plus fulvestrant vs fulvestrant alone for HR+, HER2- advanced breast cancer following progression on a prior CDK4/6 inhibitor plus endocrine therapy: Primary outcome of the phase 3 postMONARCH trial. Journal of Clinical Oncology 2024;42:LBA1001-LBA

      (10) Mayer EL, Wander SA, Regan MM, DeMichele A, Forero-Torres A, Rimawi MF_, et al._ Palbociclib after CDK and endocrine therapy (PACE): A randomized phase II study of fulvestrant, palbociclib, and avelumab for endocrine pre-treated ER+/HER2- metastatic breast cancer. Journal of Clinical Oncology 2018;36:TPS1104-TPS

      (11) Llombart-Cussac A, Harper-Wynne C, Perello A, Hennequin A, Fernandez A, Colleoni M_, et al._ Second-line endocrine therapy (ET) with or without palbociclib (P) maintenance in patients (pts) with hormone receptor-positive (HR[+])/human epidermal growth factor receptor 2-negative (HER2[-]) advanced breast cancer (ABC): PALMIRA trial. Journal of Clinical Oncology 2023;41:1001-

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      We are grateful to the reviewers and the editorial team for their feedback and thorough revisions of our paper. We also appreciate their acknowledgement that this study represents a significant advancement in the field of reproductive neuroendocrinology and offers insights on the contribution of obesity vs melanocortin signaling in women’s fertility. In the revised version, we will provide a more detailed clarification of the data and methodology and adhere to the reviewers’ suggestions.

      Please find below our answers to specific concerns in the public review:

      Given the fact that mice lacking MC4R in Kiss1 neurons remained fertile despite some reproductive irregularities, the overall tone and some of the conclusions of the manuscript (e.g., from the abstract: "... Mc4r expressed in Kiss1 neurons is required for fertility in females") were overstated. Perhaps this can be described as a contributing pathway, but other mechanisms must also be involved in conveying metabolic information to the reproductive system.

      We will tone down these statements throughout the manuscript to indicate that MC4R in Kiss1 neurons plays a role in the metabolic control of fertility (rather than “…is required for fertility”)

      The mechanistic studies evaluating melanocortin signalling in Kiss1 neurons were all completed in ovariectomised animals (with and without exogenous hormones) that do not experience cyclical hormone changes. Such cyclical changes are fundamental to how these neurons function in vivo and may dynamically alter the way they respond to neuropeptides. Therefore, eliminating this variable makes interpretation difficult.

      Mice lack true follicular and luteal phases and therefore it is impossible to separate estrogen-mediated changes from progesterone-mediated changes (e.g., in a proestrous female). Therefore, we use an ovariectomized female model in which we can generate a LH surge with an E2-replacement regimen [1]. This model enables us to focus on estrogen effects, exclude progesterone effects, and minimize variability. Inclusion of cycling females would make interpretation much more difficult.

      (1) Bosch et al., 2013 Mol & Cell Endo; https://doi.org/10.1016/j.mce.2012.12.021

      Use of the POMC-Cre to target ontogenetic inputs to Kiss1 neurons might have targeted a wider population of cells than intended.

      POMC is transiently expressed during embryonic development in a portion of cells fated to be Kiss1 or NPY/AgRP neurons [1-2]. Therefore, this is a valid concern when crossing with a floxed mouse. However, use of AAVs in adult animals avoids this issue and leads to specific expression in POMC neurons [3]. This POMC-Cre mouse has been used extensively with AAVs to drive specific expression in POMC neurons by other laboratories [4-7]. Therefore, we are confident that our optogenetic studies have narrowly targeted POMC inputs.

      (1) Padilla et al., 2010 Nat Med; https://doi.org/10.1038/nm.2126

      (2) Lam et al., 2017 Mol Metab; https://doi.org/10.1016/j.molmet.2017.02.007

      (3) Stincic et al., 2018 eNeuro; https://doi.org/10.1523/eneuro.0103-18.2018

      (4) Fenselau et al., 2017 Nat Neuro; https://doi.org/10.1038/nn.4442

      (5) Rau & Hentges, 2019 J Neuro; https://doi.org/10.1523/jneurosci.3193-18.2019

      (6) Fortin et al., 2021 Nutrients; https://doi.org/10.3390/nu13051642

      (7) Villa et al., 2024 J Neuro; https://doi.org/10.1523/jneurosci.0222-24.2024

      Recommendations for Authors

      We thank the reviewers and the editorial team for their comments and thorough revisions of our paper. We have now addressed their comments and edited the manuscript accordingly:

      Reviewer #1 (Recommendations For The Authors):

      L80 -This is an awkward sentence; it isn't an inverse agonist of the AgRP; this may read better just to say that the inverse agonist, AgRP.

      Thank you for this comment. This has now been changed in the text (L80).

      L86 - This text reads as if mice have an inherent obesity issue.

      This has also now been addressed in the text (L86).

      L131 - The numbers of digits past the decimal point should match for both mean and SEM.

      This has also now been addressed throughout the text.

      Figure 1D: Revise the bar graphs with distinct SEM bars, as these data are not generated within the same mice.

      The graphs are now changed, and they include distinct SEM and individual data points.

      Figure 2I-L - An n of 3 for controls is pretty minimal, though the clustering of data points is tight.

      We thank the reviewer for this comment, and we emphasize that while we agree that an n=3 for controls is minimal, the mRNA level values of this group are close, therefore the clustering of the data points is tight. We are happy to provide the raw data value for these groups if the reviewer wishes to.

      L159 - The role of reduced dynorphin mRNA is pretty speculative with regard to basal levels of LH, especially since no other indices of LH secretion were affected. It should also be recognized that mRNA levels do not always equate to activity.

      We agree with the reviewer that our explanation of the role of the reduced dynorphin with regards to the elevated basal LH is speculative, however, we only report that the higher LH levels correlates with the lower expression of the Pdyn gene expression, which is in line with the well documented role of Dynorphin on inhibiting LH secretion. We also recognize that mRNA levels don’t necessarily reflect activity. We have now added this statement to the text (L159).

      L164 - Given the ovary data, it seems that the increase seen in KO mice isn't quite sufficient, but is it known how much of a surge is necessary for ovulation in mice?

      We agree with the reviewer’s comment that the LH surge in Kiss1MC4RKO group is not enough to consistently induce ovulation, which is supported by the decrease in the numbers of corpora lutea data (Figure 2, O).

      According to literature, an LH surge in the female mice is estimated by a LH value >4 ng/ml (Bahougne et al., 2020). According to this rule, our data show that only two females out of six had LH surge in the KO group, while four females out of five had LH surge in the control group.  

      L211 - According to the figure, LH pulses were not recovered and remained similar to KO levels. Looking at the LH secretory patterns presented, it seems like the pulse frequency data should be interpreted with some caution, given that some of the pulses identified are tenuous at best.

      We agree that the LH pulses identified by our software (criteria described in the methods) are variable in shape (LH pulses are difficult to detect clearly in gonad intact females) and did not differ in number between groups; however, the reinsertion of Mc4r within Kiss1 neurons restored LH basal levels, amplitude and total secretory mass, which are clear indicatives of a significant improvement in the ability of these mice to release LH.

      L218 - Is there a reason why the surge was not looked at in these groups?

      Ovarian histology is the best indicator of ovulation. In these mice, corpora lutea were absent, indicating impaired ovulation, thus, we did not consider performing an LH surge protocol was necessary.

      L244 - This would also fit with previous findings in sheep that not all Kiss neurons express MC receptors

      We agree with this comment.

      L329 - Given the rapidity of its actions, how would this membrane ER function during a normal surge?

      Rapid estrogen signaling can act to ease transitions between states. Membrane delimited E2 actions can quickly attenuate or enhance coupling between receptors and signaling cascades. These effects will precede E2-driven changes in gene expression that produce more stable alterations in signaling. This combination of mechanisms will reduce any lag between rises in serum E2 and physiological effects. Considering the abbreviated mouse reproductive cycle, parallel mechanisms acting on different timescales are particularly important.

      L365 - I'm a little confused as to how this particular work sheds light on a role for MC3R. Is the relative distribution of the two isoforms within Kiss neurons known?

      In the present study, we report that hypothalamic Mc3r expression decreases leading up to the age of puberty onset (p30), in line with the profile of expression of Mc4r and a recent publication involving Mc3r in puberty onset (Lam et al., 2021), suggesting that both receptors may be involved in the control of reproductive function, potentially through the direct regulation of Kiss1 neurons as characterized in our present study.

      L422 - While I understand the nature of this statement, the receptor may simply reflect the activity of what binds to it, i.e., AgRP vs. alpha-MSH, suggesting that maybe the prepubertal period is more AgRP-dominated.

      We agree with this statement, and this needs to be further investigated.

      L495 - Reinsertion of Mc4R in Kiss1 neurons

      Thank you for this comment. This is now corrected in the text (L501).

      L524 - Bilateral ovariectomy of 6-month

      Thank you for this comment. This is now corrected in the text (L530).

      L538 - Is it known what stage of the cycle these mice were in when samples were collected?

      Yes, the samples were collected in diestrus. This is now mentioned in the text (L548)

      L556 - Pulse amplitude is usually measured relative to the preceding nadir.

      The method that we have been consistently using in our lab is the average of the 4 highest LH values in the samples collection period for each animal. We have found this to be consistent and representative of the overall amplitude (McCarthy et al., 2021; Talbi et al., 2021).

      L594 - This is a little confusing - the whole MBH would contain the ARH, but only the ARH was collected from the KO mice. If the whole MBH, dynorphin and Tac3, and Tac3 are expressed outside of the ARC, making interpretation of changes specifically within the ARH is difficult.

      Here (L592), we describe two different experiments, as mentioned by i) and ii).

      For experiment 1 (i): MBH was used in the WT mice at ages P10, P15, P22 and P30 to investigate the expression of the melanocortin genes (Agrp, Pomc, Mc3r and Mc4r).

      For experiment 2 (ii): In both KO and control groups, only the micro-dissected ARH was used to investigate genes expressions of Pdyn, Kiss1, Tac2, Tacr3.

      Reviewer #2 (Recommendations For The Authors):

      The validation experiments for the various manipulations are currently presented in the supplementary data. Still, in my opinion, these are critically important for interpreting the data, and it should be considered to present these more comprehensively in the main body of the manuscript. In Figure S1, it seems that the exposure of the two images is not the same, with a higher background in the control. Has this image been adjusted to highlight the staining, while the other has not? It looks like there remains a low level of expression still present in at least some of the KO cells - this may reflect difficulties using RNAscope (with its extreme amplification) to detect the absence of a signal, or it could also be that the knockout is incomplete. A percentage of cells still express MC4R. I think this should be acknowledged or discussed.

      We thank the reviewer for the feedback. While we agree that the validation of the mouse model is critical, we would like to keep it in the supplemental data.

      We also agree that the exposure looks different between the KO and WT controls, and we thank the reviewer for this comment. The quality of the photograph decreased when transferring to the manuscript. This has now been improved in the revised figure.

      As for the MC4R expression in some of the KO cells, we believe that MC4R is expressed in non Kiss1 cells as shown in the merged figure. Therefore, we believe that the Knockout of Mc4r in Kiss1 neurons is complete in these mice.

      The clear difference from the PVN's lack of effect is convincing and indicates that a specific knockout has been achieved. Is equivalent data also available for the AVPV population of cells that are examined later in the manuscript? Do those Kiss1 neurons also express the MC4R? The same question applies to the knock-in experiment: Was the expression of MC4R also driven in the AVPV population using this approach

      Yes, Kiss1 neurons in the AVPV also express MC4R as indicated in this study, and thus Mc4r is removed/reinserted in the AVPV as well in this mouse model.

      The quantitative RT-qPCR data on developmental changes in metabolic signaling molecules are really peripheral to the paper's main question. Relative to the validation experiments (as discussed above), I think these are less important data and could be placed into a supplementary figure. The discussion of these data becomes problematic, e.g., on line 359, the changes are described as "a low melanocortin tone..." but this seems problematic when referring to reduced expression of AgRP, an inverse agonist at the MC4R. If you are going to present these data, individual data points should be shown. Similarly, the question about whether this is a PCOS-like phenotype is perhaps worth asking. Still, the simple assessment of T and AMH could also be reported in a sentence without necessarily showing the data (or placing it in a supplementary figure). Better to focus on the key question - which is the role of MC4R signaling in Kiss1 neurons.

      We understand this reviewer’s concerns, however, due to the impact of MC4R signaling (particularly in the context of AgRP) on puberty, we strongly believe that the reader will benefit from expression profile across ages so we will respectfully disagree and keep in the main figure.  

      Per this reviewer’s comment, we have now added individual data points to Figure 1D.

      We also agree with the reviewer that the T and AMH data are not in the main scope of the paper, but since we uncovered a PCOS-like phenotype in female mice with specific deletion of Mc4r from Kiss1 neurons, it is important to keep these data in the main figure to show that the phenotype does not fully resemble a PCOS model.

      Having praised the experimental design, I think it is fair to acknowledge that the reproductive data from these experiments remain difficult to interpret. I understand that it is difficult to illustrate estrous cycles, but the "quantitative" data on percentages of time spent in any one stage are not as informative as seeing the actual individual patterns in Figure 2B. Were all of the animals consistently like the one illustrated, with persistent diestrus and only occasional evidence of ovulation?

      We agree that Figure 2C may be difficult to interpret but it is the best way to capture the all the data points for each group.

      All the 5 Kiss1MC4RKO females had persistent diestrus phases with only one or two estrus phases over 15 days (except for one female who had 4 estrous days), compared to control females who had 7 to 9 days of estrous, as shown in the graph (except for one female who had 5 days of estrus over 15 days period).

      Given that LH pulses appear to be normal, does this, in fact, suggest an ovarian problem? Is that possible? Are MC4R and Kiss1 co-expressed in the ovary? Or do you think this suggests an ovulation problem, perhaps driven by the impaired LH surge?

      This reviewer is correct in that our findings suggest a central defect in ovulation based on the deficit observed in the preovulatory LH surge. Thus, it is possible to have normal LH pulses, which are driven by one population of Kiss1 neurons (ARH) and the LH surge, driven by a distinct population of Kiss1 neurons (AVPV).

      Similarly, the response to the "LH surge induction protocol" is impaired (why not look at endogenous LH surges?). It seems that ovulation should be an all-or-none phenomenon in that if the LH surge is sufficient to induce ovulation, then all available follicles would be ovulated. If it is not, then no follicles will be ovulated. Why fewer follicles are ovulated in the gene-targeted animals seems more likely to be due to impaired follicular development rather than a subthreshold LH surge. So, this again points back to the ovary. Or perhaps we need a more thorough assessment of the pattern of LH pulses throughout the cycles in these animals.

      An LH surge induction protocol allows us to submit all female mice to the same conditions and expect a similar response, which is then optimal to compare with animals with an expected ovulation deficit, as it eliminates   external factors. We disagree in that ovulation is an all-or-none phenomenon because in mice numerous follicles mature at the same time and thus a decrease in the number of ovulated oocytes may be significant between groups even if the animals are not completely infertile.

      Collectively, my assessment of these data is that there are effects on reproduction, but they are actually relatively subtle. There were abnormal cycles and impaired LH surge in response to exogenous estrogen. But the animals are not actually infertile, so can ovulate and express normal reproductive behavior. So while there is a role for MC4R signalling in Kiss1 neurons, it may be a contributing modulatory role rather than a major regulatory mechanism. I think the tone of the descriptions should reflect this. I like the way it is framed in some parts of the discussion ("reproductive impairments...mediated by MC4R in Kiss1 neurons and not by their obese phenotype"), but the overall significance of this is overstated in some places, such as the abstract and in other parts of the discussion ("this population is tightly controlled by melanocortins").

      As mentioned in previous responses, ovulation in mice is not all-or nothing, so while the mice can reproduce, the disruption in the central mechanisms that control ovulation and irregular estrous cycles are a significant advancement in the field with strong translational potential to species where only one oocyte is usually ovulated, like in humans, where reproductive disorders in MC4R patients had been attributed to the obesity phenotype rather than to a central action of MC4R (as the reviewer captured in their comment). This is one of the main findings of this study.

      The overstatement has been now addressed throughout the text.

      For in vitro studies, all mice were ovariectomized and given estradiol "replacement." What was the rationale for this? Wouldn't this suppress the basal activity of these neurons? Then it appears that some of the animals were studied as ovariectomised (for an unspecified time but apparently ">7 days", without hormone replacement. The basal activity of these cells would be dramatically different. I think these artificial manipulations make these data quite difficult to interpret. How does this reflect the situation in a normal (or abnormal) estrous cycle? My understanding is that the brain slice approach already compromises the ability of this population of cells to function as a coordinated network (i.e., coordinated episodes of activity that are seen in vivo have not been observed in vitro in brain slices). Ovariectomizing and providing exogenous hormones also removes the additional regulatory elements of the cyclical changes in hormone inputs, so the cells may or may not behave like they would in vivo. Perhaps the authors could justify their choice of experimental model.

      We have clarified that the mice were ovariectomized for 7-10 days. A group of 3 mice are OVXed at once and then used on subsequent days a week later. This delay is both for the recovery of the animal and to allow for “washout” of endogenous ovarian hormones. For optogenetic studies, we were not measuring basal activity. Rather, we prioritized the ability to detect a postsynaptic response. While E2 decreases the networked activity of Kiss1- ARH neurons, the Hcn channels, calcium channels, and Vglut2 expression are all increased. This leads to increased excitability and more glutamate release. Mice lack true follicular and luteal phases and therefore it is impossible to separate estrogen-mediated changes from progesterone-mediated changes (e.g., in a proestrous female). Therefore, we use an ovariectomized female model in which we can generate a LH surge with an E2-replacement regimen (Bosch et al., J Mol Cell Endocrinology 2013). This model enables us to focus on estrogen effects, exclude progesterone effects, and minimize variability. Finally, we have documented that Kiss1<sup>ARH</sup> neurons retain the synchronization of their neuronal firing in the hypothalamic slice preparation (Qiu et al., eLife 2016).

      Figure 4E shows neurons' staining after expressing a Cre-dependent channel rhodopsin vector into POMC-Cre mice. The number of labelled cells looks markedly larger than expected for adult POMC neurons. Was the specificity of this approach to neurons expressing POMC checked? I understand that the POMC-Cre mice have been criticised for ectopic expression of Cre during development in other populations of neurons in the arcuate nucleus that does not express POMC, such as the AgRP neurons (e.g., PMID: 22166984). Is it possible that this is not a problem in adult animals? Has that been validated in these animals? The description of the method suggests that it is acknowledged that some of the expression driven in these animals might be in AgRP neurons. Still, optogenetic activation of these cells will include all cells expressing Cre at the time of AAV administration.

      POMC is transiently expressed during embryonic development in a portion of cells fated to be Kiss1 or NPY/AgRP neurons. Therefore, this is a valid concern when crossing with a floxed mouse. However, use of AAVs in adult animals avoids this issue and leads to specific expression in POMC neurons. This POMC-Cre mouse has been used extensively with AAVs to drive specific expression in POMC neurons by other laboratories (Padilla et al., Nat Med 2010; Lam et al., Mol Metab 2017; Stincic et al., eNeuro 2018 eNeuro; Fenselau et al., Nat Neuro 2017). We have previously shown that AAV-driven mCherry expression is limited to cells labeled with a beta-endorphin antibody (Stincic et al., 2018 eNeuro). Therefore, we are confident that our optogenetic studies have narrowly targeted POMC inputs.

      Some additional explanation of the electrophysiology result may be required. For example, on Line 292, I'm confused by Fig 4M. Why is the response to 20Hz stimulation different in this cell (compared to the one in 4L) before administering naloxone? What proportion of cells showed this opposite response? On line 307: Is 5 cells sufficient for testing the POMC inputs onto AVPV and PeN Kiss1 neurons? How many slices/animals are included in collecting these 5 cells? The rapid action of STX illustrates the ability to modulate the response to MTII, but I am struggling to understand the implications of this in a physiological context. Suppose this response is desensitized by longer-term treatment with E2 (as indicated in the manuscript). Is it relevant to normal regulation during the cycle (particularly in the AVPV, where the key regulatory step seems to be the prolonged exposure to high estradiol as part of the preovulatory signals leading up to the LH surge)?

      As stated in the text, E2 has been shown to increase POMC expression and beta-Endorphin immunostaining. We do not know the effects of E2 on aMSH expression and release. E2 also tends to attenuate the coupling between inhibitory postsynaptic metabotropic (Gi,o-coupled) receptors and signaling cascades. So, there is likely a combination of pre- and post-synaptic mechanisms contributing to these responses. However, the focus of the current studies was on the predominant melanocortin signaling and, as such, we chose to eliminate the influence of opioid signaling. We have added two more cells to this group, both of which were successfully rescued for a total of 5 of 6 cells (6 slices, 5 animals). Between the labeling of b-endorphin fibers and high rate of rescue, we do believe that this is sufficient evidence to support a direct POMC input to Kiss1<sup>AVP/PeN</sup> neurons.

      Line 52: "Here, we show that Mc4r expressed in Kiss1 neurons is required for fertility in females." The knockout animals remain fertile, so this conclusion needs to be re-worded.

      Thank you for this comment. This has now been changed (L52).

      Line 80: "The melanocortin 4 receptor (MC4R) binds α-melanocyte stimulating hormone (αMSH), an agonist product of the pro-opiomelanocortin (Pomc) gene, and the inverse agonist of the agouti-related peptide (AgRP) to regulate food intake and energy expenditure" Is this the correct wording? I think it should be stated that AgRP is an inverse agonist at the MC4R, not that αMSH is the inverse agonist of AgRP. Re-work this sentence.

      Thank you for this comment. This has now been changed (L79-80).

      Line 88: "... however, conflicting reports exist". Describe what these conflicting reports show. Many MC4 variants ("mutations") are expressed in humans, but few will fully inactivate signalling like the mouse knockout.

      We thank the reviewer for this comment. By conflicting data, we refer to the studies that report no reproductive impairments in women with MC4R mutations. Either because the metabolic impairments (obesity, hyperphagia, hyperinsulinemia, hyperleptinemia, etc) are so strong that the focus is skewed to these issues, without a full reproductive assessment in these women, or simply because the reviewer mentioned, not all MC4R mutations fully inactivate its signaling in humans - as opposed to mouse models where reproductive disruption has been described previously in full body MC4RKOs.

      Line 91: "...that largely affects females". Is this a genuine sex difference, or are reproductive deficits simply more overt in female rodents? I think the Coss paper (reference 19 in the manuscript) showed a greater effect of diet-induced obesity in males than in females.

      We believe that sex differences exist with regards to the role of MC4R in the regulation of fertility, as we show that most of this effect is mediated by MC4R signaling in Kiss1 AVPV neurons, a neuronal population that is specific to the female brain.

      As far as we can tell, the Coss paper (Villa et al., 2024) has only tested males but not females. Moreover, they investigated the effect of diet induced obesity in mice on their fertility (specifically LH secretion), while in this study we are specifically looking at the deletion of MC4R from Kiss1 neurons, and these mice were not obese (Figure 2A). While both these conditions induce impaired fertility, the mechanisms and signaling pathways are different (our mice lack MC4R signaling while the obese mice have a decrease in MC4R expression but the signaling is still functional).

      Line 392: also Hessler et al. PMID: 32337804.

      This reference is now added to the text (Line 393).

      Line 433. The discussion of how advanced puberty onset (seen in the Kiss1-specific KO animals) might be caused by MC4R signalling in AVPV Kiss1 neurons, which are sexually dimorphic, which might explain sex differences in puberty timing in mammals seems extremely speculative and based on limited data. More targeted experiments would be needed to address this, and I think this speculation should be removed here.

      This speculation has now been removed from the text.

      Line 438: "Furthermore, our findings suggest that metabolic cues, through the regulation of the melanocortin output onto Kiss1AVPV/PeN neurons, are essential for the timing and magnitude of the GnRH/LH surge." Again, I think this is overstating the present data, which has only looked at an artificial hormone administration regime. The animals are fertile and, thus, must be able to mount a sufficient LH surge. The major effect, in fact, seems to be on their cycle, perhaps leading to impaired follicular development. Please acknowledge that this will be one of the multiple pathways by which metabolic information is fed into the HPG axis.

      In addition to the effect on their cycles as mentioned by the reviewer, the Kiss1MC4RKO females also display impaired fertility (Figure 2, S-T) and fewer corpora lutea which is in line with the impaired mounting of LH surge (Figure 2, M). Even if the LH surge is induced by the hormone administration protocol, it only reflects the natural ability of the HPG axis to mount the surge, as this regimen is only there to mimic the endogenous hormonal changes leading to LH surge and therefore ovulation, in a controlled manner. Nonetheless, we agree with this reviewer that this is not the sole mechanism by which metabolism regulates reproductive function and this has been emphasized in the paper. (line 443)

      Reviewer #3 (Recommendations For The Authors):

      The decreased melanocortin tone drives puberty onset (Figure 1D), and this is correlative. The transgenic animals' hypothalamic expression of Agrp, Pomc, Mc4r, and Mc3r can be measured to strengthen the claim. Hprt expression should be demonstrated, as this housekeeping gene was used as a common denominator.

      We thank the reviewer for this comment. While we think that indeed, measuring Agrp, Pomc, Mc4r, and Mc3r gene expressions in the transgenic mice will strengthen our claim and give more insights into the melanocortins tone during pubertal maturation, this is unfortunately not feasible as it will involve generating a lot of mice (at least n=40 pups for an n=5/group, KO and control littermates, females only -which will require setting up lots of breeding pairs-) during different ages throughout puberty.

      As for the gene expression of Hprt, because we have 6 mice per age, 4 ages total, every gene (Agrp, Pomc, Mc4r, Mc3r) was run in a separate plate with Hprt as its own housekeeping gene. Samples were run in duplicates for each Hprt and melanocortin genes in a 96 well = 48 wells for Hprt and 48 wells for each of the melanocortin genes. Therefore, it won’t be possible to represent one Hprt expression for all the four genes, however every gene was normalized to the Hprt gene expression that was ran in the same plate).

      In Figures 4 and 5, dot plots can be used (as opposed to the bar graphs) to better reflect the individual data points.

      Figures 4 and 5 have been revised to include individual data points.

      The electrophysiology experiment requires more details in the method section. In addition to the publication cited, a brief recap of the methodology used in this paper, such as the focal application of MTII (Figure 4B), is also needed.

      We have added more details to the Methods.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public review):

      Summary:

      In the manuscript the authors describe a new pipeline to measure changes in vasculature diameter upon optogenetic stimulation of neurons. The work is useful to better understand the hemodynamic response on a network /graph level.

      Strengths:

      The manuscript provides a pipeline that allows to detect changes in the vessel diameter as well as simultaneously allows to locate the neurons driven by stimulation.

      The resulting data could provide interesting insights into the graph level mechanisms of regulating activity dependent blood flow.

      Weaknesses:

      (1) The manuscript contains (new) wrong statements and (still) wrong mathematical formulas.

      The symbols in these formulas have been updated to disambiguate them, and the accompanying statements have been adjusted for clarity.

      (2) The manuscript does not compare results to existing pipelines for vasculature segmentation (opensource or commercial). Comparing performance of the pipeline to a random forest classifier (illastik) on images that are not preprocessed (i.e. corrected for background etc.) seems not a particularly useful comparison.

      We’ve now included comparisons to Imaris (a commercial) for segmentation and VesselVio (open-source) for graph extraction software.

      For the ilastik comparison, the images were preprocessed prior to ilastik segmentation, specifically by doing intensity normalization.

      Example segmentations utilizing Imaris have now been included. Imaris leaves gaps and discontinuities in the segmentation masks, as shown in Supplementary Figure 10. The Imaris segmentation masks also tend to be more circular in cross-section despite irregularities on the surface of the vessels observable in the raw data and identified in manual segmentation. This approach also requires days to months to generate per image stack.

      “Comparison with commercial and open-source vascular analysis pipelines

      To compare our results with those achievable on these data with other pipelines for segmentation and graph network extraction, we compared segmentation results qualitatively with Imaris version 9.2.1 (Bitplane) and vascular graph extraction with VesselVio [1]. For the Imaris comparison, three small volumes were annotated by hand to label vessels. Example slices of the segmentation results are shown in Supplementary Figure 10. Imaris tended to either over- or under-segment vessels, disregard fine details of the vascular boundaries, and produce jagged edges in the vascular segmentation masks. In addition to these issues with segmentation mask quality, manual segmentation of a single volume took days for a rater to annotate. To compare to VesselVio, binary segmentation masks (one before and one after photostimulation) generated with our deep learning models were loaded into VesselVio for graph extraction, as VesselVio does not have its own method for generating segmentation masks. This also facilitates a direct comparison of the benefits of our graph extraction pipeline to VesselVio. Visualizations of the two graphs are shown in Supplementary Figure 11. Vesselvio produced many hairs at both time points, and the total number of segments varied considerably between the two sequential stacks: while the baseline scan resulted in 546 vessel segments, the second scan had 642 vessel segments. These discrepancies are difficult to resolve in post-processing and preclude a direct comparison of individual vessel segments across time. As the segmentation masks we used in graph extraction derive from the union of multiple time points, we could better trace the vasculature and identify more connections in our extracted graph. Furthermore, VesselVio relies on the distance transform of the user supplied segmentation mask to estimate vascular radii; consequently, these estimates are highly susceptible to variations in the input segmentation masks.We repeatedly saw slight variations between boundary placements of all of the models we utilized (ilastik, UNet, and UNETR) and those produced by raters. Our pipeline mitigates this segmentation method bias by using intensity gradient-based boundary detection from centerlines in the image (as opposed to using the distance transform of the segmentation mask, as in VesselVio).”

      (3) The manuscript does not clearly visualize performance of the segmentation pipeline (e.g. via 2d sections, highlighting also errors etc.). Thus, it is unclear how good the pipeline is, under what conditions it fails or what kind of errors to expect.

      On reviewer’s comment, 2D slices have been added in the Supplementary Figure 4.

      (4) The pipeline is not fully open-source due to use of matlab. Also, the pipeline code was not made available during review contrary to the authors claims (the provided link did not lead to a repository). Thus, the utility of the pipeline was difficult to judge.

      All code has been uploaded to Github and is available at the following location: https://github.com/AICONSlab/novas3d

      The Matlab code for skeletonization is better at preserving centerline integrity during the pruning of hairs from centerlines than the currently available open-source methods.

      - Generalizability: The authors addressed the point of generalizability by applying the pipeline to other data sets. This demonstrates that their pipeline can be applied to other data sets and makes it more useful.  However, from the visualizations it's unclear to see the performance of the pipeline, where the pipelines fails etc. The 3d visualizations are not particularly helpful in this respect . In addition, the dice measure seems quite low, indicating roughly 20-40% of voxels do not overlap between inferred and ground truth. I did not notice this high discrepancy earlier. A thorough discussion of the errors appearing in the segmentation pipeline would be necessary in my view to better assess the quality of the pipeline.

      2D slices from the additional datasets have been added in the Supplementary Figure 13 to aid in visualizing the models’ ability to generalize to other datasets.

      The dice range we report on (0.7-0.8) is good when compared to those (0.56-86) of 3D segmentations of large datasets in microscopy [2], [3], [4], [5], [6]. Furthermore, we had two additional raters segment three images from the original training set. We found that the raters had a mean inter class correlation  of 0.73 [7]. Our model outperformed this Dice score on unseen data: Dice scores from our generalizability tests on C57 mice and Fischer rats on par or higher than this baseline.

      Reviewer #2 (Public review):<br /> The authors have addressed most of my concerns sufficiently. There are still a few serious concerns I have. Primarily, the temporal resolution of the technique still makes me dubious about nearly all of the biological results. It is good that the authors have added some vessel diameter time courses generated by their model. But I still maintain that data sampling every 42 seconds - or even 21 seconds - is problematic. First, the evidence for long vascular responses is lacking. The authors cite several papers:

      Alarcon-Martinez et al. 2020 show and explicitly state that their responses (stimulus-evoked) returned to baseline within 30 seconds. The responses to ischemia are long lasting but this is irrelevant to the current study using activated local neurons to drive vessel signals.

      Mester et al. 2019 show responses that all seem to return to baseline by around 50 seconds post-stimulus.

      In Mester et al. 2019, diffuse stimulations with blue light showed a return to baseline around 50 seconds post-stimulus (cf. Figure 1E,2C,2D). However, focal stimulations where the stimulation light is raster scanned over a small region focused in the field of view show longer-lasting responses (cf. Figure 4) that have not returned to baseline by 70 seconds post-stimulus [8]. Alarcon-Martinez et al. do report that their responses return baseline within 30 seconds; however, their physiological stimulation may lead to different neuronal and vessel response kinetics than those elicited by the optogenetic stimulations as in current work.

      O'Herron et al. 2022 and Hartmann et al. 2021 use opsins expressed in vessel walls (not neurons as in the current study) and directly constrict vessels with light. So this is unrelated to neuronal activity-induced vascular signals in the current study.

      We agree that optogenetic activation of vessel-associated cells is distinct from optogenetic activation of neurons, but we do expect the effects of such perturbations on the vasculature to have some commonalities.

      There are other papers including Vazquez et al 2014 (PMID: 23761666) and Uhlirova et al 2016 (PMID: 27244241) and many others showing optogenetically-evoked neural activity drives vascular responses that return back to baseline within 30 seconds. The stimulation time and the cell types labeled may be different across these studies which can make a difference. But vascular responses lasting 300 seconds or more after a stimulus of a few seconds are just not common in the literature and so are very suspect - likely at least in part due to the limitations of the algorithm.

      The photostimulation in Vazquez et al. 2014 used diffuse photostimulation with a fiberoptic probe similar to Mester et al. 2019 as opposed to raster scanning focal stimulation we used in this study and in the study by Mester et al. 2019  where we observed the focal photostimulation to elicited longer than a minute vascular responses. Uhlirova et al. 2016 used photostimulation powers between 0.7 and 2.8 mW, likely lower than our 4.3 mW/mm2 photostimulation. Further, even with focal photostimulation, we do see light intensity dependence of the duration of the vascular responses. Indeed, in Supplementary Figure 2, 1.1 mW/mm2 photostimulation leads to briefer dilations/constrictions than does 4.3 mW/mm2; the 1.1 mW/mm2 responses are in line, duration wise, with those in Uhlirova et al. 2016.

      Critically, as per Supplementary Figure 2, the analysis of the experimental recordings acquired at 3-second temporal resolution did likewise show responses in many vessels lasting for tens of seconds and even hundreds of seconds in some vessels.

      Another major issue is that the time courses provided show that the same vessel constricts at certain points and dilates later. So where in the time course the data is sampled will have a major effect on the direction and amplitude of the vascular response. In fact, I could not find how the "response" window is calculated. Is it from the first volume collected after the stimulation - or an average of some number of volumes? But clearly down-sampling the provided data to 42 or even 21 second sampling will lead to problems. If the major benefit to the field is the full volume over large regions that the model can capture and describe, there needs to be a better way to capture the vessel diameter in a meaningful way.

      In the main experiment (i.e. excluding the additional experiments presented in the Supplementary Figure 2 that were collected over a limited FOV at 3s per stack), we have collected one stack every 42 seconds. The first slice of the volume starts following the photostimulation, and the last slice finishes at 42 seconds. Each slice takes ~0.44 seconds to acquire. The data analysis pipeline (as demonstrated by the Supplementary Figure 2) is not in any way limited to data acquired at this temporal resolution and - provided reasonable signal-to-noise ratio (cf. Figure 5) - is applicable, as is, to data acquired at much higher sampling rates.

      It still seems possible that if responses are bi-phasic, then depth dependencies of constrictors vs dilators may just be due to where in the response the data are being captured - maybe the constriction phase is captured in deeper planes of the volume and the dilation phase more superficially. This may also explain why nearly a third of vessels are not consistent across trials - if the direction the volume was acquired is different across trials, different phases of the response might be captured.

      Alternatively, like neuronal responses to physiological stimuli, the vascular responses elicited by increases in neuronal activity may themselves be variable in both space and time.

      I still have concerns about other aspects of the responses but these are less strong. Particularly, these bi-phasic responses are not something typically seen and I still maintain that constrictions are not common. The authors are right that some papers do show constriction. Leaving out the direct optogenetic constriction of vessels (O'Herron 2022 & Hartmann 2021), the Alarcon-Martinez et al. 2020 paper and others such as Gonzales et al 2020 (PMID: 33051294) show different capillary branches dilating and constricting. However, these are typically found either with spontaneous fluctuations or due to highly localized application of vasoactive compounds. I am not familiar with data showing activation of a large region of tissue - as in the current study - coupled with vessel constrictions in the same region. But as the authors point out, typically only a few vessels at a time are monitored so it is possible - even if this reviewer thinks it unlikely - that this effect is real and just hasn't been seen.

      Uhlirova et al. 2016 (PMID: 27244241) observed biphasic responses in the same vessel with optogenetic stimulation in anesthetized and unanesthetized animals (cf Fig 1b and Fig 2, and section “OG stimulation of INs reproduces the biphasic arteriolar response”). Devor et al. (2007) and Lindvere et al. (2013) also reported on constrictions and dilations being elicited by sensory stimuli.

      I also have concerns about the spatial resolution of the data. It looks like the data in Figure 7 and Supplementary Figure 7 have a resolution of about 1 micron/pixel. It isn't stated so I may be wrong. But detecting changes of less than 1 micron, especially given the noise of an in vivo prep (brain movement and so on), might just be noise in the model. This could also explain constrictions as just spurious outputs in the model's diameter estimation. The high variability in adjacent vessel segments seen in Figure 6C could also be explained the same way, since these also seem biologically and even physically unlikely.

      Thank you for your comment. To address this important issue, we performed an additional validation experiment where we placed a special order of fluorescent beads with a known diameter of 7.32 ± 0.27um, imaged them following our imaging protocol, and subsequently used our pipeline to estimate their diameter. Our analysis converged on the manufacturer-specified diameters, estimating the diameter to be 7.34 ± 0.32. The manuscript has been updated to detail this experiment, as below:

      Methods section insert

      “Second, our boundary detection algorithm was used to estimate the diameters of fluorescent beads of a known radius imaged under similar acquisition parameters. Polystyrene microspheres labelled with Flash Red (Bangs Laboratories, inc, CAT# FSFR007) with a nominal diameter of 7.32um and a specified range of 7.32 ± 0.27um as determined by the manufacturer using a Coulter counter were imaged on the same multiphoton fluorescence microscope set-up used in the experiment (identical light path, resonant scanner, objective, detector, excitation wavelength and nominal lateral and axial resolutions, with 5x averaging). The images of the beads had a higher SNR than our images of the vasculature, so Gaussian noise was added to the images to degrade the SNR to the same level of that of the blood vessels. The images of the beads were segmented with a threshold, centroids calculated for individual spheres, and planes with a random normal vector extracted from each bead and used to estimate the diameter of the beads. The same smoothing and PSF deconvolution steps were applied in this task. We then reported the mean and standard deviation of the distribution of the diameter estimates. A variety of planes were used to estimate the diameters.”

      Results Section Insert

      “Our boundary detection algorithm successfully estimated the radius of precisely specified fluorescent beads. The bead images had a signal-to-noise ratio of 6.79 ± 0.16 (about 35% higher than our in vivo images): to match their SNR to that of in vivo vessel data, following deconvolution, we added Gaussian noise with a standard deviation of 85 SU to the images, bringing the SNR down to 5.05 ± 0.15. The data processing pipeline was kept unaltered except for the bead segmentation, performed via image thresholding instead of our deep learning model (trained on vessel data). The bead boundary was computed following the same algorithm used on vessel data: i.e., by the average of the minimum intensity gradients computed along 36 radial spokes emanating from the centreline vertex in the orthogonal plane. To demonstrate an averaging-induced decrease in the uncertainty of the bead radius estimates on a scale that is finer than the nominal resolution of the imaging configuration, we tested four averaging levels in 289 beads. Three of these averaging levels were lower than that used on the vessels, and one matched that used on the vessels (36 spokes per orthogonal plane and a minimum of 10 orthogonal planes per vessel). As the amount of averaging increased, the uncertainty on the diameter of the beads decreased, and our estimate of the bead's diameter converged upon the manufacturer's Coulter counter-based specifications (7.32 ± 0.27um), as tabulated in Table 1.”

      Reviewer #1 (Recommendations for the authors):

      Comments to the authors replies to the reviews:

      - Supplementary Figure 13:

      As indicated before the 3d images + scale makes it impossible to judge the quality of the outputs.

      As aforementioned, 2D slices have been added to the Supplementary Figure 13.

      - Supplementary Table 3:

      There is a significant increase in the Hausdorrf and Mean Surface Distance measures for the new data, why ?

      A single vessel being missed by either the rater or the model would significantly affect the Hausdorff distance (HD) and by extension Mean Surface Distance: this is particularly pertinent in the LSFM image with its much larger FOV and thus a potential for much larger max distances to result from missed vessels in the prediction or ground truth data. Large Hausdorff distances may indicate a vessel was missed in either the ground truth or the segmentation mask.

      Of note, a different rater annotated these additional datasets from the raters labeling the ground truth data. There is a high variability in boundary placements between raters. On a test where three raters segmented the same three images from the original dataset, we computed a ICC of 0.73 across their segmentations. Our model Dice scores on predictions in out-of-distribution data sets were on par with the inter-rater ICC on the Thy1ChR2 2PFM data.

      - Supplementary Figure 2: The authors provide useful data on the time responses. However, looking at those figures, it is puzzling why certain vessels were selected as responding as there seems almost no change after stimulation. In addition, some of the responses seem to actually start several tens of seconds before the actual stimulus (particularly in A).

      Only some traces in C and D (dark blue) seem to be actually responding vessels.

      This is not discussed and unclear.

      Supplementary Figure 2 displays the time courses of vessel calibre for all vessels in the FOV, not just those deemed responders.

      The aforementioned effects are due to the loess smoothing filter having been applied to the time courses for the preliminary response, which has been rectified in the updated figures. In particular, Supplementary Figure 2 has been updated with separate loess smoothing before and after photostimulation. The (pre-stimulation) effect is gone once the loess smoothing has been separated.

      - R Point 7: As indicated before and in agreement with the alternative reviewer, the quality of the results in 3d is difficult to judge. No 2d sections that compare 'ground truth' with inferred results are shown in the current manuscript which would enable a much better judgment. The provided video is still 3d and not a video going through 2d slices. Also, in the video the overlap of vasculature and raw data seems to be very good and near 100%, why is the dice measure reported earlier so low ? Is this a particularly good example ?

      Some examples, indicating where the pipeline fails (and why) would be helpful to see, to judge its performance better (ideally in 2d slices).

      As discussed in the public comments, the 2D slices are now included in Suppl. Fig. 4 and suppl. Fig 13 to facilitate visual assessment. The vessels are long and thin so that slight dilations or constrictions impact the Dice scores without being easily visualizable.

      - Author response images 6 and 7. From the presented data the constrictions measured in the smaller vessels may be a result (at least partly) of noise. This seems to be particularly the case in Author response image 7 left top and bottom for example. It would be helpful to show the actual estimates of the vessels radii overlaid in the (raw) images. In some of the pictures the noise level seems to reach higher values than the 10-20% of noise used in the tests by the authors in the revision.

      The vessel radii are estimated as averages across all vertices of the individual vessels: it is thus not possible to overlay them meaningfully in 2D slices: in Figure 2B, we do show a rendering of sample vessel-wise radii estimates.

      - "We tested the centerline detection in Python, scipy (1.9.3) and Matlab. We found that the Matlab implementation performed better due to its inclusion of a branch length parameter for the identification of terminal branches, which greatly reduced the number of false branches; the Python implementation does not include this feature (in any version) and its output had many more such "hair" artifacts. Clearmap skeletonization uses an algorithm by Palagyi & Kuba(1999) to thin segmentation masks, which does not include hair removal. Vesselvio uses a parallelized version of the scipy implementation of Lee et al. (1994) algorithm which does not do hair removal based on a terminal branch length filter; instead, Vesselvio performs a threshold-based hair removal that is frequently overly aggressive (it removes true positive vessel branches), as highlighted by the authors."

      This statement is wrong. The removal of small branches in skeletons is algorithmically independent of the skeletonization algorithm itself. The authors cite a reference concerned with the algorithm they are currently employing for the skeletonization. Careful assessment of that reference shows that this algorithm removes small length branches after skeletonization is performed. This feature is available in open-source packages as well, or could be easily implemented.

      We appreciate that skeletonization is distinct from hair removal and have reworded this paragraph for clarity. We are currently working with SciPy developers to implement hair removal in their image processing pipeline so as to render our pipeline fully open-source.

      The removal of hairs after skeletonization with length based thresholding leads to the possibility of removing parts of centerlines in the main part of vessels after branch points with hairs. The Matlab implementation does not do this and leaves the main branches intact.

      This text has been updated to:

      “Hair” segments shorter than 20 μm and terminal on one end were iteratively removed, starting with the shortest hairs and merging the longest hairs at junctions with 2 terminal branches with the main vessel branch to reduce false positive vascular branches and minimize the amount of centerlines removed. This iterative hair removal functionality of the skeletonization algorithm is currently unavailable in python, but is available in Matlab [9].

      - "On the reviewer's comment, we did try inputting normalized images into Ilastik, but this did not improve its results." This is surprising. Reasonable standard preprocessing (e.g. background removal, equalization, and vessel enhancement) would probably restore most of illastik's performance in the indicated panel.

      While the improvement may be present in a particular set of images, the generalizability of such improvement to other patches is often poor in our experience, as reflected by aforementioned results and the widespread uptake of DL approaches to image segmentation. The in vivo datasets also contain artifacts arising from eg. bleeding into the FOV that ilastik is highly sensitive to. This is an example of noise that is not easily removed by standard preprocessing.

      - "Typical pre-processing/standard computer vision techniques with parameter tuning do not generalize on out-of-distribution data with different image characteristics, motivating the shift to DL-based approaches."

      I disagree with this statement. DL approaches can generalize typically when trained with sufficient amount of diverse data. However, DL approaches can also fail with new out of distribution data. In that situation they only be 'rescued' via new time intensive data generation and retraining. Simple standard image pre-processing steps (e.g. to remove background or boost vessel structures) have well defined parameter that can be easily adapted to new out of distribution data as clear interpretations are available. The time to adapt those parameters is typically much smaller than retraining of DL frameworks.

      We find that the standard image processing approaches with parameter tuning work robustly only if fine-tuned on individual images; i.e., the fine-tuning does not generalize across datasets. This approach thus does not scale to experiments yielding large image sizes/having high throughput experiments. While DL models may not generalize to out-of-distribution data, fine-tuning DL models with a small subset of labels generally produce superior models to parameter tuning that can be applied to entire studies. Moreover, DL fine-tuning is typically an efficient process due to very limited labelling and training time required.

      - It is still unclear how the authors pipeline performs compared with other (open source or commercially) available pipelines. As indicated before, comparing to illastik, particularly when feeding non preprocessed data, does not seem to be a particularly high bar.

      This question has also been raised by the other reviewer who asked to compare to commercially available pipelines.

      This question was not answered by the authors, and instead the authors reply by claiming to provide an open source pipeline. In fact, the use of matlab in their pipeline does not make it fully open-source either. Moreover, as mentioned before, open-source pipelines for comparisons do exists.

      As discussed above, the manuscript now includes comparisons to Imaris for segmentation and Vesselvio for graph extraction. The pipeline is on github.

      -"We agree with the review that this question is interesting; however, it is not addressable using present data: activated neuronal firing will have effects on their postsynaptic neighbors, yet we have no means of measuring the spread of activation using the current experimental model."

      Distances to the closest neuron in the manuscript are measured without checking if it's active. Thus, distances to the first set of n neurons could be measured in the same way, ignoring activation effects.

      Shorter distances to an entire ensemble of neurons would still be (more) informative of metabolic demands.

      This could indeed be done within the existing framework. The connected-components-3d can be used to extract individual occurrences of neurons in the FOV from the neuron segmentation mask. Each neuron could then have its distance calculated to each point on the vessel centerlines.

      - model architecture:

      It is unclear from the description if any positional encoding was used for the image patches.

      It is unclear if the architecture / pipeline can handle any volume sizes or is trained on a fixed volume shapes? In the latter case how is the pipeline applied?

      The model includes positional encoding, as described in Hatamizadeh et al. 2021.

      The model can be applied to images of any size, as demonstrated on larger images in Supplementary Figure 9 and on smaller images in Supplementary Figure 2. The pipeline is applied in the same way. It will read in the size of an input image and output an image of the same size.

      - transformer models often show better results when using a learning rate scheduler that adjust the learning rate (up and down ramps typically). Did the authors test such approaches?

      We did not use a learning rate scheduler, as we found we were getting good results without using one.

      - formula (4): The 95% percentile of two numbers is the max, and thus (5) is certainly not what the HD95 metric is. The formula is simply wrong as displayed.

      Thank you. The formula has been updated.

      - formula (5): formula 5 is certainly wrong: n_X, n_y are either integer numbers as indicated by the sum indices or sets when used in the distances, but can't be both at the same time.

      Thank you for your comment. The Formula has been updated.

      - The statement:

      "this functionality of the skeletonization algorithm is currently unavailable in any python implementation, but is available in Matlab [56]."

      is not correct (see reply above)

      Please see the response above. This text has been updated to:

      “Hair” segments shorter than 20 μm and terminal on one end were iteratively removed, starting with the shortest hairs and merging the longest hairs at junctions with 2 terminal branches with the main vessel branch to reduce false positive vascular branches and minimize the amount of centerlines removed. This iterative hair removal functionality of the skeletonization algorithm is currently unavailable in Python, but is available in Matlab [9].

      - the centerline extraction is performed after taking the union of smoothed masks. The union operation can induce novel 'irregular' boundaries that degrade skeletonization performance. I would expect to apply smoothing after the union?

      Indeed the images were smoothed via dilation after taking the union, as described in the previous set of responses to private comments.

      - "The radius estimate defined the size of the Gaussian kernel that was convolved with the image to smooth the vessel: smaller vessels were thus convolved with narrower kernels."

      It's unclear what image were filtered ?

      We have updated this text for clarity:

      The radius estimate defined the size of the Gaussian kernel that was convolved with the 2D image slice to smooth the vessel: smaller vessels were thus convolved with narrower kernels.

      - Was deconvolution on the raw images applied or after Gaussian filtering ?

      The deconvolution was applied before Gaussian filtering.

      - ",we extracted image intensities in the orthogonal plane from the deconvolved raw registered image. A 2D Gaussian kernel with sigma equal to 80% of the estimated vessel-wise radius was used to low-pass filter the extracted orthogonal plane image and find the local signal intensity maximum searching, in 2D, from the center of the image to the radius of 10 pixels from the center."

      Would it not be better to filter the 3d image before extracting a 2d plane and filter then ?

      That could be done, but would incur a significant computational speed penalty. 2D convolutions are faster, and produced excellent accuracy when estimating radii in our bead experiment.

      What algorithm was used to obtain the 2d images.

      The 2d images were obtained using scipy.ndimage.map_coordinates.

      - Figure 2: H is this the filtered image or the raw data ?

      Panel H is raw data.

      - It would be good to see a few examples of the raw data overlaid with the radial estimates to evaluate the approach (beyond the example in K).

      Additional examples are shown in Figure 5.

      - Figure 2 K: Why are boundary points greater than 2 standard deviations away from the mean excluded ?

      They are excluded to account for irregularities as vessels approach junctions [10], [11] REF.

      - Figure 2 L: what exactly is plotted here ? What are vertex wise changes, is that the difference between the minimum and maximum of all the detected radii for a single vertex? Why do some vessels (red) show high values consistently throughout the vessel ?

      Figure 2L displays change in the radius of vertices - in this FOV- following photostimulation in relation to baseline.

      - Assortativity: to calculate the assortativity, are radius changes binned in any form to account for the fact that otherwise, $e_{xy}$ and related measures will be likely be based on single data points?

      Assortativity is not calculated from single data points. It can be calculated by either binning into categories or computing it on scalars i.e. average radius across a vessel segment:

      See here for info on calculating assortativity from binned categories (ie classifying a vessel as a constrictor, dilator or non-responder):

      https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.assortativity.attribute_assortativity_coefficient.html#networkx.algorithms.assortativity.attribute_assortativity_coefficient

      And see here for calculating assortativity from scalar values:

      https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.assortativity.numeric_assortativity_coefficient.html#networkx.algorithms.assortativity.numeric_assortativity_coefficient

      We calculated the assortativity using scalar values.

      In both cases, one uses all nodes and calculates the correlation between each node and its neighbours with an attribute that can be binned or is a scalar. Binning the value on a given node would not affect the number of nodes in a graph.

      - "Ilastik tended to over-segment vessels, i.e. the model returned numerous false positives, having a high recall (0.89{plus minus}0.19) but low precision (0.37{plus minus}0.33) (Figure 3, Supplementary Table 3)."

      As indicated before, and looking at Figure 4, over segmentation seems due to too high background. A suggested preprocessing step on the raw images to remove background could have avoided this.

      The images were normalized in preprocessing.

      - Figure 4: The 3d panels are not much easier to read in the revised version. As suggested by other reviewers, 2d sections indicating the differences and errors would be much more helpful to judge the pipelines quality more appropriately.

      As discussed above, 2D sections are now available in a supplementary figure.

      - Figure 3: What would be the dice score (and other measures) between two ground truths extracted by two annotations by two humans (assisted e.g. by illastik).

      Two additional human rates annotated images. We observed a ICC of 0.73 across a total of three raters on the three images.

      - Figure 5: The authors only provide the absolute value of SU for the sigma noise levels. This only has some meaning when compared to the mean or median SU of the images. In the text the maximal intensity of 1023 SU is mentioned, but what are those values in images with weaker / smaller vessels (as provided in the constriction examples in the revision)/

      I am unclear why this validation figure should be part of the main manuscript while generalization performance is left out.

      The manuscript has been updated with the mean SNR value of 5.05 ± 0.15 to provide context for the quality of our images.

      Bibliography

      (1) J. R. Bumgarner and R. J. Nelson, “Open-source analysis and visualization of segmented vasculature datasets with VesselVio,” Cell Rep. Methods, vol. 2, no. 4, Apr. 2022, doi: 10.1016/j.crmeth.2022.100189.

      (2) G. Tetteh et al., “DeepVesselNet: Vessel Segmentation, Centerline Prediction, and Bifurcation Detection in 3-D Angiographic Volumes,” Front. Neurosci., vol. 14, Dec. 2020, doi: 10.3389/fnins.2020.592352.

      (3) N. Holroyd, Z. Li, C. Walsh, E. Brown, R. Shipley, and S. Walker-Samuel, “tUbe net: a generalisable deep learning tool for 3D vessel segmentation,” Jul. 24, 2023, bioRxiv. doi: 10.1101/2023.07.24.550334.

      (4) W. Tahir et al., “Anatomical Modeling of Brain Vasculature in Two-Photon Microscopy by Generalizable Deep Learning,” BME Front., vol. 2020, p. 8620932, Dec. 2020, doi: 10.34133/2020/8620932.

      (5) R. Damseh, P. Delafontaine-Martel, P. Pouliot, F. Cheriet, and F. Lesage, “Laplacian Flow Dynamics on Geometric Graphs for Anatomical Modeling of Cerebrovascular Networks,” ArXiv191210003 Cs Eess Q-Bio, Dec. 2019, Accessed: Dec. 09, 2020. [Online]. Available: http://arxiv.org/abs/1912.10003

      (6) T. Jerman, F. Pernuš, B. Likar, and Ž. Špiclin, “Enhancement of Vascular Structures in 3D and 2D Angiographic Images,” IEEE Trans. Med. Imaging, vol. 35, no. 9, pp. 2107–2118, Sep. 2016, doi: 10.1109/TMI.2016.2550102.

      (7) T. B. Smith and N. Smith, “Agreement and reliability statistics for shapes,” PLOS ONE, vol. 13, no. 8, p. e0202087, Aug. 2018, doi: 10.1371/journal.pone.0202087.

      (8) J. R. Mester et al., “In vivo neurovascular response to focused photoactivation of Channelrhodopsin-2,” NeuroImage, vol. 192, pp. 135–144, May 2019, doi: 10.1016/j.neuroimage.2019.01.036.

      (9) T. C. Lee, R. L. Kashyap, and C. N. Chu, “Building Skeleton Models via 3-D Medial Surface Axis Thinning Algorithms,” CVGIP Graph. Models Image Process., vol. 56, no. 6, pp. 462–478, Nov. 1994, doi: 10.1006/cgip.1994.1042.

      (10) M. Y. Rennie et al., “Vessel tortuousity and reduced vascularization in the fetoplacental arterial tree after maternal exposure to polycyclic aromatic hydrocarbons,” Am. J. Physiol.-Heart Circ. Physiol., vol. 300, no. 2, pp. H675–H684, Feb. 2011, doi: 10.1152/ajpheart.00510.2010.

      (11) J. Steinman, M. M. Koletar, B. Stefanovic, and J. G. Sled, “3D morphological analysis of the mouse cerebral vasculature: Comparison of in vivo and ex vivo methods,” PLOS ONE, vol. 12, no. 10, p. e0186676, Oct. 2017, doi: 10.1371/journal.pone.0186676.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      The hypothesis is based on the idea that inversions capture genetic variants that have antagonistic effects on male sexual success (via some display traits) and survival of females (or both sexes) until reproduction. Furthermore, a sufficiently skewed distribution of male sexual success will tend to generate synergistic epistasis for male fitness even if the individual loci contribute to sexually selected traits in an additive way. This should favor inversions that keep these male-beneficial alleles at different loci together at a cis-LD. A series of simulations are presented and show that the scenario works at least under some conditions. While a polymorphism at a single locus with large antagonistic effects can be maintained for a certain range of parameters, a second such variant with somewhat smaller effects tends to be lost unless closely linked. It becomes much more likely for genomically distant variants that add to the antagonism to spread if they get trapped in an inversion; the model predicts this should drive accumulation of sexually antagonistic variants on the inversion versus standard haplotype, leading to the evolution of haplotypes with very strong cumulative antagonistic pleiotropic effects. This idea has some analogies with one of predominant hypotheses for the evolution of sex chromosomes, and the authors discuss these similarities. The model is quite specific, but the basic idea is intuitive and thus should be robust to the details of model assumption. It makes perfect sense in the context of the geographic pattern of inversion frequencies. One prediction of the models (notably that leads to the evolution of nearly homozygously lethal haplotypes) does not seem to reflect the reality of chromosomal inversions in Drosophila, as the authors carefully discuss, but it is the case of some other "supergenes", notably in ants. So the theoretical part is a strong novel contribution.

      We appreciate the detailed and accurate summary of our main theoretic results.

      To provide empirical support for this idea, the authors study the dynamics of inversions in population cages over one generation, tracking their frequencies through amplicon sequencing at three time points: (young adults), embryos and very old adult offspring of either sex (>2 months from adult emergence). Out of four inversions included in the experiment, two show patterns consistent with antagonistic effects on male sexual success (competitive paternity) and the survival of offspring, especially females, until an old age, which the authors interpret as consistent with their theory.

      As I have argued in my comments on previous versions, the experiment only addresses one of the elements of the theoretical hypothesis, namely antagonistic effects of inversions on male reproductive success and other fitness components, in particular of females. Furthermore, the design of this experiment is not ideal from the viewpoint of the biological hypothesis it is aiming to test. This is in part because, rather than testing for the effects of inversion on male reproductive success versus the key fitness components of survival to maturity and female reproductive output, it looks at the effects on male reproductive success versus survival to a rather old age of 2 months. The relevance of survival until old age to fitness under natural conditions is unclear, as the authors now acknowledge. Furthermore, up to 15% of males that may have contributed to the next generation did not survive until genotyping, and thus the difference between these males' inversion frequency and that in their offspring may be confounded by this potential survival-based sampling bias. The experiment does not test for two other key elements of the proposed theory: the assumption of frequency-dependence of selection on male sexual success, and the prediction of synergistic epistasis for male fitness among genetic variants in the inversion. To be fair, particularly testing for synergistic epistasis would be exceedingly difficult, and the authors have now included a discussion of the above caveats and limitations, making their conclusions more tentative. This is good but of course does not make these limitations of the experiment go away. These limitations mean that the paper is stronger as a theoretical than as an empirical contribution.

      We discuss the choice to focus on exploring the potential antagonistic effects of the inversion karyotype on male reproductive success and survival in our general response above. Primarily, this prediction seemed to be the most specific to the proposed model as compared to other alternate models. Still, further studies are clearly needed to elucidate the potential frequency dependence and genetic architecture of the inversions.

      Regarding the choice of age at collection, it is unknown to what degree our selected collection age of 10 weeks correlates with survival in the wild, but we feel confident that there will be some positive correlation.

      We now further clarify that across our experiments, a minimum of 5% and a mean of 9% of the males used in the parental generation died before collection. These proportions do not appear sufficient to explain the differences between paternal and embryo inversion frequencies shown in Figure 9.

      Reviewer #2 (Public review):

      Summary:

      In their manuscript the authors address the question whether the inversion polymorphism in D. melanogaster can be explained by sexually antagonistic selection. They designed a new simulation tool to perform computer simulations, which confirmed their hypothesis. They also show a tradeoff between male reproduction and survival. Furthermore, some inversions display sex-specific survival.

      Strengths:

      It is an interesting idea on how chromosomal inversions may be maintained

      Weaknesses:

      The authors motivate their study by the observation that inversions are maintained in D. melanogaster and because inversions are more frequent closer to the equator, the authors conclude that it is unlikely that the inversion contributes to adaptation in more stressful environments. Rather the inversion seems to be more common in habitats that are closer to the native environment of ancestral Drosophila populations.

      While I do agree with the authors that this observation is interesting, I do not think that it rules out a role in local adaptation. After all, the inversion is common in Africa, so it is perfectly conceivable that the non-inverted chromosome may have acquired a mutation contributing to the novel environment.

      Based on their hypothesis, the authors propose an alternative strategy, which could maintain the inversion in a population. They perform some computer simulations, which are in line with the predicted behavior. Finally, the authors perform experiments and interpret the results as empirical evidence for their hypothesis. While the reviewer is not fully convinced about the empirical support, the key problem is that the proposed model does not explain the patterns of clinal variation observed for inversions in D. melanogaster. According to the proposed model, the inversions should have a similar frequency along latitudinal clines. So in essence, the authors develop a complicated theory because they felt that the current models do not explain the patterns of clinal variation, but this model also fails to explain the pattern of clinal variation.

      To the contrary – in the Discussion paragraph beginning on Line 671, we explain why we would predict that a tradeoff between survival and reproduction should lead to clinal inversion frequencies. We suggest that a karyotype associated with a survival penalty should be increasingly disadvantageous in more challenging environments (such as high altitudes and latitudes for this species). Furthermore, an advantage in male reproductive competition conferred by that same haplotype may be reduced by the lower population densities that we would expect in more challenging environments (meaning that each female should encounter fewer males). Individually or jointly, these two factors predict that the equilibrium frequency of a balanced inversion frequency polymorphism should depend on a local population’s environmental harshness and population density, with the ensuing prediction that inversion frequency should correlate with certain environmental variables.

      Reviewer #3 (Public review):

      Summary:

      In this study, McAllester and Pool develop a new model to explain the maintenance of balanced inversion polymorphism, based on (sexually) antagonistic alleles and a trade-off between male reproduction and survival (in females or both sexes). Simulations of this model support the plausibility of this mechanism. In addition, the authors use experiments on four naturally occurring inversion polymorphisms in D. melanogaster and find tentative evidence for one aspect of their theoretical model, namely the existence of the above-mentioned trade-off in two out of the four inversions.

      Strengths:

      (1) The study develops and analyzes a new (Drosophila melanogaster-inspired) model for the maintenance of balanced inversion polymorphism, combining elements of (sexually) antagonistically (pleiotropic) alleles, negative frequency-dependent selection and synergistic epistasis. Simulations of the model suggest that the hypothesized mechanism might be plausible.

      (2) The above-mentioned model assumes, as a specific example, a trade-off between male reproductive display and survival; in the second part of their study, the authors perform laboratory experiments on four common D. melanogaster inversions to study whether these polymorphisms may be subject to such a trade-off. The authors observe that two of the four inversions show suggestive evidence that is consistent with a trade-off between male reproduction and survival.

      Open issues:

      (1) A gap in the current modeling is that, while a diploid situation is being studied, the model does not investigate the effects of varying degrees of dominance. It would thus be important and interesting, as the authors mention, to fill this gap in future work.

      (2) It will also be important to further explore and corroborate the potential importance and generality of trade-offs between different fitness components in maintaining inversion polymorphisms in future work.

      We appreciate the work put in to evaluating, improving, and summarizing our study. We agree that further work studying the effects of dominance and of the fitness components of the inversions is important.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      l. 354 : I don't understand what the authors mean by "an antagonistic and non-antagonistic allele". If there is a antagonistic polymorphism at a locus, then both alleles have antagonistic effects; i.e., allele B increases trait 1 and reduced trait 2 relative to allele A and vice versa.

      Edited, agreed that the terminology used here was sub-optimal.

      Reviewer #2 (Recommendations for the authors):

      The motivation for their model is their claim that the clinal inversion frequencies are not compatible with local adaptation. The reviewer doubts this strong statement. Furthermore, the proposed model also fails to explain the inversion frequencies in natural populations.

      Hence, rather than building a straw man, it would be better if the authors first show their experiments and then present their model as an explanation for the empirical results. Nevertheless, it is also clear that the empirical data are not very strong and cannot be fully explained by the proposed model.

      This claim that we reject any role of local adaptation in clinal variation and selection upon inversion polymorphism does not hold up in a reading of our manuscript. We even suggest that locally varying selective pressures must be playing some role, although that does not imply that local adaptation is the ultimate driver of inversion frequencies. Indeed, we suggest that local adaptation alone is an insufficient explanation for inversion frequency clines in D. melanogaster, including because (1) these frequency clines do not approach the alternate fixed genotypes predicted by local directional selection, (2) these derived inversions tend to be more frequent in more ancestral environments (l.113-158).

      In our public review response above, and in the Discussion section of our paper, we explain why our model can predict both the clinal frequencies of many Drosophila inversions and their intermediate maximal frequencies. Of course, we do not predict that most inversions in this species should follow the specific tradeoff investigated here. In fact, we were surprised to find even two inversions that experimentally supported our predicted tradeoff. Still, it remains possible that other inversions in this species are subject to other balanced tradeoffs not investigated here, which could help explain why they rarely reach high local frequencies.

      Reviewer #3 (Recommendations for the authors):

      My previous comments have been adequately addressed.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      […]

      To provide empirical support for this idea, the authors study the dynamics of inversions in population cages over one generation, tracking their frequencies through amplicon sequencing at three time points: (young adults), embryos and very old adult offspring of either sex (>2 months from adult emergence). Out of four inversions included in the experiment, two show patterns consistent with antagonistic effects on male sexual success (competitive paternity) and the survival of offspring, especially females, until an old age, which the authors interpret as consistent with their theory.

      There are several reasons why the support from these data for the proposed theory is not waterproof.

      (1) As I have already pointed out in my previous review, survival until 2 months (in fact, it is 10 weeks and so 2.3 months) of age is of little direct relevance to fitness, whether under natural conditions or under typical lab conditions.

      The authors argue this objection away with two arguments

      First, citing Pool (2015) they claim that the average generation time (i.e. the average age at which flies reproduce) in nature is 24 days. That paper made an estimate of 14.7 generations per year under the North Carolina climate. As also stated in Pool (2015), the conditions in that locality for Drosophila reproduction and development are not suitable during three months of the year. This yields an average generation length of about 19.5 days during the 9 months during which the flies can reproduce. On the highly nutritional food used in the lab and at the optimal temperature of 25 C, Drosophila need about 11-12 days to develop from egg to adult. Even assuming these perfect conditions, the average age (counted from adult eclosion) would be about 8 days. In practice, larval development in nature is likely longer for nutritional and temperature reasons, and thus the genomic data analyzed by Pool imply that the average adult age of reproducing flies in nature would be about 5 days, and not 24 days, and even less 10 weeks. This corresponds neatly to the 2-6 days median life expectancy of Drosophila adults in the field based on capture-recapture (e.g., Rosewell and Shorrocks 1987).

      Second, the authors also claim that survival over a period of 2 month is highly relevant because flies have to survive long periods where reproduction is not possible. However, to survive the winter flies enter a reproductive diapause, which involves profound physiological changes that indeed allow them to survive for months, remaining mostly inactive, stress resistant and hidden from predators. Flies in the authors' experiment were not diapausing, given that they were given plentiful food and kept warm. It is still possible that survival to the ripe old age of 10 weeks under these conditions still correlates well with surviving diapause under harsh conditions, but if so, the authors should cite relevant data. Even then, I do not think this allows the authors to conclude that longevity is "the main selective pressure" on Drosophila (l. 936).

      This is overall a thoughtfully presented critique and we have endeavored to improve our discussion of Pool (2015) and to clarify some of the language used about survival elsewhere. While we agree that challenges other than survival to 10 weeks are very relevant to Drosophila melanogaster, collection at 10 weeks does encompass some of these other challenges. Egg to adult viability still contributes to the frequencies of the inversions at collection and is not separable from longevity in this data. Collection at longevity was chosen in part to encompass all lifetime fitness challenges that might influence the inversion frequency at collection, albeit still within permissive laboratory conditions. Future experiments exploring specific stressors independently and beyond permissive lab conditions would generate a clearer picture.

      In addition to general edits, the specific phrase mentioned at 1. 936 [now line 1003] has been revised from “In many such cases females are in reproductive diapause, and so longevity is the main selective pressure.” to “While longevity is a key selective pressure underlying overwintering, the relationship between longevity in permissive lab conditions without diapause and in natural conditions under diapause is unclear (Schmidt et al. 2005; Flatt 2020), and our experiment represents just one of many possible ways to examine tradeoffs involving survival.”

      (2) It appears that the "parental" (in fact, paternal) inversion frequency was estimated by sequencing sires that survived until the end of the two-week mating period. No information is provided on male mortality during the mating period, but substantial mortality is likely given constant courtship and mating opportunities. If so, the difference between the parental and embryo inversion frequency could reflect the differential survival of males until the point of sampling rather than / in addition to sexual selection.

      We have further clarified that when referenced as parental frequency, the frequency presented is ½ the paternal frequency as the mothers were homokaryotypic for the standard arrangement. We chose to present both due to considerations in representing the frequency change from paternal to embryo frequencies, where a hypothetical change from 0.20 frequency in fathers to 0.15 frequency in embryos represents a selective benefit (a frequency increase in the population), despite the reality that this is a decrease in allele frequency between paternal and embryo cohorts.

      We mentioned a maximum 15% paternal mortality at line 827 [now l.1056], but have now added complete data on the counts of flies in the experiment as a supplemental table (Table S1) and have added or corrected further references to this in the results and methods [lines 555, 638, 975]. It is true that this may influence the observed frequency changes to some degree, and while we adjusted our sampling method to account for the effects of this mortality on statistical power [l.1056ff], we have now edited the manuscript to better highlight potential effects of this phenomenon on the recorded frequency changes.

      It is also worth noting that, if mortality among fathers over the mating period is codirectional with mortality among aged offspring, this would bias the results against detecting an opposing antagonistic selective effect of the inversions on paternity share. This is now also mentioned in the manuscript, l.639ff.

      (3) Finally, irrespective of the above caveats, the experimental data only address one of the elements of the theoretical hypothesis, namely antagonistic effects of inversions on reproduction and survival, notably that of females. It does not test for two other key elements of the proposed theory: the assumption of frequency-dependence of selection on male sexual success, and the prediction of synergistic epistasis for male fitness among genetic variants in the inversion. To be fair, particularly testing the latter prediction would be exceedingly difficult. Nonetheless, these limitations of the experiment mean that the paper is much stronger theoretical than empirical contribution.

      This is a fair criticism of the limitations of our results, and we now summarize such caveats more directly in the discussion summary, lines 876ff.

      Reviewer #2 (Public Review): 

      […]

      Comments on the latest version:

      I would like to give an example of the confusing terminology of the authors:

      "Additionally, fitness conveyed by an allele favoring display quality is also frequency-dependent: since mating success depends on the display qualities of other males, the relative advantage of a display trait will be diminished as more males carry it..."

      I do not understand the difference to an advantageous allele, as it increases in frequency the frequency increase of this allele decreases, but this has nothing to do with frequency dependent selection. In my opinion, the authors re-define frequency dependent selection, as for frequency dependent selection needs to change with frequency, but from their verbal description this is not clear.

      We have edited this text for greater clarity, now line 232ff. We did not seek to redefine frequency dependence, and did mean by “the relative advantage of a display trait will be diminished” that an equivalent s would diminish with frequency. We have now remedied terminological issues introduced in the prior revision with regard to frequency dependent selection.

      One example of how challenging the style of the manuscript is comes from their description of the DNA extraction procedure. In principle a straightforward method, but even here the authors provide a convoluted uninformative description of the procedure.

      We have edited for clarity the text on lines 1016-1020. Citing a published protocol and mentioning our modifications seems an appropriate trade-off between representing what was done accurately, citing the sources we relied on in doing it, and limiting the volume of information in the main text for such a straightforward and common method. 

      It is not apparent to the reviewer why the authors have not invested more effort to make their manuscript digestible.

      We have invested a great deal of effort in making this manuscript as clear as we are able to.  We regret that our writing has not been to this reviewer’s liking. We believe we have been highly responsive to all specific criticisms, including revising all passages cited as unclear. In this round, we have again scrutinized the entire manuscript for any opportunity to clarify it, and we have made further changes throughout.  Although our subject matter is conceptually nuanced, we nevertheless remain optimistic that a careful, fresh reading of our revised manuscript would yield a more favorable impression.

      Reviewer #3 (Public Review):

      […]

      Weaknesses:

      A gap in the current modeling is that, while a diploid situation is being studied, the model does not investigate the effects of varying degrees of dominance. It would be important and interesting to fill this gap in future work.

      Agreed, and now reinforced at lines 892ff.

      Comments on the latest version:

      Most of the comments which I have made in my public review have been adequately addressed.

      Some of the writing still seems somewhat verbose and perhaps not yet maximally succinct; some additional line-by-line polishing might still be helpful at this stage in terms of further improving clarity and flow (for the authors to consider and decide).

      We have made further changes and some polishing in this draft, and greatly appreciate the guidance provided in improving the draft so far. 

      Reviewer #1 (Recommendations For The Authors):

      (1) While the model results are convincing, some of the verbal interpretation is confusing. In particular, the authors state that in their model the allele favoring male display quality shows a negative frequency dependence whereas the alternative allele has a positive frequency dependence. This does not make sense to me in the context of population genetics theory. For a one-locus, two-allele model the change of allele frequency under selection depends on the fitness of the genotypes concerned relative to each other. Thus, at least under no dominance assumed in this model, if the relative fitness of AA decreases with the frequency of allele A, the relative fitness of aa must decrease with the frequency of allele a. I.e., if selection is negatively frequency dependent, then it is so for both alleles.

      This phrasing was wrong, and we have edited the relevant section.

      (2) I am still not entirely sure that the synergistic epistasis assumed in the verbal model is actually generated in the simulations; this would be easy enough to check by extracting the mating success of males with different genotypes from the simulation output should be reported, e.g., as a figure supplement.

      Our new Figure S2, which depicts haplotype frequencies for a set of the simulations presented in Figure 4, should demonstrate a necessary presence of synergistic epistasis. These results further clarify that the weaker allele B is only kept when linked to A. The same fitness classes of genotype are present in the simulations with and without the inversion, so the only mechanical difference is the rate of recombination, and the only way this might change selection on the alleles is if a variant has a different fitness in one haplotype background than another – i.e. epistasis. The maintenance of haplotypes AB and ab to the exclusion of Ab and aB relies on the lesser relative fitness of Ab and aB. And since survival values are multiplicative, this additional contribution must come from the mate success of AB being disproportionately larger than Ab or aB, indicating the emergent synergistic epistasis posited by our model. We have clarified this point in the text at line 363ff.

      (3) l. 318ff: What was this set number of males? I could not find this information anywhere. Also, this model of the mating system is commonly referred to as "best of N", so the authors may want to include this label in the description.

      We indicate this detail just after the referenced line, now reworded and on l. 338-340 as “For each female’s mating competition, 100 males were sampled, though see Figure S1 for plots with varying encounter number.”  Among these edits, “one hundred” has been changed to a numeral for easier skimming, and Figure S1 is now referenced here earlier in the text. Several edits have also been made in the caption of Figures 2 and 3, and in the relevant methods section to clarify the number of encountered males simulated, mention best of N terminology, and clarify how the quality score is used in the mate competition.

      (4) The description of the experiment is still confusing. The number of individuals of each sex entered in each mating cage is missing from the Methods (l. 914); although I did finally find it in the Results. These flies were laying over 2 weeks - does this mean that offspring from the entire period were used to obtain the embryo and aged offspring frequencies, or only from a particular egg collection? If the former, does this mean that the offspring obtained from different egg batches were aged separately? Were the offspring aged in cages or bottles, at what density? Given that only those males that survived until the end of the two-week mating period were sequenced, it is important to know what % of the initial number of males these survivors were. A substantial mortality of the parental males could bias the estimate of parental frequencies. How many parental males, embryos and aged offspring were sequenced? Were all individuals of a given cage and stage extracted and sequenced as a single pool or were there multiple pools? The description could also be structured better. For example, the food and grape agar recipes and cage construction are inserted at random points of the description of the crossing design, which does not help.

      We have now reorganized and edited these portions of the Methods text. Portions of this comment overlap with edits responding to (2) of the Public Review and below for l. 921 in Details. Offspring from different laying periods were aged in different bottles, further separated by the time at which they eclosed. They were then pooled for DNA extraction and library preparation by sex and a binary early or late eclosion time. This data was present in the “D. mel. Sample Size” column of supplemental tables S6 and S7 (now S7 and S8), but we have added and referenced a new table to specifically collate the sample sizes of different experimental stages, table S1. Now referenced at lines 555, 638, 975, 1057.

      (5) The caption of figure 9 and the discussion of its results should be clear and explicit about the fact that "adult offspring" in Fig 9A and "female" and "male" refers to adults surviving to old age (whereas "parental" in Fig 9A refers to young adults in their reproductive prime. This has consequences for the interpretation of the difference between "parental" and "adult offspring", as it combines one generation of usual selection as it occurs under the conditions of the lab culture (young adult at generation t -> young adult in generation t+1) with an additional step of selection for longevity. Thus, a marked change in allele frequency does not imply that the "parental" frequency does not represent an equilibrium frequency of the inversions under the lab culture conditions. Furthermore, it would be useful to state explicitly that Figure 9B represents the same results as figure 9A, but with the aged offspring split by sex.

      Figure caption edited to provide further clarity on the age of cohorts and presented data, along with the relevant results section (2.3) referencing this figure.

      We avoid making any statements about the equilibrium frequencies of inversions under lab conditions, and whether or not any step of our experiment reflects such equilibria, because our investigation does not rely upon or test for such conditions. Instead, our analysis focuses on whether inversions have contrasting effects (as indicated by frequency changes that are incompatible with neutral sampling) between different life history components.  Under our model, such frequency reversals might be detectable both at equilibrium balanced inversion frequencies and also at frequencies some distance away from equilibria. We have now clarified this point at l. 970-972.

      Details:

      l. 211: this should be modified as male-only costs are now included.

      Edited. “survival likelihood (of either or both sexes).”

      l. 343: misplaced period

      Edited.

      l. 814: "We confirmed model predictions...": This sounds like it refers to an empirical confirmation of a theory prediction, but I think the authors just want to say that their simulations predicted antagonistic variants can be maintained at an intermediate equilibrium frequency. So the wording should be changed to avoid ambiguity.

      Edited. Now line 869.

      l. 853: How can a genome be "empty"? Do the authors mean an absence of any polymorphism?

      Edited to: “In SAIsim, a population is instantiated as a python object, and populated with individuals which are also represented by python objects. These individuals may be instantiated using genomes specified by the user, or by default carry no genomic variation.” Lines 913ff.

      l. 853: I do not see this diagramed in Figure 5

      Apologies, fixed to Fig. 2

      l. 864: is crossing-over in the model limited to female gametogenesis (reflecting the Drosophila case) or does it occur in both sexes?

      There is a variable in the simulator to make crossover female-specific. All simulations were performed with female-only crossover. Edited for clarity. “While the simulator can allow recombination in both sexes, all simulations presented only generate crossovers and gene conversion events for female gametes, in accordance with the biology of D. melanogaster.” Lines 928-929.

      l. 906: "F2" is ambiguous; does this mean that the mix of lines was allowed to breed for two generations? Also, in other places in the manuscript these flies appear to be referred to are "parental". So do not use F2.

      Edited, F2 language removed and replaced with being allowed to breed for two generations. Now lines 967ff.

      l. 910: this is incorrect/imprecise; what can be inferred is the frequency of the inversions in male gametes that contributed to fertilization. This would correspond to the frequency in successful males only if each successful male genotype had the same paternity share.

      Edited, now “Since no inversions could be inherited through the mothers, inversion frequencies among successful male gametes could be inferred from their pooled offspring.” Now line 994.

      l. 912: "without a controlled day/night cycle" meaning what? Constant light? Constant darkness? Daylight falling through the windows?

      Edited to “Unless otherwise noted, all flies were kept in a lab space of 23°C with around a degree of temperature fluctuation and without a controlled day/night cycle. Light exposure was dependent on the varying use of the space by laboratory workers but amounted to near constant exposure to at least a minimal level of lighting, with some variable light due to indirect lighting from adjacent rooms with exterior windows.” Now lines 1007-1010.

      l. 921: I cannot parse this sentence. Were the offspring isolated as virgins?

      No, the logistics of collecting virgins would have been prohibitive, and it did not seem essential for our experiment. Hopefully the edits to this section are clearer, now lines 978ff.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This study offers a useful treatment of how the population of excitatory and inhibitory neurons integrates principles of energy efficiency in their coding strategies. The analysis provides a comprehensive characterisation of the model, highlighting the structured connectivity between excitatory and inhibitory neurons. However, the manuscript provides an incomplete motivation for parameter choices. Furthermore, the work is insufficiently contextualized within the literature, and some of the findings appear overlapping and incremental given previous work.

      We are genuinely grateful to the Editors and Reviewers for taking time to provide extremely valuable suggestions and comments, which will help us to substantially improve our paper. We decided to do our very best to implement all suggestions, as detailed in the point-by-point rebuttal letter below. We feel that our paper has improved considerably as a result. 

      Public Reviews:

      Reviewer #1 (Public Review): 

      Summary: Koren et al. derive and analyse a spiking network model optimised to represent external signals using the minimum number of spikes. Unlike most prior work using a similar setup, the network includes separate populations of excitatory and inhibitory neurons. The authors show that the optimised connectivity has a like-to-like structure, leading to the experimentally observed phenomenon of feature competition. They also characterise the impact of various (hyper)parameters, such as adaptation timescale, ratio of excitatory to inhibitory cells, regularisation strength, and background current. These results add useful biological realism to a particular model of efficient coding. However, not all claims seem fully supported by the evidence. Specifically, several biological features, such as the ratio of excitatory to inhibitory neurons, which the authors claim to explain through efficient coding, might be contingent on arbitrary modelling choices. In addition, earlier work has already established the importance of structured connectivity for feature competition. A clearer presentation of modelling choices, limitations, and prior work could improve the manuscript.

      Thanks for these insights and for this summary of our work.  

      Major comments:

      (1) Much is made of the 4:1 ratio between excitatory and inhibitory neurons, which the authors claim to explain through efficient coding. I see two issues with this conclusion: (i) The 4:1 ratio is specific to rodents; humans have an approximate 2:1 ratio (see Fang & Xia et al., Science 2022 and references therein); (ii) the optimal ratio in the model depends on a seemingly arbitrary choice of hyperparameters, particularly the weighting of encoding error versus metabolic cost. This second concern applies to several other results, including the strength of inhibitory versus excitatory synapses. While the model can, therefore, be made consistent with biological data, this requires auxiliary assumptions.

      We now describe better the ratio of numbers of E and I neurons found in real data, as suggested. The first submission already contained an analysis of how the optimal ratio of E vs I neuron numbers depends in our model on the relative weighting of the loss of E and I neurons and on the relative weighting of the encoding error vs the metabolic cost in the loss function (see Fig. 7E). We revised the text on page 12 describing Fig. 7E. 

      To allow readers to form easily a clear idea of how the weighting of the error vs the cost may influence the optimal network configuration, we now present how optimal parameters depend on the weighting in a systematic way, by always including this type of analysis when studying all other model parameters (time constants of single E and I neurons, noise intensity, metabolic constant, ratio of mean I-I to E-I connectivity). These results are shown on the Supplementary Fig. S4 A-D and H, and we comment briefly on each of them in Results sections (pages 9, 10, 11 and 12) that analyze each of these parameters.  

      Following this Reviewer’s comment, we now included a joint analysis of network performance relative to the ratio of E-I neuron numbers and the ratio of mean I-I to E-I connectivity (Fig. 7J). We found a positive correlation between optima values of these two ratios. This implies that a lower ratio of E-I neuron numbers, such as a 2:1 ratio in human cortex mentioned by the reviewer, predicts lower optimal ratio of I-I to E-I connectivity and thus weaker inhibition in the network. We made sure that this finding is suitably described in revision (page 13).

      (2) A growing body of evidence supports the importance of structured E-I and I-E connectivity for feature selectivity and response to perturbations. For example, this is a major conclusion from the Oldenburg paper (reference 62 in the manuscript), which includes extensive modelling work. Similar conclusions can be found in work from Znamenskiy and colleagues (experiments and spiking network model; bioRxiv 2018, Neuron 2023 (ref. 82)), Sadeh & Clopath (rate network; eLife, 2020), and Mackwood et al. (rate network with plasticity; eLife, 2021). The current manuscript adds to this evidence by showing that (a particular implementation of) efficient coding in spiking networks leads to structured connectivity. The fact that this structured connectivity then explains perturbation responses is, in the light of earlier findings, not new.

      We agree that the main contribution of our manuscript in this respect is to show how efficient coding in spiking networks can lead to structured connectivity implementing lateral inhibition similar to that proposed in the recent studies mentioned by the Reviewer. We apologize if this was not clear enough in the previous version. We streamlined the presentation to make it clearer in revision.  We nevertheless think it useful to report the effects of perturbations within this network because these results give information about how lateral inhibition works in our network. Thus, we kept presenting it in the revised version, although we de-emphasized and simplified its presentation. We now give more emphasis to the novelty of the derivation of this connectivity rule from the principles of efficient coding (pages 4 and 6). We also describe better (page 8) what the specific results of our simulated perturbation experiments add to the existing literature.

      (3) The model's limitations are hard to discern, being relegated to the manuscript's last and rather equivocal paragraph. For instance, the lack of recurrent excitation, crucial in neural dynamics and computation, likely influences the results: neuronal time constants must be as large as the target readout (Figure 4), presumably because the network cannot integrate the signal without recurrent excitation. However, this and other results are not presented in tandem with relevant caveats.

      We improved the Limitations paragraph in Discussion, and also anticipated caveats in tandem with results when needed, as suggested. 

      We now mention the assumption of equal time constants between the targets and readouts in the Abstract. 

      We now added the analysis of the network performance and dynamics as a function of the time constant of the target (t<sub>x</sub>) to the Supplementary Fig S5 (C-E). These results are briefly discussed in text on page 13. The only measure sensitive to t<sub>x</sub> is the encoding error of E neurons, with a minimum at t<sub>x</sub> =9 ms, while I neurons and metabolic cost show no dependency. Firing rates, variability of spiking as well as the average and instantaneous balance show no dependency on t<sub>x</sub>. We note that t<sub>x</sub> = t, with t=1/l the time constant of the population readout (Eq. 9), is an assumption we use when we derive the model from the efficiency objective (Eq. 18 to 23). In our new and preliminary work (Koren, Emanuel, Panzeri, Biorxiv 2024), we derived a more general class of models where this assumption is relaxed, which gives a network with E-E connectivity that adapts to the time constant of the stimulus. Thus, the reviewer is correct in the intuition that the network requires E-E connectivity to better integrate target signals with a different time constant than the time constant of the membrane. We now better emphasize this limitation in Discussion (page 16).

      (4) On repeated occasions, results from the model are referred to as predictions claimed to match the data. A prediction is a statement about what will happen in the future – but most of the “predictions” from the model are actually findings that broadly match earlier experimental results, making them “postdictions”.

      This distinction is important: compared to postdictions, predictions are a much stronger test because they are falsifiable. This is especially relevant given (my impression) that key parameters of the model were tweaked to match the data.

      We now comment on every result from the model as either matching earlier experimental results, or being a prediction for experiments. 

      In Section “Assumptions and emergent properties of the efficient E-I network derived from first principles”, we report (page 4) that neural networks have connectivity structure that relates to tuning similarity of neurons (postdiction). 

      In Section “Encoding performance and neural dynamics in an optimally efficient E-I network” we report (page 5) that in a network with optimal parameters, I neurons have higher firing rate than E neurons (postdiction), that single neurons show temporally correlated synaptic currents (postdiction) and that the distribution of firing rates across neurons is log-normal (postdiction). 

      In Section “Competition across neurons with similar stimulus tuning emerging in efficient spiking networks” we report (page 6)  that the activity perturbation of E neurons induces lateral inhibition on other E neurons, and that the strength of lateral inhibition depends on tuning similarity (postdiction). We show that activity perturbation of E neurons induces lateral excitation in I neurons (prediction). We moreover show that the specific effects of the perturbation of neural activity rely on structured E-I-E connectivity (prediction for experiments, but similar result in Sadeh and Clopath, 2020). We show strong voltage correlations but weak spike-timing correlations in our network (prediction for experiments, but similar result in Boerlin et al. 2013). 

      In Section “The effect of structured connectivity on coding efficiency and neural dynamics”, we report (page 7) that our model predicts a number of differences between networks with structured and unstructured (random) connectivity. In particular, structured networks differ from unstructured ones by showing better encoding performance, lower metabolic cost, weaker variance over time in the membrane potential of each neuron, lower firing rates and weaker average and instantaneous balance of synaptic currents.

      In Section “Weak or no spike-triggered adaptation optimizes network efficiency”, we report (page 9) that our model predicts better encoding performance in networks with adaptation compared to facilitation. Our results suggest that adaptation should be stronger in E compared to I (PV+) neurons (postdiction). In the same section, we report (page 10) that our results suggest that the instantaneous balance is a better predictor of model efficiency than average balance (prediction).

      In Section “Non-specific currents regulate network coding properties”, we report (page 10) that our model predicts that more than half of the distance between the resting potential and firing threshold is taken by external currents that are unrelated to feedforward processing (postdiction). We also report (page 11) that our model predicts that moderate levels of uncorrelated (additive) noise is beneficial for efficiency (prediction for experiments, but similar results in Chalk et al., 2016, Koren et al., 2017, Timcheck et al. 2022).

      In Section “Optimal ratio of E-I neuron numbers and of mean I-I to E-I synaptic efficacy coincide with biophysical measurements”, we predict the optimal ratio of E to I neuron numbers to be 4:1 (postdiction) and the optimal ratio of mean I-I to E-I connectivity to be 3:1 (postdiction). Further, we report (page 13) that our results predict that a decrease in the ratio of E-I neuron numbers is accompanied with the decrease in the ratio of mean I-I to E-I connectivity. 

      Finally, in Section “Dependence of efficient coding and neural dynamics on the stimulus statistics”, we report (page 13) that our model predicts that the efficiency of the network has almost no dependence on the time scale of the stimulus (prediction). 

      Reviewer #2 (Public Review):

      Summary:

      In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength. It thus argues that some of these observations may come as a direct consequence of efficient coding.

      Strengths:

      While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models.

      In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some longstanding puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important.

      Though several of the observations have been reported and studied before (see below), this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.

      Thanks for these insights and for the kind words of appreciation of the strengths of our work.  

      Weaknesses:

      Though the text of the paper may suggest otherwise, many of the modeling choices and observations found in the paper have been introduced in previous work on efficient spiking models, thereby making this work somewhat repetitive and incremental at times. This includes the derivation of the network into separate excitatory and inhibitory populations, discussion of physical units, comparison of voltage versus spike-timing correlations, and instantaneous E/I balance, all of which can be found in one of the first efficient spiking network papers (Boerlin et al. 2013), as well as in subsequent papers. Metabolic cost and slow adaptation currents were also presented in a previous study (Gutierrez & Deneve 2019). Though it is perfectly fine and reasonable to build upon these previous studies, the language of the text gives them insufficient credit.

      We indeed built our work on these important previous studies, and we apologize if this was not clear enough. We thus improved the text to make sure that credit to previous studies is more precisely and more clearly given (see detailed reply for the list of changes made). 

      To facilitate the understanding on how we built on previous work, we expanded the comparison of our results with the results of Boerlin et al. (2013) about voltage correlations and uncorrelated spiking (page 7), comparison with the derivation of physical units of Boerlin et al. (2013) (page 3), discussion of how results on the ratio of the number of E to I neurons relate  to Calaim et al (2022) and Barrett et al. (2016) (page 16), and comment on the previous work by Gutierrez and Deneve about adaptation (page 8).  

      Furthermore, the paper makes several claims of optimality that are not convincing enough, as they are only verified by a limited parameter sweep of single parameters at a time, are unintuitive and may be in conflict with previous findings of efficient spiking networks. This includes the following. 

      Coding error (RMSE) has a minimum at intermediate metabolic cost (Figure 5B), despite the fact that intuitively, zero metabolic cost would indicate that the network is solely minimizing coding error and that previous work has suggested that additional costs bias the output. 

      Coding error also appears to have a minimum at intermediate values of the ratio of E to I neurons (effectively the number of I neurons) and the number of encoded variables (Figures 6D, 7B). These both have to do with the redundancy in the network (number of neurons for each encoded variable), and previous work suggests that networks can code for arbitrary numbers of variables provided the redundancy is high enough (e.g., Calaim et al. 2022). 

      Lastly, the performance of the E-I variant of the network is shown to be better than that of a single cell type (1CT: Figure 7C, D). Given that the E-I network is performing a similar computation as to the 1CT model but with more neurons (i.e., instead of an E neuron directly providing lateral inhibition to its neighbor, it goes through an interneuron), this is unintuitive and again not supported by previous work. These may be valid emergent properties of the E-I spiking network derived here, but their presentation and description are not sufficient to determine this.

      With regard to the concern that our previous analyses considered optimal parameter sets determined with a sweep of a single parameter at a time, we have addressed this issue in two ways. First, we presented (Figure 6I and 7J and text on pages 11 and 13) results of joint sweeps of variations of pairs of parameters whose joint variations are expected to influence optimality in a way that cannot be understood varying one parameter at a time. These new analyses complement the joint parameter sweep of the time constants of single E and I neurons (t<sub>r</sub><sup>E</sup> and t<sub>r</sub><sup>I</sup>) that has already been presented in Fig. 5A (former Fig. 4A). Second, we conducted, within a reasonable/realistic range of possible variations of each individual parameter, a Monte-Carlo random joint sampling (10000 simulations with 20 trials each) of all 6 model parameters that we explored in the paper. We presented these new results on Fig. 2 and discuss it on pages 5-6. 

      The Reviewer is correct in stating that the error (RMSE) exhibits a counterintuitive minimum as a function of the metabolic constant despite the fact that, intuitively, for vanishing metabolic constant the network is solely minimizing the coding error (Fig. 6B). In our understanding, this counterintuitive finding is due to the presence of noise in the membrane potential dynamics. In the presence of noise, a non-vanishing metabolic constant is needed to suppress “inefficient” spikes purely induced by noise that do not contribute to coding and increase the error. This gives rise to a form of “stochastic resonance”, where the noise improves detection of the signal coming from the feedforward currents. We note that the metabolic constant and the noise variance both appear in the non-specific external current (Eq. 29f in Methods), and, thus, a covariation in their optimal values is expected. Indeed, we find that the optimal metabolic constant monotonically increases as a function of the noise variance, with stronger regularization (larger beta) required to compensate for larger variability (larger sigma) (Fig. 6I). Finally, we note that a moderate level of noise (which, in turn, induces a non-trivial minimum of the coding error as a function of beta) in the network is optimal. The beneficial effect of moderate levels of noise on performance in networks with efficient coding has been shown in different contexts in previous work (Chalk et al. 2016, Koren and Deneve, 2017). The intuition is that the noise prevents the excessive synchronization of the network and insufficient single neuron variability that decrease the performance. The points above are now explained in the revised text on page 11.

      The Reviewer is also correct in stating that the network exhibits an optimal performance for intermediate values of the number of I neurons and the number of encoded features. In our understanding, the optimal number of encoded features of M=3 arises simply because all the other parameters were optimized for those values of M. The purpose of those analyses was not to state that a network optimally encodes only a given number of features, but how a network whose parameters are optimized for a given M perform reasonably well when M is varied. We clarify this on page 13 of Results in Discussion on page 16. In the same Discussion paragraph we refer also to the results of Calaim et al mentioned by the Reviewer. 

      To address the concern about the comparison of efficiency between the E-I and the 1CT model, we took advantage of the Reviewer’s suggestions to consider this issue more deeply. In revision, we now compare the efficiency of the 1CT model with the E population of the E-I model (Fig. 8H). This new comparison changes the conclusion about which model is more efficient, as it shows the 1CT model is slightly more efficient than the E-I model. Nevertheless, the E-I model performance is more robust to small variations of optimal parameters, e.g., it exhibits biologically plausible firing rates for non-optimal values of the metabolic constant. See also the reply to point 3 of the Public Review of Reviewer 2 for more detail. We added these results and the ensuing caveats for the interpretation of this comparison on Page 14, and also revised the title of the last subsection of Results.  

      Alternatively, the methodology of the model suggests that ad hoc modeling choices may be playing a role. For example, an arbitrary weighting of coding error and metabolic cost of 0.7 to 0.3, respectively, is chosen without mention of how this affects the results. Furthermore, the scaling of synaptic weights appears to be controlled separately for each connection type in the network (Table 1), despite the fact that some of these quantities are likely linked in the optimal network derivation. Finally, the optimal threshold and metabolic constants are an order of magnitude larger than the synaptic weights (Table 1). All of these considerations suggest one of the following two possibilities. One, the model has a substantial number of unconstrained parameters to tune, in which case more parameter sweeps would be necessary to definitively make claims of optimality. Or two, parameters are being decoupled from those constrained by the optimal derivation, and the optima simply corresponds to the values that should come out of the derivation.

      We thank the reviewer for bringing about these important questions.

      In the first submission, we presented both the encoding error and the metabolic cost separately as a function of the parameters, so that readers could get an understanding of how stable optimal parameters would be to the change of the relative weighting of encoding error and metabolic cost. We specified this in Results (page 5) and we kept presenting separately encoding and metabolic terms in the revision.

      However, we agree that it is important to present the explicit quantification on how the optimal parameters may depend on g<sub>L</sub>. In the first submission, we showed the analysis for all possible weightings in case of two parameters for which we found this analysis was the most relevant – the ratio of neuron numbers (Fig. 7E, Fig. 6E in first submission) and the optimal number of input features M (see last paragraph on page 13 and Fig. 8D). We now show this analysis also for the rest of studied model parameters in the Supplementary Fig. S4 (A-D and H). This is discussed on pages 9, 10,11 and 12.

      With regard to the concern that the scaling of synaptic weights should not be controlled separately for each connection type in the network, we agree and we would like to clarify that we did not control such scaling separately. Apologies if this was not clear enough. From the optimal analytical solution, we obtained that the connectivity scales with the standard deviation of decoding weights (s<sub>w</sub><sup>E</sup> and s<sub>w</sub><sup>I</sup>) of the pre and postsynaptic populations (Methods, Eq. 32). We studied the network properties as a function of the ratio of average I-I to E-I connectivity (Fig. 7 F-I; Supplementary Fig. S4 D-H), which is equivalent to the ratio of standard deviations s<sub>w</sub><sup>I</sup> /s<sub>w</sub><sup>E</sup> (see Methods, Eq. 35). We clarified this in text on page 12.

      Next, it is correct that our synaptic weights are an order of magnitude smaller than the metabolic constant. We analysed a simpler version of the network that has the coding and dynamics identical to our full model (Methods, Eq. 25) but without the external currents. We found that the optimal parameters determining the firing threshold in such a simpler network were biologically implausible (see Supplementary Text 2 and Supplementary Table S1). We considered as another simple solution the rescaling of the synaptic efficacy such as to have biologically plausible threshold. However, that gave implausible mean synaptic efficacy (see Supplementary Text 2).  Thus, to be able to define a network with biologically plausible firing threshold and mean synaptic efficacy, we introduced the non-specific external current. After introducing such current, we were able to shift the firing threshold to biologically plausible values while keeping realistic values of mean synaptic efficacy. Biologically plausible values for the firing threshold are around 15 -– 20 mV above the resting potential (Constantinople and Bruno, 2013), which is the value that we have in our model. A plausible value for the average synaptic strength is between a fraction of one millivolt to a couple of millivolts (Constantinople & Bruno, 2013, Campagnola et al. 2022), which also corresponds to values that the synaptic weights take. The above results are briefly explained in the revised text on page 4.

      Finally, to study the optimality of the network when changing multiple parameters at a time, we added a new analysis with Monte-Carlo random joint sampling (10.000 parameter sets with 20 trials for each set) of all 6 model parameters that we explored in the paper. We compared (Fig 2) the so-obtained results of each simulation with those obtained from the understanding gained from varying one or two parameters at a time (optimal parameters reported in Table 1 and used throughout the paper).  We found (Fig. 2) that the optimal configuration in Table 1 was never improved by any other simulations we performed, and that the first three random simulations that came the closest to the optimal one of Table 1 had stronger noise intensity but also stronger metabolic cost than the configuration on Table 1. The second, third and fourth configurations had longer time constants of both E and I single neurons (adaptation time constants). Ratio of E-I neuron numbers and of I-I to E-I connectivity in the second, third and fourth best configuration were either jointly increased or decreased with respect to our configuration. These results are reported on Fig. 2 and in Tables 2-3 and they are discussed in Results (page 5).

      Reviewer #3 (Public Review):

      Summary:

      In their paper the authors tackle three things at once in a theoretical model: how can spiking neural networks perform efficient coding, how can such networks limit the energy use at the same time, and how can this be done in a more biologically realistic way than previous work?

      They start by working from a long-running theory on how networks operating in a precisely balanced state can perform efficient coding. First, they assume split networks of excitatory (E) and inhibitory (I) neurons. The E neurons have the task to represent some lower dimensional input signal, and the I neurons have the task to represent the signal represented by the E neurons. Additionally, the E and I populations should minimize an energy cost represented by the sum of all spikes. All this results in two loss functions for the E and I populations, and the networks are then derived by assuming E and I neurons should only spike if this improves their respective loss. This results in networks of spiking neurons that live in a balanced state, and can accurately represent the network inputs.

      They then investigate in-depth different aspects of the resulting networks, such as responses to perturbations, the effect of following Dale's law, spiking statistics, the excitation (E)/inhibition (I) balance, optimal E/I cell ratios, and others. Overall, they expand on previous work by taking a more biological angle on the theory and showing the networks can operate in a biologically realistic regime.

      Strengths:

      (1) The authors take a much more biological angle on the efficient spiking networks theory than previous work, which is an essential contribution to the field.

      (2) They make a very extensive investigation of many aspects of the network in this context, and do so thoroughly.

      (3) They put sensible constraints on their networks, while still maintaining the good properties these networks should have.

      Thanks for this summary and for these kind words of appreciation of the strengths of our work.  

      Weaknesses:

      (1) The paper has somewhat overstated the significance of their theoretical contributions, and should make much clearer what aspects of the derivations are novel. Large parts were done in very similar ways in previous papers. Specifically: the split into E and I neurons was also done in Boerlin et al (2008) and in Barrett et al (2016). Defining the networks in terms of realistic units was already done by Boerlin et al (2008). It would also be worth it to discuss Barrett et al (2016) specifically more, as there they also use split E/I networks and perform biologically relevant experiments.

      We improved the text to make sure that credit to previous studies is more precisely and more clearly given (see rebuttal to the specific suggestions of Reviewer 2 for a full list).

      We apologize if this was not clear enough in the previous version. 

      With regard to the specific point raised here about the E-I split, we revised the text on page 2. With regard to the realistic units, we revised the text on page 3. Finally, we commented on relation between our results and results of the study by Barrett et al. (2016) on page 16.

      (2) It is not clear from an optimization perspective why the split into E and I neurons and following Dale's law would be beneficial. While the constraints of Dale's law are sensible (splitting the population in E and I neurons, and removing any non-Dalian connection), they are imposed from biology and not from any coding principles. A discussion of how this could be done would be much appreciated, and in the main text, this should be made clear.

      We indeed removed non-Dalian connections because Dale’s law is a major constraint for biological plausibility. Our logic was to consider efficient coding within the space of networks that satisfy this (and other) biological plausibility constraints. We did not intend to claim that removing the non-Dalian connections was the result of an analytical optimization. We clarified this in revision (page 4).

      (3) Related to the previous point, the claim that the network with split E and I neurons has a lower average loss than a 1 cell-type (1-CT) network seems incorrect to me. Only the E population coding error should be compared to the 1-CT network loss, or the sum of the E and I populations (not their average). In my author recommendations, I go more in-depth on this point.

      We carefully considered these possibilities and decided to compare only the E population of the E-I model with the 1-CT model. On Fig.8G (7C of the first submission), E neurons have a slightly higher error and cost compared to the 1CT network. In the revision, we compared the loss of E neurons of the E-I model with the loss of the 1-CT model. Using such comparison, we found that the 1CT network has lower loss and is more efficient compared to E neurons of the E-I model. We revised Figure 8H and text on page 14 to address this point. 

      (4) While the paper is supposed to bring the balanced spiking networks they consider in a more experimentally relevant context, for experimental audiences I don't think it is easy to follow how the model works, and I recommend reworking both the main text and methods to improve on that aspect.

      We tried to make the presentation of the model more accessible to a non-computational audience in the revised paper. We carefully edited the text throughout to make it as accessible as possible. 

      Assessment and context:

      Overall, although much of the underlying theory is not necessarily new, the work provides an important addition to the field. The authors succeeded well in their goal of making the networks more biologically realistic, and incorporating aspects of energy efficiency. For computational neuroscientists, this paper is a good example of how to build models that link well to experimental knowledge and constraints, while still being computationally and mathematically tractable. For experimental readers, the model provides a clearer link between efficient coding spiking networks to known experimental constraints and provides a few predictions.

      Thanks for these kind words. We revised the paper to make sure that these points emerge more clearly and in a more accessible way from the revised paper.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Referring to the major comments:

      (1) Be upfront about particular modelling choices and why you made them; avoid talk of a "striking/surprising", etc. ability to explain data when this actually requires otherwise-arbitrary choices and auxiliary assumptions. Ideally, this nuance is already clear from the abstract.

      We removed all the "striking/surprising" and similar expressions from the text. 

      We added to the Abstract the assumption of equal time constants of the stimulus and of the membrane of E and I neurons and the assumption of the independence of encoded stimulus features.

      In revision, we performed additional analyses (joint parameter sweeps, Monte-Carlo joint sampling of all 6 model parameters) providing additional evidence that the network parameters in Table 1 capture reasonably well the optimal solution. These are reported on Figs. 2, 6I and 7J and in Results (pages 5, 11 and 13). See rebuttal to weaknesses of the public review of the Referee 2 for details.

      (2) Make even more of an effort to acknowledge prior work on the importance of structured E-I and I-E connectivity.

      We have revised the text (page 4) to better place our results within previous work on structured E-I and I-E connectivity.

      (3) Be clear about the model's limitations and mention them throughout the text. This will allow readers to interpret your results appropriately.

      We now comment more on model's limitations, in particular the simplifying assumption about the network's computation (page 16), the lack of E-E connectivity (page 3), the absence of long-term adaptation (page 10), and the simplification of only having one type of inhibitory neurons (page 16). 

      (4) Present your "predictions" for what they are: aspects of the model that can be made consistent with the existing data after some fitting. Except in the few cases where you make actual predictions, which deserve to be highlighted.

      We followed the suggestion of the reviewer and distinguished cases where the model is consistent with the data (postdictions) from actual predictions, where empirical measurements are not available or not conclusive. We compiled a list of predictions and postdictions in response to the point 4 of Reviewer 1. In revision, we now comment about every property of the model as either reproducing a known property of biological networks (postdiction) or being a prediction. We improved the text in Results on pages 4, 5, 6, 7, 9, 10, 11, 12 and 13 to accommodate these requests.

      Minor comments and recommendations

      It's a sizable list, but most can be addressed with some text edits.

      (1) The image captions should give more details about the simulations and analyses, particularly regarding sample sizes and statistical tests. In Figure 5, for example, it is unclear if the lines represent averages over multiple signals and, if so, how many. It's probably not a single realization, but if it is, this might explain the otherwise puzzling optimal number of three stimuli. Box plots visualize the distribution across simulation trials, but it's not clear how many. In Figure 7d, a star suggests statistical significance, but the caption does not mention the test or its results; the y-axis should also have larger limits.

      All statistical results were computed on 100 or 200 simulation trials, depending on the figure, with duration of the trial of 1 second of simulated time. To compute statistical results in Fig. 1, we used 10 trials with duration of 10 seconds for each trial. Each trial consisted of M independent realizations of Ornstein-Uhlenbeck (OU) processes as stimuli, independent noise in the membrane potential and an independent draw of tuning parameters, such that the results are general over specific realization of these random variables. Realizations of the OU processes were independent across stimulus dimensions and across trials. We added this information in the caption of each figure. 

      The optimal number of M=3 stimuli is the result of measuring the performance of the network in 100 simulation trials (for each parameter value), thus following the same procedure as for all other parameters. Boxplots on Fig. 8G-H were also generated from results computed in 100 simulation trials, which we have now specified in the caption of the figure, together with the statistical test used for assessing the significance (twotailed t-test). We also enlarged the limits of Fig. 8H (7D in the previous version).

      (2) The Oldenburg paper (reference 62) finds suppression of all but nearby neurons in response to two- photon stimulation of small neural ensembles (instead of single neurons, as in Chettih & Harvey). This isn't perfectly consistent with the model's results, even though the Oldenburg experiments seem more relevant given the model's small size, and strong connectivity/high connection probability between similarly tuned neurons. What might explain the potential mismatch?

      We sincerely apologize for not having been precise enough on this point when comparing our model against Chettih & Harvey and Oldenburg et al. We corrected the sentence (page 6) to remove the claim that our model reproduces both. 

      We speculate that the discrepancy between perturbing our model and the Oldenburg data may arise from the lack of E-E connectivity in our model. Synaptic connections between E neurons with similar selectivity could create an enhancement instead of suppression between neuronal pairs with very similar tuning. We added a sentence about this in the section with perturbation experiments “Competition across neurons with similar stimulus tuning emerging in efficient spiking networks” (page 7) where we discuss this limitation of our model. We feel that this example shows the utility to derive some perturbation results from our model, as not all networks with some degree of lateral inhibition will show the same perturbation results. Comparing our model's perturbation with real data perturbation results has thus some value to better appreciate strengths and limitations of our approach. 

      (3) "Previous studies optogenetically stimulated E neurons but did not determine whether the recorded neurons were excitatory or inhibitory " (p. 11). I believe Oldenburg et al. did specifically image excitatory neurons.

      The reviewer is correct about Oldenburg et al. imaging specifically excitatory neurons. We have revised this part of the Discussion (page 15). 

      (4) The authors write that efficiency is particularly achieved where adaptation is stronger in E compared to I neurons (p. 7; Figure 4). Although this would be consistent with experimental data (the I neurons in the model seem akin to fast-spiking Pv+ cells), I struggle to see it in the figure. Instead, it seems like there are roughly two regimes. If either of the neuronal timescales is faster than the stimulus timescale, the optimisation fails. If both are at least as slow, optimisation succeeds.

      We agree with the reviewer that the adaptation properties of our inhibitory neurons are compatible with Pv+ cells. What is essential for determining the dynamical regime of the network is less the relation to the time constant of the stimulus (t<sub>x</sub>) but rather the relation between the time constant of the population readout (t, which is also the membrane time constant) and the time constant of the single neuron (t<sub>r</sub><sup>y</sup> for y=E and y=I; see Eq. 23, 25 or 29e). The relation between t and t<sub>r</sub><sup>y</sup> determines if single neurons generate spike-triggered adaptation (t<sub>r</sub><sup>y</sup> > t) or spike-triggered facilitation (t<sub>r</sub><sup>y</sup> < t; see Table 4). In regimes with facilitation in either E or I neurons (or both), the network performance strongly deteriorates compared to regimes with adaptation (Fig. 5A). 

      Beyond adaptation leading to better performance, we also found different effects of adaptation in E and I neurons. We acknowledge that the difference of these effects was difficult to see from the Fig. 4B in the first submission. We have now replotted results from previously shown Fig. 4B to focus on the adaptation regime only, (since the Fig. 5A already establishes that this is the regime with better performance). We also added figures showing the differential effect of adaptation in E and I cell type on the firing rate and on the average loss (Fig. 5C-D). Fig. 5B and C (top plots) show that with adaptation in E neurons, the error and the loss increase more slowly than with adaptation in I neurons. Moreover, the firing rate in both cell types decreases with adaptation in E neurons, while this is not the case with adaptation in I neurons (Fig. 5D). These results are added to the figure panels specified above and discussed in text on page 9.

      To clarify the relation between neuronal and stimulus timescale, we now also added the analysis of network performance as a function of the time constant of the stimulus t<sub>x</sub> (Supplementary Fig. S5 C-E). We found that the model's performance is optimal when the time constant of the stimulus is close to the membrane time constant t. This result is expected, because the equality of these time constants was imposed in our analytical derivation of the model (t<sub>x</sub>  = t). We see a similar decrease in performance for values of t<sub>x</sub>  that are faster and slower with respect to the membrane time constant (Supplementary Fig. S5C, top). These results are added to the figure panels specified above and discussed in text on page 13.

      (5) A key functional property of cortical interneurons is their lower stimulus selectivity. Does the model replicate this feature?

      We think that whether I neurons are less selective than E neurons is still an open question. A number of recent empirical studies reported that the selectivity of I neurons is comparable to the selectivity of E neurons (see., e.g., Kuan et al. Nature 2024, Runyan et al. Neuron 2010, Najafi et al. Neuron 2020). In our model, the optimal solution prescribes a precise structure in recurrent connectivity (see Eq. 24 and Fig. 1C(ii)) and structured connectivity endows I neurons with stimulus selectivity. To show this, we added plots of example tuning curves and the distribution of the selectivity index across E and I neurons (Fig. 8E-F) and described these new results in Results (page 14). Tuning curves in our network were similar to those computed in a previous work that addressed stimulus tuning in efficient spiking networks (Barrett et al. 2016). We evaluated tuning curves using M=3 constant stimulus features and we varied one of the features while the two others were kept fixed. We provided details on how the tuning curves and the selectivity index were computed in a new Methods subsection (“Tuning curves and selectivity index”) on page 50.

      (6) The final panels of Figure 4 are presented as an approach to test the efficiency of biological networks. The authors seem to measure the instantaneous (and time-averaged) E-I balance while varying the adaptation parameter and then correlate this with the loss. If that is indeed the approach (it's difficult to tell), this doesn't seem to suggest a tractable experiment. Also, the conclusion is somewhat obvious: the tighter the single neuron balance, the fewer unnecessary spikes are fired. I recommend that the authors clearly explain their analysis and how they envision its application to biological data.

      We indeed measured the instantaneous (and time-averaged) E-I balance while varying the adaptation parameters and then correlating this with the loss. We did not want to imply that the latter panels of Figure 4 are a means to test the efficiency or biological networks or that we are suggesting new and possibly unfeasible experiments. We see it as a way to better conceptually understand how spike triggered adaptation helps the network’s coding efficiency, by tightening the E I balance in a way that it reduces the number of unnecessary spikes. We apologize if the previous text was confusing in this respect.   We have now removed the initial paragraph of former Results Subsection (including removing the subsection title) and added new text about different effect of adaptation in E and I neurons on Page 9. We also thoroughly revised Figure 5.

      (7) The external stimuli are repeatedly said to vary (or be tracked) across "multiple time scales", which might inadvertently be interpreted as (i) a single stimulus containing multiple timescales or (ii) simultaneously presented stimuli containing different timescales. These scenarios are potential targets for efficient coding through neuronal adaptation (reference 21 in the manuscript and Pozzorini et al. Nat. Neuro. 2013), but they are not addressed in the current model. I recommend the authors clarify their statements regarding timescales (and if they're up for it, acknowledge this as a limitation).

      We thank the reviewer for bringing up this interesting point. To address the second point raised by the Reviewer (simultaneously presented stimuli containing multiple timescales), we performed new analyses to test the model with simultaneously presented stimuli that have different timescales. We found that the model encodes efficiently such stimuli.  We tested the case with a 3-dimensional stimulus where each dimension is an Ornstein-Uhlenbeck process with a different time constant. More precisely, we kept the time constant in the first dimension fixed (at 10 ms), and varied the time constant in the second and third dimension such that the time constant in the third dimension is doubled with respect to the second dimension. We plotted the encoding error in every stimulus dimension for E and I neurons (Fig. 8B, left plot) as well as the encoding error and the metabolic cost averaged across stimulus dimensions (Fig. 8B, right plot). The results are briefly described with text on page 13.

      Regarding the case i) (single stimulus containing multiple timescales), we considered two possibilities. One possibility is that timescales of the stimulus are separable, and in this case a single stimulus containing several time scales can be decomposed in several stimuli with a single time scale each. As we assign a new set of weights for each dimension of the decomposed stimulus, this case is similar to the case ii) that we already addressed. Another possibility is that timescales of the stimulus cannot be separated. This case is not covered in the present analysis and we listed it among the limitations of the model. We revised the text (page 13) around the question of multiple time scales and included the citation of Pozzorini et al. (2013). 

      (8) It is claimed that the model uses a mixed code to represent signals, citing reference 47 (Rigotti et al., Nature 2013). But whereas the model seems to use linear mixed selectivity, the Rigotti reference highlights the virtues of nonlinear mixed selectivity. In my understanding, a linearly mixed code does not enjoy the same benefits since it’s mathematically equivalent to a non-mixed code (simply rotate the readout matrix). I recommend that the authors clarify the type of selectivity used by their model and how it relates to the paper(s) they cite.

      The reviewer is correct that our selectivity is a linear mixing of input variables, and differs from the selectivity in Rigotti et al. (2013) which is non-linear. We revised the sentence on page 4 to clarify better that the mixed selectivity we consider is linear and we removed Rigotti’s citation. 

      (9) Reference 46 is cited as evidence that leaky integration of sensory features is a relevant computation for sensory areas. I don’t think this is quite what the reference shows. Instead, it finds certain morphological and electrophysiological differences between single pyramidal neurons in the primary visual cortex compared to the prefrontal cortex. Reference 46’ then goes on to speculate that these are differences relevant to sensory computation. This may seem like a quibble, but given the centrality of the objectivee function in normative theories, I think it's important to clarify why a particular objective is chosen.

      We agree that our reference of Amatrudo et al was not the best reference and that the previous text was confusing. We thus tried to improve on its clarity. We looked at the previous theoretical efficient coding papers introducing this leaky integration and we could not find in the previous theoretical work a justification of this assumption based on experimental papers. However, there is evidence that neurons in sensory structures, and in cortical association areas respond to time varying sensory evidence by summing stimuli over time with a weight that decreases steadily going back in time from the time of firing, which suggests that neurons integrate time-varying sensory features. In many cases, these integration kernels decay approximately exponentially going back in time, and several models explaining successfully perceptual readouts of neural activity work assuming leaky integration. This suggests that the mathematical approximation of leaky integration of sensory evidence, though possibly simplistic, is reasonable.  We revised the text in this respect (page 2).  

      (10) The definition of the objective function uses beta as a tuning parameter, but later parts of the text and figures refer to a parameter g_L which might only be introduced in the convex combination of Eq. 40a.

      This is correct. Parameter optimization has been performed on a weighted sum of the average encoding error and cost as given by the Eq. 39a (40a in first submission), with the weighting g<sub>L</sub> for the error versus the cost, and not the beta that is part of the objective in Eq.10. The convex combination in Eq. 39a allowed us to find a set of optimal parameters that is within biologically realistic parameter ranges, which includes realistic values for the firing threshold. The average encoding error and metabolic cost (the two terms on the right-hand side of Eq. 39a, without weighting with g<sub>L</sub>) in our network are of the same order (see Fig 8G for the E-I model where these values are plotted separately for the optimal network). Weighing the cost with optimal beta that is in the range of ~10 would have yielded a network that optimizes almost exclusively the metabolic cost and would bias the results towards solutions with poor encoding accuracy.

      To document more fully how the choice of weighting of the error with the cost (g<sub>L</sub>) affects the optimal parameters, we now added new analysis (Fig. 8D and Supplementary Fig. S4 A-D and H) showing optimal parameters as a function of this weighting. We commented on these results in the text on pages 9-11 and 12. For further details, please see also the reply to point 1 or Reviewer 1.

      (11) Figure 1J: "In E neurons, the distribution of inhibitory and of net synaptic inputs overlap". In my understanding, they are in fact identical, and this is by construction. It might help the reader to state this.

      We apologize for an unclear statement. In E neurons, net synaptic current is the sum of the feedforward current and of recurrent inhibition (Eq. 29c and Eq. 42). With our choice of tuning parameters that are symmetric around zero and with stimulus features that have vanishing mean, the mean of the feedforward current is close to zero. Because of this, the mean of the net current is negative and is close to the mean of the inhibitory current. We have clarified this in the text (page 5).

      (12) A few typos:

      -  p1. "Minimizes the encoding accuracy" should be "maximizes..."

      -  p1: "as well the progress" should be something like "as well as the progress"

      -  p.11 In recorded neurons where excitatory or inhibitory. ", "where" should be "were" - Fig3: missing parentheses (B)

      -  Fig4B: the 200 ticks on the y-scale are cut off.

      -  Panel Fig. 5a: "stimulus" should be "stimuli".

      -  Ref 24 "Efficient andadaptive sensory codes" is missing a space.

      -  p. 26: "requires" should be "required".

      -  On several occasions, the article "the" is missing.

      We thank the reviewer for kindly pointing out the typos that we now corrected.

      Reviewer #2 (Recommendations For The Authors):

      I would like to give the authors more details about the two main weaknesses discussed above, so that they may address specific points in the paper. First, there is the relation to previous work. Several published articles have presented very similar results to those discussed here, including references 5, 26, 28, 32, 33, 42, 43, 48, and an additional reference not cited by the authors (Calaim et al. 2022 eLife e73276). This includes:

      (1) Derivation of an E-I efficient spiking network, which is found in refs. 28, 42, 43, and 48. This is not reflected in the text: e.g., "These previous implementations, however, had neurons that did not respect Dale's law" (Introduction, pg. 1); "Unlike previous approaches (28, 48), we hypothesize that E and I neurons have distinct normative objectives...". The authors should discuss how their derivation compares to these.

      We have now fully clarified on page 3 that our model builds on the seminal previous works that introduced E-I networks with efficient coding (Supplementary text in Boerlin et al. 2013, Chalk et al. 2016, Barrett et al. 2016). 

      (2) Inclusion of a slow adaptation current: I believe this also appears in a previous paper (Gutierrez & Deneve 2019, ref. 33) in almost the exact same form, and is again not reflected in the text: "The strength of the current is proportional to the difference in inverse time constants ... and is thus absent in previous studies assuming that these time constants are equal (... ref. 33). Again, the authors should compare their derivation to this previous work.

      We thank the reviewer for pointing this out. We sincerely apologize if our previous version did not recognize sufficiently clearly that the previous work of Gutierrez and Deneve (eLife 2019; ref 33) introduced first the slow adaptation current that is similar to spike-triggered adaptation in our model. We have made sure that the revised text recognizes it more clearly. We also explained better what we changed or added with respect to this previous work (see revised text on page 8). 

      The work by Gutierrez and Deneve (2019) emphasizes the interplay between single neuron property (an adapting current in single neurons) and network property (networklevel coding through structured recurrent connections). They use a network that does not distinguish E and I neurons. Our contribution instead focuses on the adaptation in an E-I network. To improve the presentation following the Reviewer’s comment, we now better emphasize the differential effect of adaptation in E and in I neurons in revision (Fig. 5 B-D). Moreover, Gutierrez and Deneve studied the effect of adaptation on slower time scales (1 or 2 seconds) while we study the adaptation on a finer time scale of tens of milliseconds. The revised text detailed this is reported on Page 8.

      (3) Background currents and physical units: Pg. 26: "these models did not contain any synaptic current unrelated to feedforward and recurrent processing" and "Moreover previous models on efficient coding did not thoroughly consider physical units of variables" - this was briefly described in ref. 28 (Boerlin et al. 2013), in which the voltage and threshold are transformed by adding a common constant, and additional aspects of physical units are discussed.

      It is correct that Boerlin et al (2013) suggested adding a common constant to introduce physical units. We now revised the text to make clearer the relation between our results and the results of Boerlin et al. (2013) (page 3). In our paper, we built on Boerlin et al. (2013) and assigned physical units to computational variables that define the model's objective (the targets, the estimates, the metabolic constant, etc.). We assigned units to computational variables in such a way that physical variables (such as membrane potential, transmembrane currents, firing thresholds and resets) have the correct physical units.  We have now clarified how we derived physical units in the section of Results where we introduce the biophysical model (page 3) and specified how this derivation relates to the results in Boerlin et al. (2013).

      (4) Voltage correlations, spike correlations, and instantaneous E/I balance: this was already pointed out in Boerlin et al. 2013 (ref 28; from that paper: "Despite these strong correlations of the membrane potentials, the neurons fire rarely and asynchronously") and others including ref. 32. The authors mention this briefly in the Discussion, but it should be more prominent that this work presents a more thorough study of this well-known characteristic of the network.

      We agree that it would be important to comment on how our results relate to these results in Boerlin et al. (2013). It is correct that in Boerlin et al. (2013) neurons have strong correlations in the membrane potentials, but fire asynchronously, similarly to what we observe in our model. However, asynchronous dynamics in Boerlin et al. (2013) strongly depends on the assumption of instantaneous synaptic transmission and time discretization, with a “one spike per time bin” rule in numerical implementation. This rule enforces that at most one spike is fired in each time bin, thus actively preventing any synchronization across neurons. If this rule is removed, their network synchronizes, unless the metabolic constant is strong enough to control such synchronization to bring it back to asynchronous regime (see ref. 36). Our implementation does not contain any specific rule that would prevent synchronization across neurons. We now cite the paper by Boerlin and colleagues and briefly summarize this discussion when we describe the result of Fig. 3D on page 7. 

      (5) Perturbations and parameters sweep: I found one previous paper on efficient spiking networks (Calaim et al. 2022) which the authors did not cite, but appears to be highly relevant to the work presented here. Though the authors perform different perturbations from this previous study, they should ideally discuss how their findings relate to this one. Furthermore, this previous study performs extensive sweeps over various network parameters, which the authors might discuss here, when relevant. For example, on pg. 8, the authors write “We predict that, if number of neurons within the population decreases, neurons have to fire more spikes to achieve an optimal population readout” – this was already shown in Calaim et al. 2022 Figure 5, and the authors should mention if their results are consistent.

      We apologize for not being aware of Calaim et al. (2022) when we submitted the first version of our paper. This important study is now cited in the revised version. We have now, as suggested, performed sweeps of multiple parameters inspired by the work of Calaim. This new analysis is described extensively in reply to Weaknesses in the Public Review of reviewer 2 and is found in Fig 2, 6I and 7J and described on pages 5,11 and 13.

      The Reviewer is also correct that the compensation mechanism that applies when changing the ratio of E-I neuron numbers is similar to the one described in Barrett et al. (2016) and related to our claim “if number of neurons within the population decreases, neurons have to fire more spikes to achieve an optimal population readout”. We have now added (page 11) that this prediction is consistent with the finding of Barrett et al. (2016).

      With regard to the dependence of optimal coding properties on the number of neurons, we have tried to better describe similarities and differences with our work and that of Calaim et al as well as with the work of Barrett et al. (2016) which reports highly relevant results. These additional considerations are summarized in a paragraph in Discussion (page 16).

      (6) Overall, the authors should distinguish which of their results are novel, which ones are consistent with previous work on efficient spiking networks, and which ones are consistent in general with network implementations of efficient and sparse coding. In many of the above cases, this manuscript goes into much more depth and study of each of the network characteristics, which is interesting and commendable, but this should be made clear. In clarifying the points listed above, I hope that the authors can better contextualize their work in relation to previous studies, and highlight what are the unique characteristics of the model presented here.

      We made a number of clarifications of the text to provide better contextualization of our model within existing literature and to credit more precisely previous publications. This includes commenting on previous studies that introduced separate objective functions of E and I neurons (page 2), spike-triggered adaptation (page 8), physical units (page 3), and changes in the number of neurons in the network (page 16). 

      Next, there are the claims of optimal parameters. As explained on pg. 35 (criterion for determining optimal model parameters), it appears to me that they simply vary each parameter one at a time around the optimal value. This argument appears somewhat circular, as they would need to know the optimal parameters before starting this sweep. In general, I find these optimality considerations to be the most interesting and novel part of the paper, but the simulations are relatively limited, so I would ask the authors to either back them up with more extensive parameter sweeps that consider covariations in different parameters simultaneously (as in Calaim et al. 2022). Furthermore, the authors should make sure that they are not breaking any of the required relationships between parameters necessary for the optimization of the loss function. Again, some of the results (such as coding error not being minimized with zero metabolic cost) suggests that there might be issues here. 

      We thank the reviewer for this insightful suggestion. We have now added a joint sweep of all relevant model parameters using Monte-Carlo parameter search with 10.000 iterations. We randomly drew parameter configurations from predetermined parameter ranges that are detailed in the newly added Table 2. Parameters were sampled from a uniform distribution. We varied all the six model parameters studied in the paper (metabolic constant, noise intensity, time constant of single E and I neurons, ratio of E to I neurons and ratio of the mean I-I to E-I connectivity).  We now present these results on a new Figure 2. We did not find any set of parameters with lower loss than the parameters in Table 1 when the weighting of the error with the cost was in the following range: 0.4<g<sub>L</sub><0.81 (Fig. 2C). While our large but finite Monte-Carlo random sampling does not fully prove that the configuration we selected as optimal (on Table 1) is a global optimum, it shows that this configuration is highly efficient. Further, and as detailed in the rebuttal to the Weaknesses of the Public Review of Referee 2, analyses of the near optimal solutions are compatible with the notion (resulting from the join parameter sweep studies that we added to Figures 6 and 7) that network optimality may be influenced by joint covariations in parameters. These new results are reported in Results (page 5, 11 and 13) and in Figure 2, 6I an 7J.

      Some more specific points:

      (1) In general, I find it difficult to understand the scaling of the RMSE, cost, and loss values in Figures 4-7. Why are RMSE values in the range of 1-10, whereas loss and cost values are in the range of 0-1? Perhaps the authors can explicitly write the values of the RMSE and loss for the simulation in Figure 1G as a reference point.

      Encoding error (RMSE), metabolic cost (MC) and average loss for a well performing network are within the range of 1-10 (see Fig. 8G or 7C in the first submission). To ease the visualization of results, we normalized the cost and the loss on Figs. 6-8 in order to plot them on the same figure (while the computation of the optima is done following the Eq. 39 and is without normalization). We have now explicitly written the values of RMSE, MC and the average loss (non-normalized) for the simulation in Fig. 1D on page 5, as suggested by the reviewer. We have also revised Fig. 4 and now show the absolute and not the relative values of the RMSE and the MC (metabolic cost). 

      (2) Optimal E-I neuron ratio of 4:1 and efficacy ratio of 3:1: besides being unintuitive in relation to previous work, are these two optimal settings related to one another? If there are 4x more excitatory neurons than inhibitory neurons, won't this affect the efficacy ratio of the weights of the two populations? What happens if these two parameters are varied together?

      Thanks for this insightful point. Indeed, the optima of these two parameters are interdependent and positively correlated - if we decrease the E-I neuron ratio, the optimal efficacy ratio decreases as well. To better show this relation we added figures with 2dimensional parameter search (Fig. 7J) where we varied jointly the two ratios. The red cross on the right figure marks the optimal ratios used as optimal parameters in our study. These finding are discussed on page 13.

      (3) Optimal dimensionality of M=[1,4]: Again, previous work (Calaim et al. 2022) would suggest that efficient spiking networks can code for arbitrary dimensional signals, but that performance depends on the redundancy in the network - the more neurons, the better the coding. From this, I don't understand how or why the authors find a minimum in Figure 7B. Why does coding performance get worse for small M?

      We optimized all model parameters with M=3 and this is the reason why M=3 is the optimal number of inputs when we vary this parameter. Our network shows a distinct minimum of the encoding error as a function of the stimulus dimensionality for both E and I neurons (Fig. 8C, top). This minimum is reflected in the minimum of the average loss (Fig. 8C, bottom). The minimum of the loss is shifted (or biased) by the metabolic cost, with strong weighting of the cost lowering the optimal number of inputs. This is discussed on pages 13-14.

      Here are a list of other, more minor points, that the authors can consider addressing to make the results and text more clear:

      (1) Feedforward efficient coding models: in the introduction (pg. 1) and discussion (pg. 11) it is mentioned that early efficient coding models, such as that of Olshausen & Field 96, were purely feedforward, which I believe to be untrue (e.g., see Eq. 2 of O&F 96). Later models made this even more explicit (Rozell et al. 2008). Perhaps the authors can either clarify what they meant by this, or downplay this point.

      We sincerely apologize for the oversight present in the previous version of the text. We agree with the reviewer that the model in Olshausen and Field (1996) indeed defines a network with recurrent connections, and the same type of recurrent connectivity has been used by Rozell et al. (2008, 2013). The structure of the connectivity in Olshausen and Field (as well as in Rozell et al (2008)) is closely related to the structure of connectivity that we derived in our model. We have corrected the text in the introduction (page 1) to remove these errors.

      (2) Pg. 2 - The authors state: "We draw tuning parameters from a normal distribution...", but in the methods, it states that these are then normalized across neurons, so perhaps the authors could add this here, or rephrase it to say that weights are drawn uniformly on the hypersphere.

      We rephrased the description of how weights were determined (page 2).

      (3) Pg. 2 - "We hypothesize the time-resolved metabolic cost to be proportional to the estimate of a momentary firing rate of the neural population" - from what I can see, this is not the usual population rate, which would be an average or sum of rates across the population.

      Indeed, the time-dependent metabolic cost is not the population rate (in the sense of the sum of instantaneous firing rates across neurons), but is proportional to it by a factor of 1/t. More precisely, we can define the instantaneous estimate of the firing rate of a single neuron i as z<sub>i</sub>(t) = 1/t<sub>r</sub> r<sub>i</sub>(t) with r<sub>i</sub>(t) as in Eq. 7. We have clarified this in the revised text on page 3. 

      (4) Pg. 3: "The synaptic strength between two neurons is proportional to their tuning similarity if the tuning similarity is positive" - based on the figure and results, this appears to be the case for I-E, E-I, and I-I connections, but not for E-E connections. This should be clarified in the text. Furthermore, one reference given in the subsequent sentence (Ko et al. 2011, ref. 51), is specifically about E-E connections, so doesn't appear to be relevant here.

      We have now specified that the Eq. 24 does not describe E-E connections. We also agree that the reference (Ko et al. 2011) did not adequately support our claim and we thus removed it and revised the text on page 3 accordingly.

      (5) Pg. 3: "the relative weight of the metabolic cost over the encoding error controls the operating regime of the network" and "and an operating regime controlled by the metabolic constant" - what do you mean by operating regime here?

      We used the expression “operating regime” in the sense of a dynamical regime of the network.  However, we agree that this expression may be confusing and we removed it in revision. 

      (6) Pg. 3: "Previous studies interpreted changes of the metabolic constant beta as changes to the firing thresholds, which has less biological plausibility" - can the authors explain why this is less plausible, or ideally provide a reference for it?

      In biological networks, global variables such as brain state can strongly modulate the way neural networks respond to a feedforward stimulus. These variables influence neural activity in at least two distinct ways. One is by changing non-specific synaptic inputs to neurons, which is a network-wide effect (Destexhe and Pare, Nature Reviews Neurosci. 2003). This is captured in our model by changing the strength of the mean and fluctuations in the external currents. Beyond modulating synaptic currents, another way of modulating neural activity is by changing cell-intrinsic factors that modulate the firing threshold in biological neurons (Pozzorini et al. 2013). Previous studies on spiking networks with efficient coding interpreted the effect of the metabolic constant as changes to the firing threshold (Koren and Deneve, 2017, Gutierrez and Deneve 2019), which corresponds to cell-intrinsic factors. Here we instead propose that the metabolic constant modulates the neural activity by changing the non-specific synaptic input, homogeneously across all neurons in the network. Interpreting the metabolic constant as setting the mean of the non-specific synaptic input was necessary in our model to find an optimal set of parameters (as in Table 1) that is also biologically plausible. We revised the text accordingly (page 4).

      (7) Pg. 4: Competition across neurons: since the model lacks E-E connectivity, it seems trivial to conclude that there is competition through lateral inhibition, and it can be directly determined from the connectivity. What is gained from running these perturbation experiments?

      We agree that a reader with a good understanding of sparse / efficient coding theory can tell that there is competition across neurons with similar tuning already from the equation for the recurrent connectivity (Eq. 24). However, we presume that not all readers can see this from the equations and that it is worth showing this with simulations.

      Following the reviewer's comment, we have now downplayed the result about the model manifesting lateral inhibition in general on page 6. We have also removed its extensive elaboration in Discussion.

      One reason to run perturbation experiments was to test to what extent the optimal model qualitatively replicates empirical findings, in particular, single neuron perturbation experiments in Chettih and Harvey, 2019, without specifically tuning any of the model parameters. We found that the model reproduces qualitatively the main empirical findings, without tuning the model to replicate the data. We revised the text on page 5 accordingly.

      Further reason to run these experiments was to refine predictions about the minimal amount of connectivity structure that generates perturbation response profiles that are qualitatively compatible with empirical observations. To establish this, we did perturbation experiments while removing the connectivity structure of a particular connectivity sub-matrices (E-I, I-I or I-E; Fig. S3 F). This allowed us to determine which connectivity matrix has to be structured to observe results that qualitatively match empirical findings. We found that the structure of E-I and I-E connectivity is necessary, but not the structure of I-I connectivity. Finally, we tested partial removal of the connectivity structure where we replaced the precise (and optimal) connectivity structure and imposed a simpler connectivity rule. In the optimal connectivity, the connection strength is proportional to the tuning similarity. A simpler connectivity rule, in contrast, only specifies that neurons with similar tuning share a connection, and beyond this the connection strength is random. Running perturbation experiments in such a network obeying a simpler connectivity rule still qualitatively replicated empirical results from Chettih and Harvey (2019). This is shown on the Supplementary Fig. S2F on described on page 8.

      (8) Pg. 4: "the optimal E-I network provided a precise and unbiased estimator of the multidimensional and time-dependent target signal" - from previous work (e.g., Calaim et al. 2022), I would guess that the estimator is indeed biased by the metabolic cost. Why is this not the case here? Did you tune the output weights to remove this bias?

      Output weights were not tuned to remove the bias. On Fig. 1H in the first submission we plotted the bias for the network that minimizes the encoding error. We forgot to specify this in the text and figure caption, for which we apologize. We now replaced this figure with a new one (Fig. 1E) where we plot the bias of the network minimizing the average loss (with parameters as in Table 1). The bias of the network minimizing the error is close to zero, B^E = 0.02 and B^I = 0.03.  The bias of the network minimizing the loss is stronger and negative, B^E = -0.15 and B^I=-0.34. In the text of Results, we now report the bias of both networks (i.e., optimizing the encoding error and optimizing the loss). We also added a plot showing trial-averaged estimates and a time-dependent bias in each stimulus dimension (Supplementary figure S1 F). Note that the network minimizing the encoding error requires a lower metabolic constant (β = 6) than the network optimizing the loss (β=14), however, the optimal metabolic cost in both networks is nonzero. We revised the text and explained these points on page 5.

      (9) Pg. 4: "The distribution of firing rates was well described by a log-normal distribution" - I find this quite interesting, but it isn't clear to me how much this is due to the simulation of a finitetime noisy input. If the neurons all have equal tuning on the hypersphere, I would expect that the variability in firing is primarily due to how much the input correlates with their tuning. If this is true, I would guess that if you extend the duration of the simulation, the distribution would become tighter. Can you confirm that this is the stationary distribution of the firing rates?

      We now simulated the network with longer simulation time (10 seconds of simulated time instead of 2 seconds used previously) and also iterated the simulation across 10 trials to report a result that is general across random draws of tuning parameters (previously a single set of tuning parameters was used). The reviewer is correct that the distribution of firing rates of E neurons has become tighter with longer simulation time, but distributions remain log-normal. We also recomputed the coefficient of variation (CV) using the same procedure. We updated these plots on Fig. 1F.

      (10) Pg. 4: "We observed a strong average E-I balance" - based on the plots in Figure 1J, the inputs appear to be inhibition-dominated, especially for excitatory neurons. So by what criterion are you calling this strong average balance?

      The reviewer is correct about the fact that the net synaptic input to single neurons in our optimal network shows excess inhibition and the network is inhibition-dominated, so we revised this sentence (page 5) accordingly.  

      (11) Pg. 4: Stronger instantaneous balance in I neurons compared to E neurons - this is curious, and I have two questions: (1) can the authors provide any intuition or explanation for why this is the case in the model? and (2) does this relate to any literature on balance that might suggest inhibitory neurons are more balanced than excitatory neurons?

      In our model, I neurons receive excitatory and inhibitory synaptic currents through synaptic connections that are precisely structured. E neurons receive structured inhibition and a feedforward current. The feedforward current consists of M=3 independent OU processes projected on the tuning vectors of E neurons w<sub>i</sub><sup>E</sup>. We speculate that because the synaptic inhibition and feedforward current are different processes and the 3 OU inputs are independent, it is harder for E neurons to achieve the instantaneous balance that would be as precise as in I neurons. While we think that the feedforward current in our model reflects biologically plausible sensory processing, it is not a mechanistic model of feedforward processing. In biological neurons, real feedforward signals are implemented as a series of complex feedforward synaptic inputs from downstream areas, while the feedforward current in our model is a sum of stimulus features, and is thus a simplification of a biological process that generates feedforward signals. We speculate that a mechanistic implementation of the feedforward current could increase the instantaneous balance in E neurons.  Furthermore, the presence of EE connections could potentially also increase the instantaneous balance in E neurons. We revised the Discussion about these important questions that lie on the side of model limitations and could be advanced in future work. We could not find any empirical evidence directly comparing the instantaneous balance in E versus I neurons.  We have reported these considerations in the revised Discussion (page 16).

      (12) Pg. 5, comparison with random connectivity: "Randomizing E-I and I-E connectivity led to several-fold increases in the encoding error as well as to significant increases in the metabolic cost" and Discussion, pg. 11: "the structured network exhibits several fold lower encoding error compared to unstructured networks": I'm wondering if these comparisons are fair. First, regarding activity changes that affect the metabolic cost - it is known that random balanced networks can have global activity control, so it is not straightforward that randomizing the connectivity will change the metabolic cost. What about shuffling the weights but keeping an average balance for each neuron's input weights? Second, regarding coding error, it is trivial that random weights will not map onto the correct readout. A fairer comparison, in my opinion, would at least be to retrain the output weights to find the best-fitting decoder for the threedimensional signal, something more akin to a reservoir network.

      Thank you for raising these interesting questions. The purpose of comparing networks with and without connectivity structure was to observe causal effects of the connectivity structure on the neural activity. We agree that the effect on the encoding error is close to trivial, because shuffling of connectivity weights decouples neural dynamics from decoding weights. We have carefully considered Reviewer's suggestions to better compare the performance of structured and unstructured networks. 

      In reply to the first point, we followed the reviewer's suggestion and compared the optimal network with a shuffled network that matched the optimal network in its average balance. This was achieved by increasing the metabolic constant, decreasing the noise intensity and slightly decreasing the feedforward stimulus (we did not find a way to match the net current in both cell types by changing a single parameter). As we compared the metabolic cost between the optimal and the shuffled network with matched average balance, we still found lower metabolic cost in the optimal network, even though the difference was now smaller. We replaced Fig. 3B from the first submission with these new results in Fig. 4B and commented on them in the text (page 7).

      In reply to the second point, we followed reviewer’s suggestion and compared the encoding error (RMSE) of the optimal network and the network with shuffled connectivity where decoding weights are trained such as to optimally reconstruct the target signal. As suggested, we now analyzed the encoding error of the networks using decoding weights trained on the set of spike trains generated by the network using linear least square regression to minimize the decoding error. For a fair and quantitative comparison and because we did not train decoding weights of our structured model, we performed this same analysis using spike trains generated by networks with structured and shuffled recurrent connectivity. We found that the encoding error is smaller in the E population and much smaller in the I population in the structured compared to the random network. Decoding weights found numerically in the optimal network approach uniform distribution of weights that we used in our model (Fig. 4A, right). In contrast, decoding weights obtained from the random network do not converge to a uniform distribution, but instead form a much sparser distribution, in particular in I neurons (Supplementary Fig. S3 A). These additional results reported in the above mentioned figures are discussed in text on page 14.  

      (13) Pg. 5: "a shift from mean-driven to fluctuation-driven spiking" and Pg. 11 "a network structured as in our efficient coding solution operates in a dynamical regime that is more stimulus-driven, compared to an unstructured network that is more fluctuation driven" - I would expect that the balanced condition dictates that spiking is always fluctuation driven. I'm wondering if the authors can clarify this.

      We agree with the reviewer that networks with and without connectivity structure are fluctuation-driven, because in a mean-driven network the mean current must be suprathreshold (Ahmadian and Miller, 2021), which is not the case of either network. We removed the claim of the change from mean to fluctuation driven regime in the revised paper. We are grateful to the Reviewer for helping us tighten the elaboration of our findings.

      (14) Pg. 5: "suggesting that variability of spiking is independent of the connectivity structure" - the literature of balanced networks argues against this. Is this not simply because you have a noisy input? Can you test this claim?

      We thank the reviewer for the suggestion. We tested this claim by measuring the coefficient of variation in networks receiving a constant stimulus. In particular, we set the same strength in each of the M=3 stimulus dimensions and set the stimulus amplitude such as to match the firing rate of the optimal network in response to the OU stimulus. We computed the coefficient of variation in 200 simulation trials.  The removal of connectivity structure did not cause significant change of the coefficient of variation in a network driven by a constant stimulus (Fig. 4E). These additional results are discussed in text on page 7. 

      We also taken the suggestion about variability of spiking being independent of the connectivity structure. We removed this claim in the revision, because we only tested a couple of specific cases where the connectivity is structured with respect to tuning similarity (fully structured, fully unstructured and partially unstructured networks). This is not exhaustive of all possible structures that recurrent connectivity may have.

      (15) Pg. 6: "we also removed the connectivity structure only partially, keeping like-to-like connectivity structure and removing all structure beyond like-to-like" - can you clarify what this means, perhaps using an equation? What connectivity structure is there besides like-to-like?

      In the optimal model, the strength of the synapse between a pair of neurons is proportional to the tuning similarity of the two neurons, Y<sub>ij</sub> proportional to J<sub>ij</sub> for Y<sub>ij</sub> >0 (see Eq. 24 and Fig. 1C(ii)). Besides networks with optimal connectivity, we also tested networks with a simpler connectivity rule. Such a simpler rule prescribes a connection if the pair of neurons has similar tuning (Y<sub>ij</sub> >0), and no connection otherwise. The strength of the connection following this simpler connectivity rule is otherwise random (and not proportional to pairwise tuning similarity Y<sub>ij</sub> as it is in the optimal network). We clarified this in the revision (page 8), also by avoiding the term “like-to-like” for the second type of networks, which could indeed be prone to confusion.

      (16) Pgs. 6-7: "we indeed found that optimal coding efficiency is achieved with weak adaptation in both cell types" and "adaptation in E neurons promotes efficient coding because it enforces every spike to be error- correcting" - this was not clear to me. First, it appears as though optimal efficiency is achieved without adaptation nor facilitation, i.e., when the time constants are all equal. Indeed, this is what is stated in Table 1. So is there really a weak adaptation present in the optimal case? Second, it seems that the network already enforces each spike to be errorcorrecting without adaptation, so why and how would adaptation help with this?

      We agree with the Reviewer that the network without adaptation in E and I neurons is already optimal. It is also true that most spikes in an optimal network should already be error-correcting (besides some spikes that might be caused by the noise). However, regimes with weak adaptation in E neurons remain close to optimality. Spike-triggered facilitation, meanwhile, ads spikes that are unnecessary and decrease network efficiency. We revised the Fig.5 (Fig. 4 in first submission) and replaced 2-dimensional plots in Fig.4 C-F with plots that show the differential effect of adaptation in E neurons (top) and in I neurons (bottom plots) for the measures of the encoding error (RMSE), the efficiency (average loss) and the firing rate (Fig. 5B-D). On the new Fig. 5C it is evident that the loss of E and I population grows slowly with adaptation in E neurons (top) while it grows faster with adaptation in I neurons (bottom). These considerations are explained in revised text on page 9.

      (17) Pg. 7: "adaptation in E neurons resulted in an increase of the encoding error in E neurons and a decrease in I neurons" - it would be nice if the authors could provide any explanation or intuition for why this is the case. Could it perhaps be because the E population has fewer spikes, making the signal easier to track for the I population?

      We agree that this could indeed be the case. We commented on it in revision (page 9).

      (18) Pg. 7: "The average balance was precise...with strong adaptation in E neurons, and it got weaker when increasing the adaptation in I neurons (Figure 4E)" - I found the wording of this a bit confusing. Didn't the balance get stronger with larger I time constants?

      By increasing the time constant of I neurons, the average imbalance got weaker (closer to zero) in E neurons (Fig. 5G, left), but stronger (further away from zero) in I neurons (Fig. 5G, right). We have revised the text on page 9 to make this clearer.

      (19) Pg. 7: Figure 4F is not directly described in the text.

      We have now added text (page 9) commenting on this figure in revision.

      (20) Pg. 8: "indicating that the recurrent network dynamics generates substantial variability even in the absence of variability in the external current" -- how does this observation relate to your earlier claim (which I noted above) that "variability of spiking is independent of connectivity structure"?

      We agree that the claim about variability of spiking being independent of connectivity structure was overstated and we thus removed it. The observation that we wanted to report is that both structured and unstructured networks have very similar levels of variability of spiking of single neurons. The fact that much of the variability of the optimal network is generated by recurrent connections is not incompatible. We revised the related text (page 11) for clarity.

      (21) Pg. 9: "We found that in the optimally efficient network, the mean E-I and I-E synaptic efficacy are exactly balanced" - isn't this by design based on the derivation of the network?

      True, the I-E connectivity matrix is the transpose of the E-I connectivity matrix, and their means are the same by the analytical solution. This however remains a finding of our study. We have clarified this in the revised text (page 12).

      (22) Pg. 30, eq. 25: the authors should verify if they include all possible connectivity here, or if they exclude EE connectivity beforehand.

      We now specify that the equation for recurrent connectivity (Eq. 24, Eq. 25 in first submission) does not include the E-E connectivity in the revised text (page 41).

      Reviewer #3 (Recommendations For The Authors):

      Essential

      (1)  Currently, they measure the RMSE and cost of the E and I population separately, and the 1CT model. Then, they average the losses of the E and I populations, and compare that to the 1CT model, with the conclusion that the 1CT model has a higher average loss. However, it seems to me that only the E population should be compared to the 1CT model. The I population loss determines how well the I population can represent the E population representation (which it can do extremely well). But the overall coding accuracy of the network of the input signal itself is only represented by the E population. Even if you do combine the E and I losses, they should be summed, not averaged. I believe a more fair conclusion would be that the E/I networks have generally slightly worse performance because of needing to follow Dale's law, but are still highly efficient and precise nonetheless. Of course, I might be making a critical error somewhere above, and happy to be convinced otherwise!

      We carefully considered the reviewer's comment and tested different ways of combining the losses of the E and I population. We decided to follow the reviewer's suggestion and to compare the loss of the E population of the E-I model with the loss of the one cell type model. As evident already from the Fig. 8G, such comparison indeed changes the result to make the 1CT model more efficient. Also, the sum of losses of E and I neurons results in the 1CT model being more efficient than the E-I model. Note, however, the robustness of the E-I model to changes in the metabolic constant (Fig. 6C, top). The firing rates of the E-I model stay within physiological ranges for any value of the metabolic constant, while the firing rate of the 1CT model skyrocket for the metabolic constant that is lower than optimal (Fig. 8I).

      We added to Results (page 14) a summary of these findings.

      (2) The methods and main text should make much clearer what aspects of the derivation are novel, and which are not novel (see review weaknesses for specifics).

      We specified these aspects, as discussed in more detail in the above reply to point 4 of the public review of Reviewer 1.

      Request:

      If possible, I would like to see the code before publication and give recommendations on that (is it easy to parse and reproduce, etc.)

      We are happy to share the computer code with the reviewer and the community. We added a link to our public repository containing the computer code that we used for simulations and analysis to the preprint and submission (section “Code availability” on page 17). 

      Suggestions:

      (1) I believe that for an eLife audience, the main text is too math-heavy at the beginning, and it could be much simplified, or more effort could be made to guide the reader through the math.

      We tried to do our best to improve the clarity of description of mathematical expressions in the main text.

      (2) Generally vector notation makes network equations for spiking neurons much clearer and easier to parse, I would recommend using that throughout the paper (and not just in the supplementary methods).

      We now use vector notation throughout the paper whenever we think that this improves the intelligibility of the text. 

      (3) In the discussion or at the end of the results adding a clear section summarizing what the minimal requirements or essential assumptions are for biological networks to implement this theory would be helpful for experimentalists and theorists alike.

      We have added such a section in Discussion (page 15). 

      (5) I think the title is a bit too cumbersome and hard to parse. Might I suggest something like 'Efficient coding and energy use in biophysically realistic excitatory-inhibitory spiking networks' or 'Biophysically constrained excitatory-inhibitory spiking networks can efficiently implement efficient coding'.

      We followed reviewer’s suggestion and changed the title to “Efficient coding in biophysically realistic excitatory-inhibitory spiking networks.”

      (6) How the connections were shuffled exactly was not clear to me in how it was described now. Did they just take the derived connectivity, and shuffle the connections around? I recommend a more explicit methods section on it (I might have missed it).

      Indeed, the connections of the optimal network were randomly shuffled, without repetition, between all neuronal pairs of a specific connectivity matrix. This allows to preserve all properties of the distribution of connectivity weights and only removes the structure of the connectivity, which is precisely what we wanted to test. We now added a section in Methods (“Removal of connectivity structure”) on pages 51-52 where we explain how the connectivity structure is removed.

      (7) Figure 1 sub-panel ordering was confusing to read (first up down, then left right). Not sure if re- arranging is possible, but perhaps it could be A, B, and C at the top, with subsublabels (i) and (ii). Might become too busy though.

      We followed this suggestion and rearranged the Fig. 1 as suggested by the reviewer. 

      (8) Equation 3 in the main text should specify that 'y' stands for either E or I.

      This has been specified in the revision (page 3). 

      (9) Figure 1D shows a rough sketch of the types of connectivities that exist, but I would find it very useful to also see the actual connection strengths and the effect of enforcing Dale's law.

      We revised this figure (now Fig. 1B (ii)) and added connection strengths as well as a sketch of a connection that was removed because of Dale’s law.

      (10) The main text mentions how the readout weights are defined (normal distributions), but I think this should also be mentioned in the methods.

      Agreed. We indeed had Methods section “Parametrization of synaptic connectivity (page 46), where we explain how readout weights are defined. We apologize if a call on this section was not salient enough in the first submission. We made sure that the revised main text contains a clear pointer to this Methods section for details. 

      (11) The text seems to mix ‘decoding weights’ and ‘readout weights’.

      Thanks for this suggestion to use consistent language. We opted for ‘decoding weights’ and removed ‘readout weights’.

      (12) The way the paper is written makes it quite hard to parse what are new experimental predictions, and what results reproduce known features. I wonder if some sort of 'box' is possible with novel predictions that experimentalists could easily look at and design an experiment around.

      We now revised the text. We clarified for every property of the model if this property is a prediction of facts that were not yet experimentally tested or if it accounts for previously observed properties of biological neurons. Please see the reply to point 4 of Reviewer 1. 

      (13) Typo's etc.:

      Page 5 bottom -- ("all") should have one of the quotes change direction (common latex typo, seems to be the only place with the issue).

      We thank the reviewer for pointing out this typo that has been removed in revision.

    1. Author response:

      Reviewer #1(Public review):

      Summary:

      This manuscript details the results of a small pilot study of neoadjuvant radiotherapy followed by combination treatment with hormone therapy and dalpiciclib for early-stage HR+/HER2-negative breast cancer.

      Strengths:

      The strengths of the manuscript include the scientific rationale behind the approach and the inclusion of some simple translational studies.

      Weaknesses:

      The main weakness of the manuscript is that overly strong conclusions are made by the authors based on a very small study of twelve patients. A study this small is not powered to fully characterize the efficacy or safety of a treatment approach, and can, at best, demonstrate feasibility. These data need validation in a larger cohort before they can have any implications for clinical practice, and the treatment approach outlined should not yet be considered a true alternative to standard evidence-based approaches.

      I would urge the authors and readers to exercise caution when comparing results of this 12-patient pilot study to historical studies, many of which were much larger, and had different treatment protocols and baseline patient characteristics. Cross-trial comparisons like this are prone to mislead, even when comparing well powered studies. With such a small sample size, the risk of statistical error is very high, and comparisons like this have little meaning.

      We greatly appreciate your evaluation of our study and fully agree with the limitations you have pointed out. We have clearly stated the limitations of the small sample size and emphasized the need for a larger population to validate our preliminary findings in the discussion section (Lines 311-316).

      We acknowledge that this small sample size is not powered to characterize this regimen as a promising alternative regimen in the treatment of patients with HR-positive, HER2-negative breast cancer. Therefore, we have revised the description of this regimen to serve as a feasible option for neoadjuvant therapy in HR-positive, HER2-negative breast cancers both in the discussion (Lines 317-320) and the abstract (Lines 71-72).

      We agree with you that cross-trial comparisons should be approached with caution due to differences in study designs and patient populations. In our discussion section, we acknowledge that small sample size limited the comparison of our data with historical data in the literature due to the potential bias (Lines 312-313). We clearly state that such comparisons hold limited significance (Lines 313-314) and suggest a larger population to validate our preliminary findings.

      • Why was dalpiciclib chosen, as opposed to another CDK4/6 inhibitor?

      Thank you for your comments. The rationale for selecting dalpiciclib over other CDK4/6 inhibitors in our study is primarily based on the following considerations:

      (1) Clinical Efficacy: In several clinical trials, including DAWNA-1 and DAWNA-2, the combination of dalpiciclib with endocrine therapies such as fulvestrant, letrozole, or anastrozole has been shown to significantly extend the progression-free survival (PFS) in patients with hormone receptor-positive, HER2-negative advanced breast cancer (1-2).

      (2) Tolerability and Management of Adverse Reactions: The primary adverse reactions associated with dalpiciclib are neutropenia, leukopenia, and anemia. Despite these potential side effects, the majority of patients are able to tolerate them, and with proper monitoring and management, these reactions can be effectively mitigated (1-2).

      (3) Comparable pharmacodynamic with other CDK4/6 inhibitors: The combination of CDK4/6 inhibitors, including palbociclib, ribociclib, and abemaciclib, with aromatase inhibitors has demonstrated an enhanced ability to suppress tumor proliferation and increase the rate of clinical response in neoadjuvant therapy for HR-positive, HER2-negative breast cancer (3-5). Furthermore, preclinical studies have shown that dalpiciclib has comparable in vivo and in vitro pharmacodynamic activity to palbociclib, suggesting its potential effectiveness in similar treatment regimens (6).

      (4) Accessibility and Regulatory Approval: Dalpiciclib has gained marketing approval in China on December 31, 2021, which facilitates the accessibility of this medication, making it a more convenient option when considering treatment plans.

      References:

      (1) Zhang P, Zhang Q, Tong Z, et al. Dalpiciclib plus letrozole or anastrozole versus placebo plus letrozole or anastrozole as first-line treatment in patients with hormone receptor-positive, HER2-negative advanced breast cancer (DAWNA-2): a multicentre, randomised, double-blind, placebo-controlled, phase 3 trial(J). The Lancet Oncology, 2023, 24(6): 646-657.

      (2) Xu B, Zhang Q, Zhang P, et al. Dalpiciclib or placebo plus fulvestrant in hormone receptor-positive and HER2-negative advanced breast cancer: a randomized, phase 3 trial(J). Nature medicine, 2021, 27(11): 1904-1909.

      (3) Hurvitz S A, Martin M, Press M F, et al. Potent cell-cycle inhibition and upregulation of immune response with abemaciclib and anastrozole in neoMONARCH, phase II neoadjuvant study in HR+/HER2− breast cancer(J). Clinical Cancer Research, 2020, 26(3): 566-580.

      (4) Prat A, Saura C, Pascual T, et al. Ribociclib plus letrozole versus chemotherapy for postmenopausal women with hormone receptor-positive, HER2-negative, luminal B breast cancer (CORALLEEN): an open-label, multicentre, randomised, phase 2 trial(J). The lancet oncology, 2020, 21(1): 33-43.

      (5) Ma C X, Gao F, Luo J, et al. NeoPalAna: neoadjuvant palbociclib, a cyclin-dependent kinase 4/6 inhibitor, and anastrozole for clinical stage 2 or 3 estrogen receptor–positive breast cancer(J). Clinical Cancer Research, 2017, 23(15): 4055-4065.

      (6) Long F, He Y, Fu H, et al. Preclinical characterization of SHR6390, a novel CDK 4/6 inhibitor, in vitro and in human tumor xenograft models(J). Cancer science, 2019, 110(4): 1420-1430.

      • The eligibility criteria are not consistent throughout the manuscript, sometimes saying early breast cancer, other times saying stage II/III by MRI criteria.

      criteria in our manuscript. We deeply apologize for any confusion caused by these inconsistencies. We have revised the term from “early-stage HR-positive, HER2-negative breast cancer” to “early or locally advanced HR-positive, HER2-negative breast cancer” (Lines 128 and 150). The term “early or locally advanced” encompasses two different stages of breast cancer, whereas “Stage II/III by MRI criteria” refers to specific stages within the TNM staging system.

      • The authors should emphasize the 25% rate of conversion from mastectomy to breast conservation and also report the type and nature of axillary lymph node surgery performed. As the authors note in the discussion section, rates of pathologic complete response/RCB scores are less prognostic for hormone-receptor-positive breast cancer than other subtypes, so one of the main rationales for neoadjuvant medical therapy is for surgical downstaging. This is a clinically relevant outcome.

      We appreciate your constructive comments. Based on your suggestions, we have made the following revisions and additions to the article.

      The breast conservation rate serves as a secondary endpoint in our study (Line 62 and 179). We have highlighted the significant 25% conversion rate from mastectomy to breast conservation in both the results (Lines 229-230) and discussion sections (Lines 290-292).

      In our study, all patients underwent lymph node surgery, including sentinel lymph node biopsy or axillary lymph node dissection. Among them, 58.3% of patients (7/12) underwent sentinel lymph node biopsies.

      We agree with your point that the prognostic value of pathologic complete response/RCB score is lower for hormone receptor-positive breast cancer compared to other subtypes, we have revised the discussion section to clarify that one of the principal objectives for neoadjuvant therapy in this patient population is to facilitate downstaging and enhance the rate of breast conservation (Lines 289-290). And also emphasized that this neoadjuvant therapeutic regiment appeared to improve the likelihood of pathological downstaging and achieve a margin-free resection, particularly for those with locally advanced and high-risk breast cancer (Lines 293-295).

      Reviewer #2 (Public review):

      Firstly, as this is a single-arm preliminary study, we are curious about the order of radiotherapy and the endocrine therapy. Besides, considering the radiotherapy, we also concern about the recovery of the wound after the surgery and whether related data were collected.

      Thanks for the comments. The treatment sequence in this study is to first administer radiotherapy, followed by endocrine therapy. A meta-analysis has indicated that concurrent radiotherapy with endocrine therapy does not significantly impact the incidence of radiation-induced toxicity or survival rates compared to a sequential approach (1). In light of preclinical research suggesting enhanced therapeutic efficacy when radiotherapy is delivered prior to CDK4/6 inhibitors, we have opted to administer radiotherapy before the combination therapy of CDK4/6 inhibitors and hormone therapy (2).

      In our study, we collected data on surgical wound recovery. All 12 patients had Class I incisions, which healed by primary intention. The wounds exhibited no signs of redness, swelling, exudate, or fat necrosis.

      References:

      (1) Li Y F, Chang L, Li W H, et al. Radiotherapy concurrent versus sequential with endocrine therapy in breast cancer: A meta-analysis(J). The Breast, 2016, 27: 93-98.

      (2) Petroni G, Buqué A, Yamazaki T, et al. Radiotherapy delivered before CDK4/6 inhibitors mediates superior therapeutic effects in ER+ breast cancer(J). Clinical Cancer Research, 2021, 27(7): 1855-1863.

      Secondly, in the methodology, please describe the sample size estimation of this study and follow up details.

      Thanks for pointing out this crucial omission. Sample size estimation for this study and follow-up details have been added in the methodology section. The section on sample size estimation has been revised to state in Statistical analysis: “This exploratory study involves 12 patients, with the sample size determined based on clinical considerations, not statistical factors (Lines 210-211).” The section on follow up has been revised to state in Procedures section “A 5-year follow-up is conducted every 3 months during the first 2 years, and every 6 months for the subsequent 3 years. Additionally, safety data are collected within 90 days after surgery for subjects who discontinue study treatment (Lines 169-172).”

      Thirdly, in Table 1, the item HER2 expression, it's better to categorise HER2 into 0, 1+, 2+ and FISH-.

      Thank you very much for pointing out this issue. The item HER2 expression in Table 1 has been revised from “negative, 1+, 2+ and FISH-” to “0, 1+, 2+ and FISH-”.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      Summary: The manuscript by Yang et al. describes a new CME accessory protein. CCDC32 has been previously suggested to interact with AP2 and in the present work the authors confirm this interaction and show that it is a bona fide CME regulator. In agreement with its interaction with AP2, CCDC32 recruitment to CCPs mirrors the accumulation of clathrin. Knockdown of CCDC32 reduces the amount of productive CCPs, suggestive of a stabilisation role in early clathrin assemblies. Immunoprecipitation experiments mapped the interaction of CCDC42 to the α-appendage of the AP2 complex α-subunit. Finally, the authors show that the CCDC32 nonsense mutations found in patients with cardio-facial-neuro-developmental syndrome disrupt the interaction of this protein to the AP2 complex. The manuscript is well written and the conclusions regarding the role of CCDC32 in CME are supported by good quality data. As detailed below, a few improvements/clarifications are needed to reinforce some of the conclusions, especially the ones regarding CFNDS.

      Response: We thank the referee for their positive comments. In light of a recently published paper describing CCDC32 as a co-chaperone required for AP2 assembly (Wan et al., PNAS, 2024, see reviewer 2), we have added several additional experiments to address all concerns and consequently gained further insight into CCDC32-AP2 interactions and the important dual role of CCDC32 in regulating CME.

      Major comments:

      1) Why did the protein could just be visualized at CCPs after knockdown of the endogenous protein? This is highly unusual, especially on stable cell lines. Could this be that the tag is interfering with the expressed protein function rendering it incapable of outcompeting the endogenous? Does this points to a regulated recruitment?

      Response: The reviewer is correct, this would be unusual; however, it is not the case. We misspoke in the text (although the figure legend was correct) these experiments were performed without siRNA knockdown and we can indeed detect eGFP-CCDC32 being recruited to CCPs in the presence of endogenous protein. Nonetheless, we repeated the experiment to be certain.

      2) The disease mutation used in the paper does not correspond to the truncation found in patients. The authors use an 1-54 truncation, but the patients described in Harel et al. have frame shifts at the positions 19 (Thr19Tyrfs*12) and 64 (Glu64Glyfs*12), while the patient described in Abdalla et al. have the deletion of two introns, leading to a frameshift around amino acid 90. Moreover, to be precisely test the function of these disease mutations, one would need to add the extra amino acids generated by the frame shift. For example, as denoted in the mutation description in Harel et al., the frameshift at position 19 changes the Threonine 19 to a Tyrosine and ads a run of 12 extra amino acids (Thr19Tyrfs*12).

      Response: The label of the disease mutant p.(Thr19Tyrfs∗12) and p.(Glu64Glyfs∗12) is based on a 194aa polypeptide version of CCDC32 initiated at a nonconventional start site that contains a 9 aa peptide (VRGSCLRFQ) upstream of the N-terminus we show. Thus, we are indeed using the appropriate mutation site (see: https://www.uniprot.org/uniprotkb/Q9BV29/entry). The reviewer is correct that we have not included the extra 12 aa in our construct; however as these residues are not present in the other CFNDS mutants, we think it unlikely that they contribute to the disease phenotype. Rather, as neither of the clinically observed mutations contain the 78-98 aa sequence required for AP2 binding and CME function, we are confident that this defect contributed to the disease. Thus, we are including the data on the CCDC32(1-54) mutant, as we believe these results provide a valuable physiological context to our studies.

      3) The frameshift caused by the CFNDS mutations (especially the one studied) will likely lead to nonsense mediated RNA decay (NMD). The frameshift is well within the rules where NMD generally kicks in. Therefore, I am unsure about the functional insights of expressing a disease-related protein which is likely not present in patients.

      Response: We thank the reviewer for bringing up this concern. However, as shown in new Figure S1, the mutant protein is expressed at comparable levels as the WT, suggesting that NMD is not occurring.

      4) Coiled coils generally form stable dimers. The typically hydrophobic core of these structures is not suitable for transient interactions. This complicates the interpretation of the results regarding the role of this region as the place where the interaction to AP2 occurs. If the coiled coil holds a stable CCDC32 dimer, disrupting this dimer could reduce the affinity to AP2 (by reduced avidity) to the actual binding site. A construct with an orthogonal dimeriser or a pulldown of the delta78-98 protein with of the GST AP2a-AD could be a good way to sort this issue.

      Response: We were unable to model a stable dimer (or other oligomer) of this protein with high confidence using Alphafold 3.0. Moreover, we were unable to detect endogenous CCDC32 co-immunoprecipitating with eGFP-CCDC32 (Fig. S6C). Thus, we believe that the moniker, based solely on the alpha-helical content of the protein is a misnomer. We have explained this in the main text.

      Minor comments:

      1) The authors interchangeably use the term "flat CCPs" and "flat clathrin lattices". While these are indeed related, flat clathrin lattices have been also used to refer to "clathrin plaques". To avoid confusion, I suggest sticking to the term "flat CCPs" to refer to the CCPs which are in their early stages of maturation.

      Response: Agreed. Thank you for the suggestion. We have renamed these structures flat clathrin assemblies, as they do not acquire the curvature needed to classify them as pits, and do not grow to the size that would classify then as plaques.

      Significance

      General assessment: CME drives the internalisation of hundreds of receptors and surface proteins in practically all tissues, making it an essential process for various physiological processes. This versatility comes at the cost of a large number of molecular players and regulators. To understand this complexity, unravelling all the components of this process is vital. The manuscript by Yang et al. gives an important contribution to this effort as it describes a new CME regulator, CCDC32, which acts directly at the main CME adaptor AP2. The link to disease is interesting, but the authors need to refine their experiments. The requirement for endogenous knockdown for recruitment of the tagged CCDC32 is unusual and requires further exploration.

      Advance: The increased frequency of abortive events presented by CCDC32 knockdown cells is very interesting, as it hints to an active mechanism that regulates the stabilisation and growth of clathrin coated pits. The exact way clathrin coated pits are stabilised is still an open question in the field.

      Audience: This is a basic research manuscript. However, given the essential role of CME in physiology and the growing number of CME players involved in disease, this manuscript can reach broader audiences.

      Response: We thank the referee for recognizing the 'interesting' advances our studies have made and for considering these studies as 'an important contribution' to 'an essential process for various physiological processes' and able 'to reach broader audiences'. We have addressed and reconciled the reviewer's concerns in our revised manuscript.

      Field of expertise of the reviewer: Clathrin mediated endocytosis, cell biology, microscopy, biochemistry.


      Reviewer #2

      Evidence, reproducibility and clarity

      In this manuscript, the authors demonstrate that CCDC32 regulates clathrin-mediated endocytosis (CME). Some of the findings are consistent with a recent report by Wan et al. (2024 PNAS), such as the observation that CCDC32 depletion reduces transferrin uptake and diminishes the formation of clathrin-coated pits. The primary function of CCDC32 is to regulate AP2 assembly, and its depletion leads to AP2 degradation. However, this study did not examine AP2 expression levels. CCDC32 may bind to the appendage domain of AP2 alpha, but it also binds to the core domain of AP2 alpha. Overall, while this work presents some interesting ideas, it remains unclear whether CCDC32 regulates AP2 beyond the assembly step.

      Response: We thank the reviewer for drawing our attention to the Wan et al. paper, that appeared while this work was under review. However, our in vivo data are not fully consistent with the report from Wan et al. The discrepancies reveal a dual function of CCDC32 in CME that was masked by complete knockout vs siRNA knockdown of the protein, and also likely affected by the position of the GFP-tag (C- vs N-terminal) on this small protein. Thus:

      • Contrary to Wan et al., we do not detect any loss of AP2 expression (see new Figure S3A-B) upon siRNA knockdown. Most likely the ~40% residual CCDC32 present after siRNA knockdown is sufficient to fulfill its catalytic chaperone function but not its structural role in regulating CME beyond the AP2 assembly step.
      • Contrary to Wan et al., we have shown that CCDC32 indeed interacts with intact AP2 complex (Figure S3C and 6B,C) showing that all 4 subunits of the AP2 complex co-IP with full length eGFP-CCDC32. Interestingly, whereas the full length CCDC32 pulls down the intact AP2 complex, co-IP of the ∆78-98 mutant retains its ability to pull down the b2-µ2 hemicomplex, its interactions with α:σ2 are severely reduced. While this result is consistent with the report of Wan et al that CCDC32 binds to the α:σ2 hemi-complex, it also suggests that the interactions between CCDC32 and AP2 are more complex and will require further studies.
      • Contrary to Wan et al., we provide strong evidence that CCDC32 is recruited to CCPs. Interestingly, modeling with AlphaFold 3.0 identifies a highly probably interaction between alpha helices encoded by residues 66-91 on CCDC32 and residues 418-438 on a. The latter are masked by µ2-C in the closed confirmation of the AP2 core, but exposed in the open confirmation triggered by cargo binding, suggesting that CCDC32 might only bind to membrane-bound AP2. Thus, our findings are indeed novel and indicate striking multifunctional roles for CCDC32 in CME, making the protein well worth further study.

      • Besides its role in AP2 assembly, CCDC32 may potentially have another function on the membrane. However, there is no direct evidence showing that CCDC32 associates with the plasma membrane.

      Response: We disagree, our data clearly shows that CCDC32 is recruited to CCPs (Fig. 1B) and that CCPs that fail to recruit CCDC32 are short-lived and likely abortive (Fig. 1C). Wan et al. did not observe any colocalization of C-terminally tagged CCDC32 to CCPs, whereas we detect recruitment of our N-terminally tagged construct, which we also show is functional (Fig. 6F). Further, we have demonstrated the importance of the C-terminal region of CCDC32 in membrane association (see new Fig. S7). Thus, we speculate that a C-terminally tagged CCDC32 might not be fully functional. Indeed, SIM images of the C-terminally-tagged CCDC32 in Wan et al., show large (~100 nm) structures in the cytosol, which may reflect aggregation.

      CCDC32 binds to multiple regions on AP2, including the core domain. It is important to distinguish the functional roles of these different binding sites.

      Response: We have localized the AP2-ear binding region to residues 78-99 and shown these to be critical for the functions we have identified. As described above we now include data that are complementary to those of Wan et al. However, our data also clearly points to additional binding modalities. We agree that it will be important and map these additional interactions and identify their functional roles, but this is beyond the scope of this paper.

      AP2 expression levels should be examined in CCDC32 depleted cells. If AP2 is gone, it is not surprising that clathrin-coated pits are defective.

      Response: Agreed and we have confirmed this by western blotting (Figure S3A-B) and detect no reduction in levels of any of the AP2 subunits in CCDC32 siRNA knockdown cells. As stated above this could be due to residual CCDC32 present in the siRNA KD vs the CRISPR-mediated gene KO.

      If the authors aim to establish a secondary function for CCDC32, they need to thoroughly discuss the known chaperone function of CCDC32 and consider whether and how CCDC32 regulates a downstream step in CME.

      Response: Agreed. We have described the Wan et al paper, which came out while our manuscript was in review, in our Introduction. As described above, there are areas of agreement and of discrepancies, which are thoroughly documented and discussed throughout the revised manuscript.

      The quality of Figure 1A is very low, making it difficult to assess the localization and quantify the data.

      Response: The low signal:noise in Fig. 1A the reviewer is concerned about is due to a diffuse distribution of CCDC32 on the inner surface of the plasma membrane. We now, more explicitly describe this binding, which we believe reflects a specific interaction mediated by the C-terminus of CCDC32; thus the degree of diffuse membrane binding we observe follows: eGFP-CCDC32(FL)> eGFP-CCDC32(∆78-98)>eGFP-CCDC32(1-54)~eGFP/background (see new Fig. S7). Importantly, the colocalization of CCDC32 at CCPs is confirmed by the dynamic imaging of CCPs (Fig 1B).

      In Figure 6, why aren't AP2 mu and sigma subunits shown?

      Response: Agreed. Not being aware of CCDC32's possible dual role as a chaperone, we had assumed that the AP2 complex was intact. We have now added this data in Figure 6 B,C and Fig. S3C, as discussed above.

      Page 5, top, this sentence is confusing: "their surface area (~17 x 10 nm2) remains significantly less than that required for the average 100 nm diameter CCV (~3.2 x 103 nm2)."

      Response: Thank you for the criticism. We have clarified the sentence and corrected a typo, which would definitely be confusing. The section now reads, "While the flat CCSs we detected in CCDC32 knockdown cells were significantly larger than in control cells (Fig. 4D, mean diameter of 147 nm vs. 127 nm, respectively), they are much smaller than typical long-lived flat clathrin lattices (d{greater than or equal to}300 nm)(Grove et al., 2014). Indeed, the surface area of the flat CCSs that accumulate in CCDC32 KD cells (mean ~1.69 x 104 nm2) remains significantly less than the surface area of an average 100 nm diameter CCV (~3.14 x 104 nm2). Thus, we refer to these structures as 'flat clathrin assemblies' because they are neither curved 'pits' nor large 'lattices'. Rather, the flat clathrin assemblies represent early, likely defective, intermediates in CCP formation."

      Significance

      Please see above.(from above: Overall, while this work presents some interesting ideas, it remains unclear whether CCDC32 regulates AP2 beyond the assembly step)

      Response: Our responses above argue that we have indeed established that CCDC32 regulates AP2 beyond the assembly step. We have also identified several discrepancies between our findings and those reported by Wan et al., most notably binding between CCDC32 and mature AP2 complexes and the AP2-dependent recruitment of CCDC32 to CCPs. It is possible that these discrepancies may be due to the position of the GFP tag (ours is N-terminal, theirs is C-terminal; we show that the N-terminal tagged CCDC32 rescues the knockdown phenotype, while Wan et al., do not provide evidence for functionality of the C-terminal construct).

      __Reviewer #3 __

      Evidence, reproducibility and clarity (Required):

      In this manuscript, Yang et al. characterize the endocytic accessory protein CCDC32, which has implications in cardio-facio-neuro-developmental syndrome (CFNDS). The authors clearly demonstrate that the protein CCDC32 has a role in the early stages of endocytosis, mainly through the interaction with the major endocytic adaptor protein AP2, and they identify regions taking part in this recognition. Through live cell fluorescence imaging and electron microscopy of endocytic pits, the authors characterize the lifetimes of endocytic sites, the formation rate of endocytic sites and pits and the invagination depth, in addition to transferrin receptor (TfnR) uptake experiments. Binding between CCDC32 and CCDC32 mutants to the AP2 alpha appendage domain is assessed by pull down experiments. Together, these experiments allow deriving a phenotype of CCDC32 knock-down and CCDC32 mutants within endocytosis, which is a very robust system, in which defects are not so easily detected. A mutation of CCDC32, known to play a role in CFNDS, is also addressed in this study and shown to have endocytic defects.

      Response: We thank the reviewer for their positive remarks regarding the quality of our data and the strength of our conclusions.

      In summary, the authors present a strong combination of techniques, assessing the impact of CCDC32 in clathrin mediated endocytosis and its binding to AP2, whereby the following major and minor points remain to be addressed:

      • The authors show that CCDC32 depletion leads to the formation of brighter and static clathrin coated structures (Figure 2), but that these were only prevalent to 7.8% and masked the 'normal' dynamic CCPs. At the same time, the authors show that the absence of CCDC32 induces pits with shorter life times (Figure 1 and Figure 2), the 'majority' of the pits. Clarification is needed as to how the authors arrive at these conclusions and these numbers. The authors should also provide (and visualize) the corresponding statistics. The same statement is made again later on in the manuscript, where the authors explain their electron microscopy data. Was the number derived from there?

      These points are critical to understanding CCDC32's role in endocytosis and is key to understanding the model presented in Figure 8. The numbers of how many pits accumulate in flat lattices versus normal endocytosis progression and the actual time scales could be included in this model and would make the figure much stronger.

      Response: Thank you for these comments. We understand the paradox between the visual impression and the reality of our dynamic measurements. We have been visually misled by this in previous work (Chen et al., 2020), which emphasizes the importance of unbiased image analysis afforded to us through the well-documented cmeAnalysis pipeline, developed by us (Aguet et al., 2013) and now used by many others (e.g. (He et al., 2020)).

      The % of static structures was not derived from electron microscopy data, but quantified using cmeAnalysis, which automatedly provides the lifetime distribution of CCPs. We have now clarified this in the manuscript and added a histogram (Fig. S4) quantifying the fraction of CCPs in lifetime cohorts 150s (static).

      • In relation to the above point, the statistics of Figure 2E-G and the analysis leading there should also be explained in more detail: For example, what are the individual points in the plot (also in Figures 6G and 7G)? The authors should also use a few phrases to explain software they use, for example DASC, in the main text.

      Response: Each point in these bar graphs represents a movie, where n{greater than or equal to}12. These details have been added to the respective figure legend. We have also added a brief description of DASC analysis in the text.

      • There are several questions related to the knock-down experiments that need to be addressed:

      Firstly, knock-down of CCDC32 does not seem to be very strong (Figure S2B). Can the level of knock-down be quantified?

      Response: We have now quantified the KD efficiency. It is ~60%. This turns out to be fortuitous (see responses to reviewer 2), as a recent publication, which came out after we completed our study, has shown by CRISPR-mediated knockout, that CCD32 also plays an essential chaperone function required for AP2 assembly. We do not see any reduction in AP2 levels or its complex formation under our conditions (see new Supplemental Figure S3), which suggests that the effects of CCDC32 on CCP dynamics are more sensitive to CCDC32 concentration than its roles as a chaperone. Our phenotypes would have been masked by more efficient depletion of CCDC32.

      In page 6 it is indicated that the eGFP-CCDC32(1-54) and eGFP-CCDC32(∆78-98) constructs are siRNA-resistant. However in Fig S2B, these proteins do not show any signal in the western blot, so it is not clear if they are expressed or simply not detected by the antibody. The presence of these proteins after silencing endogenous CCDC32 needs to be confirmed to support Figures 6 and Figures 7, which critically rely on the presence of the CCDC32 mutants.

      Response: Unfortunately, the C-terminally truncated CCDC32 proteins are not detected because they lack the antibody epitope, indeed even the D78-98 deletion is poorly detected (compare the GFP blot in new S1A with the anti-CCDC32 blot in S1B). However, these constructs contain the same siRNA-resistance mutation as the full length protein. That they are expressed and siRNA resistant can be seen in Fig. S2A (now Fig. S1A) blotting for GFP.

      In Figures 6 and 7, siRNA knock-down of CCDC32 is only indicated for sub-figures F to G. Is this really the case? If not, the authors should clarify. The siRNA knock-down in Figure 1 is also only mentioned in the text, not in the figure legend. The authors should pay attention to make their figure legends easy to understand and unambiguous.

      Response: No, it is not the case. Thank you for pointing out the uncertainty. We have added these details to the Figure legends and checked all Figure legends to ensure that they clearly describe the data shown.

      • It is not exactly clear how the curves in Figure 3C (lower panel) on the invagination depth were obtained. Can the authors clarify this a bit more? For example, what are kT and kE in Figure 3A? What is I0? And how did the authors derive the logarithmic function used to quantify the invagination depth? In the main text, the authors say that the traces were 'logarithmically transformed'. This is not a technical term. The authors should refer to the actual equation used in the figure.

      Response: This analysis was developed by the Kirchhausen lab (Saffarian and Kirchhausen, 2008). We have added these details and reference them in the Figure legend and in the text. We also now use the more accurate descriptor 'log-transformed'.

      • In the discussion, the claim 'The resulting dysregulation of AP2 inhibits CME, which further results in the development of CFNDS.' is maybe a bit too strong of a statement. Firstly, because the authors show themselves that CME is perturbed, but by no means inhibited. Secondly, the molecular link to CFNDS remains unclear. Even though CCDC32 mutants seem to be responsible for CFNDS and one of the mutant has been shown in this study to have a defect in endocytosis and AP2 binding, a direct link between CCDC32's function in endocytosis and CFNDS remains elusive. The authors should thus provide a more balanced discussion on this topic.

      Response: We have modified and softened our conclusions, which now read that the phenotypes we see likely "contribute to" rather than "cause" the disease.

      • In Figure S1, the authors annotate the presence of a coiled-coil domain, which they also use later on in the manuscript to generate mutations. Could the authors specify (and cite) where and how this coiled-coil domain has been identified? Is this predicted helix indeed a coiled-coil domain, or just a helix, as indicated by the authors in the discussion?

      Response: See response to Reviewer 1, point 4. We have changed this wording to alpha-helix. The 'coiled-coil' reference is historical and unlikely a true reflection of CCDC32 structure. AlphaFold 3.0 predictions were unable to identify with certainly any coiled-coil structures, even if we modelled potential dimers or trimers; and we find no evidence of dimerization of CCDC32 in vivo. We have clarified this in the text.

      Minor comments

      • In general, a more detailed explanation of the microscopy techniques used and the information they report would be beneficial to provide access to the article also to non-expert readers in the field. This concerns particularly the analysis methods used, for example: How were the cohort-averaged fluorescence intensity and lifetime traces obtained? How do the tools cmeAnalysis and DASC work? A brief explanation would be helpful.

      Response: We have expanded Methods to add these details, and also described them in the main text.

      • The axis label of Figure 2B is not quite clear. What does 'TfnR uptake % of surface bound' mean? Maybe the authors could explain this in more detail in the figure legend? Is the drop in uptake efficiency also accessible by visual inspection of the images? It would be interesting to see that.

      Response: This is a standard measure of CME efficiency. 'TfnR uptake % of surface bound' = Internalized TfnR/Surface bound TfnR. Again, images may be misleading as defects in CME lead to increased levels of TfnR on the cell surface, which in turn would result in more Tfn uptake even if the rate of CME is decreased.

      • Figure 4: How is the occupancy of CCPs in the plasma membrane measured? What are the criteria used to divide CCSs into Flat, Dome or Sphere categories?

      Response: We have expanded Methods to add these details. Based on the degree of invagination, the shapes of CCSs were classified as either: flat CCSs with no obvious invagination; dome-shaped CCSs that had a hemispherical or less invaginated shape with visible edges of the clathrin lattice; and spherical CCSs that had a round shape with the invisible edges of clathrin lattice in 2D projection images. In most cases, the shapes were obvious in 2D PREM images. In uncertain cases, the degree of CCS invagination was determined using images tilted at {plus minus}10-20 degrees. The area of CCSs were measured using ImageJ and used for the calculation of the CCS occupancy on the plasma membrane.

      • Figure 5B: Can the authors explain, where exactly the GFP was engineered into AP2 alpha? This construct does not seem to be explained in the methods section.

      Response: We have added this information. The construct, which corresponds to an insertion of GFP into the flexible hinge region of AP2, at aa649, was first described by (Mino et al., 2020) and shown to be fully functional. This information has been added to the Methods section.

      • Figure S1B: The authors should indicate the colour code used for the structural model.

      Response: We have expanded our structural modeling using AlphaFold 3.0 in light of the recent publication suggesting the CCDC32 interacts with the µ2 subunit and does not bind full length AP2. These results are described in the text. The color coding now reflects certainty values given by AlphaFold 3.0 (Fig. S6B, D).

      • The list of primers referred to in the materials and methods section does not exist. There is a Table S1, but this contains different data. The actual Table S1 is not referenced in the main text. This should be done.

      Response: We apologize for this error. We have now added this information in Table S2.

      __ Significance (Required):__

      In this study, the authors analyse a so-far poorly understood endocytic accessory protein, CCDC32, and its implication for endocytosis. The experimental tool set used, allowing to quantify CCP dynamics and invagination is clearly a strength of the article that allows assessing the impact of an accessory protein towards the endocytic uptake mechanism, which is normally very robust towards mutations. Only through this detailed analysis of endocytosis progression could the authors detect clear differences in the presence and absence of CCDC32 and its mutants. If the above points are successfully addressed, the study will provide very interesting and highly relevant work allowing a better understanding of the early phases in CME with implication for disease.

      The study is thus of potential interest to an audience interested in CME, in disease and its molecular reasons, as well as for readers interested in intrinsically disordered proteins to a certain extent, claiming thus a relatively broad audience. The presented results may initiate further studies of the so-far poorly understood and less well known accessory protein CCDC32.

      Response: We thank the reviewer for their positive comments on the significance of our findings and the importance of our detailed phenotypic analysis made possible by quantitative live cell microscopy. We also believe that our new structural modeling of CCDC32 and our findings of complex and extensive interactions with AP2 make the reviewers point regarding intrinsically disordered proteins even more interesting and relevant to a broad audience. We trust that our revisions indeed address the reviewer's concerns.

      The field of expertise of the reviewer is structural biology, biochemistry and clathrin mediated endocytosis. Expertise in cell biology is rather superficial.


      References:

      Aguet, F., Costin N. Antonescu, M. Mettlen, Sandra L. Schmid, and G. Danuser. 2013. Advances in Analysis of Low Signal-to-Noise Images Link Dynamin and AP2 to the Functions of an Endocytic Checkpoint. Developmental Cell. 26:279-291.

      Chen, Z., R.E. Mino, M. Mettlen, P. Michaely, M. Bhave, D.K. Reed, and S.L. Schmid. 2020. Wbox2: A clathrin terminal domain-derived peptide inhibitor of clathrin-mediated endocytosis. Journal of Cell Biology. 219.

      Grove, J., D.J. Metcalf, A.E. Knight, S.T. Wavre-Shapton, T. Sun, E.D. Protonotarios, L.D. Griffin, J. Lippincott-Schwartz, and M. Marsh. 2014. Flat clathrin lattices: stable features of the plasma membrane. Mol Biol Cell. 25:3581-3594.

      He, K., E. Song, S. Upadhyayula, S. Dang, R. Gaudin, W. Skillern, K. Bu, B.R. Capraro, I. Rapoport, I. Kusters, M. Ma, and T. Kirchhausen. 2020. Dynamics of Auxilin 1 and GAK in clathrin-mediated traffic. J Cell Biol. 219.

      Mino, R.E., Z. Chen, M. Mettlen, and S.L. Schmid. 2020. An internally eGFP-tagged α-adaptin is a fully functional and improved fiduciary marker for clathrin-coated pit dynamics. Traffic. 21:603-616.

      Saffarian, S., and T. Kirchhausen. 2008. Differential evanescence nanometry: live-cell fluorescence measurements with 10-nm axial resolution on the plasma membrane. Biophys J. 94:2333-2342.

    1. Acknowledgments The authors acknowledge the COBE SST2 data provided by the NOAA/OAR/ESRL (PSL, Boulder, Colorado, USA), obtained from their website at https://psl.noaa.gov/data/gridded/data.cobe2.html and to the public IBTrACs database provided by the National Oceanic and Atmospheric Administration. Also, A.P-A. acknowledges the support from UVigo PhD grants. J.C.F-A. and R.S acknowledge the support from the Xunta de Galicia (Galician Regional Government). References Aiyyer, A. & Thorncroft, C. 2006. “Climatology of vertical shear over the tropical Atlantic”. Journal of Climate, 19: 2969-2983, ISSN: 0894-8755, DOI: 10.1175/JCLI3685.1. Andrews, D. G.; Holton, J. R. & Leovy, C. B. 1987. Middle Atmosphere Dynamics. 1st ed., vol. 40, United Kingdom: Academic Press, 489p., ISBN: 9780080511672, Available: <https://www.sciencedirect.com/bookseries/international-geophysics/vol/40/suppl/C>, [Consulted: Febraury 10, 2021]. Arora, K. & Dash, P. 2016. “Towards Dependence of Tropical Cyclone Intensity on Sea Surface Temperature and Its Response in a Warming World”. Climate, 4(2): 30, ISSN: 2225-1154, DOI: 10.3390/cli4020030. Bhatia, K. T.; Vecchi, G. A.; Knutson, T. R.; Murakami, H.; Kossin, J.; Dixon, K. W. & Whitlock, C. E. 2019. “Recent increases in tropical cyclone intensification rates”. Nature Communication, 10: 635, ISSN 2041-1723, DOI: 10.1038/s41467-019-08471-z. Bister, M. & Emanuel, K. A. 2002. “Low frequency variability of tropical cyclone potential intensity 1. Interannual to interdecadal variability”. Journal Geophysical Research Atmosphere, 107(D24): 4801, ISSN:2169-8996, DOI: 10.1029/2001JD000776. Camargo, S. J.; Emanuel, K. A. & Sobel, A. H. 2007. “Use of a Genesis Potential Index to Diagnose ENSO Effects on Tropical Cyclone Genesis”. Journal of Climate, 20: 4819-4834, ISSN: 0894-8755, DOI: 10.1175/JCLI4282.1. Caron, L.; Boudreault, M. & Bruyère, C. L. 2015. “Changes in large-scale controls of Atlantic tropical cyclone activity with the phases of the Atlantic multidecadal oscillation”. Climate Dynamics, 44: 1801-1821, ISSN: 1432-0894, DOI: 10.1007/s00382-014-2186-5. Chang, E. K. M. & Guo, Y. 2007. “Is the number of North Atlantic tropical cyclones significantly underestimated prior to the availability of satellite observations?”. Geophysical Research Letter, 34: L14801, ISSN: 1944-8007, DOI: 10.1029/2007GL030169. Chiang, J. C. H. & Vimont, D. J. 2004. “Analagous meridional modes of atmosphere-ocean variability in the tropical Pacific and tropical Atlantic”. Journal of Climate, 17(21): 4143-4158, ISSN: 0894-8755, DOI: 10.1175/JCLI4953.1. Cione, J. J. & Uhlhorn, E.W. 2003. “Sea Surface Temperature Variability in Hurricanes: Implications with Respect to Intensity Change”. Monthly Weather Review, 131(8): 1783-1796, ISSN: 1520-0493, DOI: 10.1175//2562.1. Dare, R. A. & McBride, J. L. 2011. “The threshold sea surface temperature condition for tropical cyclogenesis”. Journal of Climate, 24: 4570-4576, ISSN: 0894-8755, DOI: 10.1175/JCLI-D-10-05006.1. DeMaria, M.; Knaff, J. A. & Connell, B. H. 2001. “A Tropical Cyclone Genesis Parameter for the Tropical Atlantic”. Weather and Forecasting, 16: 219-233, ISSN: 1520-0434, DOI: 10.1175/1520-0434(2001)016<0219:ATCGPF>2.0.CO;2 Deser, C.; Alexander, M. A.; Xie, S.-P. & Phillips, A. S. 2010. “Sea surface temperature variability: Patterns and mechanisms”. Annual Review of Marine Science, 2: 115-143, ISSN: 1941-0611, DOI:10.1146/annurev-marine-120408-151453. Elsner, J. B. 2003. “Tracking hurricanes”. Bulletin of the American Meteorological Society, 84: 353-356, ISSN: 1520-0477, DOI: 10.1175/BAMS-84-3-353. Emanuel, K. A. 2007. “Environmental factors affecting tropical cyclone power dissipation”. Journal of Climate, 20: 5497-5509, ISSN: 0894-8755, DOI: 10.1175/2007JCLI1571.1 Emanuel, K. A. 2013. “Downscaling CMIP5 climate models shows increased tropical cyclone activity over the 21st century”. Proceedings of the National Academy of Sciences, 110: 12219-12224, ISSN: 1091-6490, DOI: 10.1073/pnas.1301293110. Enfield, D. B.; Mestas-Nunez, A. M. & Trimble, P. J. 2001. “The Atlantic Multidecadal Oscillation and its relationship to rainfall and river flows in the continental U.S”. Geophysical Research Letter,28: 2077-2080, ISSN: 1944-8007, DOI: 10.1029/2000GL012745 Enfield, D. B.; Mestas, A.M.; Mayer, D. A. & Cid-Serrano, L. 1999. “How ubiquitous is the dipole relationship in tropical Atlantic sea surface temperatures?”. Journal of Geophysical Research Ocean, 104: 7841-7848, ISSN: 2169-9291, DOI: 10.1029/1998JC900109. Fraza, E. & Elsner, J. B. 2015. “A climatological study of the effect of sea-surface temperature on North Atlantic hurricane intensification”. Physical Geography, 36(5): 395-407, ISSN: 1930-0557, DOI: 10.1080/02723646.2015.1066146. Goldenberg, S. B.; Landsea, C. W.; Mestas-Nuñez, A. M. & Gray, W. M. 2001. “The Recent Increase in Atlantic Hurricane Activity: Causes and Implications”. Science, 293: 474-479, ISSN: 1095-9203, DOI: 10.1126/science.1060040. Gray, W. M. 1968. “Global view of the origin of tropical disturbances and storms”. Monthly Weather Review, 96(10): 669-700, ISSN: 1520-0493, DOI: 10.1175/1520-0493(1968)096<0669:GVOTOO>2.0.CO;2. Gray, W. M. 1984. “Atlantic seasonal hurricane frequency. Part I: El Niño and 30 mb quasi-biennial oscillation influences”. Monthly Weather Review, 112(9): 1649-1668, ISSN: 1520-0493, DOI: 10.1175/1520-0493(1984)112<1649:ASHFPI>2.0.CO;2. Hakkinen, S. & Rhines, P. B. 2004. “Decline of subpolar North Atlantic gyre circulation during the 1990s”. Science, 304: 555-559, ISSN: 1095-9203, DOI: 10.1126/science.1094917. Hakkinen, S. & Rhines, P. B. 2009. “Shifting surface currents in the northern North Atlantic Ocean”. Journal Geophysical Research, 114: C04005, ISSN: 2169-9291, DOI: 10.1029/2008JC004883. Held, I. M. & Soden, B. J. 2006. “Robust responses of the hydrological cycle to global warming”. Journal of Climate, 19: 5686-5699, ISSN: 0894-8755, DOI: 10.1175/JCLI3990.1. Hirahara, S.; Ishii, M. & Fukuda, Y. 2014 “Centennial-scale sea surface temperature analysis and its uncertainty”. Journal of Climate, 27: 57-75, ISSN: 0894-8755, DOI: 10.1175/JCLI-D-12-00837.1. Hurrell, J. W. 1995. “Decadal trends in the North Atlantic Oscillation and relationships to regional temperature and precipitation”. Science, 269: 676-679, ISSN: 1095-9203. Jiang, H.; Halverson, J. B. & Zipser, E. J. 2008. “Influence of environmental moisture on TRMM-derived tropical cyclone precipitation over land and ocean”. Geophysical Research Letter, 35: L17806, ISSN: 1944-8007, DOI: 10.1029/2008GL034658. Jones, P. D.; Jónsson, T. & Wheeler, D. 1997. “Extension to the North Atlantic Oscillation using early instrumental pressure observations from Gibraltar and South-West Iceland”. International Journal of Climatology, 17: 1433-1450, ISSN: 1097-0088, DOI: 10.1002/(SICI)1097-0088(19971115)17:13<1433::AID-JOC203>3.0.CO;2-P. Keith, E. & Xie, L. 2009. “Predicting Atlantic Tropical Cyclone Seasonal Activity in April”. Weather and Forecasting, 24: 436-455, ISSN: 1520-0434, DOI: 10.1175/2008WAF2222139.1. Killick, R.; Fearnhead, P. & Eckley, I. A. 2012. “Optimal detection of change points with a linear computational cost”. Journal of the American Statistical Association, 107(500): 1590-1598, ISSN: 1537-274X, DOI: 10.1080/01621459.2012.737745. Klotzbach, P. J. 2010. “On the Madden-Julian oscillation-Atlantic hurricane relationship”. Journal Climate, 23: 282-293, ISSN: 0894-8755, DOI: 10.1175/2009JCLI2978.1. Klotzbach, P. J. & Gray, V. M. 2008. “Multidecadal variability in North Atlantic tropical cyclone activity”. Journal of Climate, 21: 3929-3935, ISSN: 0894-8755, DOI: 10.1175/2008JCLI2162.1. Knaff, J. A. 1998. “Predicting summertime Caribbean pressure in early April”. Weather and Forecasting, 13: 740-752, ISSN: 1520-0434, DOI: 10.1175/1520-0434(1998)013<0740:PSCPIE>2.0.CO;2. Knapp, K. R.; Kruk, M. C.; Levinson, D. H.; Diamond, H. J. & Neumann, C. J. 2010. “The International Best Track Archive for Climate Stewardship (IBTrACS): Unifying tropical cyclone best track data”. Bulletin of the American Meteorological Society, 91: 363-376, ISSN: 1520-0477, DOI:10.1175/2009BAMS2755.1. Kossin, J. P.; Camargo, S. J. & Sitkowski, M. 2010. “Climate modulation of North Atlantic hurricane tracks”. Journal of Climate, 23: 3057-3076, ISSN: 0894-8755, DOI: 10.1175/2010JCLI3497.1. Kossin, J. P.; Olander, T. L. & Knapp, K. R. 2013. “Trend Analysis with a New Global Record of Tropical Cyclone Intensity”. Journal of Climate, 26; 9960-9976, ISSN: 0894-8755, DOI: 10.1175/JCLI-D-13-00262.1. Kossin, J.; Emanuel, K. & Vecchi, G. 2014. “The poleward migration of the location of tropical cyclone maximum intensity”. Nature, 509: 349-352, ISSN: 1476-4687, DOI: 10.1038/nature13278. Knutson, T. R.; Sirutis, J. J.; Zhao, M.; Tuleya, R. E.; Bender, M.; Vecchi, G. A.; Villarini, G. & Chavas, D. 2015. “Global projections of intense tropical cyclone activity for the late twenty-first century from dynamical downscaling of CMIP5/RCP4.5 scenarios”. Journal of Climate, 28(18): 7203-7224, ISSN: 0894-8755, DOI: 10.1175/jcli-d-15-0129.1. Krishnamurthy, L.; Vecchi, G. A.; Msadek, R.; Murakami, H.; Wittenberg, A. & Zeng, F. 2016. “Impact of Strong ENSO on Regional Tropical Cyclone Activity in a High-Resolution Climate Model in the North Pacific and North Atlantic Oceans”. Journal of Climate, 29: 2375-2394, ISSN: 0894-8755, DOI: 10.1175/JCLI-D-15-0468.1. Lim, Y.; Schubert, S. D.; Reale, O.; Molod, A. M.; Suarez, M. J. & Auer, B. M. 2016. “Large-Scale Controls on Atlantic Tropical Cyclone Activity on Seasonal Time Scales”. Journal of Climate, 29: 6727-6749, ISSN: 0894-8755, DOI: 10.1175/JCLI-D-16-0098.1. Lim, Y. K.; Schubert, S. D.; Kovach, R.; Molod, A. M. & Pawson, S. 2018. “The Roles of Climate Change and Climate Variability in the 2017 Atlantic Hurricane Season”. Scientific Reports, 8: 16172, ISSN: 2045-2322, DOI: 10.1038/s41598-018-34343-5 Lin, I. ‐I.; Camargo, S. J.; Patricola, C. M.; Boucharel, J.; Chand, S.; Klotzbach, P.; Chan, J. C. L.; Wang, B.; Chang, P.; Li, T. & Jin, F. F. 2020. ENSO and Tropical Cyclones. In McPhaden, M. J.; Santoso, A. & Cai, W. (eds). El Niño Southern Oscillation in a Changing Climate. United States of America: American Geophysical Union (AGU), ISBN: 9781119548164, DOI: 10.1002/9781119548164.ch17. Liu, M.; Vecchi, G. A.; Smith, J. A. & Knutson, T. R. 2019. “Causes of large projected increases in hurricane precipitation rates with global warming”. npj Climate and Atmospheric Science, 2(1): 1-5, ISSN: 23973722, DOI: 10.1038/s41612-019-0095-3. Loader, C. R. 1999. “Bandwidth Selection: Classical or Plug-In?” The Annals of Statistics, 27(2): 415-438, ISSN: 00905364. Mendelsohn, R.; Emanuel, K. A.; Chonabayashi, S. & Bakkensen, L. 2012. “The impact of climate change on global tropical cyclone damage”. Nature Climate Change, 2: 205-209, ISSN: 1758-6798, DOI: 10.1038/nclimate1357. Molinari, J.; Knight, D.; Dickinson, M.; Vollaro, D. & Skubis, S. 1997. “Potential vorticity, easterly waves, and eastern Pacific tropical cyclogenesis”. Monthly Weather Review, 125: 2699-2708, ISSN: 1520-0493, DOI: 10.1175/1520-0493(1997)125<2699:PVEWAE>2.0.CO;2. Montgomery, M. T. 2016. Recent Advances in Tropical Cyclogenesis. In Mohanty U. C. & Gopalakrishnan S.G. (eds) Advanced Numerical Modeling and Data Assimilation Techniques for Tropical Cyclone Prediction. Switzerland: Springer, ISBN: 978-94-024-0895-9, DOI: 10.5822/978-94-024-0896-6_22. Murakami, H.; Li, T. & Hsu, P. 2014. “Contributing Factors to the Recent High Level of Accumulated Cyclone Energy (ACE) and Power Dissipation Index (PDI) in the North Atlantic”. Journal of Climate, 27: 3023-3034, ISSN: 0894-8755, DOI: 10.1175/JCLI-D-13-00394.1. Naujokat, B. 1986. “An update of the observed quasi-biennial oscillation of the stratospheric winds over the tropics”. Journal of Atmospheric Science, 43: 1873-1877, ISSN: 1520-0469, DOI: 10.1175/1520-0469(1986)043<1873:AUOTOQ>2.0.CO;2 Neumann, C. J. 1993. Global climatology. Global Guide to Tropical Cyclone Forecasting, (ser. WMO/TD No. 560, Rep. TCP-31), Technical Document, Ginebra: World Meteorological Organization. Available: <https://library.wmo.int/index.php?lvl=notice_display&id=305#.YR_jPVuxXeM>, [Consulted: Febraury 15, 2021]. Noy, I. 2016. “Tropical storms: the socioeconomics of cyclones”. Nature Climate Change, 6:343, ISSN: 1758-6798, DOI: 10.1038/nclimate2975. Park, W. & Latif, M. 2005. “Ocean dynamics and the nature of air-sea interactions over the North Atlantic at decadal timescales”. Journal of Climate, 18: 982-95, ISSN: 0894-8755, DOI: 10.1175/JCLI-3307.1 Pazos, M. & Gimeno, L. 2017. “Identification of moisture sources in the Atlantic Ocean for cyclogenesis processes”. In: 1st International Electronic Conference on Hydrological Cycle (ChyCle-2017). Sciforum Electronic Conference Series, Vol. 1, Basel, Switzerland: MDPI, DOI: 10.3390/CHyCle-2017-04862 Penland, C. & Matrosova, L. 1998. “Prediction of tropical Atlantic sea surface temperatures using Linear Inverse Modeling”. Journal of Climate, 11(3): 483-496, ISSN: 0894-8755, DOI: 10.1175/1520-0442(1998)011<0483:POTASS>2.0.CO;2 Pérez-Alarcón, A.; Sorí, R.; Fernández-Alvarez, J. C.; Nieto, R. & Gimeno, L. 2020. “Moisture Sources for Tropical Cyclones Genesis in the Coast of West Africa through a Lagrangian Approach”. Environmental Sciences Proceedings, 4:3, ISSN: 2673-4931, DOI: 10.3390/ecas2020-08126 Saffir, H. S. 1973. “Hurricane wind and storm surge”. Military Engineering, 65(423): 4-5, ISSN: 00263982. Scott, A. J. & Knott, M. 1974. “A Cluster Analysis Method for Grouping Means in the Analysis of Variance”. Biometrics, 30(3): 507-512, ISSN: 0006341X. Shen, W. X.; Tuleya, R. E. & Ginis, I. 2000. “A sensitivity study of the thermodynamic environment on GFDL model hurricane intensity: Implications for global warming”. Journal of Climate, 13: 109-121, ISSN: 0894-8755, DOI: 10.1175/1520-0442(2000)013<0109:ASSOTT>2.0.CO;2 Simpson, R. H. 1974. “The hurricane disaster-potential scale”. Weatherwise, 27: 169-186, ISSN: 1940-1310, DOI: 10.1080/00431672.1974.9931702 Smith, C. A. & Sardeshmukh, P. .2000. “The Effect of ENSO on the Intraseasonal Variance of Surface Temperature in Winter”. International Journal of Climatology, 20: 1543-1557, ISSN: 1097-0088, DOI: 10.1002/1097-0088(20001115)20:13<1543::AID-JOC579>3.0.CO;2-A. Tang, B. H. & Neelin, J. D. 2004. “ENSO influence on Atlantic hurricanes via tropospheric warming”. Geophysical Research Letter, 31: L24204, ISSN: 1944-8007, DOI: 10.1029/2004GL021072. Toggweiler, J. R. & Russell, J. 2008. “Ocean circulation in a warming climate”. Nature, 451: 286-288, ISSN: 1476-4687, DOI: 10.1038/nature06590. Vecchi, G. A. & Knutson, T. R. 2008. “On Estimates of Historical North Atlantic Tropical Cyclone Activity”. Journal of Climate, 21(14): 3580-3600, ISSN: 0894-8755, DOI: 10.1175/2008JCLI2178.1. Vecchi, G. & Soden, B. 2007. “Effect of remote sea surface temperature change on tropical cyclone potential intensity”. Nature, 450: 1066-1070, ISSN: 1476-4687, DOI: 10.1038/nature06423. Vimont, J. P. & Kossin, J. P. 2007. “The Atlantic meridional mode and hurricane activity”. Geophysical Research Letter, 34: L07709, ISSN: 1944-8007, DOI: 10.1029/2007GL029683. Wang, X.; Liu, H. & Foltz, G. R. 2017. “Persistent influence of tropical North Atlantic wintertime sea surface temperature on the subsequent Atlantic hurricane season”. Geophysical Research Letter, 44: 7927- 7935, ISSN: 1944-8007 , DOI: 10.1002/2017GL074801. Wehner, M.; Prabhat; Reed, K. A.; Stone, D.; Collins, W. D. & Bacmeister, J. 2015. “Resolution Dependence of Future Tropical Cyclone Projections of CAM5.1 in the U.S. CLIVAR Hurricane Working Group Idealized Configurations”. Journal of Climate, 28: 3905-3925, ISSN: 0894-8755, DOI: 10.1175/JCLI-D-14-00311.1. Xie, L.; Yan, T.; Pietrafesa, L. J.; Morrison, J. M. & Karl, T. 2005. “Climatology and Interannual Variability of North Atlantic Hurricane Tracks”. Journal of Climate, 18: 5370-5381, ISSN: 0894-8755, DOI: 10.1175/JCLI3560.1. Xu, J.; Wang, Y. & Tan, Z. 2016. “The Relationship between Sea Surface Temperature and Maximum Intensification Rate of Tropical Cyclones in the North Atlantic”. Journal of Atmospheric Sciences, 73: 4979-4988, ISSN: 1520-0469, DOI: 10.1175/JAS-D-16-0164.1. Ye, M.; Wu, J.; Liu, W.; He, X. & Wang, C. 2020. “Dependence of tropical cyclone damage on maximum wind speed and socioeconomic factors”. Environmental Research Letters, 15(9): 094061, ISSN: 1748-9326, DOI: 10.1088/1748-9326/ab9be2.

      CIT: Possible resources

    1. BoM: Severe tropical cyclone Yasi, http://www.bom.gov.au/cyclone/history/yasi.shtml (last access: 14 July 2016), 2011.  Chou, M.-D. and Suarez, M.: An efficient thermal infrared radiation parameterization for use in general circulation models, NASA Tech. Memo, NASA, Greenbelt, MD, USA, p. 84, 1994.  Dare, R. A. and McBride, J. L.: The Threshold Sea Surface Temperature Condition for Tropical Cyclogenesis, J. Climate, 24, 4570–4576, 2011.  Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., van de Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Hólm, E. V., Isaksen, L., Kållberg, P., Köhler, M., Matricardi, M., McNally, A. P., Monge-Sanz, B. M., Morcrette, J.-J. Park, B.-K., Peubey, C., de Rosnay, P., Tavolato, C., Thépaut, J.-N. and Vitart, F.: The ERA-Interim reanalysis: configuration and performance of the data assimilation system, Q. J. Roy. Meteorol. Soc., 137, 553–597, 2011.  Emanuel, K.: An air–sea interaction theory for tropical cyclones. Part I: Steady-state maintenance, J. Atmos. Sci., 43, 585–604, 1986.  Emanuel, K.: Environmental Factors Affecting Tropical Cyclone Power Dissipation, J. Climate, 20, 5497–5509, 2007.  Emanuel, K. and Sobel, A.: Response of tropical sea surface temperature, precipitation, and tropical cyclone-related variables to changes in global and local forcing, J. Adv. Model. Earth Syst., 5, 447–458, 2013.  Evans, J. L., Ryan, B. F., and McGregor, J. L.: A numerical exploration of the sensitivity of tropical cyclone rainfall intensity to sea surface temperature, J. Climate, 7, 616–623, 1994.  Gray, W.: Global view of the origin of tropical disturbances and storms, Mon. Weather Rev., 96, 669–700, 1968.  Holland, G. J.: The maximum potential intensity of tropical cyclones, J. Atmos. Sci., 54, 2519–2541, 1997.  Imielska, A.: Seasonal climate summary southern wettest Australian summer on record and one of the strongest La Niña events on record, Aust. Meteorol. Oceanogr. J., 61, 241–251, 2011.  Kain, J.: The Kain–Fritsch convective parameterization: an update, J. Appl. Meteorol., 43, 170–181, 2004.  Kilic, C. and Raible, C. C.: Investigating the sensitivity of hurricane intensity and trajectory to sea surface temperatures using the regional model WRF, Meteorol. Z., 22, 685–698, 2013.  Knapp, K. R., Kruk, M. C., Levinson, D. H., Diamond, H. J., and Neumann, C. J.: The International Best Track Archive for Climate Stewardship (IBTrACS), B. Am. Meteorol. Soc., 91, 363–376, 2010.  Knutson, T. R., McBride, J. L., Chan, J., Emanuel, K., Holland, G., Landsea, C., Held, I., Kossin, J. P., Srivastava, A. K., and Sugi, M.: Tropical cyclones and climate change, Nat. Geosci., 3, 157–163, 2010. Lesser, G. R., Roelvink, J. A., van Kester, J. A. T. M., and Stelling, G. S.: Development and validation of a three-dimensional morphological model, Coast. Eng., 51, 883–915, https://doi.org/10.1016/j.coastaleng.2004.07.014, 2004.  Lonfat, M., Marks, F. D., and Chen, S. S.: Precipitation Distribution in Tropical Cyclones Using the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager: A Global Perspective, Mon. Weather Rev., 132, 1645–1660, 2004  Miglietta, M. M., Moscatello, A., Conte, D., Mannarini, G., Lacorata, G., and Rotunno, R.: Numerical analysis of a Mediterranean `hurricane' over south-eastern Italy: Sensitivity experiments to sea surface temperature, Atmos. Res., 101, 412—426, 2011. Mlawer, E. J., Taubman, S. J., Brown, P. D., Iacono, M. J., and Clough, S. A.: Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the longwave, J. Geophys. Res., 102, 16663–16682, 1997.  Nakanishi, M. and Niino, H.: An Improved Mellor–Yamada Level-3 Model: Its Numerical Stability and Application to a Regional Prediction of Advection Fog, Bound.-Lay. Meteorol., 119, 397–407, 2006. NCEP: NCEP FNL Operational Model Global Tropospheric Analyses, continuing from July 1999, National Centers for Environmental Prediction/National Weather Service/NOAA/US Department of Commerce, Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory, Boulder, Colorado, https://doi.org/10.5065/D6M043C6, 2000.  Ooyama, K.: Numerical simulation of the life cycle of tropical cyclones, J. Atmos. Sci., 26, 3–39, 1969.  Palmén, E.: On the formation and structure of tropical hurricanes, Geophysica, 3, 26–38, 1948.  Parker, C. L., Lynch, A. H., and Mooney, P. A.: Factors affecting the simulated trajectory and intensification of Tropical Cyclone Yasi (2011), Atmos. Res., 194, 27–42, 2017.  Powell, M. D. and Reinhold, T. A.: Tropical Cyclone Destructive Potential by Integrated Kinetic Energy, B. Am. Meteorol. Soc., 88, 513–526, 2007.  Queensland Government: Tropical Cyclone Yasi – 2011 Post Cyclone Coastal Field Investigation, Queensland Department of Science, Information Technology, Innovation and the Arts, Brisbane, Australia, 2012.  Ramsay, H. A. and Sobel, A. H.:Effects of Relative and Absolute Sea Surface Temperature on Tropical Cyclone Potential Intensity Using a Single-Column Model, J. Climate, 24, 183–193, 2011.  Thompson, G., Field, P. R., Rasmussen, R. M., and Hall, W. D.: Explicit Forecasts of Winter Precipitation Using an Improved Bulk Microphysics Scheme. Part II: Implementation of a New Snow Parameterization, Mon. Weather Rev., 136, 5095–5115, 2008. Ummenhofer, C. C., Sen Gupta, A., England, M. H., Taschetto, A. S., Briggs, P. R., and Raupach, M. R.: How did ocean warming affect Australian rainfall extremes during the 2010/2011 La Niña event?, Geophys. Res. Lett., 42, 9942–9951, https://doi.org/10.1002/2015GL065948, 2015.  Vecchi, G. A. and Soden, B. J.: Effect of remote sea surface temperature change on tropical cyclone potential intensity, Nature, 450, 1066–1070, https://doi.org/10.1038/nature06423, 2007.   Whiteway, T.: Australian bathymetry and topography grid, Geoscience Australia, Canberra, 2009.

      Cit: Possible resources I can use

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *We are grateful for the overall positive feedback and constructive suggestions. We have been able to experimentally address several of the suggested points and provide here a revision plan addressing all of the reviewers’ additional concerns. *

      *In summary, this study is of fundamental novelty and high impact as it: *

      1. Reveals an unexpected role of ErbB3 in controlling ____Integrin β1 ____trafficking ____and thus epithelial cell motility and extracellular vesicles secretion. This may shed important insights into the role of ErbB3____ in cancer.
      2. Uncovers the first ligand-independent, non-canonical cellular function for ErbB3 as a scaffold for the Arf6-Rabaptin5-GGA3 endosomal sorting complex.
      3. Provoking the notion that pseudo-RTKs may have evolved cellular functions beyond receptor signaling, such as by scaffolding endosomal sorting compartments. *We hope that you share our view that these conceptually ground breaking findings will be of interest to a broad cross-disciplinary audience interested in cell signaling, cancer biology, endocytic trafficking and integrin biology. *

      1. Point-by-point description of the revisions

      Reviewer #1 (Evidence____, reproducibility and clarity (Required)):

      ErbB3 is well-known for its significance in cancer, which is dependent on ligand-binding and heterodimerization with other ErbB family members. In the current work, Rodrigues-Junior et al. identified novel, unexpected functions of ErbB3 in promoting early endocytic recycling and restricting exocytic trafficking (extracellular vesicles secretion) of membrane receptors, such as integrin b1 and transferrin receptor, via stabilizing the Arf6-GGA3-Rabaptin5 endosomal sorting complex. Via ErbB3 siRNA knockdown, they observed an impaired recycling of transferrin receptor and integrin b1 back to the cell membrane. The recycling assay condition (growth factor-deprived) provided a very clean result to support that this ErbB3-dependent endocytic trafficking is ligand-binding independent. The trafficking-dependence on ErbB3 (both the endocytic and the exocytic) was further supported by integrin b1 functional assays (scratch closure assay and Matrigel invasion assay). There are still some details that need to be clarified to fully understand the conclusion.

      Major points:

        • The manuscript started with a pathological correlation between high ErbB3 level and poor patient survival rate. In Fig.1, the impaired TfR recycling, and the co-localization between ErbB3 and integrin b1 were also performed in the pathological breast cancer cell line, MCF7. While investigating integrin b1 recycling, the authors suddenly switched to another two non-malignant human breast epithelial cell lines, which led to a difficult correlation of ErbB3-mediated recycling back to the disease situation. The authors should state more clearly this point, rather than data not shown. This inconsistency occurred also in other assays, for example, when addressing the trafficking from TGN to cell surface, MCF7 was utilized; while when addressing extracellular vesicle secretion, MCF10A was utilized. Response: we thank the reviewer for the comment. The rationale for using different cell-lines or primary cells is now better explained in the manuscript. We found that depletion of ErbB3 impaired recycling of Integrin β1 in the non-malignant cells, including MCF10A and primary breast epithelial cells, but not in malignant MCF7 cells that overexpress ErbB3 (data not shown). We now speculate in the manuscript that perhaps the dependence on ErbB3 for Integrin b1 recycling is lost at some point during carcinogenesis, although further studies will be needed to address this possibility. MCF7 cells were used to detect endogenous ErbB3 as normal expression levels of ErbB3 (primary MECs and MCF10A) were not detectable by immunofluorescence microscopy in our hands with a range of antibodies we tested. With regard to the transferrin recycling assay, we first attempted to use MCF10A cells for consistency, however we found that transferrin internalized poorly in these cells and the limited pool of transferrin that internalised was retained in these cells for an extended time (3 h), thus rendering them unsuitable for our transferrin experiments. *

      *Concerning the data on trafficking from the TGN to cell surface we mistakenly wrote that they were performed in MCF7 cells although they were in fact done in MCF10A cells. This is now corrected in the new version of this manuscript. *

      Additionally, based on the constructive comment by this reviewer, we have now extended the analysis of EV secretion in ErbB3, Rab4 and Rabaptin5 silenced cells to MCF7 cells. The new data is in line with our findings in MCF10A and prHMEC cells, that absence of ErbB3 significantly increased EV secretion. Moreover, Rab4 and Rabaptin5 knockdown also enhanced the amount of EVs secreted by MCF7 cells. These results were incorporated in the manuscript as new Supplementary Figure S7F-G and new Supplementary Figure S9F-G, as recommended. Furthermore, we also included in this new version that GGA3 and to a lesser extent Rab GTPase-binding effector protein 1 (Rabaptin5 or RABPT5) shared colocalisation with endogenous ErbB3 in MCF7 cells as the new Supplementary Figure 9A, B. Finally, we also attempted to conduct the Arf6 IP in MCF7 cells, but as opposed to MCF10A cells, the yield of Arf6 in pull down experiments was much lower than in MCF10A cells, and interacting proteins were not detectable.

      It was shown before that ErbB3 undergoes constitutive internalization and degradation within several hours that is independent of ligand-binding (ref#13). Can the authors provide experimental evidences to show the correlation of TfR or integrin b1 recycling with this dynamic ErbB3 levels rather than ErbB3 knockdown?

      Response: we have performed colocalization of ErbB3, traced Integrin β1 and the recycling endosome marker EHD1, showing triple colocalization in a subset of endosomes, as shown in the new Supplementary Figure S2H. Experimental limitations prevented us from including EEA1 in triple staining for mCherry-ErbB3 or endogenous ErbB3 protein. Furthermore, ectopically expressed ErbB3 in MCF10A cells did not show convincing co-localisation. We hope that the new EHD1 triple colocalization with ErbB3 and Integrin β1 in endosomal compartments satisfies this specific comment.

      As mentioned above, regarding the transferrin recycling assay, we first attempted to use MCF10A cells for consistency, however we found that transferrin internalized poorly in these cells and the limited pool of transferrin that internalised was retained in these cells for an extended time (3 h), thus preventing their use.

      The efficiency of siRNA knockdown of ErbB3 (both #1 and #2) should support the observed phenotype (Fig. 1I-J, K-L). Is there a correlation between the ErbB3 level with integrin recycling? For example, siRNA#2 led to more efficient knockdown of ErbB3 in MCF10A?

      Response: notably, the immunoblots presented here to assess the efficiency of the two different siRNAs are one example and we noted some variability between different experiments but find that both siRNAs work well and yield comparable effects on recycling of Integrin β1. Importantly, the recycling data represents biological repeats of independently performed experiments, and have yielded reproducible and consistent ErbB3 silencing using both siRNAs. This is noted by the lack of significance between ErbB3 knocked down cells in Fig. 1I-J and K-L. Hence, we consider that both siRNAs against ErbB3 worked efficiently with comparable outcome. Please also note our reply to Rev2 #07.

      ErbB3 loss led to more extracellular vesicles secretion, but also lysosomal degradation of integrin b1. This conclusion is supported by results shown in Fig.4D-E and Fig. S8A-B, while the analysis from the same cell line (MCF10A, Fig. S3A) results in no change of integrin b1 levels upon ErbB3 depletion. Fig. S3B showed also no change in a second non-malignant cell line (prHMEC). How do the authors explain this conflict?

      Response*: we thank the reviewer for this comment. We believe that the increase in EV secretion and lysosomal degradation is compensated by increase in de novo synthesis of Integrin β1 (see data below, from Fig. S3C). In the original manuscript we did not perform the appropriate statistical analysis of the RT-qPCR data. The unpaired two-tail Student’s T-test is only suitable for normally distributed samples, which is not the case here. Instead, we performed the appropriate Mann-Whitney U-test assuming non-normal distribution, yielding an exact p-value of 0.017. The figure S3A and associated text has been modified accordingly. *

      Minor points: 1. Is TfR also colocalizing with endogenous ErbB3?

      *Response: as mentioned in the major comment #02, we attempted to perform the transferrin recycling assay using MCF10A cells to enable direct comparisons with the integrin b1 recycling, but found that transferrin internalized poorly in these cells. *

      Fig. 3J, TSG101, T is masked by 3I

      Response: we apologize for this oversight. We have gone through the manuscript in detail and corrected all pointed errors accordingly.

      Page 10, the description of the EV secretion in prHMEC cells is annotated to the wrong figure. Fig S5Dà S7D; S5Eà S7E

      Response: we apologize for this oversight and have now corrected the mistake.

      Fig. 4M: How was the motility/invasion into Matrigel determined? Images? Only quantifications are shown.

      Response*: the matrigel invasion assay was described in the Material and Methods section. Accordingly, the data were expressed as the percentage of invasion based on the ratio of the mean number of cells invading through Matrigel matrix per mean number of cells in the uncoated support. For this rebuttal letter, the reviewer can find representative images of invaded MCF10A siCtrl non-treated (Ctrl) or treated with VSF secreted from MCF10A siCtrl or siErbB3. Since this is an established method to measure cell invasion, we hope the reviewer agrees that these images do not add value to the manuscript. *

      Fig. 4M: Exosomes collected from ErbB3-depleted cells promotes the migration in MCF10A-wild type cells, how about the effects on ErbB3-depleted cells? This group should be included for analysis.

      Response*: as proposed, we have treated both control and ErbB3-silenced MCF10A cells with normalized concentrations of EVs secreted from siCtrl and siErbB3 (1 x 109 nanoparticles/ mL) for 48 hours, followed by cell viability and cell invasion assays. The new data show that both EV pools modestly increased cell viability and substantially increased invasiveness of both wild-type and ErbB3-depleted cells through Matrigel (new Figures 4K and L). Together, our results indicate that while ErbB3-silenced MCF10A cells exhibited lower basal motility, ErbB3 is not required for the observed EV induced motility. The new Figures 4K and L were included and further discussed in this manuscript. *

      Quantification of the blots should be provided for Fig. 5A (GGA3), 5B (GGA3, Rabaptin5 and Arf6), 5F (GGA3) and 5G (GGA3, Rabaptin5 and Arf6). What is mock IP in each graph? The mock IP is neither mentioned in methods nor in legends.

      Response*: we have now carried out densitometry analysis in all the requested immunoblots shown in Figure 5. We also changed the mock IP term to IgG IP for clarity. The use of non-immunogenic IgG in control IPs is now specified in the methods and respective figure legend. *


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: In their manuscript, Rodrigues-Junior and colleagues identify a novel ligand-independent function of the tyrosine kinase receptor (RTK) ErbB3 as a regulator of integrin β1 recycling. In particular, the authors demonstrate that ErbB3 depletion reduce β1 integrin surface expression, triggering its lysosomal degradation and increasing its secretion in extracellular vesicles (EVs). Moreover, the authors show that these EVs enhanced the invasive capacity of ErbB3 wild type breast epithelial cells. In addition, the authors evidence the interaction between ErbB3, GGA3 and Rabaptin5. Loss of any of these proteins destabilizes this interaction, which abrogates integrin β1 recycling and leads to its degradation and secretion. The work is potentially interesting; however, there are some aspects that need to be analyzed in a more robust manner.

      Major comments:

      1. The manuscript is mainly focused on β1 integrin endocytic and post-endocytic fate following ErbB3 silencing, describing also a molecular mechanism underlying these observations. Despite the cited manuscript by Deneka, A. and colleagues indicates a similar mechanism for transferrin receptor (TfR) recycling, the Authors only studied the receptor internalization upon ErbB3 silencing. Therefore, this observation does not add any significance to the main topic of the manuscript and its removal should be considered. Response*: we agree with the reviewer the fate of Integrin β1 is the main focus of this manuscript. We would however favour retaining the TfR data as it implies a wider role of ErbB3, beyond trafficking of Integrin β1. We ask for the reviewer’s understanding of our rationale. *

      2.Data from Figure S1A seems to be not normally distributed. Have the Authors tested the data for normal distribution? If not, please consider it. If the data is not normally distributed, a non parametric Mann-Whitney U-Test would be more suitable.

      Response: we thank the reviewer for the comment. The differential ErbB3 mRNA expression analysis was retrieved from the widely used GEPIA2 portal (to date about 600 manuscripts cite this portal on PubMed), based on the selected datasets (“TCGA tumors vs TCGA normal + GTEx normal” or “TCGA tumors vs TCGA normal”). The method for differential analysis is one-way ANOVA, using disease state (Tumor or Normal) as variable for calculating differential expression, as it considers differential expression among several tumors.

      Tang, Z., Kang, B., Li, C., Chen, T., and Zhang, Z. (2019). GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis. Nucleic Acids Res. 47, W556–W560. https://doi.org/10.1093/nar/gkz430.

      1. The Authors studied the colocalization of ErbB3, Rab4 and Rab11, observing an increased colocalization between ErbB3 and Rab4 10 minutes following primaquine. However, the Authors previously referred to Sönnichsen, B et al. manuscript, in which TfR colocalized with Rab11 at 30min. It would be interesting to see whether ErbB3 and Rab11 colocalize at later time points in the presence or absence of primaquine. This will reinforce the conclusion that ErbB3 is involved in early Rab4-dependent recycling.

      Response: we appreciate the reviewer’s comment. However, we consider that these requested experiments will not add significant value to the novelty of this manuscript and hope that the reviewer accepts that we politely refrain from reproducing them.

      In Figure 4C the Authors observed a reduction in β1 integrin levels in ErbB3 silenced cells compared to the control already at the beginning of tracing (0 min), which might be due to accelerated turnover at the internalization step of their experimental design. To confirm this, immunofluorescence of β1 integrin in control and ErbB3 silenced cells could be performed just right after the 15min integrin internalization.

      Response: this is likely a misunderstanding as the timepoint (0 min) is defined as the point after the 15 min internalization step when the imaging-based tracing begins, which aligns perfectly with the reviewer’s request.

      In the discussion, the Authors indicate that "loss of ErbB3 redirects Integrin β1 towards lysosomes for degradation, mimicking loss of GGA3 that similarly redirects both Integrin β1 and c-Met towards lysosomal degradation, or Rabaptin5 depletion that we find similarly redirects trafficking of internalised Integrin β1 towards lysosomal degradation". However, the involvement of lysosomal degradation was only studied for ErbB3 silencing by employing chloroquine. To further support this statement, the use of chloroquine in Rabaptin5- and GGA3-depleted cells is recommended.

      Response: we appreciate the reviewer’s comment, but since these findings have been published earlier, we think that they will not add significant value to the manuscript and hope that the reviewer accepts that we politely refrain from reproducing them.

      Minor comments:

      6.The Authors should consider shortening the following sentences from the Introduction: "GGA proteins contain several functional domains that...thereby regulating sorting of cargo including Integrin β3 and TfR into recycling endosomes".

      Response: we thank the reviewer for the comment. We have now divided this sentence into two for smoother reading.

      The Authors do not show ErbB3 silencing efficiency at the protein level until Figure 3G, which should have been shown in Figure 1 or Supplementary Figure 1, as all the research is based on it. Moreover, GGA3 silencing efficiency was never tested.

      Response*: we thank the reviewer for this comment. We have included a new immunoblot confirming the silencing of ErbB3 by two independent siRNAs in MCF7 cells, as the new Supplementary Figure S2A. Please, note that GGA3 silencing was shown in the main Figure 6J. *

      Figure 1I and Figure 1K may include the representative images for the missing siErbB3 to properly illustrate the associated quantification.

      Response: we thank the reviewer for the comment. We have now included the representative images, as suggested.

      Consider including a Western blot showing the effect of lapatinib in EGFR, ErbB2 and ErbB3 protein expression, including their phosphorylated forms.

      Response: we thank the reviewer for the comment. As requested, we now show that at used concentration, lapatinib efficiently blocked tyrosine phosphorylation of ErbB3 and ERK1/2, without perturbing EGFR or ErbB3 expression levels. We also considered it relevant to show that 1 µM lapatinib used was not cytotoxic to MCF10A and MCF7 cells. We hope that these new results satisfy this specific request.

      Some supplementary figures are mislabelled, such as Supplementary Figure S5D and S5E on page 10, which should be S7D and S7E, respectively. Supplementary Figure S7C on page 15 should be S9C.

      Response: we apologize for this oversight and have performed the corrections.

      The following sentence on page 8 should be revised as a verb is missing: "which corresponds to the reported peak time when colocalization of Rab4 with traced TfR, preceding Rab11 and TfR colocalization that peaks later at 30 minutes".

      Response: we apologize for this oversight. It now reads: "which corresponds to the reported peak time of colocalization of Rab4 with traced TfR, which precedes Rab11 and TfR colocalization that peaks later at 30 minutes".

      The main text indicates that the amount of VSV-G transported to the cell surface after 30min it is not affected by ErbB3 silencing. However, in Figure 3E seems to slightly decrease following the silencing. The Authors may consider employing another Western blot image to match the main text and the quantification in Figure 3F.

      Response: as the reviewer noted the immunoblot showed a slight decrease. It is however a very modest decrease that is also observed in the positive control (MUC1) in the same Streptavidin IP sample. We ask for permission to keep these representative images.

      In the main text, a significant difference in the nanoparticles/cell between ErbB3-depleted cells and wild type or control cells were reported. However, Figure 3I only showed the statistics of each siRNA vs the control and not the wild type condition.

      Response: we apologize for this oversight. We removed from the text the comparison with the wild-type non-transfected cells to avoid misunderstanding.

      The Authors concluded that "chloroquine treatment significantly restored traced Integrin β1 levels". However, this conclusion is not reflected in the statistical analysis reported in Figure 4H, which only showed the differences between control and ErbB3 silenced cells. Thus, the statistics reported for the chloroquine results should be added.

      Response: we appreciate the comment by the reviewer. The requested comparison is now included in the new Figure 4H.

      The Authors concluded that "loss of either GGA3 or Rabaptin5 mimics the effect of loss of ErbB3 on endocytic trafficking of Integrin β1, consistent with the hypothesis that GGA3 and Rabaptin5 are effectors of ErbB3 in promoting endosomal recycling and impeding EV release". To confirm this conclusion, the inclusion of siRabaptin5 results in Figures 6H and 6J is suggested.

      Response*: we thank the reviewer for the comment. We have now included immunoblots of MCF10A cell lysate after silencing ErbB3 or Rabaptin5, as the results shown in the previous Figure 6G. We believe that these new data satisfy the specific request. *

      To be consistent with the results presentation:

      • The inclusion of Modal size is recommended in Figure 6I.

      • Some graphs show the number of cells or biological replicates while other ones no.

      • Figure 4E showed different time points for both siRNAs.

      Response: we appreciate the comment and we have now included as the new main Figure 6H the modal size for the EVs secreted by MCF10A cells upon Rabaptin5 silencing. We will ensure that all respective Figure legends indicate the number of replicates. The intermediate time points showed in the main Figure 4E are different, however since the final read out at 9 h using two independent siRNAs against ErbB3 are directly comparable we ask permission to maintain the time points with respect to the analysis we performed.

      Figure 1E represents the squared regions of Figure 1D, but it is not indicated in the figure legend.

      Response: we apologize for this oversight. We have now indicated in Figure 1 legend that Figure 1E represents the squared regions of Figure 1D, as suggested.

      In the legend of Figure 1D-G, 30min of integrin internalization is reported, where it should be 15min according to main text and methods.

      Response: we apologize for this oversight and we thank the reviewer for this comment. We have now indicated the correct time point in Figure 1 legend.

      The addition of representative images in Figure 6A is recommended, as already present in Figure 1I.

      Response: we thank the reviewer for the comment. Representative images of Fig. 6A-D were included as the new panel Fig. 6B.

      As two different siRNAs for ErbB3 were used and not in all experiments, the employed siRNA should be indicated in each experiment. In the cases where both ErbB3 siRNAs were employed, figures should report them either as main results or supplementary.

      Response: we appreciate this meticulous comment. We have now indicated in the figure and in the respective figure legends which siRNA was used in the respective set of experiments (siErbB3 #01 or #02).

      Why do the Authors use EVs enriched in the VSF or by UC to show the same result? What is the criteria to choose one or the other one? For example, in Figures 6G and 6K.

      Response*: based on the guidelines suggested by MISEV 2018 and 2023, there is no gold standard method for EV isolation. Thus, by using at least two independent methods (i.e., tangential flow filtration, followed by immuno-affinity and ultracentrifugation; UC) we validate the enrichment of EVs in our sample preparations, showing reproducible results among the different EV enrichment protocols (Figure 3). *


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The paper by Dorival Mendes Rodrigues-Junior et al., focuses on a novel ligand-independent role of ErbB3 receptor, modulating Transferrin receptor and integrin beta1 early recycling. Authors perform several in vitro studies where they show how ErbB3 depletion diverts integrin beta1 from recycling towards lysosomal degradation and extracellular vesicle secretion, impairing cell migration. They also provide mechanistic experiments showing the role of ErbB3 on Arf6-GGA3-Rabaptin5 endosomal complex assembly.

      Major comments:

      1. Fig. 1. Authors should co-stain with early endosomal markers (such as EEA1) to clearly show endogenous ErbB3 and Beta1 integrin endosomal co-localization. Including some insets with higher magnifications would also improve visual inspection of such interactions. Response: as requested, we have performed colocalization of ErbB3, traced Integrin β1 and the recycling endosome marker EHD1, showing triple colocalization in a subset of endosomes, as shown in the new Supplementary Figure S2H. Experimental limitations prevented us from including EEA1 in triple staining for mCherry-ErbB3 or endogenous ErbB3 protein. Furthermore, ectopically expressed ErbB3 in MCF10A cells did not show convincing co-localisation with EEA. We believe that the new triple colocalization showing ErbB3 and Integrin β1 in EHD1-positive endosomal compartments satisfies this specific comment.

      Fig. 1H and 1I. Authors need to provide TIRF penetration depth to better evaluate the potential cytosolic contribution. Additionally, plasma membrane purification studies would help to validate their live imaging results.

      Response: the TIRF penetration depth was 83nm which has now been added to the methods section. Purifications of plasma membrane fractions, following recycling of traced surface-labelled Integrin β1 in control or siErbB3 depleted cells, by cell surface biotinylation and immunoblotting of the recovered proteins is indeed a valuable approach to validate our findings. Nevertheless, we are confident about the results of our confocal imaging results. Thus, including these results might not contribute significantly to the novelty of this manuscript. Hence, we ask permission to publish the paper at this stage, without the plasma membrane purification, as this requires optimizations and will delay the publication of our paper, in addition to exhausting our limited financial resources.

      Fig. 1J. Authors should explain better how they calculated normalized fluorescence.

      Response: the normalized fluorescence is explained in the Fig. 1J legend and in the respective method section. Alexa488 intensity was normalized between 0-1, with the control as reference where Fnorm=((Fmax-Fmin)/(F-Fmin)). All data points were background corrected, followed by normalization to the pre-stimulatory level (F/F0).

      Fig. 2B. Authors should include some plasma membrane markers (such as WGA) to better localize cell surface after beta1 integrin tracing.

      Response: we appreciate the reviewer’s comment, and have attempted the suggested experiment, but in our hands, WGA did not give a clear membrane staining but a diffuse faint signal in MCF10A cells for reasons we do not fully understand.

      Fig. 1J, 1M-1L: beta1 integrin endocytic recycling should be compared across the same time-points to better evaluate kinetic differences.

      Response: the intermediate time points showed in the main Figure 1J, M-L are based on the final read out. We understand that it could be interesting evaluating the kinetic differences but this will generate a substantial number of comparisons that might be difficult for visualization. We ask permission to keep the comparisons among the latest respective time points with respect to the performed analysis.

      Fig. 3. Author should consider adding additional experiments with Rab4 and Rab11 dominant negative forms to validate their results.

      Response: the experiments proposed have been performed, but the ectopic expression of dominant negative Rab4 and Rab11 had detrimental effects to the cells, with the formation of large endosomal blobs and rounding up of the MCF10A cells. Subsequently we do not feel confident with the possible conclusions from these data. We ask the reviewer to understand this technical detail and accept the fact that we are not able to address this point.

      Fig. 4M. To validate authors' claim on the role of integrin Beta1-containing EVs on invasive behaviour, they should repeat the experiment using blocking beta1 antibodies prior to EV addition.

      Response*: we thank the reviewer for this comment. As requested, we performed the experiment using the Integrin β1 blocking monoclonal antibody (mAb; clone P4C10). The new data show that P4C10A treatment alone or in combination with EVs derived from MCF10A cells transfected with siCtrl or siErbB3 significantly reduced invasiveness in comparison to IgG treatments, confirming the mechanistic role of Integrin β1 promoting MCF10A invasive behaviour. The new Figure 4M was included and further discussed in this manuscript. *

      While authors claim that their results could potentially clarify different aspects of tumour dissemination, most of their experiments are done in MCF10A, a non-tumorigenic epithelial cell line. To better support their conclusion, they should reproduce key experiments in MCF7 or other tumorigenic cell line.

      Response: we thank the reviewer for the comment. As explained in response to reviewer 1, the rational for using different cell-lines or primary cells is now better explained in the manuscript. We found that depletion of ErbB3 impaired recycling of Integrin β1 in the normal non-malignant cells including MCF10A and primary breast epithelial cells, but not in malignant MCF7 cells that overexpress ErbB3 (data not shown), which is now discussed in the paper. Moreover, *MCF7 cells were used to detect endogenous ErbB3 as normal expression levels of ErbB3 (primary MECs and MCF10A) were not detectable by immunofluorescence microscopy with a range of antibodies we tested. Furthermore, we also included in this new version that GGA3 and Rab GTPase-binding effector protein 1 (Rabaptin5 or RABPT5) shared colocalisation with endogenous ErbB3 in MCF7 cells as the new Supplementary Figure 9A, B. Finally, we also attempted to conduct the Arf6 IP in MCF7 cells, but as opposed to MCF10A cells, the yield of Arf6 in pull down experiments was much lower than in MCF10A cells, and interacting proteins were not detectable. *

      Minor comments:

      1. Fig. 1D-1F: please explain better if beta1 integrin surface signal was quenched in these specific set of studies. Response: Beta1 Integrin was quenched on ice with an antibody against Alexa488 as described by Arjonen et al. (Traffic, 2012; DOI: 10.1111/j.1600-0854.2012.01327.x), and further outlined in the methods section and results section (page 6 and schematic Fig4A).

      Suppl. Fig. 3A: last WB lane should read "siErB2" instead of "siErbB3".

      Response: we thank the reviewer and we apologize for this oversight. We corrected the siErbB2 lane in Supplementary Figure 3A, as requested.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      ErbB3 is well-known for its significance in cancer, which is dependent on ligand-binding and heterodimerization with other ErbB family members. In the current work, Rodrigues-Junior et al. identified novel, unexpected functions of ErbB3 in promoting early endocytic recycling and restricting exocytic trafficking (extracellular vesicles secretion) of membrane receptors, such as integrin b1 and transferrin receptor, via stabilizing the Arf6-GGA3-Rabaptin5 endosomal sorting complex.

      Via ErbB3 siRNA knockdown, they observed an impaired recycling of transferrin receptor and integrin b1 back to the cell membrane. The recycling assay condition (growth factor-deprived) provided a very clean result to support that this ErbB3-dependent endocytic trafficking is ligand-binding independent. The trafficking-dependence on ErbB3 (both the endocytic and the exocytic) was further supported by integrin b1 functional assays (scratch closure assay and Matrigel invasion assay). There are still some details that need to be clarified to fully understand the conclusion.

      Major points:

      1. The manuscript started with a pathological correlation between high ErbB3 level and poor patient survival rate. In Fig.1, the impaired TfR recycling, and the co-localization between ErbB3 and integrin b1 were also performed in the pathological breast cancer cell line, MCF7. While investigating integrin b1 recycling, the authors suddenly switched to another two non-malignant human breast epithelial cell lines, which led to a difficult correlation of ErbB3-mediated recycling back to the disease situation. The authors should state more clearly this point, rather than data not shown. This inconsistency occurred also in other assays, for example, when addressing the trafficking from TGN to cell surface, MCF7 was utilized; while when addressing extracellular vesicle secretion, MCF10A was utilized.
      2. It was shown before that ErbB3 undergoes constitutive internalization and degradation within several hours that is independent of ligand-binding (ref#13). Can the authors provide experimental evidences to show the correlation of TfR or integrin b1 recycling with this dynamic ErbB3 levels rather than ErbB3 knockdown?
      3. The efficiency of siRNA knockdown of ErbB3 (both #1 and #2) should support the observed phenotype (Fig. 1I-J, K-L). Is there a correlation between the ErbB3 level with integrin recycling? For example, siRNA#2 led to more efficient knockdown of ErbB3 in MCF10A?
      4. ErbB3 loss led to more extracellular vesicles secretion, but also lysosomal degradation of integrin b1. This conclusion is supported by results shown in Fig.4D-E and Fig. S8A-B, while the analysis from the same cell line (MCF10A, Fig. S3A) results in no change of integrin b1 levels upon ErbB3 depletion. Fig. S3B showed also no change in a second non-malignant cell line (prHMEC). How do the authors explain this conflict?

      Minor points:

      1. Is TfR also colocalizing with endogenous ErbB3?
      2. Fig. 3J, TSG101, T is masked by 3I
      3. Page 10, the description of the EV secretion in prHMEC cells is annotated to the wrong figure. Fig S5D S7D; S5E S7E
      4. Fig. 4M: How was the motility/invasion into Matrigel determined? Images? Only quantifications are shown.
      5. Fig. 4M: Exosomes collected from ErbB3-depleted cells promotes the migration in MCF10A-wild type cells, how about the effects on ErbB3-depleted cells? This group should be included for analysis.
      6. Quantification of the blots should be provided for Fig. 5A (GGA3), 5B (GGA3, Rabaptin5 and Arf6), 5F (GGA3) and 5G (GGA3, Rabaptin5 and Arf6). What is mock IP in each graph? The mock IP is neither mentioned in methods nor in legends.

      Significance

      Strength: The recycling assay condition (growth factor-deprived) provided a very clean result to support that this ErbB3-dependent endocytic trafficking is ligand-binding independent.

      Limitations: Constantly change cell lines when addressing different questions

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this work, Williamson L et al address two intriguing questions in the field of gene regulation and chromatin organization: whether enhancer activity can overcome TAD boundaries and whether regulatory elements- shared between two genes interact with them concomitantly or competitively. For that, they use as a paradigm the well characterized regulatory landscape of the mouse Shh locus. This gene is located in a large TAD containing enhancers for different embryonic structures, including the ZPA of the developing limbs, foregut, lung buds and floor plate of the embryonic neural tube. Mnx1, a gene located in the adjacent TAD near the boundary of the Shh TAD, is mainly expressed in differentiating motor neurons of the spinal cord. However, the authors observed that Mnx1 transcription overlpas that of Shh in the ZPS, lung and foregut.

      Combining high resolution RNA and DNA FISH experiments they demonstrate that: (a) Shh enhancers located near (up to few hundreds kb) from the TAD border can overcome TAD insulation and drive Mnx1 transcription, while enhancers located near the Mnx1 and Shh loci or more internally in the Shh TAD specifically activate only their respective loci, (b) Deletion of the TAD boundary increases the fraction of cells coexpressing Shh and Mnx1 in the ZPA, (c) Coactivation of Shh and Mnx1 occurs in cis (i.e. in the same allele) suggesting concomitant activity of the shared ZRS enhancer on the two promoters (d) the co-activation correlates with the tightening of the distances between Shh, Mnx1 and the ZRS enhancer and is dependent on the ZRS activity as shown by the overexpression of tZRS-VP64 in mouse ES cells, (e) Cohesin, but not CTCF activity is required for the looping of the ZRS element with Shh and Mnx1. The experiments are well designed and the findings provide important insights improving our understanding that TAD boundaries constitute somewhat permeable rather than absolute barriers, reinforcing previous evidences in the fields, although the functional significance of this permeability is not addressed in this work. There are two main points which, in my opinion, require to be addressed by the authors to improve the overall quality and clarity of the work:

      Major comments:

      • The authors claim that co-expression of Mnx1 and Shh in the foregut and lung buds is also driven by boundary crossing contacts with the MACS1 enhancer. However, the effect of the boundary deletion on the co-transcription of Shh and Mnx1 is only showed for the ZPA. In this sense I find potentially misleading the statement of the authors in the following paragraph: "In the ZPA, the foregut, and the lung buds, the majority of Mnx1 RNA-FISH signals are at alleles that show simultaneous signal for Shh nascent transcript from the same allele (closely apposed signals) (Fig. 2a, b and Extended Data Fig. 2a). In del 35 embryos, an even higher proportion of Mnx1 transcribing alleles also transcribe Shh (Fig. 2b,Extended Data Fig. 2a, Extended Data Table 3.). These data suggest that both the ZRS and MACS1 enhancers are able to simultaneously activate transcription at two gene loci on the same chromosome". In my opinion this phrasing implicitly extends the increase in Mnx1-Shh co-expressing nuclei observed in the ZPA of 35 del embryos to the expression of these two genes in the foregut and lung buds (driven by the MACS1 enhancer) while this effect has not been specifically addressed. In a previous work, the authors showed that boundary deletion does not impact Mnx1 expression in the foregut and lungs. It would be important to clarify whether more precise analysis in this study have led to different conclusions or, alternatively, appropriately discuss the results. Ideally the authors should analyse the effect of the 35 del allele in the foregut / lung buds or rephrase the statement about the sharing of the MACS1 enhancer.
      • The authors use the quantifications of nuclei co-expressing Mnx1 and Shh from the same allele as an indicator of simultaneous transcription of the two genes by the sharing of the enhancer as opposed to a model of alternate transcriptional bursts. However, I am concerned that the time scale at which looping and transcriptional bursts occur is at odds with the detection of nascent transcription in FISH experiments, thus not excluding that shifting of the enhancer from one promoter to the other could still result in detection of nascent RNA of the two genes in the same allele. In any case, following the argumentation of the authors, the fraction of nuclei expressing Mnx1 alone does not appear to be significantly different from those expressing Mnx1 and Shh, and the increase of Mnx1 expressing nuclei upon boundary deletion seem proportionally similar to the increase of Mnx1+/Shh+ nuclei. In my opinion, this makes it difficult to interpret the detection of Mnx1 alone or both Mnx1-Shh expression as a reflection of alternate looping and transcriptional burst from enhancer sharing. Determining whether the two promoters compete for the interaction with the enhancer or share it would require estimate whether in the 35 del homozygote embryos Shh expression is reduced compared to wts, as a result of the increased interaction of the ZRS with the enhancer. The authors claim that there are no differences in the % of cells expressing Shh upon boundary deletion but in my opinion measurement is not sufficient to estimate a change in transcriptional rate (frequency of bursting). Nascent mRNA level detection in single cells would allow to better asses competition or concomitant activation of the two gene. Not being an expert in the RAN FISH technique it is not clear to me whether fluorescence intensity could be used as an estimator of transcription. From the images of the authors, in some cases it seems that expression of Shh alone is higher than when both Shh and Mnx1 are transcribed from the same allele (Fig. 2a, left panel, Fig 2c left vs right panel ). However, in other cases an opposite trend can be observed (Mnx1 intensity in Fig2a central vs right panel). Thus, a single nuclei PCR or RNAseq approach may be more suited for this assessment.

      Minor comments:

      • In the mESC model overexpressing the tZRS-VP64 construct, Shh and Mnx1 seem to be transcribed at similar rates compared to what observed in vivo (where only a minor fraction of Shh+ cells express Mnx1). Thus, despite the fact that TAD boundary deletion increases Mnx1, but not Shh, expression, the ZRS activity seems to more easily overcome the border in this context than in vivo. Could the authors comment on this interesting observation? May it relate to the insulation score of TAD boundaries in the mESCs compared to in vivo? Alternatively, could it reflect that combinatorial TF binding to an enhancer contribute to its directionality?
      • Overall, figure organization and clarity could be improved. For example, enlargement of RNA fish images in Fig. 1 could be enlarged (to the same size than the broad view image) and RNA FISH signal could be highlighted with arrowheads. Panel distribution could also be optimized.

      Significance

      Significance. The work presented by Williamson L et al provide interesting insights on how TAD borders contribute to the insulation of topologically domains and restricting enhancer interactions, showing that some enhancers are able to overcome TAD insulation and showing that enhancer looping and TAD border crossing rely on enhancer activity and cohesin loop extrusion. As mentioned above, these findings reinforce and extend previous reports (Chakraborty et al 2023, Kessler S et al 2023, Balasubramanian et al 2024, Tzu-Chiao H. et al 2024,). This work does not specifically address whether the fact that their tested enhancer (really focusing on the ZRS enhancer) can overcome the Shh TAD boundary is dependent on their intrinsic properties (e.g. TFBS composition) or whether it relates with their distance to the border. This would require more complex genetic rearrangements (for example bringing floor plate enhancers in proximity of the border, and in combination with the TAD boundary deletion) and would significantly increase the scientific relevance of the work, yet at the expense of significant amount of work that could not be addressed in a reviewing process. In summary, the research of Williamsons l et al constitute an overall well performed piece of work that integrates well within other pieces of evidence of the field of gene regulation and chromatin organization. Thus, without constituting a major conceptual breakthrough in the field, it constitutes a valuable contribution to our understanding of basic principles of genome organization and gene transcription.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary and significance in the context of the field:

      In this work, the authors conduct a detailed investigation of the 'ectopic'/'bystander' activation of the gene Mnx1 by enhancers of Shh, located in the neighboring TAD. TAD borders have been shown in a number of works to contribute to the remarkable specificity of enhancer-promoter choice, and the current dogma in the field is to view them as perfect boundaries to enhancer-promoter interaction. Notably, this current dogma also highlights a conundrum in our understanding of gene regulation, as available 3D genome data from both sequencing and microscopy show that TAD borders are regions of abrupt decrease in 3D proximity, but far from perfect borders, with numerous cross-TAD interactions detected by Hi-C and its variants and by single-cell microscopy (albeit fewer than the local intra-TAD interactions).

      The authors show convincing data that Mnx1 indeed responds transcriptionally to several Shh-enhancers located over 100 kb distal and on the wrong side of the TAD boundary. The data come from developing mouse embryos, span several tissues, and include key controls for specificity of the method. This provides convincing data with which to challenge the currently widely accepted view of as TADs a significant boundary, complimenting the few examples that indicate that such regulation is possible in special cases (see further discussion in 2b below). I believe this work represents an important and substantive contribution to the field and should ultimately be published, after a few notable issues have been addressed.

      Major comments:

      Does the CTCF degron substantially remove CTCF from the Mnx1/Shh TAD border?<br /> In prior AID-CTCF degron studies 1,2, a considerable fraction of cohesin dependent TAD borders are retained upon CTCF removal. Moreover, CTCF sites at these retained borders still have clear ChIP-seq peaks - even though the protein is >95% depleted and scarcely detectable by western. Thus, while I suspect that the authors are correct that the shorter distance of the 35 kb border deletion contributes substantially to the increased crosstalk between the Mnx1 and Shh-enhancers, I suspect part of the reason for a lack of a similar effect in the CTCF degron is due to the known challenges in removing CTCF from this border. To argue that the border but not the CTCF is important, I think it would be helpful to show the CTCF signal is sufficiently lost in the degron by ChIP-seq and/or show that this TAD border has been lost by Hi-C. Alternatively, the authors could tone down this claim to something more conservative, as I did not find it to be presented as a key conclusion of the paper as a whole.

      Minor comments:

      I believe the manuscript could be strengthened by some textual revisions of the introduction: 2a) In particular, in my opinion, the authors' description of existing data for the importance of TAD borders in enhancer promoter regulation is not described in a sufficiently balanced and complete manner, and overall impression given by the text is that CTCF marked borders have little serious evidence for a role in developmental enhancer specificity and are maybe a cancer thing. This is doubly unfortunate, as it undermines the impact of the authors work in expanding our view of what TAD borders are in a regulatory sense, as well as presents an unbalanced view of work in the field. This is of course easily corrected. In particular I recommend the following revisions:

      It is " depletion of CTCF has only a small effect on transcription in cell culture (Nora et al., 2017; Hsieh et al., 2022)." It should be clarified that there is only a small acute * effect on transcription (in the first 6-12 hours), which may tell us more about the timescale at which promoters sample, integrate and respond to changes in their enhancer environment than about the roles of CTCF particularly. Notably, this degradation is lethal*, it results in massive changes in transcription after 4 days, and I suspect the authors agree that this lethal affect arises from CTCF's role in transcription regulation (if you remove some key cytoskeletal protein or metabolic enzyme the primary cause of cell death is not transcriptional, but almost all the evidence for CTCF's vital role in the cell is linked in one way or another to transcription). The discussion of TAD border deletions is more one-sided than ideal. I appreciate the discussion is usually even more unbalanced when presenting the opposite view in the literature - many works only cite the examples where border deletion does lead to ectopic expression and phenotypes. The current text presented a subset of these border deletion data in such a way as to give me the impression the authors are deeply skeptical that CTCF plays a role as an insulator of E-P interactions in a developmental context (rather than just as a weird cancer thing). For example:

      Pennacchio's lab has analyzed a series of TAD border deletions with more examples of both lethal effects and effects with no apparent phenotype 3

      Deletion of TAD borders upstream of the FGF3/4/15 locus in mouse is embryonic lethal (particularly the border Kim et al label TB1 and didn't delete in their cancer model). https://www.biorxiv.org/content/10.1101/2024.08.03.606480v1

      I appreciate that Bickmore and colleagues found quite phenotypically normal mice upon deletion of CTCF sites from Shh, but it might be balanced to still reference the work from Uishiki et al that indicate in humans the CTCF site does play a role in Shh - ZRS communication: 4

      As the authors are doubtless aware, Andrey and colleagues show a CTCF dependent enhancement of a sensitized ZRS enhancer. 5

      Zuin et al. in an elegant experiment in which an enhancer is mobilized to different distances away from its promoter using transposon induction, reported a complete lack of detection of enhancers mobilizing outside the TAD to activate gene expression 6.

      A balanced presentation of the data on CTCF role might include some discussion of the above. In light of these earlier works, the findings the authors report about border bypass are all the more surprising.

      2b) By contrast, direct evidence for cross TAD interactions at endogenous loci has not to my knowledge been shown as clearly as described in the current manuscript.

      Recent work from Rocha and colleagues 7 showed evidence that some enhancers upstream of Sox2 can pass ectopically induced boundaries. While recent work has described examples of 'TAD border bypass' at endogenous loci (e.g. for Pitx1 8, Hoxa regulation 9), these reports really just expand the view of regulatory boundaries rather than provide evidence against it. They invoke a 3D stacking of boundaries that allows boundary proximal enhancers and promoters to stack with (and so bypass) an intervening TAD boundary. Notably, in this view enhancers and promoters that lie away from the border of their respective TADs are still separate, and indeed intervening genes between distal enhancers for Pitx1 and Hoxa appear to follow these rules.2 Mnx1 and the Shh enhancers by contrast do not appear to be an example of border stacking. Given that Sox2 at least is also a TAD border, and the position of the bypassing enhancers is not precisely known in the work from Rocha, it is possible that that case is also an example of boundary stacking, which appears less likely in the case of Mnx1 (which does not appear to be at CTCF marked border, at least in mESCs).

      Statistics

      Some of the bar graphs quantifying the %-expressing cells do not obviously have associated n-values, as are some of the violin plots of the distances. I think all these bar graphs could also benefit from adding errorbars (e.g. by bootstrapping from the sampled population). This will help the reader more easily appreciate how sampling error and sample size affect the variation seen in the plots.

      Recommendations for improving the figures

      Figure 2

      I would have preferred the authors zoom in more on the FISH spots to help the reader appreciate the proximity. I do appreciate also seeing a field of more than 1 cell (to give some sense of the variability), but these images mostly have only 1 spot pair per panel, which is exceedingly small as they contain parts of more than 1 nucleus. There is also unnecessary white space in this figure that could have been used to show zoom in panels.

      Figure 3 -image panels

      The same applies to the image panels in this figure as for figure 2 - there is considerable unused whitespace, the image panels capture mostly a single nucleus and its pattern of DAPI dense heterochromatin (which isn't particularly relevant to the narrative) while the fluorescent spots that are the focus of the narrative are quite small. It is nice to have an example of the cell to see that this isn't just random background (that there is just one spot per cell) - in that sense though it's equally helpful to show its not just 1 cell in the field that has the signal-to-noise (SNR) shown.<br /> For this figure and the panels in figure 2, I'd recommend showing a zoom out showing ~3 nuclei with transcription foci (at least in the regions where the % transcribing is >60% it should be fine to have adjacent nuclei transcribing, for those where it is 10%, 1 of 3 nuclei transcribing in the image selected would also help get the sense of the data). These zoom out images would also give a sense of the SNR in the image, and then a zoom in where the FISH spots are sizable would make it easier to see the neighboring transcripts. Extended Data Fig 3 does a better job showing the context of the limb and then zooming in to an image where the RNA spots are appreciable. It looks like the resolution of the zoom in is lower, such that zooming in further on the spots in this data may not enhance the image.

      Figure 3 - DNA FISH

      It would be helpful to include a diagram indicated where the DNA FISH probes are located on the genome and their size in kb as an inset in the figure.

      References cited above

      1. Nora, E. P., Goloborodko, A., Valton, A.-L., Gibcus, J. H., Uebersohn, A., Abdennur, N., Dekker, J., Mirny, L. A. & Bruneau, B. G. Targeted Degradation of CTCF Decouples Local Insulation of Chromosome Domains from Genomic Compartmentalization. Cell 169, 930-944.e22 (2017).
      2. Kubo, N., Ishii, H., Gorkin, D., Meitinger, F., Xiong, X., Fang, R., Liu, T., Ye, Z., Li, B., Dixon, J., Desai, A., Zhao, H. & Ren, B. Preservation of Chromatin Organization after Acute Loss of CTCF in Mouse Embryonic Stem Cells. bioRxiv 118737 (2017).
      3. Rajderkar, S., Barozzi, I., Zhu, Y., Hu, R., Zhang, Y., Li, B., Alcaina Caro, A., Fukuda-Yuzawa, Y., Kelman, G., Akeza, A., Blow, M. J., Pham, Q., Harrington, A. N., Godoy, J., Meky, E. M., von Maydell, K., Hunter, R. D., Akiyama, J. A., Novak, C. S., Plajzer-Frick, I., Afzal, V., Tran, S., Lopez-Rios, J., Talkowski, M. E., Lloyd, K. C. K., Ren, B., Dickel, D. E., Visel, A. & Pennacchio, L. A. Topologically associating domain boundaries are required for normal genome function. Commun. Biol. 6, 435 (2023).
      4. Ushiki, A., Zhang, Y., Xiong, C., Zhao, J., Georgakopoulos-Soares, I., Kane, L., Jamieson, K., Bamshad, M. J., Nickerson, D. A., University of Washington Center for Mendelian Genomics, Shen, Y., Lettice, L. A., Silveira-Lucas, E. L., Petit, F. & Ahituv, N. Deletion of CTCF sites in the SHH locus alters enhancer-promoter interactions and leads to acheiropodia. Nat. Commun. 12, 2282 (2021).
      5. Paliou, C., Guckelberger, P., Schöpflin, R., Heinrich, V., Esposito, A., Chiariello, A. M., Bianco, S., Annunziatella, C., Helmuth, J., Haas, S., Jerković, I., Brieske, N., Wittler, L., Timmermann, B., Nicodemi, M., Vingron, M., Mundlos, S. & Andrey, G. Preformed chromatin topology assists transcriptional robustness of Shh during limb development. Proc. Natl. Acad. Sci. U. S. A. 116, 12390-12399 (2019).
      6. Zuin, J., Roth, G., Zhan, Y., Cramard, J., Redolfi, J., Piskadlo, E., Mach, P., Kryzhanovska, M., Tihanyi, G., Kohler, H., Eder, M., Leemans, C., van Steensel, B., Meister, P., Smallwood, S. & Giorgetti, L. Nonlinear control of transcription through enhancer-promoter interactions. Nature 604, 571-577 (2022).
      7. Chakraborty, S., Kopitchinski, N., Zuo, Z., Eraso, A., Awasthi, P., Chari, R., Mitra, A., Tobias, I. C., Moorthy, S. D., Dale, R. K., Mitchell, J. A., Petros, T. J. & Rocha, P. P. Enhancer-promoter interactions can bypass CTCF-mediated boundaries and contribute to phenotypic robustness. Nat. Genet. 55, 280-290 (2023).
      8. Hung, T.-C., Kingsley, D. M. & Boettiger, A. N. Boundary stacking interactions enable cross-TAD enhancer-promoter communication during limb development. Nat. Genet. 56, 306-314 (2024).
      9. Hafner, A., Park, M., Berger, S. E., Murphy, S. E., Nora, E. P. & Boettiger, A. N. Loop stacking organizes genome folding from TADs to chromosomes. Mol. Cell 83, 1377-1392.e6 (2023).

      Significance

      The authors show convincing data that Mnx1 indeed responds transcriptionally to several Shh-enhancers located over 100 kb distal and on the wrong side of the TAD boundary. The data come from developing mouse embryos, span several tissues, and include key controls for specificity of the method. This provides convincing data with which to challenge the currently widely accepted view of as TADs a significant boundary, complimenting the few examples that indicate that such regulation is possible in special cases (see further discussion in 2b below). I believe this work represents an important and substantive contribution to the field and should ultimately be published, after a few notable issues have been addressed.

      Audience: I believe this work will be of general interest to the eukaryotic transcription community, the 4D genome community, and the developmental biology community.

      My expertise: developmental biology, 4D genome biology, microscopy

    1. Author response:

      eLife Assessment

      This study addresses a novel and interesting question about how the rise of the Qinghai-Tibet Plateau influenced patterns of bird migration, employing a multi-faceted approach that combines species distribution data with environmental modeling. The findings are valuable for understanding avian migration within a subfield, but the strength of evidence is incomplete due to critical methodological assumptions about historical species-environment correlations, limited tracking data, and insufficient clarity in species selection criteria. Addressing these weaknesses would significantly enhance the reliability and interpretability of the results.

      We would like to thank you and two anonymous reviewers for your careful, thoughtful, and constructive feedback on our manuscript. These reviews made us revisit a lot of our assumptions and we believe the paper will be much improved as a result. In addition to minor points, we will make three main changes to our manuscript in response to the reviews. First, we will address the concerns on the assumptions of historical species-environment correlations from perspectives of both theoretical and empirical evidence. Second, we will discuss the benefits and limitations of using tracking data in our study and demonstrate how the findings of our study are consolidated with results of previous studies. Third, we will clarify our criteria for selecting species in terms of both eBird and tracking data.

      Below, we respond to each comment in turn. Once again, we thank you all for your feedback.

      Reviewer #1 (Public review):

      Strengths:

      This is an interesting topic and a novel theme. The visualisations and presentation are to a very high standard. The Introduction is very well-written and introduces the main concepts well, with a clear logical structure and good use of the literature. The methods are detailed and well described and written in such a fashion that they are transparent and repeatable.

      We appreciate the reviewer’s careful reading of our manuscript, encouraging comments and constructive suggestions.

      Weaknesses:

      I only have one major issue, which is possibly a product of the structure requirements of the paper/journal. This relates to the Results and Discussion, line 91 onwards. I understand the structure of the paper necessitates delving immediately into the results, but it is quite hard to follow due to a lack of background information. In comparison to the Methods, which are incredibly detailed, the Results in the main section reads as quite superficial. They provide broad overviews of broad findings but I found it very hard to actually get a picture of the main results in its current form. For example, how the different species factor in, etc.

      Yes, it is the journal request to format in this way (Methods follows the Results and Discussion) for the article type of short reports. As suggested, in the revision we will elaborate on details of our findings, especially the species-specific responses, in terms of (i) shifts of distribution of avian breeding and wintering areas under the influence of the uplift of the Qinghai-Tibetan Plateau, and (ii) major factors that shape current migration patterns of birds in the Plateau. We will also better reference the approaches we used in the study.

      Reviewer #2 (Public review):

      Summary:

      The study tries to assess how the rise of the Qinghai-Tibet Plateau affected patterns of bird migration between their breeding and wintering sites. They do so by correlating the present distribution of the species with a set of environmental variables. The data on species distributions come from eBird. The main issue lies in the problematic assumption that species correlations between their current distribution and environment were about the same before the rise of the Plateau. There is no ground truthing and the study relies on Movebank data of only 7 species which are not even listed in the study. Similarly, the study does not outline the boundaries of breeding sites NE of the Plateau. Thus it is absolutely unclear potentially which breeding populations it covers.

      We are very grateful for the careful review and helpful suggestions. We will revise the manuscript carefully in response to the reviewer’s comments and believe that it will be much improved as a result. Below are our point-by-point replies to the comments.

      Strengths:

      I like the approach for how you combined various environmental datasets for the modelling part.

      We appreciate the reviewer’s encouragement.

      Weaknesses:

      The major weakness of the study lies in the assumption that species correlations between their current distribution and environments found today are back-projected to the far past before the rise of the Q-T Plateau. This would mean that species responses to the environmental cues do not evolve which is clearly not true. Thus, your study is a very nice intellectual exercise of too many ifs.

      This is a valid concern. We will address this from both the perspectives of the theoretical design of our study and empirical evidence.

      First, we agree with the reviewer that species responses to environmental cues might vary over time. Nonetheless, the simulated environments before the uplift of the plateau serve as a counterfactual state in our study. Counterfactual is an important concept to support causation claims by comparing what happened to what would have happened in a hypothetical situation: “If event X had not occurred, event Y would not have occurred” (Lewis 1973). Recent years have seen an increasing application of the counterfactual approach to detect biodiversity change, i.e., comparing diversity between the counterfactual state and real estimates to attribute the factors causing such changes (e.g., Gonzalez et al. 2023). Whilst we do not aim to provide causal inferences for avian distributional change, using the counterfactual approach, we are able to estimate the influence of the plateau uplift by detecting the changes of avian distributions, i.e., by comparing where the birds would have distributed without the plateau to where they currently distributed. We regard the counterfactual environments as a powerful tool for eliminating, to the extent possible, vagueness, as opposed to simply description of current distributions of birds. Therefore, we assume species’ responses to environments are conservative and their evolution should not discount our findings. We will clarify this in both the Introduction and Methods.

      Second, we used species distribution modelling to contrast the distributions of birds before and after the uplift of the plateau under the assumption that species tend to keep their ancestral ecological traits over time (i.e., niche conservatism). This indicates a high probability for species to distribute in similar environments wherever suitable. Particularly, considering birds are more likely to be influenced by food resources (Martins et al. 2024), and the distribution of available food before the uplift (Jia et al. 2020), we believe the findings can provide valuable insights into the influence of the plateau on avian migratory patterns. Having said that, we acknowledge other factors, e.g., carbon dioxide concentrations (Zhang et al. 2022), can influence the simulations of environments and our prediction of avian distribution. We will clarify the assumptions and evidence we have for the modelling in Methods. We will further point out the direction for future studies in the Discussion.

      The second major drawback lies in the way you estimate the migratory routes of particular birds. No matter how good the data eBird provides is, you do not know population-specific connections between wintering and breeding sites. Some might overwinter in India, some populations in Africa and you will never know the teleconnections between breeding and wintering sites of particular species. The few available tracking studies (seven!) are too coarse and with limited aspects of migratory connectivity to give answer on the target questions of your study.

      We agree with the reviewer that establishing interconnections for birds is important for estimating the migration patterns of birds. We employed a dynamic model to assess their weekly distributions. Thus, we can track the movement of species every week, and capture the breeding and wintering areas for specific populations. That being said, we acknowledge that our approach can be subjected to the patchy sampling of eBird data. We will better demonstrate this in the main text.  

      Tracking data can provide valuable insights into the movement patterns of species but are limited to small numbers of species due to the considerable costs and time needed. We aimed to adopt the tracking data to examine the influence of focal factors on avian migration patterns, but only seven species, to the best of our ability, were acquired. Moreover, similar results were found in studies that used tracking data to estimate the distribution of breeding and wintering areas of birds in the plateau (e.g., Prosser et al. 2011, Zhang et al. 2011, Zhang et al. 2014, Liu et al. 2018, Kumar et al. 2020, Wang et al. 2020, Pu and Guo 2023, Yu et al. 2024, Zhao et al. 2024). We believe the conclusions based on seven species are rigour, but their implications could be restricted by the number of tracking species we obtained. We will demonstrate how our findings on breeding and wintering areas of birds are reinforced by other studies reporting the locations of those areas. We will also add a separate caveat section to discuss the limitations stated above.

      Your set of species is unclear, selection criteria for the 50 species are unknown and variability in their migratory strategies is likely to affect the direction of the effects.

      We will clarify the selection criteria for the 50 species). We first obtained a full list of birds in the plateau from Prins and Namgail (2017). We then extracted species identified as full migrants in Birdlife International (https://datazone.birdlife.org/species/spcdistPOS) from the full list.

      In addition, the position of the breeding sites relative to the Q-T plate will affect the azimuths and resulting migratory flyways. So in fact, we have no idea what your estimates mean in Figure 2.

      We calculated the azimuths not only by the angles between breeding sites and wintering sites but also based on the angles between the stopovers of birds. Therefore, the azimuths are influenced by the relative positions of breeding, wintering and stopover sites. We will better explain this both in the Methods and legend of Figure 2.

      There is no way one can assess the performance of your statistical exercises, e.g. performances of the models.

      As suggested, we will add the AUC values to assess the performances of the models.

      References

      Gonzalez, A., J. M. Chase, and M. I. O'Connor. 2023. A framework for the detection and attribution of biodiversity change. Philosophical Transactions of the Royal Society B: Biological Sciences 378: 20220182.

      Jia, Y., H. Wu, S. Zhu, Q. Li, C. Zhang, Y. Yu, and A. Sun. 2020. Cenozoic aridification in Northwest China evidenced by paleovegetation evolution. Palaeogeography, Palaeoclimatology, Palaeoecology 557:109907.

      Kumar, N., U. Gupta, Y. V. Jhala, Q. Qureshi, A. G. Gosler, and F. Sergio. 2020. GPS-telemetry unveils the regular high-elevation crossing of the Himalayas by a migratory raptor: implications for definition of a “Central Asian Flyway”. Scientific Reports 10:15988.

      Lewis, D. 1973. Counterfactuals. Oxford: Blackwell.

      Liu, D., G. Zhang, H. Jiang, and J. Lu. 2018. Detours in long-distance migration across the Qinghai-Tibetan Plateau: individual consistency and habitat associations. PeerJ 6:e4304.

      Martins, L. P., D. B. Stouffer, P. G. Blendinger, K. Böhning-Gaese, J. M. Costa, D. M. Dehling, C. I. Donatti, C. Emer, M. Galetti, R. Heleno, Í. Menezes, J. C. Morante-Filho, M. C. Muñoz, E. L. Neuschulz, M. A. Pizo, M. Quitián, R. A. Ruggera, F. Saavedra, V. Santillán, M. Schleuning, L. P. da Silva, F. Ribeiro da Silva, J. A. Tobias, A. Traveset, M. G. R. Vollstädt, and J. M. Tylianakis. 2024. Birds optimize fruit size consumed near their geographic range limits. Science 385:331-336.

      Prins, H. H. T., and T. Namgail. 2017. Bird migration across the Himalayas : wetland functioning amidst mountains and glaciers. Cambridge University Press, Cambridge.

      Prosser, D. J., P. Cui, J. Y. Takekawa, M. Tang, Y. Hou, B. M. Collins, B. Yan, N. J. Hill, T. Li, Y. Li, F. Lei, S. Guo, Z. Xing, Y. He, Y. Zhou, D. C. Douglas, W. M. Perry, and S. H. Newman. 2011. Wild bird migration across the Qinghai-Tibetan Plateau: a transmission route for highly pathogenic H5N1. PloS One 6:e17622.

      Pu, Z., and Y. Guo. 2023. Autumn migration of black-necked crane (Grus nigricollis) on the Qinghai-Tibetan and Yunnan-Guizhou plateaus. Ecology and Evolution 13:e10492.

      Wang, Y., C. Mi, and Y. Guo. 2020. Satellite tracking reveals a new migration route of black-necked cranes (Grus nigricollis) in Qinghai-Tibet Plateau. PeerJ 8:e9715.

      Yu, X., G. Song, H. Wang, Q. Wei, C. Jia, and F. Lei. 2024. Migratory flyways and connectivity of brown headed gulls (Chroicocephalus brunnicephalus) revealed by GPS tracking. Global Ecology and Conservation 56:e03340.

      Zhang, G.G., D.P. Liu, Y.Q. Hou, H.X. Jiang, M. Dai, F.W. Qian, J. Lu, T. Ma, L.X. Chen, and Z. Xing. 2014. Migration routes and stopover sites of Pallas’s gulls Larus ichthyaetus breeding at Qinghai Lake, China, determined by satellite tracking. Forktail 30:104-108.

      Zhang, G.G., D.P. Liu, Y.Q. Hou, H.X. Jiang, M. Dai, F.W. Qian, J. Lu, Z. Xing, and F.S. Li. 2011. Migration routes and stop-over sites determined with satellite tracking of bar-headed geese (Anser indicus) breeding at Qinghai Lake, China. Waterbirds 34:112-116, 115.

      Zhang, R., D. Jiang, C. Zhang, and Z. Zhang. 2022. Distinct effects of Tibetan Plateau growth and global cooling on the eastern and central Asian climates during the Cenozoic. Global and Planetary Change 218:103969.

      Zhao, T., W. Heim, R. Nussbaumer, M. van Toor, G. Zhang, A. Andersson, J. Bäckman, Z. Liu, G. Song, M. Hellström, J. Roved, Y. Liu, S. Bensch, B. Wertheim, F. Lei, and B. Helm. 2024. Seasonal migration patterns of Siberian Rubythroat (Calliope calliope) facing the Qinghai–Tibet Plateau. Movement Ecology 12:54.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Plasmacytoid dendritic cells (pDCs) are the major producers of type I interferon after viral infections and play key role in antiviral immune response. This article by Joshi et al. investigates the role of pDCs in regulating the Hepatitis E virus (HEV) infection. In Fig. 1, the authors investigated the immunocompetence of different cell lines and HepG2/C3A and PLC3 were chosen for further studies. By utilizing a combination of flow cytometry, RT-qPCR and other techniques, the authors showed in Fig. 2 that the cell-cell contacts between pDCs and HEV infected cells induce the pDCs to secrete interferon (IFN). This interaction is mediated by cell adhesion molecules and is dependent on TLR7 signaling. The authors then went on to show that the IFN produced by pDCs controlled the viral spread. Further, using several mutant forms of ORF2 protein and utilizing imaging, RT-qPCR and other techniques, in Fig. 3 and 4 the authors elucidated the importance of the glycosylation pattern, localization of different forms of HEV ORF2 protein, cell-cell contact in triggering the immune response. Overall, this study provided insights in the pDC mediated IFN response against HEV.

      Major comments:

      1. The authors report that in the PLC3 cells, STOP mutation significantly reduced IFN⍺ production (Fig. 3f), significantly reduced pDC contact with infected cells (Fig. 4c) and thus concluded that the ORF2g/c is involved in pDC-infected cell interaction and IFN⍺ production. However, in the HepG2/C3A cells, the STOP mutation does not decrease the IFN⍺ production (Fig. 3e). In the manuscript, one of the key conclusions is that the glycosylated form of ORF2 leads to better recognition of the infected cells by pDC. So, it is critical that the difference in the IFN⍺ production between these two cell lines with STOP mutation is addressed with further details.
      2. The authors show that the IFN⍺ response was reduced in 5R/5A mutant HepG2/C3A cells (Fig. 3e), whereas the IFN⍺ response was completely absent in 5R/5A mutant PLC3 cells (Fig. 3f). The authors suggested that the difference in IFN⍺ response may be due to lack of ORF2i in PLC3 and other cell specific regulation in HepG2/C3A. Further evidence for this differential regulation would strengthen the claim.
      3. In the PLC3-pDC co-culture experiment (Fig. 2b), there is already an induction of IFN-1 (Interferon Lambda 1) in the uninfected PLC3-pDC co-culture (right panel, Fig. 2b). An explanation for the IFN-1 (Interferon Lambda 1) expression in the uninfected state would be helpful.

      Additional comments:

      1. Authors checked the expression of two ISGs- MXA, ISG15 in Fig. 1a-c, 2a-b. Were the expressions of other ISGs, such as members of OAS family (OAS1, OAS2 etc.), IFITM family or any other ISGs checked? This may be helpful, since in the Fig. 2c there is IFN⍺ production in pDC-infected PLC3 co-culture, but the ISGs (MXA, ISG15) are not upregulated significantly in Fig. 2b.
      2. In the HepG2/C3A-pDC co-culture experiment (Fig. 2a), there is not much difference in IFN-1 (Interferon Lambda 1) level in the infected HepG2/C3A-pDC co-culture (right panel, Fig. 2a) in comparison to infected HepG2/C3A alone (left panel, Fig. 2b), and also this outcome is different from that in the PLC3 experiment (Fig. 2b). Further clarification would help to support the conclusion regarding the IFN-1 (Interferon Lambda 1) upregulation in HEV infected cells-pDC co-culture.
      3. The authors show that in the pDC-PLC3 co-culture system, IFN⍺ was induced at 18h (Fig. 2c-2e), but the viral replication was not decreased in PLC3 cells (Fig. 2g). But, the HepG2/C3A-pDC co-culture has reduced viral replication at 18h (Fig. 2f). An explanation for the difference in the observation in two different cell lines at the same timepoints would strengthen the antiviral role of pDCs on HEV infected cells.
      4. The authors quantified the fold change in HEV infected PLC3+ cells in Fig. 2h. Was it performed by flow cytometry? It would be helpful to mention it in the figure legend. Also, if the said quantitation was done by flow cytometry, performing similar assay with HEPG2/C3A cells at 48h would provide the readers a better idea about the antiviral response across the cell lines at<br /> comparable timepoints.

      Minor comments:

      1. Was it expected to observe the increased induction of IL6 (Fig. 1b) in HepG2/C3A cells (but not in other cell lines) after IFN- (Interferon Lambda) treatment?
      2. In Fig. 3e, for the WT cells, 4 datapoints are visible while in the legend it is mentioned n=5.
      3. Typo: IRS661 in line 263, 699, Figure 2e.
      4. Typo: 200l in line 579.
      5. Catalogue number for ELISA kit is missing (Line 584).
      6. It would be helpful if the color code for the imaging in Supplementary figure 2f is provided on the top of the images, as it is provided in other images.

      Significance

      This article by Joshi et al. provides insight about the role of pDCs in controlling the HEV infection. However, the importance of pDC-infected cell contact mediated IFN-I secretion in antiviral response has been previously shown by the authors' group (Assil et al., 2019, Cell Host & Microbe) and others as well (E.g., Yun et al., 2021, Sci. Immunol.). The involvement of integrin mediated cell adhesion and TLR signaling in mediating this response was also shown. Though this manuscript does not advance the field of pDC biology or virology significantly, it does provide better understanding of the pDC antiviral response in the landscape of HEV infection. Although, it is out of the scope of this manuscript, elucidation of the mechanistic regulation how ORF2g/c controls the pDC-infected cell contact would be of great interest and significance. Overall, this study could be of interest to a general audience, especially to the virologists and researchers working in pDC biology.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Using a knock-out mutant strain, the authors tried to decipher the role of the last gene in the mycofactocin operon, mftG. They found that MftG was essential for growth in the presence of ethanol as the sole carbon source, but not for the metabolism of ethanol, evidenced by the equal production of acetaldehyde in the mutant and wild type strains when grown with ethanol (Fig 3). The phenotypic characterization of ΔmftG cells revealed a growth-arrest phenotype in ethanol, reminiscent of starvation conditions (Fig 4). Investigation of cofactor metabolism revealed that MftG was not required to maintain redox balance via NADH/NAD+, but was important for energy production (ATP) in ethanol. Since mycobacteria cannot grow via substrate-level phosphorylation alone, this pointed to a role of MftG in respiration during ethanol metabolism. The accumulation of reduced mycofactocin points to impaired cofactor cycling in the absence of MftG, which would impact the availability of reducing equivalents to feed into the electron transport chain for respiration (Fig 5). This was confirmed when looking at oxygen consumption in membrane preparations from the mutant and would type strains with reduced mycofactocin electron donors (Fig 7). The transcriptional analysis supported the starvation phenotype, as well as perturbations in energy metabolism, and may be beneficial if described prior to respiratory activity data.

      The data and conclusions support the role of MftG in ethanol metabolism.

      We thank the reviewer for the positive evaluation of our manuscript.

      Reviewer #3 (Public review):

      Summary:

      The work by Graca et al. describes a GMC flavoprotein dehydrogenase (MftG) in the ethanol metabolism of mycobacteria and provides evidence that it shuttles electrons from the mycofactocin redox cofactor to the electron transport chain.

      Strengths:

      Overall, this study is compelling, exceptionally well designed and thoroughly conducted. An impressively diverse set of different experimental approaches is combined to pin down the role of this enzyme and scrutinize the effects of its presence or absence in mycobacteria cells growing on ethanol and other substrates. Other strengths of this work are the clear writing style and stellar data presentation in the figures, which makes it easy also for non-experts to follow the logic of the paper. Overall, this work therefore closes an important gap in our understanding of ethanol oxidation in mycobacteria, with possible implications for the future treatment of bacterial infections.

      Weaknesses:

      I see no major weaknesses of this work, which in my opinion leaves no doubt about the role of MftG.

      We thank the reviewer for the positive evaluation of our manuscript.

      Reviewer #4 (Public review):

      Summary:

      The manuscript by Graça et al. explores the role of MftG in the ethanol metabolism of mycobacteria. The authors hypothesise that MftG functions as a mycofactocin dehydrogenase, regenerating mycofactocin by shuttling electrons to the respiratory chain of mycobacteria. Although the study primarily uses M. smegmatis as a model microorganism, the findings have more general implications for understanding mycobacterial metabolism. Identifying the specific partner to which MftG transfers its electrons within the respiratory chain of mycobacteria would be an important next step, as pointed out by the authors.

      Strengths:

      The authors have used a wide range of tools to support their hypothesis, including co-occurrence analyses, gene knockout and complementation experiments, as well as biochemical assays and transcriptomics studies.

      An interesting observation that the mftG deletion mutant grown on ethanol as the sole carbon source exhibited a growth defect resembling a starvation phenotype.

      MftG was shown to catalyse the electron transfer from mycofactocinol to components of the respiratory chain, highlighting the flexibility and complexity of mycobacterial redox metabolism.

      Weaknesses:

      Could the authors elaborate more on the differences between the WT strains in Fig. 3C and 3E? in Fig. 3C, the ethanol concentration for the WT strain is similar to that of WT-mftG and ∆mftG-mftG, whereas the acetate concentration in thw WT strain differs significantly from the other two strains. How this observation relates to ethanol oxidation, as indicated on page 12.

      This is a good question, and we agree with the reviewer that the sum of processes leading to the experimental observations shown in Figure 3 are not completely understood. For instance, when looking at ethanol concentrations, evaporation is a dominating effect and the situation is furthermore confounded by the fact that the rate of ethanol evaporation appears to be inversely correlated to the optical density of the samples (see Figure 3E and compare media control as well as the samples of DmftG and DmftG at OD<sub>600</sub> = 1). Additionally, the growth rate and thus the OD<sub>600</sub> of all strains monitored are different at each time point, thus further complicating the analysis. This is why we assume that the rate of ethanol oxidation is mirrored more clearly by acetate formation, at least in the early phase before 48 h (Figure 3E),i.e., before acetate consumption becomes dominant in DmftG-mftG and WT-mftG. Here, we see that the rate of acetate formation is zero for media controls, low for DmftG, but high for WT as well as DmftG-mftG and WT-mftG. The latter two strains also showed an earlier starting point of growth as well as acetate formation and the following phase of acetate depletion.

      All of these observations are in line with our general statement, i.d., “Parallel to the accelerated and enhanced growth described above (Figure 3A), the overexpression strains displayed higher rates of ethanol consumption as well as an earlier onset of acetate overflow metabolism and acetate consumption (Figure 3D).” We are still convinced that this summary describes the findings well and avoids unnecessary speculation.

      The authors conclude from their functional assays that MftG catalyses single-turnover reactions, likely using FAD present in the active site as an electron acceptor. While this is plausible, the current experimental set up doesn't fully support this conclusions, and the language around this claim should be softened.

      This is a fair point. We revised our claim accordingly. In particular, we changed:

      Page 28: we added “possibly”

      Page 28 we changed “single-turnover reactions” to “reactions reminiscent of a single-turnover process”.

      The authors suggest in the manuscript that the quinone pool (page 24) may act as the electron acceptor from mycofactocinol, but later in the discussion section (page 30) they propose cytochromes as the potential recipients. If the authors consider both possibilities valid, I suggest discussing both options in the manuscript.

      This is true. However, no change to the manuscript is necessary, since both options were discussed on page 30.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The authors addressing some of the original recommendations is appreciated e.g. title change. Other recommendations that were not adequately addressed would mostly improve the clarity and help comprehension for the reader, but they are at the author's discretion.

      Reviewer #3 (Recommendations for the authors):

      Abstract: "Here, we show that MftG enzymes strictly require mft biosynthetic genes and are found in 75% of organisms harboring these genes". I read this sentence several times and I am still somewhat confused and not sure what exactly is meant here. I suggest to rephrase, e.g., to "Here, we show that in 75% of all organisms that harbour the mft biosynthetic genes, MftG enzymes are also encoded and functionally associated with these genes" (if that was meant; also the abbreviation mft should be introduced in the abstract or otherwise the full name be used).

      We thank the reviewer for the good hint. We changed the sentence to “Here, we show that MftG enzymes are almost exclusively found in genomes containing mycofactocin biosynthetic genes and are present in 75% of organisms harboring these genes”.

      p.3, 2nd paragraph: "Although the role of MFT in alcohol metabolism is well established, further biological roles of mycofactocin appear to exist." Mycofactocin is once written as MFN and once in full length, which is slightly confusing. Consider rephrasing, e.g., to "...further biological roles of this cofactor appear to exist".

      Thank you, we adopted the suggested change.

      Fig. 1: Consider adding MftG in brackets after "mycofactocin dehydrogenase" in panel B.

      Good suggestion. We added (MftG) to the figure.

      Fig. 3: Legend should be corrected. The color of the signs should be teal diamond for "M. smegmatis double presence of the mftG gene" and orange upward facing triangle for "Medium with 10 g L-1 of ethanol without bacterial inoculation". Aside from the coloration, the order should ideally also be identical to the one shown in the upper right part.

      Thank you for the valuable hint! We corrected the legend and unified the legends in the figure caption and figure.

      p.20 : It is not exactly clear to me why "semipurified cell-free extracts from M. smegmatis ∆mftG-mftGHis6 " were used here rather than the purified enzyme. Was the purification by HisTrap columns not feasible or was the protein unstable when fully purified? In any case, it would help the reader to quickly state the reason in this section.

      Indeed, the problem with M. smegmatis as an expression host was a combination of low protein yield and poor binding to Ni-NTA columns. In E. coli, poor expression, low solubility or poor binding was the issue. Unfortunately, the usage of other affinity tags resulted in either poor expression or inactive protein. We have shortly mentioned the major issues on page 21 and prefer not to focus on failed attempts too much.

      p. 21: "We, therefore, concluded that MftG can indeed interact with mycofactocins as electron donors but might require complex electron acceptors, for instance, proteins present in the respiratory chain." I agree. For the future it might be worthwhile to determine the redox potential of MftG, which could provide hints on the natural electron acceptor.

      Thank you for the suggestion. We will consider this question in our future work.

      p. 23: "In M. smegmatis, cyanide is a known inhibitor of the cytochrome bc/aa3 but not of cytochrome bd (34), therefore, the decrease of oxygen consumption when MFTs were added to the membrane fractions in combination with KCN (Figure 7), revealed that MFT-induced oxygen consumption is indeed linked to mycobacterial respiration." It might be a good idea to quickly recapitulate the functions of these cytochromes here. Also, I think it should read "bc1aa3" (also correct in legend of Fig. 8 that says "bcc-aa3").

      Thank you for the good observation. We changed all instances to the correct designation (bc1-aa3).

      Reviewer #4 (Recommendations for the authors):

      Abstract: revise the wording "MftG enzymes strictly require mft biosynthetic genes". It should be either mftG gene with the mft biosynthetic genes or MftG enzyme with the Mft biosynthetic proteins. I also suggest replacing "require" with a more appropriate term.

      This was taken care of. See above.

      Page 3, end of the first paragraph; does the alcohol dehydrogenase refer to Mno/Mdo?

      Partially, yes, but also to other alcohol dehydrogenases.

      Page 4, radical SAM; define upon first use

      Good, point, we changed “radical SAM” to radical S-adenosyl methionine (rSAM)

      Page 6; Rossman fold refers to the fold and not only the FAD binding pocket.

      Good point. We deleted “(Rossman fold)”

      Page 11; not exactly sure what this means "the growth curve of the complemented strain, which could be dysregulated in mftG expression"

      By “dysregulated” expression, we mean that the expression of mftG could be higher or lower than in the WT and could follow different regulatory signals than in the wild type. Since this phenomenon is not well understood, we would like to avoid speculative discussions.

      Page 11; Figures 2E and 2C should be 3E and 3C. Likewise on page 12 Figure 2D.

      Thank you very much for the valuable hint. We corrected the figure numbers as suggested.

      Page 12; the last Figure 3D in the page should be 3E?

      Yes, good catch, we corrected the Figure number.

      Page 17, KO; define upon first use.

      Good suggestion, we changed both instances of “KO” to “knockout”

      Page 24; revise: "for instance. For example"

      We deleted “for instance”.

      Page 26; change 6.506 to 6,506

      Corrected.

      Page 23; "In M. smegmatis, cyanide is a known inhibitor ..." is too long and not easy to understand/follow.

      Good suggestion. We simplified the sentence to “Therefore, the decrease of oxygen consumption in the presence of KCN (Figure 7) revealed…”

      Page 29; "single-turnover reactions could be observed". There are no experiments to support this statement, except the results shown in Figure 7F. I suggest softening the language, as it has been done on page 21. To claim single-turnover, a proper kinetic analysis would be necessary, which is not included in the current manuscript.

      This is true and has been taken care of. See above.

      Figure 1; Indicate mycofactocin dehydrogenase as MftG

      Done.

      Figure 5A; what is the significance of comparing ∆mftG glucose with WT ethanol?

      We agree, that, although the difference of the two columns is significant, this does not have any relevant meaning. Therefore, we removed the bracket with p-value in Panel A.

      Make HdB-Tyl/HdB-tyloxapol usage consistent throughout the document. Likewise, re the usage of mycobacteria/Mycobacteria/Mycobacteria

      Thank you for the valuable hint, we unified the usage throughout the document

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      This a comprehensive study that sheds light on how Wag31 functions and localises in mycobacterial cells. A clear link to interactions with CL is shown using a combination of microscopy in combination with fusion fluorescent constructs, and lipid specific dyes. Furthermore, studies using mutant versions of Wag31 shed light on the functionalities of each domain in the protein. My concerns/suggestions for the manuscript are minor:

      (1) Ln 130. A better clarification/discussion is required here. It is clear that both depletion and overexpression have an effect on levels of various lipids, but subsequent descriptions show that they affect different classes of lipids.

      We thank the reviewer for the comments. We will improve Ln130 in the manuscript. The lipid classes that get impacted by the depletion of Wag31 vs overexpression are different. Wag31 is an adaptor protein that interacts with proteins of the ACCase complex (Meniche et al., 2014; Xu et al., 2014) that synthesize fatty acid precursors and regulate their activity (Habibi Arejan et al., 2022).

      The varied response to lipid homeostasis could be attributed to a change in the stoichiometry of these interactions with Wag31. While Wag31 depletion would prevent such interactions from occurring and might affect lipid synthesis that directly depends on Wag31-protein partner interactions, its overexpression would lead to promiscuous interactions and a change in the stoichiometry of native interactions, ultimately modulating lipid synthesis pathways.

      (2) The pulldown assays results are interesting, but links are tentative.

      The interactome of Wag31 was identified through the immunoprecipitation of Flag-tagged Wag31 complemented at an integrative locus in Wag31 mutant background to avoid overexpression artifacts. We used Msm::gfp expressing an integrative copy (at L5 locus) of FLAG-GFP as a control to subtract non-specific interactions. The experiment was performed in biological triplicates, and interactors that appeared in all replicates were selected for further analysis. Although we identified more than 100 interactors of Wag31, we analyzed only the top 25 hits, with a PSM cut-off ≥18 and unique peptides≥5. Additionally, two of Wag31's established interactors, AccD5 and Rne, were among the top five hits, thus validating our data.

      Though we agree that the interactions can either be direct or through a third partner, the fact that we obtained known interactors of Wag31 makes us believe these interactions are genuine. Moreover, we performed pulldown experiments for validation by mixing E. coli lysates expressing His-Wag31 full-length or truncated protein with M. smegmatis lysates expressing FLAG-tagged interacting proteins. The wash conditions used were quite stringent for these pull-down assays—the wash buffer contained 1% Triton X100, eliminating all non-specific and indirect interactions.  However, we agree that we cannot conclusively state that the interactions are direct without purifying the proteins and performing the experiment. We will describe this caveat in the revised manuscript. 

      (3) The authors may perhaps like to rephrase claims of effects lipid homeostasis, as my understanding is that lipid localisation rather than catabolism/breakdown is affected.

      In this manuscript, we are trying to convey that Wag31 is a spatiotemporal regulator of lipid metabolism. It is a peripheral protein that is hooked to the membrane via Cardiolipin and forms a scaffold at the poles, which helps localize several enzymes involved in lipid metabolism.

      Homeostasis is the process by which an organism maintains a steady-state of balance and stability in response to changes.  Depletion of Wag31 not only results in delocalisation of lipids in intracellular lipid inclusions but also leads to changes in the levels of various lipid classes. Advancement in the field of spatial biology underscores the importance of native localization of various biological molecules crucial for maintaining a steady-cell of the cell. Hence, we have used the word “homeostasis” to describe both the changes observed in lipid metabolism.

      Reviewer #2 (Public review):

      Summary

      Kapoor et. al. investigated the role of the mycobacterial protein Wag31 in lipid and peptidoglycan synthesis and sought to delineate the role of the N- and C- terminal domains of Wag31. They demonstrated that modulating Wag31 levels influences lipid homeostasis in M. smegmatis and cardiolipin (CL) localisation in cells. Wag31 was found to preferentially bind CL-containing liposomes, and deleting the N-terminus of the protein significantly decreased this interaction. Novel interactions between Wag31 and proteins involved in lipid metabolism and cell wall synthesis were identified, suggesting that Wag31 recruits proteins to the intracellular membrane domain by direct interaction.

      Strengths:

      (1) The importance of Wag31 in maintaining lipid homeostasis is supported by several lines of evidence.

      (2) The interaction between Wag31 and cardiolipin, and the role of the N-terminus in this interaction was convincingly demonstrated.

      Weaknesses:

      (1) MS experiments provide some evidence for novel protein-protein interactions. However, the pull-down experiments lack a valid negative control.

      We thank the reviewer for the comments. We will include a valid negative control in the experiment. We would choose ~2 mycobacterial proteins that are not a part of our interactome study and perform a similar pull-down experiment with them and a positive control (known interactor of Wag31).

      (2) The role of the N-terminus in the protein-protein interaction has not been ruled out.

      Previously, we attempted to express the N-terminal (1-60 aa) and the C-terminal (60-212 aa) proteins in various mycobacterial shuttle vectors to perform MS/MS experiments. Despite numerous efforts, neither was expressed with the N/C-terminal FLAG tag nor without any tag in episomal or integrative vectors due to the instability of the protein. Eventually, we successfully expressed the C-terminal Wag31 with an N and C-terminal hexa-His tag. However, this expression was not sufficient or stable enough for us to perform Ni affinity pull-down experiments for mass spectrometry.  The N-terminal of Wag31 could not be expressed in M. smegmatis even with N and C-terminal Hexa-His tags.

      To rule out the role of the N-terminal in mediating protein-protein interactions, we plan to attempt to express N-terminal of Wag31with N and C-terminal hexa-His tag in E. coli. If this clone successfully expresses in E. coli, we will perform pull-down experiments as described in Figure 7.

      Reviewer #3 (Public review):

      Summary:

      This manuscript describes the characterization of mycobacterial cytoskeleton protein Wag31, examining its role in orchestrating protein-lipid and protein-protein interactions essential for mycobacterial survival. The most significant finding is that Wag31, which directs polar elongation and maintains the intracellular membrane domain, was revealed to have membrane tethering capabilities.

      Strengths:

      The authors provided a detailed analysis of Wag31 domain architecture, revealing distinct functional roles: the N-terminal domain facilitates lipid binding and membrane tethering, while the C-terminal domain mediates protein-protein interactions. Overall, this study offers a robust and new understanding of Wag31 function.

      Weaknesses:

      The following major concerns should be addressed.

      • Authors use 10-N-Nonyl-acridine orange (NAO) as a marker for cardiolipin localization. However, given that NAO is known to bind to various anionic phospholipids, how do the authors know that what they are seeing is specifically visualizing cardiolipin and not a different anionic phospholipid? For example, phosphatidylinositol is another abundant anionic phospholipid in mycobacterial plasma membrane.

      We thank the reviewer for the comments. Despite its promiscuous binding to other anionic phospholipids, 10-N-Nonyl-acridine orange is widely used to stain Cardiolipin and determine its localisation in bacterial cells and mitochondria of eukaryotes (Garcia Fernandez et al., 2004; Mileykovskaya & Dowhan, 2000; Renner & Weibel, 2011).  This is because it has a stronger affinity for Cardiolipin than other anionic phospholipids with the affinity constant being 2 × 10<sup>6</sup> M<sup>−1</sup> for Cardiolipin association and 7 × 10<sup>4</sup> M<sup>−1</sup> for that of phosphatidylserine and phosphatidylinositol association (Petit et al., 1992). Additionally, there is not yet another stain available for detecting Cardiolipin. Our protein-lipid binding assays suggest that Wag31 preferentially binds to Cardiolipin over other anionic phospholipids (Fig. 4b), hence it is likely that the majority of redistribution of NAO fluorescence that we observe might be contributed by Cardiolipin mislocalization due to altered Wag31 levels, with smaller degree of NAO redistribution intensity coming indirectly from other anionic phospholipids displaced from the membrane due to the loss of membrane integrity and cell shape changes due to Wag31.

      • Authors' data show that the N-terminal region of Wag31 is important for membrane tethering. The authors' data also show that the N-terminal region is important for sustaining mycobacterial morphology. However, the authors' statement in Line 256 "These results highlight the importance of tethering for sustaining mycobacterial morphology and survival" requires additional proof. It remains possible that the N-terminal region has another unknown activity, and this yet-unknown activity rather than the membrane tethering activity drives the morphological maintenance. Similarly, the N-terminal region is important for lipid homeostasis, but the statement in Line 270, "the maintenance of lipid homeostasis by Wag31 is a consequence of its tethering activity" requires additional proof. The authors should tone down these overstatements or provide additional data to support their claims.

      We agree with the reviewer that there exists a possibility for another function of the N-terminal that may contribute to sustaining mycobacterial physiology and survival. We would revise our statements in the paper to accurately reflect the data. Results shown suggest that the tethering activity of the N-terminal region may contribute to mycobacterial morphology and survival. However, additional functions of this region can’t be ruled out. Similarly, the maintenance of lipid homeostasis by Wag31 may be associated with its tethering activity, although other mechanisms could also contribute to this process. 

      • Authors suggest that Wag31 acts as a scaffold for the IMD (Fig. 8). However, Meniche et. al. has shown that MurG as well as GlfT2, two well-characterized IMD proteins, do not colocalize with Wag31 (DivIVA) (https://doi.org/10.1073/pnas.1402158111). IMD proteins are always slightly subpolar while Wag31 is located to the tip of the cell. Therefore, the authors' biochemical data cannot be easily reconciled with microscopic observations in the literature. This raises a question regarding the validity of protein-protein interaction shown in Figure 7. Since this pull-down assay was conducted by mixing E. coli lysate expressing Wag31 and Msm lysate expression Wag31 interactors like MurG, it is possible that the interactions are not direct. Authors should interpret their data more cautiously. If authors cannot provide additional data and sufficient justifications, they should avoid proposing a confusing model like Figure 8 that contradicts published observations.

      In the literature, MurG and GlfT2 have been shown to have polar localization (Freeman et al., 2023; Hayashi et al., 2016; Kado et al., 2023), and two groups have shown slightly sub-polar localization of MurG (García-Heredia et al., 2021; Meniche et al., 2014). Additionally, (Freeman et al., 2023) they showed SepIVA to be a spatio-temporal regulator of MurG. MS/MS analysis of Wag31 immunoprecipitation data yielded both MurG and SepIVA to be interactors of Wag31 (Fig. 3). Given Wag31 also displays polar localisation, it likely associates with the polar MurG. However, since a sub-polar localization of MurG has also been reported, it is possible that they do not interact directly, and another protein mediates their interaction. We will modify the model proposed in Fig. 8 based on the above.

      We agree that for validation of interaction, we performed pulldown experiments by mixing E. coli lysates expressing His-Wag31 full-length or truncated protein with M. smegmatis lysates expressing FLAG-tagged interacting proteins. The wash conditions used were quite stringent for these pull-down assays—the wash buffer containing 1% Triton X100, which eliminates all non-specific and indirect interactions.  However, we agree that we cannot conclusively state that the interactions are direct without purifying the proteins and performing the experiment. We will describe this caveat in the revised manuscript and propose a model reflecting our results.

      References:

      Freeman, A. H., Tembiwa, K., Brenner, J. R., Chase, M. R., Fortune, S. M., Morita, Y. S., & Boutte, C. C. (2023). Arginine methylation sites on SepIVA help balance elongation and septation in Mycobacterium smegmatis. Mol Microbiol, 119(2), 208-223. https://doi.org/10.1111/mmi.15006

      Garcia Fernandez, M. I., Ceccarelli, D., & Muscatello, U. (2004). Use of the fluorescent dye 10-N-nonyl acridine orange in quantitative and location assays of cardiolipin: a study on different experimental models. Anal Biochem, 328(2), 174-180. https://doi.org/10.1016/j.ab.2004.01.020

      García-Heredia, A., Kado, T., Sein, C. E., Puffal, J., Osman, S. H., Judd, J., Gray, T. A., Morita, Y. S., & Siegrist, M. S. (2021). Membrane-partitioned cell wall synthesis in mycobacteria. eLife, 10. https://doi.org/10.7554/eLife.60263

      Habibi Arejan, N., Ensinck, D., Diacovich, L., Patel, P. B., Quintanilla, S. Y., Emami Saleh, A., Gramajo, H., & Boutte, C. C. (2022). Polar protein Wag31 both activates and inhibits cell wall metabolism at the poles and septum. Front Microbiol, 13, 1085918. https://doi.org/10.3389/fmicb.2022.1085918

      Hayashi, J. M., Luo, C. Y., Mayfield, J. A., Hsu, T., Fukuda, T., Walfield, A. L., Giffen, S. R., Leszyk, J. D., Baer, C. E., Bennion, O. T., Madduri, A., Shaffer, S. A., Aldridge, B. B., Sassetti, C. M., Sandler, S. J., Kinoshita, T., Moody, D. B., & Morita, Y. S. (2016). Spatially distinct and metabolically active membrane domain in mycobacteria. Proc Natl Acad Sci U S A, 113(19), 5400-5405. https://doi.org/10.1073/pnas.1525165113

      Kado, T., Akbary, Z., Motooka, D., Sparks, I. L., Melzer, E. S., Nakamura, S., Rojas, E. R., Morita, Y. S., & Siegrist, M. S. (2023). A cell wall synthase accelerates plasma membrane partitioning in mycobacteria. eLife, 12, e81924. https://doi.org/10.7554/eLife.81924

      Meniche, X., Otten, R., Siegrist, M. S., Baer, C. E., Murphy, K. C., Bertozzi, C. R., & Sassetti, C. M. (2014). Subpolar addition of new cell wall is directed by DivIVA in mycobacteria. Proc Natl Acad Sci U S A, 111(31), E3243-3251. https://doi.org/10.1073/pnas.1402158111

      Mileykovskaya, E., & Dowhan, W. (2000). Visualization of phospholipid domains in Escherichia coli by using the cardiolipin-specific fluorescent dye 10-N-nonyl acridine orange. J Bacteriol, 182(4), 1172-1175. https://doi.org/10.1128/JB.182.4.1172-1175.2000

      Petit, J. M., Maftah, A., Ratinaud, M. H., & Julien, R. (1992). 10N-nonyl acridine orange interacts with cardiolipin and allows the quantification of this phospholipid in isolated mitochondria. Eur J Biochem, 209(1), 267-273. https://doi.org/10.1111/j.1432-1033.1992.tb17285.x

      Renner, L. D., & Weibel, D. B. (2011). Cardiolipin microdomains localize to negatively curved regions of Escherichia coli membranes. Proc Natl Acad Sci U S A, 108(15), 6264-6269. https://doi.org/10.1073/pnas.1015757108

      Xu, W. X., Zhang, L., Mai, J. T., Peng, R. C., Yang, E. Z., Peng, C., & Wang, H. H. (2014). The Wag31 protein interacts with AccA3 and coordinates cell wall lipid permeability and lipophilic drug resistance in Mycobacterium smegmatis. Biochem Biophys Res Commun, 448(3), 255-260. https://doi.org/10.1016/j.bbrc.2014.04.116

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study asks whether the phenomenon of crossmodal temporal recalibration, i.e. the adjustment of time perception by consistent temporal mismatches across the senses, can be explained by the concept of multisensory causal inference. In particular, they ask whether the explanation offered by causal inference better explains temporal recalibration better than a model assuming that crossmodal stimuli are always integrated, regardless of how discrepant they are.

      The study is motivated by previous work in the spatial domain, where it has been shown consistently across studies that the use of crossmodal spatial information is explained by the concept of multisensory causal inference. It is also motivated by the observation that the behavioral data showcasing temporal recalibration feature nonlinearities that, by their nature, cannot be explained by a fixed integration model (sometimes also called mandatory fusion).

      To probe this the authors implemented a sophisticated experiment that probed temporal recalibration in several sessions. They then fit the data using the two classes of candidate models and rely on model criteria to provide evidence for their conclusion. The study is sophisticated, conceptually and technically state-of-the-art, and theoretically grounded. The data clearly support the authors’ conclusions.

      I find the conceptual advance somewhat limited. First, by design, the fixed integration model cannot explain data with a nonlinear dependency on multisensory discrepancy, as already explained in many studies on spatial multisensory perception. Hence, it is not surprising that the causal inference model better fits the data.

      We have addressed this comment by including an asynchrony-contingent model, which is capable of predicting the nonlinearity of recalibration effects by employing a heuristic approximation of the causal-inference process (Fig. 3). We also updated the previous competitor model with a more reasonable asynchrony-correction model as the baseline of model comparison, which assumes recalibration aims to restore synchrony whenever the sensory measurement of SOA indicates an asynchrony. The causal-inference model outperformed both models, as indicated by model evidence (Fig. 4A). Furthermore, model predictions show that the causal-inference model more accurately captures recalibration at large SOAs at both the group (Fig. 4B) and the individual levels (Fig. S4).

      Second, and again similar to studies on spatial paradigms, the causal inference model fails to predict the behavioral data for large discrepancies. The model predictions in Figure 5 show the (expected) vanishing recalibration for large delta, while the behavioral data don’t decay to zero. Either the range of tested SOAs is too small to show that both the model and data converge to the same vanishing effect at large SOAs, or the model's formula is not the best for explaining the data. Again, the studies using spatial paradigms have the same problem, but in my view, this poses the most interesting question here.

      We included an additional simulation (Fig. 5B) to show that the causal-inference model can predict non-zero recalibration for long adapter SOAs, especially in observers with a high common-cause prior and low sensory precision. This ability to predict a non-zero recalibration effect even at large SOA, such as 0.7 s, is one key feature of the causal-inference model that distinguishes it from the asynchrony-contingent model.

      In my view there is nothing generally wrong with the study, it does extend the 'known' to another type of paradigm. However, it covers little new ground on the conceptual side.

      On that note, the small sample size of n=10 is likely not an issue, but still, it is on the very low end for this type of study.

      This study used a within-subject design, which included 3 phases each repeated in 9 sessions, totaling 13.5 hours per participant. This extensive data collection allows us to better constrain the model for each participant. Our conclusions are based on the different models’ ability to fit individual data.

      Reviewer #2 (Public Review):

      Summary:

      Li et al.’s goal is to understand the mechanisms of audiovisual temporal recalibration. This is an interesting challenge that the brain readily solves in order to compensate for real-world latency differences in the time of arrival of audio/visual signals. To do this they perform a 3-phase recalibration experiment on 9 observers that involves a temporal order judgment (TOJ) pretest and posttest (in which observers are required to judge whether an auditory and visual stimulus were coincident, auditory leading or visual leading) and a conditioning phase in which participants are exposed to a sequence of AV stimuli with a particular temporal disparity. Participants are required to monitor both streams of information for infrequent oddballs, before being tested again in the TOJ, although this time there are 3 conditioning trials for every 1 TOJ trial. Like many previous studies, they demonstrate that conditioning stimuli shift the point of subjective simultaneity (pss) in the direction of the exposure sequence.

      These shifts are modest - maxing out at around -50 ms for auditory leading sequences and slightly less than that for visual leading sequences. Similar effects are observed even for the longest offsets where it seems unlikely listeners would perceive the stimuli as synchronous (and therefore under a causal inference model you might intuitively expect no recalibration, and indeed simulations in Figure 5 seem to predict exactly that which isn't what most of their human observers did). Overall I think their data contribute evidence that a causal inference step is likely included within the process of recalibration.

      Strengths:

      The manuscript performs comprehensive testing over 9 days and 100s of trials and accompanies this with mathematical models to explain the data. The paper is reasonably clearly written and the data appear to support the conclusions.

      Weaknesses:

      While I believe the data contribute evidence that a causal inference step is likely included within the process of recalibration, this to my mind is not a mechanism but might be seen more as a logical checkpoint to determine whether whatever underlying neuronal mechanism actually instantiates the recalibration should be triggered.

      We have addressed this comment by replacing the fixed-update model with an asynchrony-correction model, which assumes that the system first evaluates whether the measurement of SOA is asynchronous, thus indicating a need for recalibration (Fig. 3). If it does, it shifts the audiovisual bias by a proportion of the measured SOA. We additionally included an asynchrony-contingent model, which is capable of replicating the nonlinearity of recalibration effects by a heuristic approximation of the causal-inference process.

      Model comparisons indicate that the causal-inference model of temporal recalibration outperforms both alternative models (Fig. 4A). Furthermore, the model predictions demonstrate that the causal-inference model more accurately captures recalibration at large SOAs at both the group level (Fig. 4B) and individual level (Fig. S4).

      The authors’ causal inference model strongly predicts that there should be no recalibration for stimuli at 0.7 ms offset, yet only 3/9 participants appear to show this effect. They note that a significant difference in their design and that of others is the inclusion of longer lags, which are unlikely to originate from the same source, but don’t offer any explanation for this key difference between their data and the predictions of a causal inference model.

      We added further simulations to show that the causal-inference model can predict non-zero recalibration also for longer adapter SOAs, especially in observers with a large common-cause prior (Fig. 5A) and low sensory precision (Fig. 5B). This ability to predict a non-zero recalibration effect even at longer adapter SOAs, such as 0.7 s, is a key feature of the causal-inference model that distinguishes it from the asynchrony-contingent model.

      I’m also not completely convinced that the causal inference model isn’t ‘best’ simply because it has sufficient free parameters to capture the noise in the data. The tested models do not (I think) have equivalent complexity - the causal inference model fits best, but has more parameters with which to fit the data. Moreover, while it fits ‘best’, is it a good model? Figure S6 is useful in this regard but is not completely clear - are the red dots the actual data or the causal inference prediction? This suggests that it does fit the data very well, but is this based on predicting held-out data, or is it just that by having more parameters it can better capture the noise? Similarly, S7 is a potentially useful figure but it's not clear what is data and what are model predictions (what are the differences between each row for each participant; are they two different models or pre-test post-test or data and model prediction?!).

      I'm not an expert on the implementation of such models but my reading of the supplemental methods is that the model is fit using all the data rather than fit and tested on held-out data. This seems problematic.

      We recognize the risk of overfitting with the causal-inference model. We now rely on Bayesian model comparisons, which use model evidence for model selection. This method automatically incorporates a penalty for model complexity through the marginalization over the parameter space (MacKay, 2003).

      Our design is not suitable for cross-validation because the model-fitting process is computationally intensive and time-consuming. Each fit of the causal-inference model takes approximately 30 hours, and multiple fits with different initial starting points are required to rule out that the parameter estimates correspond to local minima.

      I would have liked to have seen more individual participant data (which is currently in the supplemental materials, albeit in a not very clear manner as discussed above).

      We have revised Supplementary Figures S4-S6 to show additional model predictions of the recalibration effect for individual participants, and participants’ temporal-order judgments are now shown in Supplement Figure S7. These figures confirm the better performance of the causal-inference model.

      The way that S3 is described in the text (line 141) makes it sound like everyone was in the same direction, however, it is clear that 2 /9 listeners show the opposite pattern, and 2 have confidence intervals close to zero (albeit on the -ve side).

      We have revised the text to clarify that the asymmetry occurs in both directions and is idiosyncratic (lines 168-171). We summarized the distribution of the individual asymmetries of the recalibration effect across visual-leading and auditory-leading adapter SOAs in Supplementary Figure S2.

      Reviewer #3 (Public Review):

      Summary:

      Li et al. describe an audiovisual temporal recalibration experiment in which participants perform baseline sessions of ternary order judgments about audiovisual stimulus pairs with various stimulus-onset asynchronies (SOAs). These are followed by adaptation at several adapting SOAs (each on a different day), followed by post-adaptation sessions to assess changes in psychometric functions. The key novelty is the formal specification and application/fit of a causal-inference model for the perception of relative timing, providing simulated predictions for the complete set of psychometric functions both pre and post-adaptation.

      Strengths:

      (1) Formal models are preferable to vague theoretical statements about a process, and prior to this work, certain accounts of temporal recalibration (specifically those that do not rely on a population code) had only qualitative theoretical statements to explain how/why the magnitude of recalibration changes non-linearly with the stimulus-onset asynchrony of the adapter.

      (2) The experiment is appropriate, the methods are well described, and the average model prediction is a fairly good match to the average data (Figure 4). Conclusions may be overstated slightly, but seem to be essentially supported by the data and modelling.

      (3) The work should be impactful. There seems a good chance that this will become the go-to modelling framework for those exploring non-population-code accounts of temporal recalibration (or comparing them with population-code accounts).

      (4) A key issue for the generality of the model, specifically in terms of recalibration asymmetries reported by other authors that are inconsistent with those reported here, is properly acknowledged in the discussion.

      Weaknesses:

      (1) The evidence for the model comes in two forms. First, two trends in the data (non-linearity and asymmetry) are illustrated, and the model is shown to be capable of delivering patterns like these. Second, the model is compared, via AIC, to three other models. However, the main comparison models are clearly not going to fit the data very well, so the fact that the new model fits better does not seem all that compelling. I would suggest that the authors consider a comparison with the atheoretical model they use to first illustrate the data (in Figure 2). This model fits all sessions but with complete freedom to move the bias around (whereas the new model constrains the way bias changes via a principled account). The atheoretical model will obviously fit better, but will have many more free parameters, so a comparison via AIC/BIC or similar should be informative

      In the revised manuscript, we switched from AIC to Bayesian model selection, which approximates and compares model evidence. This method incorporates a strong penalty for model complexity through marginalization over the parameter space (MacKay, 2003).

      We have addressed this comment by updating the former competitor model into a more reasonable version that induces recalibration only for some measured SOAs and by including another (asynchrony-contingent) model that is capable of predicting the nonlinearity and asymmetry of recalibration (Fig. 3) while heuristically approximating the causal inference computations. The causal-inference model outperformed the asynchrony-contingent model, as indicated by model evidence (Fig. 4A). Furthermore, model predictions show that the causal-inference model more accurately captures recalibration at large SOAs at both the group (Fig. 4B) and the individual level (Fig. S4).

      (2) It does not appear that some key comparisons have been subjected to appropriate inferential statistical tests. Specifically, lines 196-207 - presumably this is the mean (and SD or SE) change in AIC between models across the group of 9 observers. So are these differences actually significant, for example via t-test?

      We statistically compared the models using Bayes factors (Fig. 4A). The model evidence for each model was approximated using Variational Bayesian Monte Carlo. Bayes factors provided strong evidence in support of the causal-inference model relative to the other models.

      (3) The manuscript tends to gloss over the population-code account of temporal recalibration, which can already provide a quantitative account of how the magnitude of recalibration varies with adapter SOA. This could be better acknowledged, and the features a population code may struggle with (asymmetry?) are considered.

      We simulated a population-code model to examine its prediction of the recalibration effect for different adapter SOAs (lines 380–388, Supplement Section 8). The population-code model can predict the nonlinearity of recalibration, i.e., a decreasing recalibration effect as the adapter SOA increases. However, to capture the asymmetry of recalibration effects across auditory-leading and visual-leading adapter stimuli, we would need to assume that the auditory-leading and visual-leading SOAs are represented by neural populations with unequal tuning curves.

      (4) The engagement with relevant past literature seems a little thin. Firstly, papers that have applied causal inference modeling to judgments of relative timing are overlooked (see references below). There should be greater clarity regarding how the modelling here builds on or differs from these previous papers (most obviously in terms of additionally modelling the recalibration process, but other details may vary too). Secondly, there is no discussion of previous findings like that in Fujisaki et al.’s seminal work on recalibration, where the spatial overlap of the audio and visual events didn’t seem to matter (although admittedly this was an N = 2 control experiment). This kind of finding would seem relevant to a causal inference account.

      References:

      Magnotti JF, Ma WJ and Beauchamp MS (2013) Causal inference of asynchronous audiovisual speech. Front. Psychol. 4:798. doi: 10.3389/fpsyg.2013.00798

      Sato, Y. (2021). Comparing Bayesian models for simultaneity judgement with different causal assumptions. J. Math. Psychol., 102, 102521.

      We have revised the Introduction and Discussion to better situate our study within the existing literature. Specifically, we have incorporated the suggested references (lines 66–69) and provided clearer distinctions on how our modeling approach builds on or differs from previous work on causal-inference models, particularly in terms of modeling the recalibration process (lines 75–79). Additionally, we have discussed findings that might contradict the assumptions of the causal-inference model (lines 405–424).

      (5) As a minor point, the model relies on simulation, which may limit its take-up/application by others in the field.

      Upon acceptance, we will publicly share the code for all models (simulation and parameter fitting) to enable researchers to adapt and apply these models to their own data.

      (6) There is little in the way of reassurance regarding the model’s identifiability and recoverability. The authors might for example consider some parameter recovery simulations or similar.

      We conducted a model recovery for each of the six models described in the main text and confirmed that the asynchrony-contingent and causal-inference models are identifiable (Supplement Section 11). Simulations of the asynchrony-correction model were sometimes best fit by causal-inference models, because the latter behaves similarly when the prior of a common cause is set to one.

      We also conducted a parameter recovery for the winning model, the causal-inference model with modality-specific precision (Supplement Section 13).

      Key parameters, including audiovisual bias  , amount of auditory latency noise  , amount of visual latency noise  , criterion, lapse rate  showed satisfactory recovery performance. The less accurate recovery of  is likely due to a tradeoff with learning rate  .

      (7) I don't recall any statements about open science and the availability of code and data.

      Upon acceptance of the manuscript, all code (simulation and parameter fitting) and data will be made available on OSF and publicly available.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      In addition to the comments below, we would like to offer the following summary based on the discussion between reviewers:

      The major shortcoming of the work is that there should ideally be a bit more evidence to support the model, over and above a demonstration that it captures important trends and beats an account that was already known to be wrong. We suggest you:

      (1) Revise the figure legends (Figure 5 and Figure 6E).

      We revised all figures and figure legends.

      (2) Additionally report model differences in terms of BIC (which will favour the preferred model less under the current analysis);

      We now base the model comparison on Bayesian model selection, which approximates and compares model evidence. This method incorporates a strong penalty for model complexity through marginalization over the parameter space (MacKay, 2003).

      (3) Move to instead fitting the models multiple times in order to get leave-one-out estimates of best-fitting loglikelihood for each left-out data point (and then sum those for the comparison metric).

      Unfortunately, our design is not suitable for cross-validation methods because the model-fitting process is computationally intensive and time-consuming. Each fit of the causal-inference model takes approximately 30 hours, and multiple fits with different initial starting points are required to rule out local minima.

      (4) Offering a comparison with a more convincing model (for example an atheoretical fit with free parameters for all adapters, e.g. as suggested by Reviewer 3.

      We updated the previous competitor model and included an asynchrony-contingent model, which is capable of predicting the nonlinearity of recalibration (Fig. 3). The causal-inference model still outperformed the asynchrony-contingent model (Fig. 4A). Furthermore, model predictions show that only the causal-inference model captures non-zero recalibration effects for long adapter SOAs at both the group level (Fig. 4B) and individual level (Figure S4).

      Reviewer #1 (Recommendations For The Authors):

      A larger sample size would be better.

      This study used a within-subject design, which included 9 sessions, totaling 13.5 hours per participant. This extensive data collection allows us to better constrain the model for each participant. Our conclusions are based on the different models’ ability to fit individual data rather than on group statistics.

      It would be good to better put the study in the context of spatial ventriloquism, where similar model comparisons have been done over the last ten years and there is a large body of work to connect to.

      We now discuss our model in relation to models of cross-modal spatial recalibration in the Introduction (lines 70–78) and Discussion (lines 324–330).

      Reviewer #2 (Recommendations For The Authors):

      Previous authors (e.g. Yarrow et al.,) have described latency shift and criterion change models as providing a good fit of experimental data. Did the authors attempt a criterion shift model in addition to a shift model?

      We have considered criterion-shift variants of our atheoretical recalibration models in Supplement Section 1. To summarize the results, we varied two model assumptions: 1) the use of either a Gaussian or an exponential measurement distribution, and 2) recalibration being implemented either as a shift of bias or a criterion. We fit each model variant separately to the ternary TOJ responses of all sessions. Bayesian model comparisons indicated that the bias-shift model with exponential measurement distributions best captured the data of most participants.

      Figure 4B - I'm not convinced that the modality-independent uncertainty is anything but a straw man. Models not allowed to be asymmetric do not show asymmetry? (the asymmetry index is irrelevant in the fixed update model as I understand it so it is not surprising the model is identical?).

      We included the assumption that temporal uncertainty might be modality-independent for several reasons. First, there is evidence suggesting that a central mechanism governs the precision of temporal-order judgments (Hirsh & Sherrick, 1961), indicating that precision is primarily limited by a central mechanism rather than the sensory channels themselves. Second, from a modeling perspective, it was necessary to test whether an audio-visual temporal bias alone, i.e., assuming modality-independent uncertainty, could introduce asymmetry across adapter SOAs. Additionally, most previous studies implicitly assumed symmetric likelihoods, i.e., modality-independent latency noise, by fitting cumulative Gaussians to the psychometric curves derived from 2AFC-TOJ tasks (Di Luca et al., 2009; Fujisaki et al., 2004; Harrar & Harris, 2005; Keetels & Vroomen, 2007; Navarra et al., 2005; Tanaka et al., 2011; Vatakis et al., 2007, 2008; Vroomen et al., 2004).

      Why does a zero SOA adapter shift the pss towards auditory leading? Is this a consequence of the previous day’s conditioning - it’s not clear from the methods whether all listeners had the same SOA conditioning sequence across days.

      The auditory-leading recalibration effect for an adapter SOA of zero has been consistently reported in previous studies (e.g., Fujisaki et al., 2004; Vroomen et al., 2004). This effect symbolizes the asymmetry in recalibration. This asymmetry can be explained by differences across modalities in the noisiness of the latencies (Figure 5C) in combination with audiovisual temporal bias (Figure S8).

      We added details about the order of testing to the Methods section (lines 456–457).

      Reviewer #3 (Recommendations For The Authors):

      Abstract

      “Our results indicate that human observers employ causal-inference-based percepts to recalibrate cross-modal temporal perception” Your results indicate this is plausible. However, this statement (basically repeated at the end of the intro and again in the discussion) is - in my opinion - too strong.

      We have revised the statement as suggested.

      Intro and later

      Within the wider literature on relative timing perception, the temporal order judgement (TOJ) task refers to a task with just two response options. Tasks with three response options, as employed here, are typically referred to as ternary judgments. I would suggest language consistent with the existing literature (or if not, the contrast to standard usage could be clarified).

      Ref: Ulrich, R. (1987). Threshold models of temporal-order judgments evaluated by a ternary response task. Percept. Psychophys., 42, 224-239.

      We revised the term for the task as suggested throughout the manuscript.

      Results, 2.2.2

      “However, temporal precision might not be due to the variability of arrival latency.” Indeed, although there is some recent evidence that it might be.

      Ref: Yarrow, K., Kohl, C, Segasby, T., Kaur Bansal, R., Rowe, P., & Arnold, D.H. Neural-latency noise places limits on human sensitivity to the timing of events. Cognition, 222, 105012 (2022).

      We included the reference as suggested (lines 245–248).

      Methods, 4.3.

      Should there be some information here about the order of adaptation sessions (e.g. random for each observer)?

      We added details about the order of testing to the Methods section (lines 456–457).

      Supplemental material section 1.

      Here, you test whether the changes resulting from recalibration look more like a shift of the entire psychometric function or an expansion of the psychometric function on one side (most straightforwardly compatible with a change of one decision criterion). Fine, but the way you have done this is odd, because you have introduced a further difference in the models (Gaussian vs. exponential latency noise) so that you cannot actually conclude that the trend towards a win for the bias-shift model is simply down to the bias vs. criterion difference. It could just as easily be down to the different shapes of psychometric functions that the two models can predict (with the exponential noise model permitting asymmetry in slopes). There seems to be no reason that this comparison cannot be made entirely within the exponential noise framework (by a very simple reparameterization that focuses on the two boundaries rather than the midpoint and extent of the decision window). Then, you would be focusing entirely on the question of interest. It would also equate model parameters, removing any reliance on asymptotic assumptions being met for AIC.

      We revised our exploration of atheoretical recalibration models. To summarize the results, we varied two model assumptions: 1) the use of either a Gaussian or an exponential measurement distribution, and 2) recalibration being implemented either as a shift of the cross-modal temporal bias or as a shift of the criterion. We fit each model separately to the ternary TOJ responses of all sessions. Bayesian model comparisons indicated that the bias-shift model with exponential measurement distributions best described the data of most participants.

      References

      Di Luca, M., Machulla, T.-K., & Ernst, M. O. (2009). Recalibration of multisensory simultaneity:

      cross-modal transfer coincides with a change in perceptual latency. Journal of Vision, 9(12), Article 7.

      Fujisaki, W., Shimojo, S., Kashino, M., & Nishida, S. ’ya. (2004). Recalibration of audiovisual simultaneity. Nature Neuroscience, 7(7), 773–778.

      Harrar, V., & Harris, L. R. (2005). Simultaneity constancy: detecting events with touch and vision. Experimental Brain Research. Experimentelle Hirnforschung. Experimentation Cerebrale, 166(3-4), 465–473.

      Hirsh, I. J., & Sherrick, C. E., Jr. (1961). Perceived order in different sense modalities. Journal of Experimental Psychology, 62(5), 423–432.

      Keetels, M., & Vroomen, J. (2007). No effect of auditory-visual spatial disparity on temporal recalibration. Experimental Brain Research. Experimentelle Hirnforschung. Experimentation Cerebrale, 182(4), 559–565.

      MacKay, D. J. (2003). Information theory, inference and learning algorithms.https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=201b835c3f3a3626ca07b e68cc28cf7d286bf8d5

      Navarra, J., Vatakis, A., Zampini, M., Soto-Faraco, S., Humphreys, W., & Spence, C. (2005). Exposure to asynchronous audiovisual speech extends the temporal window for audiovisual integration. Brain Research. Cognitive Brain Research, 25(2), 499–507.

      Tanaka, A., Asakawa, K., & Imai, H. (2011). The change in perceptual synchrony between auditory and visual speech after exposure to asynchronous speech. Neuroreport, 22(14), 684–688.

      Vatakis, A., Navarra, J., Soto-Faraco, S., & Spence, C. (2007). Temporal recalibration during asynchronous audiovisual speech perception. Experimental Brain Research. Experimentelle Hirnforschung. Experimentation Cerebrale, 181(1), 173–181.

      Vatakis, A., Navarra, J., Soto-Faraco, S., & Spence, C. (2008). Audiovisual temporal adaptation of speech: temporal order versus simultaneity judgments. Experimental Brain Research. Experimentelle Hirnforschung. Experimentation Cerebrale, 185(3), 521–529.

      Vroomen, J., Keetels, M., de Gelder, B., & Bertelson, P. (2004). Recalibration of temporal order perception by exposure to audio-visual asynchrony. Brain Research. Cognitive Brain Research, 22(1), 32–35.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript by Oleh et al. uses in vitro electrophysiology and compartmental modeling (via NEURON) to investigate the expression and function of HCN channels in mouse L2/3 pyramidal neurons. The authors conclude that L2/3 neurons have developmentally regulated HCN channels, the activation of which can be observed when subjected to large hyperpolarizations. They further conclude via blockade experiments that HCN channels in L2/3 neurons influence cellular excitability and pathway-specific EPSP kinetics, which can be neuromodulated. While the authors perform a wide range of slice physiology experiments, concrete evidence that L2/3 cells express functionally relevant HCN channels is limited. There are serious experimental design caveats and confounds that make drawing strong conclusions from the data difficult. Furthermore, the significance of the findings is generally unclear, given modest effect sizes and a lack of any functional relevance, either directly via in vivo experiments or indirectly via strong HCN-mediated changes in known operations/computations/functions of L2/3 neurons.

      Specific points:

      (1) The interpretability and impact of this manuscript are limited due to numerous methodological issues in experimental design, data collection, and analysis. The authors have not followed best practices in the field, and as such, much of the data is ambiguous and/or weak and does not support their interpretations (detailed below). Additionally, the authors fail to appropriately explain their rationale for many of their choices, making it difficult to understand why they did what they did. Furthermore, many important references appear to be missing, both in terms of contextualizing the work and in terms of approach/method. For example, the authors do not cite Kalmbach et al 2018, which performed a directly comparable set of experiments on HCN channels in L2/3 neurons of both humans and mice. This is an unacceptable omission. Additionally, the authors fail to cite prior literature regarding the specificity or lack thereof of Cs+ in blocking HCN. In describing a result, the authors state "In line with previous reports, we found that L2/3 PCs exhibited an unremarkable amount of sag at 'typical' current commands" but they then fail to cite the previous reports.

      We thank the reviewer for the thorough examination of our manuscript; however, we disagree with many of the raised concerns for several reasons, as detailed here:

      To address the lack of certain citations, we would like to emphasize that in the introduction section, we did initially focus on the several decades-long line of investigation into the HCN channel content of layer 2/3 pyramidal cells (L2/3 PCs), where there has undoubtedly been some controversy as to their functional contribution. We did not explicitly cite papers that claimed to find no/little HCN channels/sag- although this would be a significant list of publications from some excellent investigators, as methods used may have differed from ours leading to different interpretations. Simply stated, unless one was explicitly looking for HCN in L2/3 PCs, it might go unobserved. However, we now addressed this more clearly in the revision:

      Just to take one example: in the publication mentioned by the reviewer (Kalmbach et al 2018), the investigators did not carry out voltage clamp or dynamic clamp recordings, as we did in our work here. Furthermore, the reported input resistance values in the aforementioned paper were far above other reports in mice (Routh et al. 2022, Brandalise et al 2022, Hedrick et al 2012; which were similar to our findings here), suggesting that recordings in Kalmbach were carried out at membrane potentials where HCN activation may be less available (Routh, Brager and Johnston 2022).

      Another reason for some mixed findings in the field is undoubtedly due to the small/nonexistent sag in L2/3 current clamp recordings (in mice). We also observed a very small sag, which can be explained by the following:  The ‘sag’ potential is a biphasic voltage response emerging from a relatively fast passive membrane response and a slower Ih activation. In L2/3 PCs, hyperpolarization-activated currents are apparently faster than previously described, and are located proximally (Figure 2 & Figure 5). Therefore, their recruitment in mouse L2/3 PCs is on a similar timescale to the passive membrane response, resulting in a more monophasic response. We now include a more full set of citations in the updated introduction section, to highlight the importance of HCN channels in L2/3 PCs in mice (and other species).

      The justification for using cesium (i.e., ‘best practices’) is detailed below.

      (2) A critical experimental concern in the manuscript is the reliance on cesium, a nonspecific blocker, to evaluate HCN channel function. Cesium blocks HCN channels but also acts at potassium channels (and possibly other channels as well). The authors do not acknowledge this or attempt to justify their use of Cs+ and do not cite prior work on this subject. They do not show control experiments demonstrating that the application of Cs+ in their preparation only affects Ih. Additionally, the authors write 1 mM cesium in the text but appear to use 2 mM in the figures. In later experiments, the authors switch to ZD7288, a more commonly used and generally accepted more specific blocker of HCN channels. However, they use a very high concentration, which is also known to produce off-target effects (see Chevaleyre and Castillo, 2002). To make robust conclusions, the authors should have used both blockers (at accepted/conservative concentrations) for all (or at least most) experiments. Using one blocker for some experiments and then another for different experiments is fraught with potential confounds.

      To address the concerns regarding the usage of cesium to block HCN channels, we would like to state that neither cesium nor ZD-7288 are without off-target effects, however in our case the potential off-target effects of external cesium were deemed less impactful, especially concerning AP firing output experiments. Extracellular cesium has been widely accepted as a blocker of HCN channels (Lau et al. 2010, Wickenden et al. 2009, Rateau and Ropert 2005, Hemond et al. 2009, Yang et al. 2015, Matt et al. 2010). However, it is well known to act on potassium channels as well at higher concentrations, which has been demonstrated with intracellular and extracellular application (Puil et al. 1981, Fleidervish et al. 2008, Williams et al. 1991, 2008).

      Although we initially performed ‘internal’ control experiments to ensure the cesium concentration was unlikely to greatly block voltage gated K+ channels during our recordings, we recognize these were not included in the original manuscript. These are detailed as follows: during our recordings cesium had no significant effect on action potential halfwidth, ruling out substantial blocking of potassium channels, nor did it affect any other aspects of suprathreshold activity (now reported in results, page 4 - line 113). Furthermore, we observed similar effects on passive properties (resting membrane potential, input resistance) following ZD-7288 as with cesium, which we now also updated in our figures (Supplementary Figure 1). We did acknowledge that ZD-7288 is a widely accepted blocker of HCN, and for this reason we carried out some of our experiments using this pharmacological agent instead of cesium.

      On the other hand, ZD-7288 suffers from its own side effects, such as potential effects on sodium channels (Wu et al. 2012) and calcium channels (Sánchez-Alonso et al. 2008, Felix et al. 2003). As our aim was to provide functional evidence for the importance of HCN channels, we initially deemed these potential effects unacceptable in experiments where AP firing output (e.g., in cell-attached experiments) was measured. Nonetheless, in new experiments now included here, we found the effects of ZD and cesium on AP output were similar as shown in new Supplemental Figure 1.

      Many experiments were supported by complementary findings using external cesium and ZD-7288. For example, the effect of ZD-7288 on EPSPs was confirmed by similar synaptic stimulation experiments using cesium. This is important, as synaptic inputs of L2/3 PCs are modulated by both dendritic sodium (Ferrarese et al. 2018) and calcium channels (Landau 2022), therefore the application of ZD-7288 alone may have been difficult to interpret in isolation. We thank the reviewer for bringing up this important point.

      (3) A stronger case could be made that HCN is expressed in the somatic compartment of L2/3 cells if the authors had directly measured HCN-isolated currents with outside-out or nucleated patch recording (with appropriate leak subtraction and pharmacology). Whole-cell voltage-clamp in neurons with axons and/or dendrites does not work. It has been shown to produce erroneous results over and over again in the field due to well-known space clamp problems (see Rall, Spruston, Williams, etc.). The authors could have also included negative controls, such as recordings in neurons that do not express HCN or in HCN-knockout animals. Without these experiments, the authors draw a false equivalency between the effects of cesium and HCN channels, when the outcomes they describe could be driven simply by multiple other cesium-sensitive currents. Distortions are common in these preparations when attempting to study channels (see Williams and Womzy, J Neuro, 2011). In Fig 2h, cesium-sensitive currents look too large and fast to be from HCN currents alone given what the authors have shown in their earlier current clamp data. Furthermore, serious errors in leak subtraction appear to be visible in Supplementary Figure 1c. To claim that these conductances are solely from HCN may be misleading.

      We disagree with the argument that “Whole-cell voltage-clamp in neurons with axons and/or dendrites does not work”. Although this method is not without its confounds (i.e. space clamp), it is still a useful initial measure as demonstrated countless times in the literature. However, the reviewer is correct that the best approach to establish the somatodendritic distribution of ion channels is by direct somatic and dendritic outside-out patches. Due to the small diameter of L2/3 PC dendrites, these experiments haven’t been carried out yet in the literature for any other ion channel either to our knowledge. Mapping this distribution electrophysiologically may be outside the scope of the current manuscript, but it was hard for us to ignore the sheer size of the Cs<sup>+</sup> sensitive hyperpolarizing currents in whole cell. Thus, we will opt to report this data.

      Also, we should point out that space clamp-related errors manifest in the overestimation of frequency-dependent features, such as activation kinetics, and underestimation of steady-state current amplitudes. The activation time constant of our measured currents are somewhat faster than previously reported; reducing major concerns regarding space clamp errors. Furthermore, we simply do not understand what “too large… to be from HCN currents” means. Our voltage-clamp measured currents are similar to previously reported HCN currents (Meng et al. 2011, Li 2011, Zhao et al. 2019, Yu et al. 2004, Zhang et al. 2008, Spinelli et al. 2018, Craven et al. 2006, Ying et al. 2012, Biel et al. 2009).

      Furthermore, we should point out that our measured currents activated at hyperpolarized voltages, had the same voltage dependence as HCN currents, did not show inactivation, influenced both input resistance and resting membrane potential, and are blocked by low concentration extracellular cesium. Each of these features would point to HCN.

      (4) The authors present current-clamp traces with some sag, a primary indicator of HCN conductance, in Figure 2. However, they do not show example traces with cesium or ZD7288 blockade. Additionally, the normalization of current injected by cellular capacitance and the lack of reporting of input resistance or estimated cellular size makes it difficult to determine how much current is actually needed to observe the sag, which is important for assessing the functional relevance of these channels. The sag ratio in controls also varies significantly without explanation (Figure 6 vs Figure 7). Could this variability be a result of genetically defined subgroups within L2/3? For example, in humans, HCN expression in L2/3 varies from superficial and deep neurons. The authors do not make an effort to investigate this. Regardless of inconsistencies in either current injection or cell type, the sag ratio appears to be rather modest and similar to what has already been reported previously in other papers.

      We thank the reviewer for pointing out that our explanation for the modest sag ratio might have not been sufficient to properly understand why this measurement cannot be applied to layer 2/3 pyramidal cells. Briefly: sag potential emerges from a relatively (compared to I<sub>h</sub>) fast passive membrane response and a slower HCN recruitment. The opposing polarity and different timescales of these two mechanisms results in a biphasic response called “sag” potential. However, if the timescale of these two mechanisms is similar, the voltage response is not predicted to be biphasic. We have shown that hyperpolarization activated currents in our preparations are fast and proximal, therefore they are recruited during the passive response (see Figure 2g.). This means that although a substantial amount of HCN currents are activated during hyperpolarization, their activation will not result in substantial sag. Therefore, sag ratio measurement is not necessarily applicable to approximate the HCN content of mouse L2/3 PCs. We would like to emphasize that sag ratio measurements are correct in case of other cell types (i.e. L5 and CA1 PCs_,_ and our aim is not to discredit the method, but rather to show that it cannot be applied similarly in the case of mouse L2/3 PCs.

      Our own measurements, similar to others in the literature show that L2/3 PCs exhibit modest sag ratios, however, this does not mean that HCN is not relevant. I<sub>h</sub> activation in L2/3 PCs does not manifest in large sag potential but rather in a continuous distortion of steady-state responses (Figure 2b.). The reviewer is correct that L2/3 PCs are non-homogenous, therefore we sampled along the entire L2/3 axis. This yielded some potential variability in our results (i.e., passive properties); yet we did not observe any cells where hyperpolarizing-activated/Cs<sup>+</sup>-sensitive currents could not be resolved. As structural variability of L2/3 cells does result in variability in cellular capacitance, we compensated for this variability by injecting cellular capacitance-normalized currents. Our measured cellular capacitances were in accordance with previously published values, in the range of 50-120 pF. Therefore, the injected currents were not outside frequently used values. Together, we would like to state that whether substantial sag potential is present or not, initial estimates of the HCN content for each L2/3 PC should be treated with caution.

      (5) In the later experiments with ZD7288, the authors measured EPSP half-width at greater distances from the soma. However, they use minimal stimulation to evoke EPSPs at increasingly far distances from the soma. Without controlling for amplitude, the authors cannot easily distinguish between attenuation and spread from dendritic filtering and additional activation and spread from HCN blockade. At a minimum, the authors should share the variability of EPSP amplitude versus the change in EPSP half-width and/or stimulation amplitudes by distance. In general, this kind of experiment yields much clearer results if a more precise local activation of synapses is used, such as dendritic current injection, glutamate uncaging, sucrose puff, or glutamate iontophoresis. There are recording quality concerns here as well: the cell pictured in Figure 3a does not have visible dendritic spines, and a substantial amount of membrane is visible in the recording pipette. These concerns also apply to the similar developmental experiment in 6f-h, where EPSP amplitude is not controlled, and therefore, attenuation and spread by distance cannot be effectively measured. The outcome, that L2/3 cells have dendritic properties that violate cable theory, seems implausible and is more likely a result of variable amplitude by proximity.

      To resolve this issue, we made a supplementary figure showing elicited amplitudes, which showed no significant distance dependence and minimal variability (new Supplementary Figure 6). We thank the reviewer for suggesting an amplitude-halfwidth comparison control (now included as new Supplementary Figure 6).). To address the issue of the non-visible spines, we would like to note that these images are of lower magnification and power to resolve them. The presence of dendritic spines was confirmed in every recorded pyramidal cell observed using 2P microscopy at higher magnification.

      We would like to emphasize that although our recordings “seemingly” violated the cable theory, this is only true if we assume a completely passive condition. As shown in our manuscript, cable theory was not violated, as the presence of NMDA receptor boosting explained the observed ‘non-Rallian’ phenomenon.

      (6) Minimal stimulation used for experiments in Figures 3d-i and Figures 4g-h does not resolve the half-width measurement's sensitivity to dendritic filtering, nor does cesium blockade preclude only HCN channel involvement. Example traces should be shown for all conditions in 3h; the example traces shown here do not appear to even be from the same cell. These experiments should be paired (with and without cesium/ZD). The same problem appears in Figure 4, where it is not clear that the authors performed controls and drug conditions on the same cells. 4g also lacks a scale bar, so readers cannot determine how much these measurements are affected by filtering and evoked amplitude variability. Finally, if we are to believe that minimal stimulation is used to evoke responses of single axons with 50% fail rates, NMDA receptor activation should be minimal to begin with. If the authors wish to make this claim, they need to do more precise activation of NMDA-mediated EPSPs and examine the effects of ZD7288 on these responses in the same cell. As the data is presented, it is not possible to draw the conclusion that HCN boosts NMDA-mediated responses in L2/3 neurons.

      As stated in the figure legends, the control and drug application traces are from the same cell, both in figure 3 and figure 4, and the scalebar is not included as the amplitudes were normalized for clarity. We have address the effects of dendritic filtering above in answer (5), and cesium blockade above in answer (2). To reiterate, dendritic filtering alone cannot explain our observations, and cesium is often a better choice for blocking HCN channels compared to ZD-7288, which blocks sodium channels as well.

      When an excitatory synaptic signal arrives onto a pyramidal cell in typical conditions, neurotransmitter sensitive receptors transmit a synaptic current to the dendritic spine. This dendritic spine is electrically isolated by the high resistance of the spine neck and due to the small membrane surface of the spine, the synaptic current can elicit remarkably large voltage changes. These voltage changes can be large enough to depolarize the spine close to zero millivolts upon even single small inputs (Jayant et al. 2016). Therefore, to state that single inputs arriving to dendritic spines cannot be large enough to recruit NMDA receptor activation is incorrect. This is further exemplified by the substantial literature showing ‘miniature’ NMDA recruitment via stochastic vesicle release alone.

      (7) The quality of recordings included in the dataset has concerning variability: for example, resting membrane potentials vary by >15-20 mV and the AP threshold varies by 20 mV in controls. This is indicative of either a very wide range of genetically distinct cell types that the authors are ignoring or the inclusion of cells that are either unhealthy or have bad seals.

      Although we are aware of the diversity of L2/3 PCs, resolving further layer depth differences is outside the scope of our current manuscript. However, as shown in Kalmbech et al, resting membrane potential can greatly vary (>15-20 mV) in L2/3 PCs depending on distance from pia. We acknowledge that the variance in AP threshold is large and could be due to genetically distinct cell types.

      (8) The authors make no mention of blocking GABAergic signaling, so it must be assumed that it is intact for all experiments. Electrical stimulation can therefore evoke a mixture of excitatory and inhibitory responses, which may well synapse at very different locations, adding to interpretability and variability concerns.

      We thank the reviewer for pointing out our lack of detail regarding the GABAergic signaling blocker SR 95531. We did include this drug in our recordings of (50Hz stim.) signal summation, so GABAergic responses did not contaminate our recordings. We now included this information in the results section (page 5) and the methods section (page 15)

      (9) The investigation of serotonergic interaction with HCN channels produces modest effect sizes and suffers the same problems as described above.

      We do not agree with the reviewer that 50% drop in neuronal AP firing responses (Figure 7b) was a modest effect size. Thus, we opted to keep this data in the manuscript.

      (10) The computational modeling is not well described and is not biologically plausible. Persistent and transient K channels are missing. Values for other parameters are not listed. The model does not seem to follow cable theory, which, as described above, is not only implausible but is also not supported by the experimental findings.

      The model was downloaded from the Cell Type Database from the Allen Institute, with only minor modifications including the addition of dendritic HCN channels and NDMA receptors- which were varied along a wide parameter space to find a ‘best fit’ to our observations. These additions were necessary to recapitulate our experimental findings. We agree the model likely does not fully recapitulate all aspects of the dendrites, which as we hope to convey in this manuscript, are not fully resolved in mouse L2/3 PCs. This is a previously published neuronal model, and despite its potential shortcomings, is one among a handful of open-source neuronal models of a fully reconstructed L2/3 PC.

      Reviewer #2 (Public Review):

      Summary:

      This paper by Olah et al. uncovers a previously unknown role of HCN channels in shaping synaptic inputs to L2/3 cortical neurons. The authors demonstrate using slice electrophysiology and computational modeling that, unlike layer 5 pyramidal neurons, L2/3 neurons have an enrichment of HCN channels in the proximal dendrites. This location provides a locus of neuromodulation for inputs onto the proximal dendrites from L4 without an influence on distal inputs from L1. The authors use pharmacology to demonstrate the effect of HCN channels on NMDA-mediated synaptic inputs from L4. The authors further demonstrate the developmental time course of HCN function in L2/3 pyramidal neurons. Taken together, this a well-constructed investigation of HCN channel function and the consequences of these channels on synaptic integration in L2/3 pyramidal neurons.

      Strengths:

      The authors use careful, well-constrained experiments using multiple pharmacological agents to asses HCN channel contributions to synaptic integrations. The authors also use a voltage clamp to directly measure the current through HCN channels across developmental ages. The authors also provide supplemental data showing that their observation is consistent across multiple areas of the cerebral cortex.

      Weaknesses:

      The gradient of the HCN channel function is based almost exclusively on changes in EPSP width measured at the soma. While providing strong evidence for the presence of HCN current in L2/3 neurons, there are space clamp issues related to the use of somatic whole-cell voltage clamps that should be considered in the discussion.

      We thank the reviewer for pointing out our careful and well-constrained experiments and for making suggestions. The potential effects of space clamp errors are detailed in the extended explanations under Reviewer 1, Specific points (3).

      Reviewer #3 (Public Review):

      Summary:

      The authors study the function of HCN channels in L2/3 pyramidal neurons, employing somatic whole-cell recordings in acute slices of visual cortex in adult mice and a bevy of technically challenging techniques. Their primary claim is a non-uniform HCN distribution across the dendritic arbor with a greater density closer to the soma (roughly opposite of the gradient found in L5 PT-type neurons). The second major claim is that multiple sources of long-range excitatory input (cortical and thalamic) are differentially affected by the HCN distribution. They further describe an interesting interplay of NMDAR and HCN, serotonergic modulation of HCN, and compare HCN-related properties at 1, 2 and 6 weeks of age. Several results are supported by biophysical simulations.

      Strengths:

      The authors collected data from both male and female mice, at an age (6-10 weeks) that permits comparison with in vivo studies, in sufficient numbers for each condition, and they collected a good number of data points for almost all figure panels. This is all the more positive, considering the demanding nature of multi-electrode recording configurations and pipette-perfusion. The main strength of the study is the question and focus.

      Weaknesses:

      Unfortunately, in its present form, the main claims are not adequately supported by the experimental evidence: primarily because the evidence is indirect and circumstantial, but also because multiple unusual experimental choices (along with poor presentation of results) undermine the reader's confidence. Additionally, the authors overstate the novelty of certain results and fail to cite important related publications. Some of these weaknesses can be addressed by improved analysis and statistics, resolving inconsistent data across figures, reorganizing/improving figure panels, more complete methods, improved citations, and proofreading. In particular, given the emphasis on EPSPs, the primary data (for example EPSPs, overlaid conditions) should be shown much more.

      However, on the experimental side, addressing the reviewer's concerns would require a very substantial additional effort: direct measurement of HCN density at different points in the dendritic arbor and soma; the internal solution chosen here (K-gluconate) is reported to inhibit HCN; bath-applied cesium at the concentrations used blocks multiple potassium channels, i.e. is not selective for HCN (the fact that the more selective blocker ZD7288 was used in a subset of experiments makes the choice of Cs+ as the primary blocker all the more curious); pathway-specific synaptic stimulation, for example via optogenetic activation of specific long-range inputs, to complement / support / verify the layer-specific electrical stimulation.

      We thank the reviewer for their very careful examination of our manuscript and helpful suggestions. We addressed the concerns raised in the review and presented more raw traces in our figures. Although direct dendritic HCN mapping measurements are outside the scope of the current manuscript due to the morphological constraints presented by L2/3 PCs (which explains why no other full dendritic nonlinearity distribution has been described in L2/3 PCs with this method), we nonetheless supplemented our manuscript with additional suggested experiments as suggested. For example, we included the excellent suggestion of pathway-specific optogenetic stimulation to further validate the disparate effect of HCN channels for distal and proximal inputs. We agree that ZD-7288 is a widely accepted blocker of HCN channels. However, the off-target effects on sodium channels may have significantly confounded our measurements of AP output using extracellular stimulation. Therefore, we chose low concentration cesium as the primary blocker for those experiments, but now validated several other Cs<sup>+</sup>-based results with ZD-7288 as well.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      I have some issues that need clarification or correction.

      (1) On page 3, line 90, the authors state "We found that bath application of Cs+ (1mM)..." but the methods and Figure 1 state "2mM Cs+". Please check and correct.

      Correct, typo corrected.

      (2) Related to Cs+ application, the methods state that "CsMeSO4 (2mM) was bath applied..." Is this correct? CsMeSO4 is typically used intracellularly while CsCl is used extracellularly. If so, please justify. If not, please correct.

      It is correct. The justification for not using CsCl selectively extracellularly is that introducing intracellular chloride ions can significantly alter basic biophysical properties, unrelated to the cesium effect. However, no similar distinction has been made for CsMeSO4, which would exclude the use of this drug extracellularly.

      (3) The authors normalize the current injections by cell capacitance (pA/pF). Was this done because there is a significant variance in cell morphology? A bit of justification for why the authors chose to normalize the current injection this way would help. If there is significant variation in cell capacitance across cells (or developmental ages), the authors could also include these data.

      Indeed, we choose to normalize current injection to cellular capacitance due to the markedly different morphology of deep and superficial L2/3 PCs. Deeper L2/3 PCs have a pronounced apical branch, closely resembling other pyramidal cell types such as L5 PCs, while superficial L2/3 PC lack a thick main apical branch and instead are equipped with multiple, thinner apical dendrites. This morphological variation would yield an inherent bias in several of the reported measurements, therefore we corrected for it by normalizing current injection to cellular capacitance, similar to our previous recent publications (Olah, Goettemoeller et al., 2022, Goettemoeller et al. 2024, Kumar et al. 2024).

      (4) On page 15, line 445, the section heading is "PV cell NEURON modeling". Is this a typo? The models are of L2/3 pyramidal neurons, correct?  

      Correct, typo corrected.

      (5) Figures 3F and 3I are plots of the voltage integral for different inputs before and after Cs+. The y-axis label units are "pA*ms". This should be "mV*ms" for a voltage integral.  

      Correct, typo corrected.

      (6) On page 9, line 273, the text reads "Voltage clamp experiments revealed that the rectification of steady-state voltage responses to hyperpolarizing current injection was amplified with 5-CT (Fig. 7c)". Both the text and Figure 7C describe current clamp, not voltage clamp, recordings. Please check and correct.

      Correct, typo corrected.

      (7) Figure 2i looks to be a normalized conductance vs voltage (i.e. activation) plot. The y-axis shows 0-1 but the units are in nS. Is that a coincidence or an error?

      Correct, typo corrected.

      Reviewer #3 (Recommendations For The Authors):

      This is your paper. My comments are my own opinion, I don't expect you to agree or to respond. But I hope that what I wrote below will help you to understand my perspective.

      Please pardon my directness (and sheer volume) in this section - I have a lot of notes/thoughts and hope you may find some of them helpful. My high-level comments are unfortunately rather critical, and in (small) part that is because I encountered too many errors/typos/ambiguities in figures, legend, and text. I expect many would be caught with good proofreading, but uncorrected caused confusion on my part, or an inability to interpret your figures with confidence, given some ambiguity.

      The paper reads a bit like patchwork - likely a result of many "helpful" reviewers who came before me. Consider starting with and focusing on the synaptic findings, expanding the number of figures and panels dedicated to that, showing example traces for all conditions, and giving yourself the space to portray these complex experiments and results. While I'm not a fan of a large number of supplemental figures, I feel you could move the "extra" results to the supplementals to improve the focus and get right to the meat of it.

      For me, the main concern is that the evidence you present for the non-uniform HCN distribution is rather indirect. Ideally, I'd like to see patch recordings from various dendritic locations (as others have done in rats, at least; I'm not sure if L2/3 mice have had such conductance density measurements made in basal and apical dendrites). Otherwise, perhaps optical mapping, either functional or via staining. I also mention some concerns about the choice of internal and cesium. More generally, I want to see more primary data (traces), in particular for the big synaptic findings (non-uniform, L1-vs-L4 differences, NMDAR).

      We thank the reviewer for the helpful suggestions. Indeed, direct patch clamp recording is widely considered to be the best method to identify dendritic ion channel distribution, however, we choose an in silico approach instead, for several reasons. Undoubtedly, one of the main reasons to omit direct dendritic recordings was that due to the uniquely narrow apical dendrites this method is extremely challenging, with no previous examples in the literature where isolated dendritic outside-out patch recordings were achieved from this cell type. However, there are theoretical considerations as well. In primates, it has been demonstrated that HCN1 channels are concentrated on dendritic spines (Datta et al., 2023) therefore direct outside-out recordings are not adequate in these circumstances. In future experiments we could directly target L2/3 PC dendrites for outside out recordings in order to resolve dendritic nonlinearity distribution, although a cell-attached methodology may be better suited due to the HCN biophysical properties being closely regulated by intracellular signaling pathways.

      The introduction and Figures 1 and 2 are not so interesting and not entirely accurate: L2/3 do not have "abundant" HCN, nor is there an actual controversy about whether they have HCN. It's been clear (published) for years that they have about the same as all other non-PT neocortical pyramidal neurons (see e.g. Larkum 2007; Sheets 2011). Your own Figure 1A has a logarithmic scale and shows L2/3 as having the lowest expression (?) of all pyramidals and roughly 10x lower than L5 PT, but the text says "comparable", which is misleading.

      We thank the reviewer for this comment. Although there are sporadic reports in the literature about the HCN content of L2/3 PCs, most of these publications arrive to the same conclusion from the negligible sag potential (as the mentioned Larkum et al., 2007 publication); namely that L2/3 PCs do not contain significant amount of HCN channels. We have shown with voltage and current clamp recordings that this assumption is false, as sag potential is not a reliable indicator of HCN content in L2/3 PCs. With the term “controversial” we aimed to highlight the different conclusions of functional investigations (e.g. Sheets et al., 2011) and sag potential recordings (e.g. Larkum et al., 2007), regarding the importance of HCN channels in L2/3 PCs.

      Non-uniform HCN with distal lower density has already been published for a (rare) pyramidal neuron in CA1 (Bullis 2007), similar to what you found in L2/3, and different from the main CA1 population.

      We thank the reviewer for this suggestion. We have now included the mentioned citation in the introduction section (page 3).

      Express sag as a ratio or percentage, consistently. Figure out why in Figure 7 the average sag ratio is 0.02 while in Fig. S1 it is 0.07 (for V1) - that is a massive difference.

      The calculation of sag ratio is consistent across the manuscript (at -6pA.pF), except for experiments depicted in Fig. 7 where sag ratio was calculated from -2pA/pF steps. Explanation below:

      Sag should be measured at a common membrane potential, with each neuron receiving a current pulse appropriate to reach that potential. Your approach of capacitance-based may allow for the same, but it is not clear which responses are used to calculate a single sag value per cell (as in Figure 2d).

      Thank you, we now included this info in the methods section. Sag potential was measured at the -6 pA/pF step peak voltage, except for Fig. 7 as noted above. We have now included this discrepancy detail in the methods section (page 14 ). These recordings in Fig. 7 took significantly longer than any other recording in the manuscript, as it took a considerable time to reach steady-state response from 5-CT application. -6pA/pF is a current injection in the range of 400-800 pA, which was proven to be too severe for continued application in cells after more than an hour of recording. Accordingly, we decided to lower the hyperpolarizing current step in these recordings. The absolute value of sag is thus different in Fig. 7, but nonetheless the 5-CT effect was still significant. Notably, we probably wouldn’t have noticed the small sag in L2/3 here (and thus the entire study), save for the fact that we looked at -6pA/pF to begin.

      In a paper focused on HCN, I would have liked to see resonance curves in the passive characterization.

      We thank the reviewer for the suggestion. Resonance curves can indeed provide useful insights into the impact of HCN on a cell’s physiological behavior, however, these experiments are outside the scope of our current manuscript as without in vivo recordings, resonance curves do not contribute to the manuscript in our opinion.

      How did you identify L2/3? Did you target cells in L2 or L3 or in the middle, or did you sample across the full layer width for each condition? A quantitative diagram showing where you patched (soma) and where you stimulated (L1, L4) with actual measurements, would be helpful (supplemental perhaps). You mention in the text that some L2/3 don't have a tuft, suggesting some variability in morphology - some info on this would be useful, i.e. since you did fill at least some of the neurons (eg 3A), how similar/different are the dendritic arbors?

      We sampled the entire L2/3 region during our recordings. It has been published that deep and superficial L2/3  PCs are markedly different in their morphology, and a recent publication (Brandelise et al. 2023) has even separated these two subpopulations to broad-tufted and slender tufted pyramidal cells, which receive distinct subcortical inputs. Although this differentiation opens exciting avenues for future research, examining potential layer gradients in our dataset would warrant significantly higher sample numbers and is currently out of the scope of our manuscript.

      Distal vs proximal: this could use more clarification, considering how central it is to your results. What about a synapse on a basal dendrite, but 150 or 200 um from the soma, is that considered proximal? Is the distance to the soma you report measured along the 3D dendrite, along the 2D dendrite, as a straight line to the soma, or just relative to some layers or cortical markers? (I apologize if I missed this).

      We thank the reviewer for pointing out the missing description in the results section. We have amended this oversight (p15).  Furthermore, although deeper L3 PCs have characteristic apical and basal dendritic branches, when recordings were made from more superficial L2 cells, a large portion of their dendrites extended radially, which made their classification ambiguous. Therefore, we did not use “apical” and “basal” terminology in the paper to avoid confusion. Distances were measured along the 3D reconstructed surface of the recovered pyramidal cells. This information is now included in the methods.

      Line 445, "PV cell NEURON modeling" ... hmm. Everyone re-uses methods sections to some degree, but this is not confidence-inspiring, and also not from a proofreading perspective.

      We have corrected the typo.

      It seems that you constructed a new HCN NEURON mechanism when several have been published/reviewed already. Please explain your reasons or at least comment on the differences.

      There are slight differences in our model compared to previously published models. Nevertheless, we took a previously published HCN model as a base (Gasparini et al, 2004), and created our own model to fit our whole-cell voltage clamp recordings.

      Bath-applied Cs+ can change synaptic transmission (in the hippocampus; Chevaleyre 2002). But also ZD7288 has some such effects. Also, see (Harris 1995) for a Cs+ and ZD7288 comparison. As well as (Harris 1994) for more Cs+ side-effects (it broadens APs, etc). Bath-applied blockers may affect both long-range and local synapses in your recordings, via K-channels or perhaps presynaptic HCN (though I am aware of your Fig. 1e). Since you can do intracellular perfusion, you could apply ZD7288 postsynaptically (Sheets 2011), an elegant solution.

      We thank the reviewer for the suggestion. We were aware of the potential presynaptic effects of cesium (i.e., presynaptic Kv or other channel effects) and did measure PPR after cesium application (Fig. 1h), noting no effect. At Cs<sup>+</sup> concentrations used here, we now also include new data in the results showing no effect on somatically recorded AP waveform (i.e., representative of a Kv channel effect). As stated earlier for reviewer 1, we now performed additional experiments using either cesium or ZD-7288 for comparison (e.g., see updated Fig. 1; Supplementary Figure 1; Fig. 3b-e). Intracellular ZD re-perfusion is an elegant solution which we will absolutely consider in future experiments.

      K-Gluconate is reported to inhibit Ih (Velumian 1997), consider at least some control experiments with a different internal for the main synaptic finding - maybe you'll find no big change ...

      We thank the reviewer for the suggestion. Although K-Gluconate can inhibit HCN current, the use of this intracellular solution is often used in the literature to measure this current (Huang & Trussel 2014). We have chosen this intracellular solution to improve recording stability.  

      (Biel 2009) is a very comprehensive HCN review, you may find it useful.

      We thank the reviewer for bringing this to our attention, we have now included the citation in the introduction.

      "Hidden" in your title seems too much.

      We changed the title to more accurately describe our findings and removed ‘hidden’.

      While I'm glad you didn't record at room temperature, the choice of 30C seems a bit unfortunate - if you go to the trouble to heat the bath, why not at least 34C, which is reasonably standard as an approximation for physiological temperature?

      We thank the reviewer for pointing this out. The choice of 30C was made to approach physiological temperature levels, while preserving the slices for extended amounts of time which is a standard approach. Future experiments in vivo be performed to further understand the naturalistic relevance at ~37C.

      Line 506: do you mean "Hz" here? It's not a frequency, is it? I think it's a unitless ratio?

      Correct, we have amended the typo.

      Line 95: you have not shown that HCN is "essential" for "excess" AP firing.

      We have corrected the phrasing, we agree.

      Fig. 2b,c: is this data from a single example neuron, maybe the same neuron as in 2a? Or from all recorded neurons pooled?

      The data is from several recorded cells pooled.

      Fig. 3 (important figure):

      Why did you not use a paired test for panels e and f? You have the same number of neurons for each condition and the expectation is that you record each neuron in control and then in cesium condition, which would be a paired comparison. Or did you record only 1 condition per neuron?

      This figure presents your main finding (in my opinion). You should show examples of the synaptic responses, i.e. raw traces, for each condition and panel, and overlaid in such a way that the reader can immediately see the relevant comparison - it's worth the space it requires.

      We thank the reviewer for the suggestions. Traces are only overlaid in the paper when they come from the same cell. For Fig. 3d-i, EPSPs in every neuron were evoked in 2-3 different locations (i.e., 1-2 ‘L4’ locations for Type-I and Type-II synapses, and one ‘L1’ location in each) with the same stimulation pipette and one pharmacological condition per cell. Therefore two-sample t-test were used since the control and cesium conditions came from separate cells (i.e., separate observations). This was necessary, as we can never assume that the stimulating electrode can return back to the same synapse after moving it. We were not comfortable with showing overlaid traces from different cells, however, we did show representative traces from control and the Cs<sup>+</sup> conditions in Fig. 3h. Complementary ZD-7288 experiments can be found on panel b and c, where we did perform within-cell pharmacology (and thus used paired t-tests) from one stimulation area/cell. We hope these complementary experiments increase overall confidence as neither pharmacological approach is 100% without off-target effects. We now also included more overlaid traces where appropriate (i.e., Fig. 3b, and in the new  Fig. 3k experiments using within-cell pharmacology comparisons). We do realize these complementary approaches could cause confusion to the reader, and have now done our best to make the slightly different approaches in this Figure clearer in the results section.

      Consider repeating at least some of these critical experiments with ZD7288 instead of Cs+ (and not K-gluc), or even with ZD7288 pipette perfusion, if it's technically feasible here.

      We thank the reviewer for the suggestions. Although many of our recordings using Cs<sup>+</sup> already had complementary experiments (such as synaptic experiments Figure 3e vs Figure 3b), we recognize the need to extend the manuscript with more ZD-7288 experiments. We have now extended Figure 1 with three panels (Figure 1 c,d,e), which recapitulates a fundamental finding, the change in overall excitability upon HCN channel blockade, using ZD-7288 as well.

      Fig. 3a, why show a schematic (and weirdly scaled) stimulating electrode? Don't you have a BF photo showing the actual stimulating electrode, which you could trace to scale or overlay? Could you use this panel to indicate what counts as "distal" and what as "proximal", visually?

      The stimulating electrode was unfortunately not filled with florescent materials, therefore it was not captured during the z-stack.

      Fig. 3b: is the y-axis labeled correctly? A "100% change" would mean a doubling, but based on the data points here I think y=100% means "no change"?

      The scale is labeled correctly, 100% means doubling.

      Fig. 3b, c: again, show traces representing distal and proximal, not just one example (without telling us how far it was). And use those traces to illustrate the half-width measurement, which may be non-trivial.

      We have extended Figure 3b with an inset showing the effect of ZD-7288 on a proximal stimulating site. The legend now includes additional information indicating stimulating location 28 µm away from the soma in control conditions (black trace) and upon Z-7288 application (green trace).  

      Line 543, 549: it seems you swapped labels "h" and "i"?

      Typo corrected.

      Fig. 4b: to me, MK-801 only *partially* blocks amplification, but in the text L198 you write "abolish".

      We thank the reviewer for pointing this out. Indeed, there are several other subthreshold mechanisms that are still intact after pipette perfusion, which can cause amplification. We have now clarified this in the text (p7).

      Fig. 4e,f: what is the message? Uniform NMDAR? The red asterisk in (e) is at a proximal/distal ratio of roughly 1. I don't understand the meaning of the asterisk (the legend is too basic) and I'm surprised to see a ratio of 1 as the best fit, and also that the red asterisk is at a dendritic distance of 0 um in (f). This could use more explanation (if you feel it's relevant).

      We thank the reviewer for pointing this out. We have now included a better explanation in the results and figure legend. We have also updated the figure to make it clearer and added model traces in Fig. 4f, which correspond to example data from slices in Fig. 4g (both green). The graph suggests nonuniform, proximally abundant NMDA distribution. The color coding corresponds to the proximal EPSP halfwidth divided by distal EPSP halfwidth. It is true that the dendritic distance ‘center’ was best-fit very close to the soma, but also note the dispersion (distribution) half-width was >150mm, so there is quite a significant dendritic spread despite the proximal bias prediction. Based on this model there is likely NMDA spread throughout the entire dendrite, but biased proximally. Naturally, future work will need to map this at the spine level so this is currently an oversimplification. Nonetheless, a proximal NMDA bias was necessary to recapitulate findings from Fig. 3, and additional slice recordings in Fig. 4 were consistent with this interpretation.

      Fig. 4g: I feel your choice of which traces to overlay is focusing on the wrong question. As the reader, what I want to see here is an overlay of all 4 conditions for one pathway. If this is a sequential recording in a single cell (Cs, Cs+MK801, wash out Cs, MK801), then the overlay would be ideal and need not be scaled. Otherwise, you can scale it. But the L1/L4 comparison does not seem appropriate to me. I find myself trying to imagine what all the dark lines would look like overlaid, and all the light lines overlaid separately. Also, the time axis is missing from this panel. Consider a subtraction of traces (if appropriate).

      In these recordings, all EPSPs cells were measured using a stimulating electrode that was moved between L1 and L4 (only once, to keep the exact input consistent) to measure the different inputs in a single neuron. In separate sets of experiments, the same method was used but in the presence of Cs<sup>+</sup>, Cs<sup>+</sup> + MK-801, or MK-801 alone. This was the most controlled method in our hands for this type of approach, as drug wash outs were either impractical or not possible.  Overlaying four traces would have presented a more cluttered image, and were not actually performed experimentally. As our aim was to resolve the proximal-distal halfwidth relationship, therefore we deemed the within-cell L1 vs. L4 comparison appropriate. We have nonetheless added model traces in Fig. 4f, which correspond to example data from slices in Fig. 4g (both green). The bar graphs should serve also serve to illustrate the input-specific  relationship- i.e., that the only time the L1 and L4 EPSP relationship was inverted was in the presence of Cs<sup>+</sup> (green bars) and that this effect was occluded with simultaneous MK-801 in the pipette (red bars).

      Line 579: should "hyperpolarized" be depolarized?

      Corrected

      Fig. 5a: it looks like the HCN density is high in the most basal dendrites (black curve above), then drops towards the soma, then rises again in the apicals (red curve). Is that indeed how the density was modeled? If so, this is completely at odds with the impression I received from reading your text and experimental data - there, "proximal" seems to mean where the L4 axons are, and "distal" seems to mean where the L1 axons are, in other words, high HCN towards the pia and low HCN towards the white matter. But this diagram suggests a biphasic hill-valley-hill distribution of HCN (meaning there is a second "distal" region below the soma). In that case, would the laterally-distant basal dendrites also be considered distal? How does the model implement the distribution - is it 1D, 2D or 3D? As you can probably tell, this figure raised more questions for me and made me wonder why I don't have a better understanding yet of your definitions.

      We thank the reviewer for pointing this out. We agree our initial cartoon of the parameter fitting procedure was not accurate and should have just been depicted a single ‘curve’. We have now simplified it to better demonstrate what the model is testing, and also made the terms more consistent and accurate. There is no ‘second’ region in the model. We hope this better illustrates it now. We also edited the legend to be clearer. Because the model description in Fig. 4d suffered from similar shortcomings, we also modified it accordingly as well as the figure legend there.

      Fig. 5b: why is the best fit at a proximal/distal ratio of 1, yet sigma is 50 um?

      Proximal/distal bias on this figure was fitted to 0.985 (prox/distal ratio) as we modeled control conditions, with intact NDMA and HCN channels,  which closely approximated the control recording comparisons.

      Fig. 6h, Line 662: "vs CsMeSO4 ... for putative LGN events" The panel shows proximal vs distal, not control vs Cs+. What's going on here?

      Typo corrected.

      Fig. 7e: the ctrl sag ratio here averages 0.02, while in Fig. S1 the average (for V1 and others) is about 0.07.  Please refer to our answer given to the previous question regarding sag ratio measurements. Briefly, recordings made with 5-CT application were made using a less severe, -2 pA/pF current injection to test seg responses. This more modest hyperpolarization activated less HCN channels, therefore the sag ratio is lower compared to previously reported datapoints.

      We have included this explanation in the methods section (page 14)

      Now hear you are using a paired test for this pharmacology, but you didn't previously (see my earlier comments/questions).

      Paired t-test were used for these experiments as these control and test datapoints came from the same cell. Cells were recorded in control conditions, and after drug application.

      Line 137: single-axon activation: but cortical axons make multi-synaptic contacts, at least for certain types of pre- and post-synaptic neurons, and (e.g. in L5-L5 pairs) those contacts can be distributed across the entire dendritic arbor. In other words, it's possible that when you stimulate in L1, you activate local axons, and the signal could then propagate to multiple synaptic contact locations, some being distal and some proximal. Maybe you have reasons to believe you're able to avoid this?

      We thank the reviewer for this question. Cortical axons often make distributed contacts, however, top-down and bottom-up pathways innervating L2/3 PCs are at least somewhat restricted to L2/3/L4 and L1, respectively (Shen et al. 2022, Sermet et al. 2019). Therefore, due to the lack evidence suggesting a heavily mixed topographical distribution for top-down and bottom-up inputs, we have reason to believe that L1 stimulation will result in mainly distal input recruitment, while L4 stimulation will mainly excite proximal dendritic regions. The resolution of our experiments was also improved by the minimal stimulation and visual guidance (subset of experiments) of the stimulation. Furthermore, new optogenetic experiments stimulating LGN and LM axons, which have been anatomically defined previously as biased to deeper layers and L1, respectively, were now also performed (Fig. 3j-l) with analogous cesium effects as our local electrical stimulation experiments. Future work using varying optogenetic stimulation parameters will expand on this.

      L140: "previous reports" ==> citation needed.

      We have inserted the citation needed.

      L149: "arriving to layer 1"; but I think earlier you noted that some or many L2/3 neurons lack a dendritic tuft; do they all nevertheless have dendrites in L1? Note that cortico-cortical long-range axons still need to pass through all cortical layers on their way up to L1.

      We thank the reviewer for the question. Although the more superficial L2/3 PCs lack distinct apical tuft, their dendrites reach the pia similarly to deeper L2/3 PCs. All of our recorded and post-hoc recovered cells had dendrites in L1, except in cases where they were clearly cut during the slicing procedure, which cells were occluded from the study.

      When you write "L4 axons" or "L4 inputs", do you specifically mean long-range thalamic axons? Or axons from local L4 neurons? What about axons in L4 that originate from L5 pyramidal neurons?

      In case of ‘L4’ axons, we cannot disambiguate these inputs a priori, as they are both part of the bottom-up pathway, and are possibly experimentally indistinguishable. Even with restricted opto LGN stimulation, disynaptic inputs via L4 PCs cannot be completely ruled out under our conditions. On the other hand, the probability of L5 PC axons to terminate on L2/3 PCs is exceedingly low (single reported connection out of 1145 potential connections; Hage et al. 2022). We did find two clearly different synaptic subpopulations (Supp. Fig 3) in L4- which was tempting to classify as one or the other. However we felt there was not enough evidence in the literature as well as our additional optogenetic experiments to make a classification on the source of these different L4 inputs. Thus we deemed them as Type-I or Type-II for now.

      Do you inject more holding current to compensate for the resting membrane potential when Cs+ or ZD7288 is in the bath?

      We thank the reviewer for the question. We did not inject a compensatory current, as we wanted to investigate the dual, physiologically relevant action of HCN channels (George et al. 2009)

      I'd like to see distributions (histograms) of L4 and L1 EPSP amplitudes, under control conditions and ideally also under HCN block.

      We have now extended the manuscript with a supplementary figure (Supplementary Figure 6) to show that EPSP peak was not distance dependent in control conditions, and there was no relationship between peak and halfwidth in our dataset.

      Line 186, custom pipette perfusion: why not use this for internal ZD7288, to make it cell-specific?

      We thank the reviewer for the question, this is a good point. In future work we will consider this when applicable. It is certainly a way to control for bath application confounds in many ways.

      L205: "recapitulate our experimental findings" - which findings do you mean? I think a bit of explanation/referencing would help.

      Corrected.

      Line 210: L4-evoked were narrower than L1-evoked: is this not expected based on filtering?

      We thank the reviewer for pointing this out, the word “Intriguingly” has been omitted.

      Line 231 and 235: "in L5 PCs" should be restricted to L5 PT-type PCs.

      We have corrected this throughout the manuscript.

      Neuromodulation, Fig. 7, L263-282: the neuromodulation finding is interesting. However, a bit like the developmental figure, it feels "tacked on" and the transition feels a bit awkward. I think you may want to discuss/cite more of the existing literature on neuromodulatory interactions with HCN (not just L2/3). Most importantly, what I feel is missing is a connection to your main finding, namely L1 and L4 inputs. Does serotonergic neuromodulation put L1 and L4 back on equal footing, or does it exaggerate the differences?

      We thank the reviewer for the question. We agree with the reviewer that Figure 7 does not give a complete picture about how the adult brain can capitalize on this channel distribution, as our intention was to show that HCN channels are not a stationary feature of L2/3 PC, but a feature which can be regulated developmentally and even in the adult brain via neuromodulation. In other words, the subthreshold NMDA boosting we observed can be gated by HCN, depending on developmental stage and/or neuromodulatory state of the system. We have now added some brief language to better introduce the transition and its relevance to the current study in the results (p8), and discussed the implications in the discussion section of the original manuscript.

      General comment: different types/sources of synapses may have different EPSP kinetics. I feel this is not mentioned/discussed adequately, considering your emphasis on EPSPs/HCN.

      See points above on input-specific synaptic diversity.

      Line 319/320: enriched distal HCN is found in L5 PT-type, not in all L5 PCs.

      Corrected

      L320: CA1 reportedly has a subset of pyramidal neurons that have higher proximal HCN than distal (I gave the citation above). In light of that, I think "unprecedented" is an overstatement.

      Corrected.

      Methods:

      L367: What form of anesthesia was used?

      Amended.

      Which brain areas, and how?

      Amended.

      Why did you first hold slices at 34C, but during recording hold at 30C?

      We held the slices at 34C to accelerate the degradation of superficial damaged parts of the slice, which is in line with currently used acute slice preparation methodologies, regardless of the subsequent recording temperature.

      Pipette resistance/tip size?

      Amended.

      Cell-attached recordings (L385): provide details of recordings. What was the command potential (fixed value, or did you adjust it per neuron by some criteria)?

      Amended.

      What type of stimulating electrode did you use? If glass, what solution is inside, and what tip size?

      We thank the reviewer for pointing these out, the specific points were added to the methods section.

      L392/393: you adjusted the holding (bias) current to sit at -80 mV. What were the range and max values of holding current? Was -80 mV the "raw" potential, or did it account for liquid junction? If you did not account for liquid junction potential, then would -80 in your hands effectively be between -95 and -90 mV? That seems unusually hyperpolarized.

      All cells were held with bias holding currents between -50 pA and 150 pA. To be clear, as mentioned below, we did not change the bias current after any drug applications. We did not correct for liquid junction potential, and cells were ‘held‘ with bias current at -80 mV as during our recordings, as 1) this value was apparently close to the RMP (i.e. little bias current needed at this voltage on average) (Fig. 2e) and 2) to keep consistent conditions across recordings. The uncorrected -80 mV is in the range of previously reported membrane potential values both in vivo and in vitro (Svoboda et al. 1999, Oswald et al. 2008, Luo et al. 2017), which found the (corrected) RMP to be below -80mV. Naturally this will not reflect every in vivo condition completely and further investigation using naturalistic conditions in the future are warranted.  

      Did you adjust the bias current during/after pharmacology?

      Bias current was not adjusted in order to resolve the effect on resting membrane potential.

      L398: sag calculation could use better explanation: how did you combine/analyze multiple steps from a single neuron when calculating sag? Did you choose one level (how) or did you average across step sizes or ...?

      Sag ratio was measured at -6 pA/pF current step except for one set of experiments in Fig. 7. Methods section was amended.

      L400, 401: 10 uM Alexa-594 or 30 um Alexa-594, which is correct?

      10 µM is correct, typo was corrected

      L445: "PV cell" seems like a typo?

      Typo is corrected.

      L450: "altered", please describe the algorithm or manual process.

      Alterations were made manually.

      L474: NDMA, typo.

      Typo is fixed.

      L474: "were adjusted", again please describe the process.

      Adjustments were made by a grid-search algorithm.

      Biel, M., Wahl-Schott, C., Michalakis, S., & Zong, X. (2009). Hyperpolarization-activated cation channels: from genes to function. Physiological reviews, 89(3), 847-885. https://journals.physiology.org/doi/full/10.1152/physrev.00029.2008 - (very comprehensive review of HCN)

      Bullis JB, Jones TD, Poolos NP. Reversed somatodendritic I(h) gradient in a class of rat hippocampal neurons with pyramidal morphology. J Physiol. 2007 Mar 1;579(Pt 2):431-43. doi: 10.1113/jphysiol.2006.123836. Epub 2006 Dec 21. PMID: 17185334; PMCID: PMC2075407. https://physoc.onlinelibrary.wiley.com/doi/full/10.1113/jphysiol.2006.123836 - (CA1 subset (PLPs) have a reversed HCN gradient; cell-attached patches, NMDAR)

      Velumian AA, Zhang L, Pennefather P, Carlen PL. Reversible inhibition of IK, IAHP, Ih, and ICa currents by internally applied gluconate in rat hippocampal pyramidal neurones. Pflugers Arch. 1997 Jan;433(3):343-50. doi: 10.1007/s004240050286. PMID: 9064651. https://link.springer.com/article/10.1007/s004240050286 - (K-Gluc internal inhibits HCN)

      Sheets, P. L., Suter, B. A., Kiritani, T., Chan, C. S., Surmeier, D. J., & Shepherd, G. M. (2011). Corticospinal-specific HCN expression in mouse motor cortex: I h-dependent synaptic integration as a candidate microcircuit mechanism involved in motor control. Journal of neurophysiology, 106(5), 2216-2231. https://journals.physiology.org/doi/full/10.1152/jn.00232.2011 - (L2/3 IT have same sag ratio as all other non-PT pyramidals, roughly 5% (vs 20% PT); intracellular ZD7288 used at 10 or 25 um)

      Harris NC, Constanti A. Mechanism of block by ZD 7288 of the hyperpolarization-activated inward rectifying current in guinea pig substantia nigra neurons in vitro. J Neurophysiol. 1995 Dec;74(6):2366-78. doi: 10.1152/jn.1995.74.6.2366. PMID: 8747199. https://journals.physiology.org/doi/abs/10.1152/jn.1995.74.6.2366 - (comparison Cs+ and ZD7288)

      Harris, N. C., Libri, V., & Constanti, A. (1994). Selective blockade of the hyperpolarization-activated cationic current (Ih) in guinea pig substantia nigra pars compacta neurones by a novel bradycardic agent, Zeneca ZM 227189. Neuroscience letters, 176(2), 221-225. https://www.sciencedirect.com/science/article/abs/pii/0304394094900876 - (Cs+ is not HCN-selective; it also broadens APs, reduces the AHP)

      Chevaleyre, V., & Castillo, P. E. (2002). Assessing the role of Ih channels in synaptic transmission and mossy fiber LTP. Proceedings of the National Academy of Sciences, 99(14), 9538-9543. https://pnas.org/doi/abs/10.1073/pnas.142213199 - (Cs+ blocks K channels, increases transmitter release; but also ZD7288 affects synaptic transmission)

      Thank you

    1. Sgdbtkstq]krstchdrcdo]qsldmsrnel]mxMnqsg?ldqhb]mtmhudqrh,shdrg]ud]cnosdczonrsbnknmh]krstchdr–hmsgdhqbtqqhbtk]vhsg]m]b]cdlh,

      no political urgency- post colonial suggests colonialism is a past system

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The use of antalarmin, a selective CRF1 receptor antagonist, prevents the deficits in sociability in (acutely) morphine-treated males, but not in females. In addition, cell-attached experiments show a rescue to control levels of the morphine-induced increased firing in PVN neurons from morphine-treated males. Similar results are obtained in CRF receptor 1-/- male mice, confirming the involvement of CRF receptor 1-mediated signaling in both sociability deficits and neuronal firing changes in morphine-treated male mice.

      Strengths:

      The experiments and analyses appear to be performed to a high standard, and the manuscript is well written and the data clearly presented. The main finding, that CRF-receptor plays a role in sociability deficits occurring after acute morphine administration, is an important contribution to the field.

      Weaknesses:

      The link between the effect of pharmacological and genetic modulation of CRF 1 receptor on sociability and on PVN neuronal firing, is less well supported by the data presented. No evidence of causality is provided.

      Major points:

      (1) The results of behavioral tests and the neural substrate are purely correlative. To find causality would be important to selectively delete or re-express CRF1 receptor sequence in the VPN. Re-expressing the CRF1 receptor in the VPN of male mice and testing them for social behavior and for neuronal firing would be the easier step in this direction.

      We agree with this comment and have acknowledged that further studies, such as genetic or pharmacological inactivation of CRF<sub>1</sub> receptors selectively in the paraventricular nucleus of the hypothalamus (PVN), are warranted to address this issue (page 17, line 25 to page 18, line 1).

      We would also like to mention that our manuscript title intentionally presented our findings separately without implying causality. Our idea was simply to pair the behavioral data to neural activity within a network of interest, i.e., the PVN CRF-oxytocin (OXY)/arginine-vasopressin (AVP) network, which is thought to play a critical role at the interface of substance use disorders and social behavior. Accordingly, we previously reported that genetic CRF<sub>2</sub> receptor deficiency reliably eliminated sociability deficits and hypothalamic OXY and AVP expression induced by cocaine withdrawal (Morisot et al., 2018). Thus, the present manuscript reliably shows that CRF<sub>1</sub> receptor-mediated effects of acute morphine administration upon social behavior are consistently mirrored by neural activity changes within the PVN, and particularly within its OXY<sup>+</sup>/AVP<sup>+</sup> neuronal populations. In addition, we demonstrate that the latter effects are sex-linked, which is in line with previous reports of sex-biased CRF<sub>1</sub> receptor roles in rodents (Rosinger et al., 2019; Valentino et al., 2013) and humans (Roy et al., 2018; Weber et al., 2016).

      (2) It would be interesting to discuss the relationship between morphine dose and CRF1 receptor expression.

      We are not aware of studies reporting CRF<sub>1</sub> receptor expression following acute morphine administration. However, repeated heroin self-administration was shown to increase CRF<sub>1</sub> receptor expression in the ventral tegmental area (VTA). We have mentioned the latter study in the present revised version of our manuscript at page 18, lines 1-2.

      (3) It would be important to show the expression levels of CRF1 receptors in PVN neurons in controls and morphine-treated mice, both males and females.

      We agree with this reviewer comment and, in the present version of the manuscript, have mentioned that examination of CRF<sub>1</sub> receptor expression in the PVN might help to understand the brain mechanisms underlying morphine effects upon social behavior (page 18, lines 2-6). Moreover, at page 15, lines 11-19 we have mentioned studies showing higher levels of the CRF<sub>1</sub> receptor in the PVN of adult (2 months) and old (20-24 months) male mice, as compared to adult and old female mice (Rosinger et al., 2019). Thus, differences in PVN CRF<sub>1</sub> receptor expression between male and female mice might underlie the sex-linked effects of CRF<sub>1</sub> receptor antagonism by antalarmin reported in our manuscript.

      (4) It would be important to discuss the mechanisms by which CRF1 receptor controls the firing frequency of APV+/OXY+ neurons in the VPN of male mice.

      Using the in situ hybridization technique, studies reported relatively low expression of the CRF<sub>1</sub> receptor in the PVN (Van Pett et al., 2000). However, more recent studies using genetic approaches identified a substantial population of CRF<sub>1</sub> receptor-expressing neurons within the PVN (Jiang et al., 2019, 2018). These CRF<sub>1</sub> receptor-expressing neurons are believed to respond to local CRF release and likely form bidirectional connections with both CRF and OXY+/AVP+ neurons (Jiang et al., 2019, 2018). Thus, one proposed mechanism of action is that morphine increases intra-PVN release of CRF, which may act on intra-PVN CRF<sub>1</sub> receptor-expressing neurons. The latter neurons might in turn influence the activity of PVN OXY+/AVP+ neurons, which largely project to the VTA and the bed nucleus of the stria terminalis (BNST) to modulate social behavior. Within this framework, pharmacological or genetic inactivation of CRF<sub>1</sub> receptors might deregulate the activity of intra-PVN CRF-OXY/AVP interactions and thus interfere with opiate-induced social behavior deficits. In particular, the latter phenomenon might be more pronounced in male mice since they express more CRF<sub>1</sub> receptor-positive neurons in the PVN, as compared to female mice (Rosinger et al., 2019). The putative mechanisms of action described herein are also mentioned at page 16, lines 12 to page 17, line 7 of the present revised version of the manuscript.

      Minor points:

      (1) The phase of the estrous cycles in which females are analyzed for both behavior and electrophysiology should be stated.

      The normal estrous cycle of laboratory mice is 4-5 days in length, and it is divided into four phases (proestrus, estrus, metestrus and diestrus). The three-chamber experiments were generally carried out over a 5-day period, thus spanning across the entire estrous cycle. In particular, on each test day approximately the same number of mice was assigned to each experimental group. Thus, within each group the number of female mice tested on each phase of the estrous cycle was likely similar. Moreover, except for firing frequency displayed by vehicle/morphine-treated mice, female and male mice showed similar results variability, indicating a marginal role for the estrous cycle in the spread of data. We would also like to mention relatively recent studies indicating no significant difference over different phases of the estrous cycle in the social interaction test as well as in anxiety-like and anhedonia-like behavioral tests in C57BL/6J female mice (Zhao et al., 2021). Accordingly, similar findings were also reported by other authors who found no difference across the diestrus and estrus phases of the estrous cycle in C57BL/6J female mice tested in behavioral assays of anxiety-like, depression-like and social interaction (Zeng et al., 2023).

      A paragraph has been added to page 20, lines 1-9 of the present version of the manuscript to explain why we did not monitor the estrous cycle in female mice.

      (2) It would be important to show the statistical analysis between sexes.

      Following this reviewer comment, we examined the sociability ratio results by a three-way ANOVA with sex (males vs. females), pretreatment (vehicle vs. antalarmin) and treatment (saline vs. morphine) as between-subjects factors. The latter analysis revealed an almost significant sex X pretreatment X treatment interaction effect (F<sub>1,53</sub>=3.287, P=0.075), which could not allow for post-hoc individual group comparisons. Nevertheless, Newman-Keuls post-hoc comparisons revealed that male mice treated with antalarmin/morphine showed higher sociability ratio than female mice treated with antalarmin/morphine (P<0.05). The latter statistical results have been added to the present revised version of the manuscript at page 7, lines 2-8.

      We also examined neuronal firing frequency by a three-way ANOVA with sex (males vs. females), pretreatment (vehicle vs. antalarmin) and treatment (saline vs. morphine) as between-subjects factors. Analysis of firing frequency of all of the recorded cells in C57BL/6J mice revealed a sex X pretreatment X treatment interaction effect (F<sub>1,195</sub>=4.765, P<0.05). Newman-Keuls post-hoc individual group comparisons revealed that male mice treated with vehicle/morphine showed higher firing frequency than all other male and female groups (P<0.0005). Moreover, male mice treated with antalarmin/morphine showed lower firing frequency than male mice treated with vehicle/morphine (P<0.0005). In net contrast, female mice treated with antalarmin/morphine did not differ from female mice treated with vehicle/morphine (P=0.914). The latter statistical results have been added to the present revised version of the manuscript at page 8, lines 4-12. Finally, similar results were obtained following the three-way ANOVA (sex X pretreatment X treatment) of firing frequency recorded in the subset of neurons co-expressing OXY and AVP (data not shown).

      Thus, sex-linked responses to morphine were detected also by three-way ANOVAs including sex as a variable. However, in the revised version of the manuscript we did not include novel figures combining the two sexes because it would have been largely redundant with the figures already reported, especially with Fig. 1D, Fig. 1G, Fig. 2B and Fig. 2D.

      Reviewer #2 (Public review):

      This manuscript reports a series of studies that sought to identify a biological basis for morphine-induced social deficits. This goal has important translational implications and is, at present, incompletely understood in the field. The extant literature points to changes in periventricular CRF and oxytocin neurons as critical substrates for morphine to alter social behavior. The experiments utilize mice, administered morphine prior to a sociability assay. Both male and female mice show reduced sociability in this procedure. Pretreatment with the CRF1 receptor antagonist, antalarmin, clearly abolished the morphine effect in males, and the data are compelling. Consistently, CRF1-/- male mice appeared to be spared of the effect of morphine (while wild-type and het mice had reduced sociability). The same experiment was reported as non-feasible in females due to the effect of dose on exploratory behavior per se. Seeking a neural correlate of the behavioral pharmacology, acute cell-attached recordings of PVN neurons were made in acute slices from mice pretreated with morphine or anatalarmin. Morphine increased firing frequencies, and both antalarmin and CRF1-/- mice were spared of this effect. Increasing confidence that this is a CRF1 mediated effect, there is a gene deletion dose effect where het's had an intermediate response to morphine. In general, these experiments are well-designed and sufficiently powered to support the authors' inferences. A final experiment repeated the cell-attached recordings with later immunohistochemical verification of the recorded cells as oxytocin or vasopressin positive. Here the data are more nuanced. The majority of sampled cells were positive for both oxytocin and vasopressin, in cells obtained from males, morphine pretreatment increased firing in this population and was CRF1 dependent, however in females the effect of morphine was more modest without sensitivity to CRF1. Given that only ~8 cells were only immunoreactive for oxytocin, it may be premature to attribute the changes in behavior and physiology strictly to oxytocinergic neurons.

      In sum, the data provide convincing behavioral pharmacological evidence and a regional (and possibly cellular) correlation of these effects suggesting that morphine leads to sociality deficits via CRF interacting with oxytocin in the hypothalamus. While this hypothesis remains plausible, the current data do not go so far as directly testing this mechanism in a site or cell-specific way.

      We agree with this reviewer’s comment and acknowledge that further studies are needed to better understand the neural substrates of CRF<sub>1</sub> receptor-mediated sociability deficits induced by morphine. This has been mentioned at page 17, line 25 to page 18, line 6 of the present revised version of the manuscript.

      With regard to the presentation of these data and their interpretation, the manuscript does not sufficiently draw a clear link between mu-opioid receptors, their action on CRF neurons of the PVN, and the synaptic connectivity to oxytocin neurons. Importantly, sex, cell, and site-specific variations in the CRF are well established (see Valentino & Bangasser) yet these are not reviewed nor are hypotheses regarding sex differences articulated at the outset. The manuscript would have more impact on the field if the implications of the sex-specific effects evident here were incorporated into a larger literature.

      At page 15, line 19 to page 16, line 2 of the present version of the manuscript, we have mentioned prior studies reporting differences in CRF<sub>1</sub> receptor signaling or cellular compartmentalization between male and female rodents (Bangasser et al., 2013, 2010). However, the latter studies were conducted in cortical or locus coeruleus brain tissues. Thus, more studies are needed to examine CRF<sub>1</sub> receptor signaling or cellular compartmentalization in the PVN and their relationship to the sex-linked results reported in our manuscript.

      With regards to the model proposed in the discussion, it seems that there is an assumption that ip morphine or antalarmin have specific effects on the PVN and that these mediate behavior - but this is impossible to assume and there are many meaningful alternatives (for example, both MOR and CRF modulation of the raphe or accumbens are worth exploration).

      We focused our discussion on PVN OXY/AVP systems because ourelectrophysiology studies examined neurons expressing OXY and/or AVP in this brain area. However, we understand that other brain areas/systems might mediate the effect of systemic administration of the CRF<sub>1</sub> receptor antagonist antalarmin or whole-body genetic disruption of the CRF<sub>1</sub> receptor upon morphine-induced social behavior deficits. For this reason, at page 16, line 12 to page 17, line 7 of the present version of the manuscript we have mentioned the possible involvement of BNST OXY or VTA dopamine systems in the CRF<sub>1</sub> receptor-mediated social behavior effects of morphine reported herein. Indeed, literature suggests important CRF-OXY and CRF-dopamine interactions in the BNST and the VTA, which might be relevant to the expression of social behavior. Nevertheless, to date the implication of the latter brain systems interactions in social behavior alterations induced by substances of abuse remains to be elucidated.

      While it is up to the authors to conduct additional studies, a demonstration that the physiology findings are in fact specific to the PVN would greatly increase confidence that the pharmacology is localized here. Similarly, direct infusion of antalarmin to the PVN, or cell-specific manipulation of OT neurons (OT-cre mice with inhibitory dreadds) combined with morphine pre-exposure would really tie the correlative data together for a strong mechanistic interpretation.

      We agree with this reviewer’s comment that the suggested experiments would greatly increase the understanding of the brain mechanisms underlying the social behavior deficits induced by opiate substances. We have acknowledged this at page 17, line 25 to page 18, line 6.

      Because the work is framed as informing a clinical problem, the discussion might have increased impact if the authors describe how the acute effects of CRF1 antagonists and morphine might change as a result of repeated use or withdrawal.

      Prior studies reported behavioral and neuroendocrine (hypothalamus-pituitary-adrenal axis) effects of chronic systemic administration of CRF<sub>1</sub> receptor antagonists, such as R121919 and antalarmin (Ayala et al., 2004; Dong et al., 2018). However, to our knowledge, no studies have directly compared the behavioral effects of acute vs. repeated administration of CRF<sub>1</sub> receptor antagonists. We previously reported that acute administration of antalarmin increased the expression of somatic opiate withdrawal in mice, indicating that this compound is effective following withdrawal from repeated morphine administration (Papaleo et al., 2007). Nevertheless, further studies are needed to specifically address this reviewer’s comment.

      Reviewer #3 (Public review):

      Summary:

      In the current manuscript, Piccin et al. identify a role for CRF type 1 receptors in morphine-induced social deficits using a 3-chamber social interaction task in mice. They demonstrate that pre-treatment with a CRFR1 antagonist blocks morphine-induced social deficits in male, but not female, mice, and this is associated with the CRF R1 antagonist blocking morphine-induced increases in PVN neuronal excitability in male but not female mice. They followed up by using a transgenic mouse CRFR1 knockout mouse line. CRFR1 genetic deletion also blocked morphine-induced social deficits, similar to the pharmacological approach, in male mice. This was also associated with morphine-induced increases in PVN neuronal excitability being blocked in CRFR1 knockout mice. Interestingly they found that the pharmacological antagonism of the CRFR1 specifically blocked morphine-induced increases in oxytocin/AVP neurons in the PVN in male mice.

      Strengths:

      The authors used both male and female mice where possible and the studies were fairly well controlled. The authors provided sufficient methodological detail and detailed statistical information. They also examined measures of locomotion in all of the behavioral tasks to separate changes in sociability from overall changes in locomotion. The experiments were well thought out and well controlled. The use of both the pharmacological and genetic approaches provides converging lines of evidence for the role of CRFR1 in morphine-induced social deficits. Additionally, they have identified the PVN as a potential site of action for these CRFR1 effects.

      Weaknesses:

      While the authors included both sexes they analyzed them independently. This was done for simplicity's sake as they have multiple measures but there are several measures where the number of factors is reduced and the inclusion of sex as a factor would be possible.

      Please, see above our response to the same comment made by Reviewer 1.

      Additionally, single doses of both the CRFR1 antagonist and morphine are used within an experiment without justification for the doses. In fact, a lower dose of morphine was needed for the genetic CRFR1 mouse line. This would suggest that the dose of morphine being used is likely causing some aversion that may be more present in the females, as they have lower overall time in the ROI areas of both the object and the mouse following morphine exposure.

      The morphine dose was chosen based on our prior study showing that morphine (2.5 mg/kg) impaired sociability in male and female C57BL/6J mice, without affecting locomotor activity (Piccin et al., 2022). Also, the antalarmin dose (20 mg/kg) and the route of administration (per os) was chosen based on our prior studies demonstrating behavioral effects of this CRF<sub>1</sub> receptor antagonist administered per os (Contarino et al., 2017; Ingallinesi et al., 2012; Piccin and Contarino, 2020). This is now mentioned in the “materials and methods” section of the present revised version of the manuscript at page 23, lines 6-13. We also agree with this reviewer that female mice seemed more sensitive to morphine than male mice. Indeed, during the habituation phase of the three-chamber test female mice treated with morphine (2.5 mg/kg) spent less time in the ROIs containing the empty wire cages, as compared to saline-treated female mice (Fig. 1E). However, morphine did not affect locomotor activity in female mice (Fig. S1B), suggesting independency between social approach and ambulation.

      As for the discussion, the authors do not sufficiently address why CRFR1 has an effect in males but not females and what might be driving that difference, or why male and female mice have different distribution of PVN cell types during the recordings.

      At page 15, line 11 to page 16, line 2, we have mentioned possible mechanisms that might underlie the sex-linked results reported in our manuscript. Moreover, at page 16, lines 6-9 we have mentioned a seminal review reporting sex-linked expression of PVN OXY and AVP in a variety of animal species that is similar to the present results. Nevertheless, as mentioned in the “discussion” section, further studies are needed to elucidate the neural substrates underlying sex-linked effects of opiate substances upon social behavior.

      Additionally, the authors attribute their effect to CRF and CRFR1 within the PVN but do not consider the role of extrahypothalamic CRF and CRFR1. While the PVN does contain the largest density of CRF neurons there are other CRF neurons, notably in the central amygdala and BNST, that have been shown to play important roles in the impact of stress on drug-related behavior. This also holds true for the expression of CRFR1 in other regions of the brain, including the VTA, which is important for drug-related behavior and social behavior. The treatments used in the current manuscript were systemic or brain-wide deletion of CRFR1. Therefore, the authors should consider that the effects could be outside the PVN.

      Even if they suggest a role for PVN CRF<sub>1</sub>-OXY circuits, we are aware that the present data do not support a direct link between behavior and PVN CRF<sub>1</sub> receptors. Thus, at page 16, line 12 to page 17, line 7 of the present version of the manuscript we have mentioned some studies showing a role for PVN OXY, BNST OXY or VTA dopamine systems in social behavior. Interestingly, the latter brain systems are thought to interact with the CRF system. However, more studies are warranted to understand the implication of CRF-OXY or CRF-dopamine interactions in social behavior deficits induced by substances of abuse.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      I commend the authors on crafting a well-written and clear manuscript with excellent figures. Furthermore, the data analysis and rigor are quite high. I have a few suggestions in the order they appear in the manuscript:

      The introduction has a number of abrupt transitions. For example, the sentence beginning with "Besides," in paragraph 2 jumps from CRF to oxytocin and vasopressin without a transition or justification. In all, vasopressin may be better removed from the introduction. There is sufficient evidence in the literature to support the CRF-OT circuit that might mediate behavioral pharmacology and this should be clearly described in the introduction.

      We have added a sentence at page 3, lines 22-23 to introduce possible interactions of the CRF system with other brain systems implicated in social behavior. Also, in the “introduction” section both OXY and AVP systems are mentioned because our electrophysiology studies examined the effect of morphine upon the activity of OXY- and AVP-positive neurons.

      Our interest in the PVN CRF-OXY/AVP network also stems from previous findings from our laboratory showing that genetic inactivation of the CRF<sub>2</sub> receptor eliminated both sociability deficits and increased hypothalamic OXY and AVP expression associated with long-term cocaine withdrawal in male mice (Morisot et al., 2018). Moreover, evidence suggests the implication of AVP systems in opiate effects. In particular, pharmacological antagonism of AVP-V1b receptors decreased the acquisition of morphine-induced conditioned place preference in male C57BL/6N mice housed with morphine-treated mice (Bates et al., 2018).

      Throughout the manuscript, it seems that there is an assumption that ip morphine or antalarmin have specific effects on the PVN and that these mediate behavior - this is impossible to assume and there are many meaningful alternatives (for example, both MOR and CRF modulation of the raphe or accumbens are worth exploration). While it is up to the authors to conduct additional studies, a demonstration that the physiology findings are in fact specific to the PVN would greatly increase confidence that the pharmacology is localized here. Similarly, direct infusion of antalarmin to the PVN, or cell-specific manipulation of OT neurons (OT-cre mice with inhibitory dreadds) combined with morphine pre-exposure would really tie the correlative data together for a strong mechanistic interpretation.

      We agree that the suggested experiments would greatly increase the understanding of the brain mechanisms underlying the social behavior deficits induced by opiate substances. This has been acknowledged at page 17, line 25 to page 18, line 6 of the present version of the manuscript.

      Also in the introduction, the reference to shank3b mice is not the most direct evidence of oxytocin involvement in sociability. It may be helpful to point reviewers to studies with direct manipulation of these populations (Grinevich group, for example).

      At page 4, lines 4-6 of the “introduction” section, we have added a sentence to mention a seminal paper by the Grinevich group demonstrating an important role for OXY-expressing PVN parvocellular neurons in social behavior (Tang et al., 2020). Moreover, at page 4, lines 8-10 we have mentioned a recent study showing that targeted chemogenetic silencing of PVN OXY neurons in male rats impaired short- and long-term social recognition memory (Thirtamara Rajamani et al., 2024).

      It would be helpful in the figures to indicate which panels contain male or female data.

      The sex of the mice is mentioned above each panel of the main and supplemental figures, except for the studies with CRF<sub>1</sub> receptor-deficient mice wherein only experiments carried out with male mice were illustrated. In the latter case, the sex (male) of the mice is mentioned in the related legend.

      The discussion itself departs from the central data in a few ways - the passages suggesting that morphine produces a stress response and that CRF1 antagonists would block the stress state are highly speculative (although testable). The manuscript would have more impact if the sex-specific effects and alternative hypotheses were enhanced in the discussion.

      At page 16, line 12 to page 17, line 7 of the “discussion” section, we have suggested that interaction of the CRF system with other brain systems implicated in social behavior (i.e., OXY, dopamine) might underlie the sex-linked CR<sub>1</sub> receptor-mediated effects of morphine reported in our manuscript. Also, at page 15, line 19 to page 16, line 2 we have mentioned studies showing sex-linked CRF<sub>1</sub> receptor signaling and cellular compartmentalization that might be relevant to the present findings. Finally, to further support the notion of morphine-induced PVN CRF activity, at page 15, lines 4-6 we have mentioned a study suggesting that activation of presynaptic mu-opioid receptors located on PVN GABA terminals might reduce GABA release (and related inhibitory effects) onto PVN CRF neurons (Wamsteeker Cusulin et al., 2013). Nevertheless, we believe that more work is needed to better understand the role for the CRF<sub>1</sub> receptor in opiate-induced stress responses and activity of OXY and dopamine systems implicated in social behavior.

      Reviewer #3 (Recommendations for the authors):

      (1) You should provide justification for the doses selected for treatments and the route of administration for the CRFR1 antagonist, especially for females.

      This has been added at page 23, lines 6-13 of the present version of the manuscript. In particular, the doses and routes of administration for morphine and antalarmin used in the present study were chosen based on previous work from our laboratory. Indeed, the intraperitoneal administration of morphine (2.5 mg/kg) impaired social behavior in male and female mice, without affecting locomotor activity (Piccin et al., 2022). Moreover, the oral route of administration for antalarmin was chosen for its translational relevance, as it could be easily employed in clinical trials assessing the therapeutic value of pharmacological CRF<sub>1</sub> receptor antagonists.

      (2) For the electrophysiology data you should include the number of cells per animal that were obtained. It appears that fewer cells from more females were obtained than in males and so the distribution of individual animals to the overall variance may be different between males and females.

      The number of cells examined and animals used in the electrophysiology experiments are reported above each panel of the related Figures 2, 3 and 4 as well as in the supplementary tables S1B and S1C. Overall, the number of cells examined in male and female mice was quite similar. Also, the number of male and female mice used was comparable. Standard errors of the mean (SEM) were quite similar across the different male and female groups (Fig. 2B and 2D), except for vehicle/morphine-treated male mice. Indeed, in the latter group a considerable number of cells displayed elevated firing responses to morphine, which accounted for the higher spread of the data. Accordingly, as mentioned above, the three-way ANOVA with sex (males vs. females), pretreatment (vehicle vs. antalarmin) and treatment (saline vs. morphine) as between-subjects factors revealed that male mice treated with vehicle/morphine showed higher firing frequency than all other male and female groups (P<0.0005). Finally, a similar pattern of firing frequency was observed also in neurons co-expressing OXY and AVP, wherein vehicle/morphine-treated male mice displayed higher SEM, as compared to all other male and female groups (Fig. 4C and 4F). Thus, except for vehicle/morphine-treated mice, distribution of the firing frequency data did not seem to be linked to the sex of the animal.

      (3) You should consider using a nested analysis for the slice electrophysiology data as that is more appropriate.

      We thank the reviewer for this suggestion. However, after careful consideration, we have decided to keep the current statistical analyses. In particular, given the relatively low variability of our data, we believe that the use of parametric ANOVA tests is appropriate. Moreover, additional details supporting our choice are provided just above in our response to the comment #2.

      (4) While it makes sense to not want to directly compare male and female data that results in needing to run a 4-way ANOVA, there are many measures, such as sociability, firing rate, etc., that if including sex as a factor would result in running a 3-way ANOVA and would allow for direct comparison of male and female mice.

      Please, see above our response to the same comment made by Reviewer 1. Notably, the results of our new statistical analyses including sex as a variable further support sex-linked effects of the CRF<sub>1</sub> receptor antagonist antalarmin upon morphine-induced sociability deficits and PVN neuronal firing. Nevertheless, we would like to keep the figures illustrating our findings as they are since it easily allows detecting the observed sex-linked results. Finally, we hope that this reviewer agrees with our choice, which is consistent with the wording of the title (i.e., “in male mice”).

      (5) There are grammatical and phrasing issues throughout the manuscript and the manuscript would benefit from additional thorough editing.

      We appreciate this reviewer’s feedback. Thus, upon revising, we have carefully edited the manuscript with regard to possible grammatical and phrasing errors. We hope that our changes have made the manuscript clearer in order to facilitate readability by the audience.

      (6) The discussion should be edited to include consideration of an explanation for the presence of the effect in male, but not female, mice more clearly. The discussion should also include some discussion as to why the distribution of cell types used in the electrophysiology recordings was different between males and females and whether the distribution of CRFR1 is different between males and females. Lastly, the authors need to include consideration of extrahypothalamic CRF and CRFR1 as a possible explanation for their effects. While they have PVN neuron recordings, the treatments that they used are brain-wide and therefore the possibility that the critical actions of CRFR1 could be outside the PVN.

      At page 15, line 11 to page 16, line 2 of the “discussion” section, we have suggested several mechanisms that might underlie the sex-linked behavioral and brain effects of CR<sub>1</sub> receptor antagonism reported in our manuscript. With regard to the distribution of cell types examined in the electrophysiology studies, at page 16, lines 6-9 we have mentioned a seminal review reporting sex-linked expression of PVN OXY and AVP in a variety of animal species that is similar to our results. Moreover, at page 18, lines 2-6 we mentioned that more studies are needed to examine PVN CRF<sub>1</sub> receptor expression in male and female animals, an issue that is still poorly understood. Finally, at page 16, line 12 to page 17, line 7 of the “discussion” section we also suggest that CRF<sub>1</sub> receptor-expressing brain areas other than the PVN, such as the BNST or the VTA, might contribute to the sex-linked effects of morphine reported in our manuscript. Thus, in agreement with this reviewer’s suggestion, in the present version of the manuscript we have further emphasized the possible implication of CRF<sub>1</sub> receptor-expressing extrahypothalamic brain areas in social behavior deficits induced by opiate substances.

      References

      Ayala AR, Pushkas J, Higley JD, Ronsaville D, Gold PW, Chrousos GP, Pacak K, Calis KA, Gerald M, Lindell S, Rice KC, Cizza G. 2004. Behavioral, adrenal, and sympathetic responses to long-term administration of an oral corticotropin-releasing hormone receptor antagonist in a primate stress paradigm. J Clin Endocrinol Metab 89:5729–5737. doi:10.1210/jc.2003-032170

      Bangasser DA, Curtis A, Reyes B a. S, Bethea TT, Parastatidis I, Ischiropoulos H, Van Bockstaele EJ, Valentino RJ. 2010. Sex differences in corticotropin-releasing factor receptor signaling and trafficking: potential role in female vulnerability to stress-related psychopathology. Mol Psychiatry 15:877, 896–904. doi:10.1038/mp.2010.66

      Bangasser DA, Reyes B a. S, Piel D, Garachh V, Zhang X-Y, Plona ZM, Van Bockstaele EJ, Beck SG, Valentino RJ. 2013. Increased vulnerability of the brain norepinephrine system of females to corticotropin-releasing factor overexpression. Mol Psychiatry 18:166–173. doi:10.1038/mp.2012.24

      Bates MLS, Hofford RS, Emery MA, Wellman PJ, Eitan S. 2018. The role of the vasopressin system and dopamine D1 receptors in the effects of social housing condition on morphine reward. Drug Alcohol Depend 188:113–118. doi:10.1016/j.drugalcdep.2018.03.021

      Contarino A, Kitchener P, Vallée M, Papaleo F, Piazza P-V. 2017. CRF1 receptor-deficiency increases cocaine reward. Neuropharmacology 117:41–48. doi:10.1016/j.neuropharm.2017.01.024

      Dong H, Keegan JM, Hong E, Gallardo C, Montalvo-Ortiz J, Wang B, Rice KC, Csernansky J. 2018. Corticotrophin releasing factor receptor 1 antagonists prevent chronic stress-induced behavioral changes and synapse loss in aged rats. Psychoneuroendocrinology 90:92–101. doi:10.1016/j.psyneuen.2018.02.013

      Ingallinesi M, Rouibi K, Le Moine C, Papaleo F, Contarino A. 2012. CRF2 receptor-deficiency eliminates opiate withdrawal distress without impairing stress coping. Mol Psychiatry 17:1283–1294. doi:10.1038/mp.2011.119

      Jiang Z, Rajamanickam S, Justice NJ. 2019. CRF signaling between neurons in the paraventricular nucleus of the hypothalamus (PVN) coordinates stress responses. Neurobiol Stress 11:100192. doi:10.1016/j.ynstr.2019.100192

      Jiang Z, Rajamanickam S, Justice NJ. 2018. Local Corticotropin-Releasing Factor Signaling in the Hypothalamic Paraventricular Nucleus. J Neurosci 38:1874–1890. doi:10.1523/JNEUROSCI.1492-17.2017

      Morisot N, Monier R, Le Moine C, Millan MJ, Contarino A. 2018. Corticotropin-releasing factor receptor 2-deficiency eliminates social behaviour deficits and vulnerability induced by cocaine. Br J Pharmacol 175:1504–1518. doi:10.1111/bph.14159

      Papaleo F, Kitchener P, Contarino A. 2007. Disruption of the CRF/CRF1 receptor stress system exacerbates the somatic signs of opiate withdrawal. Neuron 53:577–589. doi:10.1016/j.neuron.2007.01.022

      Piccin A, Contarino A. 2020. Sex-linked roles of the CRF1 and the CRF2 receptor in social behavior. J Neurosci Res 98:1561–1574. doi:10.1002/jnr.24629

      Piccin A, Courtand G, Contarino A. 2022. Morphine reduces the interest for natural rewards. Psychopharmacology (Berl) 239:2407–2419. doi:10.1007/s00213-022-06131-7

      Rosinger ZJ, Jacobskind JS, De Guzman RM, Justice NJ, Zuloaga DG. 2019. A sexually dimorphic distribution of corticotropin-releasing factor receptor 1 in the paraventricular hypothalamus. Neuroscience 409:195–203. doi:10.1016/j.neuroscience.2019.04.045

      Roy A, Laas K, Kurrikoff T, Reif A, Veidebaum T, Lesch K-P, Harro J. 2018. Family environment interacts with CRHR1 rs17689918 to predict mental health and behavioral outcomes. Prog Neuropsychopharmacol Biol Psychiatry 86:45–51. doi:10.1016/j.pnpbp.2018.05.004

      Tang Y, Benusiglio D, Lefevre A, Hilfiger L, Althammer F, Bludau A, Hagiwara D, Baudon A, Darbon P, Schimmer J, Kirchner MK, Roy RK, Wang S, Eliava M, Wagner S, Oberhuber M, Conzelmann KK, Schwarz M, Stern JE, Leng G, Neumann ID, Charlet A, Grinevich V. 2020. Social touch promotes interfemale communication via activation of parvocellular oxytocin neurons. Nat Neurosci 23:1125–1137. doi:10.1038/s41593-020-0674-y

      Thirtamara Rajamani K, Barbier M, Lefevre A, Niblo K, Cordero N, Netser S, Grinevich V, Wagner S, Harony-Nicolas H. 2024. Oxytocin activity in the paraventricular and supramammillary nuclei of the hypothalamus is essential for social recognition memory in rats. Mol Psychiatry 29:412–424. doi:10.1038/s41380-023-02336-0

      Valentino RJ, Van Bockstaele E, Bangasser D. 2013. Sex-specific cell signaling: the corticotropin-releasing factor receptor model. Trends Pharmacol Sci 34:437–444. doi:10.1016/j.tips.2013.06.004

      Van Pett K, Viau V, Bittencourt JC, Chan RK, Li HY, Arias C, Prins GS, Perrin M, Vale W, Sawchenko PE. 2000. Distribution of mRNAs encoding CRF receptors in brain and pituitary of rat and mouse. J Comp Neurol 428:191–212. doi:10.1002/1096-9861(20001211)428:2<191::aid-cne1>3.0.co;2-u

      Wamsteeker Cusulin JI, Füzesi T, Inoue W, Bains JS. 2013. Glucocorticoid feedback uncovers retrograde opioid signaling at hypothalamic synapses. Nat Neurosci 16:596–604. doi:10.1038/nn.3374

      Weber H, Richter J, Straube B, Lueken U, Domschke K, Schartner C, Klauke B, Baumann C, Pané-Farré C, Jacob CP, Scholz C-J, Zwanzger P, Lang T, Fehm L, Jansen A, Konrad C, Fydrich T, Wittmann A, Pfleiderer B, Ströhle A, Gerlach AL, Alpers GW, Arolt V, Pauli P, Wittchen H-U, Kent L, Hamm A, Kircher T, Deckert J, Reif A. 2016. Allelic variation in CRHR1 predisposes to panic disorder: evidence for biased fear processing. Mol Psychiatry 21:813–822. doi:10.1038/mp.2015.125

      Zeng P-Y, Tsai Y-H, Lee C-L, Ma Y-K, Kuo T-H. 2023. Minimal influence of estrous cycle on studies of female mouse behaviors. Front Mol Neurosci 16:1146109. doi:10.3389/fnmol.2023.1146109

      Zhao W, Li Q, Ma Y, Wang Z, Fan B, Zhai X, Hu M, Wang Q, Zhang M, Zhang C, Qin Y, Sha S, Gan Z, Ye F, Xia Y, Zhang G, Yang L, Zou S, Xu Z, Xia S, Yu Y, Abdul M, Yang J-X, Cao J-L, Zhou F, Zhang H. 2021. Behaviors Related to Psychiatric Disorders and Pain Perception in C57BL/6J Mice During Different Phases of Estrous Cycle. Front Neurosci 15:650793. doi:10.3389/fnins.2021.650793

    1. Reviewer #2 (Public review):

      Summary:

      The authors aim to provide a comprehensive understanding of the evolutionary history of the Major Histocompatibility Complex (MHC) gene family across primate species. Specifically, they sought to:

      (1) Analyze the evolutionary patterns of MHC genes and pseudogenes across the entire primate order, spanning 60 million years of evolution.

      (2) Build gene and allele trees to compare the evolutionary rates of MHC Class I and Class II genes, with a focus on identifying which genes have evolved rapidly and which have remained stable.

      (3) Investigate the role of often-overlooked pseudogenes in reconstructing evolutionary events, especially within the Class I region.

      (4) Highlight how different primate species use varied MHC genes, haplotypes, and genetic variation to mount successful immune responses, despite the shared function of the MHC across species.

      (5) Fill gaps in the current understanding of MHC evolution by taking a broader, multi-species perspective using (a) phylogenomic analytical computing methods such as Beast2, Geneconv, BLAST, and the much larger computing capacities that have been developed and made available to researchers over the past few decades, (b) literature review for gene content and arrangement, and genomic rearrangements via haplotype comparisons.

      (6) The authors overall conclusions based on their analyses and results are that 'different species employ different genes, haplotypes, and patterns of variation to achieve a successful immune response'.

      Strengths:

      Essentially, much of the information presented in this paper is already well-known in the MHC field of genomic and genetic research, with few new conclusions and with insufficient respect to past studies. Nevertheless, while MHC evolution is a well-studied area, this paper potentially adds some originality through its comprehensive, cross-species evolutionary analysis of primates, focus on pseudogenes and the modern, large-scale methods employed. Its originality lies in its broad evolutionary scope of the primate order among mammals with solid methodological and phylogenetic analyses.

      The main strengths of this study are the use of large publicly available databases for primate MHC sequences, the intensive computing involved, the phylogenetic tool Beast2 to create multigene Bayesian phylogenetic trees using sequences from all genes and species, separated into Class I and Class II groups to provide a backbone of broad relationships to investigate subtrees, and the presentation of various subtrees as species and gene trees in an attempt to elucidate the unique gene duplications within the different species. The study provides some additional insights with summaries of MHC reference genomes and haplotypes in the context of a literature review to identify the gene content and haplotypes known to be present in different primate species. The phylogenetic overlays or ideograms (Figures 6 and 7) in part show the complexity of the evolution and organisation of the primate MHC genes via the orthologous and paralogous gene and species pathways progressively from the poorly-studied NWM, across a few moderately studied ape species, to the better-studied human MHC genes and haplotypes.

      Weaknesses:

      The title 'The Primate Major Histocompatibility Complex: An Illustrative Example of Gene Family Evolution' suggests that the paper will explore how the Major Histocompatibility Complex (MHC) in primates serves as a model for understanding gene family evolution. The term 'Illustrative Example' in the title would be appropriate if the paper aimed to use the primate Major Histocompatibility Complex (MHC) as a clear and representative case to demonstrate broader principles of gene family evolution. That is, the MHC gene family is not just one instance of gene family evolution but serves as a well-studied, insightful example that can highlight key mechanisms and concepts applicable to other gene families. However, this is not the case, this paper only covers specific details of primate MHC evolution without drawing broader lessons to any other gene families. So, the term 'Illustrative Example' is too broad or generalizing. In this case, a term like 'Case Study' or simply 'Example' would be more suitable. Perhaps, 'An Example of Gene Family Diversity' would be more precise. Also, an explanation or 'reminder' is suggested that this study is not about the origins of the MHC genes from the earliest jawed vertebrates per se (~600 mya), but it is an extension within a subspecies set that has emerged relatively late (~60 mya) in the evolutionary divergent pathways of the MHC genes, systems, and various vertebrate species.

      Phylogenomics. Particular weaknesses in this study are the limitations and problems associated with providing phylogenetic gene and species trees to try and solve the complex issue of the molecular mechanisms involved with imperfect gene duplications, losses, and rearrangements in a complex genomic region such as the MHC that is involved in various effects on the response and regulation of the immune system. A particular deficiency is drawing conclusions based on a single exon of the genes. Different exons present different trees. Which are the more reliable? Why were introns not included in the analyses? The authors attempt to overcome these limitations by including genomic haplotype analysis, duplication models, and the supporting or contradictory information available in previous publications. They succeed in part with this multidiscipline approach, but much is missed because of biased literature selection. The authors should include a paragraph about the benefits and limitations of the software that they have chosen for their analysis, and perhaps suggest some alternative tools that they might have tried comparatively. How were problems with Bayesian phylogeny such as computational intensity, choosing probabilities, choosing particular exons for analysis, assumptions of evolutionary models, rates of evolution, systemic bias, and absence of structural and functional information addressed and controlled for in this study?

      Gene families as haplotypes. In the Introduction, the MHC is referred to as a 'gene family', and in paragraph 2, it is described as being united by the 'MHC fold', despite exhibiting 'very diverse functions'. However, the MHC region is more accurately described as a multigene region containing diverse, haplotype-specific Conserved Polymorphic Sequences, many of which are likely to be regulatory rather than protein-coding. These regulatory elements are essential for controlling the expression of multiple MHC-related products, such as TNF and complement proteins, a relationship demonstrated over 30 years ago. Non-MHC fold loci such as TNF, complement, POU5F1, lncRNA, TRIM genes, LTA, LTB, NFkBIL1, etc, are present across all MHC haplotypes and play significant roles in regulation. Evolutionary selection must act on genotypes, considering both paternal and maternal haplotypes, rather than on individual genes alone. While it is valuable to compile databases for public use, their utility is diminished if they perpetuate outdated theories like the 'birth-and-death model'. The inclusion of prior information or assumptions used in a statistical or computational model, typically in Bayesian analysis, is commendable, but they should be based on genotypic data rather than older models. A more robust approach would consider the imperfect duplication of segments, the history of their conservation, and the functional differences in inheritance patterns. Additionally, the MHC should be examined as a genomic region, with ancestral haplotypes and sequence changes or rearrangements serving as key indicators of human evolution after the 'Out of Africa' migration, and with disease susceptibility providing a measurable outcome. There are more than 7000 different HLA-B and -C alleles at each locus, which suggests that there are many thousands of human HLA haplotypes to study. In this regard, the studies by Dawkins et al (1999 Immunol Rev 167,275), Shiina et al. (2006 Genetics 173,1555) on human MHC gene diversity and disease hitchhiking (haplotypes), and Sznarkowska et al. (2020 Cancers 12,1155) on the complex regulatory networks governing MHC expression, both in terms of immune transcription factor binding sites and regulatory non-coding RNAs, should be examined in greater detail, particularly in the context of MHC gene allelic diversity and locus organization in humans and other primates.

      Diversifying and/or concerted evolution. Both this and past studies highlight diversifying selection or balancing selection model is the dominant force in MHC evolution. This is primarily because the extreme polymorphism observed in MHC genes is advantageous for populations in terms of pathogen defence. Diversification increases the range of peptides that can be presented to T cells, enhancing the immune response. The peptide-binding regions of MHC genes are highly variable, and this variability is maintained through selection for immune function, especially in the face of rapidly evolving pathogens. In contrast, concerted evolution, which typically involves the homogenization of gene duplicates through processes like gene conversion or unequal crossing-over, seems to play a minimal role in MHC evolution. Although gene duplication events have occurred in the MHC region leading to the expansion of gene families, the resulting paralogs often undergo divergent evolution rather than being kept similar or homozygous by concerted evolution. Therefore, unlike gene families such as ribosomal RNA genes or histone genes, where concerted evolution leads to highly similar copies, MHC genes display much higher levels of allelic and functional diversification. Each MHC gene copy tends to evolve independently after duplication, acquiring unique polymorphisms that enhance the repertoire of antigen presentation, rather than undergoing homogenization through gene conversion. Also, in some populations with high polymorphism or genetic drift, allele frequencies may become similar over time without the influence of gene conversion. This similarity can be mistaken for gene conversion when it is simply due to neutral evolution or drift, particularly in small populations or bottlenecked species. Moreover, gene conversion might contribute to greater diversity by creating hybrids or mosaics between different MHC genes. In this regard, can the authors indicate what percentage of the gene numbers in their study have been homogenised by gene conversion compared to those that have been diversified by gene conversion?

      Duplication models. The phylogenetic overlays or ideograms (Figures 6 and 7) show considerable imperfect multigene duplications, losses, and rearrangements, but the paper's Discussion provides no in-depth consideration of the various multigenic models or mechanisms that can be used to explain the occurrence of such events. How do their duplication models compare to those proposed by others? For example, their text simply says on line 292, 'the proposed series of events is not always consistent with phylogenetic data'. How, why, when? Duplication models for the generation and extension of the human MHC class I genes as duplicons (extended gene or segmental genomic structures) by parsimonious imperfect tandem duplications with deletions and rearrangements in the alpha, beta, and kappa blocks were already formulated in the late 1990s and extended to the rhesus macaque in 2004 based on genomic haplotypic sequences. These studies were based on genomic sequences (genes, pseudogenes, retroelements), dot plot matrix comparisons, and phylogenetic analyses of gene and retroelement sequences using computer programs. It already was noted or proposed in these earlier 1999 studies that (1) the ancestor of HLA-P(90)/-T(16)/W(80) represented an old lineage separate from the other HLA class I genes in the alpha block, (2) HLA-U(21) is a duplicated fragment of HLA-A, (3) HLA-F and HLA-V(75) are among the earliest (progenitor) genes or outgroups within the alpha block, (4) distinct Alu and L1 retroelement sequences adjoining HLA-L(30), and HLA-N genomic segments (duplicons) in the kappa block are closely related to those in the HLA-B and HLA-C in the beta block; suggesting an inverted duplication and transposition of the HLA genes and retroelements between the beta and kappa regions. None of these prior human studies were referenced by Fortier and Pritchard in their paper. How does their human MHC class I gene duplication model (Fig. 6) such as gene duplication numbers and turnovers differ from those previously proposed and described by Kulski et al (1997 JME 45,599), (1999 JME 49,84), (2000 JME 50,510), Dawkins et al (1999 Immunol Rev 167,275), and Gaudieri et al (1999 GR 9,541)? Is this a case of reinventing the wheel?

      Results. The results are presented as new findings, whereas most if not all of the results' significance and importance already have been discussed in various other publications. Therefore, the authors might do better to combine the results and discussion into a single section with appropriate citations to previously published findings presented among their results for comparison. Do the trees and subsets differ from previous publications, albeit that they might have fewer comparative examples and samples than the present preprint? Alternatively, the results and discussion could be combined and presented as a review of the field, which would make more sense and be more honest than the current format of essentially rehashing old data.

      Minor corrections:

      (1) Abstract, line 19: 'modern methods'. Too general. What modern methods?

      (2) Abstract, line 25: 'look into [primate] MHC evolution.' The analysis is on the primate MHC genes, not on the entire vertebrate MHC evolution with a gene collection from sharks to humans. The non-primate MHC genes are often differently organised and structurally evolved in comparison to primate MHC.

      (3) Introduction, line 113. 'In a companion paper (Fortier and Pritchard, 2024)' This paper appears to be unpublished. If it's unpublished, it should not be referenced.

      (4) Figures 1 and 2. Use the term 'gene symbols' (circle, square, triangle, inverted triangle, diamond) or 'gene markers' instead of 'points'. 'Asterisks "within symbols" indicate new information.

      (5) Figures. A variety of colours have been applied for visualisation. However, some coloured texts are so light in colour that they are difficult to read against a white background. Could darker colours or black be used for all or most texts?

      (6) Results, line 135. '(Fortier and Pritchard, 2024)' This paper appears to be unpublished. If it's unpublished, it should not be referenced.

      (7) Results, lines 152 to 153, 164, 165, etc. 'Points with an asterisk'. Use the term 'gene symbols' (circle, square, triangle, inverted triangle, diamond) or 'gene markers' instead of 'points'. A point is a small dot such as those used in data points for plotting graphs .... The figures are so small that the asterisks in the circles, squares, triangles, etc, look like points (dots) and the points/asterisks terminology that is used is very confusing visually.

      (8) Line 178 (BEA, 2024) is not listed alphabetically in the References.

      (9) Lines 188-190. 'NWM MHC-G does not group with ape/OWM MHC-G, instead falling outside of the clade containing ape/OWM MHC-A, -G, -J and -K.' This is not surprising given that MHC-A, -G, -J, and -K are paralogs of each other and that some of them, especially in NWM have diverged over time from the paralogs and/or orthologs and might be closer to one paralog than another and not be an actual ortholog of OWM, apes or humans.

      (10) Line 249. Gene conversion: This is recombination between two different genes where a portion of the genes are exchanged with one another so that different portions of the gene can group within one or other of the two gene clades. Alternatively, the gene has been annotated incorrectly if the gene does not group within either of the two alternative clades. Another possibility is that one or two nucleotide mutations have occurred without a recombination resulting in a mistaken interpretation or conclusion of a recombination event. What measures are taken to avoid false-positive conclusions? How many MHC gene conversion (recombination) events have occurred according to the authors' estimates? What measures are taken to avoid false-positive conclusions?

      (11) Lines 284-286. 'The Class I MHC region is further divided into three polymorphic blocks-alpha, beta, and kappa blocks-that each contains MHC genes but are separated by well-conserved non-MHC genes.' The MHC class I region was first designated into conserved polymorphic duplication blocks, alpha and beta by Dawkins et al (1999 Immunol Rev 167,275), and kappa by Kulski et al (2002 Immunol Rev 190,95), and should be acknowledged (cited) accordingly.

      (12) Lines 285-286. 'The majority of the Class I genes are located in the alpha-block, which in humans includes 12 MHC genes and pseudogenes.' This is not strictly correct for many other species, because the majority of class I genes might be in the beta block of new and old-world monkeys, and the authors haven't provided respective counts of duplication numbers to show otherwise. The alpha block in some non-primate mammalian species such as pigs, rats, and mice has no MHC class I genes or only a few. Most MHC class I genes in non-primate mammalian species are found in other regions. For example, see Ando et al (2005 Immunogenetics 57,864) for the pig alpha, beta, and kappa regions in the MHC class I region. There are no pig MHC genes in the alpha block.

      (13) Line 297 to 299. 'The alpha-block also contains a large number of repetitive elements and gene fragments belonging to other gene families, and their specific repeating pattern in humans led to the conclusion that the region was formed by successive block duplications (Shiina et al., 1999).' There are different models for successive block duplications in the alpha block and some are more parsimonious based on imperfect multigenic segmental duplications (Kulski et al 1999, 2000) than others (Shiina et al., 1999). In this regard, Kulski et al (1999, 2000) also used duplicated repetitive elements neighbouring MHC genes to support their phylogenetic analyses and multigenic segmental duplication models. For comparison, can the authors indicate how many duplications and deletions they have in their models for each species?

      (14) Lines 315-315. 'Ours is the first work to show that MHC-U is actually an MHC-A-related gene fragment.' This sentence should be deleted. Other researchers had already inferred that MHC-U is actually an MHC-A-related gene fragment more than 25 years ago (Kulski et al 1999, 2000) when the MHC-U was originally named MHC-21.

      (15) Lines 361-362. 'Notably, our work has revealed that MHC-V is an old fragment.' This is not a new finding or hypothesis. Previous phylogenetic analysis and gene duplication modelling had already inferred HLA-V (formerly HLA-75) to be an old fragment (Kulski et al 1999, 2000).

      (16) Line 431-433. 'the Class II genes have been largely stable across the mammals, although we do see some lineage-specific expansions and contractions (Figure 2 and Figure 2-gure Supplement 2).' Please provide one or two references to support this statement. Is 'gure' a typo?

      (17) Line 437. 'We discovered far more "specific" events in Class I, while "broad-scale" events were predominant in Class II.' Please define the difference between 'specific' and 'broad-scale'.<br /> 450-451. 'This shows that classical genes experience more turnover and are more often affected by long-term balancing selection or convergent evolution.' Is balancing selection a form of divergent evolution that is different from convergent evolution? Please explain in more detail how and why balancing selection or convergent evolution affects classical and nonclassical genes differently.

      References. Some references in the supplementary materials such as Alvarez (1997), Daza-Vamenta (2004), Rojo (2005), Aarnink (2014), Kulski (2022), and others are missing from the Reference list. Please check that all the references in the text and the supplementary materials are listed correctly and alphabetically.

    1. Emouvante critique de Brague.

      1) un gros paquet d'insultes: islamophobes (considés comme) les Urvoy et Brague lui même. "ses fantasmes, de ses erreurs parfois et de sa mauvaise foi souvent." "Rémi Brague enrobe les poncifs les plus éculés sur l’islam, qu’il ressert sans la moindre originalité, avec une apparence savante."

      La déqualification des époux Urvoy universitaires spécialistes de l'islam reconnus, ayant démythologisé Al Andalous et son "multiculturalisme" et bien leur qualification d'"islamophobe" signe son fréro. Voilà c'est fait. On se contentera donc de répondre de même: défenseur raciste, hypocrite et prétentieux de l'insupportable.

      2) Les détails. a) les "sciences" islamiques ilm el-khawâtîr la sciences de pensées, traditionnellement associées aux élaborations spirituelles, à la mystique et certainement pas à ce qu'on appelle la psychologie ou alors de manière indirecte. Toutes ces "sciences" datent de l'an mille, et les sciences "objectives" sont antérieures à la grande fermeture du XIIIème siècle, science dont le rapport à l'islam est rien moins que lointain. L'astronomie arabe (disons plutôt persane du moyen âge) n'est pas musulmane et ne doit rien au coran, pas plus que l'algèbre...<br /> L'appropriation tardive de ces savoirs moyen âgeux par des berbères se fantasmant arabes pour éviter un déclassement problématique reste ce qu'il est: pitoyable.

      b) Ibn Hazm. L'auteur du traité poétique "le collier de la colombe" bien qu'accusé de mutazilisme, la doctrine qui voulait imposer la raison partout et qui fut réfutée pour cela, est cité à ce sujet et établit précisément l'asservissement de la raison en théologie à la révélation coranique. La citation est pertinente. Le zahirisme n'accorde pas une moindre place à la raison mais réfute l'analogie et Ibn Hazm reste un docteur important de l'islam.

      De manière générale, la statégie qui consiste à réfuter toute citation islamique gênante par son attribution à un partisan d'une école nécessairement non représentative de l'"islam global" est bien connue du frère musulman moyen. Pas vu pas pris.

      c) Shah Kazemi, islamologue, est chiite et écrit des ouvrages sur l'imam Ali et sa spirituaité. Ne pas le référencer est coupable, certes... Seyyed Hussein Nasr iranien réfugié aux Usa, et pourquoi pas Gilbert Bourdin dont l'oubli est au moins autant coupable ?

      Le name dropping de tout obscur nom oriental , tel celui de abdul al azherd fait rire le mécréant...

      d) Nawawi ne contredit en rien Brague, n'évoquant pas spécifiquement les dhimmis dans ses 40 hadiths. L'indulgence générale envers les dhimmis est une vaste blague raciste et méprisante. La dhimma comme "sauvegarde" est tout simplement risible quand elle cherche à ne pas passer pour ce qu'elle est: une discrimination.

      Les laboureurs ne sont PAS exemptés de la zakat. C'est simple. Il l'étaient d'une forme spéciale c'est tout.

      e) les 4 écoles reconnaissent la polygamie et la monogamie est recommandée par le Coran lui même ! Quand à l'autorisation de troncher ses esclaves, elle est claire...

      f) Le verset de la tolérance, pont aux ânes hypocrite, : Plus de contrainte dans la religion maintenant que le vrai se distingue de l’erreur. capito pepito ? La citation de Qutb illustre parfaitement l'ambiguité constitutive de la chose...

      g) le califat et son interprétation est historiquement fondée, et qualifier de manière insultante la thèse relève du (1).

      h) Brague est trop humble et son humilité a trompé le fréro: l'orthodoxie continue de l'islam est bien celle qu'il décrit, hélas et ne pas le voir, c'est l'accuser ... d'islamophobie.

      i) le soufisme est parfaitement caractérisé: 80 % de l'islam le rejette et avec force. Le "cinquième" c'est 20%... Le soufisme est bien profondément réactionnaire de manière générale, et adossé sur une stricte orthodoxie, pour mieux se faire excuser son mysticisme, profondément rejeté par l'islam juridique.

      j) se contenter de rejeter le wahhabisme comme le fait Al Azhar de Daech (en recommandant de crucifier les djihadistes, suprême forme de foutage de gueule, qui plus est citée par le monsieur , pas dégouté) est parfaitement ridicule.

      k) kazemi est donc la référence absolue... voir https://philitt.fr/2018/10/20/daoud-riffi-en-islam-lintolerance-est-lexception-face-a-une-tolerance-fondamentale-1-2/

      Un disciple de Martin Lings, britannique des années 30, converti à l'islam soufi et sectateur de Guénon...

      l) suspecter Brague d'ignorer la différence entre fiqh et Sharia, sachant que le coran et donc dieu est bien le législateur suprême est un foutage de gueule cynique absurde. Le fréro se mord les couilles, cela impressionne. Rappelons aussi que la sharia INCLUT le fiqh qui n'est absolument pas une "interprétation" de la shariah !

      La citation de Linant de Bellefonds , qui creusa le canal de Suez, est décisive pour apprécier la vraie nature du fiqh ! A hurler de rire ! C'est donc lui qui "prouve" que la raison en islam a tout réglé...

      Le fiqh (la doctrine des fuqaha) est une jurisprudence qui déduit un droit (heureusement non positif, sinon cela serait terrible) des source islamiques. Ca va, vous suivez?

      "libre réflexion sur les fondements" ? Ah bon ?

      m) le droit "naturel" que serait la shariah. Strauss et Villey font partie de ceux qui critiquent l'abandon moderniste de loi naturelle transcendant la loi positive et AUSSI l'interprétation du droit naturel comme seul droit de l'individu hors du contexte de l'Etat. Les citer comme des acquis est symptomatique d'une culture superficielle, celle d'un formé aux écoles multiculturelles fréristes.

      De manière générale, la manière sentencieuse méprisante, et pour tout dire raciste dont un arabo mes couilles fait la leçon à des mécréants en les insultant est tout bonnement insupportable.

      Ibn Arabi ne distingue pas des "voies" mais des aspects, l'ésotérique et l'exotérique, zahir et batir. La conception personnelle du monsieur, d'inspiration vaguement soufie et sans doute à moitié chiite n'est en rien une analyse globale de l'infection globale qu'est l'islam, la suffisance fidéique avec laquelle il cherche à l'imposer de manière méprisante et prétentieuse est merdique et insupportable.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      How reconsolidation works - particularly in humans - remains largely unknown. With an elegant, 3-day design, combining fMRI and psychopharmacology, the authors provide evidence for a certain role for noradrenaline in the reconsolidation of memory for neutral stimuli. All memory tasks were performed in the context of fMRI scanning, with additional resting-state acquisitions performed before and after recall testing on Day 2. On Day 1, 3 groups of healthy participants encoded word-picture associates (with pictures being either scenes or objects) and then performed an immediate cued recall task to presentation of the word (answering is the word old or new, and whether it was paired with a scene or an object). On Day 2, the cued recall task was repeated using half of the stimulus set words encoded on Day 1 (only old words were presented, with subjects required to indicate prior scene vs object pairing). This test was immediately preceded by the oral administration of placebo, cortisol, or yohimbine (to raise noradrenaline levels) depending on group assignment. On Day 3, all words presented on Day 1 were presented. As expected, on Day 3, memory was significantly enhanced for associations that were cued and successfully retrieved on Day 2 compared to uncued associations. However, for associative d', there was no Cued × Group interaction nor a main effect of Group, i.e., on the standard measure of memory performance, post-retrieval drug presence on Day 2 did not affect memory reconsolidation. As further evidence for a null result, fMRI univariate analyses showed no Cued × Group interactions in whole-brain or ROI activity.

      Strengths:

      There are some aspects of this study that I find impressive. The study is well-designed and the fMRI analysis methodology is innovative and sound. The authors have made meticulous and thorough physiological measurements, and assays of mood, throughout the experiment. By doing so, they have overcome, to a considerable extent, the difficulties inherent in the timing of human oral drug delivery in reconsolidation tasks, where it is difficult to have the drug present in the immediate recall period without affecting recall itself. This is beautifully shown in Figure 3. I also think that having some neurobiological assay of memory reactivation when studying reconsolidation in humans is critical, and the authors provide this. While multi-voxel patterns of hemodynamic responses are, in my view, very difficult to equate with an "engram", these patterns do have something to do with memory.

      We thank the reviewer for considering aspects of our work impressive, the study to be well-designed, and the methodology to be innovative and sound.

      Weaknesses:

      I have major issues regarding the behavioral results and the framing of the manuscript.

      (1) To arrive at group differences in memory performance, the authors performed median splitting of Day 3 trials by short and long reaction times during memory cueing on Day 2, as they took this as a putative measure of high/low levels of memory reactivation. Associative category hits on Day 3 showed a Group by Day 2 Reaction time (short, long) interaction, with post-hocs showing (according to the text) worse memory for short Day 2 RTs in the Yohimbine group. These post-hocs should be corrected for multiple comparisons, as the result is not what would be predicted (see point 2). My primary issue here is that we are not given RT data for each group, nor is the median splitting procedure described in the methods. Was this across all groups, or within groups? Are short RTs in the yohimbine group any different from short RTs in the other two groups? Unfortunately, we are not given Day 2 picture category memory levels or reaction times for each group. This is relevant because (as given in Supplemental Table S1) memory performance (d´) for the Yohimbine group on Day 1 immediate testing is (roughly speaking) 20% lower than the other 2 groups (independently of whether the pairs will be presented again the following day). I appreciate that this is not significant in a group x performance ANOVA but how does this relate to later memory performance? What were the group-specific RTs on Day 1? So, before the reader goes into the fMRI results, there are questions regarding the supposed drug-induced changes in behavior. Indeed, in the discussion, there is repeated mention of subsequent memory impairment produced by yohimbine but the nature of the impairment is not clear.

      Thank you for the opportunity to clarify these important issues.

      Reaction times are well established proxies (correlates) of memory strength and memory confidence in previous research, as they reflect cognitive processes involved in retrieving information. Faster reaction times indicate stronger mnemonic evidence and higher confidence in the accuracy of a memory decision, while slower responses suggest weaker evidence and decision uncertainty or doubt. This relationship is supported by an extensive literature (e.g., Starns 2021; Robinson et al., 1997; Ratcliff & Murdock, 1976; amongst others). Importantly, distinguishing between high and low confidence choices in a memory task serves the purpose of differentiating between particularly strong memory evidence (e.g., in associative cued recall, when remembering is particularly vivid) and weaker memory evidence. Separating low from high confidence responses based on participants’ reaction times was especially important in the current analyses, because previous research demonstrates that reaction times during cued recall tasks inversely correlate with hippocampal involvement (Heinbockel et al., 2024; Gagnon et al. 2019) and that stress-effects on human memory may be particularly pronounced for high-confidence memories (Gagnon et al., 2019).

      In response to the Reviewer 1’s comments, we have elaborated on our rationale for the distinction between short and long reaction times in the introduction, results, and methods. Please see page 4, lines 144 to 148:

      “We distinguished between responses with short and long reaction times indicative of high and low confidence responses because previous research showed that reaction times are inversely correlated with hippocampal memory involvement(58-60) and memory strength(61,62), and that high confidence memories associated with short reaction times may be particularly sensitive to stress effects(63).”

      On page 13, lines 520 to 523:

      “Reaction times in the Day 2 Memory cueing task revealed a trial-specific gradient in reactivation strength. Thus, we turned to single-trial analyses, differentiating Day 3 trials by short and long reaction times during memory cueing on Day 2 (median split), indicative of high vs. low memory confidence(58–60) and hippocampal reactivation(26,63).”

      And on page 26, lines 1046 to 1053:

      “Reaction times serve as a proxy for memory confidence and memory strength, with faster responses reflecting higher confidence/strength and slower responses suggesting greater uncertainty/weaker memory. The association between reaction times and memory confidence has been established by previous research(58–60), suggesting that the distinction between high from low confidence responses differentiates vividly recalled associations from decisions based on weaker memory evidence. Reaction times are further linked to hippocampal activity during recall tasks(26,53), and stress effects on memory are particularly pronounced for high-confidence memories(53).”

      With respect to behavioral data reporting, we agree that the critical median-split procedure was not sufficiently clear in the original manuscript. We elaborate on this important aspect of the analysis now on page 26, lines 1053 to 1057:

      “We conducted a median-split within each participant to categorize trials as fast vs. slow reaction time trials during Day 2 memory cueing. We conducted this split on the participant- and not group-level because there is substantial inter-individual variability in overall reaction times. This approach also results in an equal number of trials in the low and high confidence conditions.”

      We completely agree that the relevant post-hoc test should be corrected for multiple comparisons. Please note that all reported post-hoc tests had been Bonferroni-corrected already. We clarify this now by explicitly referring to corrected p-values (P<sub>corr</sub>) and indicate in the methods that P<sub>corr</sub> refers to Bonferroni-corrected p-values. (please see page 25, lines 1036 to 1038).

      We further agree that for a comprehensive overview of the behaviour in terms of memory performance and RTs, these data need to be provided for each group and experimental day. Therefore, we now extended Supplementary Table S1 to include descriptive indices of memory performance (hits, dprime) and RTs for each group for each day. Moreover, we now report ANOVAs for reaction times for each of the experimental days in the main text.

      The ANOVA for Day 1 is now reported on page 6, lines 200 to 204: “To test for potential group differences in reaction times for correctly remembered associations on Day 1, we fit a linear model including the factors Group and Cueing. Critically, we did not observe a significant Group x Cueing interaction, suggesting no RT difference between groups for later cued and not cued items (F(2,58) = 1.41, P = .258, η<sup>2</sup> = 0.01; Supplemental Table S1).”

      The ANOVA for Day 2 is now reported on page 7, lines 243 to 248: “To test for potential group differences in reaction times for correctly remembered associations on Day 2, we fit a linear model including the factors Group and Reaction time (slow/fast) following the subject specific median split. The model did not reveal any main effect or interaction including the factor Group (all Ps > .535; Supplemental Table S1), indicating that there was no RT difference between groups, nor between low and high RT trials in the groups.”

      The ANOVA for Day 3 is reported on page 13 lines 487 to 494: “To test for potential group differences in reaction times for correctly remembered associations on Day 3 we fit a linear model including the factors Group and Cueing. This model did not reveal any main effect or interaction including the factor Group (all Ps > .267), indicating that there was no average RT difference between groups. As expected we observed a main effect of the factor Cueing, indicating a significant difference of reaction times across groups between trials that were successfully cued and those not cued on Day 2 (F(2,58) = 153.07, P < .001, η<sup>2</sup> = 0.22; Supplemental Table S1).”

      (2) The authors should be clearer as to what their original hypotheses were, and why they did the experiment. Despite being a complex literature, I would have thought the hypotheses would be reconsolidation impairment by cortisol and enhancement by yohimbine. Here it is relevant to point out that - only when the reader gets to the Methods section - there is mention of a paper published by this group in 2024. In this publication, the authors used the same study design but administered a stress manipulation after Day 2 cued recall, instead of a pharmacological one. They did not find a difference in associative hit rate between stress and control groups, but - similar to the current manuscript - reported that post-retrieval stress disrupts subsequent remembering (Day 3 performance) depending on neural memory reinstatement during reactivation (specifically driven by the hippocampus and its correlation with neocortical areas).

      Instead of using these results, and other human studies, to motivate the current work, reference is made to a recent animal study: Line 169 "Building on recent findings in rodents (Khalaf et al. 2018), we hypothesized that the effects of post-retrieval noradrenergic and glucocorticoid activation would critically depend on the reinstatement of the neural event representation during retrieval". It is difficult to follow that a rodent study using contextual fear conditioning and examining single neuron activity to remote fear recall and extinction would be relevant enough to motivate a hypothesis for a human psychopharmacological study on emotionally neutral paired associates.

      We agree that our recent publication utilizing a very similar experimental design including three days is highly relevant in the context of the current study and we now refer to this recent study earlier in our manuscript. Please see page 3, lines 89 to 94:  

      “Recently, we showed a detrimental impact of post-retrieval stress on subsequent memory that was contingent upon reinstatement dynamics in the Hippocampus, VTC and PCC during memory reactivation26. While this study provided initial insights into the potential brain mechanisms involved in the effects of post-retrieval stress on subsequent memory, the underlying neuroendocrine mechanisms remained elusive.”

      Moreover, we explicitly state our hypothesis regarding the neural mechanism, with reference to our recent work, on page 5, lines 166 to 169:

      “Building on our recent findings in humans(26) as well as current insights from rodents(47), we hypothesized that the effects of post-retrieval noradrenergic and glucocorticoid activation would critically depend on the reinstatement of the neural event representation during retrieval.”

      Concerning the potential direction of the effects of post-retrieval cortisol and noradrenaline, the literature is indeed mixed with partially contradicting results, which made it, in our view, difficult to derive a clear hypothesis of potentially opposite effects of cortisol and yohimbine. We summarize the relevant evidence in the introduction on pages 3 to 4, lines 100 to 113:

      “Some studies, using emotional recognition memory or fear conditioning in healthy humans, suggest enhancing effects of post-retrieval glucocorticoids on subsequent memory(30,31). However, rodent studies on neutral recognition memory(21), fear conditioning(32), as well as evidence from humans on episodic recognition memory(33) report impairing effects of glucocorticoid receptor activation on post-retrieval memory dynamics. For noradrenaline, post-retrieval blockade of noradrenergic activity impairs putative reconsolidation or future memory accessibility in human fear conditioning(34), as well as drug (alcohol) memory(35) and spatial memory in rodents(36). However, this effect is not consistently observed in human studies on fear conditioning(40), speaking anxiety(37), inhibitory avoidance(39), traumatic mental imagination (PTSD patients)(38), and might depend on the arousal state of the individual(21) or the exact timing of drug administration as suggested by studies in humans(41) and rodents(42). Thus, while there is evidence that glucocorticoid and noradrenergic activation after retrieval can affect subsequent memory, the direction of these effects remains elusive.”

      In addition to these reviewer comments and in response to the eLife assessment, we would like to emphasize that the present findings are in our view not only relevant for a subfield but may be of considerable interest for researchers from various fields, beyond experimental memory research, including Neurobiology, Psychiatry, Clinical Psychology, Educational Psychology, or Law Psychology. We highlight the relevance of the topic and our findings now more explicitly in the introduction and discussion. Please see page 3:

      “The dynamics of memory after retrieval, whether through reconsolidation of the original trace or interference with retrieval-related traces, have fundamental implications for educational settings, eyewitness testimony, or mental disorders(5,11,12). In clinical contexts, post-retrieval changes of memory might offer a unique opportunity to retrospectively modify or render less accessible unwanted memories, such as those associated with posttraumatic stress disorder (PTSD) or anxiety disorders(13–15). Given these potential far reaching implications, understanding the mechanisms underlying post-retrieval dynamics of memory is essential.”

      On page 17:

      “Upon their retrieval, memories can become sensitive to modification(1,2). Such post-retrieval changes in memory may be fundamental for adaptation to volatile environments and have critical implications for eyewitness testimony, clinical or educational contexts(5,11–15). Yet, the brain mechanisms involved in the dynamics of memory after retrieval are largely unknown, especially in humans.”

      And on page 19:

      “Beyond their theoretical relevance, these findings may have relevant implications for attempts to employ post-retrieval manipulations to modify unwanted memories in anxiety disorders or PTSD(97,98). Specifically, the present findings suggest that such interventions may be particularly promising if combined with cognitive or brain stimulation techniques ensuring a sufficient memory reactivation.“

      Reviewer #1 (Recommendations for the authors):

      (1) Related to major issue 2 in the Public Review. In the introduction, it would be helpful to be specific about the type of memory being probed in the different studies referenced (episodic vs conditioning). For the former, please make it clear whether stimuli to be remembered were emotional or neutral, and for which stimulus class drug effects were observed. This is particularly important given that in the first paragraph, you describe memory reactivation in the context of traumatic memories via mention of PTSD. It would also be helpful to know to which species you refer. For example, in line 115, "timing of drug administration..." a rodent and a human study are cited.

      We completely agree that these aspects are important. We have therefore rewritten the corresponding paragraph in the introduction to clarify the type of memory probed, the emotionality of the stimuli and the species tested. Please see pages 3 to 4, lines 100 to 113:

      “Some studies, using emotional recognition memory or fear conditioning in healthy humans, suggest enhancing effects of post-retrieval glucocorticoids on subsequent memory(30,31). However, rodent studies on neutral recognition memory(21), fear conditioning(32), as well as evidence from humans on episodic recognition memory(33) report impairing effects of glucocorticoid receptor activation on post-retrieval memory dynamics. For noradrenaline, post-retrieval blockade of noradrenergic activity impairs putative reconsolidation or future memory accessibility in human fear conditioning(34), as well as drug (alcohol) memory(35) and spatial memory in rodents(36). However, this effect is not consistently observed in human studies on fear conditioning(40), speaking anxiety(37), inhibitory avoidance(39), traumatic mental imagination (PTSD patients)(38), and might depend on the arousal state of the individual(21) or the exact timing of drug administration as suggested by studies in humans(41) and rodents(42). Thus, while there is evidence that glucocorticoid and noradrenergic activation after retrieval can affect subsequent memory, the direction of these effects remains elusive.”

      (2) The Bos 2014 reference appears incorrect. I think you mean the Frontiers paper of the same year.

      Thank you for noticing this mistake, which has been corrected.

      (3) Line 734 "The study employed a fully crossed, placebo-controlled, double-blind, between-subjects design". What is a fully crossed design?

      A fully-crossed design refers to studies in which all possible combinations of multiple between-subjects factors are implemented. However, because the factor reactivation/cueing was manipulated within-subject in the present study and there is only one between-subjects factor (group/drug), “fully-crossed” may be misleading here. We removed it from the manuscript.

      (4) Supplemental Table S3. Are these ordered in terms of significance? A t- or Z-value for each cluster (either of the peak or a summed value) would be helpful.

      We agree that the ordering of the clusters was not clearly described. In the revised Supplemental Table S3, we have now added a column with the cluster-peak specific T-values and added an explanation in the table caption: “Depicted clusters are ordered by cluster-peak T-values.”

      (5) Please provide the requested memory performance and reaction time data, and relevant group comparisons.

      In response to general comment #1 above, we now provide all relevant accuracy and reaction time data for all groups and experimental days in the revised Supplemental Table S1. Moreover, we now report the relevant group comparisons in the main text on page 6, lines 200 to 204, on page 7, lines 243 to 248, and on page 13, lines 487 to 494.

      (6) Please rewrite the introduction with specific hypotheses, mention your recent results published in Science Advances, and attend to suggestions made in the first comment above.

      We have rewritten parts of the introduction to make the link to our recent publication clearer and to clarify the types of memories and species tested, as suggested by the reviewer (please see pages 3 to 4, lines 100 to 113). Moreover, we explicitly state our hypothesis regarding the neural mechanism on page 5, lines 166 to 169:

      “Building on our recent findings in humans(26) as well as current insights from rodents(47), we hypothesized that the effects of post-retrieval noradrenergic and glucocorticoid activation would critically depend on the reinstatement of the neural event representation during retrieval.”

      In terms of the direction of the potential cortisol and yohimbine effects, we have elaborated on the relevant literature, which in our view does not allow a clear prediction regarding the nature of the drug effects. We have made this explicit by stating that “… while there is evidence that glucocorticoid and noradrenergic activation after retrieval can affect subsequent memory, the direction of these effects remains elusive.” (please see page 4, lines 111 to 113). It would be, in our view, inappropriate to retrospectively add another, more specific “hypothesis”.

      Reviewer #2 (Public review):

      Summary:

      The authors aimed to investigate how noradrenergic and glucocorticoid activity after retrieval influence subsequent memory recall with a 24-hour interval, by using a controlled three-day fMRI study involving pharmacological manipulation. They found that noradrenergic activity after retrieval selectively impairs subsequent memory recall, depending on hippocampal and cortical reactivation during retrieval.

      Overall, there are several significant strengths of this well-written manuscript.

      Strengths:

      (1) The study is methodologically rigorous, employing a well-structured three-day experimental design that includes fMRI imaging, pharmacological interventions, and controlled memory tests.

      (2) The use of pharmacological agents (i.e., hydrocortisone and yohimbine) to manipulate glucocorticoid and noradrenergic activity is a significant strength.

      (3) The clear distinction between online and offline neural reactivation using MVPA and RSA approaches provides valuable insights into how memory dynamics are influenced by noradrenergic and glucocorticoid activity distinctly.

      We thank the reviewer for these very positive and encouraging remarks.

      Weaknesses:

      (1) One potential limitation is the reliance on distinct pharmacodynamics of hydrocortisone and yohimbine, which may complicate the interpretation of the results.

      We agree that the pharmacodynamics of hydrocortisone and yohimbine are different. However, we took these pharmacodynamics into account when designing the experiment and have made an effort to accurately track the indicators for noradrenergic arousal and glucocorticoids across the experiment. As shown in Figure 2, these indicators confirm that both drugs are active within the time window of approximately 40-90 minutes after reactivation. This time window corresponds to the proposed reconsolidation window, which is assumed to open around 10 minutes post-reactivation and to remain open for a few hours (approximately 90 minutes; Monfils & Holmes, 2018; Lee et al., 2017; Monfils et al., 2009).

      We have now acknowledged the distinct pharmacodynamics of hydrocortisone and yohimbine on page 21, lines 845 to 847: “We note that yohimbine and hydrocortisone follow distinct pharmacodynamics(104,105), yet selected the administration timing to ensure that both substances are active within the relevant post-retrieval time window.”

      In the results section, on page 11, lines 437 to 439, we further emphasize this differential dynamic: “Our data demonstrate that, despite the distinct pharmacodynamics of CORT and YOH, both substances are active within the time window that is critical for potential reconsolidation effects(3,4,43).”

      (2) Another point related above, individual differences in pharmacological responses, physiological and cortisol measures may contribute to memory recall on Day 3.

      The administered drugs elicit a pronounced adrenergic and glucocorticoid response, respectively. Specifically, the cortisol levels reached by 20mg of hydrocortisone correspond to those observed after a significant stressor exposure. Moreover, individual variation in stress system activation following drug intake tends to be less pronounced than in response to a natural stressor. Nevertheless, we fully agree that individual factors, such as metabolism or body weight, can influence the drug's action.

      We therefore re-analysed the reported Day 3 models, now including individual measures of baseline-to-peak changes in cortisol and systolic blood pressure, respectively. We report these additional analyses in the supplement and refer the interested reader to these analyses on page 15, lines 580 to 586:

      “As individual factors, such as metabolism or body weight, can influence the drug's action, we ran an additional analysis in which we included individual (baseline-to-peak) differences in salivary cortisol and (systolic) blood pressure, respectively. This analysis did not show any group by baseline-to-peak difference interaction suggesting that the observed memory effects were mainly driven by the pharmacological intervention group per se and less by individual variation in responses to the drug (see Supplemental Results).”

      And in the Supplemental Results:

      “To account for individual differences in cortisol responses after pill intake, we fit additional GLMMs predicting Day 3 subsequent memory of cued and correct trials including the factors Individual baseline-to-peak cortisol and Group. Doing so allowed us to account for variation in Day 3 performance, which might have resulted from within-group variation in cortisol responses, in particular in the CORT group. Importantly, none of the models predicting Day 3 memory performance by Day 2 cortisol-increase and Group, median-split RTs (high/low), hippocampal activity and RTs, or hippocampal activity and VTC category reinstatement revealed a significant group x baseline-to-peak cortisol interaction (all Ps > .122). These results suggest that inter-individual differences in cortisol responses did not have a significant impact on subsequent memory, beyond the influence of group per se. The same analyses were repeated for systolic blood pressure employing GLMMs predicting Day 3 subsequent memory of cued and correct trials including the factors Individual baseline-to-peak systolic blood pressure and Group to account for variation in Day 3 performance, which might have resulted from within-group variation in blood pressure response, in particular in the YOH group. While the model predicting Day 3 memory performance revealed a significant Individual baseline-to-peak systolic blood pressure × Group × median-split RTs (high/low) interaction (β = -0.05 ± 0.02, z = -2.04, P = .041, R<sup>2</sup><sub>conditional</sub> = 0.01), post-hoc slope tests, however, did not show any significant difference between groups (all P<sub>Corr</sub> > .329). The remaining models including hippocampal activity and RTs, or hippocampal activity and VTC category reinstatement did not reveal a significant Group × Individual baseline-to-peak systolic blood pressure interaction (all Ps > .101). These results suggest that inter-individual differences in systolic blood pressure responses did not have a significant impact on subsequent memory, beyond the influence of group per se.”

      Although we acknowledge that our study may not have been sufficiently powered for an analysis of individual differences, these data suggest that our memory effects were mainly driven by the pharmacological intervention group per se and less by individual variation in responses. It is to be noted, however, that all participants of the respective groups showed a pronounced increase in cortisol concentrations (on average > 1000% in the CORT group) and autonomic arousal (on average > 10% in the YOH group), respectively. These increases appeared to be sufficient to drive the observed memory effects, irrespective of some individual variation in the magnitude of the response.

      (3) Median-splitting approach for reaction times and hippocampal activity should better be justified.

      Reaction times are well established proxies (correlates) of memory strength and memory confidence in previous research, as they reflect cognitive processes involved in retrieving information. Faster reaction times indicate stronger mnemonic evidence and higher confidence in the accuracy of a memory decision, while slower responses suggest weaker evidence and decision uncertainty or doubt. This relationship is supported by an extensive literature (e.g., Starns 2021; Robinson et al., 1997; Ratcliff & Murdock, 1976; amongst others). Importantly, distinguishing between high and low confidence choices in a memory task serves the purpose to differentiating between particularly strong memory evidence (e.g., is associative cued recall, when remembering is particularly vivid) and weaker memory evidence. Separating low from high confidence responses based on participants’ reaction times was especially important in the current analyses, because previous research demonstrates that reaction times during cued recall tasks inversely correlate with hippocampal involvement  Heinbockel et al., 2024; Gagnon et al. 2019) and that stress-effects on human memory may be particularly pronounced for high-confidence memories (Gagnon et al., 2019).

      In response to the Reviewer comments, we have elaborated on our rationale for the distinction between short and long reaction times in the introduction, results, and methods. Please see page 4, lines 144 to 148:

      “We distinguished between responses with short and long reaction times indicative of high and low confidence responses because previous research showed that reaction times are inversely correlated with hippocampal memory involvement(58–60) and memory strength(61,62), and that high confidence memories associated with short reaction times may be particularly sensitive to stress effects(63).”

      On page 13, lines 520 to 523:

      “Reaction times in the Day 2 Memory cueing task revealed a trial-specific gradient in reactivation strength. Thus, we turned to single-trial analyses, differentiating Day 3 trials by short and long reaction times during memory cueing on Day 2 (median split), indicative of high vs. low memory confidence(58–60) and hippocampal reactivation(26,63).”

      And on page 26, lines 1046 to 1053:

      “Reaction times serve as a proxy for memory confidence and memory strength, with faster responses reflecting higher confidence/strength and slower responses suggesting greater uncertainty/weaker memory. The association between reaction times and memory confidence has been established by previous research(58–60), suggesting that the distinction between high from low confidence responses differentiates vividly recalled associations from decisions based on weaker memory evidence. Reaction times are further linked to hippocampal activity during recall tasks(26,53), and stress effects on memory are particularly pronounced for high-confidence memories(53).”

      We agree that the critical median-split procedure was not sufficiently clear in the original manuscript. We elaborate on this important aspect of the analysis now on page 26, lines 1053 to 1057:

      “We conducted a median-split within each participant to categorize trials as slow vs. fast reaction time trials during Day 2 memory cueing. We chose to conduct this split on the participant- and not group-level because there is substantial inter-individual variability in overall reaction times and to retain an equal number of trials in the low and high confidence conditions.”

      In addition to these reviewer comments and in response to the eLife assessment, we would like to emphasize that the present findings are in our view not only relevant for a subfield but may be of considerable interest for researchers from various fields, beyond experimental memory research, including Neurobiology, Psychiatry, Clinical Psychology, Educational Psychology, or Law Psychology. We highlight the relevance of the topic and our findings now more explicitly in the introduction and discussion. Please see page 3:

      “The dynamics of memory after retrieval, whether through reconsolidation of the original trace or interference with retrieval-related traces, have fundamental implications for educational settings, eyewitness testimony, or mental disorders5,11,12. In clinical contexts, post-retrieval changes of memory might offer a unique opportunity to retrospectively modify or render less accessible unwanted memories, such as those associated with posttraumatic stress disorder (PTSD) or anxiety disorders(13–15). Given these potential far reaching implications, understanding the mechanisms underlying post-retrieval dynamics of memory is essential.”

      On page 17:

      “Upon their retrieval, memories can become sensitive to modification(1,2). Such post-retrieval changes in memory may be fundamental for adaptation to volatile environments and have critical implications for eyewitness testimony, clinical or educational contexts(5,11–15), Yet, the brain mechanisms involved in the dynamics of memory after retrieval are largely unknown, especially in humans.”

      And on page 19:

      “Beyond their theoretical relevance, these findings may have relevant implications for attempts to employ post-retrieval manipulations to modify unwanted memories in anxiety disorders or PTSD(97,98). Specifically, the present findings suggest that such interventions may be particularly promising if combined with cognitive or brain stimulation techniques ensuring a sufficient memory reactivation.“

      Reviewer #2 (Recommendations for the authors):

      My comments and/or questions for the authors to improve this well-written manuscript.

      (1) This study identifies the modulatory role of the hippocampus and VTC in the effects of norepinephrine on subsequent memory. Are there functional interactions between these ROIs and other brain regions that could be wise to consider for a more comprehensive understanding of the underlying neural mechanisms?

      We agree that functional interactions of hippocampus and VTC and other regions that were active during Day 2 memory cueing are relevant for our understanding of the underlying mechanisms. We therefore now performed connectivity analyses using general psycho-physiological interaction analysis (gPPI; as implemented in SPM) and report the results of this analysis on page 16, lines 635 to 644, and added Supplemental Table S4 including gPPI statistics.

      “We conducted general psycho-physiological interaction analysis (gPPI) analyses on the Day 2 memory cueing task (remembered – forgotten), which revealed that successful cueing was accompanied by significant functional connectivity between the left hippocampus, VTC, PCC and MPFC (see Supplemental Table S4). However, using these connectivity estimates to predict Day 3 subsequent memory performance (dprime) via regression did not reveal any significant Group × Connectivity interactions, indicating that the pharmacological manipulation (i.e. noradrenergic stimulation) did not modulate subsequent memory based on functional connectivity during memory cueing (all P<sub>Corr</sub> > .228). The same pattern of results was observed when including single trial beta estimates from multiple ROIs during memory cueing to predict Day 3 memory (all interaction effects P<sub>Corr</sub> > .288).”

      (2) In theory, noradrenergic activity would have a profound impact on activity in widespread brain regions that are closely related to memory function. It would be interesting to know other possible effects beyond the hippocampus and VTC.

      We agree and included in our analysis additional ROIs beyond the HC and VTC; we now report these explorative results on page 16, lines 616 to 633:

      “Beyond hippocampal and VTC activity during memory cueing (Day 2), we exploratively reanalysed the GLMMs predicting Day 3 memory performance including the PCC, which was relevant during memory cueing in the current study and in our previous work(26).  Predicting Day 3 memory performance by the factors Group and Single trial beta activity during memory cueing in the PCC did not reveal a significant interaction (P<sub>Corr</sub>  = 1); adding the factor Reaction time to the model also did not result in a significant interaction (P<sub>Corr</sub> = 1). We also included the Medial Prefrontal Cortex (MPFC) to predict Day 3 memory performance, as the MPFC has been shown to be sensitive to noradrenergic modulation in previous work(75). Predicting Day 3 memory performance by the factors Group and Single trial beta activity during memory cueing in the MPFC did not reveal a significant interaction (P<sub>Corr</sub>  = 1); adding the factor Reaction time to the model also did not result in a significant interaction (P<sub>Corr</sub> = 1), which indicates that the MPFC was not modulated by either pharmacological intervention. Finally, we investigated memory cueing from all remaining ROIs that were significantly activated during the Day 2 memory cueing task (Day 2 whole-brain analysis; correct-incorrect; Supplemental Table S3). We again fit GLMMs predicting Day 3 memory performance by the factors Group and Single trial beta activity during memory cueing. Again, we did not observe any significant interaction effect any of the ROIs (all interaction P<sub>Corr</sub> > .060) and these results did not change when adding the factor Reaction time to the respective models (all  P<sub>Corr</sub> > .075).”

      (3) There are substantial individual differences in pharmacological responses, physiological and cortisol measures, as shown in Figure 3A&B. If such individual differences are taken into account, are there any potential effects on subsequent recall on Day 3 pertaining to the hydrocortisone group?

      In response to this comment (and the General comment #1 of this reviewer), we now re-analyzed the respective models including individual measures of baseline-to-peak cortisol and systolic blood pressure.

      We re-analysed the reported Day 3 models, now including individual measures of baseline-to-peak changes in cortisol and systolic blood pressure, respectively. We report these additional analyses in the supplement and refer the interested reader to these analyses on page 15, lines 580 to 586:

      “As individual factors, such as metabolism or body weight, can influence the drug's action, we ran an additional analysis in which we included individual (baseline-to-peak) differences in salivary cortisol and (systolic) blood pressure, respectively. This analysis did not show any group by baseline-to-peak difference interaction suggesting that the observed memory effects were mainly driven by the pharmacological intervention group per se and less by individual variation in responses to the drug (see Supplemental Results).”

      And in the Supplemental Results:

      “To account for individual differences in cortisol responses after pill intake, we fit additional GLMMs predicting Day 3 subsequent memory of cued and correct trials including the factors Individual baseline-to-peak cortisol and Group. Doing so allowed us to account for variation in Day 3 performance, which might have resulted from within-group variation in cortisol responses, in particular in the CORT group. Importantly, none of the models predicting Day 3 memory performance by Day 2 cortisol-increase and Group, median-split RTs (high/low), hippocampal activity and RTs, or hippocampal activity and VTC category reinstatement revealed a significant group x baseline-to-peak cortisol interaction (all Ps > .122). These results suggest that inter-individual differences in cortisol responses did not have a significant impact on subsequent memory, beyond the influence of group per se. The same analyses were repeated for systolic blood pressure employing GLMMs predicting Day 3 subsequent memory of cued and correct trials including the factors Individual baseline-to-peak systolic blood pressure and Group to account for variation in Day 3 performance, which might have resulted from within-group variation in blood pressure response, in particular in the YOH group. While the model predicting Day 3 memory performance revealed a significant Individual baseline-to-peak systolic blood pressure × Group × median-split RTs (high/low) interaction (β = -0.05 ± 0.02, z = -2.04, P = .041, R<sup>2</sup><sub>conditional</sub> = 0.01), post-hoc slope tests, however, did not show any significant difference between groups (all P<sub>Corr</sub> > .329). The remaining models including hippocampal activity and RTs, or hippocampal activity and VTC category reinstatement did not reveal a significant Group × Individual baseline-to-peak systolic blood pressure interaction (all Ps > .101). These results suggest that inter-individual differences in systolic blood pressure responses did not have a significant impact on subsequent memory, beyond the influence of group per se.”

      (4) Median-splitting approach for reaction times and hippocampal activity should better be justified.

      Reaction times are well established proxies (correlates) of memory strength and memory confidence in previous research, as they reflect cognitive processes involved in retrieving information. Faster reaction times indicate stronger mnemonic evidence and higher confidence in the accuracy of a memory decision, while slower responses suggest weaker evidence and decision uncertainty or doubt. This relationship is supported by an extensive literature (e.g., Starns 2021; Robinson et al., 1997; Ratcliff & Murdock, 1976; amongst others). Importantly, distinguishing between high and low confidence choices in a memory task serves the purpose to differentiating between particularly strong memory evidence (e.g., is associative cued recall, when remembering is particularly vivid) and weaker memory evidence. Separating low from high confidence responses based on participants’ reaction times was especially important in the current analyses, because previous research demonstrates that reaction times during cued recall tasks inversely correlate with hippocampal involvement ( Heinbockel et al., 2024; Gagnon et al. 2019) and that stress-effects on human memory may be particularly pronounced for high-confidence memories (Gagnon et al., 2019).

      In response to the Reviewer comments, we have elaborated on our rationale for the distinction between short and long reaction times in the introduction, results, and methods. Please see page 4, lines 144 to 148:

      “We distinguished between responses with short and long reaction times indicative of high and low confidence responses because previous research showed that reaction times are inversely correlated with hippocampal memory involvement(58–60) and memory strength(61,62), and that high confidence memories associated with short reaction times may be particularly sensitive to stress effects(63).”

      On page 13, lines 520 to 523:

      “Reaction times in the Day 2 Memory cueing task revealed a trial-specific gradient in reactivation strength. Thus, we turned to single-trial analyses, differentiating Day 3 trials by short and long reaction times during memory cueing on Day 2 (median split), indicative of high vs. low memory confidence(58–60) and hippocampal reactivation(26,63).”

      And on page 26, lines 1046 to 1053:

      “Reaction times serve as a proxy for memory confidence and memory strength, with faster responses reflecting higher confidence/strength and slower responses suggesting greater uncertainty/weaker memory. The association between reaction times and memory confidence has been established by previous research(58–60), suggesting that the distinction between high from low confidence responses differentiates vividly recalled associations from decisions based on weaker memory evidence. Reaction times are further linked to hippocampal activity during recall tasks(26,53), and stress effects on memory are particularly pronounced for high-confidence memories(53).”

      Minor comments:

      (5) Please include the full names of key abbreviations in the figure legends, such as "ass.cat.hit" and among others.

      We now include the full names of key abbreviations in all figure legends (e.g., ass.cat.hit = associative category hit).

      (6) Please introduce various metrics used in the study to aid readers in better understanding the measurements they utilized.

      We agree that various measures that were included in our analyses had not been described clearly enough before, especially concerning the multivariate analyses. We therefore added short explanations across the results section.

      Page 8, lines 279 to 280: “Classifier accuracy is derived from the sum of correct predictions the trained classifier made in the test-set, relative to the total amount of predictions.”

      Page 8, lines 290 to 292:  “Neural reinstatement reflects the extent to which a neural activity pattern (i.e., for objects) that was present during encoding is reactivated during retrieval (e.g., memory cueing).”

      Page 8, lines 299 to 301:  “The logits here reflect the log-transformed trial-wise probability of a pattern either representing a scene or an object.”

      Page 10, lines 378 to 380:  “Beyond category-level reinstatement, we assessed event-level memory trace reinstatement from initial encoding (Day 1) to memory cueing (Day 2), via RSA, correlating neural patterns in each region (hippocampus, VTC, and PCC) across days.”

      (7) Please explain what the different colors represent in Figures 5B and 5C to avoid confusion. It would be good to indicate significant differences in the figures if applicable.

      We now added line legends to the figure and also the caption to clarify what exactly is depicted. We added asterisks to mark significant differences.

      References:

      Monfils, M. H., Cowansage, K. K., Klann, E., & LeDoux, J. E. (2009). Extinction-reconsolidation boundaries: key to persistent attenuation of fear memories. science324(5929), 951-955.

      Monfils, M. H., & Holmes, E. A. (2018). Memory boundaries: opening a window inspired by reconsolidation to treat anxiety, trauma-related, and addiction disorders. The Lancet Psychiatry5(12), 1032-1042.

      Lee, J. L. C., Nader, K. & Schiller, D. An Update on Memory Reconsolidation Updating. Trends Cogn. Sci. 21, 531–545 (2017).

      Radley, J. J., Williams, B., & Sawchenko, P. E. (2008). Noradrenergic innervation of the dorsal medial prefrontal cortex modulates hypothalamo-pituitary-adrenal responses to acute emotional stress. Journal of Neuroscience28(22), 5806-5816.

      Heinbockel, H., Wagner, A. D., & Schwabe, L. (2024). Post-retrieval stress impairs subsequent memory depending on hippocampal memory trace reinstatement during reactivation. Science Advances10(18), eadm7504.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their general comment and for the critical evaluation of our analyses and results interpretation. Their comments greatly helped us to improve the manuscript.

      • *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: An analysis of an Arabidopsis VSP13 presumed lipid transport is provided. The analysis pretty much follows similar studies done on yeast and human homologs. Key findings are the identification of multiple products from the locus due to differential splicing, analysis of lipid binding and transport properties, subcellular location, tissue specific promoter activity, mutant analysis suggesting a role in lipid remodeling following phosphate deprivation, but no physiological or growth defects of the mutants. Major points: The paper is generally written and documented, the experiments are well conducted and follow established protocols. The following major points should be considered:

      1. There are complementary lipid binding assays that should be considered such as liposome binding assays, or lipid/western dot blots. All of these might give slightly different results and may inform a consensus. Of course, non-membrane lipids such as TAG cannot be tested in a liposome assay.

      Concerning lipid transfer proteins (LTPs), it is important to differentiate the lipid binding capacity related to the transport specificity (which lipids are transported by a LTP?) from the lipid binding capacity linked to the targeting of a LTP to a specific membrane (a LTP can bind a specific lipid via a domain distinct from the lipid transfer domain to be targeted in cells, but will not transport this lipid). Both aspects are of high interest to be determined. Our goal here was to focus on the identification of the lipids bound to AtVPS13M1 and to be likely transported, which is why we used a truncation (1-335) corresponding to the N-term part of the hydrophobic tunnel. Liposome binding assays and lipid dot blots are necessary to answer the question of the membrane binding capacity of the protein. We think that this aspect is out of the scope of the current article as it will require to express and purify other AtVPS13M1 domains that are known to bind lipids such as the two PH domains and the C2. This will be the scope of future investigations in our lab.

      Similarly, lipid transfer based only on fluorophore-labeled lipids may be misleading because the fluorophore could affect binding. It is mentioned that the protein in this assay is tethered by 3xHis to the liposomes. Un less I ma missing something, I do not understand how that should work. This needs to be better explained.

      We truly agree with Reviewer 1 that the presence of a fluorophore could affect lipid binding to the protein. In this assay, lipids are labeled on their polar head and it is therefore difficult to conclude about the specificity of our protein in term of transport. This assay is used as a qualitative assay to show that AtVPS13M1(1-335) is able to transfer lipids in vitro, and in the manuscript, we did not make any conclusion about its transport specificity based on this assay, but rather used the binding assay to assess the binding, and likely transport, specificity of AtVPS13M1. FRET-based assay is a well-accepted assay in the lipid transfer community to easily probe lipid transport in vitro and has been used in the past to assess transfer capacity of different proteins, including for VPS13 proteins (for examples, see (Kumar et al., 2018; Hanna et al., 2022; Valverde et al., 2019)).

      To be able to transfer lipids from one liposome to another, both liposomes have to be in close proximity. Therefore, we attached our protein on donor acceptors, to favor the transport of the fluorescent lipids from the donor to the acceptor liposomes. Then, we progressively increased acceptor liposomes concentration to favor liposome proximity and the chance to have lipid transfer. We added a scheme on Figure 3B of the revised version of the manuscript to clarify the principle of the assay. In addition, we provided further control experiments suggested by Reviewers 2 and 3 showing that the fluorescence signal intensity depend on AtVPS13M1(1-335) protein concentration and that no fluorescence increase is measured with a control protein (Tom20.3) (see Figure 3C-D of the revised manuscript).

      The in vivo lipid binding assay could be obscured by the fact that the protein was produced in insect cells and lipid binding occurs during the producing. What is the evidence that added plants calli lipids can replace lipids already present during isolation.

      Actually we don’t really know whether the insect cells lipids initially bound to AtVPS13M1(1-335) are replaced by calli lipids or whether they bound to still available lipid binding sites on the protein. But we have two main lines of evidence showing that our purified protein can bind plant lipids even in the presence of insect cells lipids: 1) our protein can bind SQDG and MGDG, two plants specific lipids, and 2) as explained p.8 (lines 243-254), lipids coming from both organisms have a specific acyl-chain composition, with insect cells fatty acids mainly composed of C16 and C18 with 0 or 1 unsaturation whereas plant lipids can have up to 3 unsaturations. By analyzing and presenting on the histograms lipid species from insect cells, calli and those bound to AtVPS13M1(1-335), we were able to conclude that for all the lipid classes besides PS, a wide range of lipid species deriving from both organisms was bound to our protein. The data about the lipid species bound to AtVPS13M1(1-335) are presented in Figure 2E and S2.

      The effects on lipid composition of the mutants are not very drastic from what I can tell. Furthermore, how does this fit with the lipid composition of mitochondria where the protein appears to be mostly located?

      It is true that lipid composition variations in the mutants are not drastic but still statistically significant. As a general point in the field of lipid transfer, it is not very common to have major changes in total lipidome on single mutants of lipid transfer proteins because of a high redundancy of lipid transport pathway in cells. This is particularly true for VPS13 proteins, as exemplified by multiple studies. Major lipid phenotypes can be revealed in specific conditions, such as phosphate starvation in our case, or when looking at specific organelles or specific tissues and/or developmental stages. This is explained and illustrated by examples in the discussion part p. 16 (line 526-532). In addition, as suggested by Reviewer 3, we performed further lipid analysis on calli and also on rosettes under Pi starvation and found a similar trend (Figure 4 and S4 of the revised version of the manuscript). Thus, we believe that, even if not drastic, these variations during Pi starvation are a real phenotype of our mutants.

      As we found that our protein is located at the mitochondrial surface, we agree that Reviewer 1’s suggestion to perform lipidomic analyses on isolated mitochondria will be of high interest but this will be the scope of future studies that we will performed in our lab. First, we would like to identify all the organelles at which AtVPS13M1 is localized before performing subfractionations of these different organelles from the same pool of cell cultures grown in presence or absence of phosphate.

      For the localization of the fusion protein, has it been tested whether the furoin is functional? This should be tested (e.g. by reversion of lipid composition).

      As we did not observe major developmental phenotypes in our mutants, complementation should be indeed tested by performing lipidomic analyses in calli or plants grown in presence or absence of Pi, which is a time-consuming and expensive experiment. Because we used the fusions mainly for tissue expression study and subcellular localization and not for functional analyses, we believe that this is not an essential control to be performed for this work.

      It is speculated that different splice forms are located to different compartments. Can that be tested and used to explain the observed subcellular location patterns?

      Indeed some splice forms can modify the sequence of domains putatively involved in protein localization. This could be tested by producing synthetic constructs with one specific exon organization, which is challenging according to the size of AtVPS13M1 cDNA (around 12kb). In addition, our long-read sequencing experiment and PCR analyses revealed the existence of six transcripts, a major one representing around 92% and the five others representing less than 2.5% (Figure 1D). Among the five less abundant transcripts, four produce proteins with a premature stop codon and are likely to arise from splicing defects as explained in the discussion part p. 15 (lines 488-496). One produces a full-length protein with an additional loop in the VAB domain but because of the low abundance of this alternative transcript (1.4%), we believe it does not contribute significantly to the major localization we observed in plants and did not attend to analyze its localization.

      GUS fusion data only probe promoter activity but not all levels of gene expression. That caveat should be discussed.

      We are aware of this drawback and that is the reason why we fused the GUS enzyme directly to our protein expressed under its native locus (i.e. with endogenous promoter and exons/introns) as depicted in Figure 5A. Therefore, our construction allows to assess directly AtVPS13M1 protein level in plant tissues.

      Minor points: 1. Extraplastidic DGDG and export from chloroplasts following phosphate derivation was first reported in PMID: 10973486.

      We added this reference in the text.

      Check throughout the correct usage of gene expression as genes are expressed and proteins produced.

      Many thanks for this remark, we modified the text accordingly

      In general, the paper is too long. Redundancies between introduction, results and discussion should be removed to streamline.

      We reduced the text to avoid redundancy.

      I suggest to redraw the excel graphs to increase line thickness and enlarge font size to increase presentation and readability.

      We tried as much as we can to enlarge graphs and font size increasing readability.

      Reviewer #1 (Significance (Required)):

      Significance: Interorganellar lipid trafficking is an important topic and especially under studied in plants. Identifying components involved represents significant progress in the field. Similarly, lipid remodeling following phosphate derivation is an important phenomenon and the current advances our understanding.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: The manuscript "AtVPS13M1 is involved in lipid remodelling in low phosphate and is located at the mitochondria surface in plants" by Leterme et al. identifies the protein VPS13M1 as a lipid transporter in Arabidopsis thaliana with important functions during phosphate starvation. The researchers were able to localise this protein to mitochondria via GFP-targeting in Arabidopsis. Although VPS13 proteins are well described in yeast and mammals, highlighting their importance in many vital cellular processes, there is very little information on them in plants. This manuscript provides new insights into plant VPS13 proteins and contributes to a better understanding of these proteins and their role in abiotic stress responses, such as phosphate starvation.

      Major points: - Please describe and define the domains of the VPS13M1 protein in detail, providing also a figure for that. Figure 1 is mainly describing possible splice variants, whereas the characteristics of the protein are missing.

      We have added information on AtVPS13M1 domain organization in the introduction (p.4, lines 103-109) and referred to Figure 1A that described protein domain organization. We did not added too much details as plant VPS13 protein domains organization was extensively described in two previous studies cited several times in the manuscript (Leterme et al., 2023; Levine, 2022).

      • Please compare the expression level of VPS13M1 in the presence and in the absence of phosphate.

      Many thanks for this suggestion. We performed qRT-PCR analyses of AtVPS13M1 from mRNA extracted from calli grown six days in presence and absence of phosphate. The results obtained did not reveal variations in mRNA level. The results were added in Figure S1A of the revised version of the manuscript and discussed in p.5 (lines 154-156).

      • Page 9, second paragraph: Here, the lipid transport capability of AtVPS13M1 is described. Varying concentrations of this recombinant protein should be used in this test. Further, it is not highlighted, that a truncated version of VSP13M1 is able to transport lipids. This is surprising, since this truncated version is less than 10% of the total protein (only aa 1-335).

      We agree with reviewer 2 that increasing protein concentration is an important control to perform. We included an experiment with an increasing quantity of protein (2X and 4X) in the revised version of the manuscript and showed that the signal intensity increased faster when protein concentration is higher (Figure 3D of the revised manuscript). As requested by Reviewer 3, we also included a negative control with Tom20.3 to show that the signal increase after the addition of AtVPS13M1(1-335) is specific to this protein (Figure 3C of the revised manuscript).

      The transport ability of the N-terminal part of VPS13 was demonstrated in yeast and mammals VPS13D (Kumar et al., 2018; Wang et al., 2021). We highlighted this p. 7 (lines 213-218) of the revised version of the manuscript. This is explained by the inherent structure of VPS13 proteins that are composed of several repeats of the same domain type called RBG (for repeating β-groove), each forming a β-sheet with a hydrophobic surface. The higher the number of RBG repeats, the longer the hydrophobic tunnel is. The (1-335) N-terminal region corresponds to two RBG unit repeats forming a “small” tunnel able to bind and transfer lipids. The number of RBG repeats has influence on the quantity of lipids bound per protein in vitro, the longest the protein is, the highest the number of lipid molecules bound is (Kumar et al., 2018), but the effect on protein length on in vitro lipid transfer capacity has not been investigated yet to the best of our knowledge.

      • Also, for phenotype analysis, T-DNA insertion mutants are used that still contain VPS13M1 transcripts. Although protein fragments where not detected by proteomic analysis, this might be due to low sensitivity of the proteomic assay. Further the lipid transport domain of VPS13M1 (aa 1-335) might not be affected by the T-DNA insertions at all. Here more detailed analysis needs to be done to prove that indeed loss-of protein function occurs in the mutants.

      We do not have other methods than proteomic to test whether our mutants are KO or not. We tried unsuccessfully to produce antibodies. Mass spectrometry is the most sensitive method but the absence of detection indeed does not mean the absence of the protein. From proteomic data, we can conclude that at least, our mutants present a decrease in AtVPS13M1 protein level, thus we called them “knock down” in the revised version of the manuscript and added the following sentence p. 9 (lines 297-300): “As the absence of detection of a protein by mass spectrometry-based proteomics does not allow us to strictly claim the absence of this protein in the sample, we concluded that AtVPS13M1 expression in both atvps13m1-1 and atvps13m1-4 was below the detection limit and consider them as knock down (KD) for AtVPS13M1.”

      • Localisation in mitochondria: As the Yepet signal is very weak, a control image of not transfected plant tissue needs to be included. Otherwise, it might be hard to distinguish the Yepet signal from background signal. The localisation data presented in Figure 5 does not allow the conclusion that VPS13M1 is localized at the surface of mitochondria as stated in the title. It only indicates (provided respective controls see above) that VPS13M1 is in mitochondria. Please provide more detailed analysis such as targeting to tobacco protoplasts, immunoblots or in vitro protein import assays. Also test +Pi vs. -Pi to see if VPS13M1 localisation is altered in dependence of Pi.

      Indeed our Yepet signal is not very strong but on the experiments we performed on Col0 non-transformed plants, we did not very often see fluorescence background in the leaves’ vascular tissue, that is why we focused our study on this tissue. We sometimes observed some background signals in some cells that are clearly different from AtVPS13M1-3xYepet signals and never co-localized with mitochondria. Examples of these aspecific signals are presented in Figure S6E of the revised version of the manuscript.

      We agree with reviewer 2 that our confocal images suggested, but not demonstrated, a localization at the surface of mitochondria. To confirm the localization, we generated calli cell cultures from AtVPS13M1-3xYepet lines and performed subcellular fractionations and western blot analyses confirming that AtVPS13M1 was indeed enriched in mitochondria and also in microsomal fractions (Figure 6G of the revised version). Then we performed mild proteolytic digestion of the isolated mitochondria with thermolysin and show that AtVPS13M1 was degraded, as the outer membrane protein Tom20.3, but not the inner membrane protein AtMic60, showing that AtVPS13M1 is indeed at the surface of mitochondria (Figure 5H of the revised manuscript). We believe that this experiment, in addition to the confocal images showing a signal around mitochondria, convincingly demonstrates that AtVPS13M1 is located at the surface of mitochondria.

      The localization of AtVPS13M1 under Pi starvation is a very important question that we tried to investigate without success. Indeed, we intended to perform confocal imaging on seedlings grown in liquid media to easily perform Pi starvation as described for the analysis of AtVPS13M1 tissue expression with β-glucuronidase constructs. However, the level of fluorescence background was very high in seedlings and no clear differences between non-transformed and AtVPS13M1-3xYepet lines were observed, even in root tips where the protein is supposed to be the most highly expressed according to β-glucuronidase assays. Example of images obtained are presented in Figure R1. We concluded that the level of expression of our construct was too low in seedlings. The constructions of lines with a higher AtVPS13M1 expression level, by changing the promotor, to better analyze AtVPS13M1 in different tissues or in response to Pi starvation will be the scope of future work in our laboratory in order to investigate AtVPS13M1 localization under low Pi.

      Phenotype analysis needs to be done under Pi stress and not under cold stress! Further, root architecture and root growth should also be done under Pi depletion. Here the title is also misleading, it is not at all clear why the authors switch from phosphate starvation to cold stress.

      In the revised version of the manuscript, we analyzed the seedlings root growth of two mutants (atvps13m1-3 and m1-4) under low Pi and did not notice significant differences (Figure 7E, S7D of the revised version). We analyzed growth under cold stress because this stress also promotes remodeling of lipids, but we agree that it goes beyond the scope of this article that is focused on Pi starvation and we removed this part from the revised manuscript.

      Minor points: Page 3, line 1: what does the abbreviation VPS stand for?

      The definition of VPS (Vacuolar Protein Sorting) was added.

      Page 3, line 1: change "amino acids residues" to "amino acid residues"

      This was done.

      Page 3, line 8 - 12: please rewrite this sentence. You write, that because of their distribution VPS13 proteins do exhibit many important physiological roles. The opposite is true: They are widely distributed in the cell because of their involvement in many physiological processes.

      We changed the sentence to “ VPS13 proteins localize to a wide variety of membranes and membrane contact sites (MCSs) in yeast and human (Dziurdzik and Conibear, 2021). This broad distribution on different organelles and MCSs is important to sustain their important roles in numerous cellular and organellar processes such as meiosis and sporulation, maintenance of actin skeleton and cell morphology, mitochondrial function, regulation of cellular phosphatidylinositol phosphates level and biogenesis of autophagosome and acrosome (Dziurdzik and Conibear, 2021; Hanna et al., 2023; Leonzino et al., 2021).”

      Page 6, line6: change "cDNA obtained from A. thaliana" to "cDNA generated from A. thaliana.

      This was done.

      Page 6, line 10: change" 7.6kb" to "7.6 kb"

      This was done.

      Page 7: address this question: can the isoforms form functional VPS13 proteins? This might help to postulate whether these isoforms are a result of defective splicing events.

      We addressed this aspect in the discussion p.15 at lines 486-502.

      Figure 2 B: Change "AtVPS13M1"to "AtVPS13M1(1-335)"

      This was done.

      Figure 2, legend: -put a blank before µM in each case.

      This was done.

      -Change 0,125µM to 0.125 µM

      This was done.

      -what does "in absence (A-0µM)" mean?

      This means that the Acceptor liposomes are at 0 µM. To clarify, we changed it to “Acceptor 0 µM” in the revised version of the manuscript (Figure 3C).

      -Which statistical analysis was employed?

      We performed a non-parametric Mann-Whitney test in the revised version of the manuscript. This was indicated in the legend.

      -Further, rewrite the sentence "Mass spectrometry (MS) analysis of lipids bound to AtVPS13M1(1-335) or Tom20 (negative control) after incubation with calli total lipids. Results are expresses in nmol of lipids per nmol of proteins (C) or in mol% (D)". -"C" and "D" are not directly comparable, as in "C" no Tom20 was used and in "C" no insect cells were used.

      -Further, in "D" the experimental setup is not clear. AtVPS13(1-335) is supposed to be purified protein after incubation with calli lipids (figure 2, A). Further, in the same figure, lipid composition of "insect cells" and "calli-Pi" are compared àwhy? Please clarify this.

      C and D are two different representations of the same results providing different types of information. In C., the results are expressed in nmol of lipids / nmol of proteins to assess 1) that the level of lipids found in AtVPS13M1(1-335) purifications is significantly higher than what we can expect from the background (assessed using Tom20) and 2) what are the classes of lipids that associate or not to AtVPS13M1(1-335). In D. the lipid distribution in mol% is presented for AtVPS13M1(1-335) as well as for total extracts from calli and insect cells to be able to compare if one lipid class is particularly enriched or not in AtVPS13M1(1-335) purifications compared to the initial extracts with which the protein was incubated. As an example, it allows to deduce that the absence of DGDG detected in the AtVPS13M1(1-335) purifications is not linked to a low level of DGDG in the calli extract, because it represented around 15 mol%, but likely to a weak affinity of the protein for this lipid. We did not represent the Tom20 lipid distribution on this graph because it represents background of lipid binding to the purification column and might suggest that Tom20 binds lipids. We changed the legend in this way and hope that it is clearer now: “C-D. Mass spectrometry (MS) analysis of lipids bound to AtVPS13M1(1-335) or Tom20 (negative control) after incubation with calli total lipids and repurification. Results are expresses in nmol of lipids per nmol of proteins in order to analyze the absolute quantity of the different lipid classes bound to AtVPS13M1(1-335) compared to Tom20 negative control (C), and in mol% to assess the global distribution of lipid classes in AtVPS13M1(1-335) purifications compared to the total lipid extract of insect cells and calli (D).”

      Figure 3: -t-test requires a normal distribution of the data. This is not possible for an n=3. Please use an adequate analysis.

      We performed more replicates and used non-parametric Mann-Whitney analyses in the revised version of the manuscript.

      -Please clarify the meaning of the letters on the top of the bars in the legend.

      This corresponded to the significance of t-tests performed in the first version of the manuscript that were reported in Table S3. As in the new version we performed Mann-Whitney tests, we highlighted the significance by stars and in the figure legends.

      Please, make it clear that two figures belong to C.

      This was clarified in the legend.

      -Reorganise the order of figure 3 (AàBàCàD)

      Because of the configuration of the different histograms presented in the figure, we were not able to change the order but we believed that the graphs can be easily red this way.

      Page 10, 3. Paragraph: since the finding, that no peptides were found in the VSP13M1 ko lines, although transcription was not altered, is surprising, please include the proteomic data in the supplement

      Proteomic data were deposited on PRIDE with the identifier PXD052019. They will remain not publicly accessible until the acceptance of the manuscript.

      Page 11, line 17: The in vitro experiments showed a low affinity of VSP13M1 towards galactolipids. It is further claimed that this is consistent with the finding of the AtVSP13M1 Ko line in vivo, that in absence of PI, no change in DGDG content could be observed. However, the "absence" of VSP13M1 in vivo might still result in a bigger VSP13M1 protein, than the truncated form (1-335) used for the in vitro experiments

      It is true that our in vitro experiments were performed only with a portion of AtVPS13M1 and that the length of the protein could influence protein binding specificity. We removed this assessment from the manuscript.

      Page 13, lane 8: you should reconsider the use of a triple Yepet tag: If two or more identical fluorescent molecules are in close proximity, their fluorescence emission is quenched, which results in a weak signal (as the one that you obtained). See: Zhuang et al. 2000 (PNAS) Fluorescence quenching: A tool for single-molecule protein-folding study

      Many thanks to point this paper. We use a triple Yepet because AtVPS13M1 has a very low level of expression and because this strategy was used successfully to visualize proteins for which the signal was below the detection level with a single GFP (Zhou et al., 2011). The quenching of the 3xYepet might also depend on the conformation they adopt on the targeting protein.

      Page 13, line 14: change 1µm to 1 µm

      This was done.

      Page 13, line 29: please reduce the sentence to the first part: if A does not colocalize with B, it is not necessary to mention that B does not colocalise with A.

      The sentence was modified accordingly.

      Page 14, 2. Paragraph: it is not conclusive that phenotype analysis is suddenly conducted with plants under cold stress, since everything was about Pi-starvation and the role of VSP13M1. Lipid remodelling under Pi stress completely differs from the lipid remodelling under cold stress.

      We eliminated this part in the revised version of the manuscript.

      Page 14, line 20: change figure to Figure

      This was done.

      Page 07, line 17: change artifact to artefact

      This was done.

      Reviewer #2 (Significance (Required)):

      General assessment: The paper is well written and technically sound. However, some points could be identified, that definitely need a revision. Overall, we got the impression that so far, the data gathered are still quite preliminary and need some more detailed investigations prior to publication (see major points).

      Advance: The study definitely fills a gap of knowledge since not much is known on the function of plant VPS13 proteins so far.

      Audience: The study is of very high interest to the plant lipid community but as well of general interest for Plant Molecular Biology and intracellular transport.

      Our expertise: Plant membrane transport and lipid homeostasis.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Leterme et al. (2024) describes the characterization of VPS13M1 from Arabidopsis. VPS13 proteins have been analyzed in yeast and animals, where they establish lipid transfer connections between organelles, but not much is known about VPS13 proteins in plants. First, different splicing forms were characterized, and the form A was identified as the most relevant one with 92% of the transcripts. The protein (just N-terminal 335 amino acids out of ca. 3000 amino acids) was expressed in insect cells and purified. Next, the protein was used for lipid binding assays with NBD-labeled lipids followed by analysis in polyacrylamide gel electrophoresis. VPS13M1 bound to PC, PE, PS and PA. Then, the protein from insect cells was incubated with Arabidopsis callus lipids, and lipids bound to VPS13M1 analyzed by LC-MS/MS. Lipid transfer between liposomes was measured by the change in fluorescence in donor liposomes derived from two labeled lipids after addition of the protein caused by lipid transfer and dilution to acceptor liposomes. T-DNA insertion mutants were isolated and the lipids measured in callus derived from these mutants. Protein localization in different plant organs was recorded with a GUS fusion construct transferred into transgenic plants. The protein was localized to mitochondria using a VPS13M1-Yepet fusion construct transferred into mutant plants. The mutant plants show no visible difference to wild type, even when the plants were grown under stress conditions like low temperature. The main message of the title is that VPS13M1 localizes to the mitochondria which is well documented, and it is involved in lipid remodeling under low phosphate conditions.

      The lipid transfer assay shown in Figure 2F lacks a negative control. This would be the experiment with donor and acceptor liposomes in the presence of another protein like Tom20.

      Many thanks for this suggestion. In the revised version of the manuscript, we performed a fluorescent lipid transport assay with Tom20.3 in the presence of 25 µM of donor liposomes and 1.5 mM of acceptor liposomes, the condition for which we observed a maximum of transport for AtVPS13M1(1-335). As expected, no fluorescence increase was observed. The results are presented in the Figure 3C of the revised manuscript.

      The lipid data (Fig. 3 and Fig. S4) do not sufficiently support the second claim, i.e. that the protein is involved in lipid remodeling under low P. Data in Fig. 3C are derived from only 3 replicates and in Fig. S4 from only 2 replicas with considerable error bars. Having only 2 replicates is definitely not sufficient. Fig. 3C shows a suppression in the decrease in PE and PC at 4 d of P deprivation (significant for two mutants for PE, for only one for PC). Fig. S4A shows suppression of the decrease in PC at 6 d after P deprivation (significant for both mutants), but no significant effect on PE. Fig. 4SB shows no significant change in PE or PC at -P after 8 d of P deprivation. The data are not consistent. There are also problems with the statistics in Fig. 3 and Fig. S4. The authors used T-test, but place letters a, b, c on top of the bars. Usually, asterisks should be used to indicate significant differences. Data indicate medians and ranges, not mean and SD. In Fig. S4, how can you indicate median and range if you have only 2 replicates? Why did the authors use callus for lipid measurements? Why not use leaves and root tissues? What does adjusted nmol mean? What does the dashed line at 1.05 on the y axis mean? Taken together, I suggest to repeat lipid measurements with leaves and roots from plantets grown under +P and -P conditions in tissue culture with 5 replcates. Significant differences can be analyzed on the level of absolute (nmol per mg FW/DW) or relative (%) amounts.

      Here are our answers to concerns about the design of our lipidomics experiments:

      We used calli for lipid measurement because it is very easy to control growth conditions and to performed phosphate starvation from this cell cultures. The second reason is that it is a non-photosynthetic tissue with a high level of phospholipids and a low level of galactoglycerolipids and it is easier to monitor the modification of the balance phospholipids/galactoglycerolipids in this system. The lipid analysis on calli at 4 days of growth in presence or absence of Pi were performed on 3 biological replicates but on two different mutants (atvps13m-1 and m1-3) and we drew our conclusions based on variations that were significant for both mutants. In the revised version of the manuscript, we performed further lipidomic analyses on calli from Col0 and another mutant (atvps13m1-2) after 6 days of growth in presence or absence of Pi (Figure 4E, S4A-C, n=4-5) and added new data on a photosynthetic tissue (rosettes) from Col0 and atvps13m1-3 mutant. For rosettes analysis, seeds were germinated 4 days in plates with 1 mM Pi and then transferred on plates with 1 mM or 5 µM of Pi. Rosettes were harvested and lipids analyzed after 6 days (Figure 4F-G, S4D, n=4-5). All the data were represented with medians and ranges because we believe that median is less sensitive to extreme values than mean and might better represent what is occurring. Ranges highlight the minimal and maximal value of the data analyzed and we believe it is a representative view of the variability we obtained between biological samples.

      Lipid measurement are done by mass spectrometry. As it was already reported, mass spectrometry quantification is not trivial as the intensity of the response depends on the nature of the molecule (for a review, see (Jouhet et al., 2024)). To counteract this ionisation problem, we developed a method with an external standard that we called Quantified Control (QC) corresponding to an A. thaliana callus lipid extract for which the precised lipid composition was determined by TLC and GC-FID. All our MS signals were “adjusted” to the signal of this QC as described in (Jouhet et al., 2017). Therefore our lipid measurement are in adjusted nmol. In material and method we modified the sentence accordingly p22 lines 720-723: “Lipid amounts (pmol) were adjusted for response differences between internal standards and endogenous lipids and by comparison with a quality control (QC).” This allows to represent all the lipid classes on a same graph and to have an estimation of the lipid classes distribution. To assess the significance of our results, we used in the revised version of the manuscript non-parametric Mann-Whitney tests and added stars representing the p-value on charts. This was indicated in the figure legends.

      Here are our answers to concerns about the interpretation of our lipidomics experiments:

      To summarize, in the revised version of the manuscript, lipid analyses were performed in calli from 3 different mutants (two at day 4, one at day 6) and in the rosettes from one of these mutants. All the results are presented in Figure 4 and S4. In all the experiments, we found that in +Pi, there is no major modifications in the lipid content or composition. In –Pi, we found that the total glycerolipid content is always higher in the mutant compared to the Col0, whatever the tissue or mutant considered (Figure 4A and S4A, D). In calli, this higher increase in lipid content is mainly due to an accumulation of phospholipids and in rosettes, of galactolipids. Because of high variability between our biological replicates, we did not always found significant differences in the absolute amount of lipids in –Pi. However, the analysis of the fold change in lipid content in –Pi vs +Pi always pointed toward a reduced extent of phospholipid degradation. We also added in these graphs the fold change for the total phospholipids and total galactolipids contents in the revised version of the manuscript. We believe that the new analyses we performed strengthen our conclusion about the role of AtVPS13M1 in phospholipid degradation and not on the recycling of precursors backbone to feed galactoglycerolipids synthesis at the chloroplast envelope.

      Page 9, line 15: Please use the standard form of abbreviations of lipid molecular species with colon, e.g. PC32:0, not PC32-0

      The lipid species nomenclature has been changed accordingly.

      Page 11, line 4, (atvps13m1.1 and m1.3: please indicate the existence of mutant alleles with dashes, i.e. (atvps13m1-1 and atvps13m1-3

      Names of the mutants have been changed accordingly.

      Page 14, line 21: which line is indicated by atvps13m1.2-4? What does -4 indicate here?

      This indicates that mutants m1-2 to m1-4 were analyzed.

      Page 16, line 25: many abbreviations used here are very specific and not well known to the general audience e.g. ONT, IR, PTC, NMD etc. I think it is OK to mention them here, but still use the full terms, given that they are not used very frequently in the manuscript.

      We kept ONT abbreviation because it was cited many times in both the results and discussion part. IR, PTC and NMD were cited only in the discussion and were eliminated.

      Page 19, line 11. The authors cite Hsueh et al and Yang et al for LPTD1 playing a role in lipid homeostasis during P deficiency. But Yang et al. described the function of a SEC14 protein in Arabidopsis and rice during P deficiency. Is SEC14 related to LPTD1?

      Many thanks for noticing this mistake. We removed the citation Yang et al. in the revised version of the manuscript.

      Reference Tangpranomkorn et al. 2022: In the text, it says that this is a preprint, but in the Reference list, this is indicated with "Plant Biology" as Journal. In the internet, I could only find this manuscript in bioRxiv.

      This manuscript was accepted in “New Phytologist” in December 2024 and is now cited accordingly in the new version of the manuscript.

      Reviewer #3 (Significance (Required)):

      The manuscript by Leterme et al describes the characterization of the lipid binding and transport protein VTPS13M1 from Arabidopsis. I think that the liposome assay needs to be done with a negative control. Furthermore, I have major concerns with the lipid data in Fig. 3C and Fig. S4. These lipid data of the current manuscript need to be redone. I do not agree that the lipid data allow the conclusion that "AtVPS13M1 is involved in lipid remodeling in low phosphate" as stated in the title.

      References cited in this document:

      Dziurdzik, S.K., and E. Conibear. 2021. The Vps13 Family of Lipid Transporters and Its Role at Membrane Contact Sites. Int J Mol Sci. 22:2905. doi:10.3390/ijms22062905.

      Hanna, M., A. Guillén-Samander, and P. De Camilli. 2023. RBG Motif Bridge-Like Lipid Transport Proteins: Structure, Functions, and Open Questions. Annu Rev Cell Dev Biol. 39:409–434. doi:10.1146/annurev-cellbio-120420-014634.

      Hanna, M.G., P.H. Suen, Y. Wu, K.M. Reinisch, and P. De Camilli. 2022. SHIP164 is a chorein motif lipid transfer protein that controls endosome–Golgi membrane traffic. Journal of Cell Biology. 221:e202111018. doi:10.1083/jcb.202111018.

      Jouhet, J., E. Alves, Y. Boutté, S. Darnet, F. Domergue, T. Durand, P. Fischer, L. Fouillen, M. Grube, J. Joubès, U. Kalnenieks, J.M. Kargul, I. Khozin-Goldberg, C. Leblanc, S. Letsiou, J. Lupette, G.V. Markov, I. Medina, T. Melo, P. Mojzeš, S. Momchilova, S. Mongrand, A.S.P. Moreira, B.B. Neves, C. Oger, F. Rey, S. Santaeufemia, H. Schaller, G. Schleyer, Z. Tietel, G. Zammit, C. Ziv, and R. Domingues. 2024. Plant and algal lipidomes: Analysis, composition, and their societal significance. Progress in Lipid Research. 96:101290. doi:10.1016/j.plipres.2024.101290.

      Jouhet, J., J. Lupette, O. Clerc, L. Magneschi, M. Bedhomme, S. Collin, S. Roy, E. Maréchal, and F. Rébeillé. 2017. LC-MS/MS versus TLC plus GC methods: Consistency of glycerolipid and fatty acid profiles in microalgae and higher plant cells and effect of a nitrogen starvation. PLoS ONE. 12:e0182423. doi:10.1371/journal.pone.0182423.

      Kumar, N., M. Leonzino, W. Hancock-Cerutti, F.A. Horenkamp, P. Li, J.A. Lees, H. Wheeler, K.M. Reinisch, and P. De Camilli. 2018. VPS13A and VPS13C are lipid transport proteins differentially localized at ER contact sites. J Cell Biol. 217:3625–3639. doi:10.1083/jcb.201807019.

      Leonzino, M., K.M. Reinisch, and P. De Camilli. 2021. Insights into VPS13 properties and function reveal a new mechanism of eukaryotic lipid transport. Biochimica et Biophysica Acta (BBA) - Molecular and Cell Biology of Lipids. 1866:159003. doi:10.1016/j.bbalip.2021.159003.

      Leterme, S., O. Bastien, R.A. Cigliano, A. Amato, and M. Michaud. 2023. Phylogenetic and Structural Analyses of VPS13 Proteins in Archaeplastida Reveal Their Complex Evolutionary History in Viridiplantae. Contact (Thousand Oaks). 6:1–23. doi:10.1177/25152564231211976.

      Levine, T.P. 2022. Sequence Analysis and Structural Predictions of Lipid Transfer Bridges in the Repeating Beta Groove (RBG) Superfamily Reveal Past and Present Domain Variations Affecting Form, Function and Interactions of VPS13, ATG2, SHIP164, Hobbit and Tweek. Contact. 5:251525642211343. doi:10.1177/25152564221134328.

      Valverde, D.P., S. Yu, V. Boggavarapu, N. Kumar, J.A. Lees, T. Walz, K.M. Reinisch, and T.J. Melia. 2019. ATG2 transports lipids to promote autophagosome biogenesis. J Cell Biol. 218:1787–1798. doi:10.1083/jcb.201811139.

      Wang, J., N. Fang, J. Xiong, Y. Du, Y. Cao, and W.-K. Ji. 2021. An ESCRT-dependent step in fatty acid transfer from lipid droplets to mitochondria through VPS13D−TSG101 interactions. Nat Commun. 12:1252. doi:10.1038/s41467-021-21525-5.

      Zhou, R., L.M. Benavente, A.N. Stepanova, and J.M. Alonso. 2011. A recombineering-based gene tagging system for Arabidopsis. Plant J. 66:712–723. doi:10.1111/j.1365-313X.2011.04524.x.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public review):

      Previous experimental studies demonstrated that membrane association drives avidity for several potent broadly HIV-neutralizing antibodies and its loss dramatically reduces neutralization. In this study, the authors present a tour de force analysis of molecular dynamics (MD) simulations that demonstrate how several HIV-neutralizing membrane-proximal external region (MPER)-targeting antibodies associate with a model lipid bilayer.

      First, the authors compared how three MPER antibodies, 4E10, PGZL1, and 10E8, associated with model membranes, constructed with two lipid compositions similar to native viral membranes. They found that the related antibodies 4E10 and PGZL1 strongly associate with a phospholipid near heavy chain loop 1, consistent with prior crystallographic studies. They also discovered that a previously unappreciated framework region between loops 2-3 in the 4E10/PGZL1 heavy chain contributes to membrane association. Simulations of 10E8, an antibody from a different lineage, revealed several differences from published X-ray structures. Namely, a phosphatidylcholine binding site was offset and includes significant interaction with a nearby framework region. The revised manuscript demonstrates that these lipid interactions are robust to alterations in membrane composition and rigidity. However, it does not address the reverse-that phospholipids known experimentally not to associate with these antibodies (if any such lipids exist) also fail to interact in MD simulations.

      Next, the authors simulate another MPER-targeting antibody, LN01, with a model HIV membrane either containing or missing an MPER antigen fragment within. Of note, LN01 inserts more deeply into the membrane when the MPER antigen is present, supporting an energy balance between the lowest energy conformations of LN01, MPER, and the complex. These simulations recapitulate lipid binding interactions solved in published crystallographic studies but also lead to the discovery of a novel lipid binding site the authors term the "Loading Site", which could guide future experiments with this antibody.

      The authors next established course-grained (CG) MD simulations of the various antibodies with model membranes to study membrane embedding. These simulations facilitated greater sampling of different initial antibody geometries relative to membrane. These CG simulations , which cannot resolve atomistic interactions, are nonetheless compelling because negative controls (ab 13h11, BSA) that should not associate with membrane indeed sample significantly less membrane.

      Distinct geometries derived from CG simulations were then used to initialize all-atom MD simulations to study insertion in finer detail (e.g., phospholipid association), which largely recapitulate their earlier results, albeit with more unbiased sampling. The multiscale model of an initial CG study with broad geometric sampling, followed by all-atom MD, provides a generalized framework for such simulations.

      Finally, the authors construct velocity pulling simulations to estimate the energetics of antibody membrane embedding. Using the multiscale modelling workflow to achieve greater geometric sampling, they demonstrate that their model reliably predicts lower association energetics for known mutations in 4E10 that disrupt lipid binding. However, the model does have limitations: namely, its ability to predict more subtle changes along a lineage-intermediate mutations that reduce lipid binding are indistinguishable from mutations that completely ablate lipid association. Thus, while large/binary differences in lipid affinity might be predictable, the use of this method as a generative model are likely more limited.

      The MD simulations conducted throughout are rigorous and the analysis are extensive, creative, and biologically inspired. Overall, these analyses provide an important mechanistic characterization of how broadly neutralizing antibodies associate with lipids proximal to membrane-associated epitopes to drive neutralization.

      Reviewer #2 (Public review):

      In this study, Maillie et al. have carried out a set of multiscale molecular dynamics simulations to investigate the interactions between the viral membrane and four broadly neutralizing antibodies that target the membrane proximal exposed region (MPER) of the HIV-1 envelope trimer. The simulation recapitulated in several cases the binding sites of lipid head groups that were observed experimentally by X-ray crystallography, as well as some new binding sites. These binding sites were further validated using a structural bioinformatics approach. Finally, steered molecular dynamics was used to measure the binding strength between the membrane and variants of the 4E10 and PGZL1 antibodies.

      The use of multiscale MD simulations allows for a detailed exploration of the system at different time and length scales. The combination of MD simulations and structural bioinformatics provides a comprehensive approach to validate the identified binding sites. Finally, the steered MD simulations offer quantitative insights into the binding strength between the membrane and bnAbs.

      While the simulations and analyses provide qualitative insights into the binding interactions, they do not offer a quantitative assessment of energetics. The coarse-grained simulations exhibit artifacts and thus require careful analysis.

      This study contributes to a deeper understanding of the molecular mechanisms underlying bnAb recognition of the HIV-1 envelope. The insights gained from this work could inform the design of more potent and broadly neutralizing antibodies.

      Recommendations for the authors:

      Reviewing Editor:

      We recommend the authors remove the figure and section related to bnAb LN01, perform additional analysis (e.g., further expanding on the differences in antibody binding in the presence or absence of antigen), and present this as a separate manuscript in a follow-up study.

      We consider the analysis of a bnAb with a transmembrane antigen and of LN01 as essential to the manuscript and novel results.  Study of LN01 provides many insights unique from the other MPER bnAbs in this study.  We agree further characterization of LN01 and bnAbs with transmembrane antigen or full-length Env are intriguing and necessary to complete the full mechanistic understanding of lipid-associated antibodies.  LN01 section in this paper is novel in the field and demonstrates the preliminary evidence motivating further work, which we agree are beyond the scope of this already long detailed study.

      Reviewer #1 (Recommendations for the authors):

      I appreciate the degree to which the authors responded to my previous points raised in the private review, including edits where I might have missed something in the manuscript or relevant literature. I imagine such a point-by-point response was quite onerous. Thank you also for balancing presentation/clarity with content/rigor considering the large information content of this manuscript; in silico results are inherently hard to present given the delicate balance between rigorous validation and novel information content. I apologize if I repeat points raised and addressed previously and commend the authors on their revised study, which is much improved in clarity; any additional revisions are of course entirely at your discretion.

      "...now having more diversity in lipid headgroup chemistries" references the wrong figure-it should be: Figure 2-figure supplement 2A-C. The incorrect figure is also referenced again several sentences down: "...relevant CDR and framework surface loops..."

      Thank you for pointing out this error. We have corrected figure references.

      "One shared conformational difference observed for these bnAbs the higher cholesterol bilayers was slightly more extensive and broader interaction profiles as well as modestly deeper embedding of the relevant CDR and framework surfaces loops" please rephrase

      Thank you for this suggestion.  We rephrased this for improved clarity and flow. 

      "These results bolster the feasibility for using all-atom MD as an in silico platform to explore differential phospholipid affinity at these sites (i.e., specificity studies) and influence on antibody preferred conformation as membrane composition and lipid chemistry are systematically varied" Please tone down these speculations-you have demonstrated that simulations are robust to different headgroup chemistries but have not provided evidence for the exclusion of lipids that are known not to associate with these antibodies.

      We rephrased this speculation to highlight the potential of this application. We also emphasize future studies that would be required to achieve this application in the following sentence.

      “These results motivate use of all-atom MD as an in silico approach for exploring differential phospholipid affinity at these sites…”

      Figure 2A: Specify which PDB entry corresponds to the displayed crystal structures in the main figure or caption.

      We clarified these PDB entries in the figure caption. 

      Check reference formatting in supplemental figures when generating VOR.

      I am not sure how relevant this might be to the claims of Figure 2-figure supplement 3, but AlphaFold3-based phospholigand docking might provide an additional orthogonal approach if relevant ligand(s) are available for such analysis (particularly for the newly proposed 10E8 POPC complex).

      Thank you for this suggestion.  AI/ML based prediction methods like AF3 and RoseTTAFold All-Atom (RFAA) are interesting new methods that have come since our initial submission.   We’ve decided these experiments are beyond the scope of this already long and detailed study. We have added a sentence suggesting use of these methods in future work.

      "We next studied bnAb LN01 to interrogate differences" --> this transition still reads a bit unclear. Why shift gears and change antibodies? Also, while you do go into its interactions both +/- antigen, there's no lead into the simulation initialization with and without antigen to guide the reader into the comparisons you will draw in the figure. Also, the order of information presentation is a bit strange, where the rationale for choosing a single monomeric helix is brought up in the middle of the paragraph instead of at the beginning of the section. In the next paragraph, it goes back to the initialization of the membrane composition again, which feels a bit disorganized-I do appreciate the unique challenge of having to weave through so much quality data! In fact, if you were to conduct simulations of membrane + antigen vs. membrane + LN01 vs. membrane + LN01 + antigen, I am tempted to say that this could be removed from this manuscript and flow better as a paper in and of itself.

      We thank the reviewer for the suggestion to improve the writing style.  We feel this section adds a lot of value to the manuscript, so we will keep it in the paper and improved the transition as well as rationale.  

      We selected to study the additional antibody LN01 and the monomeric MPER-TM antigen conformation because of the existing structural evidence available without additional creative model building.  This rationale has been updated in the new text.  

      We changd the order of information as suggested, moving the rationale for antigen fragment earlier in the paragraph followed by the background of the lipids sites from the crystal that can lead into simulation set-up.  We clarified the simulation initialization was similar for systems with and without antigen in the opening sentence of the paragraph

      "previously observed snorkeling and hydration of TM Arg686" --> Is this R696 (numbering could be different based on the particular Env)?

      Thank you for noting this typo, we have corrected the numbering.

      Potential font color issue with Figure 3-Figure supplement 1 B and part of A text-could be fixed in typesetting.

      The discussion reads very well. Is it possible to direct antibody maturation, even in an engineered context, towards membrane affinity without increasing immunogenic polyreactivity? This is mentioned very briefly and cited with ref 36, but I would be interested in the author's thoughts on this topic.

      We thank the reviewer for the insightful idea to explore in future work.  Our conclusion alludes to possibly artificially evolving membrane affinity studied by MD, as done in vitro by Nieva and co-workers.  Because the hypothetical nature, we’ve chosen not to elaborate on those ideas from this manuscript.

      Reviewer #2 (Recommendations for the authors):

      To ensure reproducibility and facilitate further research, the authors should publicly deposit the code for running the MD simulations and analyses (e.g., on GitHub) along with the underlying data used in the study (e.g., on Zenodo.org).

      We appreciate the consideration for open-source code and analysis. Representative code and simulation trajectories were uploaded to the following repositories:

      https://github.com/cmaillie98/mper_bnAbs.git

      https://zenodo.org/records/13830877

      —-

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Previous experimental studies demonstrated that membrane association drives avidity for several potent broadly HIV-neutralizing antibodies and its loss dramatically reduces neutralization. In this study, the authors present a tour de force analysis of molecular dynamics (MD) simulations that demonstrate how several HIV-neutralizing membrane-proximal external region (MPER)-targeting antibodies associate with a model lipid bilayer.

      First, the authors compared how three MPER antibodies, 4E10, PGZL1, and 10E8, associated with model membranes, constructed with a lipid composition similar to the native virion. They found that the related antibodies 4E10 and PGZL1 strongly associate with a phospholipid near heavy chain loop 1, consistent with prior crystallographic studies. They also discovered that a previously unappreciated framework region between loops 2-3 in the 4E10/PGZL1 heavy chain contributes to membrane association. Simulations of 10E8, an antibody from a different lineage, revealed several differences from published X-ray structures. Namely, a phosphatidylcholine binding site was offset and includes significant interaction with a nearby framework region.

      Next, the authors simulate another MPER-targeting antibody, LN01, with a model HIV membrane either containing or missing an MPER antigen fragment within. Of note, LN01 inserts more deeply into the membrane when the MPER antigen is present, supporting an energy balance between the lowest energy conformations of LN01, MPER, and the complex. Additional contacts and conformational restraints imposed by ectodomain regions of the envelope glycoprotein, however, remain unaddressed-the size of such simulations likely runs into technical limitations including sampling and compute time.

      The authors next established course-grained (CG) MD simulations of the various antibodies with model membranes to study membrane embedding. These simulations facilitated greater sampling of different initial antibody geometries relative to membrane. Distinct geometries derived from CG simulations were then used to initialize all-atom MD simulations to study insertion in finer detail (e.g., phospholipid association), which largely recapitulate their earlier results, albeit with more unbiased sampling. The multiscale model of an initial CG study with broad geometric sampling, followed by all-atom MD, provides a generalized framework for such simulations.

      Finally, the authors construct velocity pulling simulations to estimate the energetics of antibody membrane embedding. Using the multiscale modelling workflow to achieve greater geometric sampling, they demonstrate that their model reliably predicts lower association energetics for known mutations in 4E10 that disrupt lipid binding. However, the model does have limitations: namely, its ability to predict more subtle changes along a lineage-intermediate mutations that reduce lipid binding are indistinguishable from mutations that completely ablate lipid association. Thus, while large/binary differences in lipid affinity might be predictable, the use of this method as a generative model are likely more limited.

      The MD simulations conducted throughout are rigorous and the analysis are extensive. However, given the large amount of data presented within the manuscript, the text would benefit from clearer subsections that delineate discrete mechanistic discoveries, particularly for experimentalists interested in antibody discovery and design. One area the paper does not address involves the polyreactivity associated with membrane binding antibodies-MD simulations and/or pulling velocity experiments with model membranes of different compositions, with and without model antigens, would be needed. Finally, given the challenges in initializing these simulations and their limitations, the text regarding their generalized use for discovery, rather than mechanism, could be toned down.

      Overall, these analyses provide an important mechanistic characterization of how broadly neutralizing antibodies associate with lipids proximal to membrane-associated epitopes to drive neutralization.

      Reviewer #2 (Public Review):

      In this study, Maillie et al. have carried out a set of multiscale molecular dynamics simulations to investigate the interactions between the viral membrane and four broadly neutralizing antibodies that target the membrane proximal exposed region (MPER) of the HIV-1 envelope trimer. The simulation recapitulated in several cases the binding sites of lipid head groups that were observed experimentally by X-ray crystallography, as well as some new binding sites. These binding sites were further validated using a structural bioinformatics approach. Finally, steered molecular dynamics was used to measure the binding strength between the membrane and variants of the 4E10 and PGZL1 antibodies.

      The conclusions from the paper are mostly well supported by the simulations, however, they remain very descriptive and the key findings should be better described and validated. In particular:

      It has been shown that the lipid composition of HIV membrane is rich in cholesterol [1], which accounts for almost 50% molar ratio. The authors use a very different composition and should therefore provide a reference. It has been shown for 4E10 that the change in lipid composition affects dynamics of the binding. The robustness of the results to changes of the lipid composition should also be reported.

      The real advantage of the multiscale approach (coarse grained (CG) simulation followed by a back-mapped all atom simulation) remains unclear. In most cases, the binding mode in the CG simulations seem to be an artifact.

      The results reported in this study should be better compared to available experimental data. For example how does the approach angle compare to cryo-EM structure of the bnAbs engaging with the MPER region, e.g. [2-3]? How do these results from this study compare to previous molecular dynamics studies, e.g.[4-5]?

      References<br /> (1) Brügger, Britta, et al. "The HIV lipidome: a raft with an unusual composition." Proceedings of the National Academy of Sciences 103.8 (2006): 2641-2646.<br /> (2) Rantalainen, Kimmo, et al. "HIV-1 envelope and MPER antibody structures in lipid assemblies." Cell Reports 31.4 (2020).<br /> (3) Yang, Shuang, et al. "Dynamic HIV-1 spike motion creates vulnerability for its membrane-bound tripod to antibody attack." Nature Communications 13.1 (2022): 6393.<br /> (4) Carravilla, Pablo, et al. "The bilayer collective properties govern the interaction of an HIV-1 antibody with the viral membrane." Biophysical Journal 118.1 (2020): 44-56.<br /> (5) Pinto, Dora, et al. "Structural basis for broad HIV-1 neutralization by the MPER-specific human broadly neutralizing antibody LN01." Cell host & microbe 26.5 (2019): 623-637.

      Considering reviewer suggestions, we slightly reorganized the results section into specific sub-sections with headings and changed the order in which key results were presented to allow the subsequent analysis more accessible for readers.  Supplemental materials were redistributed into eLife format, having each supplemental item grouped to a corresponding main figure. Many slightly detail modifications were made to figures (mostly supplemental items) without changing their character, such as clearer axes labels or revised annotations within panels.

      The major additions within the results sections based on the reviews were:

      (1) An expanded the comparison between our simulation analyses to previous simulations and to existing cryo-EM structural evidence for MPER antibodies’ membrane orientation the context of full-length antigen, resulting in new supplemental figure panels.

      (2) New atomistic simulations of 10E8, PGZL1, and 4E10 evaluating the phospholipid binding predictions in a different lipid composition more closely modeling HIV membranes.

      Minor edits to the analyses and interpretations include:

      (1) Outlining the geometric components contributing to variance in substates after clustering the atomistic 10E8, 4E10, and PGZL1 simulations.

      (2) Better defining the variance and durability of membrane interactions within and across systems in the coarse grain methods section.

      (3) Removed interpretations in the original results sections regarding polyreactivity and energetics for MPER bnAbs that were not explicitly supported by data.   

      (4) More context of the prevenance of bnAb loop geometries in structural informatics section

      (5) Rationale for the choice of the continuous helix MPER-TM conformation in LN01-antigen conformations, and citations to previous gp41 TM simulations.

      (6) Removed language on the novelty of the coarse grain and steered pulling simulations as newly developed approaches; tempering the potential discriminating power and applications of those approaches, in light of their limitations.

      The discussion was revised to provide more novel context of the results within the field, including discussing direct relevance of the simulation methods for evaluating immune tolerance mechanisms and into antibody engineering.   We have shared custom scripts used for molecular dynamics analysis on github (https://github.com/cmaillie98/mper_bnAbs.git) and uploaded trajectories to a public repository hosted on Zenodo (https://zenodo.org/records/13830877).

      Recommendations for the authors:

      Below, I provide an extensive list of minor edits associated with the text and figures for the authors to consider. I provide these with the hope of increasing the accessibility of the manuscript to broader audiences but leave changes to the discretion of the authors.

      Text/clarity

      Figure 1 main text

      The main text discussing Figure 1 is disorganized, making the analysis difficult to follow. I would suggest the following: moving the sentence, "4E10 and PG2L1 are structurally homologous" immediately after the paragraph discussing the simulation initiation. Then, add a sentence that directly compares their experimental affinity, neutralization, and polyreactivity of 4E10 and PG2L1 (later, an unintroduced idea pops up, "These patterns may in part explain 4E10's greater polyreactivity"). Next, lead into the discussion of the MD simulation data with something to the effect of: "Given these similarities, we first compared mechanisms of membrane insertion between 4E10 and PG2L1 to bolster confidence in our predictions". Later, the sentence "Across 4E10 and PGZL1 simulations, the bound lipid phosphates"

      We thank the reviewer for the suggestion and we have restructured the beginning of the results to implement this style: to first introduce then discuss the comparative PGZL1 & 4E10 results, i.e. Figure 1 plus associated supplements.

      In the background and the introduction text leading up to Figure 1, CDR-H3 is discussed at length, however, the first figure focuses almost entirely on how CDR-H1 coordinates a lipid phosphate headgroup. Are there experimental mutations in this loop that do not affect affinity (e.g., to a soluble gp41 peptide), but do affect neutralization (like the WAWA mutation for CDR-H3, discussed later)?

      We have altered the Introduction (para 2) and Results (4E10/PGZL1 sub-section) to give more balanced discussion of CDRs H1 & H3.  That includes referencing experimental data addressing the reviewer’s question; a PGZL1 clone H4K3 where mutations to CDRH1 were introduced and shown have minimal impact on affinity to MPER peptide via ELISA and BLI, but those mutant bnAbs had significantly reduced neutralization efficacy (PMC6879610).

      The sentence "These phospholipid binding events were highly stable, typically persisting for hundreds of nanoseconds" should be moved down to immediately precede, "[However], in a PGZL1 simulation, we observed a". This would be a good place for a paragraph break following, "Thus, these bnABs constitutively", since this block of text is very long.

      Similarly, the sentence and parts of the section, "Likewise, the interactions coordinating the lipid phosphate oxygens at CDR-H1" more appropriately belongs immediately before or after the sentence, "Our simulations uncover the CDR-lipid interactions that are the most feasible".

      Thank you for the detailed guidance in reorganizing the Figure 1 results.  We followed the advice to directly compare 4E10 and PGZL1 results separately from 10E8, moving those sections of text appropriately.  New paragraph breaks were added to improve accessibility and flow of concepts throughout the Results.

      In the sentence, "our simulations uncover CDR-lipid interactions that are the most feasible and biologically relevant in the context of a full [HIV] lipid bilayer... validation to which of the many possible ions" à have you confidently determined lipid binding and positioning outside of the site validated in figure 1? Which site(s) are these referencing? The next two sentences then introduce two new ideas on the loop backbone stability then lead into lipid exchange, which is a bit jarring.

      We have adjusted the language concerning the putative ions/lipids electron density across the many PGZL1 and 4E10 crystal structures, and additionally make the explicit point that we confidently determined the lack of lipid binding outside of the site focused on in Figure 1.

      “… both bnAbs showed strong hotspots for a lipid phosphate bound within the CDR-H1 loops, with minimal phospholipid or cholesterol ordering around the proteins elsewhere.  The simulated lipid phosphates bound within CDR-H1 have exceptional overlap with electron densities and atomic details of modelled headgroups from respective lipid-soaked co-crystal structures…”

      Figure 2 main text

      "We similarly investigated bnAb 10E8" - Please make this a separate subheader, the block text is very long up to this point.

      Thank you for the suggestion. We introduced a sub-header to separate work on 10E8 all-atom simulations.

      "we observed a POPC complexed with... modelled as headgroup phosphoglycerol anions..." - please cite the references within the text.

      Thank you for pointing out this missing reference, we added the appropriate reference.

      "One striking and novel observation" - please remove the phrase "striking" throughout, for following best practices in scientific writing (PMC10212555)-this is generally well-done throughout.

      We removed “striking” from our text per your suggestion.

      "This CDR-L1 site highlights... (>500 fold) across HIV strains" - How much do R29 and Y32 also contribute to antigen binding and the conformation of this loop? These mutants also decreased Kd by approximately 20X, and based on the co-crystal structure with the TM antigen (PDB: 4XCC), seem to play a more direct role in antigen contact. Additionally, these residues should be highlighted on a figure, otherwise it's difficult to understand why they are important for membrane association.

      We thank the reviewer for deep engagement to these supporting experimental details.  The R29A+Y32A 10E8 mutant referenced in the text showed only 4-fold Kd increase, a modest change for an SPR binding experiment.  Whereas R29E+Y32E 10E8 mutant resulted in 40x Kd increase, the “20x” the reviewer refers to.  Both 10E8 mutants showed similar drastically reduced breadth and potency of over 2 orders of magnitude on average.

      These mutated CDR-L1 residues are not directly involved in antigen contact and adopt the same loop helix conformation when antigen is bound.  A minor impact on antigen binding affinity could be due altering pre-organization of CDR loops upon losing interactions from the Tyr & Arg sidechains - particularly Tyr31 in contact with CDR-H3.

      As per the suggestion, clearer annotated figure panel denoting these sidechains has been added to Figure 2-Figure Supplement 1 for 10E8 analysis.

      "Structural searches querying... identified between 10^5 and 2*10^6..." - why is this value represented as such a large range? Does this depend on the parameters used for analysis? Please clarify.

      Additionally, how prevalent are any random loop conformations compared to the ones you searched? It's otherwise difficult to attribute number of occurrences within the 2 A cutoff to biological significance, as this number is not put in context.

      We appreciate the reviewers comment to contextualize the range and relative frequency of the bnAb loop conformations.   RMSD and length of loop are the key parameters, which can be controlled by searching reference loops of similar length.  The main point of the backbone-level searching is simply to imply the bnAb loops are not particularly rare when comparing loops of similar length.   

      We did as was suggested and added comparison to random loops of the same length to the main text, including a new Supplementary Table 4.   

      “…identified between 105 to 2∙106 geometrically similar sub-segments within natural proteins (<2 Å RMSD)40, reflecting they are relatively prevalent (not rare) in the protein universe, comparing well with frequency of other surface loops of similar length in antibodies (Supplementary Table 3).”

      "We next examined the geometries" could start after its own new subheading. Moreover, while there's an emphasis on tilt for neutralization, there is not a figure clearly modelling the proposed Env tilt compared to the relatively planar bilayer. It would be helpful to have an additional panel somewhere that shows the orientation of the antibody (e.g., a representative pose) in the simulations relative to an appropriately curved membrane, Env, the binding conformation of the antibody to Env, and apo Env, given the tilting observed in PMID: 32348769 and theorized in PMC5338832. What additional conformational changes or tilting need to occur between the antibodies and Env to accomplish binding to their respective epitopes?

      Thank you for outlining an interesting element to consider in our analysis of a multi-step binding mechanism for MPER antibodies. We added additional figure panels in the supplement to outline the similarities and differences between our simulations and Fabs with the inferred membranes in cryo-EM experiments of full-length HIV Env.  The simulated Fabs’ angles are very similar with only minor tilting to match the cryo-EM antibody-membrane geometries. 

      We added Figure 1-figure supplement 1A & Figure 2-figure supplement 2A, and alter to text to reflect this:

      “The primary difference is Env-bound Fabs in cryo-EM adopt slightly more shallow approach angles (~15_°_) relative to the bilayer normal.  The simulated bnAbs in isolation prefer orientations slightly more upright, but presenting CDRs at approximately the same depth and orientation.  Thus, these bnAbs appear pre-disposed in their membrane surface conformations, needing only a minor tilt to form the membrane-antibody-antigen neutralization complex.”   

      Env tilt dynamics and membrane curvature of natural virions may reconcile some of these differences.  Recent in situ tomography of Full-length Env in pseudo-virions corroborates our approximation of flat bilayers over the short length scales around Env.

      The sentence "we next examined the geometries" mentions "potential energy cost, if any, for reorienting...". However, there's no further discussions of geometry or energy cost within this section. Please rephrase, or move this figure to main and increase discussion associated with the various conformational ensembles, their geometry, and their phospholipid association.

      As the reviewer highlights, the unbiased simulations and our analysis do not explicitly evaluate energetics.  We removed this phrase, and now only allude to the minimal energy barrier between the similar geometric conformations, relative to the tilting & access requirements for antigen binding mechanism.

      “The apparent barrier for re-orientation is likely much less energetically constraining than shielding glycans and accessibility of MPER”

      ".. describing the spectrum of surface-bound conformations" cites the wrong figure.

      Thank you for noticing this error; we correct the figure reference to (Figure 2-figure supplement 4).

      Please comment on the significance of how global clustering (Fig. S5A-C) was similar for 4E10 and PGZL1, but different for 10E8 (e.g., blue, orange, and yellow clusters for 4E10 and PHZL1 versus cyan, red, and green clusters for 10E8). As the cyan cluster seems to be much closer in Euclidian space to the 4E10/PGZL1 clusters, it might warrant additional analysis. What do these clusters represent in terms of structure/conformation? How do these clusters differ in membrane insertion as in (A)?

      We are grateful you identify analysis in the geometric clustering section that may be of interest to other readers. We have added additional supplementary table (Table 2) to detail the CDR loop membrane insertion and global Fab angles which describe each cluster, to demonstrate their similarities and differences.  We also better describe how global clustering was similar for 4E10 and PGZL1, but different for 10E8 in the relevant results section<br /> The cyan cluster is not close in structure to 4E10/PGZL1 clusters.  We note the original figure panel had an error.  The updated Figure 2-supplement 4B shows the correct Euclidian distance hierarchy with an early split between 4e10/pgzl1 and 10e8 clusters.

      Figure 3 main text

      The start of this section, "We next studied bnAb LN01...", is a good place for a new subheader.

      We have added an additional subheader here: Antigen influence on membrane bound conformations and lipid binding sites for LN01

      There should be a sentence in the main text defining the replicate setup and production MD run time. Is the apo and complex based on a published structure? How do you embed the MPER? Is the apo structure docked to membrane like in 4E10? The MD setup could also be better delineated within the methods.

      The first two paragraphs in this section have been updated to clarify the relevant simulations configuration and Fab membrane docking prediction details. 

      The procedure was the same for predicting an initial membrane insertion, albeit now we use the LN01-TM complex and the calculation will account for the membrane burial of the the TM domain and MPER fragment.  As mentioned, LN01 is predicted as inserted with CDR loops insert similarly with or without the TM-MPER fragment.  The geometry differs from PGZL1/4E10 and 10E8, denoted by the text.

      Please comment on the oligomerization state of the antigen used in the MD simulation: how does the simulation differ from a crossed MPER as observed in an MPER antibody-bound Env cryo-EM structure (PMID: 32348769), a three-helix bundle (PMC7210310), or single transmembrane helix (PMC6121722)? How does the model MPER monomer embed in the membrane compared to simulations with a trimeric MPER (PMC6035291, PMID: 33882664)-namely, key arginine residues such as R696?

      We thank the reviewer for pointing out critical underlying rationale for modeling this TM-MPER-LN01 complex which we have corrected in the revised draft. The range of potential conformations and display of MPER based on TM domain organization could easily be its own paper – we in fact have a manuscript in preparation on the topic.  

      The updated text expands the rationale for choosing the monomeric uninterrupted helix form of the MPER-TM model antigen (para 1 of LN01 section). The alternative conformations we did not to explore are called out, with references provided by the reviewer.

      The discussion qualified that the MPER presentation is likely oversimplified here, noting MPER display in the full-length Env trimer will vary in different conformational states or membrane environments. However, the only cryo-EM structures of full-length ENV with TM domains resolved have this continuous helix MPER-TM conformation – seen both within crossing TM dimers or dissociated TM monomers.

      Are there additional analyses that can validate the dynamics of the MPER monomer in the membrane and relative to LN01? Such as key contacts you would expect to maintain over the duration of the MD simulation?

      We also increased description of this TM domain’s behavior, dynamics (tilt, orientation, Arg696 snorkeling, and complex w LN01) to provide a clearer picture of the simulation results – which aligns with past MD of the gp41 TM domain as a monomer (para 2 of LN01 section).  As well, we noted key LN01-MPER contacts that were maintained.

      How does the model MPER modulate membrane properties like lipid density and lipid proximities near LN01?

      We checked and didn’t notice differences for the types of lipids (chol, etc) proximal to the MPER-TM or the CDR loops versus the bulk lipid bilayer distributions.  Due to the already long & detailed nature of this manuscript, we elect not to include discussion on this topic.

      Supplemental figure 1H-I would be better positioned as a figure 3-associated supplemental figure.

      We rearranged to follow the eLife format and have paired supplemental panels with their most relevant main figures.

      Figure 3F/H reference a "loading site" but this site is defined much later in the text, which was confusing.

      Thank you for pointing out this source of confusion, we rearranged our discussion to reflect the order in which we present data in figures.

      What evidence suggests that lipids "quickly exchange from the Loading site into the X-ray site by diffusion"? I do not gather this from Figure S1H/I.

      We have rearranged the loading side and x-ray site RMSD maps in Figure 3-Figure supplement 1 to better illustrate how a lipid exchanges between these sites.

      Figure 4 main text

      The authors assert that in the CG simulations, restraints, "[maintain] Fab tertiary and quaternary structure". However, backbone RMSD does not directly assert this claim-an additional analysis of the key interfacial residues between chains, or geometric analysis between the chains, would better support this claim.

      Thank you for pointing this point.  We rephrased to add that the major sidechain contacts between heavy and light chain persist, in addition to backbone RMSD, to describe how these Fabs maintain the fold stably in CG representation. 

      In several cases, CG models sample and then dissociate from the membrane. In the text, the authors mention, "course-grained models can distinguishing unfavorable and favorable membrane-bound conformations". Is there a particular orientation that causes/favors membrane association and dissociation? This analysis could look at conformations immediately preceding association and dissociation to give clues as to what orientation(s) favor each state.

      Thank you for suggesting this interesting analysis.  Clustering analysis of associated states are presented in Figure 5, Figure 5-Figure Supplement 1, and Figure 6, which show all CDR and framework loop directed insertion.  This feature is currently described in the main text.  

      We did not find strong correlation of specific orientations as “pre-dissociation” states or ineffective non-inserting “scanning” events.  We revised the key sentence to reflect the major take away – that non-CDR alternative conformations did not insert and most of those having CDRs inserted in a different manner than all-atom simulations also were prone to dissociate:

      “Given that non-CDR directed and alternative CDR-embedded orientations readily dissociate, we conclude that course-grained models can distinguish unfavorable and favorable membrane-bound conformations to an extent that provides utility for characterizing antibody-bilayer interaction mechanisms.”

      Figure 6 main text

      "For 4E10, trajectories initiated from all three geometries..." only two geometries are shown for each antibody. Please include all three on the plot.

      The plots include markers for all three geometries for 4E10, highlighted in stars or with letters on the density plots of angles sampled (Figure 6B,C)

      "Aligning a full-length IgG... unlikely that two Fabs simultaneously..." Are there theoretical conformations in which two Fabs could simultaneously associate with membrane? If this was physiological or could be designed rationally, could an antibody benefit further from avidity?

      Our modeling suggests the theoretical conformations having two Fabs on the membrane are infeasible.  It’s even less likely multiple Env antigens could be engaged by one IgG.  We have revised the text to express this more clearly.

      Figure 7 main text

      "An intermediate... showed a modest reduction in affinity..." what affinity does PGZL1 have for this antigen?

      The preceding sentence for this information: “Mature PGZL1 has relatively high affinity to the MPER epitope peptide (Kd = 10 nM) and demonstrates great breadth and potency, neutralizing 84% of a 130 strain panel “

      Figures

      Figure 1

      It would be helpful to have an additional panel at the top of this figure further zoomed out showing the orientation of the antibody (e.g., a representative pose) in the simulations relative to an appropriately curved membrane, Env, the binding conformation of the antibody to Env, and apo Env, given the tilting observed in PMID: 32348769 and theorized in PMC5338832. What additional conformational changes or tilting need to occur between the antibodies and Env to accomplish binding to their respective epitopes?

      Thank you for the suggestion to include this analysis.  We have added to the text reflecting this information, as well as making new supplemental panels for 4E10 and 10E8 that we compare simulated 4E10 and 10E8 Fab conformations to cryoEM density maps with Fabs bound to full-length HIV Env. Figure 1-figure supplement 1A & Figure 2-figure supplement 2A

      In Figure 1, space permitting, it would be helpful to annotate the distances between the phosphates and side chains (similarly, for Figure S1A).

      To avoid the overloading the Main figure panels with text, those relevant distances are listed in the methods sections.  Those distances are used to define the “bound” lipid phosphate state.  Generally, we note the interactions are within hydrogen bonding distance.

      Annotating "Replicate 1" and "Replicate 2" on the left side of Figure 1C/D would make this figure immediately intuitive.

      We have added these labels.

      Figure caption 1C: Please clarify the threshold/definition of a contact used to binarize "bound" versus "unbound" (for example, "mean distance cutoff of 2A between the phosphate oxygen and the COM of CDR-H1") [on further reading of the methods section, this criterion is quite involved and might benefit from: a sentence that includes "see methods"]. Additionally, C could use a sentence explaining the bar such as in E, "Phosphate binding is mapped to above each MD trajectory" Please define FR-H3 in the figure caption for E/F.

      We have added these details to the figure caption.

      Because Figure 1 is aggregated simulation time, it would be helpful to also represent the data as individual replicates or incorporate this information to calculate standard deviations/statistics (e.g., 1 microsecond max using the replicates to compute a standard deviation).

      We believe the current quantification & display of data via sharing all trajectories is sufficient to convey the major point for how often each CDR-phosholipid binding site it occupied.  Further tracking and statistics of inter-atomic distances will likely be too tedious & add minimal value. There is some dynamics of the phosphate oxygens between the polar within the CDR site but our “bound” state definitions sufficiently describe the key participating interactions are made.

      Figure 2

      For A, it would be helpful to annotate the yellow and blue mesh on the figure itself.

      We have defined the orange phosphate and blue choline densities.

      Also, where are R29 and Y32 relative to this site? In the X-ray panels, Y38 is not shown, and the box delineating the zoom-in is almost imperceptible.

      Thank you for this suggestion to include those amino acids which are referenced in the text as critical sites where mutation impacts function. To clarify, Y32 is the pdb numbering for residue Y38 in IMGT numbering. We have added a panel to Figure 2-Figure Supplement 1 having a cartoon graphic of 10E8 loop groove with sidechains & annotating R29 and Y38, staying consistent with out use of IMGT numbering in the manuscript.

      Figure 3

      It might read clearer to have "LN01+MPER-TM" and "LN01-Apo" in the middle of A/B and C/D, respectively, and a dotted line delineating the left and right side of the figure panels.

      We have added these details to the figure for clarity for readers.

      It would be helpful to show some critical interactions that are discussed in the text, such as the salt bridge with K31, by labeling these on the figure (e.g., in E-H).

      We drafted figure panels with dashed lines to indicate those key interactions.  However, they became almost imperceptible and overloaded with annotations that distracted from the overall details.  For K31, the interaction occurs in LN01 crystal structures readers can refer to.

      Why are axes cut off for J?

      We corrected this.

      Please re-define K/L plots as in Figure 1, and explain abbreviations.

      We updated the figure caption to reflect these changes.

      Figure 4

      The caption for panel A states that the Fab begins in solvent 1-2 nm above the bilayer, but the main text states 0.5-2 nm.

      We have reconciled this difference and listed the correct distances: 0.5-2nm.

      Please label the y-axis as "Replicate" for relevant figure panels so that they are more immediately interpretable.

      This label has been added.

      A legend with "membrane-associated" and "non-associated" within the figure would be helpful. Additionally, the average percent membrane associated, with a standard deviation, should be shown (Similar to 1C, albeit with the statistics).

      This legend has been added.  We also added the additional statistical metrics requested to strengthen our analysis.

      The text references "10, 14, and 12 extended insertion events" for the three antibody-based simulations. How do you define "extended insertion events"? Would breaking this into average insertion time and standard deviation better highlight the association differences between MPER antibodies and controls, in addition to the variability due to difference random initialization?

      We thank the reviewer for the insightful suggestion on how to better organize quantitative analysis to support the method. Supplemental Table 3 includes these numbers.

      Figure 5

      The analysis in Fig. S6C could be included here as a main figure.

      The drafted revised figure adding S6C to Figure 5 made for too much information.  Likewise, putting this panel S6C separated it from the parent clustering data of S6B, so we decided to keep these figures separated.  The S6 figure is now Figure 5-figure supplement 1.

      Figure 6

      Please annotate membrane insertion on E as %.

      These are phosphate binding RMSD/occupancy vs time.  The panels are now too small to annotate by %.  The qualitative presentation is sufficient at this stage.  The quantitative % are listed in-line within text when relevant to support assertions made. 

      Please use the figure caption to explain why certain clusters (e.g., 10E8 cluster A, artifact, Fig. S6E) are not included in panel E.

      We have added this information in the figure caption.

      Figure 7

      Please show all points on the box and whisker plots (panels E and F), and perform appropriate statistical tests to see if means are significantly different (these are mentioned in the text, but should be annotated on the graph and mentioned within the figure caption).

      We have changed these plots to show all data points along with relevant statistical comparisons. The figure captions describe unpaired t-test statistical tests used.

      Figure S1

      G, H, and I do not belong here-they should be moved to accompany their relevant text section, which associates with Figure 3. It would be helpful to associate this with Figure 3 in the eLife format, "Figure 3-Supplemental Figure 1" or its equivalent.

      It's very difficult to distinguish the green and blue circles on panel G.

      We darkened the shading and added outline for better visualization

      Subfigure I is missing a caption, could be included with H: "(H,I) Additional replicates for LN01+TM (H) and LN01 (I)".

      We corrected this as suggested.

      Why is H only 3 simulations and not 4? Does it not have a lipid in the x-ray site? Also, the caption states "(top, green)" and "(bottom, cyan)", but the green vs. cyan figures are organized on the left and right. Additional labels within the figure would help make this more intuitive.

      If the point of H and I is to illustrate that POPC exchanges between the X-ray and loading sites, this is unclear from the figure. Consider clarifying these figures.

      Thank you for describing the confusion in this figure, we have added labels to clarify.

      Figure S2 (panels split between revised Figure 4 associated figure supplements)

      The LN01 figures should likely follow later so that they can associate with Figure 3, despite being a similar analysis.

      We corrected supplements to eLife format so supplements are associated with relevant main figures.

      Figure S3 (panels split between revised Figure 1 & 2 associated figure supplements)

      As hydrophobicity is discussed as a driving factor for residue insertion, it would be helpful to have a rolling hydrophobicity chart underneath each plot to make this claim obvious.

      We prefer the current format, due to the worry of having too much information in these already data-rich panels.  As well, residues are not apolar but are deeply inserted.

      Figure S4 (panels split between revised Figure 1 & 2 associated figure supplements)

      It would be helpful to label the relevant loops on these figures.

      We have labeled loops for clarity.

      Do any of these loops have minor contacts with Env in the structure?

      The 4E10 and PGZL1 CDRH-1 loop does not directly contact bound MPER peptides bound in crystal structures. 

      FRL-3 and CDR-H1 in 10E8 do not contact the MPER peptide antigen component based on x-ray crystal structures.

      Do motif contacts with lipid involve minor contacts with additional loops other than those displayed in this figure?

      The phosphate-loop interactions in motifs used as query bait here are mediated solely by the backbone and side chain interactions of the loops displayed. We visually inspected most matches and did not see any “consensus” additional peripheral interactions common across each potential instance in the unrelated proteins.  The supplied Supplemental Table 2 contains the information if a reader wanted to conduct a detailed search. 

      Why is there such a difference between the loop conformation adopted in the X-ray structure and that in the MD simulation, and why does this lead to the large observed differences in ligand-binding structure matches?

      We thank the reviewer for carefully noting our error in labeling of CDR loop and framework region input queries. We revised the labeling to clarify the issue.

      The is minimal structural difference between the loops in x-ray and MD.

      Figure S5 (Figure 2-Figure supplement 4)

      This figure is not colorblind friendly-it would be helpful to change to such a pallet as the data are interesting, but uninterpretable to some.

      We have left this figure the same.

      "Susbstates" - "Substates"

      Corrected, thank you.

      Panel B is uninterpretable-please break the axis so that the Euclidian distances can be represented accurately but the histograms can be interpreted.

      We have adjusted axis for this plot to better illustrate the cluster thresholds.

      The clusters in D-H should be analyzed in greater depth. What is the structural relevance of these clusters other than differences in phospholipid occupancy in (I)? Snapshots of representative poses for each cluster could help clarify these differences.

      We have adjusted the text to describe the geometric differences in each of those clusters that result in the different exceptionally lower propensities for forming the key phospholipid interaction.  

      The figure caption should make it clear that 3 μS of aggregate simulation time is being used here instead of 4 μS to start with unique tilt initializations. E.g., "unique starting membrane-bound conformations (0 degrees, -15 degrees, 15 degrees initialization relative to the docked pose)". Further, why was the particular 0-degree replicate chosen while the other was thrown out? Or was this information averaged? Why is the full 4 μS then used for D-I?

      We thank the reviewer for noting these details.  We didn’t want to bias the differential between 10E8 and 4E10/PGZL1 by including the replicate simulations.  The analysis was mainly intended to achieve more coarse resolution distinction between 10E8 and the similar PGZL1/4E10.  

      In the subsequent clustering of individual bnAb simulation groups, the replicate 0 degree simulations had sufficiently different geometric sampling and unique lipid binding behavior that we though it should be used (4 us total) to achieve finer conformational resolution for each bnAb.

      Figure S6 (now Figure 5-Figure Supplement 1)

      Please label the CDRs in C and provide a color key like in other figures. Also, please label the y-axes. This figure could move to main below 5B with the clusters "A,B,C" labeled on 5B.

      We have added the axes labels and color key legend.  We retained a minimal CDR loop labeling scheme for the more throughput interaction profiles here where colored sections in the residue axes denote CDR loop regions.

      Figure S7 (Figure 7 Figure Supplement 1)

      Panels A and B would likely read better if swapped.

      We have swapped these panels for a better flow.

      For panel C, please display mean and standard deviation, and compare these values with an appropriate statistical test.

      This is already displayed in main figure, we have removed it from supplement.

      For E and F, please clarify from which trajectory(s) you are extracting this conformation from. Are these the global mean/representative poses? How do they compare to other geometrically distinct clusters?

      The requested information was added to supplemental figure caption.  These are frames from 2 distinct time points selected phosphate bound frames from 0-degree tilt replicates for both 4E10 and 10E8, representing at least 2 distinct macroscopic substates differing in global light chain and heavy chain orientation towards the membrane. 

      Table S2 (now Supplementary Table 3)

      Please add details for the 13h11 simulation.

      Additionally, please add average contact time and their standard deviation to the table, rather than just the aggregated total time. This will highlight the variability associated with the random initializations of each simulation.

      We have added the details for 13h11 and the requested analysis (average aggregated time +/- standard deviation and average time per association event +- standard deviation) to supplement our summary statistics for this method.

      Reviewer #2 (Recommendations For The Authors):

      (1) The structure of the manuscript should be improved. For example, almost half of the introduction (three paragraphs) summarize the results. I found it hard to navigate all the data and specific interactions described in the result section. Furthermore, the claims at the end of several sections seem unsupported. Especially for the generalization of the approach. This should be moved to the discussion section. The discussion is pretty general and does not provide much context to the results presented in this study.

      We have significantly reorganized the results section to improve the flow of the manuscript and accessibility for readers, especially the first sections of all-atom simulations. We also removed claims not directly supported by data from our results, and expanded on some of these concepts in the discussion to make some more novel context to the result.

      (2) The author should cite more rigorously previous work and refrain from using the term "develop" to describe the simple use of a well established method. E.g. Several studies have investigated membrane protein interactions e.g. [1], membrane protein-bilayer self-assembly [2], steered molecular dynamics [3], etc.

      Thank you for identifying relevant work for the simulations that set precedent for our novel application to antibody-membrane interactions.  We have removed language about development of simulation methods from the text and now better reference the precedent simulation methods used here.

      (3) Have the authors considered estimating the PMF by combining the steered MD simulation through the application of Jarzynski's equality?

      We performed from preliminary PMFs for Fab-membrane binding, but saw it was taking upward of 40 us to reach convergence.  Steered simulations focus on a key lipid may be easier.

      Although PMFs are beyond the scope of this work, we added proposals & allusion to their utility as the next steps for more rigorous quantification of fab-membrane interactions.

      Minor

      (4) The term "integrative modeling" is usually used for computational pipelines which incorporate experimental data. Multiscale modeling would be more appropriate for this study.

      We altered descriptions throughout the manuscript to reflect this comment.

      (5) Units to report the force in the steered molecular dynamics are incorrect. They should be 98.

      We changed axes and results to correctly report this unit.

      (6) Labels for axes of several graphs are not missing.

      We added labels to all axes of graphs, except for a few where stacked labels can be easily interpreted to save space and reduce complexity in figures.

      (7) Figure 3 K & L is this really < 1% of total? The term "total" should also be clarified.

      Thank you for pointing this out, we changed the % labels to be correct with axes from 0-100%. We clarified total in the figure caption.

      (8) The font size in figures should be uniformized.

      This suggestion has been applied

      (9) Time needed for steered MD should be reported in CPUh and not hours (page 17).

      We removed comments on explicit time measurements for our simulations.

      (10) Version of Martini force field is missing in methods section

      We used Martini 2.6 and added this to the methods.

      References

      (1) Prunotto, Alessio, et al. "Molecular bases of the membrane association mechanism potentiating antibiotic resistance by New Delhi metallo-β-lactamase 1." ACS infectious diseases 6.10 (2020): 2719-2731.

      (2) Scott, Kathryn A., et al. "Coarse-grained MD simulations of membrane protein-bilayer self-assembly." Structure 16.4 (2008): 621-630.

      (3) Izrailev, S., et al. "Computational molecular dynamics: challenges, methods, ideas. Chapter 1. Steered molecular dynamics." (1997).

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We thank the reviewers for their insightful comments regarding our study and for appreciating the range of experiments used, the depth of our study and the significance of our work. We also thank reviewers with expertise in evolutionary biology for highlighting the need for precise wrong of some parts of the manuscript and the need for balancing the various viewpoints on the current understanding of early metazoan evolution. A point-by-point response to each reviewer comment is given below. We believe that we can effectively address most reviewer comments in a revised version. The revised improved manuscript will be the first insightful study of intracellular signalling pathways in the context of early animal evolution. We thank the reviewer for noting that this study is highly impactful and can have a broader influence on the scientific community.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      __ Summary: The researchers identified PIP4K (phosphatidylinositol 5 phosphate 4-kinase) as a lipid kinase that is specific to metazoans. In order to determine its conserved function across metazoans, they compared PIP4K activity in both early-branching metazoans and bilaterian animals. Biochemical assays demonstrated a conserved catalytic activity between the sponge Amphimedon queenslandica (AqPIP4K) and human PIP4K. In in-vivo experiments, AqPIP4K was found to rescue the reduced cell size, growth, and development phenotype in larvae of null mutant in Drosophila PIP4K. Based on these findings, the authors suggest that the function of PIP4K was established in early metazoans to facilitate intercellular communication. The experiments were well designed, and a range of biochemical, in vitro, and in vivo experiments were conducted.__

      __ That being said, there are some questions that require further discussion before we can fully accept the author's conclusion of an evolutionarily conserved function of PIP4K across metazoans.__

      Major comments:

      • The authors mentioned that PIP4K is metazoan-specific and involved in intercellular communication. How can we explain the presence of PIP4K in choanoflagellate genomes? Despite its high similarity with conserved domains and functionally important residues, experimental results with the PIP4K from Choanoflagellate (Monosiga brevicollis, MbPIP4K) such as Mass spectrometry-based kinase assay and mutant Drosophila PIP4K didn't show similar activity to sponge AqPIP4K. The authors suggested that "In the context of other ancient PIP4K it is possible that since choanoflagellates exist as both single-cell and a transient multicellular state and do not have the characteristics of metazoans, PIP4K does not play any important functional role in these." However, this explanation is not well justified; they need to provide a more detailed discussion on this. Response: PIP4K is found in the genome of the choanoflagellate, M.brevicollis. MbPIP4K has the requisite kinase domain and the critical residue in the activation loop (A381) required for PIP4K activity is also conserved with the Amphimedon enzyme. Despite this, MbPIP4K was unable to rescue the growth and cell size phenotype of dPIP4K mutants (dPIP4K29) unlike AqPIP4K.

      We have previously published a comparison of the in vitro activity versus in vivo function for the three PIP4K enzymes in the human genome (Mathre et.al PMID: __30718367). While all three human PIP4K isoforms can functionally rescue the Drosophila dPIP4K mutant, there is a nearly 104-fold difference for in vitro activity between them with PIP4K2C showing almost no in vitro activity. __The difference in in vitro enzyme activity between MbPIP4K and AqPIP4K is similarly notable. We would however highlight that this is more likely a reflection of the limitations of the in vitro PIP4K activity.

      However, while AqPIP4K can rescue function in vivo (rescue of fly mutant phenotypes) MbPIP4K could not when expressed in fly cells. This must imply that there are differences in the polypeptide sequences of AqPIP4K and MbPIP4K that allow the former but not the latter to couple to the Insulin PI3K signalling pathway in fly cells. Given that Amphimedon and Choanoflagellates are separated by 100-150 Mya in evolution, this is possible. Our data on expression of AqPIP4K and MbPIP4K in fly S2 cells shows that they do not have equivalent localization (Fig 2C). What are the differences in the two polypeptides that lead to this? We will perform a multiple sequence alignment using PIP4K sequences from multiple choanoflagellates and sponges to identify these differences.

      We will include the results of this analysis and an appropriate discussion in the revised manuscript.

      • Likewise, the PIP4K gene has been identified in cnidarians, which are a sister group to bilaterian animals. However, the Cnidaria HvPIP4K showed no activity in biochemical or functional assays. In comparison to sponges, cnidarians are relatively complex organisms, and I believe that PIP4K is highly important for intercellular communication, as it is in bilaterians. The authors attempted to explain this by suggesting that "Based on theories of parallel evolution between cnidarians and sponges during early metazoan evolution, it is possible that the PIP4K gene was retained functional in one lineage and not in other." However, I am not convinced by this statement.

      Response: This is a really interesting and challenging question from the reviewer. We are aware that both sponges (Porifera) and Cnidaria are examples of primitive metazoans separated by 80-90 Mya of evolution, yet while AqPIP4K shows activity and can functionally rescue dPIP4K mutants, HvPIP4K cannot. What does this mean?

      A key difference between sponges and cnidarians is that while cnidarians have a simple “nerve-net” like nervous system, sponges do not have such a mode of communication. Therefore, it is possible that PIP4K, which we propose works in the context of hormone-based communication, is functionally important in sponges.

      We are of course aware and acknowledge that in a like for like experimental system (Drosophila cells) our data shows that the two proteins behave differently, be it in terms of in vitro activity or in vivo function. This must imply inherent differences in the two polypeptides.

      What we propose to do is to compare available PIP4K sequences from multiple Porifera and Cnidaria genomes and try and understand differences in the protein sequence that might explain differences in function. These results and their implications will be included in the revised manuscript.

      • Please provide details of the databases (Uniprot-KB, NCBI sequence database, Pfam) versions. After identifying the specific PIP4K protein in each species (e.g. AqPIP4K and HvPIP4K), have you considered performing a reciprocal blast against the human genome to see if you have a top hit to PIP4K? Hence, the main focus of the project is on PIP4K as a metazoan-specific protein. We need to include a wider representation of non-bilaterian animals, including multiple species from sponges, ctenophores, placozoans, and cnidarians. Additionally, please check if homologues of PIP4K are present in other unicellular holozoans besides choanoflagellates. Response: We will add the NCBI IDs for all the sequences. We have carried out reciprocal blast to human proteome and then classified the selected sequences as PIP4K, we will add the results in the supplementary for the same. We will add more species of sponges, ctenophores, placozoans, and cnidarians in our analysis of PIP4K sequences. We will also include an analysis of other unicellular holozoans where genome sequence is available.

      • Authors suggested the identification of other components of the PI signaling pathway along with PIP4k in the sponge. What is the status of these PI signaling pathway genes in other non-bilaterians and choanoflagellates? Response: We will add the details of the same in the revised manuscript and agree that this will help enhance the interpretation of our results.

      • Phylogenetic tree of all PIP4K sequences (Figure 1C): How authors can be certain that the identified PIP4K sequences (e.g. AqPIP4K, HvPIP4K, and MbPIP4K) are indeed PIP4K, especially when there are several closely related proteins? It is important to conduct phylogenetic analysis alongside other PIP sequences (such as PI3K, PI4K, PIP5K, and PIP4K). If this analysis is carried out, the identified AqPIP4K, HvPIP4K, and MbPIP4K should be grouped together with human PIP4K in the same cluster. Response: As described in the methods, we have searched all the individual genomes analyzed for all PIK and PIPK enzyme sequences. We have marked the domains (using Pfam and Interpro) on these sequences and eliminated other PIK and PIPK sequences (such as PI3K, PI4K, PIP5K) and selected only PIP4K. To additionally confirm the distinction between PIP5K and PIP4K, we have manually inspected each sequence to establish the identity of the A381 amino acid residue in the activation loop. The identity of the amino acid at this position in the activation loop has been experimentally demonstrated to be an essential feature of PIP4K (Kunze et.al PMID: 11733501) and we have also confirmed this independently in a recent study (Ghosh et.al PMID: 37316298).

      We will perform the phylogenetic analysis of the phosphoinositide kinases in the format suggested by the reviewer and add it in the revision as a supporting evidence.

      Minor comments:

      • Line 157: Phylogenetic conservation of PIP4Ks: Please provide details about bootstrap analysis. Response: Will be added

      • Line 230: symbol correction 30{degree sign}C Response: Will be done

      • Line 429-430: "from early metazoans like Sponges, Cnidaria and Nematodes." Nematodes are not considered early metazoans. Response: Apologies for the typo. This will be corrected. We agree that nematodes are not early metazoans.

      • Line 477-478: "However, interestingly, MbPIP4K::GFP localizes only at the plasma membrane in S2 cells (Figure 2C)." This part was not further discussed. Can you please elaborate on why MbPIP4K::GFP localizes only at the plasma membrane in S2 cells? Response: We have discussed this point specifically in response to major comment by the reviewer and it will be addressed as described.

      • Line 598: "the earliest examples of metazoa, namely the coral A. queenslandica" A. queenslandica is a sponge, not coral. Response: Apologies for the error. We will correct it.

      • Line 602: "Amphimedon and human enzyme, although separated by 50Mya years of evolution" I think it's 500 million years ago, not 50 million years ago. Response: This typo will be corrected.

      • Line 612: "coordinated communication between the cells is the most likely function" the cell. Response: Will change the sentence accordingly

      • Line 614: "intracellular phosphoinositide signalling the identity of the hormone" missing full stop punctuation. Response: Will change the sentence accordingly

      • Line 802 - 804: "other by way of difference in colour. The sub clusters have been numbered (1- early metazoans, 2- Nematodes, 3- Arthropods, 4- Molluscs, 5- Vertebrates (isoform PIP4K2C), 6- Vertebrates (isoform PIP4K2A), 7- Vertebrates (isoform PIP4K2B)." In the Figure, I can't find numbers on the subclusters. Response: Will add the numbers in the figure.

      • Line 805- 807: "Phylogenetic analysis of selected PIP4K sequences from model organisms of interest. PIP4K from A. queenslandica has been marked in rectangular box." The rectangular box is missing in the figure. Response: Will change the figure accordingly

      • Figure 1C: full forms of species names are missing. Response: Will change the figure accordingly

      Reviewer #1 (Significance (Required):

      The data is presented well, and the authors used a wide range of assays to support their conclusion. The study is highly impactful and can have a broader influence on the scientific community, particularly in evolutionary molecular biology, development, and biochemistry.

      The study provides interesting findings; however, the reasons for PIP4K not being functional in cnidarians as in sponges and why PIP4K is present in unicellular holozoans but not functional are unclear.

      We thank the reviewer for appreciating the significance and impact of our study. The very helpful questions raised by the reviewer will help enhance the quality of our study even further. We will make every effort to address these queries.

      Reviewer #2 (Evidence, reproducibility and clarity (Required):

      The manuscript by Krishnan et al. uses molecular phylogenetics, in vitro kinase assays, heterologous expression assays in Drosophila S2 cells and mutant complementation assays in yeast to study the evolution and function of putative PIP4 kinase genes from a sponge, a cnidarian and a choanoflagellate. Based on these experiments, the authors conclude that PIP4K is metazoan-specific and that the sponge PIP4K has conserved functions in selectively phosphorylating PI5P.

      The study is in principle of interest and it could all be valid data, but the large number of flaws in the data presentation and/or analysis just makes it hard to assess the quality and thus validity of the data and conclusions.

      We thank the reviewer for appreciating the potential interest in our findings of PIP4K function in early metazoans. We thank them for noting the need for correcting data presentation and these will be done in the revision.

      __ Major comments:__

      Overall, the manuscript lacks scientific rigor in the analysis and representation of the results, and the validity of many of the conclusions is therefore difficult to assess.

      Major problems are:

      (i) The authors base their study on the evolution of PIP4K genes on a deeply flawed concept of animal evolution. On multiple occasions, including the title, the authors refer to extant species (e.g. Amphimedon) as 'early metazoan', 'regarded as the earliest evolved metazoan' (l. 46-7) or 'the earliest examples of metazoans' just to name a few. This reflects a 'ladder-like' view on evolution that suggests that extant sponges are identical to early 'steps' of animal evolution.

      We thank the reviewer who is clearly vastly more experienced in the field of evolutionary biology for the possible imprecise/incorrect usage of the word “ancient metazoan”. As new entrants to this area of evolutionary biology, we have of course referred to the existing literature such as PMID: 20686567 to guide us. This paper describes the sequencing of the A. queenslandica genome. It is clear that there is perceived value in studying this sponge in the context of early animal evolution although we are aware of there are a multitude of sponges and not all of them may be of value in the study of early animal evolution. We will peruse the literature more carefully and revise the manuscript to provide a more balanced view of this very interesting but unresolved area.

      Also, the author's interpretation that one cluster of genes 'contained the sequences from early metazoans like sponges, cnidaria and nematodes' is referring to an outdated idea of animal phylogeny where nematodes were thought to be ancestrally simple organisms grouped as 'Acoelomata'. This idea of animal phylogeny was however disproven by molecular phylogenetics since the 1990ies.

      Response: We are aware that the field of animal classification is undergoing continuous evolution. While earlier classifications may have been based of the presence or otherwise of a coelom and/or other anatomical features, we are aware of the use of molecular phylogenetics.

      The phylogeny presented in Fig 1C is based on the sequence relationships between the PIP4K sequences from various animal genomes. Any errors in the labelling of groups such as that highlighted by the reviewer will be revised or corrected after a careful consideration of extant views in the field, which are somewhat varied.

      (ii) The description of taxa in the phylogenetic tree in Fig. 1B lacks any understanding of phylogenetic relationships between animals and other eukaryotic groups. What kind of taxa are 'invertebrates' or 'parasites'? And why would 'invertebrates' exclude cnidarians and sponges? Also, why is the outgroup of opisthokonts named 'Eukaryota'?? Are not all organisms represented on the tree eukaryotes?

      Response: We apologize for this imprecision in labelling taxa. This will be corrected.

      (iii) The methods part lacks any information about the type of analysis (ML, Bayesian, Parsimony?) used to perform the phylogenetic analysis shown in Fig. 1C. Also, the authors mention three distinct clusters (l.428) that are not labelled in the figure.

      Response: We will update the methods to include the additional details requested by the reviewer. Fig 1C will be re-labelled.

      (iv) The validity of the Western Blot is difficult to assess as the authors have cut away the MW markers. Without, it is for example difficult to assess the size differences visible between Hydra and Monosiga PIP4K-GFP proteins on Fig. 2B. Also, it has become standard practice to show the whole Western blot as supplementary data in order to assess the correct size of the bands and the specificity of the antibody. This is also missing from this manuscript.

      Response: Cropped Western blots have been shown to facilitate figure preparation in the main manuscript. The complete uncropped Western blots, in all cases, will be shared as Source data as is the standard practice for multiple journals in the review Commons portfolio.

      (v) The authors claim that AqPIP4K was able to convert PI3P into PI with very low efficiency (Figure 2E), but without further label in the figure or explanation, it remains unclear how the authors come to this conclusion.

      Response: We regret the typo in line 500 of the manuscript we have stated that “Further,……… was able to convert PI3P into PI with very low efficiency (Figure 2E).” What we intended to write was “Further,……… was able to convert PI3P into PI (3,4) P2 with very low efficiency (Figure 2E).” The efficiency with which this reaction takes place is very low and has been reported by us (Ghosh et.al PMID: 31652444) and others (Zhang et.al PMID: 9211928). At the exposure of the TLC shown in Fig 2E the PI(3,4)P2 spot cannot been seen. Much longer exposures of the TLC plate will be needed to see the PI(3,4)P2 spot. This will be corrected in a revised version of the manuscript.

      (vi) The box plots in Fig. 3C and D lack error bars and thus seem to be consisting of only single data points without replicates. Also, Fig. 3C is a quantification of Fig. 3B but it remains unclear what has been quantified and how. It is also unclear how %PIP2 was determined.

      Response: For Fig 3C, the colony count has been done from three replicates and the average has been considered to calculate the % growth for each genotype. We will include error bars and clarify this in the revised figure legend. For Fig 3D, the PIP2/PIP ratio has been calculated from biological replicates and average has been represented in the graphs. The individual values can be provided as supplementary data.

      (vii) Throughout Fig. 4, I do not understand the genotypes indicated on the x-axis of the plots and below the images. I read the figure legends and manuscript describing these results at least 3 times, but cannot figure out what it all means. On Fig. 4C, what is the wild-type situation?

      Response: We apologize for the lack of precision in labelling the figures versus the figure legends. This will be corrected in the revision:

      The genotypes are as follows

      • w1118 (control) * Act-GAL4. This has been referred to as wild type in the figure legend and called Act-Gal4 in Fig4 panels A-E
      • dPIP4K29 – This refers to the protein null strain of dPIP4K. This strain is the background in which all reconstitutions of PIP4K genes have been done.
      • PIP4K transgene from A. queenslandica.
      • AqPIP4KKD Kinase dead PIP4K transgene from * queenslandica. In panels A, B, D and E, Act-GAL4: dPIP4K29* indicates the genetic background in which either AqPIP4K or AqPIP4KKD has been reconstituted.

      Reviewer #2 (Significance (Required)):

      If validated and put in the right phylogenetic context, the study is potentially contributing to expanding our knowledge on the evolution of metazoan-specific features, especially the evolution of proteins involved in cell-cell signalling and growth control. My field of expertise is broadly in evo-devo, molecular phylogentics, developmental genetics and cell biology. The in vitro lipid analysis seems interesting and potentially valid but I do not have sufficient expertise to evaluate its validity.

      We thank the reviewer for appreciating the novelty of our contribution and its potential to contribute to understanding the evolution of metazoan specific signalling systems, once appropriate corrections have been made. We also appreciate their positive comment on our in vitro experimental analysis. This paper is a big effort to not only perform phylogenetic analysis but address the emerging interpretations experimentally as much as possible.


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary In this manuscript, the authors investigate the evolutionary origins of metazoan Phosphatidylinositol phosphates (PIPs) signaling by elucidating the sequence and function of the PIP4K enzyme, which is crucial for converting PI5P to PI(4,5)P2 through phosphorylation. The authors have described PIP4K-like sequences distributed throughout metazoans and choanoflagellates through an extensive sequence screening. With in vitro and in vivo functional assays, the authors have shown that the sponge A. queenslandica PIP4K (AqPIP4K) is functionally similar to its human counterpart and highlight the major discovery of this study - that PIP4K protein function dates back to as early as sponges.

      We thank the reviewer for noting the major finding of our study and our efforts to experimentally validate, using multiple approaches, the findings of our detailed bioinformatics analysis of PIP4K gene distribution across the tree of life.

      Major comments

      There are two key limitations to this paper. Like the sponges, ctenophores are one of the earliest branching metazoans. They are not well addressed in the paper. Secondly, despite finding PIP4 homologs in choanoflagellates, the authors claim that PIP4 is metazoan-specific.

      We thank the reviewer for highlighting these two points; we recognize that both of these are important to address, to the extent that it is possible to do so. These will be addressed using the approaches detailed in the response to reviewer 1 comments.

      1. Line 46: A. queenslandica is the earliest branching metazoan. The phylogeny of sponges and ctenophores is not conclusively defined and hence, the statement must be rephrased. Despite the brief description of the evolution of metazoan lineage in the discussion section, ctenophores are missing from the phylogenetic tree. At least a sequence-level information PIP4K in ctenophores would strongly back the claims of the manuscript. Here is the link to the Mnemiopsis database. Response: We thank the reviewer for highlighting this point and pointing us to the Mnemiopsis database. We will most certainly analyse ctenophore genome sequences and add the ctenophore PIP4K sequence to the phylogeny, post analysis and the discussion will be modified to reflect the findings.

      Mentioning that choanoflagellates contain homologs of PIP4K contradicts the statement that PIP4K is metazoan-specific. As per Fig 1E., the domain organization of PIP4K is conserved among choanoflagellates and metazoans. What is the percent sequence similarity to the query? This could answer why it doesn't show activity in Drosophila rescues - the system might simply not be compatible with the choanoflagellate homolog. The same may apply to the cnidarian homolog HvPIP4K. Further evidence is needed before concluding that MbPIP4K doesn't phosphorylate PIP5. It is additionally fascinating that MbPIP4K localizes at the plasma membrane unlike other homologs - this function might be choano-specific. Overall, PIP4K's possible origin in the choanoflagellate-metazoan common ancestor backs the current research that choanoflagellates indeed hold clues to understanding metazoan evolution. Further research is necessary before concluding (as in line 648) in the discussions section, where it is mentioned that "PIP4K does not play any important functional role in choanos".

      Response: We thank the reviewer for highlighting the very interesting but incompletely understood facets of our study vis-à-vis choanoflagellates versus metazoans. The proposal for additional analysis is indeed interesting and we will carry out these analysis and revise the text accordingly.

      __ Minor Comments__

      1. A detailed comparison of the sequence of the hydra PIP4K might help understand why it may not have worked like the sponge PIP4K. The discussion on the cnidarian PIP4K evolution is not convincing. It may not have worked because of it being expressed in a non-natural system. Structure prediction and comparison of proteins from different early branching animals should be used. Response: Thank you for these suggestions to understand why the cnidarian PIP4K may not have been functional. We will perform the suggested analysis and incorporate the data into the revision.

      78 - Multicellularity evolved many times. Maybe say 'first evolved metazoans'

      Response: Thank you for the suggestion.

      Line 598 A. queenslandica is not a coral, it's a sponge.

      Response: Text will be changed accordingly

      Line 612 'thcells' à 'the cells'

      Response: Text will be changed.

      Line 623 - full stop missing after metazoans.

      Response: Text will be changed

      Figure 1B - Classification should be consistent - C. elegans is a species name, whereas ctenophores and vertebrates belong to a different classification. Invertebrates is not a scientific group. The edges of the lines of the phylogenetic tree don't join and they need to be arranged correctly.

      Response: The names in the phylogeny will be changed to maintain uniformity. The representation of the phylogeny will be changed as mentioned.

      Figure 2B The full blot could be shown in the supplement.

      Response: Full blot will be provided as source data on resubmission or included as supplementary based on the destination journal’s specification.


      Optional

      1. Heterologous overexpression does not always provide the full picture of the gene functionality. To make claims on the evolution of function, testing gene functions homologous systems can give a better picture. For example, performing in vitro kinase activity assays of MbPIP4K after overexpressing PIP4K in Monosiga brevicollis. would be a great. Data is missing also about the presence and function of ctenophore PIP4K. Overexpression of ctenophore-PIP4K in Drosophila for functional analyses could help in understanding the distribution/diversity of function of PIP4K in early animals. Response: We agree with the reviewer that heterologous expression may sometimes not replicate the biochemical environment of cells in the organism from which the gene being expressed was originally derived. Yet, heterologous expression experiments do sometimes provide an insight into properties solely dependent on the polypeptide with limited or no contribution from the cellular environment. In principle expressing PIP4K in M.brevicollis cells and then performing kinase assays would be a very good idea. However, we would like to highlight that till date there has been only one study where septins have been transfected in Choanoflagellates and their localization being observed. We are not set up to culture M. brevicollis and will be unable to do this for a revision of the current manuscript. However, we appreciate the importance of this experiment and will do this in collaboration with a choanoflagellate lab in a follow up study to this one.

      Ctenophores like cnidarians have two main layers of cells that sandwich a middle layer of jelly-like material, while, more complex animals have three main cell layers and no intermediate jelly-like layer. Hence ctenophores and cnidarians have traditionally been labelled diploblastic. Studies have shown that ctenophores and unicellular eukaryotes share ancestral metazoan patterns of chromosomal arrangements, whereas sponges, bilaterians, and cnidarians share derived chromosomal rearrangements. Conserved syntenic characters unite sponges with bilaterians, cnidarians, and placozoans in a monophyletic clade while ctenophores are excluded from this clustering, placing ctenophores as the sister group to all other animals. Ctenophore PIP4K sequence can be identified and compared as discussed before to other PIP4K sequences used in this study.

      Reviewer #3 (Significance (Required)):

      Significance: This is the first study that addresses PIP signaling pathway in early metazoans. The findings of this manuscript contribute to the understanding of second-messenger signaling and its link with the origin and evolution of metazoan multicellularity. PIP signaling is crucial in different metazoan aspects such as cytoskeletal dynamics, neurotransmission, and vesicle trafficking, and hence, plays a critical role in metazoan multicellularity. Through this study, it was interesting to see that some components of the PIP signaling pathway are conserved in yeast, but some, such as the PIP4K protein evolved at the brink of metazoan evolution, highlighting the need for complexity in metazoans and their close relatives - the facultatively multicellular choanoflagellates. Since this is a crucial pathway in human biology and has medical significance due to its role in tumorigenesis and cancer cell migration, this study serves the audience in basic research such as evolutionary biology, and applied research such as human medicine. My field of expertise is molecular biology, cell biology and microbiology, with specific expertise on choanoflagellates. Therefore, it is exciting to see the homologs of PIP4K present in choanoflagellates.

      __ Evidence, Reproducibility, and clarity:__

      The authors have made a clear case of why PIP4K needs to be studied. They have thoroughly mapped PIP4K throughout the tree of life. The results are clear and reproducible. With the findings of this study, they have linked the PIP signalling cascade and metazoan evolution. Using the heterologous expression of sponge A. queensladica PIP4K, they have made compelling evidence that AqPIP4K functions in PIP5 phosphorylation, as seen in humans and Drosophila. However, it was not convincing why the hydra PIP4K was not functional. It was also not convincing why the PIP4K is metazoan-only when there is a conserved sequence (with conserved domain structure) present in choanoflagellates.


      We thank the reviewer for appreciating the novelty and importance of our findings in multiple areas of basic biology related to early metazoans and basic biomedical sciences. We also note their comments on the clear and reproducible results presented. Points raised related to the lack of functionality of PIP4K from Hydra and choanoflagellates are noted and will be addressed as indicated in response to other reviewer comments.


      Experiments/Analysis to be done

      1. We will perform a multiple sequence alignment using PIP4K sequences from multiple choanoflagellates and sponges to identify these differences.
      2. What we propose to do is to compare available PIP4K sequences from multiple Porifera and Cnidaria genomes and try and understand differences in the protein sequence that might explain differences in function.
      3. We will add more species of sponges, ctenophores, placozoans, and cnidarians in our analysis of PIP4K sequences. We will also include an analysis of other unicellular holozoans where genome sequence is available.
      4. We will perform the phylogenetic analysis of the phosphoinositide kinases in the format suggested by the reviewer and add it in the revision as a supporting evidence.
      5. Structure prediction and comparison of proteins from different early branching animals should be used.
      6. Uniformity of terminology and alignment with conventions in the field of animal taxonomy
      7. NCBI ID of sequences to be added and include more non-bilaterian animals sequences in phylogeny- redo the phylogeny.
      8. Check for PI signalling genes in choanoflagellates
      9. More detailed description of phylogenetic analysis.
      10. Add complete Western blot as source data.
      11. *

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *

      4. Description of analyses that authors prefer not to carry out

      • Expression of PIP4K in choanoflagellates and in vitro kinase assays with lysates. It is beyond our technical ability to perform these experiments at this stage.
    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Bell et. al. describes an analysis of the effects of removing one of two mutually exclusive splice exons at two distinct sites in the Drosophila CaV2 calcium channel Cacophony (Cac). The authors perform imaging and electrophysiology, along with some behavioral analysis of larval locomotion, to determine whether these alternatively spliced variants have the potential to diversify Cac function in presynaptic output at larval neuromuscular junctions. The author provided valuable insights into how alternative splicing at two sites in the calcium channel alters its function.

      Strengths:

      The authors find that both of the second alternatively spliced exons (I-IIA and I-IIB) that are found in the intracellular loop between the 1st and second set of transmembrane domains can support Cac function. However, loss of the I-IIB isoform (predicted to alter potential beta subunit interactions) results in 50% fewer channels at active zones and a decrease in neurotransmitter release and the ability to support presynaptic homeostatic potentiation. Overall, the study provides new insights into Cac diversity at two alternatively spliced sites within the protein, adding to our understanding of how regulation of presynaptic calcium channel function can be regulated by splicing.

      Weaknesses:

      The authors find that one splice isoform (IS4B) in the first S4 voltage sensor is essential for the protein's function in promoting neurotransmitter release, while the other isoform (IS4A) is dispensable. The authors conclude that IS4B is required to localize Cac channels to active zones. However, I find it more likely that IS4B is required for channel stability and leads to the protein being degraded, rather than any effect on active zone localization. More analysis would be required to establish that as the mechanism for the unique requirement for IS4B.

      (1) We thank the reviewer for this important point. In fact, all three reviewers raised the same question, and the reviewing editor pointed out that caution or additional experiments were required to distinguish between IS4 splicing being important for cac channel localization versus channel stability/degradation. We provide multiple sets of experiments as well as text and figure revisions to strengthen our claim that the IS4B exon is required for cacophony channels to enter motoneuron presynaptic boutons and localize to active zones.

      a. If IS4B was indeed required for cac channel stability (and not for localization to active zones) IS4A channels should be instable wherever they are. This is not the case because we have recorded somatodendritic cacophony currents from IS4A expressing adult motoneurons that were devoid of cac channels with the IS4B exon. Therefore, IS4A cac channels are not instable but underlie somatodendritic voltage dependent calcium currents in these motoneurons. These new data are now shown in the revised figure 3C and referred to in the text on page 7, line 42 to page 8 line 9.

      b. Similarly, if IS4B was required for channel stability, it should not be present anywhere in the nervous system. We tested this by immunohistochemistry for GFP tagged IS4A channels in the larval CNS. Although IS4A channels are sparsely expressed, which is consistent with low expression levels seen in the Western blots (Fig. 1E), there are always defined and reproducible patterns of IS4A label in the larval brain lobes as well as in the anterior part of the VNC. This again shows that the absence of IS4A from presynaptic active zones is not caused by channel instability, because the channel is expressed in other parts of the nervous system. These data are shown in the new supplementary figure 1 and referred to in the text on page 15, lines 3 to 8.

      c. As suggested in a similar context by reviewers 1 and 2, we now show enlargements of the presence of IS4B channels in presynaptic active zones as well as enlargements of the absence of IS4A channels in presynaptic active zones in the revised figures 2A-C and 3A. In these images, no IS4A label is detectable in active zones or anywhere else throughout the axon terminals, thus indicating that IS4B is required for expressing cac channels in the axon terminal boutons and localizing it to active zones. Text and figure legends have been adjusted accordingly.

      d. Related to this, reviewer 1 also recommended to quantify the IS4A and ISB4 channel intensity and co-localization with the active zone marker brp (recommendation for authors). After following the reviewers’ suggestion to adjust the background values in IS4A and IS4B immunolabels to identical (revised Figs. 2A-C), it becomes obvious that IS4A channel are not detectable above background in presynaptic terminals or active zones, thus intensity is close to zero. We still calculated the Pearsons co-localization coefficient for both IS4 variants with the active zone marker brp. For IS4B channels the Pearson’s correlation coefficient is control like, just above 0.6, whereas for IS4A channels we do not find colocalization with brp (Pearson’s below 0.25). These new analyses are now shown in the revised figure 2D and referred to on page 6, lines 33 to 38.

      e. Consistent with our finding that IS4B is required for cac channel localization to presynaptic active zones, upon removal of IS4B we find no evoked synaptic transmission (Fig. 2 in initial submission, now Fig. 3B).

      Together these data are in line with a unique requirement of IS4B at presynaptic active zones (not excluding additional functions of IS4B), whereas IS4A containing cac isoforms are not found in presynaptic active zones and mediate different functions.

      Reviewer #2 (Public Review):

      This study by Bell et al. focuses on understanding the roles of two alternatively spliced exons in the single Drosophila Cav2 gene cac. The authors generate a series of cac alleles in which one or the other mutually exclusive exons are deleted to determine the functional consequences at the neuromuscular junction. They find alternative splicing at one exon encoding part of the voltage sensor impacts the activation voltage as well as localization to the active zone. In contrast, splicing at the second exon pair does not impact Cav2 channel localization, but it appears to determine the abundance of the channel at active zones.

      Together, the authors propose that alternative splicing at the Cac locus enables diversity in Cav2 function generated through isoform diversity generated at the single Cav2 alpha subunit gene encoded in Drosophila.

      Overall this is an excellent, rigorously validated study that defines unanticipated functions for alternative splicing in Cav2 channels. The authors have generated an important toolkit of mutually exclusive Cac splice isoforms that will be of broad utility for the field, and show convincing evidence for distinct consequences of alternative splicing of this single Cav2 channel at synapses. Importantly, the authors use electrophysiology and quantitative live sptPALM imaging to determine the impacts of Cac alternative splicing on synaptic function. There are some outstanding questions regarding the mechanisms underlying the changes in Cac localization and function, and some additional suggestions are listed below for the authors to consider in strengthening this study. Nonetheless, this is a compelling investigation of alternative splicing in Cav2 channels that should be of interest to many researchers.

      (2) We believe that the additional data on cac IS4A isoform localization and function as detailed above (response to public review 1) has strengthened the manuscript and answered some of the remaining questions the reviewer refers to. We are also grateful for the specific additional reviewer suggestions which we have addressed point-by-point and refer to below (section recommendations for authors).

      Reviewer #3 (Public Review):

      Summary:

      Bell and colleagues studied how different splice isoforms of voltage-gated CaV2 calcium channels affect channel expression, localization, function, synaptic transmission, and locomotor behavior at the larval Drosophila neuromuscular junction. They reveal that one mutually exclusive exon located in the fourth transmembrane domain encoding the voltage sensor is essential for calcium channel expression, function, active zone localization, and synaptic transmission. Furthermore, a second mutually exclusive exon residing in an intracellular loop containing the binding sites for Caβ and G-protein βγ subunits promotes the expression and synaptic localization of around ~50% of CaV2 channels, thereby contributing to ~50% of synaptic transmission. This isoform enhances release probability, as evident from increased short-term depression, is vital for homeostatic potentiation of neurotransmitter release induced by glutamate receptor impairment, and promotes locomotion. The roles of the two other tested isoforms remain less clear.

      Strengths:

      The study is based on solid data that was obtained with a diverse set of approaches. Moreover, it generated valuable transgenic flies that will facilitate future research on the role of calcium channel splice isoforms in neural function.

      Weaknesses:

      (1) Based on the data shown in Figures 2A-C, and 2H, it is difficult to judge the localization of the cac isoforms. Could they analyze cac localization with regard to Brp localization (similar to Figure 3; the term "co-localization" should be avoided for confocal data), as well as cac and Brp fluorescence intensity in the different genotypes for the experiments shown in Figure 2 and 3 (Brp intensity appears lower in the dI-IIA example shown in Figure 3G)? Furthermore, heterozygous dIS4B imaging data (Figure 2C) should be quantified and compared to heterozygous cacsfGFP/+.

      According to the reviewer’s suggestion, we have quantified cac localization relative to brp localization by computing the Pearson’s correlation coefficient for controls and IS4A as well as IS4B animals. These new data are shown in the revised Fig. 2D and referred to on page 6, lines 33-38. Furthermore, we now confirm control-like Pearson’s correlation coefficients for all exon out variants except ΔIS4B and show Pearson’s correlation coefficients for all genotypes side-by-side in the revised Fig. 4D (legend has been adjusted accordingly). In addition, in response to the recommendations to authors, we now provide selective enlargements for the co-labeling of Brp and each exon out variant in the revised figures 2-4. We have also adjusted the background in Fig. 2C (ΔIS4B) to match that in Figs. 2A and B (control and ΔIS4A). This allows a fair comparison of cac intensities following excision of IS4B versus excision of IS4A and control (see also Fig 3). Together, this demonstrates the absence of IS4A label in presynaptic active zones much clearer. As suggested, we have also quantified brp puncta intensity on m6/7 across homozygous exon excision mutants and found no differences (this is now stated for IS4A/IS4B in the results text on page 6, lines 37/38 and for I-IIA/I-IIB on page 8, lines 42-44.). We did not quantify the intensity of cacophony puncta upon excision of IS4B because the label revealed no significant difference from background (which can be seen much better in the images now), but the brp intensities remained control-like even upon excision of IS4B.

      (2) They conclude that I-II splicing is not required for cac localization (p. 13). However, cac channel number is reduced in dI-IIB. Could the channels be mis-localized (e.g., in the soma/axon)? What is their definition of localization? Could cac be also mis-localized in dIS4B? Furthermore, the Western Blots indicate a prominent decrease in cac levels in dIS4B/+ and dI-IIB (Figure 1D). How do the decreased protein levels seen in both genotypes fit to a "localization" defect? Could decreased cac expression levels explain the phenotypes alone?

      We have now precisely defined what we mean by cac localization, namely the selective label of cac channels in presynaptic active zones that are defined as brp puncta, but no cac label elsewhere in the presynaptic bouton (page 6, lines 18 to 20). On the level of CLSM microscopy this corresponds to overlapping cac puncta and brp puncta, but no cac label elsewhere in the bouton. Based on the additional analysis and data sets outlined in our response 1 (see above) we conclude that excision of IS4B does not cause channel mislocalization because we find reproducible expression patterns elsewhere in the nervous system as well as somatodendritic cac current in ΔIS4B (for detail see above). Therefore, the isoforms containing the mutually exclusive IS4A exon are expressed and mediate other functions, but cannot substitute IS4B containing isoforms at the presynaptic AZ. In fact, our Western blots are in line with reduced cac expression if all isoforms that mediate evoked release are missing, again indicating that the presynapse specific cac isoforms cannot be replaced by other cac isoforms. This is also in line with the sparse expression of IS4A throughout the CNS as seen in the new supplementary figure 1 (for detail see above).

      (3) Cac-IS4B is required for Cav2 expression, active zone localization, and synaptic transmission. Similarly, loss of cac-I-IIB reduces calcium channel expression and number. Hence, the major phenotype of the tested splice isoforms is the loss of/a reduction in Cav2 channel number. What is the physiological role of these isoforms? Is the idea that channel numbers can be regulated by splicing? Is there any data from other systems relating channel number regulation to splicing (vs. transcription or post-transcriptional regulation)?

      Our data are not consistent with the idea that splicing regulates channel numbers. Rather, splicing can be used to generate channels with specific properties that match the demand at the site of expression. For the IS4 exon pair we find differences in activation voltage between IS4A and IS4B channels (revised Fig. 3C), with IS4B being required for sustained HVA current. IS4A does not localize to presynaptic active zones at the NMJ and is only sparsely expressed elsewhere in the NS (new supplementary Fig. 1). By contrast, IS4B is abundantly expressed in many neuropils. Therefore, taking out IS4B takes out the more abundant IS4 isoform. This is consistent with different expression levels for IS4 isoforms that have different functions, but we do not find evidence for splicing regulating expression levels per se.

      Similarly, the I-II mutually exclusive exon pair differs markedly in the presence or absence of G-protein βγ binding sites that play a role in acute channel regulation as well the conservation of the sequence for β-subunit binding (see page 5, lines 9-17). Channel number reduction in active zones occurs specifically if expression of the cac channels with the G<sub>βγ</sub>-binding site as well as the more conserved β-subunit binding is prohibited by excision of the I-IIB exon (see Fig. 5F). Vice versa, excision of I-IIA does not result in reduced channel numbers. This scenario is consistent with the hypothesis that conserved β-subunit binding affects channel number in the active zone (see page 17, lines 3 to 6 and lines 33-36), but we have no evidence that I-II splicing per se affects channel number.

      (4) Although not supported by statistics, and as appreciated by the authors (p. 14), there is a slight increase in PSC amplitude in dIS4A mutants (Figure 2). Similarly, PSC amplitudes appear slightly larger (Figure 3J), and cac fluorescence intensity is slightly higher (Figure 3H) in dI-IIA mutants. Furthermore, cac intensity and PSC amplitude distributions appear larger in dI-IIA mutants (Figures 3H, J), suggesting a correlation between cac levels and release. Can they exclude that IS4A and/or I-IIA negatively regulate release? I suggest increasing the sample size for Canton S to assess whether dIS4A mutant PSCs differ from controls (Figure 2E). Experiments at lower extracellular calcium may help reveal potential increases in PSC amplitude in the two genotypes (but are not required). A potential increase in PSC amplitude in either isoform would be very interesting because it would suggest that cac splicing could negatively regulate release.

      There are several possibilities to explain this, but as none of the effects is statistically significant, we prefer to not investigate this in further depth. However, given that we cannot find IS4A in presynaptic active zones (revised figures 2C and 3A plus the new enlargements 2Ci and 3Ai, revised text page 6, lines 22 to 24 and 29 to 31, and page 7, second paragraph, same as public response 1D) IS4A channels cannot have a direct negative effect on release probability. Nonetheless, given that IS4A containing cac isoforms mediate functions in other neuronal compartments (see revised Fig. 3C) it may regulate release indirectly by affecting e.g. action potential shape. Moreover, in response to the more detailed suggestions to authors we provide new data that give additional insight.

      (5) They provide compelling evidence that IS4A is required for the amplitude of somatic sustained HVA calcium currents. However, the evidence for effects on biophysical properties and activation voltage (p. 13) is less convincing. Is the phenotype confined to the sustained phase, or are other aspects of the current also affected (Figure 2J)? Could they also show the quantification of further parameters, such as CaV2 peak current density, charge density, as well as inactivation kinetics for the two genotypes? I also suggest plotting peaknormalized HVA current density and conductance (G/Gmax) as a function of Vm. Could a decrease in current density due to decreased channel expression be the only phenotype? How would changes in the sustained phase translate into altered synaptic transmission in response to AP stimulation?

      Most importantly, sustained HVA current is abolished upon excision of IS4B (not IS4A, we think the reviewer accidentally mixed up the genotype) and presynaptic active zones at the NMJ contain only cac isoforms with the IS4B exon. This indicates that the cac isoforms that mediate evoked release encode HVA channels. The somatodendritic currents shown in the revised figure 3C (previously 2J) that remain upon excision of IS4B are mediated by IS4A containing cac isoforms. Please note that these never localize to the presynaptic active zone, and thus do not contribute to evoked release. Therefore, the interpretation is that specifically sustained HVA current encoded by IS4B cac isoforms is required for synaptic transmission. Reduced cac current density due to decreased channel expression is not the cause for impaired evoked release upon IS4B excision, but instead, the cause is the absence of any cac channels in active zones. IS4B-containing cac isoforms encode sustained HVA current, and we speculate that this might be a well suited current to minimize cacophony channel inactivation in the presynaptic active zone. Given that HVA current shows fast voltage dependent activation and fast inactivation upon repolarization, it is useful at large intraburst firing frequencies as observed during crawling (Kadas et al., 2017) without excessive cac inactivation (see page 15, Kadas, lines 16 to 20).

      However, we agree with the reviewer that a deeper electrophysiological analysis of splice isoform specific cac currents will be instructive. We have now added traces of control and ΔIS4B from a holding potential of -90 mv (revised Fig. 3C, bottom traces and revised text on page 7, line 43 to page 8, lines 1 to 10), and these are also consistent with IS4B mediating sustained HVA cac current. However, further analysis of activation and inactivation voltages and kinetics suffers form space clamp issues in recordings from the somata of such complex neurons (DLM motoneurons of the adult fly contain roughly 6000 µm of dendrites with over 4000 branches, Ryglewski et al., 2017, Neuron 93(3):632-645). Therefore, we will analyze the currents in a heterologous expression system and present these data to the scientific community as a separate study at a later time point.

      (6) Why was the STED data analysis confined to the same optical section, and not to max. intensity z-projections? How many and which optical sections were considered for each active zone? What were the criteria for choosing the optical sections? Was synapse orientation considered for the nearest neighbor Cac - Brp cluster distance analysis? How do the nearest-neighbor distances compare between "planar" and "side-view" Brp puncta?

      Maximum intensity z-projections would be imprecise because they can artificially suggest close proximity of label that is close by in x and y but far away in z. Therefore, the analysis was executed in xy-direction of various planes of entire 3D image stacks. We considered active zones of different orientations (Figs. 5C, D) to account for all planes. In fact, we searched the entire z-stacks until we found active zones of all orientations within the same boutons, as shown in figures 5C1-C6. The same active zone orientations were analyzed for all exon-out mutants with cac localization in active zones. The distance between cac and brp did not change if viewed from the side or any other orientation. We now explain this in more clarity in the results text on page 9, lines 23/24.

      (7) Cac clusters localize to the Brp center (e.g., Liu et al., 2011). They conclude that Cav2 localization within Brp is not affected in the cac variants (p. 8). However, their analysis is not informative regarding a potential offset between the central cac cluster and the Brp "ring". Did they/could they analyze cac localization with regard to Brp ring center localization of planar synapses, as well as Brp-ring dimensions?

      In the top views (planar) we did not find any clear offset in cac orientation to brp between genotypes. In such planar synapses (top views, Fig. 5D, left row) we did not find any difference in Brp ring dimensions. We did not quantify brp ring dimensions rigorously, because this study focusses on cac splice isoform-specific localization and function. Possible effects of different cac isoforms on brp-ring dimensions or other aspects of scaffold structure are not central to our study, in particular given that brp puncta are clearly present even if cac is absent from the synapse (Fig. 3A), indicating that cac is not instructive for the formation of the brp scaffold.

      (8) Given the accelerated PSC decay/ decreased half width in dI-IIA (Fig. 5Q), I recommend reporting PSC charge in Figure 3, and PPR charge in Figures 5A-D. The charge-based PPRs of dI-IIA mutants likely resemble WT more closely than the amplitude-based PPR. In addition, miniature PSC decay kinetics should be reported, as they may contribute to altered decay kinetics. How could faster cac inactivation kinetics in response to single AP stimulation result in a decreased PSC half-width? Is there any evidence for an effect of calcium current inactivation on PSC kinetics? On a similar note, is there any evidence that AP waveform changes accelerate PSC kinetics? PSC decay kinetics are mainly determined by GluR decay kinetics/desensitization. The arguments supporting the role of cac splice isoforms in PSC kinetics outlined in the discussion section are not convincing and should be revised.

      We agree that reporting charge in figure 3 is informative and do so in the revised text. Since the result (no significant difference in the PSCs between between CS, cac<sup>GFP</sup>, <sup>ΔI-IIA</sup>, and transheterozygous I-IIA/I-IIB, but significantly smaller values in ΔI-IIB) remained unchanged no matter whether charge or amplitude were analyzed, we decided to leave the figure as is and report the additional analysis in the text (page 8, lines 40 to 42). This way, both types of analysis are reported. Please note that EPSC amplitude is slightly but not significantly increased upon excision of I-IIA (Fig. 4J), whereas EPSC half amplitude width is significantly smaller (Fig. 5Q, now revised Fig 6R). Together, a tendency of increased EPSC amplitudes and smaller half amplitude width result in statistically insignificant changes in EPSC in ∆I-IIA (now discussed on page 15, lines 37 to 40). We also understand the reviewer’s concern attributing altered EPSC kinetics to presynaptic cac channel properties. We have toned down our interpretation in the discussion and list possible alterations in presynaptic AP shape or cac channel kinetics as alternative explanations (not conclusions; see revised discussion on page 15, line 40 to page 16, line 2). Moreover, we have quantified postsynaptic GluRIIA abundance to test whether altered PSC kinetics are caused by altered GluRIIA expression. In our opinion, the latter is more instructive than mini decay kinetic analysis because this depends strongly on the distance of the recording electrode to the actual site of transmission in these large muscle cells. Although we find no difference in GluRIIA expression levels we now clearly state that we cannot exclude other changes in GluR receptor fields, which of course, could also explain altered PSC kinetics. We have updated the discussion on page 16, lines 2/3 accordingly.

      (9) Paired-pulse ratios (PPRs): On how many sweeps are the PPRs based? In which sequence were the intervals applied? Are PPR values based on the average of the second over the first PSC amplitudes of all sweeps, or on the PPRs of each sweep and then averaged? The latter calculation may result in spurious facilitation, and thus to the large PPRs seen in dI-IIB mutants (Kim & Alger, 2001; doi: 10.1523/JNEUROSCI.21-2409608.2001).

      We agree that the PP protocol and analyses had to be described more precisely in the methods and have done so on page 23, lines 31 to 37 in the methods. Mean PPR values are based on the PPRs of each sweep and then averaged. We are aware of the study of Kim and Alger 2001 and have re-analyzed the PP data in both ways outlined by the reviewer. We get identical results with either analyses method. Spurious facilitation is thus not an issue in our data. We now explain this in the methods section along with the PPR protocol. The large spread seen in dI-IIB is indeed caused by reduced calcium influx into active zones with fewer channels, as anticipated by the reviewer (see next point).

      (10) Could the dI-IIB phenotype be simply explained by a decrease in channel number/ release probability? To test this, I propose investigating PPRs and short-term dynamics during train stimulation at lower extracellular Ca2+ concentration in WT. The Ca2+ concentration could be titrated such that the first PSC amplitude is similar between WT and dI-IIB mutants. This experiment would test if the increased PPR/depression variability is a secondary consequence of a decrease in Ca2+ influx, or specific to the splice isoform.

      In fact, the interpretation that decreased PSC amplitude upon I-IIB excision is caused mainly by reduced channel number is precisely our interpretation (see discussion page 14, last paragraph to page 15, first paragraph in the original submission, now page 16, second paragraph paragraph). In addition, we are grateful for the reviewer’s suggestion to triturate the external calcium such that the first PSC amplitude in matches in ∆I-IIB and control. This experiment tests whether altered short term plasticity is solely a function of altered channel number or whether additional causes, such as altered channel properties, also play into this. We triturated the first pulse amplitude in ∆I-IIB to match control and find that paired pulse ratio and the variance thereof are not different anymore. Therefore, the differences observed in identical external calcium can be fully explained by altered channel numbers. This additional dataset is shown in the revised figures 6D and E and referred to in the results section on page 10, lines 14 to 25 and the discussion on page16, lines 36 to 38.

      (11) How were the depression kinetics analyzed? How many trains were used for each cell, and how do the tau values depend on the first PSC amplitude? Time constants in the range of a few (5-10) milliseconds are not informative for train stimulations with a frequency of 1 or 10 Hz (the unit is missing in Figure 5H). Also, the data shown in Figures 5E-K suggest slower time constants than 5-10 ms. Together, are the data indeed consistent with the idea that dIIIB does not only affect cac channel number, but also PPR/depression variability (p. 9)?

      For each animal the amplitudes of all subsequent PSCs in each train were plotted over time and fitted with a single exponential. For depression at 1 and 10 Hz, we used one train per animal, and 5-6 animals per genotype (as reflected in the data points in Figs. 6I, M). This is now explained in more detail in the revised methods section (page 23, lines 39 to 41). The tau values are not affected by the amplitude of the first PSC. First, we carefully re-fitted new and previously presented depression data and find that the taus for depression at low stimulation frequencies (1 and 10Hz) are not affected by exon excisions at the I-II site. We thank the reviewer for detecting our error in units and tau values in the previous figure panels 5H and L (this has now been corrected in the revised figure panels 6I and M). Given that PSC amplitude upon I-IIB excision is significantly smaller than in controls and following I-IIA excision, we suspected that the time course of depression at low stimulation frequency is not significantly affected by the amount of calcium influx during the first PSC. To further test this, we followed the reviewer ’s suggestion and re-measured depression at 1 and 10 Hz for cac-GFP controls and for delta I-IIB in a higher external calcium concentration (1.8 mM), so that the first PSC was increased in amplitude in both genotypes (1.8 mM external calcium triturates the PSC amplitude in delta I-IIB to match that of controls measured in 0.5 mM external calcium, see revised Figs. 6H, L). Neither in control, nor in delta I-IIB did this affect the time course of synaptic depression (see revised Figs. 6I, M). This indicates that at low stimulation frequencies (1 and 10Hz) the time course of depression is not affected by mean quantal content. This is consistent with the paired pulse ratio at 100 ms interpulse interval shown in figures 6A-D. However, for synaptic depression at 1 Hz stimulation the variability of the data is higher for delta I-IIB (independent of external calcium concentration, see rev. Fig. 6I), which might also be due to reduced channel number in this genotype. Taken together, the data are in line with the idea that altered cac channel numbers in active zones are sufficient to explain all effects that we observe upon I-IIB excision on PPRs and synaptic depression at low stimulation frequencies. This is now clarified in the revised text on page 12, lines 3 to 7.

      (12) The GFP-tagged I-IIA and mEOS4b-tagged I-IIB cac puncta shown in Figure 6N appear larger than the Brp puncta. Endogenously tagged cac puncta are typically smaller than Brp puncta (Gratz et al., 2019). Also, the I-IIA and I-IIB fluorescence sometimes appear to be partially non-overlapping. First, I suggest adding panels that show all three channels merged. Second, could they analyze the area and area overlap of I-IIA and I-IIB with regard to each other and to Brp, and compare it to cac-GFP? Any speculation as to how the different tags could affect localization? Finally, I recommend moving the dI-IIA and dI-IIB localization data shown in Figure 6N to an earlier figure (Figure 1 or Figure 3).

      We now show panels with the two I-II cac isoforms merged in the revised figure 7H (previously 6N). We also tested merging all three labels as suggested, but found this not instructive for the reader. We thank the reviewer for pointing out that the Brp puncta appeared smaller than the cac puncta in some panels. We carefully went through the data and found that the Brp puncta are not systematically smaller than the cac puncta. Please note that punctum size can appear quite differently, depending on different staining qualities as well as different laser intensities and different point spread in different imaging channels. The purpose of this figure was not to analyze punctum size and labeling intensity, but instead, to demonstrate that I-IIA and I-IIB are both present in most active zones, but some active zones show only I-IIB labeling, as quantified in figure 7I. We did not follow the suggestion to conduct additional co-localization analyses and compare it with cac-GFP controls, because Pearson co-localization coefficients for cac-GFP and all exon-out variants analyzed, including delta I-IIA and delta I-IIB are presented in the revised figure 4D. Moreover, delta I-IIA and delta I-IIB show similar Manders 1 and 2 co-localization coefficients with Brp (see Figs. 4E, F). We do not want to speculate whether the different tags have any effect on localization precision. Artificial differences in localization precision can also be suggested by different antibodies, but we know from our STED analyses with identical tags and antibodies for all isoforms that I-IIA and I-IIB co-localize identically with Brp (see Figs. 5A-E). Finally, we prefer to not move the figure because we believe it is informative to show our finding that active zones usually contain both splice I-II variants together with the finding that only I-IIB is required for PHP.

      Recommendations for the authors:

      Reviewing Editor Comments:

      We thank you for your submission. All three reviewers urge caution in interpreting the S4 splice variant playing a role specifically in Cac localization, as opposed to just leading to instability and degradation. There are other issues with the electrophysiological experiments, a need for improved imaging and analyses, and some areas of interpretation detailed in the reviews.

      We agree that additional data was required to conclude that IS4 splicing plays a specific role in cac channel localization and is not just leading to channel instability and degradation. As outlined in detail in our response to reviewer 1, comment 1, we conducted several sets of experiments to support our interpretation. First, electrophysiological experiments show that upon removal of IS4B, which eliminates synaptic transmission at the larval NMJ and cac positive label in presynaptic active zones, somatodendritic cac current is reliably recorded (new data in revised figure 3C). This is not in line with a channel instability or degradation effect, but instead with IS4B containing isoforms being required and sufficient for evoked release from NMJ motor terminals, whereas IS4A isoforms are not sufficient for evoked release from axon terminals, but IS4A isoforms alone can mediate a distinct component of somatodendritic calcium current. Second, immunohostochemical analyses reveal that IS4A, which is not present in NMJ presynaptic active zones, is expressed sparsely, but in reproducible patterns in the larval brain lobes and in specific regions of the anterior VNC parts (new supplementary figure 1). Again, the absence of a IS4A-containing cac isoform from presynaptic active zones but their simultaneous presence in other parts of the nervous system is in accord with isoform specific localization, but not with general channel isoform instability. Third, enlargements of NMJ boutons with brp positive presynaptic active zones confirm the absence of IS4A and the presence of IS4B in active zones (these enlargements are now shown in the revised figures 2A-C, 3A, and 4A-C). Fourth, as suggested we have quantified the Pearson co-localization of IS4 isoforms with Brp in presynaptic active zones (revised Fig. 2D). This confirms quantitatively similar co-localization of IS4B and control with Brp, but no co-localization of IS4A with Brp. In fact, the labeling intensity of IS4A in presynaptic active zones is quantitatively not significantly different from background, no IS4A label is detected anywhere in the axon terminals at the NMJ, but we find IS4 label in the CNS. Together, these data strongly support our interpretation that the IS4 splice site plays a distinct role in cac channel localization. Figure legends as well as results and discussion section have been modified accordingly (the respective page and line numbers are listed in our-point-by-point responses).

      In addition, we have carefully addressed all other public comments as well as all other recommendations for authors by providing multiple new data sets, new image analyses, and revising text. Addressing the insightful comments of all three reviewers and the reviewing editor has greatly helped to make the manuscript better.

      Reviewer #1 (Recommendations For The Authors):

      The conclusion that the IS4B exon controls Cac localization to active zones versus simply being required for channel abundance is not well supported. The authors need to either mention both possibilities or provide stronger support for the active zone localization model if they want to emphasize this point.

      We agree and have included several additional data sets as outlined in our response to point 1 of reviewer 1 and to the reviewing editor (see above). These new data strongly support our interpretation that the IS4B exon controls Cac localization to active zones and is not simply required for channel abundance. The additions to the figures and accompanying text (including the respective figure panel, page, and line numbers) are listed in the point-bypoint responses to the reviewers’ public suggestions.

      Figure 2C staining for Cac localization in the delta 4B line is difficult to compare to the others, as the background staining is so high (muscles are green for example). As such, it is hard to determine whether the arrows in C are just background.

      We had over-emphasized the green label to show that there really is no cacophony label in active zones. However, we agree that this hampered image interpretation. Thus, we have adjusted brightness such that it matches the other genotypes (see new figure panel 2C, and figure 3A, bottom). Revising the figure as suggested by the reviewer shows much more clearly that IS4B puncta are detected exclusively in presynaptic active zones, whereas IS4A channels are not detectable in active zones or anywhere else in the axon terminal boutons. Quantification of IS4A label in brp positive active zones confirms that labeling intensity is not significantly above background (page 6, lines 29 to 31 and page 7, lines 19 to 21). Therefore, IS4A is not detectable in active zones at the NMJ.

      It seems more likely that the removal of the 4B exon simply destabilizes the protein and causes it to be degraded (as suggested by the Western), rather than mislocalizing it away from active zones. It's hard to imagine how some residue changes in the S4 voltage sensor would control active zone localization to begin with. The authors should note that the alternative explanation is that the protein is just degraded when the 4B exon is removed.

      Based on additional data and analyses, we disagree with the interpretation that removal of IS4B disrupts protein integrity and present multiple lines of evidence that support sparse expression of IS4A channels (ΔIS4B). As outlined in our response to reviewer 1 and to the reviewing editor, we show (1) in new immunohistochemical stainings (new supplementary figure 1) that upon removal of IS4B, sparse label is detectable in the VNC and the brain lobes (for detail see above). (2) In our new figure 3C, we show cacophony-mediated somatodendritic calcium currents recorded from adult flight motoneurons in a control situation and upon removal of IS4B that leaves only IS4A channels. This clearly demonstrates that IS4A underlies a substantial component of the HVA somatodendritic calcium current, although it is absence from axon terminals. This is in line with isoform specific functions at different locations, but not with IS4A instability/degradation. (3) We do not agree with the reviewer’s interpretation of the Western Blot data in figure 1E (formerly figure 1D). Together with our immunohistochemical data that show sparse cacophony IS4A expression, we think that the faint band upon removal of IS4B in a heterozygous background (that reduces labeled channels even further) reflects the sparseness of IS4A expression. This sparseness is not due to channel instability, but to IS4A functions that are less abundant than the ubiquitously expressed cac<sup>IS4B</sup> channels at presynaptic active zones of fast chemical synapses (see page 15, lines 24 to 29).

      If they really want to claim the 4B exon governs active zone localization, much higher quality imaging is required (with enlarged views of individual boutons and their AZs, rather than the low-quality full NMJ imaging provided). Similarly, higher resolution imaging of Cac localization at Muscle 12 (Figure 2H) boutons would be very useful, as the current images are blurry and hard to interpret. Figure 6N shows beautiful high-resolution Cac and Brp imaging in single boutons for the I-II exon manipulations - the authors should do the same for the 4B line. For all immuno in Figure 2, it is important to quantify Cac intensity as well. There is no quantification provided, just a sample image. The authors should provide quantification as they do for the delta I-II exons in Figure 3.

      We did as suggested and added figure panels to figure 2A-C and to new figures 3A (formerly part of figure 2 and 4A-C (formerly figure 3) showing magnified label at the NMJ AZs to better judge on cacophony expression after exon excision. These data are now referred to in the results section on page 6, lines 22 to 24, page 7, lines 18 to 21 and page 8, lines 17/18.

      As suggested, we now also provide quantification of co-localization with brp puncta as Pearson’s correlation coefficient for control, IS4B, and IS4A in the new figure panel 2D (text on page 6, lines 34 to 38). This further underscores control-like active zone localization of IS4B but no significant active zone localization of IS4A. As suggested, we quantified now also the intensity of IS4B label in active zones, and it was not different from control (see revised figure 4H and text on page 8, lines 38/39). We did not quantify the intensity of IS4A label, because it was not over background (text, page 6, lines 30/31).

      Reviewer #2 (Recommendations For The Authors):

      (1a) Questions about the engineered Cac splice isoform alleles:

      The authors using CRISPR gene editing to selectively remove the entire alternatively spliced exons of interest. Do the authors know what happens to the cac transcript with the deleted exon? Is the deleted exon just skipped and spliced to the next exon? Or does the transcript instead undergo nonsense-mediated decay?

      We do not believe that there is nonsense mediated mRNA decay, because for all exon excisions the respective mRNA and protein are made. Protein has been detected on the level of Western blotting and immunocytochemistry. Therefore, we are certain that the mRNA is viable for each exon excision (and we have confirmed this for low abundance cac protein isoforms by rt-PCR), but only subsets of cac isoforms can be made from mRNAs that are lacking specific exons. However, we can not make any statements as to whether the lack of specific protein isoforms exerts feedback on mRNA stability, the rate of transcription and translation, or other unknown effects.

      (1b) While it is clear that the IS4 exons encode part of the voltage sensor in the first repeat, are there studies in Drosophila to support the putative Ca-beta and G-protein beta-gamma binding sites in the I-II loop? Or are these inferred from Mammalian studies?

      To the best of our knowledge, there are no studies in Drosophila that unambiguously show Caβ and Gβγ binding sites in the I-II loop of cacophony. However, sequence analysis strongly suggests that I-IIB contains both, a Caβ as well as a Gβγ binding site (AID: α-interacting domain) because the binding motif QXXER is present. In mouse Cav2.1 and Ca<sub>v</sub>2.2 channels the sequence is QQIER, while in Drosophila cacophony I-IIB it is QQLER. In the alternative IIIA, this motif is not present, strongly suggesting that G<sub>βγ</sub> subunits cannot interact at the AID. However, as already suggested by Smith et al. (1998), based on sequence analysis, Ca<sub>β</sub> should still be able to bind, although possibly with a lower affinity. We agree that this information should be given to the reader and have revised the text accordingly on page 5, lines 9 to 17.

      (1c) The authors assert that splicing of Cav2/cac in flies is a means to encode diversity, as mammals obviously have 4 Cav2 genes vs 1 in flies. However, as the authors likely know, mammalian Cav2 channels also have various splice isoforms encoded in each of the 4 Cav2 genes. The authors should discuss in more detail what is known about the splicing of individual mammalian Cav2 channels and whether there are any homologous properties in mammalian channels controlled by alternative splicing.

      We agree and now provide a more comprehensive discussion of vertebrate Ca<sub>v</sub>2 splicing and its impact on channel function. In line to what we report in Drosophila, properties like G<sub>βγ</sub> binding and activation voltage can also be affected by alternative splicing in vertebrate Ca<sub>v</sub>2 channel, through the exon patterns are quite different from Drosophila. We integrated this part on page 14, first paragraph) in the revised discussion. The respective text is below for the reviewer’s convenience:

      “However, alternative splicing increases functional diversity also in mammalian Ca<sub>v</sub>2 channels. Although the mutually exclusive splice site in the S4 segment of the first homologous repeat (IS4) is not present in vertebrate Cav channels, alternative splicing in the extracellular linker region between S3 and S4 is at a position to potentially change voltage sensor properties (Bezanilla 2002). Alternative splice sites in rat Ca<sub>v</sub>2.1 exon 24 (homologous repeat III) and in exon 31 (homologous repeat IV) within the S3-S4 loop modulate channel pharmacology, such as differences in the sensitivity of Ca<sub>v</sub>2.1 to Agatoxin. Alternative splicing is thus a potential cause for the different pharmacological profiles of P- and Q-channels (both Ca<sub>v</sub>2.1; Bourinet et al. 1999). Moreover, the intracellular loop connecting homologous repeats I and II is encoded by 3-5 exons and provides strong interaction with G<sub>βγ</sub>-subunits (Herlitze et al. 1996). In Ca<sub>v</sub>2.1 channels, binding to G<sub>βγ</sub> subunits is potentially modulated by alternative splicing of exon 10 (Bourinet et al. 1999). Moreover, whole cell currents of splice forms α1A-a (no Valine at position 421) and α1A-b (with Valine) represent alternative variants for the I-II intracellular loop in rat Ca<sub>v</sub>2.1 and Ca<sub>v</sub>2.2 channels. While α1A-a exhibits fast inactivation and more negative activation, α1A-b has delayed inactivation and a positive shift in the IV-curve (Bourinet et al. 1999). This is phenotypically similar to what we find for the mutually exclusive exons at the IS4 site, in which IS4B mediates high voltage activated cacophony currents while IS4A channels activate at more negative potentials and show transient current (Fig. 3; see also Ryglewski et al. 2012). Furthermore, altered Ca<sub>β</sub> interaction have been shown for splice isoforms in loop III (Bourinet et al. 1999), similar to what we suspect for the I-II site in cacophony. Finally, in mammalian VGCCs, the C-terminus presents a large splicing hub affecting channel function as well as coupling distance to other proteins. Taken together, Ca<sub>v</sub>2  channel diversity is greatly enhanced by alternative splicing also in vertebrates, but the specific two mutually exclusive exon pairs investigated here are not present in vertebrate Ca<sub>v</sub>2 genes.”

      (1d) In Figure 1, it would be helpful to see the entire cac genomic locus with all introns/exons and the 4 specific exons targeted for deletion.

      We agree and have changed figure 1 accordingly.

      (2a) Cav2.IS4B deletion alleles:

      More work is necessary to explain the localization of Cac controlled by the IS4B exon. First, can the authors determine whether actual Cac channels are present at NMJ boutons? The authors seem to indicate that in the IS4B deletion mutants, some Cac (GFP) signal remains in a diffuse pattern across NMJ boutons. However, from the imaging of wild-type Cac-GFP (and previous studies), there is no Cac signal outside of active zones defined by the BRP signal. It would benefit the study to a) take additional, higher resolution images of the remaining Cac signal at NMJs in IS4B deletion mutants, and b) comment on whether the apparent remaining signal in these mutants is only observed in the absence of IS4Bcontaining Cac channels, or if the IS4A-positive channels are normally observed (but perhaps mis-localized?).

      We have conducted additional analyses to show convincingly that IS4A channels (that remain upon IS4B deletion) are absent from presynaptic active zone. Please see also responses to reviewers 1 and 3. By adjusting the background values in of CLSM images to identical values in control, delta IS4A, and delta IS4B, as well as by providing selective enlargements as suggested, the figure panels 2C, Ci and 3A now show much clearer, that upon deletion of IS4B no cac label remains in active zones or anywhere else in the axon terminal boutons (see text on page 6, lines 22 to 24). This is further confirmed by quantification showing the in IS4B mutants cac labeling intensity in active zones is not above background (see text on page 6, lines 27 to 31). We never intended to indicate that there was cac signal outside of active zones defined by the brp signal, and we now carefully went through the text to not indicate this possibility unintentionally anywhere in the manuscript.

      (2b) Do the authors know whether any presynaptic Ca2+ influx is contributed by IS4Apositive Cac channels at boutons, given the potential diffuse localization? There are various approaches for doing presynaptic Ca2+ imaging that could provide insight into this question.

      We agree that this is an interesting question. However, based on the revisions made, we now show with more clarity that IS4A channels are absent from the presynaptic terminal at the NMJ. IS4A labeling intensities within active zones and anywhere else in the axon terminals are not different from background (see text on page 6, lines 27 to 31 and revised Figs. 2C, Ci, and 3A with new selective enlargements in response to comments of both other reviewers). This is in line with our finding that evoked synaptic transmission from NMJ axon terminals to muscle cells is mostly absent upon excision of IS4B (see Fig. 3B). The very small amplitude EPSC (below 5 % of the normal amplitude of evoked EPSCs) that can still be recorded in the absence of IS4B is similar to what is observed in cac null mutant junctions and is mediated by calcium influx through another voltage gated calcium channels, a Ca<sub>v</sub>1 homolog named Dmca1D, as we have previously published (Krick et al., 2021, PNAS 118(28):e2106621118. Gathering additional support for the absence of IS4A from presynaptic terminals by calcium imaging experiments would suffer significantly from the presence of additional types of VGCCs in presynaptic terminals (for sure Dmca1D (Krick et al., 2021) and potentially also the Ca<sub>v</sub>3 homolog DmαG or Dm-α1T). Such experiments would require mosaic null mutants for cac and DmαG channels in a mosaic IS4B excision mutant, which, if feasible at all, would be very hard and time consuming to generate. In the light of the additional clarification that IS4A is not located in NMJ axon terminal boutons, as shown by additional labeling intensity analysis, revised figures with selective enlargement, and revised text, we feel confident to state that IS4A is not sufficient for evoked SV release.

      (2c) Mechanistically, how are amino acid changes in one of the voltage sensing domains in Cac related to trafficking/stabilization/localization of Cac to AZs?

      This is an exciting question that has occupied our discussions a lot. Some sorting mechanism must exist that recognizes the correct protein isoforms, just as sorting and transport mechanisms exist that transport other synaptic proteins to the synapse. We do not think that the few amino acid changes in the voltage sensor are directly involved in protein targeting. We rather believe that the cacophony variants that happen to contain this specific voltage sensor are selected for transport out to the synapse. There are possibilities to achieve this cell biological, but we have not further addressed potential mechanisms because we do not want enter the realms of speculation.

      (3) How are auxiliary subunits impacted in the Cac isoform mutants?

      Recent work by Kate O'Connor-Giles has shown that both Stj and Ca-Beta subunits localize to active zones along with Cac at the Drosophila NMJ. Endogenously tagged Stj and CaBeta alleles are now available, so it would be of interest to determine if Stj and particular Cabeta levels or localization change in the various Cac isoform alleles. This would be particularly interesting given the putative binding site for Ca-beta encoded in the I-II linker.

      We agree that the synthesis of the work of Kate O'Connor-Giles group and our study open up new avenues to explore exciting hypotheses about differential coupling of specific cacophony splice isoforms with distinct accessory proteins such as Caβ and α<sub>2</sub>δ subunits. However, this requires numerous full sets of additional experiments and is beyond the scope of this study.

      (4a) Interpretation of short-term plasticity in the I-IIB exon deletion:

      The changes in short-term plasticity presented in Figure 5 are interpreted as an additional phenotype due to the loss of the I-IIB exon, but it seems this might be entirely explained simply due to the reduced Cac levels. Reduced Cac levels at active zones will obviously reduce Ca2+ influx and neurotransmitter release. This may be really the only phenotype/function of the I-IIB exon. Hence, to determine whether loss of the I-IIB exon encodes any functions in short-term plasticity, separate from reduced Cac levels, the authors should compare short-term plasticity in I-IIB loss alleles compared to wild type with starting EPSC amplitudes are equal (for example by reducing extracellular Ca2+ levels in wild type to achieve the same levels at in Cac I-IIB exon deleted alleles). Reduced release probability, simply by reduced Ca2+ influx (either by reduced Cac abundance or extracellular Ca2+) should result in more variability in transmission, so I am not sure there is any particular function of the I-IIB exon in maintaining transmission variability beyond controlling Cac abundance at active zones.

      For two reasons we are particularly grateful for this comment. First, it shows us that we needed to explain much clearer that our interpretation is that changes in paired pulse ratios (PPRs) and in depression at low stimulation frequencies are a causal consequence of lower channel numbers upon I-IIB exon deletion, precisely as pointed out by the reviewer. We have carefully revised the text accordingly on page 10, lines 14-25, page 11, lines 3-7 and 22-28; page 16, lines 36-38. Second, the experiment suggested by the reviewer is superb to provide additional evidence that the cause of altered PPRs is in fact reduced channel number, but not altered channel properties. Accordingly, we have conducted additional TEVC recordings in elevated external calcium (1.8 mM) so that the single PSC amplitudes in I-IIB excision animals match those of controls in 0.5 mM extracellular calcium. This makes the amplitudes and the variance of PPR for all interpulse intervals tested control-like (see revised Figs. 6D, E). This strongly indicates that differences observed in PPRs as well as the variance thereof were caused by the amount of calcium influx during the first EPSC, and thus by different channel numbers in active zones.

      (4b) Another point about the data in Figure 5: If "behaviorally relevant" motor neuron stimulation and recordings are the goal, the authors should also record under physiological Ca2+ conditions (1.8 mM), rather than the highly reduced Ca2+ levels (0.5 mM) they are using in their protocols.

      Although we doubt that the effective extracellular calcium concentration that determines the electromotoric force for calcium to enter the ensheathed motoneuron terminals in vivo during crawling is known, we followed the reviewer’s suggestion partly and have repeated the high frequency stimulation trains for ΔI-IIB in 1.8 mM calcium. As for short-term plasticity this brings the charge conducted to values as observed in control and in ΔI-IIA in 0.5 mM calcium. Therefore, all difference observed in previous figure 5 (now revised figure 6) can be accounted to different channel numbers in presynaptic active zones. This is now explained on page 11, lines 19-28. For controls recordings at high frequency stimulation in higher external calcium (e.g. 2 mM) have previously been published and show significant synaptic depression (e.g. Krick et al., 2021, PNAS). Given that in the exon out variants we do not expect any differences except from those caused by different channel numbers, we did not repeat these experiments for control and ΔI-IIA.

      (5a) Mechanism of Cac's role in PHP :

      As the authors likely know, mutations in Cac were previously reported to disrupt PHP expression (see Frank et al., 2006 Neuron). Inexplicably, this finding and publication were not cited anywhere in this manuscript (this paper should also be cited when introducing PhTx, as it was the first to characterize PhTx as a means of acutely inducing PHP). In the Frank et al. paper (and in several subsequent studies), PHP was shown to be blocked in mutations in Cac, namely the CacS allele. This allele, like the I-IIB excision allele, reduces baseline transmission presumably due to reduced Ca2+ influx through Cac. The authors should at a minimum discuss these previous findings and how they relate to what they find in Figure 6 regarding the block in PHP in the Cac I-IIB excision allele.

      We thank the reviewer for pointing this out and apologize for this oversight. We agree that it is imperative to cite the 2006 paper by Frank et al. when introducing PhTx mediated PHP as well as when discussing cac the effects of cac mutants on PHP together with other published work. We have revised the text accordingly on page 12, lines 9-11 and 21-23 and on page 17, lines 29-33.

      In terms of data presentation in Fig. 6, as is typical in the field, the authors should normalize their mEPSC/QC data as a percentage of baseline (+PhTx/-PhTx). This makes it easier to see the reduction in mEPSC values (the "homeostatic pressure" on the system) and then the homeostatic enhancement in QC. Similarly, in Fig. 6M, the authors should show both mEPSC and QC as a percentage of baseline (wild type or non-GluRIIA mutant background).

      We agree and have changed figure presentation accordingly. Figure 7 (formerly figure 6) was updated as was the accompanying results text on page 12, lines 23-40.

      (6) Cac I-IIA and I-IIB excision allele colocalization at AZs:

      These are very nice and important experiments shown in Figures 6N and O, which I suggest the authors consider analyzing in further detail. Most significantly:

      (6i) The authors nicely show that most AZs have a mix of both Cac IIA and IIB isoforms. Using simple intensity analysis, can the authors say anything about whether there is a consistent stoichiometric ratio of IIA vs IIB at single AZs? It is difficult to extract actual numbers of IIA vs IIB at individual AZs without having both isoforms labeled mEOS4b, but as a rough estimate can the authors say whether the immunofluorescence intensity of IIA:IIB is similar across each AZ? Or is there broad heterogeneity, with some AZs having low vs high ratios of each isoform (as the authors suggest across proximal to distal NMJ AZs)?

      We agree and have conducted experiments and analyses to provide these data. We measured the cac puncta fluorescence intensities for heterozygous cac<sup>sfGFP</sup>/cac, cacIIIA<sup>sfGFP</sup>/cacI-IIB, and cacI-IIB<sup>sfGFP</sup>/cacI-IIA animals. We preferred this strategy, because intensity was always measured from cac puncta with the same GFP tag. Next, we normalized all values to the intensities obtained in active zones from heterozygous cac<sup>sfGFP</sup>/cac controls and then plotted the intensities of I-IIA versus I-IIB containing active zones side by side. Across junctions and animals, we find a consistent ratio 2:1 in the relative intensities of I-IIB and I-IIA, thus indicating on average roughly twice as many I-IIB as compared to I-IIA channels across active zones. This is consistent with the counts in our STED analysis (see Fig. 5F). These new data are shown in the new figure panel 7J and referred to on page 13, lines 10-16 in the revised text.

      (6ii) Intensity analysis of Cac IIA vs IIB after PHP: Previous studies have shown Cac abundance increases at NMJ AZs after PHP. Can the authors determine whether both Cac IIA vs IIB isoforms increase after PHP or whether just one isoform is targeted for this enhancement?

      We already show that PHP is not possible in the absence of I-IIB channels (see figure 7). However, we agree that it is an interesting question to test whether I-IIA channel are added in the presence of I-IIB channels during PHP, but we consider this a detail beyond the scope of this study.

      Minor points:

      (1) Including line numbers in the manuscript would help to make reviewing easier.

      We agree and now provide line numbers.

      (2) Several typos (abstract "The By contrast", etc).

      We carefully double checked for typos.

      (3) Throughout the manuscript, the authors refer to Cac alleles and channels as "Cav2", which is unconventional in the field. Unless there is a compelling reason to deviate, I suggest the authors stick to referring to "Cac" (i.e. cacdIS4B, etc) rather than Cav2. The authors make clear in the introduction that Cac is the sole fly Cav2 channel, so there shouldn't be a need to constantly reinforce that cac=Cav2.

      We agree and have changed all fly Ca<sub>v</sub>2 reference to cac.

      (4) In some figures/text the authors use "PSC" to refer to "postsynaptic current", while in others (i.e. Figure 6) they switch to the more conventional terms of mEPSC or EPSC. I suggest the authors stick to a common convention (mEPSC and EPSC).

      We have changed PSC to EPSC throughout.

      Reviewer #3 (Recommendations For The Authors):

      (1) The abstract could focus more on the results at the expense of the background.

      We agree and have deleted the second introductory background sentence and added information on PPRs and depression during low frequency stimulation.

      (2) What does "strict" active zone localization refer to? Could they please define the term strict?

      Strict active zone localization means that cac puncta are detected in active zones but no cac label above background is found anywhere else throughout the presynaptic terminal, now defined on page 6, lines 27-29.

      (3) Single boutons/zoomed versions of the confocal images shown in Figures 2A-C, 2H, and 3A-C would be very helpful.

      We have provided these panels as suggested (see above and revised figures 2-4). Figure 3 is now figure 4.

      (4) The authors cite Ghelani et al. (2023) for increased cac levels during homeostatic plasticity. I recommend citing earlier work making similar observations (Gratz et al., 2019; DOI: 10.1523/JNEUROSCI.3068-18.2019), and linking them to increased presynaptic calcium influx (Müller & Davis, 2012; DOI: 10.1016/j.cub.2012.04.018).

      We agree and have added Gratz et al. 2019 and Davis and Müller 2012 to the results section on page 12, lines 17/18 and lines 21-23, in the discussion on page 17, lines 29-33.

      (5) The data shown in Figure 3 does not directly support the conclusion of altered release probability in dI-IIB. I therefore suggest changing the legend's title.

      We have reworded to “Excisions at the I-II exon do not affect active zone cacophony localization but can alter cacsfGFP label intensity in active zones and PSC amplitude” as this is reflecting the data shown in the figure panels more directly.

      (6) It would be helpful to specify "adult flight muscle" in Figure 2J.

      We agree that it is helpful to specify in the figure (now revised figure 3C) that the voltage clamp recordings of somatodendritic calcium current were conducted in adult flight motoneurons and have revised the headline of figure panel 3C and the legend accordingly. Please note, these are not muscle cells but central neurons.

      (7) Do dIS4B/Cav2null MNs indeed show an inward or outward current at -90 to -70 mV/-40 and -50 mV, or is this an analysis artifact?

      No, this is due to baseline fluctuations as typical for voltage clamp in central neurons with more than 6000 µm dendritic length and more than 4000 dendritic branches.

      (8) Loss of several presynaptic proteins, including Brp (Kittel et al., 2006), and RBP (Liu et al., 2011), induce changes in GluR field size (without apparent changes in miniature amplitude). The statement regarding the Cav2 isoform and possible effects on GluR number (p. 8) should be revised accordingly.

      We understand and have done two things. First, we measured the intensity of GluRIIA immunolabel in ΔI-IIA, ΔI-IIB, and controls and found no differences. Second, we reworded the statement. It now reads on page 9, lines 1-6: “It seems unlikely that presynaptic cac channel isoform type affects glutamate receptor types or numbers, because the amplitude of spontaneous miniature postsynaptic currents (mEPSCs, Fig. 4K) and the labeling intensity of postsynaptic GluRIIA receptors are not significantly different between controls, I-IIA, and I-IIB junctions (see suppl. Fig. 2, p = 0.48, ordinary one-way ANOVA, mean and SD intensity values are 61.0 ± 6.9 (control), 55.8 ± 8.5 (∆I-IIA), 61.1 ± 17.3 (∆I-IIB)). However, we cannot exclude altered GluRIIB numbers and have not quantified GluR receptor field sizes.”

      (9) The statement relating miniature frequency to RRP size is unclear (p. 8). Is there any evidence for a correlation between miniature frequency to RRP size? Could the authors please clarify?

      We agree that this statement requires caution. Although there is some published evidence for a correlation of RRP size and mini frequency (Neuron, 2009 61(3):412-24. doi: 10.1016/j.neuron.2008.12.029 and Journal of Neuroscience 44 (18) e1253232024; doi: 10.1523/JNEUROSCI.1253-23.2024), which we now refer to on page 9, it is not clear whether this is true for all synapses and how linear such a relationship may be. Therefore, we have revised the text on page 9, lines 6-9. It now reads: “Similarly, the frequency of miniature postsynaptic currents (mEPSCs) remains unaltered. Since mEPSCs frequency has been related to RRP size at some synapses (Pan et al., 2009; Ralowicz et al., 2024) this indicates unaltered RRP size upon I-IIB excision, but we have not directly measured RRP size.”

      (10) Please define the "strict top view" of synapses (p. 8).

      Top view is what this reviewer referred to as “planar view” in the public review points 6 and 7. In our responses to these public review points we now also define “strict top view”, see page 9, lines 17-19.

      (11) Two papers are cited regarding a linear relationship between calcium channel number and release probability (p. 15). Many more papers could be cited to demonstrate a supralinear relationship (e.g., Dodge & Rahaminoff, 1967; Weyhersmüller et al., 2011 doi: 10.1523/JNEUROSCI.6698-10.2011). The data of the present study were collected at an extracellular calcium concentration of 0.5 mM, whereas Meideiros et al. (2023) used 1.5 mM. The relationship between calcium and release is supra-linear around 0.5 mM extracellular calcium (Weyhersmüller et al. 2011). This should be discussed/the statements be revised. Also, the reference to Meideiros et al. (2023) should be included in the reference list.

      We have now updated the Medeiros reference (updated version of that paper appeared in eLife in 2024) in the text and reference list. We agree that the relationship of the calcium concentration and P<sub>r</sub> can also be non-linear and refer to this on page 16, lines 26-32, but the point we want to make is to relate defined changes in calcium channel number (not calcium influx) as assessed by multiple methods (CLSM intensity measures and sptPALM channel counting) to release probability. We now also clearly state that we measured at 0.5 mM external calcium (page 16, lines 27/28) whereas Medeiros et al. 2024 measured at 1.5 mM calcium (page 16, lines 31/32).

      (12) Figure 6: Quantal content does not have any units - please remove "n vesicles".

      We have revised this figure in response to reviewer 2 (comment 5) and quantal content is now expressed as percent baseline, thus without units (see revised figure 7).

      (13) Figure 6C should be auto-scaled from zero.

      This has been fixed by revising that figure in response to reviewer 2 (comment 5)

      (14) The data supporting the statement on impaired motor behavior and reduced vitality of adult IS4A should be either shown, or the statement should be removed (p. 13). Any hypotheses as to why IS4A is important for behavior and or viability?

      As suggested, we have removed that statement.

      (15) They do not provide any data supporting the statement that changes in PSC decay kinetics "counteract" the increase in PSC amplitude (p. 14). The sentence should be changed accordingly.

      We agree and have down toned. It now reads on page 16, lines 7-9: “During repetitive firing, the median increase of PSC amplitude by ~10 % is potentially counteracted by the significant decrease in PSC half amplitude width by ~25 %...”.

      (16) How do they explain the net locomotion speed increase in dI    -IIA larvae? Although the overall charge transfer is not affected during the stimulus protocols used, could the accelerated PSC decay affect PSP summation (I would actually expect a decrease in summation/slower speed)? Independent of the voltage-clamp data, is muscle input resistance changed in dI-IIA mutants?

      Muscle input resistance is not altered in I-II mutants. We refer to potential causes of the locomotion effects of I-IIA excision in the discussion. On page 16, lines 12 to 21 it reads: “there is no difference in charge transfer from the motoneuron axon terminal to the postsynaptic muscle cell between ∆I-IIA and control. Surprisingly, crawling is significantly affected by the removal of I-IIA, in that the animals show a significantly increased mean crawling speed but no significant change in the number of stops. Given that the presynaptic function at the NMJ is not strongly altered upon I-IIA excision, and that I-IIA likely mediates also Ca<sub>v</sub>2 functions outside presynaptic AZs (see above) and in other neuron types than motoneurons, and that the muscle calcium current is mediated by Ca<sub>v</sub>1>/i> and Ca<sub>v</sub>3, the effects of I-IIA excision of increasing crawling speed is unlikely caused by altered pre- or postsynaptic function at the NMJ. We judge it more likely that excision of I-IIA has multiple effects on sensory and pre-motor processing, but identification of these functions is beyond the scope of this study.”

  4. Dec 2024
    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This valuable study reveals how a rhizobial effector protein cleaves and inhibits a key plant receptor for symbiosis signaling, while the host plant counters by phosphorylating the effector. The molecular evidence for the protein-protein interaction and modification is solid, though biological evidence directly linking effector cleavage to rhizobial infection is incomplete. With additional functional data, this work could have implications for understanding intricate plant-microbe dynamics during mutualistic interactions.

      Thank you for this positive comment. Our data strongly support the view that NFR5 cleavage by NopT impairs Nod factor signaling resulting in reduced rhizobial infection. However, other mechanisms may also have an effect on the symbiosis, as NopT targets other proteins in addition to NFR5. In our revised manuscript version, we discuss the possibility that negative NopT effects on symbiosis could be due to NopT-triggered immune responses. As mentioned in our point-by-point answers to the Reviewers, we included additional data into our manuscript. We would also like to point out that we are generally more cautious in our revised version in order to avoid over-interpreting the data obtained.

      Public Reviews:

      Reviewer #1 (Public Review):

      Bacterial effectors that interfere with the inner molecular workings of eukaryotic host cells are of great biological significance across disciplines. On the one hand they help us to understand the molecular strategies that bacteria use to manipulate host cells. On the other hand they can be used as research tools to reveal molecular details of the intricate workings of the host machinery that is relevant for the interaction/defence/symbiosis with bacteria. The authors investigate the function and biological impact of a rhizobial effector that interacts with and modifies, and curiously is modified by, legume receptors essential for symbiosis. The molecular analysis revealed a bacterial effector that cleaves a plant symbiosis signaling receptor to inhibit signaling and the host counterplay by phosphorylation via a receptor kinase. These findings have potential implications beyond bacterial interactions with plants.

      Thank you for highlighting the broad significance of rhizobial effectors in understanding legume-rhizobia interactions. We fully agree with your assessment and have expanded our Discussion (and Abstract) regarding the potential implications of our findings beyond bacterial interactions with plants. We mention the prospect of developing specific kinase-interacting proteases to fine-tune cellular signaling processes in general.

      Bao and colleagues investigated how rhizobial effector proteins can regulate the legume root nodule symbiosis. A rhizobial effector is described to directly modify symbiosis-related signaling proteins, altering the outcome of the symbiosis. Overall, the paper presents findings that will have a wide appeal beyond its primary field.

      Out of 15 identified effectors from Sinorhizobium fredii, they focus on the effector NopT, which exhibits proteolytic activity and may therefore cleave specific target proteins of the host plant. They focus on two Nod factor receptors of the legume Lotus japonicus, NFR1 and NFR5, both of which were previously found to be essential for the perception of rhizobial nod factor, and the induction of symbiotic responses such as bacterial infection thread formation in root hairs and root nodule development (Madsen et al., 2003, Nature; Tirichine et al., 2003; Nature). The authors present evidence for an interaction of NopT with NFR1 and NFR5. The paper aims to characterize the biochemical and functional consequences of these interactions and the phenotype that arises when the effector is mutated.

      Thank you for your positive feedback.  We have now emphasized the interdisciplinary significance of our work in the Introduction and Discussion of our revised manuscript. We highlight how the insights gained from our study can contribute to a better understanding of microbial interactions with eukaryotic hosts in general, and hope that our findings could benefit future research in the fields of pathogenesis, immunity, and symbiosis.

      We appreciate your detailed summary of our work, which is focused on NopT and its interaction with Nod factor receptors. To ensure that the readers can easily follow the rationale behind our work, we have included a more detailed explanation of how NopT was identified to target Nod factor receptors. In particular, we now better describe the test system (Nicotiana benthamiana cells co-expressing NFR1/NFR5 with a given effector of Sinorhizobium fredii NGR234). In addition, we provide now a more thorough background on the roles of NFR1 and NFR5 in symbiotic signaling and refer to the two Nature papers from 2003 on NFR1 and NFR5 (Madsen et al., 2003; Radutoiu et al., 2003).

      Evidence is presented that in vitro NopT can cleave NFR5 at its juxtamembrane region. NFR5 appears also to be cleaved in vivo. and NFR1 appears to inhibit the proteolytic activity of NopT by phosphorylating NopT. When NFR5 and NFR1 are ectopically over-expressed in leaves of the non-legume Nicotiana benthamiana, they induce cell death (Madsen et al., 2011, Plant Journal). Bao et al., found that this cell death response is inhibited by the coexpression of nopT. Mutation of nopT alters the outcome of rhizobial infection in L. japonicus. These conclusions are well supported by the data.

      We appreciate your recognition of the robustness of our conclusions. In the context of your comments, we made the following improvements to our manuscript:

      We included a more detailed description of the experimental conditions under which the cleavage of NFR5 by NopT was observed in vitro and in vivo. Furthermore, additional experiments were added to strengthen the evidence for NFR5 cleavage by NopT (Fig 3, S3, S6, and S14).

      We provided more comprehensive data on the phosphorylation of NopT by NFR1, including phosphorylation assays (Fig. 4) and mass spectrometry results (Fig. S7 and Table S1). These data provide additional information on the mechanism by which NFR1 inhibits the proteolytic activity of NopT.

      We expanded the discussion on the cell death response induced by ectopic expression of NFR1 and NFR5 in Nicotiana benthamiana. We also included further details from Madsen et al. (2011) to contextualize our findings within the known literature.

      We believe that these additions and clarifications have improved the significance and impact of our study.

      The authors present evidence supporting the interaction of NopT with NFR1 and NFR5. In particular, there is solid support for cleavage of NFR5 by NopT (Figure 3) and the identification of NopT phosphorylation sites that inhibit its proteolytic activity (Figure 4C). Cleavage of NFR5 upon expression in N. benthamiana (Figure 3A) requires appropriate controls (inactive mutant versions) that have been provided, since Agrobacterium as a closely rhizobia-related bacterium, might increase defense related proteolytic activity in the plant host cells.

      We appreciate your recognition of the importance of appropriate controls in our experimental design. In response to your comments, we revised our manuscript to ensure that the figures and legends provide a clear description of the controls used. We also included a more detailed description of our experimental design at several places. In particular, we have highlighted the use of the protease-dead version of NopT as a control (NopT<sup>C93S</sup>). Therefore, NFR5-GFP cleavage in N. benthamiana clearly depended on protease activity of NopT and not on Agrobacterium (Fig. 3A). In the revised text, we are now more cautious in our wording and don’t conclude at this stage that NopT proteolyzes NFR5. However, our subsequent experiments, including in vitro experiments, clearly show that NopT is able to proteolyze NFR5.

      We are convinced that these changes have improved the quality of our work.

      Key results from N. benthamiana appear consistent with data from recombinant protein expression in bacteria. For the analysis in the host legume L. japonicus transgenic hairy roots were included. To demonstrate that the cleavage of NFR5 occurs during the interaction in plant cells the authors build largely on western blots. Regardless of whether Nicotiana leaf cells or Lotus root cells are used as the test platform, the Western blots indicate that only a small proportion of NFR5 is cleaved when co-expressed with nopT, and most of the NFR5 persists in its full-length form (Figures 3A-D). It is not quite clear how the authors explain the loss of NFR5 function (loss of cell death, impact on symbiosis), as a vast excess of the tested target remains intact. It is also not clear why a large proportion of NFR5 is unaffected by the proteolytic activity of NopT. This is particularly interesting in Nicotiana in the absence of Nod factor that could trigger NFR1 kinase activity.

      Thank you for your comments regarding the cleavage of NFR5 by NopT and its functional implications. We acknowledge that our immunoblots indicate only a relatively small proportion of  the NFR5 cleavage product.  Possible explanations could be as follows:

      (1) The presence of full-length NFR5 does not preclude a significant impact of NopT on function of NFR5, as NopT is able to bind to NFR5. In other words, the NopT-NFR5 and NopT-NFR1 interactions at the plasmamembrane might influence the function of the NFR1/NFR5 receptor without proteolytic cleavage of NFR5. In fact, protease-dead NopT<sup>C93S</sup> expressed in NGR234Δ_nopT_ showed certain effects in L. japonicus (less infection foci were formed compared to NGR234Δ_nopT_ Fig. 5E).  In this context, it is worth mentioning that the non-acylated NopT<sup>C93S</sup> (Fig. 1B) and not<sub>USDA257</sub> (Fig. 6B) proteins were unable to suppress NFR1/NFR5-induced cell death in N. benthamina, but this could be explained by the lack of acylation and altered subcellular localization.

      (2) The cleaved NFR5 fraction, although small, may be sufficient to disrupt signaling pathways, leading to the observed phenotypic changes  (loss of cell death in N. benthamiana; altered infection in L. japonicus).

      (3) The used expression systems produce high levels of proteins in the cell. This may not reflect the natural situation in L. japonicus cells.

      (4) Cellular conditions could impair cleavage of NFR5 by NopT.  Expression of proteins in E. coli may partially result in formation of protein aggregates (inactive NopT; NFR5 resistant to proteolysis).

      (5) In N. benthamiana co-expressing NFR1/NFR5, the NFR1 kinase activity is constitutively active (i.e., does not require Nod factors), suggesting an altered protein conformation of the receptor complex, which may influence the proteolytic susceptibility of NFR5.

      (6) The proteolytic activity of NopT may be reduced by the interaction of NopT with other proteins such as NFR1, which phosphorylates NopT and inactivates its protease activity.

      In our revised manuscript version, we provide now quantitative data for the efficiency of NFR5 cleavage by NopT in different expression systems used (Supplemental Fig.  14).  We have also improved our Discussion in this context. Future research will be necessary to better understand loss of NFR5 function by NopT. 

      It is also difficult to evaluate how the ratios of cleaved and full-length protein change when different versions of NopT are present without a quantification of band strengths normalized to loading controls (Figure 3C, 3D, 3F). The same is true for the blots supporting NFR1 phosphorylation of NopT (Figure 4A).

      Thank you for pointing out this. Following your suggestions, we quantified the band intensities for cleaved and full-length NFR5 in our different expression systems (N. benthamiana, L. japonicus and E. coli). The protein bands were normalized to loading controls. The data are shown in the new Supplemental Fig. 14. Similarly, the bands of immunoblots supporting phosphorylation of NopT by NFR1 were quantified. The data on band intensities are shown in Fig.  4B of our revised manuscript. These improvements provide a clearer understanding of how the ratios of cleaved to full-length proteins change in different protein expression systems, and to which extent NopT was phosphorylated by NFR1.

      Nodule primordia and infection threads are still formed when L. japonicus plants are inoculated with ∆nopT mutant bacteria, but it is not clear if these primordia are infected or develop into fully functional nodules (Figure 5). A quantification of the ratio of infected and non-infected nodules and primordia would reveal whether NopT is only active at the transition from infection focus to thread or perhaps also later in the bacterial infection process of the developing root nodule.

      Thank you for highlighting this aspect of our study. In response to your comment, we have conducted additional inoculation experiments with L. japonicus plants inoculated with NGR234 and NGR234_ΔnopT_ mutant. The new data are shown in Fig 5A, 5E, and 5G. However, we could not find any uninfected nodules (empty) nodules when roots were inoculated with these strains and mention this observation in the Results section of our revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript presents data demonstrating NopT's interaction with Nod Factor Receptors NFR1 and NFR5 and its impact on cell death inhibition and rhizobial infection. The identification of a truncated NopT variant in certain Sinorhizobium species adds an interesting dimension to the study. These data try to bridge the gaps between classical Nod-factor-dependent nodulation and T3SS NopT effector-dependent nodulation in legume-rhizobium symbiosis. Overall, the research provides interesting insights into the molecular mechanisms underlying symbiotic interactions between rhizobia and legumes.

      Strengths:

      The manuscript nicely demonstrates NopT's proteolytic cleavage of NFR5, regulated by NFR1 phosphorylation, promoting rhizobial infection in L. japonicus. Intriguingly, authors also identify a truncated NopT variant in certain Sinorhizobium species, maintaining NFR5 cleavage but lacking NFR1 interaction. These findings bridge the T3SS effector with the classical Nod-factor-dependent nodulation pathway, offering novel insights into symbiotic interactions.

      Weaknesses:

      (1) In the previous study, when transiently expressed NopT alone in Nicotiana tobacco plants, proteolytically active NopT elicited a rapid hypersensitive reaction. However, this phenotype was not observed when expressing the same NopT in Nicotiana benthamiana (Figure 1A). Conversely, cell death and a hypersensitive reaction were observed in Figure S8. This raises questions about the suitability of the exogenous expression system for studying NopT proteolysis specificity.

      We appreciate your attention to these plant-specific differences. Previous studies showed that NopT expressed in tobacco (N. tabacum) or in specific Arabidopsis ecotypes (with PBS1/RPS5 genes) causes rapid cell death (Dai et al. 2008; Khan et al. 2022). Khan et al. 2022 reported recently that cell death does not occur in N. benthamiana unless the leaves were transformed with PBS1/RPS5 constructs. Our data shown in Fig. S15 confirm these findings. As cell death (effector triggered immunity) is usually associated with induction of plant protease activities, we considered N. tabacum and A. thaliana plants as not suitable for testing NFR5 cleavage by NopT. In fact, no NopT/NFR5 experiments were not performed with these plants in our study.  In response to your comment, we now better describe the N. benthamiana expression system and cite the previous articles_. Furthermore,  We have revised the Discussion section to better emphasize effector-induced immunity in non-host plants and the negative effect of rhizobial effectors during symbiosis. Our revisions certainly provide a clearer understanding of the advantages and limitations of the _N.  benthamiana expression system.

      (2) NFR5 Loss-of-function mutants do not produce nodules in the presence of rhizobia in lotus roots, and overexpression of NFR1 and NFR5 produces spontaneous nodules. In this regard, if the direct proteolysis target of NopT is NFR5, one could expect the NGR234's infection will not be very successful because of the Native NopT's specific proteolysis function of NFR5 and NFR1. Conversely, in Figure 5, authors observed the different results.

      Thank you for this comment, which points out that we did not address this aspect precisely enough in the original manuscript version.  We improved our manuscript and now write that nfr1 and nfr5 mutants do not produce nodules (Madsen et al., 2003; Radutoiu et al., 2003) and that over-expression of either NFR1 or NFR5 can activate NF signaling, resulting in formation of spontaneous nodules in the absence of rhizobia (Ried et al., 2014). In fact, compared to the nopT knockout mutant NGR234_ΔnopT_, wildtype NGR234 (with NopT) is less successful in inducing infection foci in root hairs of L. japonicus (Fig. 5). With respect to formation of nodule primordia, we repeated our inoculation experiments with NGR234_ΔnopT_ and wildtype NGR234 and also included a nopT over-expressing NGR234 strain into the analysis. Our data clearly showed that nodule primordium formation was negatively affected by NopT. The new data are shown in Fig. 5 of our revised version. Our data show that NGR234's infection is not really successful, especially when NopT is over-expressed. This is consistent  with our observations that NopT targets Nod factor receptors in L. japonicus and inhibits NF signaling (NIN promoter-GUS experiments). Our findings indicate that NopT is an “Avr effector” for L. japonicus.  However, in other host plants of NGR234, NopT possesses a symbiosis-promoting role (Dai et al. 2008; Kambara et al. 2009). Such differences could be explained by different NopT targets in different plants (in addition to Nod factor receptors), which may influence the outcome of the infection process. Indeed, our work shows hat NopT can interact with various kinase-dead LysM domain receptors, suggesting a role of NopT in suppression or activation of plant immunity responses depending on the host plant. We discuss such alternative mechanisms in our revised manuscript version and emphasize the need for further investigation to elucidate the precise mechanisms underlying the observed infection phenotype and the role of NopT in modulating symbiotic signaling pathways. In this context, we would also like to mention the two new figures of our manuscript which are showing (i) the efficiency of NFR5 cleavage by NopT in different expression systems, (ii) the interaction between NopT<sup>C93S</sup> and His-SUMO-NFR5<sup>JM</sup>-GFP, and (iii) cleavage of His-SUMO-NFP<sup>JM</sup>-GFP by NopT (Supplementary Figs. S8 and S9).

      (3) In Figure 6E, the model illustrates how NopT digests NFR5 to regulate rhizobia infection. However, it raises the question of whether it is reasonable for NGR234 to produce an effector that restricts its own colonization in host plants.

      Thank you for mentioning this point. We are aware of the possible paradox that the broad-host-range strain NGR234 produces an effector that appears to restrict its infection of host plants. As mentioned in our answer to the previous comment, NopT could have additional functions beyond the regulation of Nod factor signaling. In our revised manuscript version, we have modified our text as follows:

      (1) We mention the potential evolutionary aspects of NopT-mediated regulation of rhizobial infection and discuss the possibility that interactions between NopT and Nod factor receptors may have evolved to fine-tune Nod factor signaling to avoid rhizobial hyperinfection in certain host legumes.

      (2) We also emphasize that the presence of NopT may confer selective advantages in other host plants than L. japonicus due to interactions with proteins related to plant immunity. Like other effectors, NopT could suppress activation of immune responses (suppression of PTI) or cause effector-triggered immunity (ETI) responses, thereby modulating rhizobial infection and nodule formation. Interactions between NopT and proteins related to the plant immune system may represent an important evolutionary driving force for host-specific nodulation and explain why the presence of NopT in NGR234 has a negative effect on symbiosis with L. japonicus but a positive one with other legumes.

      (4) The failure to generate stable transgenic plants expressing NopT in Lotus japonicus is surprising, considering the manuscript's claim that NopT specifically proteolyzes NFR5, a major player in the response to nodule symbiosis, without being essential for plant development.

      We also thank for this comment. We have revised the Discussion section of our manuscript and discuss now our failure to generate stable transgenic L. japonicus plants expressing NopT. We observed that the protease activity of NopT in aerial parts of L. japonicus had a negative effect on plant development, whereas NopT expression in hairy roots was possible. Such differences may be explained by different NopT substrates in roots and aerial parts of the plant. In this context, we also discuss our finding that NopT not only cleaves NFR5 but is also able to proteolyze other proteins of L. japonicus such as LjLYS11, suggesting that NopT not only suppresses Nod factor signaling, but may also interfere with signal transduction pathways related to plant immunity. We speculate that, depending on the host legume species, NopT could suppress PTI or induce ETI, thereby modulating rhizobial infection and nodule formation.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Overall the text and figure legends must be double-checked for correctness of scientific statements. The few listed here are just examples. There are more that are potentially damaging the perception by the readers and thus the value of the manuscript.

      The nopT mutant leads to more infections. In line 358 the statement: "...and the proteolysis of NFR5 are important for rhizobial infection", is wrong, as the infection works even better without it. It is, according to my interpretation of the results, important for the regulation of infection. Sounds a small difference, but it completely changes the meaning.

      We appreciate your thorough review and have taken the opportunity to correct this error. Following your suggestions, we carefully rephrased the whole text and figure legends to ensure that the scientific statements accurately reflect the findings of our study. We are convinced that these changed have increased the value of this study.

      In line 905 the authors state that NopTC indicates the truncated version of NopT after autocleavage by releasing about 50 a.a. at its N-terminus.

      They do not analyse this cleavage product to support this claim. So better rephrase.

      According to Dai et al. (2008), NopT expressed in E. coli is autocleaved. The N-terminal sequence GCCA obtained by Edman sequencing suggests that NopT was cleaved between M49 and G50.  We improved our manuscript and now write:

      (1) “A previous study has shown that NopT is autocleaved at its N-terminus to form a processed protein that lacks the first 49 amino acid residues (Dai et al., 2008)”

      (2) “However, NopT<sup>ΔN50</sup>, which is similar to autocleaved NopT, retained the ability to interact with NFR5 but not with NFR1 (Fig. S2D).”.

      In line 967: "Both NopT and NopTC after autocleavage exert proteolytic activities" This is confusing as it was suggested earlier that NopTc is a product of the autocleavage. There is no indication of another round of NopTc autocleavage or did I miss something?

      Thank you for bringing this inaccuracy to our attention. There is no second round of NopT autocleavage. We have corrected the text and write: “NopT and not<sup>C</sup> (autocleaved NopT) proteolytically cleave NFR5 at the juxtamembrane domain to release the intracellular domain of NFR5”

      Given the amount of work that went into the research, the presentation of the figures should be considerably improved. For example, in Figure 3F the mutant is not correctly annotated. In figure 5 the term infection foci and IT occur but it is not explained in the legend what these are, where they can be seen in the figure and how the researchers discriminated between the two events.

      In general, the labeling of the figure panels should be improved to facilitate the understanding. For example, in Figure 3 the panels switch between different host plant systems. The plant could be clarified for each panel to aid the reader. The asterisks are not in line with the signal that is supposed to be marked. And so on. I strongly advise to improve the figures.

      Thank you for your valuable suggestions. We acknowledge the importance of clear and informative figure presentation to enhance the understanding of our research findings. In response to your comments, we made a comprehensive revision of the figures to address the mentioned issues:

      (1) We corrected annotations of the mutant in Figure 3F to accurately represent the experimental conditions.

      (2) We revised the legend of Figure 5 and provide clear explanations of the terms "infection foci" and "IT" (infection threads) in the Methods section.

      (3) We improved the labeling of figure panels and improved the writing of the figure legend specifying the protein expression system (N. benthamiana, L. japonicus and E. coli, respectively). . We ensured that the asterisks indicating statistically significant results are properly aligned.

      Furthermore, we carefully reviewed each figure to enhance clarity and readability, including optimizing font size and line thickness. Captions and annotations were also revised.

      Figure 1

      • To verify that the lack of observed cell death is not linked to differential expression levels, an expression control Western blot is essential. In the expression control Western blot given in the supplemental materials (Supplemental fig. 1E), NFR5 is not visible in the first lane.

      We appreciate your comments on the control immunoblot which were made to verify the presence of NFR1, NFR5 and NopT in N. benthamiana.  However, as shown in Supplemental Fig. 1E, the intact NFR5 could not be immuno-detected when co-expressed with NFR1 and NopT. To ensure co-expression of NFR1/NFR5, A. tumefaciens carrying a binary vector with both NFR1 and NFR5 was used. In the revised version, we modified the figure legend accordingly and also included a detailed description of the procedure at lines 165-166

      • Labeling of NFR1/LjNFR1 should be kept consistent between the text and the figures. Currently, the text refers to both NFR1 and LjNFR1 and figures are labelled NFR1. The same is true for NFR5.

      Thank you for pointing out this inconsistency. We revised our manuscript and use now consistently NFR1 and NFR5 without a prefix to avoid any confusions.

      • A clearer description of how cell death was determined would be useful. In the selected pictures in panel D, leaves coexpressing nopT with Bax1 or Cerk1 appear very different from the pictures selected for NopM and AVr3a/R3a.

      We agree that a clearer description of our cell death experiments with N. benthamiana was necessary. We have re-worded the figure legend to provide more detailed information on the criteria used for assessing cell death. Additionally, we show now our images at higher resolution.

      • In panel D, the "Death/Total" ratio is only shown for leaf discs where nopT was coexpressed with the cell-death triggering proteins. Including the ratio for leaf discs where only the cell-death triggering protein (without nopT ) was expressed would make the figure more clear.

      Thank you for this suggestion. To provide a more comprehensive comparison, we included the "Cell death/Total" ratio for all leaf disc images shown in Fig. 1D. 

      Figure 2:

      • A: Split-YFP is not ideal as evidence for colocalization because of the chemical bond formed between the YFP fragments that may lead to artificial trapping/accumulation outside the main expression domains. Overall, the authors should revise if this figure aims to show colocalization or interaction. In the current text, both terms are used, but these are different interpretations.

      We appreciate your concern regarding the use of Split-YFP for colocalization analysis. We carefully reviewed the figure and corresponding text to ensure clarity in the interpretation of the results. The primary aim of this figure was to explore protein-protein interactions rather than strict colocalization. Protein-protein interactions have also been validated by other experiments of our work. We have revised the text accordingly and no longer emphasize on “co-localization”.

      • Given the focus on proteolytic activity in this paper, all blots need to be clearly labeled with size markers, and it would be good to include a supplemental figure with all other bands produced in the Western blot, regardless of their size. Without this, the results in panel 2D seem inconsistent with results presented in figure 3A, since NFR5 does not appear to be cleaved in the Western blot in 2D, but 3A shows cleavage when the same proteins (with different tags) are coexpressed in the same system.

      Thank you for bringing up this point. We ensured that all immunoblots are clearly labeled with size markers in our revised manuscript. We also carefully checked the consistency of the results presented in Figures 2D and Figure 3A and included appropriate clarifications in the revised manuscript. In Figure 2D, we show the bands at around 75 kD  (multi-bands would be detected below, including cleaved NFR5 by NopT, but also other non-specific bands).

      Figure 3:

      • In panel E, NopTC93S cannot cleave His-Sumo-NFR5JM-GFP, but it would be interesting to also show if NopTC93S can bind the NFR5JM fragment. It would also be useful to see this experiment done with the JM of NFP.

      Thank you for the suggestion. We agree that investigating the binding of NopT<sup>C93S</sup> to the NFR5<sup>JM</sup> fragment provides valuable insights into the interaction between NopT and NFR5. In our revised version, we show in the new Supplemental Fig. S4 that NopT interacts with NFR5JM and cleaves NFP<sup>JM</sup>. The Results section has been modified accordingly.

      • The panels in this figure require better labeling. In many panels, asterisks are misplaced relative to the bands they should highlight, and not all blots have size markers or loading controls.

      Thank you for bringing this to our attention. We carefully reviewed the labeling of all panels in Figure 3 to ensure accuracy and clarity. We ensured that asterisks are correctly placed in the figures. We also included size markers and loading controls to improve the quality of the shown immunoblots.

      • Since there is no clear evidence in this figure that the smear in the blot in panel C is phosphorylated NopT, it is recommended to provide a less interpretative label on the blot, and explain the label in the text.

      We appreciate your suggestion regarding the labeling of the blot in panel C of Fig. 3. We revised the label and provided a less interpretative designation in Fig. 3C. We also rephrased the figure legend and the text in the Results section as recommended.

      Figure 4

      • In B, a brief introduction in the text to the function of the Zn-phostag would make the figure easier to understand for more readers.

      Thank you for the suggestion. We agree and have provided a brief explanation in the Results section: “On such gels, a Zn<sup>2+</sup>-Phos-tag bound phosphorylated protein migrates slower than its unbound nonphosphorylated form. Furthermore, we have included the reference (Kato & Sakamoto, 2019) into the Methods section.

      Figure 5:

      • Change "Scar bar" to "Scale bar" in the figure captions

      Thank you for spotting that typo. We have corrected it.

      • Correct the references to the figures in the text

      We carefully reviewed the Figure 5 and made corresponding corrections to improve the quality of our manuscript Please check line 394-451.

      • It should be clarified what was quantified as "infection foci" (C, F, G)

      We revised the legend of Figure 5 and provide now explanations of the terms "infection foci" and "IT" (infection threads) in the Methods section.  Please check line 399-451.

      • It is recommended to use pictures that are from the same region of the plant root (the susceptible zone). The pictures in panel A appear to be from different regions, since the density of root hairs is different.

      Thank you for bringing this to our attention. We ensured that the images selected for panel A were from the same region of the plant root to guarantee consistency and accuracy of the comparison.

      • Panel G should be labeled so it is clearer that nopT is being expressed in L. japonicus transgenic roots.

      We have labeled this panel more clearly to help the reader understand that nopT was expressed in transgenic L. japonicus roots.

      • Panel F is missing statistical tests for ITs

      We apologize and have included the results of our statistical tests for ITs.

      Figure 6:

      • The model presented in panel E misrepresents the role of NFR5 according to the results in the paper. From the evidence presented, it is not clear if the observed rhizobial infection phenotype is due to reduced abundance of full-length NFR5, or if the cleaved NFR5 fragment is suppressing infection. Additionally, S. fredii should not be drawn so close to the plasma membrane, since the bacteria are located outside the cell wall when the T3SS is active.

      We appreciate your comment which helps us to improve the interpretation of our results. We agree that the model should accurately reflect the uncertainties regarding the role of NFR5. We revised the model (positioning of S. fredii etc.) and write in the Discussion:

      “NopT impairs the function of the NFR1/NFR5 receptor complex. Cleavage of NFR5 by NopT reduces its protein levels. Possible inhibitory effects of NFR5 cleavage products on NF signaling are unknown but cannot be excluded.”

      Reviewer #2 (Recommendations For The Authors):

      (1) Some minor weaknesses need addressing: In Figure 5A, the root hair density in the two images appears significantly different. Are these images representative of each treatment?

      We appreciate your attention to detail and the importance of ensuring that the images in Figure 5A are representative. We carefully reviewed our image selection process and confirm that the shown images are indeed representative of each treatment group. In our revised version, we show additional images and also improved the text in the figure legend. Furthermore, we performed additional GUS staining tests and the new data are shown in Fig 5A abd 5B.

      (2) Additionally, please ensure consistency in the format of genotype names throughout the manuscript. For instance, in Line 897, "Italy" should be used in place of "N. benthamiana."

      We thank you for pointing out the format of genotype names and corrected our manuscript as requested.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      BMP signaling is, arguably, best known for its role in the dorsoventral patterning, but not in nematodes, where it regulates body size. In their paper, Vora et al. analyze ChIP-Seq and RNA-Seq data to identify direct transcriptional targets of SMA-3 (Smad) and SMA-9 (Schnurri) and understand the respective roles of SMA-3 and SMA-9 in the nematode model Caenorhabditis elegans. The authors use publicly available SMA-3 and SMA-9 ChIP-Seq data, own RNA-Seq data from SMA-3 and SMA-9 mutants, and bioinformatic analyses to identify the genes directly controlled by these two transcription factors (TFs) and find approximately 350 such targets for each. They show that all SMA-3-controlled targets are positively controlled by SMA-3 binding, while SMA-9-controlled targets can be either up or downregulated by SMA-9. 129 direct targets were shared by SMA-3 and SMA-9, and, curiously, the expression of 15 of them was activated by SMA-3 but repressed by SMA-9. Since genes responsible for cuticle collagen production were eminent among the SMA-3 targets, the authors focused on trying to understand the body size defect known to be elicited by the modulation of BMP signaling. Vora et al. provide compelling evidence that this defect is likely to be due to problems with the BMP signaling-dependent collagen secretion necessary for cuticle formation.

      We thank the reviewer for this supportive summary. We would like to clarify the status of the publicly available ChIP-seq data. We generated the GFP tagged SMA-3 and SMA‑9 strains and submitted them to be entered into the queue for ChIP-seq processing by the modENCODE (later modERN) consortium. Thus, the publicly available SMA-3 and SMA-9 ChIP-seq datasets used here were derived from our efforts.  Due to the nature of the consortium’s funding, the data were required to be released publicly upon completion. Nevertheless, our current manuscript provides the first comprehensive analysis of these datasets. We have updated the text to clarify this point.

      Strengths:

      Vora et al. provide a valuable analysis of ChIP-Seq and RNA-Seq datasets, which will be very useful for the community. They also shed light on the mechanism of the BMP-dependent body size control by identifying SMA-3 target genes regulating cuticle collagen synthesis and by showing that downregulation of these genes affects body size in C. elegans.

      Weaknesses:

      (1) Although the analysis of the SMA-3 and SMA-9 ChIP-Seq and RNA-Seq data is extremely useful, the goal "to untangle the roles of Smad and Schnurri transcription factors in the developing C. elegans larva", has not been reached. While the role of SMA-3 as a transcriptional activator appears to be quite straightforward, the function of SMA-9 in the BMP signaling remains obscure. The authors write that in SMA-9 mutants, body size is affected, but they do not show any data on the mechanism of this effect.

      We thank the reviewer for directing our attention to the lack of clarity about SMA-9’s function. We have revised the text to highlight what this study and others demonstrate about SMA-9’s role in body size. Simply stated, SMA-9 is needed together with SMA-3 to promote the expression of genes involved in one-carbon metabolism, collagens, and chaperones, all of which are required for body size. SMA-3 has additional, SMA-9-independent transcriptional targets, including chaperones and ER secretion factors, that also contribute to body size. Finally, SMA-9 regulates additional targets independent of SMA-3 that likely have a minimal role in body size. We have adjusted Figure 5 with new graphs of the original data to make these points more clear.

      (2) The authors clearly show that both TFs can bind independently of each other, however, by using distances between SMA-3 and SMA-9 ChIP peaks, they claim that when the peaks are close these two TFs act as complexes. In the absence of proof that SMA-3 and SMA-9 physically interact (e.g. that they co-immunoprecipitate - as they do in Drosophila), this is an unfounded claim, which should either be experimentally substantiated or toned down.

      We acknowledge that we have not demonstrated a physical interaction between SMA-3 and SMA-9 through a co-immunoprecipitation, and we have indicated in the text that a formal biochemical demonstration would be required to make this point. Moreover, we toned down the text by stating that our results suggest that either SMA-3 and SMA-9 frequently bind as either subunits in a complex or in close vicinity to each other along the DNA. As the reviewer has indicated, a physical interaction between Smads and Schnurris has been amply demonstrated in other systems. A limitation in these previous studies is that only a small number of target genes were analyzed. Our goal in this study was to determine how widespread this interaction is on a genomic scale. Our analyses demonstrate for the first time that a Schnurri transcription factor has significant numbers of both Smad-dependent and Smad-independent target genes. We have revised the text to clarify this point.

      (3) The second part of the paper (the collagen story) is very loosely connected to the first part. dpy-11 encodes an enzyme important for cuticle development, and it is a differentially expressed direct target of SMA-3. dpy-11 can be bound by SMA-9, but it is not affected by this binding according to RNA-Seq. Thus, technically, this part of the paper does not require any information about SMA-9. However, this can likely be improved by addressing the function of the 15 genes, with the opposing mode of regulation by SMA-3 and SMA-9.

      We appreciate this suggestion and have clarified in the text how SMA-9 contributes to collagen organization and body size regulation.

      (4) The Discussion does not add much to the paper - it simply repeats the results in a more streamlined fashion.

      We thank the reviewer for this suggestion. We have added more context to the Discussion.

      Reviewer #2 (Public Review):

      In the present study, Vora et al. elucidated the transcription factors downstream of the BMP pathway components Smad and Schnurri in C. elegans and their effects on body size. Using a combination of a broad range of techniques, they compiled a comprehensive list of genome-wide downstream targets of the Smads SMA-3 and SMA-9. They found that both proteins have an overlapping spectrum of transcriptional target sites they control, but also unique ones. Thereby, they also identified genes involved in one-carbon metabolism or the endoplasmic reticulum (ER) secretory pathway. In an elaborate effort, the authors set out to characterize the effects of numerous of these targets on the regulation of body size in vivo as the BMP pathway is involved in this process. Using the reporter ROL-6::wrmScarlet, they further revealed that not only collagen production, as previously shown, but also collagen secretion into the cuticle is controlled by SMA-3 and SMA-9. The data presented by Vora et al. provide in-depth insight into the means by which the BMP pathway regulates body size, thus offering a whole new set of downstream mechanisms that are potentially interesting to a broad field of researchers.

      The paper is mostly well-researched, and the conclusions are comprehensive and supported by the data presented. However, certain aspects need clarification and potentially extended data.

      (1) The BMP pathway is active during development and growth. Thus, it is logical that the data shown in the study by Vora et al. is based on L2 worms. However, it raises the question of if and how the pattern of transcriptional targets of SMA-3 and SMA-9 changes with age or in the male tail, where the BMP pathway also has been shown to play a role. Is there any data to shed light on this matter or are there any speculations or hypotheses?

      We agree that these are intriguing questions, and we are interested in the roles of transcriptional targets at other developmental stages and in other physiological functions, but these analyses are beyond the scope of the current study.

      (2) As it was shown that SMA-3 and SMA-9 potentially act in a complex to regulate the transcription of several genes, it would be interesting to know whether the two interact with each other or if the cooperation is more indirect.

      A physical interaction between Smads and Schnurri has been amply demonstrated in other systems. Our goal in this study was not to validate this physical interaction, but to analyze functional interactions on a genome-wide scale.

      (3) It would help the understanding of the data even more if the authors could specifically state if there were collagens among the genes regulated by SMA-3 and SMA-9 and which.

      We thank the reviewer for this suggestion. col-94 and col-153 were identified as direct targets of both SMA-3 and SMA-9. We noted this in the Discussion.

      (4) The data on the role of SMA-3 and SMA-9 in the regulation of the secretion of collagens from the hypodermis is highly intriguing. The authors use ROL-6 as a reporter for the secretion of collagens. Is ROL-6 a target of SMA-9 or SMA-3? Even if this is not the case, the data would gain even more strength if a comparable quantification of the cuticular levels of ROL-6 were shown in Figure 6, and potentially a ratio of cuticular versus hypodermal levels. By that, the levels of secretion versus production can be better appreciated.

      We previously showed that rol-6 mRNA levels are reduced in dbl-1 mutants at L2, but RNA-seq analysis did not find enough of a statistically significant change in rol-6 to qualify it as a transcriptional target and total levels of protein are also not significantly reduced in mutants. We added this information in the text.

      (5) It is known that the BMP pathway controls several processes besides body size. The discussion would benefit from a broader overview of how the identified genes could contribute to body size. The focus of the study is on collagen production and secretion, but it would be interesting to have some insights into whether and how other identified proteins could play a role or whether they are likely to not be involved here (such as the ones normally associated with lipid metabolism, etc.).

      We have added more information to the Discussion.

      Reviewer #1 (Recommendations For The Authors):

      Figure 1 - Figure 3: The authors might want to think about condensing this into two figures.

      To avoid confusion with the different workflows, we prefer to keep these as three separate figures.

      Figure 1a-b: Measurement unit missing on X.

      We added the unit “bps” to these graphs.

      Line 244-246: The authors should stress in the Results that they analyzed publicly available ChIP-Seq data, which was not generated by them, - not just by providing a reference to Kudron et al., 2018. As far as I understood, ChIP was performed with an anti-GFP antibody. Please mention this, and specify the information about the vendor and the catalog number in the Methods.

      We would like to clarify the status of the publicly available ChIP-seq data. We generated the GFP tagged SMA-3 and SMA‑9 strains and submitted them to be entered into the queue for ChIP-seq processing by the modENCODE (later modERN) consortium. Thus, the publicly available SMA-3 and SMA-9 ChIP-seq datasets used here were derived from our efforts.  Due to the nature of the consortium’s funding, the data were required to be released publicly upon completion. Nevertheless, our current manuscript provides the first comprehensive analysis of these datasets. We have clarified these issues in the text.  We have also added information regarding the anti-GFP antibody to the Methods.

      Line 267-270: The authors should either provide experimental evidence that SMA-3 and SMA-9 form complexes or write something like "significant overlap between SMA-3 and SMA-9 peaks may indicate complex formation between these two transcription factors as shown in Drosophila" - but in the absence of proof, this must be a point for the Discussion, not for the Results. Moreover, similar behavior of fat-6 (overlapping ChIP peaks) and nhr-114 (non-overlapping ChIP peaks) in SMA-3 and SMA-9 mutants may be interpreted as a circumstantial argument against SMA-3/SMA-9 complex formation (see Lines 342-348). Importantly, since ChIP-Seq data are available for a wide array of C. elegans TFs, it would be very useful to have an estimate of whether SMA-3/SMA-9 peak overlap is significantly higher than the peak overlap between SMA-3 and several other TFs expressed at the same L2 stage.

      We have clarified our goals regarding SMA-3 and SMA-9 interactions and softened our conclusions by indicating in the text that a formal biochemical demonstration would be required to demonstrate a physical interaction. Moreover, we toned down the text by stating that our results suggest that either SMA-3 and SMA-9 frequently bind as either subunits in a complex or in close vicinity to each other along the DNA. We have added an analysis of HOT sites to address overlap of binding with other transcription factors. We disagree with the interpretation that transcription factors with non-overlapping sites cannot act together to regulate gene expression; however, nhr-114 also has an overlapping SMA-3 and SMA-9 site, so this point becomes less relevant. We have clarified the categorization of nhr-114 in the text.

      Lines 272-292: The authors do not comment on the seemingly quite small overlap between the RNA-Seq and the ChIP-Seq dataset, but I think they should. They have 3205 SMA-3 ChIP peaks and 1867 SMA-3 DEGs, but the amount of directly regulated targets is 367. It is important that the authors provide information on the number of genes to which their peaks have been assigned. Clearly, this will not be one gene per peak, but if it were, this would mean that just 11.5% of bound targets are really affected by the binding. The same number would be 4.7% for the SMA-9 peaks.

      We have added a discussion of the discrepancy between binding sites and DEGs. The high number of additional sites classified as non-functional could represent the detection of weak affinity targets that do not have an actual biological purpose. Alternatively, these sites could have an additional role in DBL-1 signaling besides transcriptional regulation of nearby genes, or they could be regulating the expression of target genes at a far enough distance to not be detected by our BETA analysis as per the constraints chosen for the analysis. The difference between total binding sites and those associated with changes in gene expression underscores the importance of combining RNA-seq with ChIP-seq to identify the most biologically relevant targets. And as the reviewer indicated, more than one gene can be assigned to a single neighboring peak.

      Lines 294-323: I feel like there is a terminology problem, which makes reading very difficult. The authors use "direct targets" as bound genes with significant expression change, but then run into a problem when the gene is bound by SMA-9 and SMA-3, but significant expression change is only associated with one of the two factors. I am not sure this is consistent with the idea of the SMA3/SMA9 complex. Also, different modalities of the SMA3 and SMA9 effect in 15 cases can be explained by co-factors. Reading would be also simplified if the order of the panels in Figure 3 were different. Currently, the authors start their explanation by referring to the shared SMA-3/SMA-9 targets (Figures 3c-d), and only later come to Figure 3b. In general, the authors should start with a clear explanation of what is on the figure (currently starting on Line 313), otherwise, it is unclear why, if the authors only discuss common targets, it is not just 114+15=129 targets, but more.

      We have re-ordered the columns in Figure 3 to match the order discussed in the text. We also incorporated more precise language about regulation by SMA-3 and/or SMA-9 in the text.

      Lines 325-355: The chapter has a rather unfortunate name "Mechanisms of integration of SMA-3 and SMA-9 function", although the authors do not provide any mechanism. Using 3 target genes, they show that if the regulatory modality of SMA-3 and SMA-9 is the same (2 examples), there is no difference in the expression of the targets, but if the modalities are opposing (1 example), SMA-9 repressive action is epistatic to the SMA-3 activating action. Can this be generalized? The authors should test all their 15 targets with opposite regulations. Moreover, it seems obvious to ask whether the intermediate phenotype of the double-mutants can be attributed to the action of these 15 genes activated by SMA-3 and repressed by SMA-9. I would suggest testing this by RNAi. I would also suggest renaming the chapter to something better reflecting its content.

      We have removed the word “mechanism” from the title of this section. We also performed additional RT-PCR experiments on another 5 targets with opposing directions of regulation. The results from these genes are consistent with the result from C54E4.5, demonstrating that the epistasis of sma-9 is generalizable.

      Figure 4b: Why was a two-way ANOVA performed here? With the small number of measurements, I would consider using a non-parametric test.

      These data are parametric and the distribution of the data is normal, so we chose to use a parametric test (ANOVA).

      Lines 354-355. The authors offer two suggestions for the mechanism of the epistatic action of SMA-9 on SMA-3 in the case of C54E4.5, but this is something for the Discussion. If they want to keep it in the Results they should address this experimentally by performing SMA-3 ChIP-seq in the SMA-9 mutants and SMA-9 ChIP-Seq in the SMA-3 mutants.

      We moved these models to the discussion as suggested.

      Lines 365-367: "We expect that clusters of genes involved in fatty acid metabolism and innate immunity mediate the physiological functions of BMP signaling in fat storage and pathogen resistance, respectively." - This is pretty confusing since the Authors claim in the previous sentence that regulation of immunity by SMA-9 is TGF-beta independent.

      Co-regulation of immunity by BMP signaling and SMA-9 is already known. The novel insight is that SMA-9 may have an additional independent role in immunity. We have clarified the language to address this confusion.

      Lines 377, and 380: Please explain in non-C. elegans-specific terminology, what rrf-3 and LON-2 are (e.g. write "glypican LON-2" instead of just "LON-2") and add relevant references.

      We added information on the proteins encoded by these genes.

      Lines 382-384: I am not sure what the Authors mean here by "more limiting".

      We substituted the phrase “might have a more prominent requirement in mediating the exaggerated growth defect of a lon-2 mutant”.

      Lines 388-392: I found this very confusing. What were these 36 genes? Were these direct targets of SMA-3, SMA-9, or both? Top 36 targets? 36 targets for which mutants are available?

      The new Figure 5 clarifies whether target genes are SMA-3-exclusive, SMA-9-exclusive, or co-regulated. The text was also updated for clarity.

      Line 397: This is the first time the authors mention dpy-11 but they do not say what it is until later, and they do not say whether it is a target of SMA3/SMA9. Checking Figure 3, I found that it is among the 238 genes bound by both but upregulated only by SMA3. The authors need to explicitly state this - from this point on, they have a section for which SMA-9 appears to be irrelevant.

      We added the molecular function of dpy-11 at its first mention. Furthermore, we included the hypothesis that SMA-3 may regulate collagen secretion independently of SMA-9. Our subsequent results with sma-9 mutants disprove this hypothesis.

      Line 402: Is ROL-6 a SMA-3/SMA-9 target or just a marker gene?

      We previously showed that rol-6 mRNA levels are reduced in dbl-1 mutants at L2, but RNA-seq analysis did not find enough of a statistically significant change in rol-6 to qualify it as a transcriptional target and total levels of protein are also not significantly reduced in mutants. We added this information in the text.

      Line 421: I am not sure what "more skeletonized" means.

      Replaced with “thinner and skeletonized”

      Figure 2b and 2d legends: "Non-target genes nevertheless showing differential expression are indicated with green squares." (l. 581-582 and again l. 588-589) I think should be "Non-direct target genes...".

      Changed to “non-direct target genes”

      Figure 7 legend: Please indicate the scale bar size in the legend.

      Indicated the scale bar size in the legend.

      Figure 7: The ER marker is referred to as "ssGFP::KDEL" (in the image and Line 700), however in the text it is called "KDEL::oxGFP" (Line 419). Please use consistent naming.

      We fixed the inconsistent naming.

      All the experiment suggestions made are optional and can, in principle, be ignored if the authors tone down their claims (for example, the SMA-3/SMA-9 complex formation).

      Reviewer #2 (Recommendations For The Authors):

      (1) As a control: Have the authors found the known regulated genes among the differentially regulated ones?

      Previously known target genes such as fat-6 and zip-10 were identified here. We have added this information in the text.

      (2) How many repetitions were performed in Figure 4b? I am wondering as the deviation for C54E4.5 is quite large and that makes me worry that the significant differences stated are not robust.

      There were two biologically independent collections from which three cDNA syntheses were analyzed using two technical replicates per point.

      (3) Lines 333-336: Can you really make this claim that the antagonistic effects seen in the regulation of body size can be correlated with some targets being regulated in the opposite direction? I would assume that the situation is far more complex as SMADs also regulate other processes.

      We agree with the reviewer that multiple models could explain this antagonism, and we have added distinct alternatives in the text.

      (4) Lines 367-369: Add the respective reference please.

      We have added the relevant references.

    1. Author response:

      The following is the authors’ response to the original reviews.

      The revised manuscript contains new results and additional text. Major revisions:

      (1) Additional simulations and analyses of networks with different biophysical parameters and with identical time constants for E and I neurons (Methods, Supplementary Fig. 5).

      (2) Additional simulations and analyses of networks with modifications of connectivity parameters to further analyze effects of E/I assemblies on manifold geometry (Supplementary Fig. 6).

      (3) Analysis of synaptic current components (Figure 3 D-F; to analyze mechanism of modest amplification in Tuned networks). 

      (4) More detailed explanation of pattern completion analysis (Results).

      (5) Analysis of classification performance of Scaled networks (Supplementary Fig.8).

      (6) Additional analysis (Figure 5D-F) and discussion (particularly section “Computational functions of networks with E/I assemblies”) of functional benefits of continuous representations in networks with E-I assemblies. 

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Meissner-Bernard et al present a biologically constrained model of telencephalic area of adult zebrafish, a homologous area to the piriform cortex, and argue for the role of precisely balanced memory networks in olfactory processing. 

      This is interesting as it can add to recent evidence on the presence of functional subnetworks in multiple sensory cortices. It is also important in deviating from traditional accounts of memory systems as attractor networks. Evidence for attractor networks has been found in some systems, like in the head direction circuits in the flies. However, the presence of attractor dynamics in other modalities, like sensory systems, and their role in computation has been more contentious. This work contributes to this active line of research in experimental and computational neuroscience by suggesting that, rather than being represented in attractor networks and persistent activity, olfactory memories might be coded by balanced excitation-inhibitory subnetworks. 

      Strengths: 

      The main strength of the work is in: (1) direct link to biological parameters and measurements, (2) good controls and quantification of the results, and (3) comparison across multiple models. 

      (1) The authors have done a good job of gathering the current experimental information to inform a biological-constrained spiking model of the telencephalic area of adult zebrafish. The results are compared to previous experimental measurements to choose the right regimes of operation. 

      (2) Multiple quantification metrics and controls are used to support the main conclusions and to ensure that the key parameters are controlled for - e.g. when comparing across multiple models.  (3) Four specific models (random, scaled I / attractor, and two variant of specific E-I networks - tuned I and tuned E+I) are compared with different metrics, helping to pinpoint which features emerge in which model. 

      Weaknesses: 

      Major problems with the work are: (1) mechanistic explanation of the results in specific E-I networks, (2) parameter exploration, and (3) the functional significance of the specific E-I model. 

      (1) The main problem with the paper is a lack of mechanistic analysis of the models. The models are treated like biological entities and only tested with different assays and metrics to describe their different features (e.g. different geometry of representation in Fig. 4). Given that all the key parameters of the models are known and can be changed (unlike biological networks), it is expected to provide a more analytical account of why specific networks show the reported results. For instance, what is the key mechanism for medium amplification in specific E/I network models (Fig. 3)? How does the specific geometry of representation/manifolds (in Fig. 4) emerge in terms of excitatory-inhibitory interactions, and what are the main mechanisms/parameters? Mechanistic account and analysis of these results are missing in the current version of the paper. 

      We agree that further mechanistic insights would be of interest and addressed this issue at different levels:

      (1) Biophysical parameters: to determine whether network behavior depends on specific choices of biophysical parameters in E and I neurons we equalized biophysical parameters across neuron types. The main observations are unchanged, suggesting that the observed effects depend primarily on network connectivity (see also response to comment [2]).

      (2) Mechanism of modest amplification in E/I assemblies: analyzing the different components of the synaptic currents demonstrate that the modest amplification of activity in Tuned networks results from an “imperfect” balance of recurrent excitation and inhibition within assemblies (see new Figures 3D-F and text p.7). Hence, E/I co-tuning substantially reduces the net amplification in Tuned networks as compared to Scaled networks, thus preventing discrete attractor dynamics and stabilizing network activity, but a modest amplification still occurs, consistent with biological observations.

      (3) Representational geometry: to obtain insights into the network mechanisms underlying effects of E/I assemblies on the geometry of population activity we tested the hypothesis that geometrical changes depend, at least in part, on the modest amplification of activity within E/I assemblies (see Supplementary Figure 6). We changed model parameters to either prevent the modest amplification in Tuned networks (increasing I-to-E connectivity within assemblies) or introduce a modest amplification in subsets of neurons by other mechanisms (concentration-dependent increase in the excitability of pseudo-assembly neurons; Scaled I networks with reduced connectivity within assemblies). Manipulations that introduced a modest, input-dependent amplification in neuronal subsets had geometrical effects similar to those observed in Tuned networks, whereas manipulations that prevented a modest amplification abolished these effects (Supplementary Figure 6). Note however that these manipulations generated different firing rate distributions. These results provide a starting point for more detailed analyses of the relationship between network connectivity and representational geometry (see p.12).

      In summary, our additional analyses indicate that effects of E/I assemblies on representational geometry depend primarily on network connectivity, rather than specific biophysical parameters, and that the resulting modest amplification of activity within assemblies makes an important contribution. Further analyses may reveal more specific relationships between E/I assemblies and representational geometry, but such analyses are beyond the scope of this study.

      (2) The second major issue with the study is a lack of systematic exploration and analysis of the parameter space. Some parameters are biologically constrained, but not all the parameters. For instance, it is not clear what the justification for the choice of synaptic time scales are (with E synaptic time constants being larger than inhibition: tau_syn_i = 10 ms, tau_syn_E = 30 ms). How would the results change if they are varying these - and other unconstrained - parameters? It is important to show how the main results, especially the manifold localisation, would change by doing a systematic exploration of the key parameters and performing some sensitivity analysis. This would also help to see how robust the results are, which parameters are more important and which parameters are less relevant, and to shed light on the key mechanisms.  

      We thank the reviewer for raising this point. We chose a relatively slow time constant for excitatory synapses because experimental data indicate that excitatory synaptic currents in Dp and piriform cortex contain a prominent NMDA component. Nevertheless, to assess whether network behavior depends on specific choices of biophysical parameters in E and I neurons, we have performed additional simulations with equal synaptic time constants and equal biophysical parameters for all neurons. Each neuron also received the same number of inputs from each population (see revised Methods). Results were similar to those observed previously (Supplementary Fig.5 and p.9 of main text). We therefore conclude that the main effects observed in Tuned networks cannot be explained by differences in biophysical parameters between E and I neurons but is primarily a consequence of network connectivity.

      (3) It is not clear what the main functional advantage of the specific E-I network model is compared to random networks. In terms of activity, they show that specific E-I networks amplify the input more than random networks (Fig. 3). But when it comes to classification, the effect seems to be very small (Fig. 5c). Description of different geometry of representation and manifold localization in specific networks compared to random networks is good, but it is more of an illustration of different activity patterns than proving a functional benefit for the network. The reader is still left with the question of what major functional benefits (in terms of computational/biological processing) should be expected from these networks, if they are to be a good model for olfactory processing and learning. 

      One possibility for instance might be that the tasks used here are too easy to reveal the main benefits of the specific models - and more complex tasks would be needed to assess the functional enhancement (e.g. more noisy conditions or more combination of odours). It would be good to show this more clearly - or at least discuss it in relation to computation and function. 

      In the previous manuscript, the analysis of potential computational benefits other than pattern classification was limited and the discussion of this issue was condensed into a single itemized paragraph to avoid excessive speculation. Although a thorough analysis of potential computational benefits exceeds the scope of a single paper, we agree with the reviewer that this issue is of interest and therefore added additional analyses and discussion.

      In the initial manuscript we analyzed pattern classification primarily to investigate whether Tuned networks can support this function at all, given that they do not exhibit discrete attractor states. We found this to be the case, which we consider a first important result.

      Furthermore, we found that precise balance of E/I assemblies can protect networks against catastrophic firing rate instabilities when assemblies are added sequentially, as in continual learning. Results from these simulations are now described and discussed in more detail (see Results p.11 and Discussion p.13).

      In the revised manuscript, we now also examine additional potential benefits of Tuned networks and discuss them in more detail (see new Figure 5D-F and text p.11). One hypothesis is that continuous representations provide a distance metric between a given input and relevant (learned) stimuli. To address this hypothesis, we (1) performed regression analysis and (2) trained support vector machines (SVMs) to predict the concentration of a given odor in a mixture based on population activity. In both cases, Tuned E+I networks outperformed Scaled and _rand n_etworks in predicting the concentration of learned odors across a wide range mixtures (Figure 5D-F).  E/I assemblies therefore support the quantification of learned odors within mixtures or, more generally, assessments of how strongly a (potentially complex) input is related to relevant odors stored in memory. Such a metric assessment of stimulus quality is not well supported by discrete attractor networks because inputs are mapped onto discrete network states.

      The observation that Tuned networks do not map inputs onto discrete outputs indicates that such networks do not classify inputs as distinct items. Nonetheless, the observed geometrical modifications of continuous representations support the classification of learned inputs or the assessment of metric relationships by hypothetical readout neurons. Geometrical modifications of odor representations may therefore serve as one of multiple steps in multi-layer computations for pattern classification (and/or other computations). In this scenario, the transformation of odor representations in Dp may be seen as related to transformations of representations between different layers in artificial networks, which collectively perform a given task (notwithstanding obvious structural and mechanistic differences between artificial and biological networks). In other words, geometrical transformations of representations in Tuned networks may overrepresent learned (relevant) information at the expense of other information and thereby support further learning processes in other brain areas. An obvious corollary of this scenario is that Dp does not perform odor classification per se based on inputs from the olfactory bulb but reformats representations of odor space based on experience to support computational tasks as part of a larger system. This scenario is now explicitly discussed (p.14).

      Reviewer #2 (Public Review): 

      Summary: 

      The authors conducted a comparative analysis of four networks, varying in the presence of excitatory assemblies and the architecture of inhibitory cell assembly connectivity. They found that co-tuned E-I assemblies provide network stability and a continuous representation of input patterns (on locally constrained manifolds), contrasting with networks with global inhibition that result in attractor networks. 

      Strengths: 

      The findings presented in this paper are very interesting and cutting-edge. The manuscript effectively conveys the message and presents a creative way to represent high-dimensional inputs and network responses. Particularly, the result regarding the projection of input patterns onto local manifolds and continuous representation of input/memory is very Intriguing and novel. Both computational and experimental neuroscientists would find value in reading the paper. 

      Weaknesses: 

      that have continuous representations. This could also be shown in Figure 5B, along with the performance of the random and tuned E-I networks. The latter networks have the advantage of providing network stability compared to the Scaled I network, but at the cost of reduced network salience and, therefore, reduced input decodability. The authors may consider designing a decoder to quantify and compare the classification performance of all four networks. 

      We have now quantified classification by networks with discrete attractor dynamics (Scaled) along with other networks. However, because the neuronal covariance matrix for such networks is low rank and not invertible, pattern classification cannot be analyzed by QDA as in Figure 5B. We therefore classified patterns from the odor subspace by template matching, assigning test patterns to one of the four classes based on correlations (see Supplementary Figure 8). As expected, Scaled networks performed well, but they did not outperform Tuned networks. Moreover, the performance of Scaled networks, but not Tuned networks, depended on the order in which odors were presented to the network. This hysteresis effect is a direct consequence of persistent attractor states and decreased the general classification performance of Scaled networks (see Supplementary Figure 8 for details). These results confirm the prediction that networks with discrete attractor states can efficiently classify inputs, but also reveal disadvantages arising from attractor dynamics. Moreover, the results indicate that the classification performance of Tuned networks is also high under the given task conditions, which simulate a biologically realistic scenario.

      We would also like to emphasize that classification may not be the only task, and perhaps not even a main task, of Dp/piriform cortex or other memory networks with E/I assemblies. Conceivably, other computations could include metric assessments of inputs relative to learned inputs or additional learning-related computations. Please see our response to comment (3) of reviewer 1 for a further discussion of this issue. 

      Networks featuring E/I assemblies could potentially represent multistable attractors by exploring the parameter space for their reciprocal connectivity and connectivity with the rest of the network. However, for co-tuned E-I networks, the scope for achieving multistability is relatively constrained compared to networks employing global or lateral inhibition between assemblies. It would be good if the authors mentioned this in the discussion. Also, the fact that reciprocal inhibition increases network stability has been shown before and should be cited in the statements addressing network stability (e.g., some of the citations in the manuscript, including Rost et al. 2018, Lagzi & Fairhall 2022, and Vogels et al. 2011 have shown this).  

      We thank the reviewer for this comment. We now explicitly discuss multistability (see p. 12) and refer to additional references in the statements addressing network stability.

      Providing raster plots of the pDp network for familiar and novel inputs would help with understanding the claims regarding continuous versus discrete representation of inputs, allowing readers to visualize the activity patterns of the four different networks. (similar to Figure 1B). 

      We thank the reviewer for this suggestion. We have added raster plots of responses to both familiar and novel inputs in the revised manuscript (Figure 2D and Supplementary Figure 4A).

      Reviewer #3 (Public Review): 

      Summary: 

      This work investigates the computational consequences of assemblies containing both excitatory and inhibitory neurons (E/I assembly) in a model with parameters constrained by experimental data from the telencephalic area Dp of zebrafish. The authors show how this precise E/I balance shapes the geometry of neuronal dynamics in comparison to unstructured networks and networks with more global inhibitory balance. Specifically, E/I assemblies lead to the activity being locally restricted onto manifolds - a dynamical structure in between high-dimensional representations in unstructured networks and discrete attractors in networks with global inhibitory balance. Furthermore, E/I assemblies lead to smoother representations of mixtures of stimuli while those stimuli can still be reliably classified, and allow for more robust learning of additional stimuli. 

      Strengths: 

      Since experimental studies do suggest that E/I balance is very precise and E/I assemblies exist, it is important to study the consequences of those connectivity structures on network dynamics. The authors convincingly show that E/I assemblies lead to different geometries of stimulus representation compared to unstructured networks and networks with global inhibition. This finding might open the door for future studies for exploring the functional advantage of these locally defined manifolds, and how other network properties allow to shape those manifolds. 

      The authors also make sure that their spiking model is well-constrained by experimental data from the zebrafish pDp. Both spontaneous and odor stimulus triggered spiking activity is within the range of experimental measurements. But the model is also general enough to be potentially applied to findings in other animal models and brain regions. 

      Weaknesses: 

      I find the point about pattern completion a bit confusing. In Fig. 3 the authors argue that only the Scaled I network can lead to pattern completion for morphed inputs since the output correlations are higher than the input correlations. For me, this sounds less like the network can perform pattern completion but it can nonlinearly increase the output correlations. Furthermore, in Suppl. Fig. 3 the authors show that activating half the assembly does lead to pattern completion in the sense that also non-activated assembly cells become highly active and that this pattern completion can be seen for Scaled I, Tuned E+I, and Tuned I networks. These two results seem a bit contradictory to me and require further clarification, and the authors might want to clarify how exactly they define pattern completion. 

      We believe that this comment concerns a semantic misunderstanding and apologize for any lack of clarity. We added a definition of pattern completion in the text: “…the retrieval of the whole memory from noisy or corrupted versions of the learned input.”. Pattern completion may be assessed using different procedures. In computational studies, it is often analyzed by delivering input to a subset of the assembly neurons which store a given memory (partial activation). Under these conditions, we find recruitment of the entire assembly in all structured networks, as demonstrated in Supplementary Figure 3. However, these conditions are unlikely to occur during odor presentation because the majority of neurons do not receive any input.

      Another more biologically motivated approach to assess pattern completion is to gradually modify a realistic odor input into a learned input, thereby gradually increasing the overlap between the two inputs. This approach had been used previously in experimental studies (references added to the text p.6). In the presence of assemblies, recurrent connectivity is expected to recruit assembly neurons (and thus retrieve the stored pattern) more efficiently as the learned pattern is approached. This should result in a nonlinear increase in the similarity between the evoked and the learned activity pattern. This signature was prominent in Scaled networks but not in Tuned or rand networks. Obviously, the underlying procedure is different from the partial activation of the assembly described above because input patterns target many neurons (including neurons outside assemblies) and exhibit a biologically realistic distribution of activity. However, this approach has also been referred to as “pattern completion” in the neuroscience literature, which may be the source of semantic confusion here. To clarify the difference between these approaches we have now revised the text and explicitly described each procedure in more detail (see p.6). 

      The authors argue that Tuned E+I networks have several advantages over Scaled I networks. While I agree with the authors that in some cases adding this localized E/I balance is beneficial, I believe that a more rigorous comparison between Tuned E+I networks and Scaled I networks is needed: quantification of variance (Fig. 4G) and angle distributions (Fig. 4H) should also be shown for the Scaled I network. Similarly in Fig. 5, what is the Mahalanobis distance for Scaled I networks and how well can the Scaled I network be classified compared to the Tuned E+I network? I suspect that the Scaled I network will actually be better at classifying odors compared to the E+I network. The authors might want to speculate about the benefit of having networks with both sources of inhibition (local and global) and hence being able to switch between locally defined manifolds and discrete attractor states. 

      We agree that a more rigorous comparison of Tuned and Scaled networks would be of interest. We have added the variance analysis (Fig 4G) and angle distributions (Fig. 4H) for both Tuned I and Scaled networks. However, the Mahalanobis distances and Quadratic Discriminant Analysis cannot be applied to Scaled networks because their neuronal covariance matrix is low rank and not invertible_. To nevertheless compare these networks, we performed template matching by assigning test patterns to one of the four odor classes based on correlations to template patterns (Supplementary Figure 8; see also response to the first comment of reviewer 2). Interestingly, _Scaled networks performed well at classification but did not outperform Tuned networks, and exhibited disadvantages arising from attractor dynamics (Supplementary Figure 8; see also response to the first comment of reviewer 2). Furthermore, in further analyses we found that continuous representational manifolds support metric assessments of inputs relative to learned odors, which cannot be achieved by discrete representations. These results are now shown in Figure 5D-E and discussed explicitly in the text on p.11 (see also response to comment 3 of reviewer 1).

      We preferred not to add a sentence in the Discussion about benefits of networks having both sources of inhibition_,_ as we find this a bit too speculative.

      At a few points in the manuscript, the authors use statements without actually providing evidence in terms of a Figure. Often the authors themselves acknowledge this, by adding the term "not shown" to the end of the sentence. I believe it will be helpful to the reader to be provided with figures or panels in support of the statements.  

      Thank you for this comment. We have provided additional data figures to support the following statements:

      “d<sub>M</sub> was again increased upon learning, particularly between learned odors and reference classes representing other odors (Supplementary Figure 9)”

      “decreasing amplification in assemblies of Scaled networks changed transformations towards the intermediate behavior, albeit with broader firing rate distributions than in Tuned networks (Supplementary Figure 6 B)”  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Meissner-Bernard et al present a biologically constrained model of telencephalic area of adult zebrafish, a homologous area to the piriform cortex, and argue for the role of precisely balanced memory networks in olfactory processing. 

      This is interesting as it can add to recent evidence on the presence of functional subnetworks in multiple sensory cortices. It is also important in deviating from traditional accounts of memory systems as attractor networks. Evidence for attractor networks has been found in some systems, like in the head direction circuits in the flies. However, the presence of attractor dynamics in other modalities, like sensory systems, and their role in computation has been more contentious. This work contributes to this active line of research in experimental and computational neuroscience by suggesting that, rather than being represented in attractor networks and persistent activity, olfactory memories might be coded by balanced excitation-inhibitory subnetworks. 

      The paper is generally well-written, the figures are informative and of good quality, and multiple approaches and metrics have been used to test and support the main results of the paper. 

      The main strength of the work is in: (1) direct link to biological parameters and measurements, (2) good controls and quantification of the results, and (3) comparison across multiple models. 

      (1) The authors have done a good job of gathering the current experimental information to inform a biological-constrained spiking model of the telencephalic area of adult zebrafish. The results are compared to previous experimental measurements to choose the right regimes of operation. 

      (2) Multiple quantification metrics and controls are used to support the main conclusions and to ensure that the key parameters are controlled for - e.g. when comparing across multiple models.   (3) Four specific models (random, scaled I / attractor, and two variant of specific E-I networks - tuned I and tuned E+I) are compared with different metrics, helping to pinpoint which features emerge in which model. 

      Major problems with the work are: (1) mechanistic explanation of the results in specific E-I networks, (2) parameter exploration, and (3) the functional significance of the specific E-I model. 

      (1) The main problem with the paper is a lack of mechanistic analysis of the models. The models are treated like biological entities and only tested with different assays and metrics to describe their different features (e.g. different geometry of representation in Fig. 4). Given that all the key parameters of the models are known and can be changed (unlike biological networks), it is expected to provide a more analytical account of why specific networks show the reported results. For instance, what is the key mechanism for medium amplification in specific E/I network models (Fig. 3)? How does the specific geometry of representation/manifolds (in Fig. 4) emerge in terms of excitatory-inhibitory interactions, and what are the main mechanisms/parameters? Mechanistic account and analysis of these results are missing in the current version of the paper. 

      Precise balancing of excitation and inhibition in subnetworks would lead to the cancellation of specific dynamical modes responsible for the amplification of responses (hence, deviating from the attractor dynamics with an unstable specific mode). What is the key difference in the specific E/I networks here (tuned I or/and tuned E+I) which make them stand between random and attractor networks? Excitatory and inhibitory neurons have different parameters in the model (Table 1). Time constants of inhibitory and excitatory synapses are also different (P. 13). Are these parameters causing networks to be effectively more excitation dominated (hence deviating from a random spectrum which would be expected from a precisely balanced E/I network, with exactly the same parameters of E and I neurons)? It is necessary to analyse the network models, describe the key mechanism for their amplification, and pinpoint the key differences between E and I neurons which are crucial for this. 

      To address these comments we performed additional simulations and analyses at different levels. Please see our reply to comment (1) of the public review (reviewer 1) for a detailed description. We thank the reviewer for these constructive comments.

      (2) The second major issue with the study is a lack of systematic exploration and analysis of the parameter space. Some parameters are biologically constrained, but not all the parameters. For instance, it is not clear what the justification for the choice of synaptic time scales are (with E synaptic time constants being larger than inhibition: tau_syn_i = 10 ms, tau_syn_E = 30 ms). How would the results change if they are varying these - and other unconstrained - parameters? It is important to show how the main results, especially the manifold localisation, would change by doing a systematic exploration of the key parameters and performing some sensitivity analysis. This would also help to see how robust the results are, which parameters are more important and which parameters are less relevant, and to shed light on the key mechanisms.  

      We thank the reviewer for this comment. We have now carried out additional simulations with equal time constants for all neurons. Please see our reply to the public review for more details (comment 2 of reviewer 1).

      (3) It is not clear what the main functional advantage of the specific E-I network model is compared to random networks. In terms of activity, they show that specific E-I networks amplify the input more than random networks (Fig. 3). But when it comes to classification, the effect seems to be very small (Fig. 5c). Description of different geometry of representation and manifold localization in specific networks compared to random networks is good, but it is more of an illustration of different activity patterns than proving a functional benefit for the network. The reader is still left with the question of what major functional benefits (in terms of computational/biological processing) should be expected from these networks, if they are to be a good model for olfactory processing and learning. 

      One possibility for instance might be that the tasks used here are too easy to reveal the main benefits of the specific models - and more complex tasks would be needed to assess the functional enhancement (e.g. more noisy conditions or more combination of odours). It would be good to show this more clearly - or at least discuss it in relation to computation and function.

      Please see our reply to the public review (comment 3 of reviewer 1).

      Specific comments: 

      Abstract: "resulting in continuous representations that reflected both relatedness of inputs and *an individual's experience*" 

      It didn't become apparent from the text or the model where the role of "individual's experience" component (or "internal representations" - in the next line) was introduced or shown (apart from a couple of lines in the Discussion) 

      We consider the scenario that that assemblies are the outcome of an experience-dependent plasticity process. To clarify this, we have now made a small addition to the text: “Biological memory networks are thought to store information by experience-dependent changes in the synaptic connectivity between assemblies of neurons.”.

      P. 2: "The resulting state of "precise" synaptic balance stabilizes firing rates because inhomogeneities or fluctuations in excitation are tracked by correlated inhibition" 

      It is not clear what the "inhomogeneities" specifically refers to - they can be temporal, or they can refer to the quenched noise of connectivity, for instance. Please clarify what you mean. 

      The statement has been modified to be more precise: “…“precise” synaptic balance stabilizes firing rates because inhomogeneities in excitation across the population or temporal variations in excitation are tracked by correlated inhibition…”.

      P. 3 (and Methods): When odour stimulus is simulated in the OB, the activity of a fraction of mitral cells is increased (10% to 15 Hz) - but also a fraction of mitral cells is suppressed (5% to 2 Hz). What is the biological motivation or reference for this? It is not provided. Is it needed for the results? Also, it is not explained how the suppressed 5% are chosen (e.g. randomly, without any relation to the increased cells?). 

      We thank the reviewer for this comment. These changes in activity directly reflect experimental observations. We apologize that we forgot to include the references reporting these observations (Friedrich and Laurent, 2001 and 2004); this is now fixed.

      In our simulation, OB neurons do not interact with each other, and the suppressed 5% were indeed randomly selected. We changed the text in Methods accordingly to read: “An additional 75 randomly selected mitral cells were inhibited” 

      P. 4, L. 1-2: "... sparsely connected integrate-and-fire neurons with conductance-based synapses (connection probability {less than or equal to}5%)." 

      Specify the connection probability of specific subtypes (EE, EI, IE, II).  

      We now refer to the Methods section, where this information can be found. 

      “... conductance-based synapses (connection probability ≤5%, Methods)”  

      P. 4, L. 6-7: "Population activity was odor-specific and activity patterns evoked by uncorrelated OB inputs remained uncorrelated in Dp (Figure 1H)" 

      What would happen to correlated OB inputs (e.g. as a result of mixture of two overlapping odours) in this baseline state of the network (before memories being introduced to it)? It would be good to know this, as it sheds light on the initial operating regime of the network in terms of E/I balance and decorrelation of inputs.  

      This information was present in the original manuscript at (Figure 3) but we improved the writing to further clarify this issue: “ (…) we morphed a novel odor into a learned odor (Figure 3A), or a learned odor into another learned odor (Supplementary Figure 3B), and quantified the similarity between morphed and learned odors by the Pearson correlation of the OB activity patterns (input correlation). We then compared input correlations to the corresponding pattern correlations among E neurons in Dp (output correlation). In rand networks, output correlations increased linearly with input correlations but did not exceed them (Figure 3B and Supplementary Figure 3B)”

      P. 4, L. 12-13: "Shuffling spike times of inhibitory neurons resulted in runaway activity with a probability of ~80%, .."   Where is this shown? 

      (There are other occasions too in the paper where references to the supporting figures are missing). 

      We now provide the statistics: “Shuffling spike times of inhibitory neurons resulted in runaway activity with a probability of 0.79 ± 0.20”

      P. 4: "In each network, we created 15 assemblies representing uncorrelated odors. As a consequence, ~30% of E neurons were part of an assembly ..." 

      15 x 100 / 4000 = 37.5% - so it's closer to 40% than 30%. Unless there is some overlap? 

      Yes: despite odors being uncorrelated and connectivity being random, some neurons (6 % of E neurons) belong to more than one assembly.

      P. 4: "When a reached a critical value of ~6, networks became unstable and generated runaway activity (Figure 2B)." 

      Can this transition point be calculated or estimated from the network parameters, and linked to the underlying mechanisms causing it? 

      We thank the reviewer for this interesting question. The unstability arises when inhibitions fails to counterbalance efficiently the increased recurrent excitation within Dp. The transition point is difficult to estimate, as it can depend on several parameters, including the probability of E to E connections, their strength, assembly size, and others. We have therefore not attempted to estimate it analytically.

      P. 4: "Hence, non-specific scaling of inhibition resulted in a divergence of firing rates that exhausted the dynamic range of individual neurons in the population, implying that homeostatic   global inhibition is insufficient to maintain a stable firing rate distribution." 

      I don't think this is justified based on the results and figures presented here (Fig. 2E) - the interpretation is a bit strong and biased towards the conclusions the authors want to draw. 

      To more clearly illustrate the finding that in Scaled networks, assembly neurons are highly active (close to maximal realistic firing rates) whereas non-assembly neurons are nearly silent we have now added Supplementary Fig. 2B. Moreover, we have toned down the text: “Hence, non-specific scaling of inhibition resulted in a large and biologically unrealistic divergence of firing rates (Supplementary Figure 2B) that nearly exhausted the dynamic range of individual neurons in the population, indicating that homeostatic global inhibition is insufficient to maintain a stable firing rate distribution”

      P. 5, third paragraph: Description of Figure 2I, inset is needed, either in the text or caption. 

      The inset is now referred to in the text: ”we projected synaptic conductances of each neuron onto a line representing the E/I ratio expected in a balanced network (“balanced axis”) and onto an orthogonal line (“counter-balanced axis”; Figure 2I inset, Methods).”

      P. 5, last paragraph: another example of writing about results without showing/referring to the corresponding figures: 

      "In rand networks, firing rates increased after stimulus onset and rapidly returned to a low baseline after stimulus offset. Correlations between activity patterns evoked by the same odor at different time points and in different trials were positive but substantially lower than unity, indicating high variability ..." 

      And the continuation with similar lack of references on P. 6: 

      "Scaled networks responded to learned odors with persistent firing of assembly neurons and high pattern correlations across trials and time, implying attractor dynamics (Hopfield, 1982; Khona and Fiete, 2022), whereas Tuned networks exhibited transient responses and modest pattern correlations similar to rand networks." 

      Please go through the Results and fix the references to the corresponding figures on all instances. 

      We thank the reviewer for pointing out these overlooked figure references, which are now fixed.

      P. 8: "These observations further support the conclusion that E/I assemblies locally constrain neuronal dynamics onto manifolds." 

      As discussed in the general major points, mechanistic explanation in terms of how the interaction of E/I dynamics leads to this is missing. 

      As discussed in the reply to the public review (comment 3 of reviewer 1), we have now provided more mechanistic analyses of our observations.

      P. 9: "Hence, E/I assemblies enhanced the classification of inputs related to learned patterns."   The effect seems to be very small. Also, any explanation for why for low test-target correlation the effect is negative (random doing better than tuned E/I)? 

      The size of the effect (plearned – pnovel = 0.074; difference of means; Figure 5C) may appear small in terms of absolute probability, but it is substantial relative to the maximum possible increase (1 – p<sub>novel</sub> =  0.133; Figure 5C). The fact that for low test-target correlations the effect is negative is a direct consequence of the positive effect for high test-target correlations and the presence of 2 learned odors in the 4-way forced choice task. 

      P. 9: "In Scaled I networks, creating two additional memories resulted in a substantial increase   in firing rates, particularly in response to the learned and related odors"   Where is this shown? Please refer to the figure. 

      We thank the reviewer again for pointing this out. We forgot to include a reference to the relevant figure which has now been added in the revised manuscript (Figure 6C).

      P. 10: "The resulting Tuned networks reproduced additional experimental observations that were not used as constraints including irregular firing patterns, lower output than input correlations, and the absence of persistent activity" 

      It is difficult to present these as "additional experimental observations", as all of them are negative, and can exist in random networks too - hence cannot be used as biological evidence in favour of specific E/I networks when compared to random networks. 

      We agree with the reviewer that these additional experimental observations cannot be used as biological evidence favouring Tuned E+I networks over random networks. We here just wanted to point out that additional observations which we did not take into account to fit the model are not invalidating the existence of E-I assemblies in biological networks. As assemblies tend to result in persistent activity in other types of networks, we feel that this observation is worth pointing out.

      Methods: 

      P. 13: Describe the parameters of Eq. 2 after the equation. 

      Done.

      P. 13: "The time constants of inhibitory and excitatory synapses were 10 ms and 30 ms, respectively." 

      What is the (biological) justification for the choice of these parameters? 

      How would varying them affect the main results (e.g. local manifolds)? 

      We chose a relatively slow time constant for excitatory synapses because experimental data indicate that excitatory synaptic currents in Dp and piriform cortex contain a prominent NMDA component. We have now also simulated networks with equal time constants for excitatory and inhibitory synapses and equal biophysical parameters for excitatory and inhibitory neurons, which did not affect the main results (see also reply to the public review: comment 2 of reviewer 1).

      P. 14: "Care was also taken to ensure that the variation in the number of output connections was low across neurons"   How exactly?

      More detailed explanations have now been added in the Methods section: “connections of a presynaptic neuron y to postsynaptic neurons x were randomly deleted when their total number exceeded the average number of output connections by ≥5%, or added when they were lower by ≥5%.“

      Reviewer #2 (Recommendations For The Authors): 

      Congratulations on the great and interesting work! The results were nicely presented and the idea of continuous encoding on manifolds is very interesting. To improve the quality of the paper, in addition to the major points raised in the public review, here are some more detailed comments for the paper: 

      (1) Generally, citations have to improve. Spiking networks with excitatory assemblies and different architectures of inhibitory populations have been studied before, and the claim about improved network stability in co-tuned E-I networks has been made in the following papers that need to be correctly cited: 

      • Vogels TP, Sprekeler H, Zenke F, Clopath C, Gerstner W. 2011. Inhibitory Plasticity Balances Excitation and Inhibition in Sensory Pathways and Memory Networks. Science 334:1-7. doi:10.1126/science.1212991 (mentions that emerging precise balance on the synaptic weights can result in the overall network stability) 

      • Lagzi F, Bustos MC, Oswald AM, Doiron B. 2021. Assembly formation is stabilized by Parvalbumin neurons and accelerated by Somatostatin neurons. bioRxiv doi: https://doi.org/10.1101/2021.09.06.459211 (among other things, contrasts stability and competition which arises from multistable networks with global inhibition and reciprocal inhibition)   • Rost T, Deger M, Nawrot MP. 2018. Winnerless competition in clustered balanced networks: inhibitory assemblies do the trick. Biol Cybern 112:81-98. doi:10.1007/s00422-017-0737-7 (compares different architectures of inhibition and their effects on network dynamics) 

      • Lagzi F, Fairhall A. 2022. Tuned inhibitory firing rate and connection weights as emergent network properties. bioRxiv 2022.04.12.488114. doi:10.1101/2022.04.12.488114 (here, see the eigenvalue and UMAP analysis for a network with global inhibition and E/I assemblies) 

      Additionally, there are lots of pioneering work about tracking of excitatory synaptic inputs by inhibitory populations, that are missing in references. Also, experimental work that show existence of cell assemblies in the brain are largely missing. On the other hand, some references that do not fit the focus of the statements have been incorrectly cited. 

      The authors may consider referencing the following more pertinent studies on spiking networks to support the statement regarding attractor dynamics in the first paragraph in the Introduction (the current citations of Hopfield and Kohonen are for rate-based networks): 

      • Wong, K.-F., & Wang, X.-J. (2006). A recurrent network mechanism of time integration in perceptual decisions. Journal of Neuroscience, 26(4), 1314-1328. https://doi.org/10.1523/JNEUROSCI.3733-05.2006 

      • Wang, X.-J. (2008). Decision making in recurrent neuronal circuits. Neuron, 60(2), 215-234. https://doi.org/10.1016/j.neuron.2008.09.034  

      • F. Lagzi, & S. Rotter. (2015). Dynamics of competition between subnetworks of spiking neuronal networks in the balanced state. PloS One. 

      • Goldman-Rakic, P. S. (1995). Cellular basis of working memory. Neuron, 14(3), 477-485. 

      • Rost T, Deger M, Nawrot MP. 2018. Winnerless competition in clustered balanced networks: inhibitory assemblies do the trick. Biol Cybern 112:81-98. doi:10.1007/s00422-017-0737-7. 

      • Amit DJ, Tsodyks M (1991) Quantitative study of attractor neural network retrieving at low spike rates: I. substrate-spikes, rates and neuronal gain. Network 2:259-273. 

      • Mazzucato, L., Fontanini, A., & La Camera, G. (2015). Dynamics of Multistable States during Ongoing and Evoked Cortical Activity. Journal of Neuroscience, 35(21), 8214-8231. 

      We thank the reviewer for the references suggestions. We have carefully reviewed the reference list and made the following changes, which we hope address the reviewer’s concerns:

      (1) We adjusted References about network stability in co-tuned E-I networks.

      (2) We added the Lagzi & Rotter (2015), Amit et al. (1991), Mazzucato et al. (2015) and GoldmanRakic (1995) papers in the Introduction as studies on attractor dynamics in spiking neural networks. We preferred to omit the two X.J Wang papers, as they describe attractors in decision making rather than memory processes.

      (3) We added the Ko et al. 2011 paper as experimental evidence for assemblies in the brain. In our view, there are few experimental studies showing the existence of cell assemblies in the brain, which we distinguish from cell ensembles, group of coactive neurons. 

      (4) We also included Hennequin 2018, Brunel 2000, Lagzi et al. 2021 and Eckmann et al. 2024, which we had not cited in the initial manuscript.

      (5) We removed the Wiechert et al. 2010 reference as it does not support the statement about geometry-preserving transformation by random networks.

      (2) The gist of the paper is about how the architecture of inhibition (reciprocal vs. global in this case) can determine network stability and salient responses (related to multistable attractors and variations) for classification purposes. It would improve the narrative of the paper if this point is raised in the Introduction and Discussion section. Also see a relevant paper that addresses this point here: 

      Lagzi F, Bustos MC, Oswald AM, Doiron B. 2021. Assembly formation is stabilized by Parvalbumin neurons and accelerated by Somatostatin neurons. bioRxiv doi: https://doi.org/10.1101/2021.09.06.459211 

      Classification has long been proposed to be a function of piriform cortex and autoassociative memory networks in general, and we consider it important. However, the computational function of Dp or piriform cortex is still poorly understood, and we do not focus only on odor classification as a possibility. In fact, continuous representational manifolds also support other functions such as the quantification of distance relationships of an input to previously memorized stimuli, or multi-layer network computations (including classification). In the revised manuscript, we have performed additional analyses to explore these notions in more detail, as explained above (response to public reviews, comment 3 of reviewer 1). Furthermore, we have now expanded the discussion of potential computational functions of Tuned networks and explicitly discuss classification but also other potential functions. 

      (3) A plot for the values of the inhibitory conductances in Figure 1 would complete the analysis for that section. 

      In Figure 1, we decided to only show the conductances that we use to fit our model, namely the afferent and total synaptic conductances. As the values of the inhibitory conductances can be derived from panel E, we refrained from plotting them separately for the sake of simplicity. 

      (4) How did the authors calculate correlations between activity patterns as a function of time in Figure 2E, bottom row? Does the color represent correlation coefficient (which should not be time dependent) or is it a correlation function? This should be explained in the Methods section. 

      The color represents the Pearson correlation coefficient between activity patterns within a narrow time window (100 ms). We updated the Figure legend to clarify this: “Mean correlation between activity patterns evoked by a learned odor at different time points during odor presentation. Correlation coefficients were calculated between pairs of activity vectors composed of the mean firing rates of E neurons in 100 ms time bins. Activity vectors were taken from the same or different trials, except for the diagonal, where only patterns from different trials were considered.”

      (5) Figure 3 needs more clarification (both in the main text and the figure caption). It is not clear what the axes are exactly, and why the network responses for familiar and novel inputs are different. The gray shaded area in panel B needs more explanation as well.  

      We thank the reviewer for the comment. We have improved Figure 3A, the figure caption, as well as the text (see p.6). We hope that the figure is now clearer.

      (6) The "scaled I" network, known for representing input patterns in discrete attractors, should exhibit clear separation between network responses in the 2D PC space in the PCA plots. However, Figure 4D and Figure 6D do not reflect this, as all network responses are overlapped. Can the authors explain the overlap in Figure 4D? 

      In Figure 4D, activity of Scaled networks is distributed between three subregions in state space that are separated by the first 2 PCs. Two of them indeed correspond to attractor states representing the two learned odors while the third represents inputs that are not associated with these attractor states. To clarify this, please see also the density plot in Figure 4E. The few datapoints between these three subregions are likely outliers generated by the sequential change in inputs, as described in Supplementary Figure 8C.

      (7) The reason for writing about the ISN networks is not clear. Co-tuned E-I assemblies do not necessarily have to operate in this regime. Also, the results of the paper do not rely on any of the properties of ISNs, but they are more general. Authors should either show the paradoxical effect associated with ISN (i.e., if increasing input to I neurons decreases their responses) or show ISN properties using stability analysis (See computational research conducted at the Allen Institute, namely Millman et al. 2020, eLife ). Currently, the paper reads as if being in the ISN regime is a necessary requirement, which is not true. Also, the arguments do not connect with the rest of the paper and never show up again. Since we know it is not a requirement, there is no need to have those few sentences in the Results section. Also, the choice of alpha=5.0 is extreme, and therefore, it would help to judge the biological realism if the raster plots for Figs 2-6 are shown.

      We have toned down the part on ISN and reduced it to one sentence for readers who might be interested in knowing whether activity is inhibition-stabilized or not. We have also added the reference to the Tsodyks et al. 1997 paper from which we derive our stability analysis. The text now reads “Hence, pDp<sub>sim</sub> entered a balanced state during odor stimulation (Figure 1D, E) with recurrent input dominating over afferent input, as observed in pDp (Rupprecht and Friedrich, 2018). Shuffling spike times of inhibitory neurons resulted in runaway activity with a probability of 0.79 ± 0.20, demonstrating that activity was inhibition-stabilized (Sadeh and Clopath, 2020b, Tsodyks et al., 1997).”  

      We have now also added the raster plots as suggested by the reviewer (see Figure 2D, Supplementary Figure 1 G, Supplementary Figure 4). We thank the reviewer for this comment.

      (8) In the abstract, authors mention "fast pattern classification" and "continual learning," but in the paper, those issues have not been addressed. The study does not include any synaptic plasticity. 

      Concerning “continual learning” we agree that we do not simulate the learning process itself. However, Figure 6 show results of a simulation where two additional patterns were stored in a network that already contained assemblies representing other odors. We consider this a crude way of exploring the end result of a “continual learning” process. “Fast pattern classification” is mentioned because activity in balanced networks can follow fluctuating inputs with high temporal resolution, while networks with stable attractor states tend to be slow. This is likely to account for the occurrence of hysteresis effects in Scaled but not Tuned networks as shown in Supplementary

      Fig. 8.

      (9) In the Introduction, the first sentence in the second paragraph reads: "... when neurons receive strong excitatory and inhibitory synaptic input ...". The word strong should be changed to "weak".

      Also, see the pioneering work of Brunel 2000. 

      In classical balanced networks, strong excitatory inputs are counterbalanced by strong inhibitory inputs, leading to a fluctuation-driven regime. We have added Brunel 2000.

      (10) In the second paragraph of the introduction, the authors refer to studies about structural co-tuning (e.g., where "precise" synaptic balance is mentioned, and Vogels et al. 2011 should be cited there) and functional co-tuning (which is, in fact, different than tracking of excitation by inhibition, but the authors refer to that as co-tuning). It makes it easier to understand which studies talk about structural co-tuning and which ones are about functional co-tuning. The paper by Znamenski 2018, which showed both structural and functional tuning in experiments, is missing here. 

      We added the citation to the now published paper by Znamenskyi et al. (2024).  

      (11) The third paragraph in the Introduction misses some references that address network dynamics that are shaped by the inhibitory architecture in E/I assemblies in spiking networks, like Rost et al 2018 and Lagzi et al 2021. 

      These references have been added.

      (12) The last sentence of the fourth paragraph in the Introduction implies that functional co-tuning is due to structural co-tuning, which is not necessarily true. While structural co-tuning results in functional co-tuning, functional co-tuning does not require structural co-tuning because it could arise from shared correlated input or heterogeneity in synaptic connections from E to I cells.  

      We generally agree with the reviewer, but we are uncertain which sentence the reviewer refers to.

      We assume the reviewer refers to the last sentence of the second (rather than the fourth paragraph), which explicitly mentions the “…structural basis of E/I co-tuning…”. If so, we consider this sentence still correct because the “structural basis” refers not specifically to E/I assemblies, but also includes any other connectivity that may produce co-tuning, including the connectivity underlying the alternative possibilities mentioned by the reviewer (shared correlated input or heterogeneity of synaptic connections).

      (13) In order to ensure that the comparison between network dynamics is legit, authors should mention up front that for all networks, the average firing rates for the excitatory cells were kept at 1 Hz, and the background input was identical for all E and I cells across different networks.

      We slightly revised the text to make this more clear “We (…) uniformly scaled I-to-E connection weights by a factor of χ until E population firing rates in response to learned odors matched the corresponding firing rates in rand networks, i.e., 1 Hz”

      (14) In the last paragraph on page 5, my understanding was that an individual odor could target different cells within an assembly in different trials to generate trial to trail variability. If this is correct, this needs to be mentioned clearly. 

      This is not correct, an odor consists of 150 activated mitral cells with defined firing rates. As now mentioned in the Methods, “Spikes were then generated from a Poisson distribution, and this process was repeated to create trial-to-trial variability.”

      (15) The last paragraph on page 6 mentions that the four OB activity patterns were uncorrelated but if they were designed as in Figure 4A, dues to the existing overlap between the patterns, they cannot be uncorrelated. 

      This appears to be a misunderstanding. We mention in the text (and show in Figure 4B) that the four odors which “… were assigned to the corners of a square…” are uncorrelated.  The intermediate odors are of course not uncorrelated. We slightly modified the corresponding paragraph (now on page 7) to clarify this: “The subspace consisted of a set of OB activity patterns representing four uncorrelated pure odors and mixtures of these pure odors. Pure odors were assigned to the corners of a square and mixtures were generated by selecting active mitral cells from each of the pure odors with probabilities depending on the relative distances from the corners (Figure 4A, Methods).”

      (16) The notion of "learned" and "novel" odors may be misleading as there was no plasticity in the network to acquire an input representation. It would be beneficial for the authors to clarify that by "learned," they imply the presence of the corresponding E assembly for the odor in the network, with the input solely impacting that assembly. Conversely, for "novel" inputs, the input does not target a predefined assembly. In Figure 2 and Figure 4, it would be especially helpful to have the spiking raster plots of some sample E and I cells.  

      As suggested by the reviewer, we have modified the existing spiking raster plots in Figure 2, such that they include examples of responses to both learned and novel odors. We added spiking raster plots showing responses of I neurons to the same odors in Supplementary Figure 1F, as well as spiking raster plots of E neurons in Supplementary Figure 4A. To clarify the usage of “learned” and “novel”, we have added a sentence in the Results section: “We thus refer to an odor as “learned” when a network contains a corresponding assembly, and as “novel” when no such assembly is present.”.

      (17) In the last paragraph of page 8, can the authors explain where the asymmetry comes from? 

      As mentioned in the text, the asymmetry comes from the difference in the covariance structure of different classes. To clarify, we have rephrased the sentence defining the Mahalanobis distance: 

      “This measure quantifies the distance between the pattern and the class center, taking into account covariation of neuronal activity within the class. In bidirectional comparisons between patterns from different classes, the mean dM may be asymmetric if neural covariance differs between classes.”

      (18) The first paragraph of page 9: random networks are not expected to perform pattern classification, but just pattern representation. It would have been better if the authors compared Scaled I network with E/I co-tuned network. Regardless of the expected poorer performance of the E/I co-tuned networks, the result would have been interesting. 

      Please see our reply to the public review (reviewer 2).

      (19) Second paragraph on page 9, the authors should provide statistical significance test analysis for the statement "... was significantly higher ...". 

      We have performed a Wilcoxon signed-rank test, and reported the p-value in the revised manuscript (p < 0.01). 

      (20) The last sentence in the first paragraph on page 11 is not clear. What do the authors mean by "linearize input-output functions", and how does it support their claim? 

      We have now amended this sentence to clarify what we mean: “…linearize the relationship between the mean input and output firing rates of neuronal populations…”.

      (21) In the first sentence of the last paragraph on page 11, the authors mentioned “high variability”, but it is not clear compared with which of the other 3 networks they observed high variability.

      Structurally co-tuned E/I networks are expected to diminish network-level variability. 

      “High variability” refers to the variability of spike trains, which is now mentioned explicity in the text. We hope this more precise statement clarifies this point.

      (22) Methods section, page 14: "firing rates decreased with a time constant of 1, 2 or 4 s". How did they decrease? Was it an implementation algorithm? The time scale of input presentation is 2 s and it overlaps with the decay time constant (particularly with the one with 4 s decrease).  

      Firing rates decreased exponentially. We have added this information in the Methods section.

      Reviewer #3 (Recommendations For The Authors): 

      In the following, I suggest minor corrections to each section which I believe can improve the manuscript. 

      - There was no github link to the code in the manuscript. The code should be made available with a link to github in the final manuscript. 

      The code can be found here: https://github.com/clairemb90/pDp-model. The link has been added in the Methods section.

      Figure 1: 

      - Fig. 1A: call it pDp not Dp. Please check if this name is consistent in every figure and the text. 

      Thank you for catching this. Now corrected in Figure 1, Figure 2 and in the text.

      - The authors write: "Hence, pDpsim entered an inhibition-stabilized balanced state (Sadeh and Clopath, 2020b) during odor stimulation (Figure 1D, E)." and then later "Shuffling spike times of inhibitory neurons resulted in runaway activity with a probability of ~80%, demonstrating that activity was indeed inhibition-stabilized. These results were robust against parameter variations (Methods)." I would suggest moving the second sentence before the first sentence, because the fact that the network is in the ISN regime follows from the shuffled spike timing result. 

      Also, I'd suggest showing this as a supplementary figure. 

      We thank the reviewer for this comment. We have removed “inhibition-stabilized” in the first sentence as there is no strong evidence of this in Rupprecht and Friedrich, 2018. And removed “indeed” in the second sentence. We also provided more detailed statistics. The text now reads “Hence, pDpsim entered a balanced state during odor stimulation (Figure 1D, E) with recurrent input dominating over afferent input, as observed in pDp (Rupprecht and Friedrich, 2018). Shuffling spike times of inhibitory neurons resulted in runaway activity with a probability of 0.79 ± 0.20, demonstrating that activity was inhibition-stabilized (Sadeh and Clopath, 2020b).”

      Figure 2: 

      - "... Scaled I networks (Figure 2H." Missing ) 

      Corrected.

      - The authors write "Unlike in Scaled I networks, mean firing rates evoked by novel odors were indistinguishable from those evoked by learned odors and from mean firing rates in rand networks (Figure 2F)." 

      Why is this something you want to see? Isn't it that novel stimuli usually lead to high responses? Eg in the paper Schulz et al., 2021 (eLife) which is also cited by the authors it is shown that novel responses have high onset firing rates. I suggest clarifying this (same in the context of Fig. 3C). 

      In Dp and piriform cortex, firing rates evoked by learned odors are not substantially different from firing rates evoked by novel odors. While small differences between responses to learned versus novel odors cannot be excluded, substantial learning-related differences in firing rates, as observed in other brain areas, have not been described in Dp or piriform cortex. We added references in the last paragraph of p.5. Note that the paper by Schulz et al. (2021) models a different type of circuit.  

      - Fig. 2B: Indicate in figure caption that this is the case "Scaled I" 

      This is not exactly the case “Scaled I”, as the parameter 𝝌𝝌 (increased I to E strength) is set to 1.

      - Suppl Fig. 2I: Is E&F ever used in the manuscript? I couldn't find a reference. I suggest removing it if not needed. 

      Suppl. Fig 2I E&F is now Suppl Fig.1G&H. We now refer to it in the text: “Activity of networks with E assemblies could not be stabilized around 1 Hz by increasing connectivity from subsets of I neurons receiving dense feed-forward input from activated mitral cells (Supplementary Figure 1GH; Sadeh and Clopath, 2020).”

      Figure 3: 

      - As mentioned in my comment in the public review section, I find the arguments about pattern completion a little bit confusing. For me it's not clear why an increase of output correlations over input correlations is considered "pattern completion" (this is not to say that I don't find the nonlinear increase of output correlations interesting). For me, to test pattern completion with second-order statistics one would need to do a similar separation as in Suppl Fig. 3, ie measuring the pairwise correlation at cells in the assembly L that get direct input from L OB with cells in the assembly L that do not get direct input from OB. If the pairwise correlations of assembly cells which do not get direct input from OB increase in correlations, I would consider this as pattern completion (similar to the argument that increase in firing rate in cells which are not directly driven by OB are considered a sign of pattern completion). 

      Also, for me it now seems like that there are contradictory results, in Fig. 3 only Scaled I can lead to pattern completion while in the context of Suppl. Fig. 3 the authors write "We found that assemblies were recruited by partial inputs in all structured pDpsim networks (Scaled and Tuned) without a significant increase in the overall population activity (Supplementary Figure 3A)."   I suggest clarifying what the authors exactly mean by pattern completion, why the increase of output correlations above input correlations can be considered as pattern completion, and why the results differs when looking at firing rates versus correlations. 

      Please see our reply to the public review (reviewer 3).

      - I actually would suggest adding Suppl. Fig. 3 to the main figure. It shows a more intuitive form of pattern completion and in the text there is a lot of back and forth between Fig. 3 and Suppl. Fig. 3 

      We feel that the additional explanations and panels in Fig.3 should clarify this issue and therefore prefer to keep Supplementary Figure 3 as part of the Supplementary Figures for simplicity.  

      - In the whole section "We next explored effects of assemblies ... prevented strong recurrent amplification within E/I assemblies." the authors could provide a link to the respective panel in Fig. 2 after each statement. This would help the reader follow your arguments. 

      We thank the reviewer for pointing this out. The references to the appropriate panels have been added. 

      - Fig. 3A: I guess the x-axis has been shifted upwards? Should be at zero. 

      We have modified the x-axis to make it consistent with panels B and C.  

      - Fig. 3B: In the figure caption, the dotted line is described as the novel odor but it is actually the unit line. The dashed lines represent the reference to the novel odor. 

      Fixed.

      - Fig. 3C: The " is missing for Pseudo-Assembly N

      Fixed.

      - "...or a learned odor into another learned odor." Have here a ref to the Supplementary Figure 3B.

      Added.

      Figure 4:   

      - "This geometry was largely maintained in the output of rand networks, consistent with the notion that random networks tend to preserve similarity relationships between input patterns (Babadi and Sompolinsky, 2014; Marr, 1969; Schaffer et al., 2018; Wiechert et al., 2010)." I suggest adding here reference to Fig. 4D (left). 

      Added.

      - Please add a definition of E/I assemblies. How do the authors define E/I assemblies? I think they consider both, Tuned I and Tuned E+I as E/I assemblies? In Suppl. Fig. 2I E it looks like tuned feedforward input is defined as E/I assemblies. 

      We thank the reviewer for pointing this out. E/I assemblies are groups of E and I neurons with enhanced connectivity. In other words, in E/I assemblies, connectivity is enhanced not only between subsets of E neurons, but also between these E neurons and a subset of I neurons. This is now clarified in the text: “We first selected the 25 I neurons that received the largest number of connections from the 100 E neurons of an assembly. To generate E/I assemblies, the connectivity between these two sets of neurons was then enhanced by two procedures.”. We removed “E/I assemblies” in Suppl. Fig.2, where the term was not used correctly, and apologize for the confusion.

      - Suppl. Fig. 4: Could the authors please define what they mean by "Loadings" 

      The loadings indicate the contribution of each neuron to each principal component, see adjusted legend of Suppl. Fig. 4: “G. Loading plot: contribution of neurons to the first two PCs of a rand and a Tuned E+I network (Figure 4D).”

      - Fig. 4F: The authors might want to normalize the participation ratio by the number of neurons (see e.g. Dahmen et al., 2023 bioRxiv, "relative PR"), so the PR is bound between 0 and 1 and the dependence on N is removed. 

      We thank the reviewer for the suggestion, but we prefer to use the non-normalized PR as we find it more easily interpretable (e.g. number of attractor states in Scaled networks).

      - Fig. 4G&H: as mentioned in the public review, I'd add the case of Scaled I to be able to compare it to the Tuned E+I case. 

      As already mentioned in the public review, we thank the reviewer for this suggestion, which we have implemented.

      - Figure caption Fig. 4H "Similar results were obtained in the full-dimensional space." I suggest showing this as a supplemental panel. 

      Since this only adds little information, we have chosen not to include it as a supplemental panel to avoid overloading the paper with figures.

      Figure 5: 

      - As mentioned in the public review, I suggest that the authors add the Scaled I case to Fig. 5 (it's shown in all figures and also in Fig. 6 again). I guess for Scaled I the separation between L and M will be very good? 

      Please see our reply to the public review (reviewer 3).

      - Fig. 5A&B: I am a bit confused about which neurons are drawn to calculate the Mahalanobis distance. In Fig. 5A, the schematic indicates that the vector B from which the neurons are drawn is distinct from the distribution Q. For the example of odor L, the distribution Q consists of pure odor L with odors that have little mixtures with the other odors. But the vector v for odor L seems to be drawn only from odors that have slightly higher mixtures (as shown in the schematic in Fig. 5A). Is there a reason to choose the vector v from different odors than the distribution Q? 

      The distribution Q and the vector v consist of activity patterns across the same neurons in response to different odors. The reason to choose a different odor for v was to avoid having this test datapoint being included in the distribution Q. We also wanted Q to be the same for all test datapoints. 

      What does "drawn from whole population" mean? Does this mean that the vectors are drawn from any neuron in pDp? If yes, then I don't understand how the authors can distinguish between different odors (L,M,O,N) on the y-axis. Or does "whole population" mean that the vector is drawn across all assemblies as shown in the schematic in Fig. 5A and the case "neurons drawn from (pseudo-) assembly" means that the authors choose only one specific assembly? In any case, the description here is a bit confusing, I think it would help the reader to clarify those terms better.  

      Yes, “drawn from whole population” means that we randomly draw 80 neurons from the 4000 E neurons in pDp. The y-axis means that we use the activity patterns of these neurons evoked by one of the 4 odors (L, M, N, O) as reference. We have modified the Figure legend to clarify this: “d<sub>M</sub> was computed based on the activity patterns of 80 E neurons drawn from the four (pseudo-) assemblies (top) or from the whole population of 4000 E neurons (bottom). Average of 50 draws.”

      - Suppl Fig. 5A: In the schematic the distance is called d_E(\bar{Q},\bar{V}) while the colorbar has d_E(\bar{Q},\bar{Q}) with the Qs in different color. The green Q should be a V. 

      We thank the reviewer for spotting this mistake, it is now fixed.

      - Fig. 5: Could the authors comment on the fact that a random network seems to be very good in classifying patterns on it's own. Maybe in the Discussion? 

      The task shown in Figure 5 is a relatively easy one, a forced-choice between four classes which are uncorrelated. In Supplementary Figure 9, we now show classification for correlated classes, which is already much harder.

      Figure 6: 

      - Is the correlation induced by creating mixtures like in the other Figures? Please clarify how the correlations were induced. 

      We clarified this point in the Methods section: “The pixel at each vertex corresponded to one pure odor with 150 activated and 75 inhibited mitral cells (…) and the remaining pixels corresponded to mixtures. In the case of correlated pure odors (Figure 6), adjacent pure odors shared half of their activated and half of their inhibited cells.”. An explicit reference to the Methods section has also been added to the figure legend.

      - Fig. 6C (right): why don't we see the clear separation in PC space as shown in Fig. 4? Is this related to the existence of correlations? Please clarify. 

      Yes. The assemblies corresponding to the correlated odors X and Y overlap significantly, and therefore responses to these odors cannot be well separated, especially for Scaled networks. We added the overlap quantification in the Results section to make this clear. “These two additional assemblies had on average 16% of neurons in common due to the similarity of the odors.”

      - "Furthermore, in this regime of higher pattern similarity, dM was again increased upon learning, particularly between learned odors and reference classes representing other odors (not shown)." Please show this (maybe as a supplemental figure). 

      We now show the data in Supplementary Figure 9.

      Discussion: 

      - The authors write: "We found that transformations became more discrete map-like when amplification within assemblies was increased and precision of synaptic balance was reduced. Likewise, decreasing amplification in assemblies of Scaled networks changed transformations towards the intermediate behavior, albeit with broader firing rate distributions than in Tuned networks (not shown)." 

      Where do I see the first point? I guess when I compare in Fig. 4D the case of Scaled I vs Tuned E+I, but the sentence above sounds like the authors showed this in a more step-wise way eg by changing the strength of \alpha or \beta (as defined in Fig. 1). 

      Also I think if the authors want to make the point that decreasing amplification in assemblies changes transformation with a different rate distribution in scaled vs tuned networks, the authors should show it (eg adding a supplemental figure). 

      The first point is indeed supported by data from different figures. Please note that the revised manuscript now contains further simulations that reinforce this statement, particularly those shown in Supplementary Figure 6, and that this point is now discussed more extensively in the Discussion. We hope that these revisions clarify this general point.

      The data showing effects of decreasing amplification in assemblies is now shown in Supplementary Figure 6 (Scaled[adjust])

      - I suggest adding the citation Znamenskiy et al., 2024 (Neuron; https://doi.org/10.1016/j.neuron.2023.12.013), which shows that excitatory and inhibitory (PV) neurons with functional similarities are indeed strongly connected in mouse V1, suggesting the existence of E/I assembly structure also in mammals.

      Done.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment:

      Developing a reliable method to record ancestry and distinguish between human somatic cells presents significant challenges. I fully acknowledge that my current evidence supporting the claim of lineage tracing with fCpG barcodes is inadequate. I agree with Reviewer 1 that fCpG barcodes are essentially a cellular division clock that diverges over time. A division clock could potentially document when cells cease to divide during development, with immediate daughter cells likely exhibiting more similar barcodes than those that are less related. Although it remains uncertain whether the current fCpG barcodes capture useful biological information, refinement of this type of tool could complement other approaches that reconstruct human brain function, development, and aging.

      Due to my lack of clarity, the fCpG barcode was perceived to be a new type of cell classifier. However, it is fundamentally different. fCpG sites are selected based on their differences between cells of the same type, while traditional cell classifiers focus on sites with consistent methylation patterns in cells of the same type. Despite these opposing criteria, fCpG barcodes and traditional cell classifiers may align because neuron subtypes often share common progenitors. As a result, cells of the same phenotype are also closely related by ancestry, and ex post facto, have similar fCpG barcodes. fCpG barcodes are complementary to cell type classifiers, and potentially provide insights into aspects such as mitotic ages, diversity within a clade, and migration of immediate daughters---information which is otherwise difficult to obtain. The title has been modified to “Human Brain Ancestral Barcodes” to better reflect the function of the fCpG barcodes. The manuscript is edited to correct errors, and a new Supplement is added to further explain fCpG barcode mechanics and present new supporting data.

      Reviewer #1 (Public review):

      I thank Reviewer 1 for his constructive comments. Major noted weaknesses were 1) insufficient clarity and brevity of the methodology, 2) inconsistent or erroneous use of neurodevelopmental concepts, and 3) lack of consideration for alternative explanations.

      (1) The methodology is now outlined in detailed in a new Supplement, including simulations that indicate that the error rate consistent with the experimental data is about 0.01 changes in methylation per fCpG site per division.

      (2) Conceptual and terminology errors noted by the Reviewers are corrected in the manuscript.

      (3) I agree completely with the alternative explanation of Reviewer 1 that fCpGs are “a cellular division clock that diverges over 'time'”. Differences between more traditional cell type classifiers and fCpG barcodes are more fully outlined in the new Supplement.  Ancestry recorded by fCpGs and cell type classifiers are confounded because cells of the same phenotype typically have common progenitors---cells within a clade have similar fCpG barcodes because they are closely related. fCpG barcodes can compliment cell type classifiers with additional information such as mitotic ages, ancestry within a clade, and daughter cell migration.

      Reviewer #1 (Recommendations for the authors):

      (1) A lot of the interpretations suffer from an extremely loose/erroneous use of developmental concepts and a lack of transparency. For instance:

      a) The thalamus is not part of the brain stem

      Corrected.

      b) The pons contains cells other than inhibitory neurons in the data; the same is true for the hippocampus which contains multiple cell types

      Corrected to refer to the specific cell types in these regions.

      c) The author talks about the rostral-caudal timing a lot which is not really discussed to this degree in the cited references. Thus, it is also unclear how interneurons fit in this model as they are distinguished by a ventral-dorsal difference from excitatory neurons. Also, it is unclear whether the timing is really as distinct as claimed. For instance, inhibitory neurons and excitatory neurons significantly overlap in their birth timing. Finally, conceptually, it does not make sense to go by developmental timing as the author proposes that it is the number of divisions that is relevant. While they are somewhat correlated there are potentially stark differences.

      The manuscript attempts to describe what might be broadly expected when barcodes are sampled from different cell types and locations. As a proposed mitotic clock, the fCpG barcode methylation level could time when each neuron ceased division and differentiated. The wide ranges of fCpG barcode methylation of each cell type (Fig 2A) would be consistent with significant overlap between cell types. The manuscript is edited to emphasize overlapping rather than distinct sequential differentiation of the cell types.

      d) Neocortical astrocytes and some oligodendrocytes share a lineage, whereas a subset of oligodendrocytes in the cortex shares an origin with interneurons. This could confound results but is never discussed.

      The manuscript does not assess glial lineages in detail because neurons were preferentially included in the sampling whereas glial cells were non-systematically excluded. This sampling information is now included in the section “fCpG barcode identification”.

      e) Neocortical interneurons should be more closely related in terms of lineage-to-excitatory neurons than other inhibitory neurons of, for instance, the pons. This is not clearly discussed and delineated.

      This is not discussed. It may not be possible analyze these details with the current data. The ancestral tree reconstructions indicate that excitatory neurons that appear earlier in development (and are more methylated) are more often more closely related to inhibitory neurons.

      f) While there is some spread of excitatory neurons tangentially, there is no tangential migration at the scale of interneurons as (somewhat) suggested/implied here.

      The abstract and results have been modified to indicate greater inhibitory than excitatory neuron tangential migration, but that the extent of excitatory neuron tangential migration cannot be determined because of the sparse sampling and that barcodes may be similar by chance.

      g) The nature of the NN cells is quite important as cells not derived from the neocortical anlage are unlikely to share a developmental origin (e.g., microglia, endothelial cells). This should be clarified and clearly stated.

      The manuscript is modified to indicate that NN cells are microglial and endothelial cells. These cells have different developmental origins, and their data are present in Fig 2A, but are not further used for ancestral analysis.  

      (2) The presentation is often somewhat confusing to me and lacks detail. For instance:

      a) The methods are extremely short and I was unable to find a reference for a full pipeline, so other researchers can replicate the work and learn how to use the pipeline.

      The pipeline including python code is outlined in the new Supplement

      b) Often numbers are given as ~XX when the actual number with some indication of confidence or spread would be more appropriate.

      Data ranges are often indicated with the violin plots.

      c) Many figure legends are exceedingly short and do not provide an appropriate level of detail.

      Figure legends have been modified to include more detail

      d) Not defining groups in the figure legends or a table is quite unacceptable to me. I do not think that referring to a prior publication (that does not consistently use these groups anyway) is sufficient.

      The cell groups are based on the annotations provided with each single cell in the public databases.

      e) The used data should be better defined and introduced (number of cells, different subtypes across areas, which cells were excluded; I assume the latter as pons and hippocampus are only mentioned for one type of neuronal cells, see also above).

      The data used are present in Supplemental File 2 under the tab “cell summary H01, H02, H04”.

      f) Why were different upper bounds used for filtering for H01 and H02, and H04 is not mentioned? Why are inhibitory and excitatory neurons specifically mentioned (Lines 61-66)?

      The filtering is used to eliminate, as much as possible, cell type specific methylation, or CpG sites with skewed neuron methylation. The filtering eliminates CpG sites with high or low methylation within each of the three brains, and within the two major neuron subtypes. The goal is to enrich for CpG sites with polymorphic but not cell type specific methylation. This process is ad hoc as success criteria are currently uncertain. The extent of filtering is balanced by the need to retain sufficient numbers of fCpGs to allow comparisons between the neurons.

      g) What 'progenitor' does the author refer to? The Zygote? If yes, can the methylation status be tested directly from a zygote? There is no single progenitor for these cells other than the zygote. Does the assumption hold true when taking this into account? See, for instance, PMID 33737485 for some estimation of lineage bottlenecks.

      A brain progenitor cell can be defined as the common ancestor of all adult neurons, and is the first cell where each of its immediate daughter cell lineages yield adult neurons. The zygote is a progenitor cell to all adult cells, and barcode methylation at the start of conception, from the oocyte to the ICM, was analyzed in the new Supplement. The proposed brain progenitor cell with a fully methylated barcode was not yet evident even in the ICM.

      (3) I am generally not convinced that the fCpGs represent anything but a molecular clock of cell divisions and that many of the similarities are a function of lower division numbers where the state might be more homogenous. This mainly derives from the issues cited above, the lack of convincing evidence to the contrary, and the sparsity of the assessed data.

      Agree that the fCpG barcode is a mitotic clock that becomes polymorphic with divisions. As outlined in the new Supplement, ancestry and cell type are confounded because cells of the same type typically have a common progenitor.

      a) There appears little consideration or modeling of what the ability to switch back does to the lineage reconstruction.

      fCpG methylation flipping is further analyzed and discussed in the new Supplement.

      b) None of the data convinced me that the observations cannot be explained by the aforementioned molecular clock and systematic methylation similarities of cell types due to their cell state.

      See above

      (4) Uncategorized minor issues:

      a) The author should explain concepts like 'molecular clock hypothesis' (line 27) or 'radial unit hypothesis' (line 154), as they are somewhat complex and might not be intuitive to readers.

      The molecular clock hypothesis is deleted and the radial unit hypothesis is explained in more detail in the manuscript.

      b) Line 32: '[...] replication errors are much higher compared to base replication [...]'. I think this is central to the method and should be better explained and referenced. Maybe even through a schematic, as this is a central concept for the entire manuscript.

      The fCpG barcode mechanics are better explained in the new Supplement. With simulations, the fCpG flip rate is about 0.01 per division per fCpG.

      c) Line 41: 'neonatal'. Does the author mean to say prenatal? Most of the cells discussed are postmitotic before birth.

      Corrected to prenatal.

      d) Line 96: what does 'flip' mean in this context? Please also see the comment on Figure 2C.

      Edited to “chage”

      e) Lines 134-135: I am not sure whether the author claims to provide evidence for this question, and I would be careful with claims that this work does resolve the question here.

      Have toned down claims as evidence for my analysis is currently inadequate.

      f) Lines 192-193: I disagree as the fCpGs can switch back and the current data does not convince me that this is an improvement upon mosaic mutation analysis. In my mind, the main advantage is the re-analysis of existing data and the parallel functional insights that can be obtained.

      Lineage analysis is more straightforward with DNA sequencing, but with an error rate of ~10-9 per base per division, one needs to sequence a billion base pairs to distinguish between immediate daughter cells. By contrast, with an inferred error rate of ~10-2 per fCpG per division, much less sequencing (about a million-fold less) is needed to find differences between daughter cells.

      g) Lines 208-209: I would be careful with claims of complexity resolution given many of the limitations and inherent systematic similarities, as well as the potential of fCpGs to change back to an ancestral state later in the lineage.

      Have modified the manuscript to indicate the analysis would be more challenging due to back changes.

      h) There seem to be few figures that assess phenomena across the three brains. Even when they exist there is no attempt to provide any statistical analyses to support the conclusions or permutations to assess outlier status relative to expectations.

      The analysis could be more extensive, but with only three brains, any results, like this study itself, would be rightly judged inadequate.

      Figure 2B: there appears to be a higher number of '0s' for, for instance, inhibitory neurons compared to excitatory neurons. Is that correct and worth mentioning? The changing axes scales also make it hard to assess.

      Inhibitory neurons do appear to have more unmethylated fCpGs compared to excitatory neurons, but in general, most inhibitory fCpGs are methylated with a skew to fully methylated fCpGs, consistent with the barcode starting predominately methylated and inhibitory neurons generally appearing earlier in development relative to excitatory neurons.

      j) Figure 2C: I have several issues with this. A minor one is the use of 'Glial' which, I believe, does not appear anywhere else before this, so I am unclear what this curve represents. Generally, however, I am not sure what the y-axis represents, as it is not described in the methods or figure legend. I initially thought it was the cumulative frequency, but I do not think that this squares with the data shown in B. I appreciate the overall idea of having 'earlier'/samples with fewer divisions being shifted to the left, but it is very confusing to me when I try to understand the details of the plot.

      This graph is now better described in the legend. “Glial” cells are defined as oligodendrocytes and astrocytes. Other non-neuronal cells (such a microglial cells) have now been removed from the graph.

      This graph attempts to illustrate how it may be possible to reconstruct brain development from adult neurons, assuming barcodes are mitotic clocks that become polymorphic with cell division. The X axis is “time”, and the Y axis indicates when different cell types reach their adult levels. The cartoon indicates what is visually present along the X axis during development--- brainstem, then ganglionic eminences with a thin cortex, and finally the mature brain with a robust cortex. Time for the X axis is barcode methylation and starts at 100% and ends at 50% or greater methylation. The fCpG barcode methylation of each cell places it on this timeline and indicates when it ceased dividing and differentiated.

      The Y axis indicates the progressive accumulation of the final adult contents of each cell type during this timeline. Early in development, the brain is rudimentary and adult cells are absent. At 90% methylation, only the inhibitory neurons in the pons are present. At 80% methylation, some excitatory neurons are beginning to appear. Inhibitory neurons in the pons have reached their final adult levels and many other inhibitory neuron types are reaching adult levels. By 70% methylation, most inhibitory neurons have reached their adult levels, and more adult excitatory neurons (mainly low cortical neurons, L4-6) and glial cells are beginning to appear. By 60% methylation, inhibitory neurogenesis has largely finished. Adult excitatory neurons and glial cells are more abundant and reach their adult levels by 50% or greater cell barcode methylation levels.

      The graph illustrates a rough alignment between mitotic ages inferred by barcode methylation levels and the physical appearances of different neuronal types during development. Many neurons die during development, and this graph, if valid, indicates when neurons that survive to adulthood appear during development.

      k) Figure 4Bff: it is confusing to me that the text jumps to these panels after introducing Figure 5. This makes it very hard to read this section of the text.

      The Figures appear in the order they are first referred to in the text.

      l) Figure 5A: could any of this difference be explained by the shared lineage of excitatory neurons and dorsal neocortical glia?

      Not sure

      m) Figure 5B: after stating that interneurons have a higher lineage fidelity, the figure legend here states the opposite and I am somewhat confused by this statement.

      The legend and text have been clarified. Fig 5A restricts fidelity to within inhibitory cell types. Fig 5B compares between neuron subtypes, and illustrates more apparent inhibitory subtype switching, albeit there are more interneuron subtypes than excitatory subtypes.

      n) Figure 5E: generally, the use of tSNE for large pairwise distance analysis is often frowned upon (e.g., PMID 37590228), and I would reconsider this argument.

      This analysis was an attempt to illustrate that cells of the same phenotype based on their tSNE metrics can be either closely or more distantly related. Although the tSNE comparisons were restricted to subtypes (and not to the entire tSNE graph), tSNE are not designed for such comparisons. This graph and discussion are deleted. 

      Reviewer #2 (Public review):

      The manuscript by Shibata proposed a potentially interesting idea that variation in methylcytosine across cells can inform cellular lineage in a way similar to single nucleotide variants (SNVs). The work builds on the hypothesis that the "replication" of methylcytosine, presumably by DNMT1, is inaccurate and produces stochastic methylation variants that are inherited in a cellular lineage. Although this notion can be correct to some extent, it does not account for other mechanisms that modulate methylcytosines, such as active gain of methylation mediated by DNMT3A/B activity and activity demethylation mediated by TET activity. In some cases, it is known that the modulation of methylation is targeted by sequence-specific transcription factors. In other words, inaccurate DNMT1 activity is only one of the many potential ways that can lead to methylation variants, which fundamentally weakens the hypothesis that methylation variants can serve as a reliable lineage marker. With that being said (being skeptical of the fundamental hypothesis), I want to be as open-minded as possible and try to propose some specific analyses that might better convince me that the author is correct. However, I suspect that the concept of methylation-based lineage tracing cannot be validated without some kind of lineage tracing experiment, which has been successfully demonstrated for scRNA-seq profiling but not yet for methylation profiling (one example is Delgado et al., nature. 2022).

      I thank Reviewer 2 for the careful evaluation. The validation experiment example (Delgado et al.) introduced sequence barcodes in mice, which is not generally feasible for human studies.

      (1) The manuscript reported that fCpG sites are predominantly intergenic. The author should also score the overlap between fCpG sites and putative regulatory elements and report p-values. If fCpG sites commonly overlap with regulatory elements, that would increase the possibility that these sites being actively regulated by enhancer mechanisms other than maintenance methyltransferase activity.

      As mentioned for Reviewer 1, fCpGs are filtered to eliminate cell type specific methylation.

      (2) The overlap between fCpG and regulatory sequence is a major alternative explanation for many of the observations regarding the effectiveness of using fCpG sites to classify cell types correctly. One would expect the methylation level of thousands of enhancers to be quite effective in distinguishing cell types based on the published single-cell brain methylome works.

      As mentioned above, the manuscript did not clearly indicate that the fCpG barcode is not a cell type classifier. The distinctions between fCpG barcodes and cell type classifiers are better explained in the new Supplement.

      (3) The methylation level of fCpG sites is higher in hindbrain structures and lower in forebrain regions. This observation was interpreted as the hindbrain being the "root" of the methylation barcodes and, through "progressive demethylation" produced the methylation states in the forebrain. This interpretation does not match what is known about methylation dynamics in mammalian brains, in particular, there is no data supporting the process of "progressive demethylation". In fact, it is known that with the activation of DNMT3A during early postnatal development in mice or humans (Lister et al., 2013. Science), there is a global gain of methylation in both CH and CG contexts. This is part of the broader issue I see in this manuscript, which is that the model might be correct if "inaccurate mC replication" is the only force that drives methylation dynamics. But in reality, active enzymatic processes such as the activation of DNMT3A have a global impact on the methylome, and it is unclear if any signature for "inaccurate mC replication" survives the de novo methylation wave caused by DNMT3A activity.

      Reviewer 2 highlights a critical potential flaw in that any ancestral signal recorded by random replication errors could be overwritten by other active methylation processes. I cannot present data that indicates fCpG replication errors are never overwritten, but new data indicate barcode reproducibility and stability with aging.

      New data are also present where barcodes are compared between daughter cells (zygote to ICM) in the setting of active and passive demethylation, when germline methylation is erased. This new analysis shows that daughter cells in 2 to 8 cell embryos have more related barcodes than morula or ICM cells. The subsequent active remethylation by a wave of DNMT3A activity may underlie the observation that the barcode appears to start predominately methylated in brain progenitors.

      (3) Perhaps one way the author could address comment 3 is to analyze methylome data across several developmental stages in the same brain region, to first establish that the signal of "inaccurate mC replication" is robust and does not get erased during early postnatal development when DNMT3A deposits a large amount of de novo methylation.

      See above

      (4) The hypothesis that methylation barcodes are homogeneous among progenitor cells and more polymorphic in derived cells is an interesting one. However, in this study, the observation was likely an artifact caused by the more granular cell types in the brain stem, intermediate granularity in inhibitory cells, and highly continuous cell types in cortical excitatory cells. So, in other words, single-cell studies typically classify hindbrain cell types that are more homogenous, and cortical excitatory cells that are much more heterogeneous. The difference in cell type granularity across brain structures is documented in several whole-brain atlas papers such as Yao et al. 2023 Nature part of the BICCN paper package.

      As noted above, fCpG barcode polymorphisms and cell type differentiation are confounded because cells of the same phenotype tend to have common progenitors. The fCpG barcode is not a cell type classifier but more a cell division clock that becomes polymorphic with time. Although fCpG barcodes could be more polymorphic in cortical excitatory cells because there are many more types, fCpG barcodes would inherently become more polymorphic in excitatory cells because they appear later in development.

      (5) As discussed in comment 2, the author needs to assess whether the successful classification of cell types (brain lineage) using fCpG was, in fact, driven by fCpG sites overlapping with cell-type specific regulatory elements.

      Although unclear in the manuscript, the fCpG is not a cell classifier and the barcode is polymorphic between cells of the same type. fCpG barcodes can appear to be cell classifiers because cell types appear at different times during development, and therefore different cell types have characteristic average barcode methylation levels.

      (6) In Figure 5E, the author tried to address the question of whether methylation barcodes inform lineage or post-mitotic methylation remodeling. The Y-axis corresponds to distances in tSNE. However, tSNE involves non-linear scaling, and the distances cannot be interpreted as biological distances. PCA distances or other types of distances computed from high-dimensional data would be more appropriate.

      The Figure and discussion are deleted (similar comment by Reviewer 1)

      Reviewer #3 (Public review):

      Summary:

      In the manuscript entitled "Human Brain Barcodes", the author sought to use single-cell CpG methylation information to trace cell lineages in the human brain.

      Strengths:

      Tracing cell lineages in the human brain is important but technically challenging. Lineage tracing with single-cell CpG methylation would be interesting if convincing evidence exists.

      Weaknesses:

      As the author noted, "DNA methylation patterns are usually copied between cell division, but the replication errors are much higher compared to base replication". This unstable nature of CpG methylation would introduce significant problems in inferring the true cell lineage. The unreliable CpG methylation status also raises the question of what the "Barcodes" refer to in the title and across this study. Barcodes should be stable in principle and not dynamic across cell generations, as defined in Reference#1. It is not convincing that the "dynamic" CpG methylation fits the "barcodes" terminology. This problem is even more concerning in the last section of results, where CpG would fluctuate in post-mitotic cells.

      I thank Reviewer 3 for his thoughtful and careful evaluation. I think the “barcode” terminology is appropriate. Dynamic engineered barcodes such as CRISPR/Cas9 mutable barcodes are used in biology to record changes over time. The fCpG barcode appears to start with a single state in a progenitor cell and changes with cell division to become polymorphic in adult cells. Therefore, I think the description of a dynamic fCpG barcode is appropriate.

      Reviewer #3 (Recommendations for the authors):

      (1) As the author noted, "DNA methylation patterns are usually copied between cell division, but the replication errors are much higher compared to base replication". This unstable nature of CpG methylation would introduce significant problems in inferring the true cell lineage. To establish DNA methylation as a means for lineage tracing, one control experiment would be testing whether the DNA methylation patterns can faithfully track cell lineages for in vitro differentiated & visibly tracked cell lineages. Has this kind of experiment been done in the field?

      These types of experiments have not been performed to my knowledge and an appropriate tissue culture model is uncertain. New single cell WGBS data from the zygote to ICM indicate that more immediate daughter cells have more related barcodes even in the setting of active DNA demethylation.

      (2) The study includes assumptions that should be backed with solid rationale, supporting evidence, or reference. Here are a couple of examples:

      a) the author discarded stable CpG sites with <0.2 or >0.8 average methylation without a clear rationale in H02, and then used <0.3 and >0.7 for a specific sample H01.

      The filtering was ad hoc and was used to remove, as much as possible, CpG sites with cell type specific or patient specific methylation. CpG sites with skewed methylation are more likely cell type specific, whereas X chromosome CpG sites with methylation closer to 0.5 in male cells are more likely to be unstable. The ad hoc filtering attempted to remove cell specific CpGs sites while still retaining enough CpG sites to allow comparisons between cells.

      b) The author assumed that the early-formed brain stem would resemble progenitors better and have a higher average methylation level than the forebrain. However, this difference in DNA methylation status could reflect developmental timing or cell type-specific gene expression changes.

      This observation that brain stem neurons that appear early in development have highly methylated fCpG barcodes in all 3 brains supports the idea that the fCpG barcode starts predominately methylated. Alternative explanations are possible.

      (3) The conclusion that excitatory neurons undergo tangential migration is unclear - how far away did the author mean for the tangential direction? Lateral dispersion is known, but it would be striking that the excitatory neurons travel across different brain regions. The question is, how would the author interpret shared or divergent methylation for the same cell type across different brain regions?

      As noted with Reviewer 1, this analysis is modified to indicate that evidence of tangential migration is greater for inhibitory than excitatory neurons, but the extent of excitatory neuron migration is uncertain because of sparse sampling, and because fCpG barcodes can be similar by chance.

      (4) The sparsity and resolution of the single-cell DNA methylation data. The methylation status is detected in only a small fraction (~500/31,000 = 1.6%) of fCpGs per cell, with only 48 common sites identified between cell pairs. Given that the human genome contains over 28 million CpG sites, it is important to evaluate whether these fCpGs are truly representative. How many of these sites were considered "barcodes"?

      fCpG barcodes are distinct from traditional cell type classifiers, and how fCpGs are identified are better outlined in the new Supplement.

      (5) While focusing on the X-chromosome may simplify the identification of polymorphic fCpGs, the confidence in determining its methylation status (0 or 1) is questionable when a CpG site is covered by only one read. Did the author consider the read number of detected fCpGs in each cell when calculating methylation levels? Certain CpG sites on autosomes may also have sufficient coverage and high variability across cells, meeting the selection criteria applied to X-chromosome CpGs.

      In most cases, a fCpG site was covered by only a single read

      (6) The overall writing in the Title, the Main text, Figure legends, and Methods sections are overly simplified, making it difficult to follow. For instance, how did the author perform PWD analysis? How did they handle missing values when constructing lineage trees?

      There is not much introduction to lineage tracing in the human brain or the use of DNA methylation to trace cell lineage.

      These shortcomings are improved in the manuscript and with the new Supplement. The analysis pipeline including the Python programs are outlined and included as new Supplemental materials. IQ tree can handle the binary fCpG barcode data and skips missing values with its standard settings.

      Line 80: it is unclear: "Brain patterns were similar"

      Clarified

      Line 98: The meaning is unclear here: "Outer excitatory and glial progenitor cells are present" What are these glial progenitor cells and when/how they stop dividing?

      The glial cells are the oligodendrocytes and astrocytes. The main take away point is that these glial cells have low barcode methylation, consistent with their appearances later in development.

      Line 104: It is unclear if this is a conclusion or assumption -- "A progenitor cell barcode should become increasingly polymorphic with subsequent divisions." The "polymorphic" happens within the progenitors, their progenies, or their progenies at different time points.

      The statement is now clarified as an assumption in the manuscript.

      Similarly line 134 "Barcodes would record neuronal differentiation and migration." Is this a conclusion from this study or a citation? How is the migration part supported?

      The reasoning is better explained in the manuscript.  Migration can be documented if immediate daughter cells with similar barcodes are found in different parts of the adult brain, albeit analysis is confounded by sparse sampling and because barcodes may be similar by chance.

      Line 148 and 150: "Nearest neighbor ... neuron pairs" in DNA methylation status would conceivably reflect their cell type-specific gene expression, how did the author distinguish this from cell lineage?

      As noted above, because cells with similar phenotypes usually arise from common progenitors, cells within a clade are also usually related. However, the barcodes are still polymorphic within a clade and potentially add complementary information on mitotic ages, ancestry within a clade, and possible cell migration.

      Figure 3C: "Cells that emerge early in development" Where are they on the figure?

      Hindbrain neurons differentiate early in development and their barcodes are more methylated. The figure has been modified to label some of the values with their neuron types. Also, the older figure mistakenly included data from all 3 brains and now the data are only from brain H01.

      Figures 4D and 4E, distinguishing cell subtypes is challenging, as the same color palette is used for both excitatory and inhibitory neurons.

      Unfortunate limitations due to complexity and color limitations

      Figures 4 and 5, what are these abbreviations?

      The abbreviations are presented in Figure 1 and maintained in subsequent figures.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This provocative manuscript from presents valuable comparisons of the morphologies of Archaean bacterial microfossils to those of microbes transformed under environmental conditions that mimic those present on Earth during the same Eon, although the evidence in support of the conclusions is currently incomplete. The reasons include that taphonomy is not presently considered, and a greater diversity of experimental environmental conditions is not evaluated -- which is important because we ultimately do not know much about Earth's early environments. The authors may want to reframe their conclusions to reflect this work as a first step towards an interpretation of some microfossils as 'proto-cells,' and less so as providing strong support for this hypothesis. 

      Regarding the taphonomic alterations: The editor and reviewers are correct in pointing out this issue. Taphonomic alteration of the microfossils attains special significance in the case of microorganisms, as they lack rigid structures and are prone to morphological alterations during or after their fossilization. We are acutely aware of this issue and have conducted long-term experiments (lasting two years) to observe how cells die, decay, and get preserved. A large section of the manuscript (pages 11 to 20) and a substantial portion of the supplementary information is dedicated to understanding the taphonomic alterations. To the best of our knowledge, these are among the longest experiments done to understand the taphonomic alterations of the cells within laboratory conditions. 

      Recent reports by Orange et al. (1,2)  showed that under favorable environmental conditions, cells could be fossilized rather rapidly with little morphological modifications. We observed a similar phenomenon in this work. Cells in our study underwent rapid encrustation with cations from the growth media. We have analyzed the morphological changes over a period of 18 months. After 18 months, the softer biofilms got encrusted entirely in salt and turned solid (Fig. ). Despite this transformation, morphologically intact cells could still be observed within these structures. This suggests that the cells inhabiting Archaean coastal marine environments could undergo rather rapid encrustation, and their morphological features could be preserved in the geological record with little taphonomic alteration.    

      Regarding the environmental conditions: We are in total agreement with the reviewers that much is unknown about Archaean geology and its environmental conditions. Like the present-day Earth, Archaean Earth certainly had regions that greatly differed in their environmental conditions—volcanic freshwater ponds, brines, mildly halophilic coastal marine environments, and geothermal and hydrothermal vents, to name a few. Our experimental design focuses on one environment we have a relatively good understanding of rather than the rest of the planet, of which we know little. Below, we list our reasons for restricting to coastal marine environments and studying cells under mildly halophilic experimental conditions.  

      (1) Very little continental crust from Haden and early Archaean Eon exists on the presentday Earth. Much of our geochemical understanding of this time period was a result of studying the Pilbara Iron Formations and the Barberton Greenstone Belt. Geological investigations suggest that these sites were coastal marine environments. The salinity of coastal marine environments is higher than that of open oceans due to the greater water evaporation within these environments. Moreover, brines were discovered within pillow basalts within the Barberton greenstone belt, suggesting that the salinity within these sites is higher or similar to marine environments. 

      (2) We are not certain about the environmental conditions that could have supported the origin of life. However, all currently known Archaean microfossils were reported from coastal marine environments (3.8-2.4Ga). This suggests that proto-life likely flourished in mildly halophilic environments, similar to the experimental conditions employed in our study. 

      (3) The chemical analysis of Archaean microfossils also suggests that they lived in saltrich environments, as most, if not all, microfossils are closely associated, often encrusted in a thin layer of salt.  

      However, we concur with the reviewers that our interpretations should be reassessed if Archaean microfossils that greatly differ from the currently known microfossils are to be discovered or if new microfossils are to be reported from environments other than coastal marine sites.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Microfossils from the Paleoarchean Eon represent the oldest evidence of life, but their nature has been strongly debated among scientists. To resolve this, the authors reconstructed the lifecycles of Archaean organisms by transforming a Gram-positive bacterium into a primitive lipid vesicle-like state and simulating early Earth conditions. They successfully replicated all morphologies and life cycles of Archaean microfossils and studied cell degradation processes over several years, finding that encrustation with minerals like salt preserved these cells as fossilized organic carbon. Their findings suggest that microfossils from 3.8 to 2.5 billion years ago were likely liposome-like protocells with energy conservation pathways but without regulated morphology. 

      Strengths: 

      The authors have crafted a compelling narrative about the morphological similarities between microfossils from various sites and proliferating wall-deficient bacterial cells, providing detailed comparisons that have never been demonstrated in this detail before. The extensive number of supporting figures is impressive, highlighting numerous similarities. While conclusively proving that these microfossils are proliferating protocells morphologically akin to those studied here is challenging, we applaud this effort as the first detailed comparison between microfossils and morphologically primitive cells. 

      Weaknesses: 

      Although the species used in this study closely resembles the fossils morphologically, it would be beneficial to provide a clearer explanation for its selection. The literature indicates that many bacteria, if not all, can be rendered cell wall-deficient, making the rationale for choosing this specific species somewhat unclear. While this manuscript includes clear morphological comparisons, we believe the authors do not adequately address the limitations of using modern bacterial species in their study. All contemporary bacteria have undergone extensive evolutionary changes, developing complex and intertwined genetic pathways unlike those of early life forms. Consequently, comparing existing bacteria with fossilized life forms is largely hypothetical, a point that should be more thoroughly emphasized in the discussion. 

      Another weak aspect of the study is the absence of any quantitative data. While we understand that obtaining such data for microfossils may be challenging, it would be helpful to present the frequencies of different proliferative events observed in the bacterium used. Additionally, reflecting on the chemical factors in early life that might cause these distinct proliferation modes would provide valuable context. 

      Regarding our choice of using modern organisms or this particular bacterial species: 

      Based on current scientific knowledge, it is logical to infer that cellular life originated as protocells; nevertheless, there has been no direct geological evidence for the existence of such cells on early Earth. Hence, protocells remain an entirely theoretical concept. Moreover, protocells are considered to have been far more primitive than present-day cells. Surprisingly, this lack of sophistication was the biggest challenge in understanding protocells. Designing experiments in which cells are primitive (but not as primitive as non-living lipid vesicles) and still retain a functional resemblance to a living cell does pose some practical challenges. Laboratory experiments with substitute (proxy) protocells almost always come with some limitations. Although not a perfect proxy, we believe protocells and protoplasts share certain characteristics. Having said that, we would like to reemphasize that protoplasts are not protocells. Our reasons for using protoplasts as model organisms and working with this bacterial species (Exiguobacterium Strain-Molly) are based on several scientific and practical criteria listed below.

      (1) Irrespective of cell physiology and intracellular complexity, we believe that protoplasts and protocells share certain similarities in the biophysical properties of their cytoplasm. We explained our reasoning in the manuscript introduction and in our previous manuscripts (Kanaparthi et al., 2024 & Kanaparthi et al., 2023). In short, to be classified as a cell, even a protocell should possess minimal biosynthetic pathways, a physiological mechanism of harvesting free energy from the surrounding (energy-yielding pathways), and a means of replicating its genetic material and transferring it to the daughter cells. These minimal physiological processes could incorporate considerable cytoplasmic complexity. Hence, the biophysical properties of the protocell cytoplasm could have resembled those of the cytoplasm of protoplasts, irrespective of the genomic complexity. 

      (2) Irrespective of their physiology, protoplasts exhibit several key similarities to protocells, such as their inherent inability to regulate their morphology or reproduction. This similarity was pointed out in previous studies (3). Despite possessing all the necessary genetic information, protoplasts undergo reproduction through simple physiochemical processes independent of canonical molecular biological processes. This method of reproduction is considered to have been erratic and rather primitive, akin to the theoretical propositions on protocells. Although protoplasts are fully evolved cells with considerable physiological complexity, the above-mentioned biophysical similarities suggest that the protoplast life cycle could morphologically resemble that of protocells (in no other aspect except for their morphology and reproduction).  

      (3) Physiologically or genomically different species of Gram-positive protoplasts are shown to exhibit similar morphologies. This suggests that when Gram-positive bacteria lose their cell wall and turn into a protoplast,  they reproduce in a similar manner independent of physiological or genome-based differences. As morphology and only morphology is key to our study, at least from the scope of this study, intracellular complexity is not a key consideration. 

      (4) This specific strain was isolated from submerged freshwater springs in the Dead Sea. This isolate and members of this bacterial genus are known to have been well acclimatized to growing in a wide range of salt concentrations and in different salt species. This is important for our study (this and previous manuscript), in which cells must be grown not only at high salt concentrations (1-15%) but in different salts like NaCl, MgCl<sub>2</sub>, and KCl. 

      (5) Our initial interest in this isolate was due to its ability to reduce iron at high salt concentrations. Given that most spherical microfossils are found in Archaean-banded iron formations covered in pyrite, this suggests that these microfossils could have been reducing oxidized iron species like Fe(III). Nevertheless, over the course of our study, we realized the complexities of live cell staining and imaging under anoxic conditions. Given that the scope of the manuscript is restricted only to comparing the morphologies, not the physiology, we abandoned the idea of growing cells under anoxic conditions.  

      Based on these observations, cell physiology may not be a key consideration, at least within the scope of studying microfossil morphology. However, we want to emphasize again that “We do not claim present-day protoplasts are protocells.”  

      Regarding the absence of quantitative data:

      We are unsure what the reviewer meant by the absence of quantitative data. Is it from the cell size/reproductive pathways perspective or from a microfossil/ecological perspective? At the risk of being portrayed in a bad light, we admit that we did not present quantitative data from either of these perspectives. In our defense, this was not due to our lack of effort but due to the practical limitations imposed by our model organism. 

      If the reviewer means the quantitative data regarding cell sizes and morphology: In our previous work, we studied the relationship between protoplast morphology, growth rate, and environmental conditions. In that study, we proposed that the growth rate is one factor that regulates protoplast morphology. Nevertheless, we did not observe uniformity in the sizes of the cells. This lack of uniformity was not just between the replicates but even among the cells grown within the same culture flask or the cells within the same microscopic field. Moreover, cells are often observed to be reproducing either by forming internal or external or by both these processes at the same time. The size and morphological differences among cells within a growth stage could be explained by the physiological and growth rate heterogenicity among cells. 

      Bacterial growth curves and their partition into different stages (lag, log & stationary), in general, represent the growth dynamics of an entire bacterial population. Nevertheless, averaging the data obscures the behavior of individual cells (4,5). It is known that genetically identical cells within a single bacterial population could exhibit considerable cell-to-cell variation in gene expression (6,7) and growth rates (8). The reason for such stochastic behavior among monoclonal cells has not been well understood. In the case of normal cells, morphological manifestation of these variations is restricted by a rigid cell wall. Given the absence of a cell wall in protoplasts, we assume such cell-to-cell variations in growth rate is manifested in cell morphology. This makes it challenging to quantitatively determine variations in cell sizes or the size increase in a statically robust manner, even in monoclonal cells. 

      Although this lack of uniformity in cell sizes should not be perceived as a limitation, this behavior is consistently observed among microfossils. Spherical microfossils of similar morphology but different sizes were reported from different microfossil sites (9,10). In this regard, both protoplasts and microfossils are very similar. 

      If the reviewer means the quantitative data from an ecological perspective: 

      Based on the elemental composition and the isotopic signatures of the organic carbon, we can deduce if these structures are of biological origin or not. However, any further interpretation of this data to annotate these microfossils to a particular physiology group is fraught with errors. Hence, we refrain from making any inferences about the physiology and ecological function of these microfossils. This lack of clarity on the physiology of microfossils reduces the chance of quantitative studies on their ecological functions. Moreover, we would like to re-emphasize that the scope of this work is restricted to morphological comparison and is not targeted at understanding the ecological function of these microfossils. This narrow objective also limits the nature of the quantitative data we could present.

      Moreover, developing a quantitative understanding of some phenomena could be technically challenging. Many theories on the origin of life, like chemical evolution, started with the qualitative observation that lightning could mediate the synthesis of biologically relevant organic carbon. Our quantitative understanding of this process is still being explored and debated even to this day.     

      Reviewer #2 (Public Review): 

      Summary: 

      In summary, the manuscript describes life-cycle-related morphologies of primitive vesiclelike states (Em-P) produced in the laboratory from the Gram-positive bacterium Exiguobacterium Strain-Molly) under assumed Archean environmental conditions. Em-P morphologies (life cycles) are controlled by the "native environment". In order to mimic Archean environmental conditions, soy broth supplemented with Dead Sea salt was used to cultivate Em-Ps. The manuscript compares Archean microfossils and biofilms from selected photos with those laboratory morphologies. The photos derive from publications on various stratigraphic sections of Paleo- to Neoarchean ages. Based on the similarity of morphologies of microfossils and Em-Ps, the manuscript concludes that all Archean microfossils are in fact not prokaryotes, but merely "sacks of cytoplasm". 

      Strengths: 

      The approach of the authors to recognize the possibility that "real" cells were not around in the Archean time is appealing. The manuscript reflects the very hard work by the authors composing the Em-Ps used for comparison and selecting the appropriate photo material of fossils. 

      Weaknesses: 

      While the basic idea is very interesting, the manuscript includes flaws and falls short in presenting supportive data. The manuscript makes too simplistic assumptions on the "Archean paleoenvironment". First, like in our modern world, the environmental conditions during the Archean time were not globally the same. Second, we do not know much about the Archean paleoenvironment due to the immense lack of rock records. More so, the Archean stratigraphic sections from where the fossil material derived record different paleoenvironments: shelf to tidal flat and lacustrine settings, so differences must have been significant. Finally, the Archean spanned 2.500 billion years and it is unlikely that environmental conditions remained the same. Diurnal or seasonal variations are not considered. Sediment types are not considered. Due to these reasons, the laboratory model of an Archean paleoenvironment and the life therein is too simplistic. Another aspect is that eucaryote cells are described from Archean rocks, so it seems unlikely that prokaryotes were not around at the same time. Considering other fossil evidence preserved in Archean rocks except for microfossils, the many early Archean microbialites that show baffling and trapping cannot be explained without the presence of "real cells". With respect to lithology: chert is a rock predominantly composed of silica, not salt. The formation of Em-Ps in the "salty" laboratory set-up seems therefore not a good fit to evaluate chert fossils. Formation of structures in sediment is one step. The second step is their preservation. However, the second aspect of taphonomy is largely excluded in the manuscript, and the role of fossilization (lithification) of Em-Ps is not discussed. This is important because Archean rock successions are known for their tectonic and hydrothermal overprint, as well as recrystallization over time. Some of the comparisons of laboratory morphologies with fossil microfossils and biofilms are incorrect because scales differ by magnitudes. In general, one has to recognize that prokaryote cell morphologies do not offer many variations. It is possible to arrive at the morphologies described in various ways including abiotic ones. 

      Regarding the simplistic presumptions on the Archaean Eon environmental conditions, we provided a detailed explanation of this issue in our response to the eLife evaluation. In short, we agree with the reviewer that little is known about the Archaean Eon environmental conditions at a planetary scale. Hence, we restricted our study to one particular environment of which we had a comparatively good understanding. The Archaean Eon spanned 2.5 billion years. However, most of the microfossil sites we discussed in the manuscript are older than 3 billion years, with one exception (2.4 billion years old Turee Creek microfossils). We presume that conditions within this niche (coastal marine) environment could not have changed greatly until 2Ga, after which there have been major changes in the ocean salt composition and salinities.

      In the manuscript, we discussed extensively the reasons for restricting our study to these particular environmental conditions. Further explanations of these choices are presented in our response to the eLife evaluation (also see our previous manuscript). In short, the fact that all known microfossils are restricted to coastal marine environments justifies the experimental conditions employed in our study. Nevertheless, we agree with the reviewer that all lab-based studies involve some extent of simplification. This gap/mismatch is even wider when it comes to studies involving origin or early life on Earth.

      We are not arguing that prokaryotes are not around at this time. The key message of the manuscript is that they are present, but they have not developed intracellular mechanisms to regulate their morphology and remained primitive in this aspect.  

      The sizes of the microfossils and cells from our study were similar in most cases. However, we agree with the reviewer that they deviated considerably in some cases, for example, S70, S73, and S83. These size variations are limited to sedimentary structures like laminations rather than cells. These differences should be expected as we try to replicate the real-life morphologies of biofilms that could have extended over large swats of natural environments in a 2ml volume chamber slide. More specifically, in Fig. S70, there is a considerable size mismatch. But, in Fig. S73, the sizes were comparable between A & C (of course, the size of our reproduction did not match B). In the case of Fig. S83, we do not see a huge size mismatch.      

      Reviewer #1 (Recommendations For The Authors): 

      We would like to provide several suggestions for changes in text and additions to data analysis. 

      39-41: It has been stated that reconstructing the lifecycle is the only way of understanding the nature of these microfossils. First of all, I would rephrase this to 'the most promising way', as there are always multiple approaches to comparing phenomena. 

      We agree with the reviewer's suggestion. The suggested changes have been made (line 41). 

      125: Please rephrase "under the environmental condition of early Earth" to "under experimental conditions possibly resembling the conditions of the Paleoarchean Eon". Now it sounds like the exact environmental conditions have been produced, which has already been debated in the discussion. 

      We agree with the reviewer's suggestion. The suggested changes have been made (line 127). 

      125: Please mention the fold change in size, the original size in numbers, and whether this change is statistically significant. 

      In the above sections of this document, we explained our reservations about presenting the exact number.

      128: Have you found a difference in the relative percentages of modes of reproduction? In other words, is there a difference in percentage between forming internal daughter cells or a string of external daughter cells? 

      We explained our reservations about presenting the exact number above. But this has been extensively discussed in our accompaining manuscript. We want to reemphasize that the scope of this manuscript is restricted to comparing morphologies rather than providing a mechanistic explanation of the reproduction process. 

      151: A similar model for endocytosis has already been described in proliferating wall-less cells (Kapteijn et al., 2023). In the discussion, please compare your results with the observations made in that paper. 

      This is an oversight on our part. The manuscript suggested by the reviewer has now been added (line 154 & 155).  

      163: Please use another word for uncanny. We suggest using 'strong resemblance'. 

      We changed this according to the reviewers' suggestion (line 168). 

      433: Please elaborate on why the results are not shown. This sounds like a statement that should be substantiated further. 

      To observe growth and simultaneously image the cells, we conducted these experiments in chamber slides (2ml volume). Over time, we observed cells growing and breaking out of the salt crust (Fig. S86, S87 & Movie 22) and a gradual increase in the turbidity of the media. Although not quantitative, this is a qualitative indication of growth. We did not take precise measurements for several reasons. This sample is precious; it took us almost two years to solidify the biofilm completely, as shown in Fig. S84A. Hence, it was in limited supply, which prevented us from inoculating these salt crusts into large volumes of fresh media. Given a long period of starvation, these cells often exhibited a long lag phase (several days), and there wasn't enough volume to do OD measurements over time. 

      We also crushed the solidified biofilm with a sterile spatula before transferring it into the chamber slide with growth media. This resulted in debris in the form of small solid particles, which interfered with our OD measurements. These practical considerations made it challenging to determine the growth precisely. Despite these challenges, we measured an OD of 4 in some chamber slides after two weeks of incubation. Given that these measurements were done haphazardly, we chose not to present this data. 

      456: Could you please double-check whether the description is correct for the figure? 8C and 8D are part of Figure 8B, but this is stated otherwise in the description. 

      We thank the reviewer for pointing it out. It has now been rectified (line 461-472).

      Reviewer #2 (Recommendations For The Authors): 

      We thank Reviewer #2  for carefully reading the manuscript and such an elaborate list of questions. The revisions suggested have definitely improved the quality of the manuscript. Here, we would like to address some of the questions that came up repeatedly below. One frequently asked question is regarding the letters denoting the individual figures within the images. For comparison purposes, we often reproduced previously published images. To maintain a consistent figure style, we often have to block the previous denotations with an opaque square and give a new letter. 

      The second question that appeared repeatedly below is the missing scale bars in some of the images within a figure. We often did not include a scale bar in the images when this image is an enlarged section of another image within the same figure.     

      Title: Please consider being more precise in the title. Microfossils are only one fossil group of "oldest life". Perhaps better: "On the nature of some microfossils in Archean rocks". (see also Line 37).  

      Authors’ response: The title conveys a broader message without quantitative insinuations. If our manuscript had been titled "On the nature of all known Archaean microfossils,” we should have agreed with the reviewer's suggestion and changed it to "On the nature of some microfossils in Archean rocks". As it is not, we respectfully decline to make this modification.     

      Abstract:  

      Line 41: "one way", not "the only way" 

      We agree with the reviewer’s comment, and necessary changes have been made (line 41).  

      Introduction: 

      Line 58f: "oldest sedimentary rock successions", not "oldest known rock formations". There are rocks of much older ages, but those are not well preserved due to metamorphic overprint, or the rocks are igneous to begin with. Minor issue: please note that "formations" are used as stratigraphic units, not so much to describe a rock succession in the field. 

      We agree with the reviewer’s comment and have made necessary changes (line 58).

      Line 67: Microfossils are widely accepted as evidence of life. Please rephrase. 

      We agree with the reviewer’s comment, and necessary changes have been made.

      Line 71 - 74: perhaps add a sentence of information here.

      We agree with the reviewer’s comment, and necessary changes have been made (line 71).

      Line 76: which "chemical and mineralogical considerations"? 

      This has been rephrased to “Apart from the chemical and δ<sup>13</sup>C-biomass composition” (line 76).

      Line 84ff: This is a somewhat sweeping statement. Please remember that there are microbialites in such rocks that require already a high level of biofilm organization. The existence of cyanobacteria-type microbes in the Archean is also increasingly considered. 

      We are aware of literature that labeled the clusters of Archaean microfossils as biofilms and layered structures as microbialites or stromatolite-like structures. However, the use of these terms is increasingly being discouraged. A more recent consensus among researchers suggests annotating these structures simply as sedimentary structures, as microbially induced sedimentary structures (MISS). 

      We respectfully disagree with the reviewer’s comment that Archaean microfossils exhibit a high level of biofilm organization. We are not aware of any studies that have conducted such comprehensive research on the architecture of Archaean biofilms. We are not even certain if these clusters of Archaean cells could even be labeled as biofilms in the true sense of the term. We presently lack an exact definition of a biofilm. In our study, we do see sedimentation and bacteria and their encapsulation in cell debris. From a broader perspective, any such aggregation of cells enclosed in cell debris could be annotated as a biofilm. However, more in-depth studies show that biofilm is not a random but a highly organized structure. Different bacterial species have different biofilm architectures and chemical composition. The multispecies biofilms in natural environments are even more complex. We do agree with the reviewer that these structures could broadly be labeled as biofilms, but we presently lack a good, if any, understanding of the Archaean biofilm architecture. 

      Regarding the annotation of microfossils as cyanobacteria, we respectfully disagree with the reviewer. This is not a new concept. Many of the Archaean microfossils were annotated as cyanobacteria at the time of their discovery. This annotation is not without controversy. With the advent of genome-based studies, researchers are increasingly moving away from this school of thought.  

      Line 101ff: The conditions on early Earth are unknown - there are many varying opinions. Perhaps simply state that this laboratory model simulates an Archean Earth environment of these conditions outlined. 

      This is a good idea. We thank the reviewer for this suggestion, and we made appropriate changes. 

      Line 112: manuscript to be replaced by "paper"? 

      This change has been made (line 114).

      Line 116: "spanned years" - how many years? 

      We now added the number of years in the brackets (line 118).

      Results: 

      Line 125: see comment for 101ff. 

      we made appropriate changes. 

      Figure 1: Caption: Please write out ICV the first time this abbreviation is used. Images: Note that some lettering appears to not fit their white labels underneath. (G, H, I, J0, and M). 

      We apologize; this is an oversight on our part. We now spell complete expansion of ICV, the first time we used this abbreviation. 

      We took these images from previously published work (references in the figure legend), so we must block out the previous figure captions. This is necessary to maintain a uniform style throughout the manuscript. 

      Line 152ff.: here would be a great opportunity to show in a graph the size variations of modern ICVs and to compare the variations with those in the fossil material. 

      In the above sections of this document, we explained our reservations about presenting the exact number.

      Line 159f.: Fig.1K - what is to see here? Maybe a close-up or - better - a small sketch would help? 

      Fig. 1K shows the surface depressions formed during the vesicle formation. The surface characteristics of EM-P and microfossils is very similar.   

      Line 161f.: reference?  

      The paragraph spanning lines 159 to 172 discusses the morphological similarities between EM-P and SPF microfossils. We rechecked the reference no 35 (Delarue 2019). This is the correct reference. We do not see a mistake if the reviewer meant the reference to the figures.    

      Line 164ff.: A question may be asked, how many fossils of the Strelley Pool population would look similar to the "modeled" ones. Questions may rise in which way the environmental conditions control such morphology variations. Perhaps more details? 

      This relationship between the environmental conditions and the morphology is discussed extensively in our previous work (11).  

      Line 193: what is meant by "similar discontinuous distribution of organic carbon"?

      This statement highlights similarities between EM-P and microfossils. The distribution of cytoplasm within the cells is not uniform. There are regions with and devoid of cytoplasm, which is quite unusual for bacteria. Some previous studies argued that this could indicate that these organic structures are of abiotic origin. Here, we show that EMP-like cells could exhibit such a patchy distribution of cytoplasm within the cell.    

      Line 218 - 291: The observations are very nice, however, the figures of fossil material in Figures 3 A, B, and C appear not to conform. Perhaps use D, E and I to K. Also, S48 does not show features as described here (see below).  

      We did not completely understand the reviewer’s question. As mentioned in the figure legend, both the microfossils and the cells exhibit string with spherical daughter cells within them. Moreover, there are also other similarities like the presence of hollow spherical structures devoid of organic carbon. We also saw several mistakes in the Fig. S48 legend. We have rectified them, and we thank the reviewer for pointing them out.   

      Line 293f: Title with "." at end?

      This change has been made.

      Line 298: predominantly in chert. In clastic material preservation of cells and pores is unlikely due to the common lack of in situ entombment by silica. 

      We rephrased this entire paragraph to better convey our message. Either way, we are not arguing that hollow pore spaces exist. As the reviewer mentioned, they will, of course, be filled up with silica. In this entire paragraph, we did not refer to hollow spaces. So, we are not entirely sure what the question was.     

      Line 324, 328-349: Please see below comments on the supplementary figures 51-62. Some of the interpretations of morphologies may be incorrect. 

      Please find our response to the reviewer’s comments on individual figures below.  

      Figure 5 A to D look interesting, however E to J appear to be unconvincing. What is the grey frame in D (not the white insert). 

      The grey color is just the background that was added during the 3D rendering process.  

      Figure 6 does not appear to be convincing. - Erase? 

      We did not understand the reviewer’s reservations regarding this figure. Images A-F within the figure show the gradual transformation of cells into honeycomb-like structures, and images G-J show such structures from the Archaean that are closely associated with microfossils. Moreover, we did not come up with this terminology (honeycomb-like). Previous manuscripts proposed it.  

      Line 379ff: S66 and 69, please see my comments below. Microfossils "were often discovered" in layers of organic carbon. 

      Please see our response below.   

      Line 393-403: Laminae? There are many ways to arrive at C-rich laminae, especially, if the material was compressed during burial. Basically, any type of biofilm would appear as laminae, if compressed. The appearance of thin layers is a mere coincidence. Note that the scale difference in S70, S73, as well as S83, is way too high (cm versus μm!) to allow any such sweeping conclusions. What are α- and β- laminations, the one described by Tice et al.? The arguments are not convincing.

      We propose that cells be compressed to form laminae. We answered this question above about the differences in the scale bars. Yes, we are referring to α- and β- laminations described by Tice et al.       

      Figure 7: This is an interesting figure, but what are the arguments for B and C, the fossil material, being a membrane? Debris cannot be distinguished with certainty at this scale in the insert of C. B could also be a shriveled-up set of trichomes.  

      We agree with the reviewer that debris cannot be definitely differentiated. Traditionally, annotations given to microfossil structures such as biofilm, intact cells, or laminations were all based on morphological similarities with existing structures observed in microorganisms. Given that the structures observed in our study are very similar to the microfossil structures, it is logical to make such inferences. Scales in A & B match perfectly well. The structure in C is much larger, but, as we mentioned in reply to one of the reviewer’s earlier questions, some of the structures from natural environments could not be reproduced at scale in lab experiments. Working in a 2 ml chamber slides does impose some restrictions.   

      Figure 8: The figure does not show any honeycomb patterns. The "gaps" in the Moodies laminae are known as lenticular particles in biofilms. They form by desiccated and shriveledup biofilm that mineralizes in situ. Sometimes also entrapped gases induce precipitation. Note also that the modelled material shows a kind of skin around the blobs that are not present in the Moodies material.  

      We agree that entrapped gas bubbles could have formed lenticular gaps. In the manuscript, we did not discount this possibility. However, if that is the case, one should explain why we often find clumps of organic carbon within these gaps. As we presented a step-by-step transformation of parallel layers of cells into laminations, which also had similar lenticular gaps, we believe this is a more plausible way such structures could have formed. In the end, there could have been more than one way such structures could have been formed. 

      We do see the honeycomb pattern in the hollow gaps. Often, the 3D-rendering of the STED images obscures some details. Hence, in the figure legend, we referred to the supplementary figures also show the sequence of steps involved in the formation of such a pattern.      

      Line 405-417: During deposition of clastic sediment any hollow space would be compressed during burial and settling. It is rare that additional pore space (except between the graingrain-contacts) remains visible, especially after consolidation. The exception would be if very early silicification took place filling in any pore space. What about EPS being replaced by mineralic substance? The arguments are not convincing. 

      We are suggesting that EPS or cell debris is rapidly encrusted by cations from the surrounding environment and gets solidified into rigid structures. This makes it possible for the structures to be preserved in the fossil record. We believe that hollow structures like the lenticular gaps will be filled up with silica. 

      We do not agree with the reviewer’s comment that all biological structures will be compressed. If this is true, there should be no intact microfossils in the Archaean sedimentary structures, which is definitely not the case.      

      Line 419-430: Lithification takes place within the sediment and therefore is commonly controlled by the chemistry of pore water and chemical compounds that derive from the dissolution of minerals close by. Another aspect to consider is whether "desiccation cracks" on that small scale may be artefacts related to sample preparation (?).  

      We agree that desiccation cracks could have formed during the sample preparation for SEM imaging, as this involves drying the biofilms. However, we observed that the sample we used for SEM is a completely solidified biofilm (Fig. S84), so we expect little change in its morphology during drying. Moreover, visible cracks and pointy edges were also observed in wet samples, as shown in Fig. S87.        

      Line 432 - 439: Please see comments on the supplementary material below.

      Please find our response to the reviewer’s comments on individual figures below.  

      Discussion:  

      Line 477f: "all known microfossil morphologies" - is this a correct statement? Also, would the Archean world provide only one kind of "EM-P type"? Morphologies of prokaryote cells (spherical, rod-shaped, filamentous) in general are very simple, and any researcher of Precambrian material will appreciate the difficulties in concluding on taxonomy. There are papers that investigate putative microfossils in chert as features related to life cycles. Microfossil-papers commonly appear not to be controversial give and take some specific cases.  

      We made a mistake in using the term “all known microfossil morphologies.” We have now changed it to “all known spherical microfossils” from this statement (line 483). However, we do not agree with the statement that microfossil manuscripts tend not to be controversial. Assigning taxonomy to microfossils is anything but controversial. This has been intensely debated among the scientific community.     

      Line 494-496: This statement should be in the Introduction.

      We agree with the reviewer’s comment. In an earlier version of the manuscript this statement was in the introduction. To put this statement in its proper context, it needs to be associated with a discussion about the importance of morphology in the identification of microfossils. The present version of the manuscript do not permit moving an entire paragraph into the introduction. Hence, we think making this statement in the discussion section is appropriate. 

      Line 484ff. The discussion on biogenicity of microfossils is long-standing (e.g., biogenicity criteria by Buick 1990 and other papers), and nothing new. In paleontology, modern prokaryotes may serve as models but everyone working on Archean microfossils will agree that these cannot correspond to modern groups. An example is fossil "cyanobacteria" that is thought to have been around already in the early Archean. While morphologically very similar to modern cyanobacteria, their genetic information certainly differed - how much will perhaps remain undisclosed by material of that high age.  

      Yes, we agree with the reviewer that there has been a longstanding conflict on the topic of biogenicity of microfossils. However, we have never come across manuscripts suggesting that modern microorganisms should only be used as models. If at all, there have been numerous manuscripts suggesting that these microfossils represent cyanobacteria, streptomycetes, and methanotrophs. Regarding the annotation of microfossils as cyanobacteria, we addressed this issue in one of the previous questions raised by the reviewer.    

      Line 498ff: Can the variation of morphology and sizes of the EM-Ps be demonstrated statistically? Line 505ff are very speculative statements. Relabeling of what could be vesicles as "microfossils" appears inappropriate. Contrary to what is stated in the manuscript, the morphologies of the Dresser Formation vesicles do not resemble the S3 to S14 spheroids from the Strelley Pool, the Waterfall, and Mt Goldsworthy sites listed in the manuscript. The spindle-shaped vesicles in Wacey et al are not addressed by this manuscript. What roles in mineral and element composition would have played diagenetic alteration and the extreme hydrothermal overprint and weathering typical for Dresser material? S59, S60 do not show what is stated, and the material derives from the Barberton Greenstone Belt, not the Pilbara.

      Please see the comments below regarding the supplementary images. 

      We did not observe huge variations in the cell morphology. Morphologies, in most cases, were restricted to spherical cells with intracellular vesicles or filamentous extensions. Regarding the sizes of the cells, we see some variations. However, we are reluctant to provide exact numbers. We have presented our reasons above.

      We respectfully disagree with the reviewer’s comments. We see quite some similarities between Dresser formation microfossils and our cells. Not just the similarities, we have provided step-by-step transformation of cells that resulted in these morphologies. We fail to see what exactly is the speculation here. The argument that they should be classified as abiotic structures is based on the opinion that cells do form such structures. We clearly show here that they can, and these biological structures resemble Dresser formation microfossils more closely than the abiotic structures. 

      Regarding the figures S3-S14. We think they are morphologically very similar. Often, it's not just comparing both images or making exact reproductions (which is not possible). We should focus on reproducing the distinctive morphological features of these microfossils.  

      We agree with the reviewer that we did not reproduce all the structures reported by Wacey’s original manuscript, such as spherical structures. We are currently preparing another manuscript to address the filamentous microfossils. These spindle-like structures will be addressed in this subsequent work. 

      We agree with the reviewer, we often have difficulties differentiating between cells and vesicles. This is not a problem in the early stages of growth. During the log phase, a significant volume of the cell consists of the cytoplasm, with hollow vesicles constituting only a minor volume (Fig. 1B or S1A). During the later growth stages (Fig. 1E7F or S11), cells were almost hollow, with numerous daughter cells within them. These cells often resemble hollow vesicles rather than cells. However, given these are biologically formed structures, and one could argue that these vesicles are still alive as there is still a minimal amount of cytoplasm (Fig. S27). Hence, we should consider them as cells until they break apart to release daughter cells. 

      Regarding Figures S59 and S60, we did not claim either of these microfossils is from Pilbara Iron Formations. The legend of Figure S59 clearly states that these structures are from Buck Reef Chert, originally reported by Tice et al., 2006 (Figure 16 in the original manuscript). The legend of Figure S60 says these structures were originally reported by Barlow et al., 2018, from the Turee Creek Formation. 

      Line 546f and 552: The sites including microfossils in the Archean represent different paleoenvironments ranging from marine to terrestrial to lacustrine. References 6 and 66 are well-developed studies focusing on specific stratigraphic successions, but cannot include information covering other Archean worlds of the over 2.5 Ga years Archean time.  

      All the Archaean microfossils reported to date are from volcanic coastal marine environments. We are aware that there are rocky terrestrial environments, but no microfossils have been reported from these sites. We are unaware of any Archaean microfossils reported from freshwater environments. 

      Line 570ff: The statements may represent a hypothesis, but the data presented are too preliminary to substantiate the assumptions.

      We believe this is a correct inference from an evolutionary, genomic, and now from a morphological perspective. 

      Figures:  

      Please check all text and supplementary figures, whether scale bars are of different styles within the figure (minor quibble). 

      S3 (no scale in C, D); S4, S5: Note that scale bars are of different styles. 

      We believe we addressed this issue above. 

      S6 D: depressions here are well visible - perhaps exchange with a photo in the main text? Note that scale bars are of different styles.  

      We agree that depressions are well visible in E. The same image of EM-P cell in E is also present in Fig. 1D in the main text.   

      S7: Scale bars should all be of the same style, if anyhow possible. Scale in D? 

      We believe we addressed this issue above. 

      S9: F appears to be distorted. Is the fossil like this? The figure would need additional indicators (arrows) pointing toward what the reader needs to see - not clear in this version. More explanation in the figure caption could be offered. 

      We rechecked the figure from the original publication to check if by mistake the figure was distorted during the assembly of this image. We can assure you that this is not the case. We are not sure what further could be said in the figure legend.     

      S13: What is shown in the inserts of D and E that is also visible in A and B? Here a sketch of the steps would help. 

      We did not understand the question.  

      S14: Scale in A, B? 

      We believe we addressed this issue above. 

      S15: Scales in A, E, C, D 

      We believe we addressed this issue above. 

      S16: scales in D, E, G, H, I, J?  

      We believe we addressed this issue above. 

      S17: "I" appears squeezed, is that so? If morphology is an important message, perhaps reduce the entire figure so it fits the layout. Note that labels A, B, C, and D are displaced. 

      As shown in several subsequent figures, the hollow spherical vesicles are compressed first into honeycomb-like structures, and they often undergo further compression to form lamination-like structures. Such images often give the impression that the entire figure is squashed, but this is not the case. If one examines the figure closely, you could see perfectly spherical vesicles together with laterally sqeezed structures. Regarding the figure labels, we addressed this issue above. 

      S18: The filamentous feature in C could also be the grain boundaries of the crystals. Can this be excluded as an interpretation? Are there microfossils with the cell membranes? That would be an excellent contribution to this figure. Note that scale bars are of different styles.

      If this is a one-off observation, we could have arrived at the reviewer's opinion. But spherical cells in a “string of beads” configuration were frequently reported from several sites, to be discounted as mere interpretation.    

      S19: The morphologies in A - insert appear to be similar to E - insert in the lower left corner. The chain of cells in A may look similar to the morphologies in E - insert upper right of the image. B - what is to see here? D - the inclusions do not appear spherical (?). Does C look similar to the cluster with the arrow in the lower part of image E? Note that scale bars are of different styles (minor quibble). A, B, C, and D appear compressed. Perhaps reduce the size of the overall image?  

      The structures highlighted (yellow box) in C are similar to the highlighted regions in E—the agglomeration of hollow vesicles. It is hard to get understand this similarity in one figure. The similarities are apparent when one sees the Movie 4 and Fig. S12, clearly showing the spherical daughter cells within the hollow vesicle. We now added the movie reference to the figure legend.    

      S20: A appears not to contribute much. The lineations in B appear to be diagenetic. However, C is suitable. Perhaps use only C, D, E? 

      We believe too many unrecognizable structures are being labeled as diagenetic. Nevertheless, we do not subscribe to the notion that these are too lenient interpretations. These interpretations are justified as such structures have not been reported from live cells. This is the first study to report that cells could form such structures. As we now reproduced these structures, an alternate interpretation that these are organic structures derived from microfossils should be entertained. 

      S 21: Note that scale bars are of different styles.  

      We believe we addressed this issue above. 

      S22: Perhaps add an arrow in F, where the cell opened, and add "see arrow" in the caption? Is this the same situation as shown in C (white arrow)? What is shown by the white arrow in A? Note that scale bars are of different styles.

      We did the necessary changes.  

      S23: In the caption and main text, please replace "&" with "and" (please check also the other figure captions, e.g. S24). Note that scale bars are of different styles. What is shown in F? A, D - what is shown here?

      We replaced “&” with “and.”  

      S24: Note that scale bars are of different styles. Note that Wacey et al. describe the vesicles as abiotic not as "microfossils"; please correct in figure caption [same also S26; 25; 28].

      We are aware of Prof. Dr. Wacey’s interpretations. We discuss it at length in the discussion section our manuscript. Based on the similarities between the Dresser formation structures and structures formed by EM-P, we contest that these are abiotic structures.  

      S25: Appears compressed; note different scale bars. 

      We believe we addressed this issue above. 

      S28: The label in B is still in the upper right corner; scale in D? What is to see in rectangles (blue and red) in A, B? In fossil material, this could be anything. 

      These figures are taken from a previous manuscript cited in the figure legend. We could not erase or modify these figures.  

      S33: "L"ewis; G appears a bit too diffuse - erase? Note that scale bars are of different styles.

      We believe we addressed this issue above. 

      S34: This figure appears unconvincing. Erase? 

      There are considerable similarities between the microfossils and structures formed by EM-P. If the reviewer expands a bit on what he finds unconvincing, we can address his reservations.    

      S35: It would be more convincing to show only the morphological similarities between the cell clusters. B and C are too blurry to distinguish much. Scales in D to F and in sketches? A appears compressed (?). 

      We rechecked the original manuscript to see if image A was distorted while making this figure, but this is not the case. Regarding B & C, cells in this image are faint as they are hollow vesicles and, by nature, do not generate too much contrast when imaged with a phase-contrast microscope. There are some limitations on how much we can improve the contrast. We now added scale bars for D-I. Similarly, faint hollow vesicles can be seen in Fig. S21 C & D, and Fig. 3H.  

      S36: Very nice; in B no purple arrow is visible. Note that scale bars are of different styles. S37 and S36 are very much the same - fuse, perhaps?  

      We are sorry for the confusion. There are purple arrows in Fig. S37B-D. 

      S38: this is a more unconvincing figure - erase? 

      Unconvincing in wahy sense. There are considerable similarities between the microfossils and structures formed by EM-P. If the reviewer expands a bit on what he finds unconvincing, we can address his reservations.

      S39: white rectangle in A? Arrow in A? Note that scale bars are of different styles.

      These are some of the unavoidable remnants from the image from the original publication. 

      S40: in F: CM, V = ?; Note that scale bars are of different style. 

      It’s an oversite on our part. We now added the definitions to the figure legaend. We thank the reviewer for pointing it out.  

      S41: Rectangles in D, E, F, G can be deleted? Scales and labels missing in photos lower right. 

      Those rectangles are added by the image processing software to the 3Drendered images. Regarding the missing scale bars in H & I they are the magnified regions of F. The scale bar is already present in F.   

      S42: appears compressed. G could be trimmed. Labels too small; scale in G? 

      This is a curled-up folded membrane. We needed to lower the resolution of some images to restrict the size of the supplement to journal size restrictions. It is not possible to present 85 figures in high resolution. But we assure you that the image is not laterally compressed in any manner.   

      S43: This figure appears to be unconvincing. Reducing to pairing B, C, D with L, K? Spherical inclusions in B? Scales in E to G? Similar in S44: A, B, E only? Note that scale bars are of different styles. 

      Figures I to K are important. They show not just the morphological similarities but also the sequence of steps through which such structures are formed. We addressed the issue of the scale bars above.  

      S45: A, B, and C appear to show live or subrecent material. How was this isolated of a rock? Note that scale bars are of different styles.  

      It is common to treat rocks with acids to dissolve them and then retrieve organic structures within them. This technique is becoming increasingly common. The procedure is quite extensively discussed in the original manuscript. We don’t see much differences in the scale bars of microfossils and EM-P cells, they are quite similar. 

      S46: A: what is to see here? Note that scale bars are of different styles. 

      There are considerable similarities between the folded fabric like organic structures with spherical inclusions and structures formed by EM-P. If the reviewer expands a bit on what he finds unconvincing, we can address his reservations.    

      S47: Perhaps enlarge B and erase A. Note that scale bars are of different styles. 

      S48: Image B appears to show the fossil material - is the figure caption inconsistent? There are no aggregations visible in the boxes in A. H is described in the figure caption but missing in the figure. Overall, F and G do not appear to mirror anything in A to E (which may be fossil material?). 

      S51; S52 B, C, E; S53: these figures appear unconvincing - erase? 

      Unconvincing in what sense? The structures from our study are very similar to the microfossils.   

      S54: North "Pole; scale bars in A to C =? 

      These figures were borrowed from an earlier publication referenced in the figure legend. That is the reason for the differences in the styles of scale bars.  

      S55: D and E appear not to contribute anything. Perhaps add arrow(s) and more explanation? Check the spelling in the caption, please. 

      D & E show morphological similarities between cells from our study and microfossils (A).   

      S56: Hexagonal morphologies may also be a consequence of diagenesis. Overall, perhaps erase this figure?  

      I certainly agree that could be one of the reasons for the hexagonal morphologies. Such geometric polygonal morphologies have not been observed in living organisms. Nevertheless, as you can see from the figure, such morphologies could also be formed by living organisms. Hence, this alternate interpretation should not be discounted.   

      S57: The figure caption needs improvement. Please add more description. What show arrows in A, what are the numbers in A? What is the relation between the image attached to the right side of A? Is this a close-up? Note that scale bars are of different styles. 

      We expanded a bit on our original description of the figure. However, we request the reviewer to keep in mind that the parts of the figure are taken from previous publication. We are not at liberty to modifiy them, like removing the arrows. This imposes some constrains. 

      S58: There are no honeycomb-shaped features visible. What is to see here? Erase this figure? 

      Clearly, one can see spherical and polygonal shapes within the Archaean organic structures and mat-like structures formed by EM-P.  

      S59 and S60: What is to see here? - Erase? 

      Clearly, one can see spherical and polygonal shapes within the Archaean organic structures and mat-like structures formed by EM-P in Fig. S59. Further disintegration of these honeycomb shaped mats into filamentous struructures with spherical cells attached to them can be seen in both Archaean organic structures and structures formed by EM-P.   

      S61: This figure appears to be unconvincing. B and F may be a good pairing. Note that scale bars are of different styles.  

      There are considerable similarities between the microfossils and structures formed by EM-P. If the reviewer expands a bit on what he finds unconvincing, we might be able to address his reservations.     

      S62: This figure appears to be unconvincing - erase?

      There are considerable similarities between the microfossils and structures formed by EM-P. If the reviewer expands a bit on what he finds unconvincing, we might be able to address his reservations.     

      S66: This figure is unconvincing - erase? 

      There are considerable similarities between the microfossils and structures formed by EM-P. If the reviewer expands a bit on what he finds unconvincing, we might be able to address his reservations.    

      S68: Scale in B, D, and E? 

      Image B is just a magnified image of a small portion of image A. Hence, there is no need for an additional scale bar. The same is true for images D and E. 

      S69: This figure appears to be unconvincing, at least the fossil part. Filamentous features are visible in fossil material as well, but nothing else. 

      We are not sure what filamentous features the reviewer is referring to. Both the figures show morphologically similar spherical cells covered in membrane debris.    

      S70 [as well as S82]: Good thinking here, but scales differ by magnitudes (cm to μm). Erase this figure? Very similar to Figure S73: Insert in C has which scale in comparison to B? Note that scale bars are of different styles.  

      We realize the scale bars are of different sizes. In our defense, our experiments are conducted in 1ml volume chamber slides. We don’t have the luxury of doing these experiments on a scale similar to the natural environments. The size differences are to be expected. 

      S71: Scale in E? 

      Image E is just a magnified image of a small portion of image D. Hence, we believe a scale bar is unnecessary. 

      S72: Scale in insert?  

      The insert is just a magnified region of A & C

      S75: This figure appears to be unconvincing. This is clastic sediment, not chert. Lenticular gaps would collapse during burial by subsequent sediment. - Erase? 

      Regarding the similarities, we see similar lenticular gaps within the parallel layers of organic carbon in both microfossils, and structures formed by EM-P.

      S76: A, C, D do not look similar to B - erase? Similar to S79, also with respect to the differences in scale. Erase? 

      Regarding the similarities, we see similar lenticular gaps within the parallel layers of organic carbon in both microfossils, and structures formed by EM-P. We believe we addressed the issue of scale bars above. 

      S80: A appears to be diagenetic, not primary. Erase? 

      These two structures share too many resemblances to ignore or discount just as diagenic structures - Raised filamentous structures originate out of parallel layers of organic carbon (laminations), with spherical cells within this filamentous organic carbon.  

      S85: What role would diagenesis play here? This figure appears unconvincing. Erase?

      We do believe that diagenesis plays a major role in microfossil preservation. However, we also do not suscribe to the notion that we should by default assign diagenesis to all microfossil features. Our study shows that there could be an alternate explanation to some of the observations.  

      S86 and S87: These appear unconvincing. What is to see here? Erase? 

      The morphological similarities between these two structures. Stellarshaped organic structures with strings of spherical daughter cells growing out of them.  

      S88: Does this image suggest the preservation of "salt" in organic material once preserved in chert?  

      That is one inference we conclude from this observation. Crystaline NaCl was previously reported from within the microfossil cells.    

      S89: What is to see here? Spherical phenomena in different materials? 

      At present, the presence of honeycomb-like structures is often considered to have been an indication of volcanic pumice. We meant to show that biofilms of living organisms could result in honeycomb-shaped patterns similar to volcanic pumice.

      References 

      Please check the spelling in the references. 

      We found a few references that required corrention. We now rectified them. 

      References  

      (1) Orange F, Westall F, Disnar JR, Prieur D, Bienvenu N, Le Romancer M, et al. Experimental silicification of the extremophilic archaea pyrococcus abyssi and methanocaldococcus jannaschii: Applications in the search for evidence of life in early earth and extraterrestrial rocks. Geobiology. 2009;7(4). 

      (2) Orange F, Disnar JR, Westall F, Prieur D, Baillif P. Metal cation binding by the hyperthermophilic microorganism, Archaea Methanocaldococcus Jannaschii, and its effects on silicification. Palaeontology. 2011;54(5). 

      (3) Errington J. L-form bacteria, cell walls and the origins of life. Open Biol. 2013;3(1):120143. 

      (4) Cooper S. Distinguishing between linear and exponential cell growth during the division cycle: Single-cell studies, cell-culture studies, and the object of cell-cycle research. Theor Biol Med Model. 2006; 

      (5) Mitchison JM. Single cell studies of the cell cycle and some models. Theor Biol Med Model. 2005; 

      (6) Kærn M, Elston TC, Blake WJ, Collins JJ. Stochasticity in gene expression: From theories to phenotypes. Nat Rev Genet. 2005; 

      (7) Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science. 2002; 

      (8) Strovas TJ, Sauter LM, Guo X, Lidstrom ME. Cell-to-cell heterogeneity in growth rate and gene expression in Methylobacterium extorquens AM1. J Bacteriol. 2007; 

      (9) Knoll AH, Barghoorn ES. Archean microfossils showing cell division from the Swaziland System of South Africa. Science. 1977;198(4315):396–8. 

      (10) Sugitani K, Grey K, Allwood A, Nagaoka T, Mimura K, Minami M, et al. Diverse microstructures from Archaean chert from the Mount Goldsworthy–Mount Grant area, Pilbara Craton, Western Australia: microfossils, dubiofossils, or pseudofossils? Precambrian Res. 2007;158(3–4):228–62. 

      (11) Kanaparthi D, Lampe M, Krohn JH, Zhu B, Hildebrand F, Boesen T, et al. The reproduction process of Gram-positive protocells. Sci Rep. 2024 Mar 25;14(1):7075.

    1. Author response:

      Reviewer #1 (Public review):

      This manuscript presents an interesting exploration of the potential activation mechanisms of DLK following axonal injury. While the experiments are beautifully conducted and the data are solid, I feel that there is insufficient evidence to fully support the conclusions made by the authors.

      In this manuscript, the authors exclusively use the puc-lacZ reporter to determine the activation of DLK. This reporter has been shown to be induced when DLK is activated. However, there is insufficient evidence to confirm that the absence of reporter activation necessarily indicates that DLK is inactive. As with many MAP kinase pathways, the DLK pathway can be locally or globally activated in neurons, and the level of DLK activation may depend on the strength of the stimulation. This reporter might only reflect strong DLK activation and may not be turned on if DLK is weakly activated. The results presented in this manuscript support this interpretation. Strong stimulation, such as axotomy of all synaptic branches, caused robust DLK activation, as indicated by puc-lacZ expression. In contrast, weak stimulation, such as axotomy of some synaptic branches, resulted in weaker DLK activation, which did not induce the puc-lacZ reporter. This suggests that the strength of DLK activation depends on the severity of the injury rather than the presence of intact synapses. Given that this is a central conclusion of the study, it may be worthwhile to confirm this further. Alternatively, the authors may consider refining their conclusion to better align with the evidence presented.

      We wish to further clarify a striking aspect of puc-lacZ induction following injury: it is bimodal. It is either induced (in various injuries that remove all synaptic boutons), or not induced, including in injuries that spared only 1-2 remaining boutons. This was particularly evident for injuries that spared the NMJ on muscle 29, which is comprised of only a few boutons. In some instances, only a single bouton was evident on muscle 29. While our injuries varied enormously in the number of branches and boutons that were lost, we did not see a comparable variability in puc-lacZ induction.  In the revision we will include additional images to better demonstrate this observation.

      The reviewer (and others) fairly point out that our current study focuses on puc-lacZ as a reporter of Wnd signaling in the cell body. We consider this to be a downstream integration of events in axons that are more challenging to detect. It is striking that this integration appears strongly sensitized to the presence of spared synaptic boutons. Examination of Wnd’s activation in axons and synapses is a goal for our future work.

      As noted by the authors, DLK has been implicated in both axon regeneration and degeneration. Following axotomy, DLK activation can lead to the degeneration of distal axons, where synapses are located. This raises an important question: how is DLK activated in distal axons? The authors might consider discussing the significance of this "synapse connection-dependent" DLK activation in the broader context of DLK function and activation mechanisms.

      While it has been noted that inhibition of DLK can mildly delay Wallerian degeneration (Miller et al., 2009), this does not appear to be the case for retinal ganglion cell axons following optic nerve crush (Fernandes et al., 2014). It is also not the case for Drosophila motoneurons and NMJ terminals following peripheral nerve injury (Xiong et al., 2012; Xiong and Collins, 2012). Instead, overexpression of Wnd or activation of Wnd by a conditioning injury leads to an opposite phenotype - an increase in resiliency to Wallerian degeneration for axons that have been previously injured (Xiong et al., 2012; Xiong and Collins, 2012). The downstream outcome of Wnd activation is highly dependent on the context; it may be an integration of the outcomes of local Wnd/DLK activation in axons with downstream consequences of nuclear/cell body signaling.  The current study suggests some rules for the cell body signaling, however, how Wnd is regulated at synapses and why it promotes degeneration in some circumstances but not others are important future questions.

      For the reviewer’s suggestion, it is interesting to consider DLK’s potential contributions to the loss of NMJ synapses in a mouse model of ALS (Le Pichon et al., 2017; Wlaschin et al., 2023). Our findings suggest that the synaptic terminal is an important locus of DLK regulation, while dysfunction of NMJ terminals is an important feature of the ‘dying back’ hypothesis of disease etiology (Dadon-Nachum et al., 2011; Verma et al., 2022). We propose that the regulation of DLK at synaptic terminals is an important area for future study, and may reveal how DLK might be modulated to curtail disease progression. Of note, DLK inhibitors are in clinical trials (Katz et al., 2022; Le et al., 2023; Siu et al., 2018), but at least some have been paused due to safety concerns (Katz et al., 2022). Further understanding of the mechanisms that regulate DLK are needed to understand whether and how DLK and its downstream signaling can be tuned for therapeutic benefit.

      Reviewer #2 (Public review):

      Summary:

      The authors study a panel of sparsely labeled neuronal lines in Drosophila that each form multiple synapses. Critically, each axonal branch can be injured without affecting the others, allowing the authors to differentiate between injuries that affect all axonal branches versus those that do not, creating spared branches. Axonal injuries are known to cause Wnd (mammalian DLK)-dependent retrograde signals to the cell body, culminating in a transcriptional response. This work identifies a fascinating new phenomenon that this injury response is not all-or-none. If even a single branch remains uninjured, the injury signal is not activated in the cell body. The authors rule out that this could be due to changes in the abundance of Wnd (perhaps if incrementally activated at each injured branch) by Wnd, Hiw's known negative regulator. Thus there is both a yet-undiscovered mechanism to regulate Wnd signaling, and more broadly a mechanism by which the neuron can integrate the degree of injury it has sustained. It will now be important to tease apart the mechanism(s) of this fascinating phenomenon. But even absent a clear mechanism, this is a new biology that will inform the interpretation of injury signaling studies across species.

      Strengths:

      (1) A conceptually beautiful series of experiments that reveal a fascinating new phenomenon is described, with clear implications (as the authors discuss in their Discussion) for injury signaling in mammals.

      (2) Suggests a new mode of Wnd regulation, independent of Hiw.

      Weaknesses:

      (1) The use of a somatic transcriptional reporter for Wnd activity is powerful, however, the reporter indicates whether the transcriptional response was activated, not whether the injury signal was received. It remains possible that Wnd is still activated in the case of a spared branch, but that this activation is either local within the axons (impossible to determine in the absence of a local reporter) or that the retrograde signal was indeed generated but it was somehow insufficient to activate transcription when it entered the cell body. This is more of a mechanistic detail and should not detract from the overall importance of the study

      We agree. The puc-lacZ reporter tells us about signaling in the cell body, but whether and how Wnd is regulated in axons and synaptic branches, which we think occurs upstream of the cell body response, remains to be addressed in future studies.

      (2) That the protective effect of a spared branch is independent of Hiw, the known negative regulator of Wnd, is fascinating. But this leaves open a key question: what is the signal?

      This is indeed an important future question, and would still be a question even if Hiw were part of the protective mechanism by the spared synaptic branch. Our current hypothesis (outlined in Figure 4) is that regulation of Wnd is tied to the retrograde trafficking of a signaling organelle in axons. The Hiw-independent regulation complements other observations in the literature that multiple pathways regulate Wnd/DLK (Collins et al., 2006; Feoktistov and Herman, 2016; Klinedinst et al., 2013; Li et al., 2017; Russo and DiAntonio, 2019; Valakh et al., 2013). It is logical for this critical stress response pathway to have multiple modes of regulation that may act in parallel to tune and restrain its activation.

      Reviewer #3 (Public review):

      Summary:

      This manuscript seeks to understand how nerve injury-induced signaling to the nucleus is influenced, and it establishes a new location where these principles can be studied. By identifying and mapping specific bifurcated neuronal innervations in the Drosophila larvae, and using laser axotomy to localize the injury, the authors find that sparing a branch of a complex muscular innervation is enough to impair Wallenda-puc (analogous to DLK-JNK-cJun) signaling that is known to promote regeneration. It is only when all connections to the target are disconnected that cJun-transcriptional activation occurs.

      Overall, this is a thorough and well-performed investigation of the mechanism of spared-branch influence on axon injury signaling. The findings on control of wnd are important because this is a very widely used injury signaling pathway across species and injury models. The authors present detailed and carefully executed experiments to support their conclusions. Their effort to identify the control mechanism is admirable and will be of aid to the field as they continue to try to understand how to promote better regeneration of axons.

      Strengths:

      The paper does a very comprehensive job of investigating this phenomenon at multiple locations and through both pinpoint laser injury as well as larger crush models. They identify a non-hiw based restraint mechanism of the wnd-puc signaling axis that presumably originates from the spared terminal. They also present a large list of tests they performed to identify the actual restraint mechanism from the spared branch, which has ruled out many of the most likely explanations. This is an extremely important set of information to report, to guide future investigators in this and other model organisms on mechanisms by which regeneration signaling is controlled (or not).

      Weaknesses:

      The weakest data presented by this manuscript is the study of the actual amounts of Wallenda protein in the axon. The authors argue that increased Wnd protein is being anterogradely delivered from the soma, but no support for this is given. Whether this change is due to transcription/translation, protein stability, transport, or other means is not investigated in this work. However, because this point is not central to the arguments in the paper, it is only a minor critique.

      We agree and are glad that the reviewer considers this a minor critique; this is an area for future study. In Supplemental Figure 1 we present differences in the levels of an ectopically expressed GFP-Wnd-kinase-dead transgene, which is strikingly increased in axons that have received a full but not partial axotomy. We suspect this accumulation occurs downstream of the cell body response because of the timing. We observed the accumulations after 24 hours (Figure S1F) but not at early (1-4 hour) time points following axotomy (data not shown). Further study of the local regulation of Wnd protein and its kinase activity in axons is an important future direction.

      As far as the scope of impact: because the conclusions of the paper are focused on a single (albeit well-validated) reporter in different types of motor neurons, it is hard to determine whether the mechanism of spared branch inhibition of regeneration requires wnd-puc (DLK/cJun) signaling in all contexts (for example, sensory axons or interneurons). Is the nerve-muscle connection the rule or the exception in terms of regeneration program activation?

      DLK signaling is strongly activated in DRG sensory neurons following peripheral nerve injury (Shin et al., 2012), despite the fact that sensory neurons have bifurcated axons and their projections in the dorsal spinal cord are not directly damaged by injuries to the peripheral nerve. Therefore it is unlikely that protection by a spared synapse is a universal rule for all neuron types. However the molecular mechanisms that underlie this regulation may indeed be shared across different types of neurons but utilized in different ways. For instance, nerve growth factor withdrawal can lead to activation of DLK (Ghosh et al., 2011), however neurotrophins and their receptors are regulated and implemented differently in different cell types. We suspect that the restraint of Wnd signaling by the spared synaptic branch shares a common underlying mechanism with the restraint of DLK signaling by neurotrophin signaling. Further elucidation of the molecular mechanism is an important next step towards addressing this question.

      Because changes in puc-lacZ intensity are the major readout, it would be helpful to better explain the significance of the amount of puc-lacZ in the nucleus with respect to the activation of regeneration. Is it known that scaling up the amount of puc-lacZ transcription scales functional responses (regeneration or others)? The alternative would be that only a small amount of puc-lacZ is sufficient to efficiently induce relevant pathways (threshold response).

      While induction of puc-lacZ expression correlates with Wnd-mediated phenotypes, including sprouting of injured axons (Xiong et al., 2010), protection from Wallerian degeneration (Xiong et al., 2012; Xiong and Collins, 2012) and synaptic overgrowth (Collins et al., 2006), we have not observed any correlation between the degree of puc-lacZ induction (eg modest, medium or high) and the phenotypic outcomes (sprouting, overgrowth, etc). Rather, there appears to be a striking all-or-none difference in whether puc-lacZ is induced or not induced. There may indeed be a threshold that can be restrained through multiple mechanisms. We posit in figure 4 that restraint may take place in the cell body, where it can be influenced by the spared bifurcation.

      References Cited:

      Collins CA, Wairkar YP, Johnson SL, DiAntonio A. 2006. Highwire restrains synaptic growth by attenuating a MAP kinase signal. Neuron 51:57–69.

      Dadon-Nachum M, Melamed E, Offen D. 2011. The “dying-back” phenomenon of motor neurons in ALS. J Mol Neurosci 43:470–477.

      Feoktistov AI, Herman TG. 2016. Wallenda/DLK protein levels are temporally downregulated by Tramtrack69 to allow R7 growth cones to become stationary boutons. Development 143:2983–2993.

      Fernandes KA, Harder JM, John SW, Shrager P, Libby RT. 2014. DLK-dependent signaling is important for somal but not axonal degeneration of retinal ganglion cells following axonal injury. Neurobiol Dis 69:108–116.

      Ghosh AS, Wang B, Pozniak CD, Chen M, Watts RJ, Lewcock JW. 2011. DLK induces developmental neuronal degeneration via selective regulation of proapoptotic JNK activity. J Cell Biol 194:751–764.

      Hao Y, Frey E, Yoon C, Wong H, Nestorovski D, Holzman LB, Giger RJ, DiAntonio A, Collins C. 2016. An evolutionarily conserved mechanism for cAMP elicited axonal regeneration involves direct activation of the dual leucine zipper kinase DLK. Elife 5. doi:10.7554/eLife.14048

      Huntwork-Rodriguez S, Wang B, Watkins T, Ghosh AS, Pozniak CD, Bustos D, Newton K, Kirkpatrick DS, Lewcock JW. 2013. JNK-mediated phosphorylation of DLK suppresses its ubiquitination to promote neuronal apoptosis. J Cell Biol 202:747–763.

      Katz JS, Rothstein JD, Cudkowicz ME, Genge A, Oskarsson B, Hains AB, Chen C, Galanter J, Burgess BL, Cho W, Kerchner GA, Yeh FL, Ghosh AS, Cheeti S, Brooks L, Honigberg L, Couch JA, Rothenberg ME, Brunstein F, Sharma KR, van den Berg L, Berry JD, Glass JD. 2022. A Phase 1 study of GDC-0134, a dual leucine zipper kinase inhibitor, in ALS. Ann Clin Transl Neurol 9:50–66.

      Klinedinst S, Wang X, Xiong X, Haenfler JM, Collins CA. 2013. Independent pathways downstream of the Wnd/DLK MAPKKK regulate synaptic structure, axonal transport, and injury signaling. J Neurosci 33:12764–12778.

      Le K, Soth MJ, Cross JB, Liu G, Ray WJ, Ma J, Goodwani SG, Acton PJ, Buggia-Prevot V, Akkermans O, Barker J, Conner ML, Jiang Y, Liu Z, McEwan P, Warner-Schmidt J, Xu A, Zebisch M, Heijnen CJ, Abrahams B, Jones P. 2023. Discovery of IACS-52825, a potent and selective DLK inhibitor for treatment of chemotherapy-induced peripheral neuropathy. J Med Chem 66:9954–9971.

      Le Pichon CE, Meilandt WJ, Dominguez S, Solanoy H, Lin H, Ngu H, Gogineni A, Sengupta Ghosh A, Jiang Z, Lee S-H, Maloney J, Gandham VD, Pozniak CD, Wang B, Lee S, Siu M, Patel S, Modrusan Z, Liu X, Rudhard Y, Baca M, Gustafson A, Kaminker J, Carano RAD, Huang EJ, Foreman O, Weimer R, Scearce-Levie K, Lewcock JW. 2017. Loss of dual leucine zipper kinase signaling is protective in animal models of neurodegenerative disease. Sci Transl Med 9. doi:10.1126/scitranslmed.aag0394

      Li J, Zhang YV, Asghari Adib E, Stanchev DT, Xiong X, Klinedinst S, Soppina P, Jahn TR, Hume RI, Rasse TM, Collins CA. 2017. Restraint of presynaptic protein levels by Wnd/DLK signaling mediates synaptic defects associated with the kinesin-3 motor Unc-104. Elife 6. doi:10.7554/eLife.24271

      Miller BR, Press C, Daniels RW, Sasaki Y, Milbrandt J, DiAntonio A. 2009. A dual leucine kinase-dependent axon self-destruction program promotes Wallerian degeneration. Nat Neurosci 12:387–389.

      Nihalani D, Merritt S, Holzman LB. 2000. Identification of structural and functional domains in mixed lineage kinase dual leucine zipper-bearing kinase required for complex formation and stress-activated protein kinase activation. J Biol Chem 275:7273–7279.

      Russo A, DiAntonio A. 2019. Wnd/DLK is a critical target of FMRP responsible for neurodevelopmental and behavior defects in the Drosophila model of fragile X syndrome. Cell Rep 28:2581–2593.e5.

      Shin JE, Cho Y, Beirowski B, Milbrandt J, Cavalli V, DiAntonio A. 2012. Dual leucine zipper kinase is required for retrograde injury signaling and axonal regeneration. Neuron 74:1015–1022.

      Siu M, Sengupta Ghosh A, Lewcock JW. 2018. Dual Leucine Zipper Kinase Inhibitors for the Treatment of Neurodegeneration. J Med Chem 61:8078–8087.

      Valakh V, Walker LJ, Skeath JB, DiAntonio A. 2013. Loss of the spectraplakin short stop activates the DLK injury response pathway in Drosophila. J Neurosci 33:17863–17873.

      Verma S, Khurana S, Vats A, Sahu B, Ganguly NK, Chakraborti P, Gourie-Devi M, Taneja V. 2022. Neuromuscular junction dysfunction in amyotrophic lateral sclerosis. Mol Neurobiol 59:1502–1527.

      Wlaschin JJ, Donahue C, Gluski J, Osborne JF, Ramos LM, Silberberg H, Le Pichon CE. 2023. Promoting regeneration while blocking cell death preserves motor neuron function in a model of ALS. Brain 146:2016–2028.

      Xiong X, Collins CA. 2012. A conditioning lesion protects axons from degeneration via the Wallenda/DLK MAP kinase signaling cascade. J Neurosci 32:610–615.

      Xiong X, Hao Y, Sun K, Li J, Li X, Mishra B, Soppina P, Wu C, Hume RI, Collins CA. 2012. The Highwire ubiquitin ligase promotes axonal degeneration by tuning levels of Nmnat protein. PLoS Biol 10:e1001440.

      Xiong X, Wang X, Ewanek R, Bhat P, Diantonio A, Collins CA. 2010. Protein turnover of the Wallenda/DLK kinase regulates a retrograde response to axonal injury. J Cell Biol 191:211–223.

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      1. General Statements

      We thank the reviewers for their thorough and positive evaluation of the manuscript.

      2. Point-by-point description of the revisions

      We revised the manuscript following the suggestions of the reviewers to make the article more concise and comprehensible to a wider audience. Specifically, we rearranged Section 5, rewrote the difficult-to-understand sections 5 and 6, and removed unnecessary or overlapping text in Introduction and Discussion. We have also addressed the specific points raised by the reviewers. The responses to individual points are detailed below.

      Reviewer 1:

      The reviewer did not ask for any changes to the manuscript.

      We thank the reviewer for the positive evaluation of the manuscript.

      Reviewer 2:

      1/ Title: Structure-based mechanism of RyR channel operation by calcium and magnesium ions

      The authors may consider using an alternative term instead of "operation".

      Thank you for the suggestion. We considered and discussed the term "RyR channel operation" very thoroughly with several colleagues, including native English speakers, and we found it to represent the complex RyR behavior in situ and in experiments most exactly. Alternative terms such as "control" suggest a one-way deterministic action from the ion binding to the protein state, which is not the case. The terms such as "modulation" implicate the presence of a higher RyR state-governing principle, such as phosphorylation, nitrosylation, binding of auxiliary proteins, etc.

      2/ Abstract: Please spell out CFF and MWC theorem.

      Thank you for the proposal. CFF was changed to caffeine; MWC was changed to Monod-Wyman-Changeaux

      3/ Line 87-88: "In striated muscle cells, RyR channels cluster at discrete sites of sarcoplasmic reticulum attached to the sarcolemma where electrical excitation triggers transient calcium release by activation of RyRs."

      There is no attachment between sarcoplasmic reticulum and sarcolemma, please rewrite.

      We respectfully disagree, since there is strong evidence for the existence of discrete contact sites between the sarcolemma and sarcoplasmic reticulum both at triads of skeletal muscle (Rossi et al., 2019) and at dyads of cardiac muscle (Mackrill, 2022), at which both membranes are firmly attached.

      However, to avoid potential misunderstanding, we changed the sentence to "In striated muscle cells, RyR channels cluster at the discrete sites of sarcoplasmic reticulum attached to the sarcolemma in triads or dyads, where electrical excitation triggers transient calcium release by activation of RyRs" (lines 86-87).

      4/ Lines 104-107: "Recently, mathematical modeling of the cardiac calcium release site (Iaparov et al., 2022) confirmed that Mg2+ ions could at the same time act as the negative competitor at the calcium activation site and as an inhibitor at the inhibition site. Unfortunately, the structural counterpart of RyR inactivation, an inhibitory binding site for divalent ions, has not been located yet in RyR structures."

      Note that the exact structural counterpart exists (Nayak et al., 2022, 2024), where Ca and Mg were found both at the activation and inhibition sites. The paragraph should be updated accordingly.

      We respectfully disagree. In the cited works of Nayak et al. (2022; 2024) it was shown that Ca and Mg ions bind firmly at the activation site. Both atoms were also observed at the ACP molecule bound at the ATP binding site. However, they were not observed at the divalent ion-binding inhibition site, which is distinct from the ATP binding site and resides in the loops of the EF-hand region.

      However, to clarify the meaning of the disputed sentence, we have changed it to: "Although binding of Ca2+ or Mg2+ to an inhibitory binding site has not been observed yet in RyR structures, a consensus is emerging that the EF-hand loops constitute this site (Gomez et al., 2016; Zheng and Wen, 2020; Nayak et al., 2024; Chirasani et al., 2024 )" (lines 107-109).

      5/ Lines 108-110: The activation of RyR by agonists was shown to be accompanied by a conformational change around the Ca2+ binding site that leads to a decrease in the free energy and to a concomitant increase of the Ca2+ binding affinity and a population shift between the closed and open conformations (Dashti et al., 2020).

      Please clarify to what state does the "decrease in free energy" refer, to the open or to the closed state?

      Thank you for the proposal. The text was changed to: "The activation of RyR by agonists was shown to be accompanied by a conformational change around the Ca2+ binding site that leads to a decrease in the free energy of the open state and concomitantly to an increase of the Ca2+ binding affinity of the activation site. As a result, the occurrence probability of a RyR state/conformation shifts from the closed toward the open (Dashti et al., 2020)" (lines 110-113).

      6/ Figure 2: please indicate if distances were measured between the C-alphas or side chains.

      Thank you for the proposal. The figure legend was modified to "Distances D1 between the Cα atoms of E4075 and R4736 or equivalent. Right - Distances D2 between the Cα atoms of K4101 and D4730 or equivalent."

      7/ Line 353-357: "These data suggest that interactions between the basic arginine residue R4736 and the acidic residues at the start of the initial helix E of the EF1-hand are specific for Ca2+-dependent inactivation in RyR1, whereas the interactions between the lysine K4101 that immediately follows the F helix of EF1 and the middle of the S23 loop (corresponding to D4730 and I4731 in RyR1) may play a part in the inactivation of both RyR1 and RyR2 isoforms.

      Sentence is unclear; please rewrite. Overall, the entire section "Spatial interactions between the EF-hand and S23* regions" should be simplified and shortened.

      Thank you for the proposal. The text was changed to: "These data suggest that interactions between the basic arginine residue R4736 and the acidic residues E4075 and D4079 are specific for Ca2+-dependent inactivation in RyR1, whereas the interactions between the lysine K4101 and the residues D4730 and I4731 (rRyR1 notation)* may play a part in the inactivation of both RyR1 and RyR2 isoforms." (lines 334-337).

      We did not find a way how to make the whole section simpler and shorter at the same time without losing clarity.

      8/ Lines 246-249 and Table 1. "all structures corresponding to rRyR1 residues 4063-4196 were<br /> subjected to energy minimization and submitted to the MIB2 server for evaluation of the ion binding score (IBS) of individual amino acid residues and the number of ion binding poses (NIBP) for Ca and Mg ions."

      Please elaborate on the "ion binding score" and "number of ion binding poses" concepts and provide reference for the MIB2 server.

      Thank you for the proposal. We added the reference for the server (Lu et al., 2022) (line 228) and added the information: "IBS values of individual residues are determined using sequence and structure conservation comparison with 409 and 209 respective templates from the PDB database for Ca2+ and Mg2+ (Lin et al., 2016) and assessing the similarity of the configuration of the residue to its configurations in known structures of its complexes with the given metal (Lu et al., 2012). Ion binding sites are determined by locally aligning the query protein with the metal ion-binding templates and calculating its score as the RMSD-weighted scoring function Z. The site is accepted if it has a scoring function Z>1, and based on the local 3D structure alignment between the query protein and the metal ion-binding template, the metal ion in the template is transformed into the query protein structure (Lin et al., 2016). The larger the IBS value, the higher the tendency of the residue to bind the ion. The larger the NIBP value, the larger the number of such complexes with acceptable structure" (lines 224-234).

      9/ Lines 460-466: Nine structural models of RyR were selected, and then these are referred to in the text only with the pdb code. The reviewer understands that it would be difficult to recapitulate all conditions but either a table in the main manuscript file or a minimal description in the text following the pdb code would increase clarity and help readers to follow the content.

      Thank you for the proposal. We added a new Table 2 "Model structures used for identifying the allosteric pathways" on line 452 that contains the required information, and inserted a reference to it in the text at line 446 "According to these criteria we selected five RyR1 model structures (Table 2)..."

      10/ Line 467: "In the selected structures, we identified residues with high allosteric coupling intensities (ACI) for both the inhibition and activation network and compared them with residues important for ligand binding and gating of RyR (Table 2)."

      Please define further the concept of "allosteric coupling intensities". The corresponding methods section appears to focus on the outputs of the OHM server without delving too much on the algorithm or principles followed. Is the allosteric coupling between neighboring residues, or reflect movement of the residues due to ligand binding? Is there a "reference" state or are the comparisons carried out within each allosteric state? This would help to introduce better the sections "The inhibition network" and "The activation network".

      Thank you for this suggestion. We have lately realized, considering both the server output and the original work of Wang et al. (2020), that a better term for the variable depicting the role of the residue in the allosteric pathway would be the residue importance RI rather than the ACI. The allosteric pathway is determined on the basis of the network of contacts between pairs of residues in the given structure. The more contacts are present between two residues, the higher is the probability that a perturbation will be propagated from one to the other residue (Eq. 3 of Wang et al. (2020)). An allosteric pathway is then defined as the pathway that transmits the signal the whole way from the allosteric site to the active site.

      Based on this we have changed in the manuscript the term "allosteric coupling intensity" to "residue importance" throughout the text and figures of the manuscript. It should be underlined, that this change has no effect whatsoever on presented data and conclusions. We inserted the following formulation in the Results section:

      "The term residue importance defines the extent to which the given residue is involved in the propagation of a perturbation from the allosteric site to the active site, i.e., the fraction of simulated perturbations transmitted through this particular residue. The more contacts are present between two residues, the higher is the probability that a perturbation will be propagated from one to the other residue (Wang et al., 2020)." (lines 439-443).

      We also inserted the following formulations into the Methods section: "The simulation of the perturbation propagation was performed 10 000 times per structure and pathway to estimate the values of residue importance." (lines 1093-1095), and we expanded the relevant sentence: "Allosteric pathways were traced using the server OHM (https://dokhlab.med.psu.edu/ohm/#/home, (Wang et al., 2020)), in which the allosteric pathway is determined on the basis of the network of contacts between pairs of residues in the given structure." (lines 1082-1084).

      11/ Figure 8: The figure would be more meaningful if the pathways were drawn in the context of the 3D structure.

      Thank you for the proposal. The pathways described in Fig. 8 are too complex for description in the RyR 3D structure, therefore they were not presented in the original manuscript. However, to follow the reviewer's proposal we have illustrated the pathways observed in the inactivated RyR1 channel (7tdg) and the open RyR2 channel (7u9) in Expanded View Figure EV1 and added the corresponding Expanded View Movie EV1 and EV2. These RyR structures were selected for displaying both the intra- and inter-monomeric inactivation pathways.

      12/ Lines 610-612: "The structure of the inactivated RyR2 has not been determined yet; however, it is plausible to suppose that it exists at high concentrations of divalent ions and differs from the inactivated RyR1 structure by the extent of EF-hand - S23* coupling. "

      The speculation would be more fit for the discussion section.

      Thank you for the proposal; however, the sentence introduces a logical supposition, necessary there for reasoning on the construction of the model. We reformulated the sentence to: "In the absence of a structure of the inactivated RyR2, the model assumes that such a structure exists at high concentrations of divalent ions and differs from the inactivated RyR1 structure by the extent of EF-hand - S23* coupling." (lines 573-575).

      13/ Lines 617-619: Closed and primed macrostates could be combined into a single closed macrostate of the model since both are closed and cannot be functionally distinguished at a constant ATP concentration.

      The rationale for combining closed with primed does not seem a good idea, especially since the authors also mention that "the primed state is structurally very close to the open state" (lines 925-926). If the COI model is based on the structural findings, in principle it seems that primed should be treated separately.

      Thank you for the proposal. The use of both the closed and primed states was crucial for solving the model. As a matter of fact, although the primed and closed states are in part structurally different, functionally they are identical, that is, closed. Consequently, to be distinguished in a functional model we would need to incorporate single-channel data obtained under conditions when the ratio of closed and primed channels was modulated under otherwise identical conditions. Unfortunately, such a set of data, for instance at a varying ATP concentration for a range of cytosolic Ca2+ concentrations, does not exist for either RyR1 or RyR2 channels. Moreover, while there are several RyR1 high-resolution structures in the primed state (such as the 7tzc that we used; 2.45 Å; Melville et al. (2022)), the resolution of the corresponding RyR2 structures (6jg3, 6jh6, 6jhn; 4.5 - 6.1 Å; Chi et al. (2019)) is not sufficient for determination of allosteric pathways. Fortunately, however, the two sets of conditions for RyR2 open probability data that were available in the literature turned out to represent activation of channels either selectively from the closed state (Fig. 10C), or almost selectively from the primed state (Fig. 10A, B). This allowed us to interpret the difference in the allosteric coefficients as a consequence of this fact.

      To better clarify the idea, the corresponding text of the Discussion was modified as follows (lines 926-931): "RyR channels can be considered mostly in the primed state under these conditions since the binding of ATP analogs induces the primed structural macrostate in RyRs even in the absence of Ca2+ (Cholak et al., 2023). Fortunately, the two sets of conditions for RyR2 open probability data that were available in the literature turned out to represent activation of channels either selectively from the closed state (Fig. 10C), or selectively from the primed state (Fig. 10A, B).", and "construction of such a model is at present hampered by the lack of open probability data at a sufficiently wide range of experimental conditions and the absence of high-resolution structures of WT RyR2 in the primed state" (lines 934-937).

      14/ Line 619. Please define the "COI" acronym. I assume it is closed, open and inactivated but this is not mentioned.

      We thank the reviewer for noticing the insufficiency. We expanded the specific sentence as follows: Therefore, we constructed the model of RyR operation, termed the COI (closed-open-inactivated) model, in which we assigned a functional macrostate corresponding to each of the closed, open, and inactivated structural macrostates (Figure 9A)" (line 582).

      15/ Figure 9: The diagrams are difficult to follow. Something that could improve it is to differentiate more between open and closed gates, but further elaboration would help the reader.

      We thank the reviewer for paying attention to details. The open state was differentiated in Figure 9 (after line 603) by adding a pore opening to the gate.

      To elaborate on the gating transitions and to keep the manuscript concise, we added a new Expanded View Figure EV2, which illustrates the relationship between the ion binding within macrostates and the transitions between macrostates.

      Nevertheless, for the complexity of the model, which would need a multidimensional presentation, we had to limit the illustration to only the binding of the first ions at the binding sites. We hope that it will help the reader to grasp the principle of the model function more easily.

      16/ One comment is that the manuscript is too long; the manuscript exceeds the typical length required by most journals. To enhance its suitability for publication, the content needs to be synthesized and streamlined. The manuscript is written for an audience specialized in the RyR field and may be challenging for outsiders or for readers unfamiliar with structure and/or biophysical models.

      We thank the reviewer for opening this problem. The specific contribution to the understanding of RyR operation communicated by this manuscript was achieved by the synergy of approaches coming from different fields of RyR research - the structural, the functional, and the synthetic/systems ones. This needed deep immersion into complex studies performed over several decades to unwrap their complementary contributions. Only then we could synthesize the stepwise advances and integrate the mosaic of partial discoveries into the COI model. When conceptualizing the manuscript we were also considering a two-paper version, one on structural aspects and the other on modeling aspects. We realized that the two papers would need to have a very high overlap at the allosteric mechanism to be understandable in separation and would be difficult to publish in the same journal. We also anticipated a typical side effect that structuralists and modelers would read just their parts and would not appreciate enough the feedback from alternative views - how to design and interpret future structural, functional, and modeling studies.

      Compacting the manuscript would be extremely difficult for us. In our view, the dense text would make it even more challenging for readers unfamiliar with some of the numerous approaches used here, as often happens to prominent multidisciplinary journals. Maybe it would be possible with the help of AI, but for now, we prefer to remain authentic.

      Nevertheless, we made some effort. To shorten the manuscript, we have removed the paragraph describing the timeline of the search for the RyR inhibition site that was originally on lines 126-151 and replaced it with the paragraph on lines 129-134: "The regulatory domains involved in both, activation and inactivation of RyRs (Figure 1) are located in the C-terminal quarter of the RyR. The Central domain participates in the Ca2+ binding activation site; the C-terminal domain bears several residues of Ca-, ATP- and caffeine-binding activation sites; the U-motif participates at the ATP- and caffeine-binding sites; the EF-hand region contains the putative Ca-binding pair EF1 and EF2; and the S23 loop bears one residue of the caffeine-binding site and two residues interacting with the EF-hand region of a neighboring monomer (Samso, 2017; Hadiatullah et al., 2022)". We also removed the statements about the proposed kinetic mechanism of inactivation by Nayak et al. (2022), originally on lines 175-184. Finally, we removed the discussion of the work of Gomez et al. (2016) originally on lines 882-889, since it fully overlapped with the statements in Results on lines 358-367 (now lines 338-347). We also moved the text of the subsection "Relationship between the COI model and RyR allosteric pathways" (originally lines 670-685) into subsection "Construction of the model of RyR operation", lines 592-603 and 645-662 of the revised version.

      17/ Another comment is the limited consideration of two relevant published works. One is by Chirasani et al. (2024), focused on allosteric pathways similar to the ones described here. The other work is by Nayak et al (2024), with cryo-EM structures of RyR1 focused on the interplay with Mg2+ and Ca2+. Overall, the manuscript would be strengthened by incorporating such related results in the literature.

      We thank the reviewer for the concerns, but we cannot fully agree. The paper of Chirasani et al. (2024 ) was cited in the manuscript as its online-first version, Chirasani et al. (2023). The manuscript now refers to the printed version proposed by the reviewer. The Chirasani et al. work was discussed on lines 870-881. The paper concentrates on the interaction between the EF-hand region and the S23 segment and its effect on RyR inactivation, which we referenced in the manuscript, but not on the allosteric pathways as mentioned by the reviewer. To broaden the consideration of this important work, we have introduced a more detailed discussion of Chirasani et al. (2024 ) by adding the following text to the manuscript: Lines 881-888: "Based on their structural analysis of the open RyR1 structure 5tal, Chirasani et al. (2024 ) proposed that narrowing the gap between the EF-hand domain and S23 loop, resulting in H-bonding interactions between the EF-hand residue K4101 and the S23 loop residue D4730, and those between the EF-hand residues E4075, Q4076, D4079 and the S23 loop residue R4736, is a consequence of the binding of Ca2+ to the EF-hands. However, our PDBePISA analysis revealed a similar number of interactions between the EF-hand region and the S23 loop not only in open and inactivated but also in primed RyR1 structures (Figure 3). The presence of EF hand-S23 hydrogen bonds in the primed and open RyR1 structures suggests that the proximity of the EF-hand domain and S23 loop is a structural trait distinguishing RyR1 from RyR2, not a consequence of Ca2+ binding to the EF hand.*"

      The data and ideas of the illuminating work of Nayak et al. (2024) were discussed and referred to in the manuscript in several places, originally lines 74, 77, 164 (Introduction), 311, 340 (Results), 892-893, and 971 (Discussion). To broaden consideration of this work, we have expanded the discussion of this paper by adding the text shown in bold into the Introduction: "Recent studies reporting RyR structure at a high divalent ion concentration provide only indirect support for the molecular mechanism of Ca2+/Mg2+-dependent inactivation. Wei et al. (2016) and Nayak et al. (2024) observed a change in the conformation of the RyR1 EF-hands in the presence of 100 µM Ca2+ and 10 mM Mg2+, respectively, compared to low-calcium or low-magnesium conditions." (lines 135-138) and in the Discussion (lines 889-891): "The recent RyR1 structure 7umz (Nayak et al., 2024) provided evidence of Mg2+ ion bound in the RyR activation site, thus confirming the functional studies that established competition between Ca2+ and Mg2+ at this activation site (Laver et al., 1997; Zahradnikova et al., 2003; Zahradnikova et al., 2010)."

      Reviewer 3:

      Minor comment: While I am not an expert in allosteric model construction and therefore cannot fully assess their methodological approach, I observed that the authors fixed a number of parameters to achieve model convergence. A more detailed explanation of the rationale behind these fixed parameters would enhance clarity. Currently, these parameters are not clearly specified in the text and are somewhat obscured by the broader description of all parameters included in the model.

      We thank the reviewer very much for this comment, which made us realize that the relevant sections were written in a too technical manner, without sufficient explanation of the ideas behind the derivation and optimization of the model. To clarify the rationale of this process, we have rewritten the subsection "Derivation of the model open probability equation" and the section "Description of RyR operation by the COI model". In the subsection "Derivation of the model open probability equation", we have explained the simplification of the full set of equations (Eqs. 3A-C) into Eqs. 4A-C (lines 642 - 666). In the section "Description of RyR operation by the COI model", we have explained the extent of over-parametrization and the rationale of reducing it by three methods: combining the data into groups with common parameter values; eliminating parameter interdependence by fixation of one parameter at a preset value taken from the literature or postulated a priori; and sharing parameter values between data groups when no significant difference between these values was observed (lines 683-685, 702-710, 719-740).

      We hope that these changes make the manuscript more comprehensible.

      REFERENCES

      Chi, X., D. Gong, K. Ren, G. Zhou, G. Huang, J. Lei, Q. Zhou, and N. Yan. 2019. Molecular basis for allosteric regulation of the type 2 ryanodine receptor channel gating by key modulators. Proceedings of the National Academy of Sciences of the United States of America. 116:25575-25582.

      Chirasani, V.R., M. Elferdink, M. Kral, J.S. Carter, S. Heitmann, G. Meissner, and N. Yamaguchi. 2024 Structural and functional interactions between the EF hand domain and S2-S3 loop in the type-1 ryanodine receptor ion channel. The Journal of biological chemistry. 300:105606.

      Cholak, S., J.W. Saville, X. Zhu, A.M. Berezuk, K.S. Tuttle, O. Haji-Ghassemi, F.J. Alvarado, F. Van Petegem, and S. Subramaniam. 2023. Allosteric modulation of ryanodine receptor RyR1 by nucleotide derivatives. Structure. 31:790-800 e794.

      Dashti, A., G. Mashayekhi, M. Shekhar, D. Ben Hail, S. Salah, P. Schwander, A. des Georges, A. Singharoy, J. Frank, and A. Ourmazd. 2020. Retrieving functional pathways of biomolecules from single-particle snapshots. Nature communications. 11:4734.

      Gomez, A.C., T.W. Holford, and N. Yamaguchi. 2016. Malignant hyperthermia-associated mutations in the S2-S3 cytoplasmic loop of type 1 ryanodine receptor calcium channel impair calcium-dependent inactivation. American journal of physiology. 311:C749-C757.

      Hadiatullah, H., Z. He, and Z. Yuchi. 2022. Structural Insight Into Ryanodine Receptor Channelopathies. Frontiers in pharmacology. 13:897494.

      Laver, D.R., T.M. Baynes, and A.F. Dulhunty. 1997. Magnesium inhibition of ryanodine-receptor calcium channels: Evidence for two independent mechanisms. J.Membrane.Biol. 156:213-229.

      Lin, Y.F., C.W. Cheng, C.S. Shih, J.K. Hwang, C.S. Yu, and C.H. Lu. 2016. MIB: Metal Ion-Binding Site Prediction and Docking Server. Journal of chemical information and modeling. 56:2287-2291.

      Lu, C.H., C.C. Chen, C.S. Yu, Y.Y. Liu, J.J. Liu, S.T. Wei, and Y.F. Lin. 2022. MIB2: metal ion-binding site prediction and modeling server. Bioinformatics. 38:4428-4429.

      Lu, C.H., Y.F. Lin, J.J. Lin, and C.S. Yu. 2012. Prediction of metal ion-binding sites in proteins using the fragment transformation method. PLoS One. 7:e39252.

      Mackrill, J.J. 2022. Evolution of the cardiac dyad. Philosophical transactions of the Royal Society of London. Series B, Biological sciences. 377:20210329.

      Melville, Z., K. Kim, O.B. Clarke, and A.R. Marks. 2022. High-resolution structure of the membrane-embedded skeletal muscle ryanodine receptor. Structure. 30:172-180 e173.

      Nayak, A.R., W. Rangubpit, A.H. Will, Y. Hu, P. Castro-Hartmann, J.J. Lobo, K. Dryden, G.D. Lamb, P. Sompornpisut, and M. Samso. 2024. Interplay between Mg(2+) and Ca(2+) at multiple sites of the ryanodine receptor. Nature communications. 15:4115.

      Nayak, A.R., and M. Samso. 2022. Ca(2+) inactivation of the mammalian ryanodine receptor type 1 in a lipidic environment revealed by cryo-EM. eLife. 11.

      Rossi, D., A.M. Scarcella, E. Liguori, S. Lorenzini, E. Pierantozzi, C. Kutchukian, V. Jacquemond, M. Messa, P. De Camilli, and V. Sorrentino. 2019. Molecular determinants of homo- and heteromeric interactions of Junctophilin-1 at triads in adult skeletal muscle fibers. Proceedings of the National Academy of Sciences of the United States of America. 116:15716-15724.

      Samso, M. 2017. A guide to the 3D structure of the ryanodine receptor type 1 by cryoEM. Protein science : a publication of the Protein Society. 26:52-68.

      Wang, J., A. Jain, L.R. McDonald, C. Gambogi, A.L. Lee, and N.V. Dokholyan. 2020. Mapping allosteric communications within individual proteins. Nature communications. 11:3862.

      Wei, R., X. Wang, Y. Zhang, S. Mukherjee, L. Zhang, Q. Chen, X. Huang, S. Jing, C. Liu, S. Li, G. Wang, Y. Xu, S. Zhu, A.J. Williams, F. Sun, and C.C. Yin. 2016. Structural insights into Ca(2+)-activated long-range allosteric channel gating of RyR1. Cell research. 26:977-994.

      Zahradnikova, A., M. Dura, I. Gyorke, A.L. Escobar, I. Zahradnik, and S. Gyorke. 2003. Regulation of dynamic behavior of cardiac ryanodine receptor by Mg2+ under simulated physiological conditions. American journal of physiology. 285:C1059-1070.

      Zahradnikova, A., I. Valent, and I. Zahradnik. 2010. Frequency and release flux of calcium sparks in rat cardiac myocytes: a relation to RYR gating. The Journal of general physiology. 136:101-116.

      Zheng, W., and H. Wen. 2020. Investigating dual Ca(2+) modulation of the ryanodine receptor 1 by molecular dynamics simulation. Proteins. 88:1528-1539.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      We sincerely thank the reviewers for their comprehensive and constructive feedback.

      Reviewer #1

      Major comments:

      1. The data and key conclusions of the paper are convincing. However, the reliability of the findings in terms of the new interaction could be improved by not relying solely on proximity ligation approaches (BioID, PLA), but employing a complementary biochemical strategy. The authors state that an immunoprecipitation (IP) was not possible due to a lack of antibodies for IP. This does not seem convincing since in the paper Saito-Diaz et al which they cite commercial antibodies were used to immunoprecipitate APC. Alternatively, the cell line expressing tagged ROBO1 could be used together with endogenous or tagged APC for an biochemical interaction experiment.

      Response

      We thank the reviewer for this important suggestion. In our initial studies, we attempted co-immunoprecipitation (co-IP) experiments using several different antibodies directed to APC. The signal detected was very low, possibly reflecting relatively low endogenous expression of ROBO1 in COS-7 cells, technical challenges associated with co-IP of APC and ROBO1, which are both large proteins (>200 kDa), and/or transient interactions between the two proteins. As part of the revision plan we will carry out co-IP experiments using HEK293A cells stably expressing full length ROBO1 (5H9 cells).

      Regarding the PLA experiment, I was very surprised by the very strong labeling for Clathrin+ROBO1 shown in the representative image. It is hard to believe that this image is representative when the average number of dots in the quantification is about 100. From the image it is also hard to see how it would be possible to quantify individual dots. For this, a zoom would be helpful.

      Response

      We thank the reviewer for this helpful comment. In the revised manuscript, we have added a magnified panel to Figure 4E.

      Clathrin and ROBO1 are likely not even direct interactors but come together by their common interaction with AP2. Therefore, to back this surprisingly strong result up, I would recommend to include one more control such as another rabbit antibody recognizing a protein that does not associate with clathrin or use e.g. the ROBO1 wildtype vs the ROBO1 mutant, that does not bind AP2 and therefore should also not associate with clathrin, for the experiment. Even better, the authors could confirm the PLA results by the mentioned complementary biochemical experiments to bolster the findings by an independent approach.

      Response

      We thank the reviewer for this suggestion. As recommended, we will use the complementary biochemical approaches suggested, and will perform immunoprecipitation experiments to examine interactions with clathrin in cells that express wildtype ROBO1 vs. cells that express mutant ROBO1 that does not bind AP2. As recommended, we will further perform experiments using control antibody directed to a protein that does not associate with clathrin.

      Minor comments:

      In general, data and methods are presented in a manner that should make them reproducible by others. Some small things to improve are:

      1. In the paragraph on antibodies the used concentrations for the different applications should be provided.

      Response

      We thank the reviewer for this suggestion and apologize for the omission. In the revised manuscript, we have added a supplementary table to clarify the concentrations of antibodies used for different experimental applications. Please see Table s1.

      It should be described how the poly-D-lysine coating was exactly performed.

      Response

      We thank the reviewer for this comment. In the revised manuscript, we have added the procedure for poly-D-lysine coating in the "Materials and Methods" section. Please see page 7 line 143-144.

      The statistical analysis looks adequate. There are just some minor things that should be specified:- Just to make sure: Is it really always SD which is provided and not SEM? Sometimes the error bars look so small that I was wondering about this.

      Response

      We appreciate the opportunity to clarify that we used SD consistently in the manuscript.

      • It should be specified for each experiment which post-hoc test is used or stated that one is always used for the One-Way ANOVA and the other for the Two-Way ANOVA resp. a rationale should be provided why two different post-hoc tests are used.

      Response

      We have added the post hoc tests used for each assay in the figure legend. The rationale for the different post hoc tests used has also been added in the "Materials and Methods" section as "Two-tailed paired Student's t-test was used for two-group comparisons. One-way ANOVA followed by Tukey's post hoc multiple comparison test was used for multiple-group comparisons with a single independent variable, and two-way ANOVA followed by Sidak's post hoc multiple comparison test was used for multiple-group comparisons with two independent variables". Please see page 12 line 275-279, page 20 line 519-520, page 21 line 524-525, 533-534, 537-538, 540-541, 543-544, page 22 line 551-552, 555-556, 561-562, 576-577, page 23 line 582, 584-585, 587, 589-590, 595, 599, page 24 line 624-625, 629, 635-636, page 25 638-639.

      • When using the t-test, it should be stated whether it is paired or unpaired and one- or two-tailed.

      Response

      Two-tailed paired Student's t-test was used in Fig. 5C. We have added in the "Materials and Methods" section and figure legend in the revised manuscript. Please see page 12 line 275-276, page 23 line 587.

      • It should be stated whether it was tested that the data fulfill the requirements for parametric tests (normal distribution).

      Response

      We have added "The data fulfilled the requirements for normal distribution using the Shapiro-Wilk test" in the "Materials and Methods" section in the revised manuscript. Please see page 12 line 274-275 in the revised manuscript.

      Text and figures are mostly clear, apart from some small things:

      • I was wondering about figure 1B. If I understand the methods description right, all cells were permeabilized prior to secondary antibody application. Why then is so little fluorescence for Flag visible in the first PBS row at 30 min? That would only make sense for me if the cell was not permeabilized and the protein internalized. So where did the majority of the protein end up after 30 min since you should see the entire population in a permeabilized cell? Could you please comment on this?

      Response

      We thank the reviewer for this comment. The cells were permeabilized prior to secondary antibody application. Since NSLIT2 binding to ROBO1 can facilitate ADAM10-mediated ROBO1 cleavage to release the extracellular domain of ROBO1 (Coleman et al., 2010), this may have caused little fluorescence for Flag to be visible in the first PBS row at 30 min. In the revised manuscript we have added a comment about the finding described. Please see page 13 line 293-296.

      • Fig. 2A the upper left image (0 min PBS) should be very similar to the upper left image in Fig. 1B, shouldn´t it? But it looks quite different to me in terms of surface amount of ROBO1-Flag. Could you please comment on this?

      Response

      We apologize for the confusing images included in the original version of the manuscript. As noted, the upper left image (0 min PBS) in Fig. 2A should be very similar to the upper left image in Fig. 1B. We have now instead included an image for Fig. 2A that is more representative of the data from the experiments we performed.

      • Please explain what the molecular difference between bio-active NSLIT2 and bio-inactive CSLIT2 is. Please provide a rationale why you sometimes use CSLIT2 as negative control and sometimes DD2SLIT2. In Fig. 3G you are using DD2SLIT2. Even though there is no significance reached with the analyzed n, it is very striking that the bars are consistently higher upon DD2SLIT2 application. Can you comment on this effect? Or am I misunderstanding the labeling of the figure?

      Response

      Bio-active NSLIT2 consists of the N-terminal fragment of SLIT2 and contains the second leucine-rich repeat (LRR) domain (D2), which binds to the first two Ig domains of the ROBO1 receptor (Ig1-2). Bio-inactive CSLIT2 consists of the C-terminal fragment of SLIT2, which does not bind ROBO1. DD2SLIT2 consists of the N-terminal fragment of SLIT2 but lacks D2 LRR domain that is essential for ROBO1 binding. Neither CSLIT2 nor DD2SLIT2 can bind the ROBO1 receptor (Bhosle et al., 2020; Mukovozov et al., 2015; Patel et al., 2012). In Fig. 3G, DD2SLIT2 was used as negative control and did not affect cell spreading, so the bars are consistently higher upon D2SLIT2 application. The use of CSLIT2 or DD2SLIT2 in different experiments was due to the availability of these reagents. In Fig. 3F and 3G, we have made modifications to the X axis to clarify.

      • On page 3 it states "...endocytosis of ROBO1...requires...APC": I found this confusing since it is the dissociation of APC that is required for promoting endocytosis. Therefore, it would be good to rephrase this sentence.

      Response

      We apologize for the confusing language. In the revised manuscript, we have changed "endocytosis of ROBO1 from the cell surface requires the tumor suppressor protein, APC" to "endocytosis of ROBO1 from the cell surface requires the dissociation of the tumor suppressor protein, APC". Please see page 4 line 35-36.

      • On page 8 is written "...cells surface ROBO1 [is] removed". Please be more accurate since the acid wash does not remove ROBO1, but only the antibody bound to the extracellular epitope.

      Response

      We apologize for the confusing language. In the revised manuscript, we have changed "cell surface ROBO1 removed" to "anti-Flag antibody binding ROBO1 removed from the cell surface". Please see page 8 line 153-154.

      • On page 8 provide an explanation for the abbreviation HAC.

      Response

      To enhance clarity, in the revised manuscript we have used the full name "acetic acid" instead of using the abbreviation "HAC". Please see page 8 line 155.

      • On page 15 you speak of "mutant AP2". Please be more accurate since there is no mutant AP2 involved, but you are refering to ROBO1 with mutations in its AP2 binding motifs.
      • On page 14 you speak of "cells expressing the mutant alleles of AP2". As above, please be more accurate and replace with "cells expressing ROBO1 harboring mutations in both AP2 binding sites".

      Response

      We thank the reviewer for this suggestion and apologize for the confusion. For the sake of accuracy, we have made the changes as suggested by the reviewer. Please see page 15 line 351 and page 14 line 331-332.

      • On page 19 you write: "Using proximity ligation assays, we observed that ROBO1, APC and clathrin interact with one another". I am maybe a bit picky here, but in my eyes with these assays you only show that they are very close together and might be in a complex, but you do not show (direct) interaction in a strict sense. Therefore, I would downtone this a bit.

      Response

      We thank the reviewer for this important comment. As suggested, in the revised manuscript, we replaced "Using proximity ligation assays, we observed that ROBO1, APC and clathrin interact with one another" with "Using proximity ligation assays, we observed that ROBO1, APC and clathrin are in close proximity to one another". Please see page 18 line 458. We have similarly amended the language throughout the manuscript. Please see page 3 line 11-12, page 4 line 37, page 16 line 391, 394, 396, 398, page 22 line 564.

      • In Fig. 5B I would find it easier for the reader if siRNA and control were shown side by side for the different conditions.

      Response

      In the revised manuscript, we have made the changes suggested by the reviewer to enhance clarity.

      • Between the internalization assays and the spreading assays, you switch from HEK293 cells to COS7 cells. Please provide a rationale for this for the reader.

      Response

      Because the endogenous expression of ROBO1 is relatively low in COS-7 cells, we generated a HEK293A cell line that stably expresses ROBO1, and used these cells to examine subcellular traffic of ROBO1 and explore interactors of ROBO1. We next sought to explore the functional consequences of internalization of ROBO1 and the functional role of APC. As we and others previously showed that SLIT2-ROBO1 signaling inhibits cell spreading (Bhosle et al., 2020; Patel et al., 2012; Tole et al., 2009), we elected to use this measure as a biologic read-out. Because HEK293A cells do not spread as much as COS-7 cells, we instead used COS-7 cells for the spreading assays.

      • You provide a table with putative interactors within the paper and as supplementary table. Could you please explain better to the reader what your criteria were for including hits into the "short-list" presented in Table1.

      Response

      We chose proteins based on two criteria. The first was association with full-length ROBO1, but not with ROBO1 lacking the intracellular domain. The second was association with full-length ROBO1 under basal conditions, but loss of association with full-length ROBO1 after exposure of cells to NSLIT2. In the revised manuscript, we have added the criteria in the manuscript. Please see page 15 line 363-366.

      Typos - p. 6: CO2 instead of CO2

      • p21 last line: Immunoblotting should not be capitzalized.

      • Figure s1 legend: full-lenth is missing a g

      Response

      We apologize for the oversight. In the revised manuscript, we have corrected these typos. Please see page 6 line 97, page 21 line 535 and page 24 line 611.

      Significance

      It was already known from Drosophila and for mammalian cells that SLIT2 induces the endocytosis of ROBO1 and that this is necessary for its repulsive function in axon guidance as the authors point out. The key advance of the study is the identification of APC as an interactor of ROBO1 which decreases its endocytosis until it dissociates upon SLIT2 binding to ROBO1. This is an interesting aspect which opens up parallels to the regulation of Wnt signaling by APC as the authors discuss. The significance of this finding would be even greater if it would have been shown that this mechanism actually operates in axon guidance. That not being the case, the authours might want to discuss in more detail if APC has previously been implicated to affect axon guidance.

      Researchers working on endocytosis, adhesion, cellular signaling and the development of the nervous system will be interested in these findings.

      Response

      We thank the reviewer for the positive comments regarding the significance of our findings. As recommended, in the discussion section of the revised manuscript we will discuss in more detail what is known about the role of APC in axon guidance.

      Reviewer #2

      Major comments:

      1. As the authors emphasize the role of NSlit2 in Robo1 internalization throughout their manuscript, I suggest authors include "NSlit" in their title. Something like this "Adenomatous polyposis coli (APC) regulates the NSlit2-induced internalization and signaling of the chemo repellent receptor, hRoundabout (ROBO) 1" or maybe a better title.

      Response

      As suggested, we have changed the title of the revised manuscript to "Adenomatous polyposis coli (APC) regulates the NSLIT2-induced internalization and signaling of the chemorepellent receptor, Roundabout (ROBO) 1".

      In addition to transferrin as the control for their internalization studies, have the authors tested the specificity of NSlit-2-induced internalization with other Robo receptors such as Robo2? Does the APC bind to Robo2 also?

      Response

      We thank the reviewer for this comment. Due to significant cost constraints, we focused our BioID experiments on identifying proteins that interact with ROBO1. In the revised manuscript, we will expand the discussion to consider the questions raised here by the reviewer.

      The N-Slit group at 0' in Figure 1 b and Figure 2a, the Flag-Robo staining looks very different. Is it because the authors did not use ADAM protease inhibitor in Figure 2a that's why they are seeing more internalized Flag-Robo at 0'? It is not very clear either in the Results or the legend.

      Response

      We apologize for the confusing images. We used ADAM protease inhibitor for all endocytosis assays, as mentioned in the "Materials and Methods" section. The upper left image (0 min PBS) in Fig. 2A should be very similar to the upper left image in Fig. 1B. We have now replaced the image in 2A with one that is more representative of the overall results.

      Have the authors tested the Surface Robo1 pool in siAPC cells induced with or without N-Slit2?

      Response

      We added NSLIT2 to cells as we started endocytosis assay. At the time point of 0 min, the surface ROBO1 pool was not affeacted by NSLIT2.

      Does the Robo1 mutated with AP2 binding motifs interact with APC? Have authors performed a Proximity ligation assay with AP2-binding motifs mutated Robo1 and APC?

      Response

      We thank the reviewer for this suggestion. As recommended, we will perform proximity ligation assays to examine interactions between APC and ROBO1 which lacks AP2-binding motifs.

      The resolution of PLA dots in the current version is very low. Authors should include higher magnification pictures for these interactions and also PLA dots channel should be separately represented in addition to the DAPI merged images for better clarity and interpretation.

      Response

      We thank the reviewer for these suggestions. In the revised manuscript, we have included figures with the recommended modifications to enhance clarity. Please see figure 4A, 4C and 4E.

      Do the Slit2 treated cells affect APC mRNA expression? Or does Slit2 only inhibit the interaction between APC and Robo1? Have the authors tested the mRNA expression of APC in slit2-treated and untreated cells?

      Response

      We thank the reviewer for this question. We will perform the experiments suggested and include the results in the revised manuscript.

      The authors have tested the effect of Slit2-induced inhibition of cell spreading under different experimental conditions however it is also important to test the cell migration/proliferation rates under control and siAPC conditions with or without Slit2 treatment.

      Response

      We thank the reviewer for this comment. In order to test the effect of APC on SLIT2-induced cell migration, a migratory cell type would be required. This would involve introducing a third cell type in addition to the HEK293 and COS-7 cells we have already used, and first validating our key experimental findings in the new cell type. Please see our response to the 10th sub-comment in Minor Comment 4) of Reviewer 1.

      Do authors see the inhibition of Robo1 and Cyfip interactions also in the presence of Slit2 by PLA assay?

      Response

      We thank the reviewer for this interesting question. As this was beyond the scope of the current study, we did not examine whether SLIT2 inhibits interactions between ROBO1 and CYFIP. In the Discussion section of the revised manuscript, we will address this question as a potential line of future investigation.

      Studying the endogenous Robo1 and APC interaction by PLA is good but I suggest authors do standard co-IP assays to visualize these interactions since authors have already generated a variety of general epitope- tagged constructs for both Robo1 and APC. These epitope-specific antibodies that are best suitable for IP are easily available with many antibody companies. This is the first study to suggest that the interaction between Robo1 and APC so the strong biochemistry would have a good impact on the findings.

      Response

      We appreciate this important suggestion. We will perform the recommended studies and include the results in the revised manuscript. Please also see our response to Reviewer 1, Major Comment 1).

      Minor comments:

      1. I suggest the authors show the single-channel images of Flag-robo (green) in Figure 2B for a clear visualization of internalized Robo in a cell. With DAPI-merged images, it is hard to specifically visualize Robo in these cells.

      Response

      We assume the reviewer was referring Figure 2A instead of 2B. To enhance clarity, in the revised manuscript we have made the changes suggested by the reviewer.

      In Figure 1C, the Y axis should have a clear indication. Instead of "% internalized" it should be mentioned as "% Internalized Robo1".

      Response

      We thank the reviewer for this suggestion and apologize for the oversight. In the revised manuscript, we have made the suggested change in Figure 1C, 2B, 2D, 2E, 5B and 5D.

      I suggest authors to include the simple schematic of the mechanism they are proposing in the manuscript.

      Response

      We thank the reviewer for the suggestion. To enhance clarity, in the revised manuscript we will include a simple schematic of the mechanism our findings suggest.

      The authors should mention the rationale or the function of using the acid wash method for their experimental conditions for a better understanding of the reader.

      Response

      We thank the reviewer for this suggestion and apologize for the oversight. We performed acid wash experiments to remove the anti-Flag antibody that binds ROBO1 from the cell surface for the endocytosis assay. To increase the clarity, in the "Materials and Methods" section of the revised manuscript we have included the rationale for using acid wash. Please see page 8 line 153-154.

      siRNA-mediated knockdown of specific genes should be correctly denoted in the figure. For example, instead of "CLTC", it should be "siCLTC" for easy understanding. The same correction has to be done in all the figures with siRNA data.

      Response

      We thank the reviewer for this helpful comment and apologize for the oversight. As suggested, we have made the suggested changes throughout the revised manuscript and in Figure 2C, 2D, 5A, 5B, 6A, 6B, 6C, s2C and s2D.

      Reference

      Bhosle, V.K., Mukherjee, T., Huang, Y.W., Patel, S., Pang, B.W.F., Liu, G.Y., Glogauer, M., Wu, J.Y., Philpott, D.J., Grinstein, S., et al. (2020). SLIT2/ROBO1-signaling inhibits macropinocytosis by opposing cortical cytoskeletal remodeling. Nat Commun 11, 4112.

      Coleman, H.A., Labrador, J.P., Chance, R.K., and Bashaw, G.J. (2010). The Adam family metalloprotease Kuzbanian regulates the cleavage of the roundabout receptor to control axon repulsion at the midline. Development (Cambridge, England) 137, 2417-2426.

      Mukovozov, I., Huang, Y.W., Zhang, Q., Liu, G.Y., Siu, A., Sokolskyy, Y., Patel, S., Hyduk, S.J., Kutryk, M.J., Cybulsky, M.I., et al. (2015). The Neurorepellent Slit2 Inhibits Postadhesion Stabilization of Monocytes Tethered to Vascular Endothelial Cells. J Immunol 195, 3334-3344.

      Patel, S., Huang, Y.W., Reheman, A., Pluthero, F.G., Chaturvedi, S., Mukovozov, I.M., Tole, S., Liu, G.Y., Li, L., Durocher, Y., et al. (2012). The cell motility modulator Slit2 is a potent inhibitor of platelet function. Circulation 126, 1385-1395.

      Tole, S., Mukovozov, I.M., Huang, Y.W., Magalhaes, M.A., Yan, M., Crow, M.R., Liu, G.Y., Sun, C.X., Durocher, Y., Glogauer, M., et al. (2009). The axonal repellent, Slit2, inhibits directional migration of circulating neutrophils. Journal of leukocyte biology 86, 1403-1415.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility____,____ and clarity)

      This manuscript by Tsai et al. shows that phage resistance mutations (LPS truncation) confer a cost during interbacterial competition. The authors show that various phage resistant mutants of S. enterica are inhibited by E. cloacae in a contact-dependent manner (on a solid surface but not in liquid). Further experiments showed that this inhibition of S. enterica was mediated by T6SS in E. cloacae. The authors then dissect which parts of the LPS are required for resistance against T6SS attacks and show that a similar resistance is conferred against T6SS of B. thailandensis and C. rodentium. Moreover, the authors show that enzymatic degradation of LPS by a phage enzyme can also increase sensitivity to T6SS (including when such enzymes are on phage particles). Finally, the authors suggest that the change in the thickness of the LPS surface layer could be the reason for changes in T6SS susceptibility. Overall, the manuscript is very well-written. The experiments and controls are explained in sufficient detail and in a logical order. The figures are clear and easy to navigate. The findings are very interesting and important for the T6SS field but also for general understanding how different evolutionary pressures combine and influence each other. I believe that this manuscript will initiate further research in this direction.

      • We thank the reviewer for their positive remarks on our manuscript and the valuable suggestions for its improvement. Major comments

      The only major point that I would like to raise is that I am not generally convinced that the 2 nm difference in the thickness of LPS is the main reason for the observed differences in T6SS-mediated killing of S. enterica. Based on what we know about T6SS mode of action, we expect that it is potentially pushing effectors by up to several hundreds of nanometers. Therefore, the change in the LPS thickness by a few nanometers (as measured by AFM) seems insufficient to provide enough spacing between the attacker and the prey to significantly decrease T6SS effector delivery. While it is clear that understanding the exact reason for the LPS mediated resistance is beyond the scope of this manuscript, I would suggest that the authors consider the fact that T6SS is known to deliver proteins even to the cytoplasm of target gram-negative cells and discuss the mode of action of the machine in the context of their finding. If the T6SS was drawn to scale in the model figure, it would become apparent that 2 nm change in the distance between two cells has probably no major impact on killing by T6SS and the actual reason for the observed phenotype is likely more complicated than what is proposed.

      We appreciate the reviewer's comments and acknowledge that our manuscript leaves open questions regarding the exact mechanisms underlying LPS-mediated resistance. We have now moderated the Discussion in our revised manuscript to reflect the complexity of this phenomenon (Lines 410-423). Although we agree that the nanometer difference in LPS thickness may not fully explain the observed protective phenotype, we believe it remains a plausible contributing factor that is worth considering.

      To fully understand how LPS influences T6SS effector delivery, future studies will need to address key mechanistic questions regarding the T6SS injection process. For example, 1) how deeply does the T6SS apparatus penetrate the target Gram-negative cells during injection; 2) what is the magnitude of the injection force generated by the T6SS; and 3) does the structural integrity of the T6SS apparatus remain intact throughout and after contraction? While it is well documented that some T6SS effectors act in the cytosol of target cells, there is evidence to suggest that cytosolic effectors are initially delivered into the periplasm and subsequently translocated into the cytosol for intoxication1,2. Furthermore, although contraction of the T6SS apparatus occurs within milliseconds3,4, this rapid action does not preclude the possibility that the injection force could be influenced by the thickness of the LPS layer. In addition, the stability of T6SS structural or delivered proteins-such as PAAR, VgrG, and Hcp-within the delivery complex might be compromised upon encountering physical barriers such as the LPS layer and the outer membrane of target cells. These potential interactions could affect the efficiency of effector delivery, leading to reduced competitiveness during interbacterial antagonism, as shown in our study.

      • We appreciate the reviewer's suggestions and acknowledge that the precise reasons for LPS-mediated resistance likely involve a combination of factors beyond those proposed here. We are actively pursuing these questions as part of an ongoing, long-term effort to better elucidate the mechanisms of T6SS action. Minor comments

      Specify which T6SS of B. thailandensis was tested.

      • We now cite studies by Schwarz, S., et al., 20105 and LeRoux, M., et al., 20156, from which we used the tssM (BTH_I2954) gene deletion strain abrogating the T6SS-1 of the B. thailandensis E264 (Line 234, Supplementary Table 1). Use a different naming of the two strains used in competition assays than "donor" and "recipient".

      • Thank you for this suggestion. In the revised manuscript, we have replaced the terms "donor" and "recipient" with "attacker" and "prey" for clarity. This change has been applied to the text (Lines 441, and 649-667) and to revised Figures 2c-h, Figures 3b, d, g, i, j, Figures 4f, g, Figures 5b, e, g, h, Supplementary Figures 3d-f, and Supplementary Figures 4b-d. Indicate in the material and methods ODs of bacterial mixtures used in the "Bacterial competition assays".

      • We apologize for this oversight. The ODs of bacterial mixtures used in the "Bacterial competition assays" have now been specified in the revised Methods section (Line 6____51). Reviewer #1 (Significance)

      This manuscript is interesting for researchers who study T6SS, phage predation and other evolutionary pressures shaping bacterial interactions. The work provides new and interesting insights. My expertise in LPS biology is limited.

      • We sincerely appreciate the reviewer's interest in and support of our study. Reviewer #____2____ (Evidence, reproducibility____,____ and clarity)

      This work investigates the fitness trade-offs in Salmonella enterica resistant to phages. The authors performed co-culture experiments with S. enterica, E. coli, and E. cloacae and found that phage-resistant S. enterica strains displayed reduced fitness in the presence of E. cloacae. Further experiments demonstrated that phage-resistant S. enterica strains were more susceptible to the type VI secretion system (T6SS) of E. cloacae. The authors then examined the role of the O-antigen of lipopolysaccharide (LPS) in T6SS-mediated interbacterial antagonism. By constructing S. enterica mutants with varying O-antigen chain lengths, the authors demonstrated that the O-antigen protects S. enterica from T6SS attack. They then demonstrated that the O-antigen-deficient S. enterica, E. coli, and C. rodentium strains were more susceptible to T6SS attack by E. cloacae. Finally, the authors showed that phage tail spike proteins (TSPs) with endoglycosidase activity could cleave the bacterial O-antigen, thereby increasing susceptibility to T6SS attack.

      The study is well-designed and the experiments are well-executed. The findings are significant and have implications for the understanding of microbial community dynamics.

      • We thank the reviewer for their positive comments regarding our original submission. Major comments

      While the study elegantly demonstrates the link between phage resistance, LPS structure, and T6SS susceptibility, we must remember that these LPS-defective strains are likely at a significant disadvantage in real-world environments without the influence of competing bacteria. Whether it's the gut or external environments, Salmonella needs its LPS for protection against a myriad of host and environmental factors. It seems a bit redundant for T6SS mediated antagonism to select for LPS structures when those structures are essential for bacterial survival outside of this very specific context. It would benefit some discussion about the likelihood of these phage-resistant, LPS-defective strains actually persisting and competing effectively in a more natural setting.

      • We thank the reviewer for their insightful comments and appreciate the opportunity to clarify this point. We agree that LPS-defective bacterial strains face significant disadvantages in natural environments, where they must contend with various host and environmental stresses. Consequently, we did not intend to suggest that T6SS-mediated antagonism is the primary driving force in selecting specific LPS structures. Rather, our study highlights an additional role for LPS during interbacterial interactions, complementing its well-established functions. This notion aligns with the hypotheses proposed in prior studies7-9. The reviewer's comments raise an intriguing question about the essentiality of LPS in Gram-negative bacteria under natural conditions. During our revision process, we identified several examples in the literature demonstrating that LPS may not always be indispensable. For instance, LPS-depleted Neisseria meningitidis strains with an early block in lipid A biosynthesis have been shown to remain viable10,11. These strains may possess adaptive advantages under specific circumstances12. Similarly, some pathogenic bacteria produce truncated LPS structures lacking O-antigen or introduce modified LPS to evade host immune responses13. Additionally, evolutionary pressures, such as phage predation, often drive mutations in O-antigen biosynthesis pathways, resulting in alterations to or an absence of O-antigen14. Furthermore, recent studies have also indicated that trade-offs between abiotic and biotic stresses can influence LPS integrity. For instance, LPS-deficient strains may exhibit selective advantages in extreme environments15,16. These findings underscore the context-dependent nature of LPS functionality and its potential dispensability in certain ecological niches.We sincerely appreciate the reviewer's thought-provoking comments. Our current study aims to provide evidence for the role of interbacterial antagonism as an additional factor influencing LPS integrity. However, we did not mean to overstate the contribution of this mechanism. Instead, we only seek to contribute to a broader understanding of the multifaceted functions of LPS in bacterial survival and adaptation. We have modified the Discussion in our revised manuscript to clarify this idea (Lines 453-466). Minor comments

      Figure 5 could be more effective is panels b and c are together

      • We appreciate this suggestion. We have revised the manuscript accordingly, so panels b and c have been combined in revised Figure 5, __and the respective figure legends have been modified for improved clarity (__Lines 810-823).

      69 Authors should define mucoid

      • The term "mucoid" has now been defined in the revised manuscript (Lines 69-70).

      155 Authors should explain that this result is expected since T6SS acts on solid surface while CDI works in liquid cultures

      • Thank you for this comment. Prior studies have demonstrated that while CDI-mediated antibacterial activity is less efficient in liquid environments, it can still occur on both solid surfaces and in liquid cultures, provided the competitors possess the necessary CdiA binding unit, such as BamA17,18. This understanding supports our initial hypothesis that T6SS and/or CDI contribute to the observed protective phenotype in S. enterica phage-resistant variants (Figure 2).

      clarify what it is meant by unicellular cultures. Should it be monocultures?

      • We apologize for this error and have now replaced "unicellular cultures" with "monocultures" in the revised manuscript (Lines 137, 180, and 258).

      618 add to the text how much dead phage was added per bacterial cell

      • Apologies for this oversight. The multiplicity of infection (MOI) describing the amount of inactivated phages used to treat bacterial cells has now been included in the revised Methods section (Line 661).

      364 references needed for "consistent with predictions for intact LPS structures "

      • We thank the reviewer for pointing out this omission. The relevant reference has now been added to the revised manuscript19 (Line 368). Reviewer #____2____ (Significance)

      This study offers a new perspective on the interplay between phage resistance and bacterial fitness in the context of microbial communities. While the concept of fitness trade-offs associated with antibiotic resistance is well-established, the authors extend this paradigm to phage resistance. They demonstrate that phage-resistant Salmonella enterica strains exhibit reduced fitness in the presence of Enterobacter cloacae due to increased susceptibility to the type VI secretion system (T6SS). This finding is significant as it highlights the potential for interbacterial antagonism to shape the evolution of phage resistance. The authors further show that the O-antigen of lipopolysaccharide (LPS) plays a crucial role in protecting S. enterica from T6SS attack. This observation provides mechanistic insights into the fitness trade-offs associated with phage resistance.

      The study's strength lies in its elegant experimental design and the comprehensive analysis of the interplay between phage resistance, T6SS susceptibility, and O-antigen structure. The authors employ a combination of co-culture experiments, genetic manipulations, and structural analyses to dissect the underlying mechanisms. The findings are robust and have implications for understanding the evolution of bacterial communities in the presence of phages and competing bacterial species.

      This research will be of interest to a broad audience, including researchers in microbiology, synthetic biology, and microbial ecology. The findings have implications for understanding the evolution of phage resistance, and the dynamics of microbial communities. The study's insights into the role of the O-antigen in T6SS susceptibility could also inform the design of novel antimicrobial strategies.

      My expertise is microbial physiology

      • We thank the reviewer for their positive remarks and careful reading of our manuscript. Reviewer #____3____ (Evidence, reproducibility____,____ and clarity)

      Tsai et al. describe LPS biosynthesis mutants arising in selection for phage resistance that increase susceptibility to T6SS-mediated interbacterial antagonism. Phage-derived LPS degrading enzymes also contribute to T6SS susceptibility, which may be due to weakening of the physical barrier of LPS. The mechanisms of this fitness trade-off are elucidated with well-executed and presented experiments.

      • We are grateful to the reviewer for their kind words and critical reading of the manuscript. Major comments

      No major critiques.

      Minor comments

      Others have described two T6SS in Enterobacter cloacae ATCC 13047 (PMID 33072020). Please clarify which of the two are inactivated by the tssM deletion in this study and either provide compelling evidence that both are inactive or change the text throughout to indicate T6SS-1 or T6SS-2 being inactivated.

      • We thank the reviewer for this comment. In our study, we refer to the work by Whitney, J., et al., 201420, from which we used the tssM (ECL_01536) gene deletion strain in which T6SS-1 of the E. cloacae ATCC 13047 is abrogated. Consistent with this detail, we have now clarified in the revised manuscript (Line 155, Supplementary Table 1) that T6SS-1 is inactivated. Moreover, the reference suggested by the reviewer provides additional evidence supporting that T6SS-1, but not T6SS-2, is involved in bacterial competition21, which we also now specify in the revised manuscript. It seems the authors used EHEC EDL933, which has T6SS, in co-culture experiments (Figure 1C). Why do the authors think the S. enterica LPS mutants don't have a competitive disadvantage against EHEC? It seems to run counter to the conclusion that LPS is broadly protective against T6SS.

      • We thank the reviewer for raising this point. While it is true that EHEC O157:H7 strain EDL933 possesses a T6SS gene cluster in its genome, a prior study has shown that the T6SS in this strain appears to be inactivated under laboratory conditions, likely due to repression by the global regulator H-NS22. Consistent with these findings, our data indicate that the S. enterica LPS mutants did not exhibit a competitive disadvantage against EHEC EDL933. These results support the conclusion that, under the conditions tested, the truncated LPS in S. enterica does not affect its fitness against EHEC (Figure 1c), likely due to the inactivity of the EHEC T6SS22. It's not clear if the only Felix O1 and P22 phage-resistant transposon hits were in LPS-related genes, or if that pattern was observed in a more complete transposon sequencing dataset and selected for further study. A complete list of the sequence-identified hits, including the non-LPS related variants, would help clarify this and provide a useful resource to the research community.

      • We thank the reviewer for the opportunity to clarify this point. For each phage, we initially isolated nine phage-resistant transposon variants, which were subsequently used for co-culture assays and transposon insertion site identification, as described in the original manuscript (Figure 1a __and Supplementary Figure 2a__). We agree with the reviewer that a broader screening approach could reveal non-LPS-related variants and provide a more comprehensive resource for the research community. To address this point, during the manuscript revision period, we followed the same procedure and isolated an additional nine phage-resistant variants for each phage (Supplementary Table 1). Interestingly, from this expanded isolation dataset, the transposon insertions were again found exclusively in LPS-related genes (Author Response Figure 1). We have now included this new dataset in the revised manuscript and believe it strengthens the robustness of our findings. This expanded data has been made available below for further reference. The fact that 8 of the 9 Felix O1 resistant variants all have transposon insertions in waaO should be stated in the results. The initial impression of showing R1-R9 is that 9 disrupted genes are being tested - in this case it's really only two. This is a minor critique because clean deletions by allelic exchange are shown for a more extensive set of genes anyway.

      • We thank the reviewer for this comment. As suggested, we have revised the Results section (Lines 126-131) to explicitly state that Felix O1-resistant variants harbor transposon insertions in only two genes (waaO and dagR), which were initially tested in the competition assay (Figure 2). The S. enterica serovar Typhimurium transposon mutagenesis library could benefit from clarification on details. The results section suggests use of a pre-existing "established" transposon library, but the methods and Figure 1 seem to indicate a new library was created based on prior methods. In either case, what is the genome coverage and redundancy of the library? If this is not known or saturation is not reached, the implications of potentially missing phage resistance genes with this approach should be discussed.

      • We thank the reviewer for the opportunity to clarify this point. For our study, we created a transposon library following previously established methods23. The library comprises approximately 12,000 variants, as noted in Figure 1a. While doing so provided substantial genome coverage, it did not achieve full saturation. We have now revised the Results section (Lines 93-94, and 115-117) to better describe the potential limitations of this approach, including by stating the possibility that some phage-resistance genes may have been missed during the screening. There is some variation in phenotype among the strains with transposon insertion into the same gene, such as P22 resistant strain R7 which macroscopically agglutinates while the other waaJ insertions R5 and R1 don't. Is this due to polar effects on waaO, or could it be genetic alterations at other sites driven by stringent phage selection?

      • We thank the reviewer for this comment. We also suspect that the variation in the macroscopically agglutinative phenotypes among P22-resistant strains, such as strain R7 compared to R5 and R1, may be caused by polar effects on waaO. Additionally, the possibility of genetic alterations at other loci driven by stringent phage selection cannot be excluded. To address this potential variability and ensure consistency, we used clean deletions of each LPS biogenesis gene in all subsequent experiments. This approach eliminates the confounding effects of polar mutations or secondary genetic alterations, thereby providing more robust and interpretable data. Figure S1- The graphs with 12 growth curves are difficult to decipher, and the error bars would suggest maybe there are subtle growth differences among the mutants. Quantifying curve parameter(s) and applying a statistical test may clarify. The CFU counts in panel D seem to be not in log scale. Likewise in Figure S3 panel A, the authors say there are no significant growth defects, but the growth curves are modestly right-shifted for several mutants. This is a point of precision rather than a major critique, because the reversal of competitive growth phenotypes by donor T6SS inactivation indicate the potential minor growth defects aren't playing a major role in competition.

      • We thank the reviewer for these suggestions and corrections. We have now revised the manuscript accordingly, including in Supplementary Figures 1 and 3. Quantitative analysis of growth curve parameters and statistical tests have been included below to clarify the observed differences (Author Response Figure 2). The slight right-shift of the growth curves for some mutants, as noted in Supplementary Figure 3, may be attributable to cell aggregation, as shown in Supplementary Figures 2e, f. The growth rate measurements were conducted in a 96-well plate with steady shaking at 200 rpm using a plate reader, which does not fully account for the aggregated cell phenotype. Despite these subtle growth differences, we agree with the reviewer that they do not appear to play a major role in the competitive growth phenotypes, as evidenced by the reversal of phenotypes upon donor T6SS inactivation (Supplementary Figure 3). Figure 3f - The authors say fepE is responsible for very long O-antigen chains, but it is not clear that the delta fepE LPS PAGE differs from wild type, which would fit with the lack of competitive disadvantage against E. cloacae in Figure 3g. The increased VL-modal O-antigen upon fepE overexpression in Figure 3h and increase protection in competition (figure 3i) are convincing. Is there another pathway(s) compensating for fepE deletion?

      • We thank the reviewer for this thoughtful comment. We have repeated the experiment independently at least three times and consistently observed a reduction in the VL-modal O-antigen in the ∆fepE strain. To provide additional clarity, we have included supplementary LPS profiles and quantifications below (Author Response Figure 3). We currently do not have evidence from the literature or our experiments to identify an alternative pathway compensating for the deletion of fepE. Nonetheless, we acknowledge this as a possibility and appreciate the reviewer's insight into this topic. Lines 199-200 - I believe the conclusion from wzzB deletion would be that L-modal O-antigen is necessary for protection against T6SS, and not necessarily sufficient.

      • We thank the reviewer for pointing out this important distinction. The respective sentence has now been revised in the manuscript (Line 204). Do the environmentally isolated phages As2 and As4 encode TSP homologs?

      • We thank the reviewer for this question. We did not identify TSP homologs in the genome of As2 and As4 phages. The genome sequences of As1 to As4 have been uploaded to NCBI's BioProject resource under accession number PRJNA1199570 (Lines 535-544, 741-743). Reviewer #____3____ (Significance)

      This manuscript provides a substantial advance in the field's understanding of how phages affect bacterial community interactions. To my knowledge, it is the first to bring together phage and T6SS defense with a strong mechanistic link. It's a conceptual advance in this regard that will stimulate more thought and experimentation on the roles of phage in bacterial communities like gut and environmental microbiomes. The manuscript's strengths include rigorous overall design, clarity of the communication, and depth of mechanistic investigation, all the way down to atomic force microscopy measurements. There are some minor revisions suggested, but these are addressable with minimal/no additional experiments.

      As someone with expertise in bacterial secretion systems and interbacterial interactions, I think this work will be of interest to microbiologists generally, and specifically in the fields of phage biology, bacterial secretion systems, and microbiome research. While the phage virology components are straightforward and well described, I think a review from someone with more expertise in this specific area would be beneficial.

      • We thank the reviewer for their careful reading of our manuscript and for the suggestions to improve it. References

      • Whitney, J.C., Quentin, D., Sawai, S., LeRoux, M., Harding, B.N., Ledvina, H.E., Tran, B.Q., Robinson, H., Goo, Y.A., Goodlett, D.R., et al. (2015). An interbacterial NAD(P)(+) glycohydrolase toxin requires elongation factor Tu for delivery to target cells. Cell 163, 607-619. 10.1016/j.cell.2015.09.027.

      • Ali, J., Yu, M., Sung, L.K., Cheung, Y.W., and Lai, E.M. (2023). A glycine zipper motif is required for the translocation of a T6SS toxic effector into target cells. EMBO Rep 24, e56849. 10.15252/embr.202356849.
      • LeRoux, M., De Leon, J.A., Kuwada, N.J., Russell, A.B., Pinto-Santini, D., Hood, R.D., Agnello, D.M., Robertson, S.M., Wiggins, P.A., and Mougous, J.D. (2012). Quantitative single-cell characterization of bacterial interactions reveals type VI secretion is a double-edged sword. Proc Natl Acad Sci U S A 109, 19804-19809. 10.1073/pnas.1213963109.
      • Basler, M., Pilhofer, M., Henderson, G.P., Jensen, G.J., and Mekalanos, J.J. (2012). Type VI secretion requires a dynamic contractile phage tail-like structure. Nature 483, 182-186. 10.1038/nature10846.
      • Schwarz, S., West, T.E., Boyer, F., Chiang, W.C., Carl, M.A., Hood, R.D., Rohmer, L., Tolker-Nielsen, T., Skerrett, S.J., and Mougous, J.D. (2010). Burkholderia type VI secretion systems have distinct roles in eukaryotic and bacterial cell interactions. PLoS Pathog 6, e1001068. 10.1371/journal.ppat.1001068.
      • LeRoux, M., Kirkpatrick, R.L., Montauti, E.I., Tran, B.Q., Peterson, S.B., Harding, B.N., Whitney, J.C., Russell, A.B., Traxler, B., Goo, Y.A., et al. (2015). Kin cell lysis is a danger signal that activates antibacterial pathways of Pseudomonas aeruginosa. Elife 4. 10.7554/eLife.05701.
      • Hersch, S.J., Manera, K., and Dong, T.G. (2020). Defending against the Type Six Secretion System: beyond Immunity Genes. Cell Rep 33, 108259. 10.1016/j.celrep.2020.108259.
      • Unterweger, D., Kitaoka, M., Miyata, S.T., Bachmann, V., Brooks, T.M., Moloney, J., Sosa, O., Silva, D., Duran-Gonzalez, J., Provenzano, D., and Pukatzki, S. (2012). Constitutive type VI secretion system expression gives Vibrio cholerae intra- and interspecific competitive advantages. PLoS One 7, e48320. 10.1371/journal.pone.0048320.
      • Toska, J., Ho, B.T., and Mekalanos, J.J. (2018). Exopolysaccharide protects Vibrio cholerae from exogenous attacks by the type 6 secretion system. Proc Natl Acad Sci U S A 115, 7997-8002. 10.1073/pnas.1808469115.
      • Steeghs, L., den Hartog, R., den Boer, A., Zomer, B., Roholl, P., and van der Ley, P. (1998). Meningitis bacterium is viable without endotoxin. Nature 392, 449-450. 10.1038/33046.
      • Steeghs, L., de Cock, H., Evers, E., Zomer, B., Tommassen, J., and van der Ley, P. (2001). Outer membrane composition of a lipopolysaccharide-deficient Neisseria meningitidis mutant. EMBO J 20, 6937-6945. 10.1093/emboj/20.24.6937.
      • Fransen, F., Heckenberg, S.G., Hamstra, H.J., Feller, M., Boog, C.J., van Putten, J.P., van de Beek, D., van der Ende, A., and van der Ley, P. (2009). Naturally occurring lipid A mutants in neisseria meningitidis from patients with invasive meningococcal disease are associated with reduced coagulopathy. PLoS Pathog 5, e1000396. 10.1371/journal.ppat.1000396.
      • Maldonado, R.F., Sa-Correia, I., and Valvano, M.A. (2016). Lipopolysaccharide modification in Gram-negative bacteria during chronic infection. FEMS Microbiol Rev 40, 480-493. 10.1093/femsre/fuw007.
      • Yu, J., Zhang, H., Ju, Z., Huang, J., Lin, C., Wu, J., Wu, Y., Sun, S., Wang, H., Hao, G., and Zhang, A. (2024). Increased mutations in lipopolysaccharide biosynthetic genes cause time-dependent development of phage resistance in Salmonella. Antimicrob Agents Chemother 68, e0059423. 10.1128/aac.00594-23.
      • Burmeister, A.R., Fortier, A., Roush, C., Lessing, A.J., Bender, R.G., Barahman, R., Grant, R., Chan, B.K., and Turner, P.E. (2020). Pleiotropy complicates a trade-off between phage resistance and antibiotic resistance. Proc Natl Acad Sci U S A 117, 11207-11216. 10.1073/pnas.1919888117.
      • Carretero-Ledesma, M., Garcia-Quintanilla, M., Martin-Pena, R., Pulido, M.R., Pachon, J., and McConnell, M.J. (2018). Phenotypic changes associated with Colistin resistance due to Lipopolysaccharide loss in Acinetobacter baumannii. Virulence 9, 930-942. 10.1080/21505594.2018.1460187.
      • Aoki, S.K., Pamma, R., Hernday, A.D., Bickham, J.E., Braaten, B.A., and Low, D.A. (2005). Contact-dependent inhibition of growth in Escherichia coli. Science 309, 1245-1248. 10.1126/science.1115109.
      • Aoki, S.K., Malinverni, J.C., Jacoby, K., Thomas, B., Pamma, R., Trinh, B.N., Remers, S., Webb, J., Braaten, B.A., Silhavy, T.J., and Low, D.A. (2008). Contact-dependent growth inhibition requires the essential outer membrane protein BamA (YaeT) as the receptor and the inner membrane transport protein AcrB. Mol Microbiol 70, 323-340. 10.1111/j.1365-2958.2008.06404.x.
      • Gao, Y., Widmalm, G., and Im, W. (2023). Modeling and Simulation of Bacterial Outer Membranes with Lipopolysaccharides and Capsular Polysaccharides. J Chem Inf Model 63, 1592-1601. 10.1021/acs.jcim.3c00072.
      • Whitney, J.C., Beck, C.M., Goo, Y.A., Russell, A.B., Harding, B.N., De Leon, J.A., Cunningham, D.A., Tran, B.Q., Low, D.A., Goodlett, D.R., et al. (2014). Genetically distinct pathways guide effector export through the type VI secretion system. Mol Microbiol 92, 529-542. 10.1111/mmi.12571.
      • Soria-Bustos, J., Ares, M.A., Gomez-Aldapa, C.A., Gonzalez, Y.M.J.A., Giron, J.A., and De la Cruz, M.A. (2020). Two Type VI Secretion Systems of Enterobacter cloacae Are Required for Bacterial Competition, Cell Adherence, and Intestinal Colonization. Front Microbiol 11, 560488. 10.3389/fmicb.2020.560488.
      • Wan, B., Zhang, Q., Ni, J., Li, S., Wen, D., Li, J., Xiao, H., He, P., Ou, H.Y., Tao, J., et al. (2017). Type VI secretion system contributes to Enterohemorrhagic Escherichia coli virulence by secreting catalase against host reactive oxygen species (ROS). PLoS Pathog 13, e1006246. 10.1371/journal.ppat.1006246.
      • Mandal, R.K., Jiang, T., and Kwon, Y.M. (2021). Genetic Determinants in Salmonella enterica Serotype Typhimurium Required for Overcoming In Vitro Stressors in the Mimicking Host Environment. Microbiol Spectr 9, e0015521. 10.1128/Spectrum.00155-21.
    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This valuable study provides in vivo evidence for the synchronization of projection neurons in the olfactory bulb at gamma frequency in an activity-dependent manner. This study uses optogenetics in combination with single-cell recordings to selectively activate sensory input channels within the olfactory bulb. The data are thoughtfully analyzed and presented; the evidence is solid, although some of the conclusions are only partially supported.

      We deeply thank all the reviewers for their time, effort, and insightful comments. Their revision led to a significant improvement of the paper.

      The reviewers suggested toning down our claim that we found a mechanism that synchronizes all odor-evoked MTC activities, as we do not directly show that. We concur and address this in our revised version to ensure a precise interpretation of our findings. In short, we state that we revealed a synchronization mechanism between two groups of active mitral and tufted cells (MTCs) and show that this synchronization is activity-dependent and distance-independent. This mechanism can enable the synchronization of all odor-activated MTCs.

      Another issue raised is the interpretation of the results obtained under Ketamine anesthesia. Ketamine is an NMDA receptor antagonist that plays a crucial role in the  MTC-GC reciprocal synapse. To address this, we include new analyses demonstrating that optogenetic activation of granule cells (GCs) can inhibit the recorded MTCs during baseline activity but does not substantially affect odor-evoked MTC firing rates. We show that this is correct in both Ketamine-induced anesthesia and awake mice (Dalal & Haddad, 2022). This indicates that GC-MTC connections are functional even under Ketamine anesthesia, however, they do not exert substantial suppression on odor-evoked MTC responses. We added a paragraph to the discussion section on the potential influence of Ketamine anesthesia on GC-MTC synapses and its implications on our findings.

      Finally, we discuss several recent studies that are particularly relevant to our research and expand the discussion on our hypothesis that parvalbumin-positive cells in the olfactory bulb may serve as key mediators of the activity- and distance-dependent lateral inhibition observed in our findings.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Dalal and Haddad investigated how neurons in the olfactory bulb are synchronized in oscillatory rhythms at gamma frequency. Temporal coordination of action potentials fired by projection neurons can facilitate information transmission to downstream areas. In a previous paper (Dalal and Haddad 2022, https://doi.org/10.1016/j.celrep.2022.110693), the authors showed that gamma frequency synchronization of mitral/tufted cells (MTCs) in the olfactory bulb enhances the response in the piriform cortex. The present study builds on these findings and takes a closer look at how gamma synchronization is restricted to a specific subset of MTCs in the olfactory bulb. They combined odor and optogenetic stimulations in anesthetized mice with extracellular recordings.<br /> The main findings are that lateral synchronization of MTCs at gamma frequency is mediated by granule cells (GCs), independent of the spatial distance, and strongest for MTCs with firing rates close to 40 Hz. The authors conclude that this reveals a simple mechanism by which spatially distributed neurons can form a synchronized ensemble. In contrast to lateral synchronization, they found no evidence for the involvement of GCs in lateral inhibition of nearby MTCs.

      Strengths:

      Investigating the mechanisms of rhythmic synchronization in vivo is difficult because of experimental limitations for the readout and manipulation of neuronal populations at fast timescales. Using spatially patterned light stimulation of opsin-expressing neurons in combination with extracellular recordings is a nice approach. The paper provides evidence for an activity-dependent synchronization of MTCs in gamma frequency that is mediated by GCs.

      Weaknesses:

      An important weakness of the study is the lack of direct evidence for the main conclusion - the synchronization of MTCs in gamma frequency. The data shows that paired optogenetic stimulation of MTCs in different parts of the olfactory bulb increases the rhythmicity of individual MTCs (Figure 1) and that combined odor stimulation and GC stimulation increases rhythmicity and gamma phase locking of individual MTCs (Figure 4). However, a direct comparison of the firing of different MTCs is missing. This could be addressed with extracellular recordings at two different locations in the olfactory bulb. The minimum requirement to support this conclusion would be to show that the MTCs lock to the same phase of the gamma cycle. Also, showing the evoked gamma oscillations would help to interpret the data.

      We agree with the reviewer that direct evidence of mutual synchronization between multiple recorded MTCs has not been shown in our study. Our study only shows a mechanism that can enable this synchronization. We now state this clearly in the manuscript. We based this on previous studies that tested MTC spike synchronization. Specifically, Schoppa 2006, reported that electrical OSN stimulation evokes MTC spikes synchronization in the gamma range, in-vitro. Kashiwadni et al., 1999 and Doucette et al., 2011 showed that odor-evoked MTC spike times are synchronized, in-vivo. Given these studies, we asked what is the underlying mechanism that can support such a synchronization. Our study demonstrates that activating a group of MTCs can entrain another MTC in an activity-dependent and distance-independent manner. We claim this could be the underlying mechanism for the odor-evoked synchronization as demonstrated by these previous studies.

      To make sure this is clearly stated in the manuscript we changed the title to “Activity-dependent lateral inhibition enables the synchronization of active olfactory bulb projection neurons”, and rephrased a sentence in the abstract to “This lateral synchronization was particularly effective when the recorded MTC fired at the gamma rhythm”. To further clarify this point, we made several other changes throughout the results and the discussion section.

      Another weakness is that all experiments are performed under anesthesia with ketamine/medetomidine. Ketamine is an antagonist of NMDA receptors and NMDA receptors are critically involved in the interactions of MTCs and GCs at the reciprocal synapses (see for example Lage-Rupprecht et al. 2020, https://doi.org/10.7554/eLife.63737; Egger and Kuner 2021, https://doi.org/10.1007/s00441-020-03402-7). This should be considered for the interpretation of the presented data.

      This issue has been raised by reviewers #1 and #2. We think, as also reviewer #2 acknowledged, that this issue does not compromise our results. However, to address this important point we added the below section to the Discussion:

      “Our experiments were performed under Ketamine anesthesia, an NMDA receptor antagonist that affects the reciprocal dendro-dendritic synapses between MTCs and GCs (Egger and Kuner, 2021; Lage-Rupprecht et al., 2020). Consistent with that, recent studies reported lower excitability of GC activity under anesthesia (Cazakoff et al., 2014; Kato et al., 2012).  This raises the concern that our result might not be valid in the awake state. We argue that this is unlikely. First, (Fukunaga et al., 2014) reported that GCs baseline activity in anesthetized and awake mice is similar, suggesting that MTC-GC synapses are functioning. Second, we show that light activation of GCL neurons strongly inhibits the MTC baseline activity (Figure 5) and increases MTC odor-evoked spike-LFP coupling in the gamma range (Figure 4). These experiments validate that GCL neurons can exert inhibition over MTCs in our experimental setup. Third, we have shown that light-activating all accessible GCL neurons has a minor effect on the MTC odor-evoked firing rates in an awake state (Dalal and Haddad, 2022), corroborating the finding that GCL neurons are unlikely to provide strong suppression to MTCs. Fourth, and most importantly, we showed that optogenetic stimulation of MTCs entrains other MTC spike times, which is achieved via the GCL neurons. This suggests that the lack of lateral suppression following MTC or GCL neuron opto-activation is not due to MTC-GC synapse blockage. That said, we cannot exclude the unlikely possibility that NMDA receptor blockage under anesthesia impairs MTC-to-MTC suppressive interactions but not the MTC-to-MTC mediated spike entrainment.”

      Figure 1A and D from Dalal & Haddad 2022 show the effect of GCL neurons opto-activation during odor stimulation on MTC firing rates in awake and anesthetized mice.

      Furthermore, the direct effect of optogenetic stimulation on GCs activity is not shown. This is particularly important because they use Gad2-cre mice with virus injection in the olfactory bulb and expression might not be restricted to granule cells and might not target all subtypes of granule cells (Wachowiak et al., 2013, https://doi.org/10.1523/JNEUROSCI.4824-12.2013). This should be considered for the interpretation of the data, particularly for the absence of an effect of GC stimulation on lateral inhibition.

      In this study we used Gad2-cre mice, and the protocol for viral transfection of GCL neurons reported in Fukunaga et al., 2014. They reported that: ‘more than 90% of Cre-expressing neurons in the GCL also expressed fluorescently tagged ArchT’. Consistently, when Fukunaga et al. expressed ChR2 in the GCL using the same viral infection as we used, they reported that: ”Light presentation in vivo resulted in rapid and strong depolarization of, and action potential (AP) discharges in, GCs (Fig. 3b), which in

      turn consistently and strongly hyperpolarized M/TCs (9 of 9 cells showed 100% AP suppression; Fig. 3c,d)”. This study shows clearly that this infection protocol is robust. Moreover, in new panels we added to the manuscript (Figure 5a-b), we show that optogenetic activation of GCL neurons strongly suppressed MTC activity during baseline conditions but not odor-evoked responses MTCs. This is consistent with the reports by Fukunaga et al, and indicates that GCL neurons are functional as they can suppress MTC baseline activity.

      Finally, since virus injection to the granule cell layer can target other GCL neuron types, we changed the reference in the text to GCL neurons (as was done in Gschwend et al., 2015) instead of ‘GCs’ when referring to GC. We replaced the image in Figure 4A, to show the expression of ChR2 is restricted to GCL neurons. That said, it is still possible that our protocol did not infect all GC subtypes. To address this, we added this line to the Discussion: “We also note that our viral transfection protocol in Gad2-Cre mice might not transfect all subtypes of GCs”

      Several conclusions are only supported by data from example neurons. The paper would benefit from a more detailed description of the analysis and the display of some additional analysis at the population level:

      - What were the criteria based on which the spots for light-activation were chosen from the receptive field map?

      In order to make this point clearer, we extended the explanation in the Methods on the selection criteria: “Spots were selected either randomly or manually. In the manual selection case, we selected spots that caused either significant or mild but insignificant inhibitory effect on the recorded MTC (e.g., local cold spots in the receptive-field map; see example in Figure 2a of example spots that were selected manually)”. We also add a reference in the text to the Methods: “see Methods for spots selection criteria”.

      - The absence of an effect on firing rate for paired stimulations is only shown for one example (Figure 1c). A quantification of the population level would be interesting.

      - Only one example neuron is shown to support the conclusion that "two different neural circuits mediate suppression and entrainment" in Figure 3. A population analysis would provide more evidence.

      Thank you very much for these comments. We added a population analysis in Figure 3. This analysis shows a dissociation between firing rate suppression and the entrainment groups (Figure 3c-d). This suggests that two different circuits mediate suppression and entrainment.

      - Only one example neuron is shown to illustrate the effect of GC stimulation on gamma rhythmicity of MTCs in Figures 4 f,g.

      In this figure, we show that the activation of subsets of GCL neurons elevated odor-evoked spike synchronization to the gamma rhythm. We thought it would be beneficial to demonstrate the change in spike entrainment following GCL neurons optogenetic activation regardless of the ongoing OB gamma oscillations, using the method presented by Fukunaga et al., 2014. However, this analysis requires that the neuron has a relatively high firing rate. As we describe in the figure legend of this panel, this neuron is probably a tufted cell based on the findings shown in Fukunaga et al., 2014 and Burton & Urban, 2021. Most of our recorded cells had a lower firing rate, which coincides with our typical recording depth, targeting mitral cells rather than tufted cells (~400µm deep). Since this analysis is shown only over a single neuron, we moved it to Supplementary Figure 4.

      - In Figure 5 and the corresponding text, "proximal" and "distal" GC activation are not clearly defined.

      We agree. Initially, we used these terms to refer to GC columns that include the recorded MTC (proximal) and columns that are away from it (distal). We decided that instead of using a coarse division, we would show the whole range of distances. We updated the analysis in Figure 5d to show the effect of GC optogenetic activation on MTC odor-evoked responses as a function of the distance from the recorded MTC.

      Reviewer #2 (Public Review):

      Summary

      This study provides a detailed analysis and dissociation between two effects of activation of lateral inhibitory circuits in the olfactory bulb on ongoing single mitral/tufted cell (MTC) spiking activity, namely enhanced synchronization in the gamma frequency range or lateral inhibition of firing rate.

      The authors use a clever combination of single-cell recordings, optogenetics with variable spatial stimulation of MTCs and sensory stimulation in vivo, and established mathematical methods to describe changes in autocorrelation/synchronization of a single MTC's spiking activity upon activation of lateral glomerular MTC ensembles. This assay is rounded off by a gain-of-function experiment in which the authors enhance granule cell (GC) excitation to establish a causal relation between GC activation and enhanced synchronization to gamma (they had used this manipulation in their previous paper Dalal & Haddad 2022, but use a smaller illumination spot here for spatially restricted activation).

      Strengths

      This study is of high interest for olfactory processing - since it shows directly that interactions between only two selected active receptor channels are sufficient to enhance the synchronization of single neurons to gamma in one channel (and thus by inference most likely in both). These interactions are distance-independent over many 100s of µms and thus can allow for non-topographical inhibitory action across the bulb, in contrast to the center-surround lateral inhibition known from other sensory modalities.

      In my view, parallels between vision and olfaction might have been overemphasized so far, since the combinatorial encoding of olfactory stimuli across the glomerular map might require different mechanisms of lateral interaction versus vision. This result is indicative of such a major difference.

      Such enhanced local synchronization was observed in a subset of activated channel pairs; in addition, the authors report another type of lateral interaction that does involve the reduction of firing rates, drops off with distance and most likely is caused by a different circuit-mediated by PV+ neurons (PVN; the evidence for which is circumstantial).

      Weaknesses/Room for improvement

      Thus this study is an impressive proof of concept that however does not yet allow for broad generalization. Therefore the framing of results should be slightly more careful in my opinion.

      We agree with the reviewer. We copy here our response to reviewer #1, who raised the same issue.

      We agree that direct evidence of mutual synchronization between multiple recorded MTCs has not been shown in our study. Our study only shows a mechanism that can enable this synchronization. We now state this clearly in the manuscript. We relayed previous studies that tested MTC spike synchronization. Specifically, Schoppa 2006, reported that electrical OSN stimulation evokes MTC spikes synchronization in the gamma range, in-vitro. Kashiwadni et al., 1999 and Doucette et al., showed that odor-evoked MTC spike times are synchronized, in-vivo. Given these studies, we asked what is the underlying mechanism that can support such a synchronization. Our study demonstrates that activating a group of MTCs can entrain another MTC in an activity-dependent and distance-independent manner. We claim this could be the underlying mechanism for the odor-evoked synchronization as demonstrated by these previous studies.

      To make sure this is clearly stated in the manuscript we changed the title to “Activity-dependent lateral inhibition enables the synchronization of active olfactory bulb projection neurons”, and rephrased a sentence in the abstract to “This lateral synchronization was particularly effective when the recorded MTC fired at the gamma rhythm”. To further clarify this point, we made several other changes throughout the results and the discussion section.

      Along this line, the conclusions regarding two different circuits underlying lateral inhibition vs enhanced synchronization are not quite justified by the data, e.g.

      (1) The authors mention that their granule cell stimulation results in a local cold spot (l. 527 ff) - how can they then said to be not involved in the inhibition of firing rate (bullet point in Highlights)? Please elaborate further. In l.406 they also state that GCs can inhibit MTCs under certain conditions. The argument, that this stimulation is not physiological, makes sense, but still does not rule out anything. You might want to cite Aghvami et al 2022 on the very small amplitude of GC-mediated IPSPs, also McIntyre and Cleland 2015.

      We apologize for the lack of clarity. We reported that we found a local cold spot in the context of an additional experiment not presented in the manuscript and only described in the Methods section. Following the revision, we decided to add the analysis of this experiment to Figure 5. This experiment validated that optogenetic activation of GCs is potent and can affect the recorded MTC firing rates. This is particularly important as we performed all experiments under Ketamine anesthesia, which is a NMDA receptor antagonist. In this experiment, we recorded the activity of MTCs at baseline conditions (without odor presentation) under optogenetic activation of GCs. We divided the OB surface into a grid and optogenetically activated GC columns at a random order, one light spot in each trial, using light patches of size of size 330um2. We used the same light intensity as in the optogenetic GC activation during odor stimulation (reported in Figures 4-5). We show that the recorded MTC was strongly inhibited by GC light activation, mostly when activating GCs in its vicinity (within its column, i.e., local cold spot). This experiment validates that in our experimental setup, GCs can exert inhibition over MTCs at baseline conditions.

      (2) Even from the shown data, it appears that laterally increased synchronization might co-occur with lateral suppression (See also comment on Figures 1d,e and Figure S1c)

      We kindly note that the panels you referred to do not quantify the firing rate but the rhythmicity of MTC light-evoked responses. We should have explained these graphs better in the main text and not only in the Methods section. We added a panel to Supplementary Figure 1, which describes our analysis: In each of these examples, we performed a time-frequency Wavelet analysis over the average response of the neurons across trials (computed using a sliding Gaussian with a std of 2ms). The results of the Wavelet analysis allowed us to visually capture the enhanced spike alignment across trials under paired activation as a function of the stimulus duration (as, for example, in Figure 1c, middle panel). The response amplitude to light stimulation did not change in this example (shown in Figure 1c lower panel), and the spikes entrainment increased following paired activation of MTCs.

      To address the relations between lateral suppression and synchronization at the population level, we added additional analyses to Figure 3c-d.

      (3) There are no manipulations of PVN activity in this study, thus there is no direct evidence for the substrate of the second circuit.

      We completely agree with the reviewer. Using the current data, we can only claim that optogenetic activation of GCL neurons did not affect the MTC odor-evoked response. This finding is consistent with the loss-of-function experiment reported by Fukunaga et al., 2014, where GC suppression did not change odor-evoke responses in both anesthetized and awake mice. Therefore, we speculated that PVN might be a candidate OB interneuron to mediate lateral inhibition between MTCs. This hypothesis is based on their higher likelihood of interconnecting two MTCs compared with GCs (Burton, 2017). We elaborated on this in the discussion and made sure it is clearly stated as a hypothesis.

      (4) The manipulation of GC activity was performed in a transgenic line with viral transfection, which might result in a lower permeation of the population compared to the line used for optogenetic stimulation of MTCs.

      We used a previously validated protocol for optogenetic manipulation of GCs from Fukunaga et al., 2014 in order to minimize this caveat. As we cited previously from their paper, following the expression of ChR2 in the GCL, ‘Light presentation in vivo resulted in rapid and strong depolarization of, and action potential (AP) discharges in, GCs (Fig. 3b), which in turn consistently and strongly hyperpolarized M/TCs (9 of 9 cells showed 100% AP suppression; Fig. 3c,d)’. These results are consistent with the additional experiment we added to the manuscript, where optogenetic activation of GCL neurons strongly suppressed MTC activity during baseline conditions (without odor presentation). The high similarity between these two reports, in which, in the case of Fukunaga et al., GC activation was directly measured, suggests that lack of opsin expression or insufficient light intensity is unlikely to explain the lack of GCL neuron activation effect on lateral inhibition. Moreover, GCL neurons' optogenetic activation during odor stimulation increased MTC spike-LFP coupling in the gamma range. Therefore, the dissociation between the effects of GCL neurons on spike entrainment and lateral inhibition suggests that the lack of lateral inhibition following GC activation is unlikely due to low expression rates.

      In some instances, the authors tend to cite older literature - which was not yet aware of the prominent contribution of EPL neurons including PVN to recurrent and lateral inhibition of MT cells - as if roles that then were ascribed to granule cells for lack of better knowledge can still be unequivocally linked to granule cells now. For example, they should discuss Arevian et al (2006), Galan et al 2006, Giridhar et al., Yokoi et al. 1995, etc in the light of PVN action.

      Therefore it is also not quite justified to state that their result regarding the role of GCs specifically for synchronization, not suppression, is "in contrast to the field" (e.g. l.70 f.,, l.365, l. 400 ff).

      We changed several sentences in the discussion and introduction to explain that previous studies attributed lateral suppression to GC because they were not aware of the prominent contribution of EPL neurons as has been demonstrated by more recent studies (Burton 2024, Huang et al., 2016,  Kato et al., 2013, and more).

      We also toned down the statement that these findings are in contrast to the field. Instead, we state that our findings support the claim that GCs are not involved in affecting MTC odor-evoked firing rate.

      Why did the authors choose to use the term "lateral suppression", often interchangeably with lateral inhibition? If this term is intended to specifically reflect reductions of firing rates, it might be useful to clearly define it at first use (and cite earlier literature on it) and then use it consistently throughout.

      We agree and have changed the manuscript accordingly. We added the following in the introduction: “We use this phrase here to refer to a process that suppresses the firing rate of the post-synaptic neuron.”

      A discussion of anesthesia effects is missing - e.g. GC activity is known to be reportedly stronger in awake mice (Kato et al). This is not a contentious point at all since the authors themselves show that additional excitation of GCs enhances synchrony, but it should be mentioned.

      We completely agree and added a paragraph to the Discussion in this regard. Please see also the response to reviewer #1, who made a similar suggestion.

      Some citations should be added, in particular relevant recent preprints - e.g. Peace et al. BioRxiv 2024, Burton et al. BioRxiv 2024 and the direct evidence for a glutamate-dependent release of GABA from GCs (Lage-Rupprecht et al. 2020).

      We thank the reviewer for noting us these relevant recent manuscripts. We have now cited Peace et al., when discussing the spatial range of inhibition and gamma synchronization in the OB, Lage-Rupprecht et al in the context of the involvement of NMDA receptor in MTC-GC reciprocal synapse and Burton et al. when discussing PV neurons potential function.

      The introduction on the role of gamma oscillations in sensory systems (in particular vision) could be more elaborated.

      In our previous paper (Dalal & Haddad 2022) we had an elaborated introduction on the role of gamma oscillations in sensory processing, since we focused in this study in the effect of gamma synchronization on information transmission between brain regions. In the current study we looked at gamma rhythms as a mechanism that can facilitate ensemble synchronization.

      Reviewer #3 (Public Review):

      Summary:

      This study by Dalal and Haddad analyzes two facets of cooperative recruitment of M/TCs as discerned through direct, ChR2-mediated spot stimulations:

      (1) mutual inhibition and

      (2) entrainment of action potential timing within the gamma frequency range.

      This investigation is conducted by contrasting the evoked activity elicited by a "central" stimulus spot, which induces an excitatory response alone, with that elicited when paired with stimulations of surrounding areas. Additionally, the effect of Gad2-expressing granule cells is examined.

      Based on the observed distance dependence and the impact of GC stimulations, the authors infer that mutual inhibition and gamma entrainment are mediated by distinct mechanisms.

      Strengths:

      The results presented in this study offer a nice in vivo validation of the significant in vitro findings previously reported by Arevian, Kapoor, and Urban in 2008. Additionally, the distance-dependent analysis provides some mechanistic insights.

      We thank the reviewer for his comments. Indeed, the current study provides in-vivo replication of the results reported in Arevian et al., 2008 in-vitro, and adds further insights by showing that lateral inhibition is distant-dependent. However, this is not the main focus of the current study. Following the findings reported by Dalal & Haddad 2022, the motivation for this study was to test the mechanism that allows co-activated MTCs to entrain their spike timing. By light-activating pairs of MTCs at varying distances, we detected a subset of pairs in which paired light-activation evoked activity-dependent lateral inhibition, as was reported by Arevian et al., 2008. Moreover, we think it is highly important to know that a previous result in an in-vitro study is fully reproducible in-vivo.

      Weaknesses:

      The results largely reproduce previously reported findings, including those from the authors' own work, such as Dalal and Haddad (2022), where a key highlight was "Modulating GC activities dissociates MTCs odor-evoked gamma synchrony from firing rates." Some interpretations, particularly the claim regarding the distance independence of the entrainment effect, may be considered over-interpretations.

      We kindly disagree with the reviewer. We think the current study extends rather than reproduces the findings reported in Dalal & Haddad 2022. The 2022 study mainly focused on the effect of OB gamma synchronization on odor representation in the Piriform cortex. We bidirectionally modulated the level of MTC gamma synchronization and found that it had bidirectional effects on odor representation in one of their downstream targets, the anterior piriform cortex. The current study, however, focuses on the question of how spatially distributed odor-activated MTCs can synchronize their spiking activity. Our current main finding is that paired activation of MTCs can enhance the spikes entrainment of the recorded MTC in an activity-dependent and spatially independent manner. We suggest that this mechanism is mediated by GCL neurons.

      The reviewer did not explain why he\she thinks that the distance independence of the entrainment effects is an over-interpretation. However, to make our claim more precise we added the following sentence to the corresponding results section:” Furthermore, within the distance range that we were able to measure, the increased phase-locking did not significantly correlate with the distance from the MTC”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments:

      (1) Line 17f: "This lateral synchronization was particularly effective when both MTCs fired at the gamma rhythm, ..."

      This sentence implies a direct comparison of the simultaneously recorded firing of MTCs but I could not find evidence for this in this manuscript. I would suggest to change this.

      We thank the reviewer. The sentence was changed to “This lateral synchronization was particularly effective when the recorded MTC fired at the gamma rhythm”.

      (2) Line 43f: A brief description of what glomeruli are could help to avoid confusion for readers less familiar with the OB. The phrasing of "activated glomeruli" and "each glomerulus innervates" are somewhat misleading given that they do not contain the cell bodies of the projection neurons.

      We edited this part of the introduction so it briefly describes what glomeruli are: ‘Olfactory processing starts with the activity of odorant-activated olfactory sensory neurons. The axons of these sensory neurons terminate in one or two anatomical structures called glomeruli located on the surface of the olfactory bulb (OB). Each glomerulus is innervated by several mitral and tufted cells (MTCs), which then project the odor information to several cortical regions. ‘

      (3) Line 78ff: The text sounds as if glomeruli are activated by the light stimulation but ChR2 is expressed in MTCs, the postsynaptic component of the glomeruli. It would be clearer to refer to the stimulation as light activation of MTCs.

      We corrected this sentence to: ‘We first mapped each recorded cell's receptive field, i.e., the set of MTCs on the dorsal OB that affect its firing rates when they are light-stimulated.’

      (4) Line 90: It would be great to mention somewhere in this paragraph that you are analyzing single-unit data sorted from extracellular recordings with tungsten electrodes.

      We added that to the description of the experimental setup: ‘To investigate how MTCs interact, we expressed the light-gated channel rhodopsin (ChR2) exclusively in MTCs by crossing the Tbet-Cre and Ai32 mouse lines (Grobman et al., 2018; Haddad et al., 2013), and extracellularly recorded the spiking activity of MTCs in anesthetized mice during optogenetic stimulation using tungsten electrodes.’

      (5) Line 97: The term "delta entrainment" could be easily confused with the entrainment of MTCs to respiration in the delta frequency band. Maybe better to use a different term or stick to "change in entrainment" also used in the text.

      We completely agree. The term was changed to “change in entrainment” throughout the manuscript and figures.

      (6) Line 121f: "Light stimulation did not affect ..." . Should this be "Paired light stimulation did not affect ..."?

      Corrected, thank you.

      (7) Supplementary Figure 1a: The example is not very convincing. It looks a bit like a rhythmic bursting neuron mildly depending on the stimulation.

      This panel serves to present our light stimulation method. The potency of the light stimulation protocol can be seen in the receptive field maps.

      (8) Supplementary Figure 1c: Why is there no confidence interval for 'Paired'?

      This panel shows the power spectrum density of the average neuron response across trials computed over the entire stimulus window (100ms). We decided to remove this panel, as panel Figure 1d shows the evolution of the entrainment in time and, therefore, provides better insight into the effect.

      (9) Line 166f: "... across any light intensities". Maybe better "... for the four light intensities tested"?

      We agree, we changed the text in accordance.

      (10) Figure 2f: It would be more intuitive to have the x-axis in the same orientation as in 2e.

      Corrected, thank you.

      (11) Figure 4a: The image in this panel is identical to Figure 1a in Dalal and Haddad 2022 in Cell reports just with a different intensity. The reuse of items and data from previous publications should be indicated somewhere but I could not find it.

      We apologize for this replication. We replaced it with a photo showing a larger portion of the OB, demonstrating the restricted viral expression within the GCL.

      (12) Line 408ff: A brief explanation for the hypothesis of EPL parvalbumin interneurons as the ones mediating lateral inhibition would be great.

      We agree. We added the following paragraph to the discussion section: “We speculate that MTC-to-MTC suppression is mediated by EPL neurons, most likely the Parvalbumin neuron (PV). This hypothesis is based on their activity and connectivity properties with MTCs(Burton, 2017; Kato et al., 2013; Miyamichi et al., 2013; Burton, 2024). More studies are required to reveal how PV neurons affect MTC activity.”

      (13) Line 425ff: You show that only activity of high firing rate neurons is suppressed by lateral inhibition, whereas "low and noise MTC responses" are not affected. Wouldn't this rather support the conclusion that lateral inhibition prevents excess activity from the OB?

      We found lateral inhibition was mainly effective when the postsynaptic neurons fired at ~30-80Hz in response to light stimulation. That is, it affects MTC firing in this “intermediate” rate, and to a lesser extent when the MTC have low and very high firing rates. To prevent excess activity, one would expect a mechanism that affects more high firing rates than medium ones. This was demonstrated in Kato 2013 for PV-MTC inhibition

      (14) Line 387: "..., only ~20% of the tested MTC pairs exhibited significant lateral inhibition." This is higher than the 16% of neurons you reported to have lateral entrainment (line 100). Why do you consider the lateral inhibition as 'sparse' but the lateral entrainment as relevant?

      We apologize for this unclear statement. The papers we cited in this regard (Fantana et al., 2008; Lehmann et al., 2016; Pressler and Strowbridge, 2017) have tested lateral inhibition when the recorded MTC was not active, which resulted in a sparse MTC-MTC inhibition. We validated and replicated these findings in our setup, by systematically projecting light spots over the dorsal OB without simultaneous activation of the recorded MTC and found similar rates of largely scarce inhibition (data not shown). In this study, using spike-triggered average light stimulation protocol and paired activation of MTCs, we found higher rates of lateral inhibition, consistent with the reports by Isaacson and Strowbridge, 1998, Urban and Sakmann, 2002. We changed this paragraph to the following:

      “We found that in only ~20% of the tested MTC pairs exhibited significant lateral suppression. This rate is consistent with previous in-vitro studies that found lateral suppression between 10-20% of heterotypic MTC pairs (Isaacson and Strowbridge, 1998; Urban and Sakmann, 2002), and is higher compared to a case where the recorded MTC is not active (Lehmann et al., 2016).”

      Reviewer #2 (Recommendations For The Authors):

      Figure-by-figure comments:

      (1) Figures 1d,e: both these examples seem to show that the firing rate is decreased in the paired condition? From maxima at 110 to 58 Hz in d and 100 to 48 Hz in e. Please explain (see also comment on Figure S1c).

      Please see the response in the Public Review section, reviewer #2, bullet (2). We also added a panel to Supplementary Figure 1 to better explain this.

      (2) Figure 1 f The means and SEMs are hard to see. Why is the SEM bar plotted horizontally? Since this is a major finding of the paper, will there be a table provided that shows the distribution of ∆ shifts across animals?

      We apologize for the mistake. The horizontal bar was the marking of the mean. Since the SEM is small, we corrected the graph for better visualization of the SEM.

      (3) Figure 1g Showing the running average of data where there is almost none or no data points (beyond 50 Hz) seems not ideal. Is the enhanced entrainment around 40Hz significant? Perhaps the moving average should be replaced by binned data with indicated n?

      We prefer to show all data points instead of binning the data so the reader can see it all. We agree that such a wide range on the x-axis is unnecessary. We shorten this graph only to include the firing rate range in which the data points ranged.

      (4) Figure 1h Impressive result!

      Thank you!

      (5) Figure S1a: since the authors show the respiratory pattern here and there obviously was no alignment of light stimulation with inspiration, was there any correlation between the respiratory phase and efficiency of light stimulation with respect to lateral interactions?

      This is an interesting idea. In Haddad et al., 2013, figure 7, the authors performed a similar analysis, and showed that optogenetic activation of MTCs had a more pronounced effect on firing rate in the respiration phases where the neuron was less firing. However, we haven’t quantified the impact of lateral interactions with respect to the respiration phase. That being said, the data will be publicly available to test this question.

      (6) Figure S1c: Here the shift towards a lower firing rate seems to be obvious (see comment in Figures 1 d and e). Please also show the plot for Figure 1e.

      This panel shows the power spectrum density of the average neuron's response across trials computed over the entire stimulus window (100ms). We decided to remove this panel, as panel Figure 1d shows the evolution of the entrainment in time and, therefore, provides better insight into the effect.

      (7) Figure 2b: show the same plot also for pair 2? Why is it stated that there is no lateral suppression for lateral stimulation alone, if the MTC did not spike spontaneously in the first place and thus inhibition cannot be demonstrated?

      We use Figure 2b to demonstrate the effect of lateral inhibition, and in Figure 2c we detail the responses under each light intensity for both pairs. We think that showing the mean and SEM for one example is enough to give a sense of the effect, as in Figure 2c we show the average response across time together with significant assessment for each pair (panels without a p-value have no significant difference between the conditions).

      However, we agree with the comment on this specific example and therefore deleted this sentence. However, at the population level we found no inhibition when activating the lateral spots, regardless of their firing rates (shown in Supplementary Figure 2a).

      (8) Figure 2d: why is there no distance-dependent color coding for the significant data points? Or, alternatively, since the distance plot is shown in 2e, perhaps drop this information altogether? Again, the moving average is problematic.

      Distance-dependent color coding is applied to all data points in this panel. Significant data points are shown in full circles and have distance-dependent color coding, which is mainly restricted to the lower part of the distance scale (cold colors).

      We used a moving average to relate to the similar result reported in Arevian 2008.In Figure 2e, the actual distance for each data point is indicated on the x-axis.

      (9) Figure 2f: the diagonal averaging method seems to neglect a lot of the data in Figure S2b, why not use radial coordinates for averaging?

      Thank you for the great suggestion. We indeed performed radial coordinates for the averaging, and the results are more robust and better summarize the entire data.

      (10) Figure 3: These are interesting observations, but are there cumulative data on such types of pairs? Please describe and show, otherwise this can only be a supplemental observation. Regarding 3b was it always the lower light intensity that resulted in suppression and the higher in sync? Since Burton et al. 2024 have just shown that PVNs require very little input to fire!

      This figure shows several examples of entrainment and inhibition properties. As suggested, we added population analysis (Figure 3c-d). This analysis compares the firing rate changes in pairs that evoked significant suppression or entrainment. First, we found only a few pairs in which paired activation evoked both spikes entrainment and suppression. Second, the mean of firing rate changes of pairs that evoked significant entrainment (N=50, shown in Figure 1f in full circles) is significantly different from the mean of the pairs that evoked significant lateral inhibition (N=51, shown in Figure 2d in full circles).

      (11) Figure 4: This Figure and the corresponding section should be entitled "Additional GC activation... ", otherwise it might be confusing for the reader. A loss of function manipulation (local GC silencing) would be also great to have! You did this in the previous paper, why not here? Raw LFP data are not shown. In Figure 4e the reported odor response firing rate ranges only up to 40Hz, but the example in g shows a much higher frequency. Is the maximum in 4e significant? (same issue as for Figure 1g).

      We changed the phrase to ‘optogenetic GCL neurons activation’. Unfortunately, we haven’t performed experiments where we suppress GC columns. In the previous paper, we suppressed the activity of all accessible GCs, which resulted in reduced spike synchronization to the OB gamma oscillations. Silencing only the GC column is, we think, unlikely to have a substantial effect, especially if the GCs have low activity (but this needs to be tested). Furthermore, we added examples of raw LFP data for odor stimulation and odor combined with GCL column activation (see Supplementary Figure 4a).

      The instantaneous firing rate is high (~80Hz), however the firing rate values we report in Figure 4e is the average within a window of 2 seconds (the odor duration is 1.5 seconds and we extend the window to account for responses with late return to baseline). The average firing rate of this example neuron in this window was 28Hz.

      (12) Fig 5: what does "proximal" mean - does this mean stimulation of the GCs below the recorded MTC, that might actually belong to the same glomerular unit?

      Yes, by “proximal” we mean the activation of the GC in the column of the recorded MTC. However, we decided that instead of coarsely dividing the data into proximal and distal optogenetic activation of GCL neurons, we will show the data continuously to show that GC had no significant effect on MTC odor-evoked firing rates regardless of their location (Figure 5d).

      A comment on the title:

      Please tone it down: "Ensemble synchronization" is a hypothesis at this point, not directly shown in the paper. Also, the paper does not show lateral interactions between odor-activated neurons.

      We agree and have rephrased it to “Activity-dependent lateral inhibition enables the synchronization of active olfactory bulb projection neurons ”

      (1) Figure 1a, 2a scale bar missing.

      Corrected, thank you.

      (2) Figure 1 c is the "rebound" in the lateral stim trace (green) real or not significant?

      The activity during this rebound is not significantly different than the baseline activity before light stimulation.

      (3) Figure 2b legend: "lateral alone" instead of lateral?

      We appreciate the suggestion. For simplicity, we will keep it as “lateral”.

      (4) Figure 2c: some of the data plots seem to be breaking off, e.g. the blue line in the bottom third one.

      This line breaking is due to the lack of spikes in this period. The PSTHs used in all analyses result from the convolution of the spike train with a Gaussian window with a standard deviation of 50ms.

      (5) Figure 2f: Why is the x axis flopped vs 2d,e?

      This panel was mistakenly plotted that way, and was corrected.

      Comments on the text:

      Abstract - we had indicated suggestions by strike-throughs and color which are lost in the online submission system, please compare with your original text:

      Information in the brain is represented by the activity of neuronal ensembles. These ensembles are adaptive and dynamic, formed and truncated based on the animal`s experience. One mechanism by which spatially distributed neurons form an ensemble is via synchronization of their spiking activity in response to a sensory event. In the olfactory bulb, odor stimulation evokes rhythmic gamma activity in spatially distributed mitral and tufted cells (MTCs). This rhythmic activity is thought to enhance the relay of odor information to the downstream olfactory targets. However, how only specifically the odor-activated MTCs are synchronized is unknown. Here, we demonstrate that light optogenetic activation of activating one set of MTCs can gamma-entrain the spiking activity of another set. This lateral synchronization was particularly effective when both MTCs fired at the gamma rhythm, facilitating the synchronization of only the odor-activated MTCs. Furthermore, we show that lateral synchronization did not depend on the distance between the MTCs and is mediated by granule cells. In contrast, lateral inhibition between MTCs that reduced their firing rates was spatially restricted to adjacent MTCs and was not mediated by granule cells. Our findings reveal lead us to propose ? a simple yet robust mechanism by which spatially distributed neurons entrain each other's spiking activity to form an ensemble.

      Thank you. We adopted most of the changes and edited the abstract to reflect the reported results better.

      "both MTCs fired at the gamma rhythm"/this is at this point unwarranted since the mutual entrainment is not shown - tone down or present as hypothesis?

      We completely agree. This sentence was changed to “This lateral synchronization was particularly effective when the recorded MTC fired at the gamma rhythm, facilitating the synchronization of the active MTC”.

      l. 28: distance-independent instead of "spatially independent"?

      Corrected

      l. 46: are there inhibitory neurons in the ONL? Or which 6 layers are you referring to here?

      Corrected to “spanning all OB layers”.

      l. 49: "is mediated" => "likely to be mediated". Schoppa's work is in vitro and did not account for PVNs, see comment in Public Review.

      Corrected. Indeed Schoppa`s work was performed in-vitro. We cite it here since it showed that the synchronized firing of two MTC pairs depends on granule cells.

      l.52: "method"? rather "mechanism"? "specifically" instread of "only"?

      Corrected.

      l.52: perhaps more precise: a recent hypothesis is that GCs enable synchronization solely between odor-activated MTCs via an activity-dependent mechanism for GABA-release (Lage Rupprecht et al. 2020 - please cite the experimental paper here). Again. Galan has no direct evidence for GCs vs PVNs, see comment in Public Review.

      Thank you, we updated this sentence here and in the discussion and added the relevant citation.

      l. 66: spike timings instead of spike's timing?

      Corrected to spike timings

      l. 67 -71: this part could be dropped.

      We appreciate the suggestion; however, we think that it is convenient to briefly read the main results before the results section.

      l. 76 mouse instead of mice.

      Corrected.

      l. 77: for clarification: " a single MTC"?

      In some cases, we recorded more than one cell simultaneously.

      l. 89: just use "hotspot".

      Corrected

      l. 97 instead of "change", "positive change" or "increase"?

      We left the word change, since we wanted to report that the change between hotspot alone and paired stimulation was significantly higher than zero.

      l. 104: the postsyn MTC's firing rate.

      Corrected to MTC instead of MTCs

      l.108: "distributed on the OB surface" sounds misleading, perhaps "across the glomerular map"?

      Corrected.

      l. 254: "which the MTCs form with each other"- perhaps "which interconnect MTCs".

      Corrected.

      l. 270 Additional GC activation.

      Corrected to ‘optogenetic activation of GCL neurons’

      l. 284 somewhat unclear - please expand.

      Corrected to ‘This measure minimizes the bias of the neuron's firing rate on the spike-LFP synchrony value’.

      l. 371: no odors in Schoppa et al.

      Corrected to ‘It has been shown that two active MTCs can synchronize their stimulus-evoked and odor-evoked spike timings’

      l. 406 ff. good point - but where is the transition? How does this observation rule out that GCs can mediate lateral suppression?

      It is an important question. We tested two setups of GCs optogenetic activation, either column activation (in this paper) or the activation of all accessible GCs of the dorsal OB (Dalal & Haddad, 2022). Although the latter manipulation results in significant firing rate suppression, the effect of MTC suppression was relatively small in anesthetized mice and even smaller in awake mice. Optogenetically activating GCs at baseline conditions resulted in a strong suppression of only the adjacent MTCs. Taken together, we think that GCs are capable of strongly inhibit MTCs, but it is not their main function in natural olfactory sensation.

      l. 422 ff: again, this is a hypothesis, please frame accordingly.

      Corrected to ‘Activity-dependent synchronization can enables the synchronization of odor-activated MTCs that are dispersed across the glomerular map’

      l. 551 typo.

      Corrected.

      l 556 ff: Figure 2 does not show odor responses.

      Corrected.

      l 582: Mix up of above/below and low/high?

      Corrected to ‘The values in the STA map that were above or below these high and low percentile thresholds’

      Reviewer #3 (Recommendations For The Authors):

      Line 76: "Ai39" should be corrected to "Ai32".

      Corrected. Thank you.

      Figure Legends: The legends should describe the results rather than interpret the data. For instance, the legends for Figures 1f, g, and h contain interpretations. The authors should review all legends and revise them accordingly.

      We appreciate the comment. However, we kindly disagree. We don’t see these opening sentences as interpretations but as guidance to the reader. For example, ‘Paired stimulation increases spikes’ temporal precision’ is not an interpretation; instead, it describes the finding presented in this panel. We think that legends that only repeat what can already be deduced from the graph are not helpful and, in many cases, obsolete. Explaining what we think this graph shows is common, and we prefer it as it helps the reader.

      For Figures 1d and e, it may be beneficial to add the spectrograms for the second stimulation alone.

      We show the stimulation of the hotspot alone and when we stimulate both.<br /> The spectrogram of the lateral alone does not show anything of importance.

      Figures 1a and 2a: Please add color bars so that readers can understand the meaning of the colors plotted.

      Color bars were added.

      Figure 3: The purpose of this figure is unclear. Why does the baseline firing rate for the paired activation differ? Is this an isolated observation, or is it observed in other units as well?

      This issue has been raised also by reviewer #2. Attached here is our response to reviewer #2

      This figure shows several examples of entrainment and inhibition properties. As suggested, we added population analysis (Figure 3c-d). This analysis compares the firing rate changes in pairs that evoked significant suppression or entrainment. First, we found only a few pairs in which paired activation evoked both spikes entrainment and suppression. Second, the mean of firing rate changes of pairs that evoked significant entrainment (N=50, shown in Figure 1f in full circles) is significantly different from the mean of the pairs that evoked significant lateral inhibition (N=51, shown in Figure 2d in full circles).

      Figures 4 and 5 data seems to come from the same dataset as in Dalal and Haddad (2022) DOI: https://doi.org/10.1016/j.celrep.2022.110693. For example, the fluorescence image looks identical. If this is the case, the authors may want to state that that the image and and some of the data and analyses are reproduced.

      The recorded data shown in these figures are not reproduced from Dalal & Haddad 2022. We collected this data, using GC-columns activation instead of light activating the entire OB dorsal surface as was done in the 2022 paper.

      However, the histology image is the same and we now replaced it with a new image, which shows that the expression is restricted to the GCL.

      Figure 4d: the authors use the data plotted here to argue that the gamma entrainment is distance-independent. But there is a clear decrease over distance (e.g., delta PPC1 over 0.01 is not seen for distance beyond 1000 m). The claim of distance independence may be an over-interpretation of the data. Peace et al. (2024) also claimed that coupling via gamma oscillations occurs over a large spatial extent.

      From a statistical point of view, we can’t state that there is a dependency on distance as the correlation is insignificant (P = 0.86). PPC1 of value 0.01 can be found at 0, 500, and 700 microns. Lower values are found at far distances, but this can result from a smaller number of points. The reduced level of synchrony observed at distances above one mm could be the result of the reduced density of lateral interactions at these distances. That said, we rephrase the sentence to a more careful statement. Please see the rephrased sentence at the Public review section.

    1. Author response:

      Reviewer #1:

      Summary:<br /> In this manuscript, Bisht et al address the hypothesis that protein folding chaperones may be implicated in aggregopathies and in particular Tau aggregation, as a means to identify novel therapeutic routes for these largely neurodegenerative conditions.

      The authors conducted a genetic screen in the Drosophila eye, which facilitates the identification of mutations that either enhance or suppress a visible disturbance in the nearly crystalline organization of the compound eye. They screened by RNA interference all 64 known Drosophila chaperones and revealed that mutations in 20 of them exaggerate the Tau-dependent phenotype, while 15 ameliorated it. The enhancer of the degeneration group included 2 subunits of the typically heterohexameric prefoldin complex and other co-translational chaperones.

      In a previous paper, we identified 95 Drosophila chaperones (Raut et al., 2017). We request that “all 64 known Drosophila chaperones” be replaced with “64 out of 95 known Drosophila chaperones” to make it factually correct.

      Strengths:

      Regarding this memory defect upon V377M tau expression. Kosmidis et al (2010) pmid: 20071510, demonstrated that pan-neuronal expression of TauV377M disrupts the organization of the mushroom bodies, the seat of long-term memory in odor/shock and odor/reward conditioning. If the novel memory assay the authors use depends on the adult brain structures, then the memory deficit can be explained in this manner.

      If the mushroom bodies are defective upon TauV377M expression does overexpression of Pfdn5 or 6 reverse this deficit? This would argue strongly in favor of the microtubule stabilization explanation.

      We agree that the disruptive organization of the mushroom body may cause memory deficits upon hTauV337M expression and that expression of Pfdn5 or Pfdn6 could reverse the deficits. One possible mechanism by which overexpression of Pfdn5/6 could rescue the Tau-induced memory deficits may be due to the stabilization of microtubules in the mushroom bodies.

      Proposed revision: We will assess if Tau-induced mushroom body disruption can be rescued with the overexpression of Pfdn5 or Pfdn6.

      Weakness:

      What is unclear however is how Pfdn5 loss or even overexpression affects the pathological Tau phenotypes. Does Pfdn5 (or 6) interact directly with TauV377M? Colocalization within tissues is a start, but immunoprecipitations would provide additional independent evidence that this is so.

      Our data suggests that Pfdn5 stabilizes neuronal microtubules by directly associating with it, and loss of Pfdn5 exacerbates Tau-phenotypes by destabilizing microtubules. However, as the reviewer notes, analysis of direct interaction between Pfdn5 and hTau<sup>V337M</sup> might provide further insights into the mechanism of Pfdn5 and Tau-aggregation.

      Proposed revision: We will perform colocalization analysis and coimmunoprecipitation to ask if Pfdn5 colocalizes and directly interacts with Tau.

      Does Pfdn5 loss exacerbate TauV377M phenotypes because it destabilizes microtubules, which are already at least partially destabilized by Tau expression? Rescue of the phenotypes by overexpression of Pfdn5 agrees with this notion.

      However, Cowan et al (2010) pmid: 20617325 demonstrated that wild-type Tau accumulation in larval motor neurons indeed destabilizes microtubules in a Tau phosphorylation-dependent manner. So, is TauV377M hyperphosphorylated in the larvae?? What happens to TauV377M phosphorylation when Pfdn5 is missing and presumably more Tau is soluble and subject to hyperphosphorylation as predicted by the above?

      Proposed revisions: We will overexpress Pfdn5 or Pfdn6 with hTau<sup>V337M</sup> and ask if microtubule disruption caused by hTau<sup>V337M</sup> is rescued. Further, we will analyze the phospho-Tau levels in controls and Pfdn5 mutant background.

      Expression of WT human Tau (which is associated with most common Tauopathies other than FTDP-17) as Cowan et al suggest has significant effects on microtubule stability, but such Tau-expressing larvae are largely viable. Will one mutant copy of the Pfdn5 knockout enhance the phenotype of these larvae?? Will it result in lethality? Such data will serve to generalize the effects of Pfdn5 beyond the two FDTP-17 mutations utilized.

      Proposed revision: We will incorporate data about the effect of heterozygous mutation of Pfdn5 on the lethality and synaptic phenotypes associated with the hTau<sup>WT</sup> and hTau<sup>V337M</sup> in the revised manuscript.

      Does the loss of Pfdn5 affect TauV377M (and WTTau) levels?? Could the loss of Pfdn5 simply result in increased Tau levels? And conversely, does overexpression of Pfdn5 or 6 reduce Tau levels?? This would explain the enhancement and suppression of TauV377M (and possibly WT Tau) phenotypes. It is an easily addressed, trivial explanation at the observational level, which if true begs for a distinct mechanistic approach.

      We thank the reviewer for suggesting an alternate model for the Pfdn5 function. We will perform the Western blot analysis to assess Tau<sup>WT</sup> and Tau<sup>V337M</sup> levels in the absence of Pfdn5 or animals coexpressing Tau and Pfdn5. We will incorporate these data and conclusions in the revised manuscript.

      Finally, the authors argue that TauV377M forms aggregates in the larval brain based on large puncta observed especially upon loss of Pfdn5. This may be so, but protocols are available to validate this molecularly the presence of insoluble Tau aggregates (for example, pmid: 36868851) or soluble Tau oligomers as these apparently differentially affect Tau toxicity. Does Pfdn5 loss exaggerate the toxic oligomers and overexpression promotes the more benign large aggregates??

      We will perform the Tau solubility assay in control, in the absence of Pfdn5 or animals coexpressing Tau and Pfdn5. Moreover, we will also ask if the large Tau puncta formed in the absence of Pfdn5 are soluble oligomers or stable aggregates. We have found that the coexpression of Tau and Pfdn5 does not result in the formation of  Tau aggregates. We will incorporate these and other relevant data in the revised manuscript.

      Reviewer #2 (Public review):

      Bisht et al detail a novel interaction between the chaperone, Prefoldin 5, microtubules, and tau-mediated neurodegeneration, with potential relevance for Alzheimer's disease and other tauopathies. Using Drosophila, the study shows that Pfdn5 is a microtubule-associated protein, which regulates tubulin monomer levels and can stabilize microtubule filaments in the axons of peripheral nerves. The work further suggests that Pfdn5/6 may antagonize Tau aggregation and neurotoxicity. While the overall findings may be of interest to those investigating the axonal and synaptic cytoskeleton, the detailed mechanisms for the observed phenotypes remain unresolved and the translational relevance for tauopathy pathogenesis is yet to be established. Further, a number of key controls and important experiments are missing that are needed to fully interpret the findings.The major weakness relates to the experiments and claims of interactions with Tau-mediated neurodegeneration. In particular, it is unclear whether knockdown of Pfdn5 may cause eye phenotypes independent of Tau. Further, the GMR>tau phenotype appears to have been incorrectly utilized to examine age-dependent, neurodegeneration.

      We have consistently found the progression of eye degeneration in the population of animals expressing Tau<sup>V337M</sup>, measured as the number of fused ommatidia/total number of ommatidia, with age. A few other studies have also shown age-dependent progressive degeneration in Drosophila retinal axons or lamina (Iijima-Ando et al., 2012; Sakakibara et al., 2018). We appreciate other studies that have proposed hTau-induced eye degeneration as a developmental defect (Malmanche et al., 2017; Sakakibara et al., 2023).

      Proposed revision: a) We will analyze the age-dependent neurodegeneration in the adult brain to further support our main conclusion that Pfdn5 ameliorates hTauV337M-induced progressive neurodegeneration.

      b) We have used three independent Pfdn5 RNAi lines (the RNAi's target different regions of Pfdn5) – all of which enhance the Tau phenotypes. The knockdown of any of these RNAi lines with GMR-Gal4 does not give detectable eye phenotypes. We will include these data in the revised manuscript.

      This manuscript argues that its findings may be relevant to thinking about mechanisms and therapies applicable to tauopathies; however, this is premature given that many questions remain about the interactions from Drosophila, the detailed mechanisms remain unresolved, and absent evidence that tau and Pfdn may similarly interact in the mammalian neuronal context. Therefore, this work would be strongly enhanced by experiments in human or murine neuronal culture or supportive evidence from analyses of human data.

      Proteome analysis of Alzheimer's brain tissue shows that the Pfdn5 level is reduced in patients (Askenazi et al., 2023; Tao et al., 2020). Moreover, the Pfdn5 expression level was found to be reduced in the blood samples from AD patients (Ji et al., 2022). Another study further validates the age-dependent reduction of Pfdn5 in the tauopathy transgenic murine model (Kadoyama et al., 2019). Together, these reports highlight a potential link between Pfdn5 levels and tauopathies. We will revise the manuscript to reflect these findings in more detail.

      References

      Askenazi, M., Kavanagh, T., Pires, G., Ueberheide, B., Wisniewski, T., and Drummond, E. (2023). Compilation of reported protein changes in the brain in Alzheimer's disease. Nat Commun 14, 4466. 10.1038/s41467-023-40208-x.

      Iijima-Ando, K., Sekiya, M., Maruko-Otake, A., Ohtake, Y., Suzuki, E., Lu, B., and Iijima, K.M. (2012). Loss of axonal mitochondria promotes tau-mediated neurodegeneration and Alzheimer's disease-related tau phosphorylation via PAR-1. PLoS Genet 8, e1002918. 10.1371/journal.pgen.1002918.

      Ji, W., An, K., Wang, C., and Wang, S. (2022). Bioinformatics analysis of diagnostic biomarkers for Alzheimer's disease in peripheral blood based on sex differences and support vector machine algorithm. Hereditas 159, 38. 10.1186/s41065-022-00252-x.

      Kadoyama, K., Matsuura, K., Takano, M., Maekura, K., Inoue, Y., and Matsuyama, S. (2019). Changes in the expression of prefoldin subunit 5 depending on synaptic plasticity in the mouse hippocampus. Neurosci Lett 712, 134484. 10.1016/j.neulet.2019.134484.

      Malmanche, N., Dourlen, P., Gistelinck, M., Demiautte, F., Link, N., Dupont, C., Vanden Broeck, L., Werkmeister, E., Amouyel, P., Bongiovanni, A., et al. (2017). Developmental Expression of 4-Repeat-Tau Induces Neuronal Aneuploidy in Drosophila Tauopathy Models. Sci Rep 7, 40764. 10.1038/srep40764.

      Raut, S., Mallik, B., Parichha, A., Amrutha, V., Sahi, C., and Kumar, V. (2017). RNAi-Mediated Reverse Genetic Screen Identified Drosophila Chaperones Regulating Eye and Neuromuscular Junction Morphology. G3 (Bethesda) 7, 2023-2038. 10.1534/g3.117.041632.

      Sakakibara, Y., Sekiya, M., Fujisaki, N., Quan, X., and Iijima, K.M. (2018). Knockdown of wfs1, a fly homolog of Wolfram syndrome 1, in the nervous system increases susceptibility to age- and stress-induced neuronal dysfunction and degeneration in Drosophila. PLoS Genet 14, e1007196. 10.1371/journal.pgen.1007196.

      Sakakibara, Y., Yamashiro, R., Chikamatsu, S., Hirota, Y., Tsubokawa, Y., Nishijima, R., Takei, K., Sekiya, M., and Iijima, K.M. (2023). Drosophila Toll-9 is induced by aging and neurodegeneration to modulate stress signaling and its deficiency exacerbates tau-mediated neurodegeneration. iScience 26, 105968. 10.1016/j.isci.2023.105968.

      Tao, Y., Han, Y., Yu, L., Wang, Q., Leng, S.X., and Zhang, H. (2020). The Predicted Key Molecules, Functions, and Pathways That Bridge Mild Cognitive Impairment (MCI) and Alzheimer's Disease (AD). Front Neurol 11, 233. 10.3389/fneur.2020.00233.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This valuable work advances our understanding of the foraging behaviour of aerial insectivorous birds. Its major strength is the large volume of tracking data and the accuracy of those data. However, the evidence supporting the main claim of optimal foraging is incomplete.

      We deeply appreciate the thoughtful review provided by the reviewers, including their valuable insights and meticulous attention to detail. Each comment has been thoroughly evaluated, leading to substantial improvements in the manuscript. Your constructive critique has been instrumental in refining our research and rectifying any oversights. We are confident that the revised article will make a substantial contribution to ecological research, particularly in advancing our understanding of foraging theories and the behaviors of aerial insectivores.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study tests whether Little Swifts exhibit optimal foraging, which the data seem to indicate is the case. This is unsurprising as most animals would be expected to optimize the energy income: expenditure ratio; however, it hasn't been explicitly quantified before the way it was in this manuscript.

      The major strength of this work is the sheer volume of tracking data and the accuracy of those data. The ATLAS tracking system really enhanced this study and allowed for pinpoint monitoring of the tracked birds. These data could be used to ask and answer many questions beyond just the one tested here.

      The major weakness of this work lies in the sampling of insect prey abundance at a single point on the landscape, 6.5 km from the colony. This sampling then requires the authors to work under the assumption that prey abundance is simultaneously even across the study region - an assumption that is certainly untrue. The authors recognize this problem and say that sampling in a spatially explicit way was beyond their scope, which I understand, but then at other times try to present this assumption as not being a problem, which it very much is.

      Further, it is uncertain whether other aspects of the prey data are problematic. For example, the radar only samples insects at 50 m or higher from the ground - how often do Little Swifts forage under 50 m high?

      Another example might be that the phrases "high abundance" and "low abundance" are often used in the manuscript, but never defined.

      It may be fair to say that prey populations might be correlated over space but are not equal. It is this unknown degree of spatial correlation that lends confidence to the findings in the Results. As such, the finding that Little Swifts forage optimally is indeed supported by the data, notwithstanding some of the shortcomings in the prey abundance data. The authors achieved their aims and the results support their conclusions.

      Thanks for this comment.

      The basic assumption of this paper is that the abundance of insects bioflow in the airspace is correlated in space and varies over time. This has been demonstrated by different studies, see for example Bell et al. (Bell, J. R., Aralimarad, P., Lim, K. S., & Chapman, J. W. (2013). Predicting insect migration density and speed in the daytime convective boundary layer. PloS one, 8(1), e54202) in which positive correlation in insect bioflow is demonstrated between different sites that are more than 100 km away in Southern England. Given the much closer proximity of the colony and the radar site, as well as the large foraging distance of the swifts that often forage in the vicinity of the radar and beyond it, it is reasonable to assume that the radar was able to successfully capture between-day variation in the abundance of flying insects in the airspace, which is highly relevant for the foraging swifts. This is likely because meteorological variables such as temperature and wind, which tend to vary over a synoptic-system scale of several hundred kilometers, significantly influence the abundance of aerial insects. Furthermore, the direction of insect flight that has been recorded by the radar points to an overall south-north directionality of the insects during the period of the study (Werber et al. Under Review: Werber, Y., Chapman, J. W., Reynolds, D. R. and Sapir, N. Active navigation and meteorological selectivity drive patterns of mass intercontinental insect migration through the Levant). Hence, it is reasonable to assume that since the colony is positioned approximately 6.5 km south of the radar site, the radar is able to reliable estimate the between-day variation in aerial insect abundance experienced by the foraging swifts. Importantly, this between-day variation is very high, and detailed information regarding this variation is provided in the paper.  We thank the reviewer for the comments on the wording and have corrected it accordingly so that it is explicitly stated that the spatial distribution of the flying insects is indeed not uniform, but is expected to be simultaneously affected by environmental variables creating spatially correlated bioflow of aerial insects.

      The term "high abundance" or "low abundance" is relative to the variable being examined but throughout the manuscript we did not use these terms to describe an absolute amount or a certain threshold but rather to describe the ecological circumstances experienced by the birds on different days that substantially varied in abundance of insect recorded by the radar. However, we have improved the wording of the text so that it is now clear that we refer to relative  and not to absolute values.

      At its centre, this work adds to our understanding of Little Swift foraging and extends to a greater understanding of aerial insectivores in general. While unsurprising that Little Swifts act as optimal foragers, it is good to have quantified this and show that the population declines observed in so many aerial insectivores are not necessarily a function of inflexible foraging habits. Further, the methods used in this research have great potential for other work. For example, the ATLAS system poses some real advantages and an exciting challenge to existing systems, like MOTUS. The radar that was used to quantify prey abundance also presents exciting possibilities if multiple units could be deployed to get a more spatially-explicit view.

      To improve the context of this work, it is worth noting that the authors suggest that this work is important because it has never been done before for an aerial insectivore; however, that justification is untrue as it has been assessed in several flycatcher and swallow species. A further justification is that this research is needed due to dramatic insect population declines, but the magnitude and extent of such declines are fiercely debated in the literature. Perhaps these justifications are unnecessary, and the work can more simply be couched as just a test of optimality theory.

      We appreciate the reviewer's helpful comment. A flycatcher is indeed an aerial insect eater, but its foraging strategy is very different from that of swifts. A comparison with the foraging strategy of the swallow is much more relevant. However, the methods used to quantify bird movement in the airspace in previous articles limited the ability to examine the optimal foraging theory in detail. Following the comment, we revised the text to better describe the uniqueness of our research. Further, since we studied insectivores, it is important to provide a broad context to potentially significant threats to the birds, albeit being debatable

      Reviewer #2 (Public Review):

      Summary:

      Bloch et al. investigate the relationships between aerial foragers (little swifts) tracked with an automated radio-telemetry system (Atlas) and their prey (flying insects) monitored with a small-scale vertical-looking radar device (BirdScan MR1). The aim of the study was to test whether little swifts optimise their foraging with the abundance of their prey. However, the results provided little evidence of optimal foraging behaviour.

      Strengths:

      This study addresses fundamental knowledge gaps on the prey-predator dynamics in the airspace. It describes the coincidence between the abundance of flying insects and features derived from tracking individual swifts.

      Weaknesses:

      The article uses hypotheses broadly derived from optimal foraging theory, but mixes the form of natural selection: parental energetics, parental survival (predation risks), nestling foraging, and breeding success.

      While this study explores additional behavioral theories alongside optimal foraging theory, its findings unequivocally support the latter. The highly statistically significant observed reduction in flight distance from the breeding colony in elation to increasing insect abundance (supporting predictions 1 and 2) coupled with an increased rate of colony visits (supporting prediction 5) demonstrate the Little Swifts' adeptness at optimizing their aerial foraging behavior. This behavior manifests in an enhanced frequency of visits to the breeding colony, underscoring their food provisioning maximization.

      Results are partly incoherent (e.g., "Thus, even when the birds foraged close to the colony under optimal conditions, the shorter traveling distance is not thought to not confer lower flight-related energetic expenditure because more return trips were made.", L285-287),

      Thanks for the comment. We have corrected this sentence.

      and confounding factors (e.g., brooding vs. nestling phase) are ignored.

      The breeding stage may indeed affect food provisioning properties but this factor is not confounded since insect abundance, and the consequent changes in bird foraging properties, fluctuated between sequential days while brooding and nestling phases take place over a period of several weeks, each. Further, despite the possible influence of breeding stages on bird behavior, variability in reproductive stages is expected among pairs in a breeding colony occupying dozens of pairs, despite some coordination in nesting initiation. Practically, the narrow and concealed nest openings hindered direct observation of the nests, posing challenges in determining the precise reproductive stage of each pair. Anyway, we added a short description of the dense colony structure to the Methods section.

      Some limits are clearly recognised by the authors (L329 and ff).

      See above the response about the distribution of insects in space.

      To illustrate potential confounding effects, the daily flight duration (Prediction 4) should decrease with prey abundance, but how far does the daily flight duration coincide with departure and arrival at sunrise and sunset (note that day length increases between March and May), respectively, and how much do parents vary in the duration of nest attendance during the day across chick ages?

      We added the following explanation to the Methods section:

      To standardize the effect of day length on daily foraging duration, we calculated and subtracted the day length from the total daily foraging time (Day duration - Daily foraging duration = Net foraging duration). The resulting data represent the daily foraging duration in relation to sunrise and sunset, independent of day length.

      To conclude, insufficient analyses are performed to rigorously assess whether little swifts optimize their foraging.

      We disagree. See our responses above.

      Filters applied on tracking data are necessary but may strongly influence derived features based on maximum or mean values. Providing sensitivity tests or using features less dependent on extreme values may provide more robust results.

      Thank you for highlighting the importance of considering the impact of data filtering on derived features. In our analysis, we employed rigorous filtering methods to emphasize central data tendencies while mitigating the influence of extreme values. These methods, validated through consultation with experts in tracking data analysis, follow established practices in the literature. Detailed descriptions of our filtering procedures can be found in the Methods section, with citations to relevant published studies.

      Radar insect monitoring is incomplete and strongly size-dependent. What is the favourite prey size of swifts? How does it match with BirdScan MR1 monitoring capability?

      We added an explanation to the Methods section to address this comment:

      The Radar Cross Section (RCS) quantifies the reflectivity of a target, serving as a proxy for size by representing the cross-sectional area of a sphere with identical reflectivity to water, whose diameter equals the target's body length. Recent findings indicate that the BirdScan MR1 radar can detect insects with an RCS as low as 3 mm², enabling the detection of insects with body lengths as small as 2 mm. These capabilities make the radar suitable for locating the primary prey of swifts, which typically range in size from 1 to 16 mm.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Lines 53-59 - major run-on sentence

      Thanks for the comment. Done.

      Line 133 - describe better. Attached where? Were feathers clipped or removed?

      Thanks for the comment. Done.

      Line 153 - shouldn't be a new paragraph

      Done.

      Line 157 - justify choosing four 

      To ensure a robust analysis of swifts' behavior relative to food abundance across multiple individuals simultaneously, we opted to exclude data from instances where only 3 tags were active. This decision was motivated by the fact that these instances accounted for only 2.9% of the data, and their exclusion minimally impacted overall data volume while enhancing data quality. In contrast, instances with 4 tags, comprising 16.2% of the data, provided substantial insights. Omitting these instances would have resulted in significant data loss. Thus, setting a threshold of 4 simultaneous tags represents a balance between maintaining adequate data quantity and ensuring high data quality for meaningful analysis.

      It took me a long time to determine whether the average and maximum flight distance was actual or Euclidean. It was only in the Results that I grasped it was actual. Define up front in the Methods.

      Thanks for the comment. Done.

      In my public review, I mention that optimal foraging has been assessed in other aerial insectivores. Here are some of the papers I was referring to:

      • Davies (1977) Prey selection and the search strategy of the spotted flycatcher (Muscicapa striata): A field study on optimal foraging. Animal Behaviour 25: 1016-1022.

      • Lifjeld & Slagsvold (1988) Effects of energy costs on the optimal diet: an experiment with pied flycatchers Ficedula hypoleuca feeding nestlings. Ornis Scandinavica 19: 111-118.

      • Quinney & Ankney (1985) Prey size selection by tree swallows. Auk 102: 245-250.

      • Turner (1982) Optimal foraging by the swallow (Hirundo rustica, L): Prey size selection. Animal Behaviour 30:862-872.

      Lastly, in terms of the work not being spatially-explicit, I do note that in lines 323-324 you acknowledge that prey populations can be patchy, then ten lines later, you provide citations to say that patchiness is not a problem because of spatial correlations. This is a bit overly dismissive, in my view, and to suggest (lines 336-337) that "patches of high insect concentration...might not exist at all" is certainly incorrect (and misleading). I do note the valiant attempt to address the spatial shortcoming in the remainder of the paragraph - although addressing it does not make the problem go away.

      Thanks for the comment.

      We revised the text to make it more coherent.

      Reviewer #2 (Recommendations For The Authors):

      L161: typo > missing space in 'meanof'

      Corrected.

      L192-193: Did the authors use the timing of sunrise and sunset to determine daytime?

      Yes. The daytime was calculated in relation to sunrise and sunset.

      Did the authors calculate the MTR from sunrise to sunset, or averaging the hourly MTR?

      If using hourly MTR, specify the criteria to assign an hourly MTR to daytime when sunset/sunrise is happening during that hour.

      A simplified terminology for "Average daily insect MTR" might be useful, in particular for the result section (insect MTR).

      Average daily insect MTR is calculated for a fixed period from 5 am to 8 pm local time. An explanation has been added to the Methods section, and the terminology in the text has been simplified as suggested

      Note that the 'M' of MTR stands for migration, which may not be appropriate in this context, and simply using "insect traffic rate" may be a better terminology.

      Thanks for the comment. The 'M' of MTR can also stand for movement, as the insects detected by the radar move in the airspace. This is how this term has been defined in the paper (e.g. in line 23 of the Summary section). Therefore, we did not change the terminology to “insect traffic rate”, which is a term not used in other studies.

      Considering the large number of predictions (10!), it would be appropriate to list them in the results (e.g., "on the daily average flight distance from the breeding colony (Prediction 3)").

      We added prediction numbers to the Results and the Discussion.

      Note that the terminology varies; e.g., in the introduction "overall daily flight distance" (L75), in the results "average length of the daily flight route" (L236), and further confusion with "daily average flight distance from the breeding colony" (L232).

      Thanks for the comment. fixed.

      The terminology - average daily 'air/flight' distance (L74-76) - needs clarification.

      Done.

      Results: Use only a relevant and consistent number of decimals to report on the effect size and p-values.

      Done.

      The authors are citing non-peer-reviewed publications:

      21. Bloch I, Troupin D, Sapir N. Movement and parental care characteristics during the nesting season of 468 the Little Swift (Apus affinis) [Poster presentation]. 12th European Ornithologists' Union Congress. Cluj Napoca, Romania. 2019.

      62. Zaugg S, Schmid B, Liechti F. Ensemble approach for automated classification of radar echoes into functional bird sub-types. In: Radar Aeroecology. 2017. p. 1. doi:10.13140/RG.2.2.23354.80326

      It is acceptable to cite non-peer-reviewed sources if they have a significant contribution to the background of the article without a critical impact on the core of the research.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript explores the multiple cell types present in the wall of murine-collecting lymphatic vessels with the goal of identifying cells that initiate the autonomous action potentials and contractions needed to drive lymphatic pumping. Through the use of genetic models to delete individual genes or detect cytosolic calcium in specific cell types, the authors convincingly determine that lymphatic muscle cells are the origin of the action potential that triggers lymphatic contraction. 

      Strengths: 

      The experiments are rigorously performed, the data justify the conclusions, and the limitations of the study are appropriately discussed. 

      There is a need to identify therapeutic targets to improve lymphatic contraction and this work helps identify lymphatic muscle cells as potential cellular targets for intervention. 

      Weaknesses: 

      My only major comment would be that the manuscript provides a lot of rich information describing the cellular components of the muscular lymphatic vessel wall and that these data are not well represented by the title. The title (while currently accurate) could be tweaked to better represent all that is in this manuscript. Maybe something like

      "Characterization/Interrogation of the cellular components of murine collecting lymphatic vessels reveals that lymphatic muscle cells are the innate pacemaker cells regulating lymphatic contractions" or "Discovery/Confirmation of lymphatic muscle cells as innate pacemaker cells of lymphatic contraction through characterization of the cellular components of murine collecting lymphatic vessels". Potentially a cartoon summary figure of the components that make up the collecting lymphatic vessel wall could also be included. In my opinion, these changes will make this manuscript of more interest to a broader group of scientists. I have a few additional comments for consideration to improve the clarity and enhance the discussion of this work. 

      We agree with the reviewer that our original manuscript, and our resubmission even more so with the addition of the scRNAseq data, provides a significant amount of information regarding the composition of the lymphatic collecting vessel wall. We have changed our title to match one suggestion of the reviewer: “Characterization of the cellular components of murine collecting lymphatic vessels reveals that lymphatic muscle cells are the innate pacemaker cells regulating lymphatic contractions".

      Reviewer #2 (Public Review): 

      Summary: 

      This is a well-written manuscript describing studies directed at identifying the cell type responsible for pacemaking in murine-collecting lymphatics. Using state-of-the-art approaches, the authors identified a number of different cell types in the wall of these lymphatics and then using targeted expression of Channel Rhodopsin and GCaMP, the authors convincingly demonstrate that only activation of lymphatic muscle cells produces coordinated lymphatic contraction and that only lymphatic muscle cells display pressure-dependent Ca2+ transients as would be expected of a pacemaker in these lymphatics. 

      Strengths: 

      The use of a targeted expression of channel rhodopsin and GCaMP to test the hypothesis that lymphatic muscle cells serve as the pacemakers in musing lymphatic collecting vessels. 

      Weaknesses: 

      The only significant weakness was the lack of quantitative analysis of most of the imaging data shown in Figures 1-11. In particular, the colonization analysis should be extended to show cells not expected to demonstrate colocalization as a negative control for the colocalization analysis that the authors present. 

      We understand the reviewer’s concern regarding the lack of a control for the colocalization analysis and that the colocalization analysis was limited to just one set of cell markers. We have now provided a colocalization analysis of Myh11 and PDGFRα, to serve as a co-localization negative control based on our RT-PCR and scRNASeq findings, which is incorporated into the current Supplemental figure 1. In regard to the staining pattern of other various marker combinations, the results were often quite clear with the representative images that two separate cell populations were being stained such as the case with labeling endothelial cells with CD31, macrophage labeling with the MacGreen mice, or hematopoietic cells with CD45. 

      During our lengthy rebuttal process we completed a single cell RNA sequence analysis using our isolated and cleaned mouse inguinal axillary lymphatic collecting vessels to aid in our characterization of the vessel wall and to more thoroughly answer these questions regarding colocalization in arguably a robust manner. The generation of our scRNAseq dataset, derived from isolated and cleaned mouse inguinal axillary collecting vessels from 10 mice, 5 male and 5 females, allowed us to profile over 2200 of the adventitial fibroblast like cells (AdvCs) we had identified in our original submission. Using this dataset, we were able to confirm co-expression of Cd34 and Pdgfrα in AdvCs and assess the co-expression of other genes of interest from our RT-PCR experiments and immunofluorescence experiments. This approach will also allow other lymphatic investigators to assess their genes of interest as our dataset is uploaded to the NIH Gene Omnibus and will be uploaded to the Broad Institute Single Cell Portal upon publication.

      Here we show that the vast majority of non-muscle fibroblast like cells referred to as AdvCs were double positive for both CD34 and PDGFRα. We also show that the AdvCs that express commonly used pericyte markers Pdgfrb and Cspg4 also co-expressed Pdgfrα. Critically, this data also shows that the AdvCs that express genes linked with lymphatic contractile dysfunction (Ano1, Gjc1 or connexin 45, and Cacna1c “Cav1.2”) co-express Pdgfrα and would render these genes susceptible to Cre-mediated recombination using our Pdgfrα-CreER<sup>TM</sup> model.  

      Reviewer #3 (Public Review): 

      Summary: 

      Zawieja et al. aimed to identify the pacemaker cells in the lymphatic collecting vessels. Authors have used various Cre-based expression systems and optogenetic tools to identify these cells. Their findings suggest these cells are lymphatic muscle cells that drive the pacemaker activity in the lymphatic collecting vessels. 

      Strengths: 

      The authors have used multiple approaches to test their hypothesis. Some findings are presented as qualitative images, while some quantitative measurements are provided.   

      Weaknesses: 

      -  More quantitative measurements. 

      -  Possible mechanisms associated with the pacemaker activity. 

      -  Membrane potential measurements. 

      We thank the reviewers for their concerns and have addressed them in the following manner. 

      - We added novel single cell RNA sequencing of isolated and cleaned inguinal axillary vessels from 10 mice (5 males and 5 females). This allowed us to quantify the number of AdvCs that coexpress CD34 and Pdgfrα as well as the number of cells co-expressing Pdgfrα and other markers.

      - We have added a negative control with quantification for the co-localization analysis assessing Myh11 and Pdgfrα. We have added a negative control with quantification for the ChR2-photo stimulated contraction experiments using Myh11CreERT2-ChR2 mice that were not injected with tamoxifen. 

      - We also used Biocytin-AF488 in our intracellular Vm electrodes to map the specific cells in which we recorded action potentials and in neighboring cells since Biocytin-AF488 is under 1KDa and can pass through gap junctions. This approach independently labeled lymphatic muscle cells and their direct neighbors for 3 IALVs from 3 separate mice. 

      - We performed membrane potential recordings in isolated, pressurized (under isobaric conditions), and spontaneously contracting inguinal axillary lymphatic collecting vessels at different pressures. 

      - We also show that the pressure-frequency relationship is dependent on the slope of the diastolic depolarization as no other parameter was significantly altered in our study and the diastolic depolarization slope was highly correlated with contraction frequency. 

      We believe the addition of these novel data, controls, experiments, and quantifications have improved the manuscript in line with the reviewers’ suggestions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Lines 149-162: The authors rule out the methylene blue staining cells in the cLV wall as pacemakers because they don't form continuous longitudinal connections to drive propagation. Is it possible for a pacemaker cell to only initiate the contraction and then have the LMCs make the axial electrical connections to propagate the electrical wave? I am not trying to suggest the methylene blue cells are pacemakers, but I am not sure the lack of longitudinal (or radial) connectivity is sufficient evidence to rule out the possibility. This comment also is relevant to the 3 criteria for a pacemaker cell listed in the Discussion (Lines 413-417). 

      We agree with the reviewer’s broader point that a pacemaker cell may not require direct contact with other ‘pacemaker’ cells within the tissue as long as they are still within the same electrical syncytium. However, we do expect a continuous presence of a pacemaker cell type throughout the vessel wall length to account for the persistence of spontaneous contractile behavior despite vessel length, and the ability for contraction initiation to shift (Akl et al 2011, Castorena et al 2018 and Castorena et al 2022) and the occurrence of spontaneous action potentials. In Dirk van Helden’s seminal work in 1993 on lymphatic pacemaking, a major finding was that “SM of small lymphangions or that of short segments, cut from lymphangions of any length, behaved similarly”. We have adjusted our phrase regarding the requirement of a contiguous network and instead suggest a continuous presence along the vessel network and integrated into the electrical syncytium. 

      Methylene blue is an alkaline stain that will stain acidic structures and historically methylene blue is noted to stain Interstitial cells of Cajal in the gastrointestinal tract which typically exist as network of cells(Huizinga et al 1993 and Berezin 1988). No such network was readily apparent in our methylene blue staining nor did the stained cells have a similar morphology to the ICCs of the gastrointestinal tract. Further, methylene blue is staining is not limited to ICCs or pacemaker cells at large as it has been used to kill cancer cells. Within the small intestine methylene blue was noted to also stain macrophage like cells (Mikkelsen et al 1988), and we too draw parallels between the macrophage morphology observed with Macgreen mice and methylene-blue stained cells. The specific structure for the ICC affinity for methylene blue is not well described and while the innate cytotoxicity of methylene blue and light has been used to kill ICCs and impair slow wave generation, the lack of specificity of this method leaves much to be desired. What is clear is that the ICC network highlighted by methylene blue in the gut is absent in lymphatic collecting vessels.

      In Figure 15/Video12, is it possible that the cells that are showing intracellular Ca2+ in diastole are the cells that reach a threshold membrane potential that then trigger the rest of the LMCs? As the authors have shown heterogeneity in the LMCs surface markers, is it possible that the cells with Ca2+ activity during diastole are identifiable by a distinct molecular phenotype? Or is the thought that these cells are randomly active in diastole? Some discussion/speculation about this seems appropriate. 

      We are in agreement with the reviewer’s conclusion that there is heterogeneity in the LMCs as it pertains to the calcium oscillations in diastole, either under normal buffer conditions or when L-type channels are inhibited with nifedipine. We also note significant heterogeneity in the gene expression noted within the four LMC subclusters (0-3), though we did not see significant differences in either in Ip3R1 or Ano1 expression. However, subcluster “0” had increased expression of Itprid2, also known as KRas-induced actin-interacting protein (KRAP) which is thought to tether, and thus immobilize, IP3 receptors to the actin cortex beneath the cell membrane. KRAP has been recently proposed to be a critical player in IP3 receptor “licensing” which allows IP3 receptors to release calcium (Vorontsova et al., 2022).  However, whether similar requirement of IP3R licensing is necessitated in all cells or specifically in LMCs is unknown it is quite clear there are specific release sites within the cell and this topic is currently under further investigation for a separate manuscript. We would like to note that there is yet to be a clear consensus on whether IP3R licensing is required as much of these studies are performed in cultured cells and this mechanism has only recently been described. 

      Healthy lymphatic collecting vessels typically have a single pacemaker driving a coordinated propagated contraction in ex vivo isobaric myograph studies (Castorena-Gonzalez et al., 2018), which is typically at either end of the cannulated vessel. We believe that this is due to the lack of a bordering cell in one direction and allows charge to accumulate and voltage to reach threshold at these sites preferentially. We have tried to image calcium at the pacemaking pole of the vessel to observe the specific Ca<sup>2+</sup> transients at these sites though invariably the act of imaging GCaMP6f results in the pacemaker activity initiating from the other pole of the vessel. It is our opinion that the fact that LMCs are heterogenous in their Ca<sup>2+</sup> transients is a feature to the system as it permits a wider range of depolarization signals, and thus allows finer control of the pacing as different physical/pressure or signaling stimuli is encountered. Likely, the cells with the higher propensity of Ca<sup>2+</sup> transients act as the contraction initiation site in vivo, though it must also be noted that the LMC density decreases around lymphatic valve sites. In fact, in guinea pig collecting vessels there are very few LMCs at the valves which can render them electrically uncoupled or poorly coupled (Van Helden, 1993). Thus, valve sites in which there is greater electrical resistance due to lower LMC-LMC coupling may allow for charge accumulation in the LMCs at the valve site, similar to the artificial condition achieved in our myograph preparations with two cut ends, and allow them to reach threshold first and drive coordination at the valve sties.

      An additional description of what the PTCL analysis is meant to represent physiologically would be helpful for readers. 

      We have better described the conversion of the calcium signals into “particles” for analysis at first mention in the methods and results section and have included the requisite reference to this specific methodology in Line 429-30. 

      A description of how DMAX is experimentally determined is needed. 

      We have adjusted our methods section to describe DMAX in line 774-775.

      “with Ca<sup>2+</sup>-free Krebs buffer (3mM EGTA) and diameter at each pressure recorded under passive conditions (DMAX).”

      I think the vessels referred to as popliteal lymphatic vessels are actually saphenous lymphatic vessels (afferent to the popliteal lymph node). Please clarify. 

      Indeed, some of the vessels used in this study are the afferents to the single popliteal node. They travel with the caudal branch of the saphenous vein, but have routinely been described as popliteal vessels, as opposed to saphenous lymphatic vessels, by the lymphatic field at large (Tilney 1971 PMCID: PMC1270981, Liao 2015 PMID: 25512945). To move away from this nomenclature would likely add to confusion although we agree that the lymphatic field may need to improve or correct the vessel naming paradigm to match the vascular pairs they follow.

      Reviewer #2 (Recommendations For The Authors): 

      Lines 214-215 - can you cite a reference for the observation that rhythmic contractions don't require the presence of valves? 

      We have added the reference. In Dr. Van Helden’s seminal work on the topic in 1993, “Vessel segments were then cut from selected small lymphangions (length 300-500 um) by cutting at the valves.” Additionally, work by Dr Anatoliy Gashev utilized sections of lymphatic vessels that lacked valves to study orthograde and retrograde shear sensitivity (Gashev et al., 2002).

      Lines 224-230 - It would have been nice to see colocalization analysis for all cell types so that "negative" results could be compared with the "positives" that you report. This would help bolster evidence of your ability to separate cell types. 

      We understand the reviewer’s sentiment and agree. We have now added a “negative control” colocalization staining and analysis for PDGFR and Myh11 which has been added to the current SuppFigure 1. We stained 3 IALVs from 3 separate mice with PDGFRα and Myh11 and performed confocal microscopy. We ran the FIJI BIOP-JACOP colocalization plugin as before and observed very little colocalization of the two signals. Additionally, we have also added a coexpression assessment for CD34 and PDGFRα and other genes using our scRNAseq dataset.  

      line 293 - Should read "Cx45 in..." 

      This has been corrected. 

      “The expression of the genes critically involved in cLV function—Cav1.2, Ano1, and Cx45—in the PdgfrαCreER<sup>TM</sup>-ROSA26mTmG purified cells and scRNAseq data prompted us to generate PdgfrαCreER<sup>TM</sup>-Ano1<sup>fl/fl</sup>, PdgfrαCreER<sup>TM</sup>-Cx45<sup>fl/fl</sup>, and PdgfrαCreER<sup>TM</sup>-Cav1.2<sup>fl/fl</sup> mice for contractile tests.”

      lines 470-473 - A reference for this statement should be cited. 

      We have added the reference. In Dr. Van Helden’s seminal work on the topic in 1993, “Vessel segments were then cut from selected small lymphangions (length 300-500 um) by cutting at the valves.” Additionally, work by Dr Anatoliy Gashev utilized sections of lymphatic vessels that lacked valves to study orthograde and retrograde shear sensitivity (Gashev et al., 2002).

      Lines 483-487 - References should be cited for these statements. 

      We have narrowed and clarified this statement and supported it with the necessary citations. 

      “Of course, mesenchymal stromal cells (Andrzejewska et al., 2019) and fibroblasts (Muhl et al., 2020; Buechler et al., 2021; Forte et al., 2022) are present, and it remains controversial to what extent telocytes are distinct from or are components/subtypes of either cell type (Clayton et al., 2022). Telocytes are not monolithic in their expression patterns, displaying both organ directed transcriptional patterns as well as intra-organ heterogeneity (Lendahl et al., 2022) as readily demonstrated by recent single cell RNA sequencing studies that provided immense detail about the subtypes and activation spectrum within these cells and their plasticity (Luo et al., 2022).”

      Lines 584-585 - Missing a reference citation. 

      Thank you for catching this error, the correct citation was for Boedtkjer et al 2013 and is now properly cited. 

      Line 638 - "these this" should read "this" 

      Thank you for catching this error. This particular sentence was removed in light of the addition of the scRNAseq data.

      Reviewer #3 (Recommendations For The Authors): 

      This manuscript from Zawieja et al. explored an interesting hypothesis about the pacemaker cells in lymphatic collecting vessels. Many aspects of lymphatic collecting vessels are still under investigation; hence this work provides timely knowledge about the lymphatic muscle cells as a pacemaker. Although it is an important topic of the investigation, the data provided do not support the overall goal of the manuscript. Many figures (Figure 1-5) provide quantitative estimation and the description provided in the results section might only be useful for a restricted audience, but not to the broader audience. Some of the figures are very condensed with multiple imaging panels and it is hard to follow the differences in qualitative analysis. Overall, this manuscript can be improved by more streamlined description/writing and figure arrangements (some of the figures/panels can be moved to the supplementary figures). 

      We disagree with the notion that the original data provided did not support the goal of the manuscript- to identify and test putative pacemaker cell types. Nonetheless we believe we have also added ample novel data to the manuscript, including membrane potential recordings and scRNAseq to highlight and to add further support to our conclusion that the pacemaker cell is an LMC. We believe the scRNAseq data will also greatly enhance the appeal of the manuscript to a broader audience and have renamed the manuscript in line with the wealth of data we have collected on the components of the vessel wall as we tested for putative pacemaker cells.

      As requested, we have moved many figures to the supplement to allow readers to focus more on the more critical experiments.

      A few other points that need to be addressed: 

      (1) Authors used immunofluorescence-based differences in various cell types in the collecting vessels. Initially, they chose ICLC, pericytes, and lymphatic muscle cells. But then they started following adventitial cells and endothelial cells. It is not clear from the description, why these other cells could be possibly involved in the pacemaker activity. It will be easier to follow if authors provide a graphical abstract or summary figure about their hypothesis and what is known from their and others' work. 

      We would like to clarify that we used the endothelial cells as controls to ensure what we observed via immunofluorescence and FACs RT-PCR were a separate cell type from either lymphatic muscle or lymphatic endothelial cells on the vessel wall. Staining for the endothelium also allowed us to assess where these PDGFRα+CD34+ cells reside in the vessel wall.  We started with a wide range of markers that are conventionally used for targeting specific cell types, but as expected those markers are not always 100% specific. Specifically, we focused on CD34, Kit, and Vimentin as those were the markers for the non-muscle cells observed in the lymphatic collecting vessel wall previously. What we found was that CD34 and PDGFRα labeled the same cell type. As there was not a CD34Cre mouse available at the time we instead utilized the inducible PDGFRαCreERTM. We are unsure how well an abstract figure will condense the conclusions from the experiments listed here but if absolutely required for publication we can attempt to highlight the representative cell populations identified on the vessel wall.

      (2) Authors used many acronyms in the manuscript without defining them (when they appeared for the first time). Please follow the convention. 

      We have checked the manuscript and made several corrections regarding the use of abbreviations.

      (3) How specific PDGFR-alpha as a marker of the pericytes? It can also label the mesenchymal cells. Why did the author choose PDGFR-alpha over beta for their Cre-based expression approach? 

      We tried to assess if there were a pericyte like cell present in or along the wall using PDGFRbeta (Pdgfrβ). Pdgfrβ is commonly used to identify pericytes (Winkler et al., 2010), while in contrast Pdgfrα is a known fibroblast marker (Lendahl et al., 2022). Pdgfrβ CreERT2 resulted in recombination in both LMCs and AdvCs, preventing it from being a discriminating marker for our study where as Myh11CreER<sup>T2</sup> and PDGFRαCreER<sup>TM</sup> were specific at least to cell type based on our FACSs-RT-PCR and staining. As you can tell from the scRNAseq data in Figure 5, there was no cell cluster that Pdgfrβ was specific for in contrast to PDGFRα and Myh11.  In Figure 6 we show the expression of another commonly used pericyte marker NG2 (Cspg4) in our scRNAseq dataset which was observed in both LMCs and AdvCs as well. Lastly, MCAM (Figure 6) can also be a marker for pericytes though we see only expression in the LMCs and LECs for this marker. Notably, almost all of the AdvCs express PDGFRα rendering the PDGFRαCreER<sup>TM</sup> a powerful tool to study this population of cells on the vessel wall including those that were PDGFRα+Cspg4+ or PDGFRα+ Pdgfrβ+.

      We were reliant on PDGFRαCreER<sup>TM</sup> as that was the only available PDGFRα Cre model at the time. Note we used PdgfrβCreER<sup>T2</sup> and Ng2Cre in our study but found that both Cre models recombined both LMCs and AdvCs.

      (4) Please include appropriate references for all the labeling markers (PDGFR-alpha, beta, and myc11 etc.) that are used in this manuscript. 

      We have added multiple references to the manuscript to support the use of these common cell “specific” markers as of course each marker is limited in some capacity to fully or specifically label a single population of cells (Muhl et al., 2020).

      (5) One of the criteria for the pacemaker cells is depolarization-induced propagated contractions. Authors have used optogenetics-induced depolarization to test this phenomenon. Please include negative controls for these experiments. 

      We have now added negative controls to this experiment which were non-induced (no tamoxifen) Myh11CreER<sup>T2</sup>-Chr2 popliteal vessels. This data has been added to the Figure 8.  

      (6) What are the resting membrane potentials of Lymphatic muscle cells? The authors should provide some details about this in the manuscript. 

      We agree with the reviewer and have added membrane potential recordings (Figure 13) at different pressures and filled our recording electrode with the cell labeling molecule BiocytinAF488 to highlight the action potential exhibiting cells, which were the LMCs. Lymphatic resting membrane potential is dynamic in pressurized vessels, which appears to be a critical difference in this approach as compared to pinned out vessels or those on wire myographs likely due to improper stretch or damage to the vessel wall. In mesenteric lymphatic vessels isolated from rats the minimum membrane potential achieved during repolarization ranges from -45 to 50mV typically while IALVs from mice are typically around -40mV, though IALVs have a notably higher contraction frequency. Critically, we have also added novel membrane potential recordings to this manuscript in IALVs at different pressures and show that the diastolic depolarization rate is the critical factor driving the pressure-dependent frequency.

      (7) In the discussion, the authors discussed SR Ca2+ cycling in Pacemaking, but the relevant data are not included in this manuscript, but a manuscript from JGP (in revision) is cross-referenced. 

      As discussed above, we have recently published our work where studied IALVs from Myh11CreERT2-Ip3R1fl/fl (Ip3r1ismKO) and Myh1CreERT2-Ip3r1fl/fl-Ip3r2fl/fl-Ip3r3fl/fl mice (Zawieja et al., 2023). Deletion of Ip3r1 from LMCs recapitulated the dramatic reduction in frequency we previously published in Myh11CreERT2-Ano1fl/fl mice and the loss of pressure dependent chronotropy. Furthermore, in this manuscript we also showed that the diastolic calcium transients are nearly completely lost in ILAVs from Myh11CreERT2-Ip3R1fl/fl knockout mice. There was no difference in the contractile function between IALVs from single Ip3r1 knockout and the triple Ip3r1-3 knockout mice suggesting that it is Ip3r1 that is required for the diastolic calcium oscillations. Further, in the presence of 1uM nifedipine there were still no calcium oscillations in the Myh11CreERT2-Ip3r1fl/fl LMCs. These findings provide further support for our interpretation that the pacemaking is of myogenic origin.

      Andrzejewska, A., B. Lukomska, and M. Janowski. 2019. Concise Review: Mesenchymal Stem Cells: From Roots to Boost. Stem Cells. 37:855-864.

      Buechler, M.B., R.N. Pradhan, A.T. Krishnamurty, C. Cox, A.K. Calviello, A.W. Wang, Y.A. Yang, L.

      Tam, R. Caothien, M. Roose-Girma, Z. Modrusan, J.R. Arron, R. Bourgon, S. Muller, and S.J. Turley. 2021. Cross-tissue organization of the fibroblast lineage. Nature. 593:575579.

      Castorena-Gonzalez, J.A., S.D. Zawieja, M. Li, R.S. Srinivasan, A.M. Simon, C. de Wit, R. de la Torre, L.A. Martinez-Lemus, G.W. Hennig, and M.J. Davis. 2018. Mechanisms of Connexin-Related Lymphedema. Circ Res. 123:964-985.

      Clayton, D.R., W.G. Ruiz, M.G. Dalghi, N. Montalbetti, M.D. Carattino, and G. Apodaca. 2022. Studies of ultrastructure, gene expression, and marker analysis reveal that mouse bladder PDGFRA(+) interstitial cells are fibroblasts. Am J Physiol Renal Physiol. 323:F299F321.

      Forte, E., M. Ramialison, H.T. Nim, M. Mara, J.Y. Li, R. Cohn, S.L. Daigle, S. Boyd, E.G. Stanley, A.G. Elefanty, J.T. Hinson, M.W. Costa, N.A. Rosenthal, and M.B. Furtado. 2022. Adult mouse fibroblasts retain organ-specific transcriptomic identity. Elife. 11.

      Gashev, A.A., M.J. Davis, and D.C. Zawieja. 2002. Inhibition of the active lymph pump by flow in rat mesenteric lymphatics and thoracic duct. J Physiol. 540:1023-1037.

      Lendahl, U., L. Muhl, and C. Betsholtz. 2022. Identification, discrimination and heterogeneity of fibroblasts. Nat Commun. 13:3409.

      Luo, H., X. Xia, L.B. Huang, H. An, M. Cao, G.D. Kim, H.N. Chen, W.H. Zhang, Y. Shu, X. Kong, Z.

      Ren, P.H. Li, Y. Liu, H. Tang, R. Sun, C. Li, B. Bai, W. Jia, Y. Liu, W. Zhang, L. Yang, Y. Peng, L. Dai, H. Hu, Y. Jiang, Y. Hu, J. Zhu, H. Jiang, Z. Li, C. Caulin, J. Park, and H. Xu. 2022. Pancancer single-cell analysis reveals the heterogeneity and plasticity of cancer-associated fibroblasts in the tumor microenvironment. Nat Commun. 13:6619.

      Muhl, L., G. Genove, S. Leptidis, J. Liu, L. He, G. Mocci, Y. Sun, S. Gustafsson, B. Buyandelger, I.V.

      Chivukula, A. Segerstolpe, E. Raschperger, E.M. Hansson, J.L.M. Bjorkegren, X.R. Peng, M. Vanlandewijck, U. Lendahl, and C. Betsholtz. 2020. Single-cell analysis uncovers fibroblast heterogeneity and criteria for fibroblast and mural cell identification and discrimination. Nat Commun. 11:3953.

      Van Helden, D.F. 1993. Pacemaker potentials in lymphatic smooth muscle of the guinea-pig mesentery. J Physiol. 471:465-479.

      Vorontsova, I., J.T. Lock, and I. Parker. 2022. KRAP is required for diffuse and punctate IP(3)mediated Ca(2+) liberation and determines the number of functional IP(3)R channels within clusters. Cell Calcium. 107:102638.

      Winkler, E.A., R.D. Bell, and B.V. Zlokovic. 2010. Pericyte-specific expression of PDGF beta receptor in mouse models with normal and deficient PDGF beta receptor signaling. Mol Neurodegener. 5:32.

      Zawieja, S.D., G.A. Pea, S.E. Broyhill, A. Patro, K.H. Bromert, M. Li, C.E. Norton, J.A. CastorenaGonzalez, E.J. Hancock, C.D. Bertram, and M.J. Davis. 2023. IP3R1 underlies diastolic ANO1 activation and pressure-dependent chronotropy in lymphatic collecting vessels. J Gen Physiol. 155.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This study presents valuable observations of white matter organisation from diffusion MRI and two types of synchrotron imaging in both monkeys and mice. Cross-modality comparisons are interesting as the different methods are able to probe anatomical structures at different length scales, from single axons in high-resolution synchrotron (ESRF) imaging, to clusters of axons in lower-resolution synchrotron (DEXY) data, to axon populations at the mm-scale in diffusion MRI. By acquiring all modalities in monkey and mouse ex vivo samples, the authors can observe principles of fibre organisation, and characterise how fibre characteristics, such as tortuosity and micro-dispersion, vary across select brain regions and in healthy tissue versus a demyelination model. The results are solid, though some statements (in the abstract/discussion) do not appear to be fully supported, and statistical tests would help confirm whether tissue characteristics are similar/different between different conditions.

      R1.1: Thank you for the kind feedback. We have included statistical tests in the paper for tissue characteristics where appropriate.

      Due to the very high number of sample points (one per voxel) within the 3D synchrotron volumes, testing for statistical significance is challenging for the structure tensor-based tissue fractional anisotropy (FA) metric. This causes any standard statistical test to have sufficient power to evaluate even minute differences between the volumes as statistically significant with high confidence. In other words, the null hypothesis (H0) will always be rejected with p = 0, regardless of the practical significance of the difference. Therefore, we have not added statistical analysis for FA results.

      For the tractography based metrics, the number of sample points (one per streamline) is not as high as that for the structure tensor FA, thus making it more reasonable to test for statistical significance. The statistical analyses performed included tests for equality of distributions (Two-sample Kolmogorov-Smirnov tests), equality of medians (Two-sided Wilcoxon rank sum tests), and equality of variance (Brown-Forsythe tests). The results are described in relation to Figure 5(B, D), Figure 8(CF), and detailed in the Methods section.

      One very interesting result is the observation of apparent laminar organisation of fibres in ex vivo monkey white matter samples. DESY data from the corpus callosum shows fibres with two dominant orientations (one L-R, one slightly inclined), clustered in laminar structures within this major fibre bundle. Thanks to the authors providing open data, I was able to look through the raw DESY volume and observe regions with different "textures" (different orientations) in the described laminar arrangement. That this organisation can be observed by eye, as well as by structure tensor, is fairly convincing. As not all readers will download the data themselves, the manuscript could benefit from additional figures/videos to demonstrate (1) the quality of the DESY data and (2) a more 3D visualisation of the laminar structures (where the coronal plane shows convincing columnar structure or stripes). Similarly in Figure 5A, though this nicely depicts two populations with different orientations, it is somewhat difficult to see the laminar structure in the current image.

      ESRF data of the centrum semiovale (CS) contributes evidence for similar laminar structures in a crossing fibre region, where primarily AP fibres are shown to cluster in 3 laminar structures. As above, further visualisations of the ESRF volume in the CS (as shown in Figure 4E) would be of value (e.g. showing consistency across the 4 volumes, 2D images showing stripey/columnar patterns along different axes, etc).

      R1.2: Conveying complex 3D geometry through 2D still images is indeed challenging, and we greatly appreciate the reviewer’s comments and suggestions. To better communicate the understanding of the 3D anatomical environments, we have taken the following actions:

      (1) To enhance insights into the tractography results in Figures 5A and 5D, we have rendered and added animations of the tractography scenes as supplemental material.

      (2) To visually support 3D insights concerning the consistency of the laminar organisation of the callosal fibres, we have replaced the 2D slice views in Figures 3A and 3B with 3D renderings similar to the one in Figure 4E.

      (3) An animation of Figure 4E was created to display the colour-coded structure tensor directions of all four stacked scans. This animation visually supports the complexity of the fibre orientation and the layered structural laminar organisation of the CS sample.

      A key limitation of this result is that, though the DESY data from the CC seems convincing, the same structures were not observed in high-resolution synchrotron (ESRF) data of the same tissue sample in the corpus callosum. This seems surprising and the manuscript does not provide a convincing explanation for this inconsistency. The authors argue that this is due to the limited FOV of the ESRF data (~200x200x800 microns). However, the observed laminar structures in DESY are ~40 microns thick, and ERSF data from the CST suggests laminar thicknesses in the range of 5-40 microns with a similar FOV. This suggests the ERSF FOV would be sufficient to capture at least a partial description of the laminar organisation. Further, the DESY data from the CC shows columnar variations along the LR axis, which we might expect to be observed along the long axis of the ESFR volume of the same sample. Additional analyses or explanations to reconcile these apparently conflicting observations would be of value. For example, the authors could consider down-sampling the ESRF data in an appropriate manner to make it more similar to the DESY data, and running the same analysis, to see if the observed differences are related to resolution (i.e. the thinner laminar structures cluster in ways that they look like a thicker laminar structure at lower resolution), or crop the DESY data to the size of the ESRF volume, to test whether the observed differences can be explained by differences in FOV. Laminar structures were not observed in mouse data, though it is unclear if this is due to anatomical differences or somewhat related to differences in data quality across species.

      R1.3: We have clarified and expanded upon the results regarding the laminar organisation observed in the monkey CC DESY data. As noted in R1.2, we replaced the 2D images in Figures 3A (DESY) and 3B (ESRF) with 3D renderings to better display the spatial outline of the laminar organisation in the volumes. The reviewer is correct that, although the smaller field of view (FOV) of the ESRF data should allow us to at least partially capture parts of the laminar organisation observed in the larger FOV of the DESY data, this is not guaranteed. It depends on how the smaller FOV is positioned relative to the structural organisation, and since we lack co-registration, we do not know this. It should now be visually evident that the ESRF FOV can be placed such that it does not cover the observed laminae, a point which is now also emphasised in the Discussion. 

      Secondly, it is important to emphasise that the voxel colouring using the primary structure tensor direction is just a visualisation technique, which has limitations when it comes to assessing laminar organisation. Mapping 3D directions to RGB colours is inherently difficult and will always have ambiguities. If we had used the standard R-G-B to LR-AP-IS colouring in Figure 3, the laminar organisation would not be evident. Additionally, the laminae will only be visible when there are clear angular differences. There can still be a layered organisation even if we don’t observe it, which is the case for the mouse. The primary direction differences of these layers could be very low (i.e., parallel layers), and consequently not visually evident. This point has been clarified in both the Results and Discussion sections.

      Finally, in response to R1.6, we have added analyses regarding the shape of the FOD, specifically estimating the Orientation Dispersion Index (ODI) and Dispersion Anisotropy (DA). This provides further context to the reviewer’s comments about the discrepancies in laminar organisation. We have reflected on the relationship between DA and the visually observed laminar organisation, and this has been integrated into the relevant parts of the Results and Discussion sections.

      The changes to manuscript reflecting the statements above are listed here: 

      The Discussion section (page 21): “In the monkey CC DESY data, which has a field of view (FOV) comparable to a dMRI voxel, a columnar laminar organisation at a macroscopic level was visually revealed from the structure tensor (ST) direction colouring. However, this laminar organisation was not visible in the higher-resolution ESRF data for the same tissue sample. Although the two samples were not co-registered, the size of a single ESRF FOV within the DESY sample is illustrated in Fig. 3A. This demonstrates the possibility of placing the ESRF sample where the observed laminar structure is absent. Consequently, knowledge of the tissue structural organisation and its orientation is important to fully benefit from the stacked FOV of the ESRF sample and when choosing appropriate minimal FOV sizes in future experiments.

      Interestingly, when characterising FODs with measures like ODI and DA as indicators of fibre organisation, rather than relying on visualisation, results from large- and small-FOV data show no discrepancies. This statistical approach discards the spatial context (visually perceived as laminae), highlighting the need to combine both methods.” 

      The Results section (page 8): “The mid-level DA values suggest some anisotropic spread of the directions, reflecting the angled laminar organisation observed in the DESY sample. Interestingly, the DA value for the ESRF sample is almost identical, despite the laminar bands being less visually apparent.”

      The Results section (page 17): “Nevertheless, visualisation of orientations did not reveal any axonal organisation in the mouse CC due to the lack of local angular contrast, unlike the clear laminar structures seen in the monkey sample (Fig. 3A). Any parallel organisation in tissue remains undetectable because our visual contrast relies on angular differences.”

      The Discussion section (page 22): “In the monkey CC (mid-body), we observed laminar organisation indicated by clear spatial angular differences in the ST directions in the sample (Fig. 3A). Quantifications of the FOD shape showed DA indices of 0.55 and 0.59 for the DESY and ESRF samples, respectively. In contrast, the mouse CC (splenium) did not visually reveal a similar angled laminar organisation (Fig. 7C), and the DA indices were lower, at 0.49 and 0.32, respectively. Two possible explanations exist. First, the within-pathway laminar organisation may not be identical across the entire CC. Consequently, more scans from other CC regions would be required to confirm. Second, the different species might account for the differences. Larger brains like the monkey might foster a different level of within-pathway axon organisation compared to the smaller mouse. Although we could not visually detect laminar organisation from the colour coding of the ST direction in the mouse, the non-zero DA values suggest some level of organisation. This is supported by our streamline tractography, which indicates a vertical layered organisation (Fig. 8A, B). It further aligns with studies using histological tracer mapping that shows a stacked parallel organisation of callosal projections in mice, between cortex regions M1 and S1 (Zhou et al. 2013). Nevertheless, we cannot rely solely on voxel-wise ST directions to fully describe axonal organisation, as this method does not contrast almost parallel fasciculi (inclination angles approaching 0 degrees). Analysing patterns in tractography streamlines would be an interesting future direction for this purpose.”

      The authors further quantify various other characteristics of the white matter, such as micro-dispersion, tortuosity, and maximum displacement. Notably, the microscopic FA calculated via structure tensor is fairly consistent across regions, though not modalities. When fibre orientations are combined across the sample, they are shown to produce similar FODs to dMRI acquired in the same tissue, which is reassuring. As noted in the text, the estimates of tortuosity and max displacement are dependent on the FOV over which they are calculated. Calculating these metrics over the same FOV, or making them otherwise invariant to FOV, could facilitate more meaningful comparisons across samples and/or modalities.

      R1.4: This raises an interesting point about the necessity of normalising the FOV to obtain invariant, tractography-based metrics of tortuosity and maximum deviation across different samples and modalities. In general, achieving this is challenging, and in this study, it is practically not possible. Between species, we encounter significant differences in brain volume ratios, which complicates the establishment of a common reference FOV due to the distinct anatomical organisation of monkey and mouse brains (see our response to R1.8). Within species, we would encounter challenges due to missing contrast—such as issues with staining—and the lack of perfect co-registration.

      The Discussion section (page 28) has been extended to reflect this: ”Within the same species, assuming perfect co-registration of samples, it would be possible to perform correlative imaging and analysis. This would allow validation of whether tractography streamlines could be reproduced at different image resolutions within the same normalised FOV. Although this was not possible with the current data and experimental setup, it would be an interesting point to pursue in future work.”

      Though the results seem solid, some statements, particularly in the abstract and discussion, do not seem to be fully supported by the data. For example, the abstract states "Our findings revealed common principles of fibre organisation in the two species; small axonal fasciculi and major bundles formed laminar structures with varying angles, according to the characteristics of major pathways.", though the results show "no strong indication within the mouse CC of the axonal laminar organisation observed in the monkey". Similarly, the introduction states: "By these means, we demonstrated a new organisational principle of white matter that persists across anatomical length scales and species, which governs the arrangement of axons and axonal fasciculi into sheet-like laminar structures." Further comments on the text are provided below.

      R1.5: We understand that it can be misunderstood that the laminar organisation is identical in monkeys and mice, which is not the case. For example, we show that in the corpus callosum, pathways are parallel in the mouse but not in the monkey. We have clarified that while the principle of layered laminar organisation of pathways is shared between monkeys and mice, species-specific differences do exist.

      We have made the following clarifying changes to the manuscript:

      The Abstract (page 2): “Our findings revealed common principles of fibre organisation that apply despite the varying patterns observed across species” 

      The Introduction (page 4-5): “Through these methods, we demonstrated organisational principles of white matter that persists across anatomical length scales and species. These principles govern the organisation of axonal fasciculi into sheet-like laminar shapes (structures with a predominant planar arrangement). Interestingly, while these principles remain consistent, they result in varied structural organisations in different species.” 

      The Discussion (page 21): “despite species differences”.

      One observation not notably discussed in the paper is that the spherical histograms of Figure 3E/H appear to have an anisotropic spread of the white points about 0,0. It would be interesting if the authors could comment on whether this could be interpreted as the FOD having asymmetric dispersion and if so, whether the axis of dispersion relates to the fibre orientations of the laminar structures.

      R1.6: That is a good point, and to address it, we have fitted spherical Bingham distributions to the FODs, allowing us to quantify their shapes. From each Bingham distribution, we derived two wellknown indices from the diffusion MRI community: the Orientation Dispersion Index (ODI) and Dispersion Anisotropy (DA) index. The ODI explains the dispersion of fibres for a single bundle FOD, whereas DA expresses the shape of the FOD on the unit sphere surface, i.e., the degree of anisotropy. We have integrated the Bingham-based analysis into the Methods, Discussion, and Results sections concerning Figures 3 and 7, but not Figure 4, which contains multiple fibre bundles that we cannot separate on a voxel level. The analysis does not impact the overall message and conclusion but adds interesting context to the discussion around laminar organisation.

      A limitation of the study is that it considers only small ex vivo tissue samples from two locations in a single postmortem monkey brain and slightly larger regions of mouse brain tissue. Consequently, further evidence from additional brain regions and subjects would be required to support more generalised statements about white matter organisation across the brain.

      R1.7: Collecting more samples from various locations in the brain would provide valuable insights into the consistency of white matter organisation across anatomical length scales, as well as the structuretensor based anisotropy and tortuosity metrics. However, being awarded beamtime at two different synchrotron facilities to scan the same sample with different imaging setups is practically challenging. At the ESRF, we have gathered additional image volumes from other white matter regions of the monkey brain that support all our findings, which will be published separately. X-ray synchrotron imaging technology is advancing rapidly, with faster acquisition times enabling more image volumes to be stitched together. This extends the FOV and allows for a more robust statistical description of the anatomy. Consequently, future studies with an extended FOV and varying image resolutions could utilise a single synchrotron facility to collect additional samples, further supporting our findings.

      The Discussion section (page 27) has been extended to reflect this: “Increasing the number of samples across both species and examining laminar organisation at various length scales in more regions would strengthen our findings. However, securing beamtime at two different synchrotron facilities to scan the same sample with varying image resolutions is a limiting factor. Beamline development for multiresolution experimental setups, along with faster acquisition methods, is a rapidly advancing field. For instance, the Hierarchical Phase-Contrast Tomography (HiP-CT) imaging beamline at ID-18 at the ESRF, enables multi-resolution imaging within a single session to address this challenge, though it is currently limited to a resolution of 2.5 μm (Walsh et al. 2021).”

      Given the monkey results, the mouse study (section 2.5 onwards) lacks some motivation. In particular, it is unclear why a demyelination model was studied and if/how this would link to the laminar structure observed in the monkey data. Further, it is unclear how comparable tortuosity/max deviation values are across species, considering the differences in data quality and relative resolution, given that the presented results show these values are very modality-dependent.

      R1.8: We have clarified the motivation for including the mouse part of the study in both the Introduction and the Results sections.

      The Introduction section (page 5): “Furthermore, using a mouse model of focal demyelination induced by cuprizone (CPZ) treatment, we investigate the inflammation-related influence on axonal organisation. This is achieved through the same structure tensor-derived micro-anisotropy and tractography streamline metrics.”

      The Results section (page 15): “Finally, we investigated the organisation of fasciculi in both healthy mouse brains and a murine model of focal demyelination induced by five weeks of cuprizone (CPZ) treatment. This allowed for the exploration of the disease-related influence on axonal organisation, particularly under inflammation-like conditions with high glial cell density at the demyelination site (He et al. 2021). The experimental setup for DESY and ESRF is similar to that described for the monkey, with the exception that we did not perform dMRI and synchrotron imaging on the same brains, and only collected MRI data for healthy mouse brains. This approach allowed us to apply the same structure tensor and tractography streamline analysis used previously, but in a healthy versus disease comparison, demonstrating the methodology’s ability to provide insights into pathological conditions.”

      Across species, the comparison of tortuosity and maximum deviation must be approached with caution. On one hand, we observe a comparable influence of the extra-axonal environment in both the monkey and mice, as discussed in the section “Sources to the non-straight trajectories of axon fasciculi.” On the other hand, the anatomical scale and relative image resolution are significant factors, as correctly pointed out. In the mouse, for instance, the measures are influenced by white matter pathway macroscopic effects, making cross-species comparison challenging to perform in a normalised way.

      The limitations section of the Discussion (page 28) has been updated to reflect this: ”A limiting consequence of having samples imaged at differing anatomical scales is that certain measures become inherently hard to compare in a normalised way. The tractography-based metrics—tortuosity and maximum deviation—serve as good examples of this resolution and FOV dependence. In the ESRF samples, the anatomical scale was at the level of individual axons, and the streamline metrics primarily reflect micro-scale effects from the extra-axonal environment, such as the influence of cells and blood vessels. In comparison, the larger anatomical scale in the DESY samples represents the level of fasciculi and above, with metrics influenced by macroscopic effects, such as the bending of the CC pathway. Both scales are interesting and can provide valuable insights in their own right, but caution is required when comparing the numbers, especially for cross-species studies where there is a significant difference in brain volume ratios.”

      The paper introduces a new method of "scale-space" parameters for structure tensors. Since, to my understanding, this is the first description of the method, some simple validation of the method would be welcomed. Further, the same scale parameters are not used across monkeys and mice, with a larger kernel used in mice (Table 2) which is surprising given their smaller brain size. Some explanation would be helpful.

      R1.9: We have expanded the description of the scale-space structure tensor approach in the Methods section. Specifically, we have elaborated on the empirical process used to select the scale-space parameters shown in Table 2 and explained why multiple scales were applied only to the monkey samples scanned at ESRF (see Table 2, sample IDs 2 and 3) but not to the other datasets. Additionally, we have added a supplementary figure to assist in illustrating the concept.

      Reviewer #2 (Public Review):

      Summary:

      In this work, the authors combine diffusion MRI and high-resolution x-ray synchrotron phase-contrast imaging in monkey and mouse brains to investigate the 3D organization of brain white matter across different scales and species. The work is at the forefront of the anatomical investigation of the human connectome and aligns with several current efforts to bridge the resolution gap between what we can see in vivo at the millimeter scale and the complexity of the human brain at the sub-micron scale. The authors compare the 3D white matter organization across modalities within 2 small regions in one monkey brain (body of the corpus callosum, centrum semiovale) and within one region (splenium of the corpus callosum) in healthy mice and in one murine model of focal demyelination. The study compares measures of tissue anisotropy and fiber orientations across modalities, performs a qualitative comparison of fasciculi trajectories across brain regions and tissue conditions using streamlined tractography based on the structure tensor, and attempts to quantify the shape of fasciculi trajectories by measuring the tortuosity index and the maximum deviation for each reconstructed streamline. Results show measures of anisotropy and fiber orientations largely agree across modalities, especially for larger FOV data. The high-resolution data allows us to explore the fiber trajectories in relation to tissue complexity and pathology. The authors claim the study reveals new common organization principles of white matter fibers across species and scales, for which axonal fasciculi arrange into sheet-like laminar structures.

      Strengths:

      The aim of the study is of central importance within present efforts to bridge the gap between macroscopic structures observable in vivo in humans using conventional diffusion MRI and the microscopic organization of white matter tissue. Results obtained from this type of study are important to interpret data obtained in vivo, inform the development of novel methodologies, and expand our knowledge of the structural and thus functional organization of brain circuits.

      Multi-scale data acquired across modalities within the same sample constitute extremely valuable data that is often hard to acquire and represent a precious resource for validation of both diffusion MRI tractography and microstructure methods.

      The inclusion of multi-species data adds value to the study, allowing the exploration of common organization principles across species.

      The addition of data from a murine cuprizone model of focal demyelination adds interesting opportunities to study the underlying biological changes that follow demyelination and how these impact tissue anisotropy and fiber trajectories. These data can inform the interpretation and development of diffusion MRI microstructure models.

      Weaknesses:

      The main claim of a newly discovered laminar organization principle that is consistent across scales and species is not supported strongly enough by the data. The main evidence in support of the claim comes from the larger FOV data obtained from the body of the corpus callosum in the monkey brain. A laminar organization principle is partially shown in the centrum semiovale in the monkey brain and it is not shown in mice data. Additionally, the methods lack details to help the correct interpretation of these findings (e.g., how were these fasciculi defined?; how well do they represent different axonal populations?; what is the effect of blood vessels on the structure tensor reconstruction?; how was laminar separation quantified?) and the discussion does not provide a biological background for this organization. The corpus callosum sample suggests axons within a bundle of fibers are organized in a sheet-like fashion, while data from the centrum semiovale suggest fibers belonging to different fiber bundles are organized in a sheet-like arrangement. While I acknowledge the challenges in acquiring such high-resolution data, additional samples from different regions in the same animals and from different animals would help strengthen this claim.

      R2.1 

      -  how were these fasciculi defined?

      In the introduction (page 3), we have clarified our definition of an axon fasciculus: “A fasciculus is a bundle of axons that travel together over short or long distances. Its size and shape can vary depending on its internal organisation and its relationship to neighbouring fasciculi.”

      Additionally, we emphasise in the Results section (page 12) that the centroid streamlines are not guaranteed to be actual fasciculi, but rather representations of them. The paragraph now states: “To ease visualisation and quantification, we used QuickBundle clustering(Garyfallidis et al. 2012) to group neighbouring streamlines with similar trajectories into a centroid streamline. This centroid streamline serves as an approximation of the actual trajectory of a fasciculus.”

      - what is the effect of blood vessels on the structure tensor reconstruction?

      Fair point, that was not clear from our description. The clarification contains two parts. First, the estimation of the structure tensor occurs in all voxels, and in that sense, the blood vessels respond very similarly to axons. Second, when it comes to sample statistics derived from the structure tensor analysis (FA histograms and the FODs), they will have an influence, albeit a small one, given the low volume percentage of the blood vessels within the FOVs. In the monkey samples, segmenting the blood vessels was achievable with little effort, allowing us to exclude their contribution from FA statistics and FODs. To make this clear, we have added a paragraph to the Methods section (page 34) titled “Structure tensor-based quantifications,” reflecting this clarification. Additionally, we have restructured the entire structure tensor methods description (starting on page 32) as part of the reviewer comments in R1.6 and R1.9.

      - how was laminar separation quantified?

      We have added a clarification in Results section (page 7): “The laminar thickness was determined by manual measurements on laminae visually identified in the 3D volume”.

      - discussion does not provide a biological background for this organization.

      A good point. Including the biological background is relevant as it supports the laminar organisation of white matter pathways observed in our findings and those of others.

      We have added a section on this background in the Discussion (page 24): “We believe our observed topological rule of white matter laminar organisation can be explained by a biological principle known from studies of nervous tissue development. The first axons to reach their destination, guided by their growth cones, are known as “pioneering” axons. “Follower” axons use the shaft of the pioneering axon for guidance to efficiently reach the target region (Breau and Trembleau 2023). Axons can form a fasciculus by fasciculating or defasciculating along their trajectory through a zippering or unzipping mechanism, controlled by chemical, mechanical, and geometrical parameters. Zippering “glues” the axons together, while unzipping allows them to defasciculate at a low angle (Šmít et al. 2017). Although speculative, the zippering mechanism may be responsible for forming the laminar topology observed across length scales. The defasciculation effect can explain our results in the corpus callosum (CC) of monkeys, with laminar structures at low angles (~35 degrees) also observed by (Innocenti et al. 2019; Caminiti et al. 2009), as well as in other major pathways (Sarubbo et al. 2019). In contrast, a fasciculation mechanism may be observed in the mouse CC (0 degrees). If the geometrical angle between two axons is high, i.e., toward 90 degrees, the zippering mechanism will not occur, and the two axons (fasciculi) will cross (Šmít et al. 2017). This supports our and other findings that crossing fasciculi or pathways occur at high angles toward 90 degrees in the fully matured brain (Wedeen et al. 2012). Once myelination begins, the zippering mechanism is lost (Šmít et al. 2017), suggesting that laminar topology is established at the earliest stages of brain maturation.”

      - additional samples from different regions in the same animals and from different animals would help strengthen this claim

      Reviewer #1 also pointed to the inclusion of additional samples, and this is now discussed as part of the study limitations on page 27 (see also R1.7).

      The main goal of the study is to bridge the organization of white matter across anatomical length scales and species. However, given the substantial difference in FOVs between the two imaging modalities used, and the absence of intermediate-resolution data, it remains difficult to effectively understand how these results can be used to inform conventional diffusion MRI. In this sense, the introduction does not do a good enough job of building a strong motivation for the scientific questions the authors are trying to answer with these experiments and for the specific methodology used.

      R2.2: Indeed, this is an essential point now emphasised in the introduction, page 3, which now states: ”Despite the limited resolution of dMRI, the water diffusion process can reveal microstructural geometrical features, such as axons and cell bodies, though these features are compounded at the voxel level. Consequently, estimating microstructural characteristics depends on biophysical modelling assumptions, which can often be simplistic due to limited knowledge of the 3D morphology of cells and axons and their intermediate-level topological organisation within a voxel. Thus, complementary highresolution imaging techniques that directly capture axon morphology and fasciculi organisation in 3D across different length scales within an MRI voxel are essential for understanding anatomy and improving the accuracy of dMRI-based models(Alexander et al. 2019).”

      Additionally, in the introduction, page 4, we have made the following changes to strengthen the link across modalities, such that it now states: “In the x-ray synchrotron data, we applied a scale-space structure tensor analysis, which allowed for the quantification of structure tensor-derived tissue anisotropy and FOD in the same anatomical regime indirectly detected by dMRI.”

      The cuprizone data represent a unique opportunity to explore the effect of demyelination on white matter tissue. However, this specific part of the study is not well motivated in the introduction and seems to represent a missed opportunity for further exploration of the qualitative and quantitative relationship between diffusion MRI and sub-micron tissue information (although unfortunately not within the same brain sample). This is especially true considering the diffusion MRI protocol for mice would allow extrapolation of advanced measures from different tissue compartments.

      R2.3: A similar point was raised by Reviewer 1 (R1.8), and we have clarified the motivation for including the healthy mice and the demyelination samples.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Many thanks to the authors for providing open data. This was very helpful when reviewing the manuscript and is a valuable resource for the community.

      R1.10: We are happy to share our data with the community. Understanding anatomy in 3D is hard to achieve through still images and animations, so the ability to explore it on your own is quite important. The link to the data repository has been added in the Methods section in the following paragraph: “Due to the size of the data selected, processed image volumes, masks and results are available at https://zenodo.org/records/10458911. Other datasets can be shared on request.“

      One confusing element of the paper is that orientations (or axes) do not seem to be consistent across samples/modalities. For example, the green tensors in Figures 3 C and D are tilted up/down in opposite directions and the streamlines in Figure 5A seem opposite (SL) from what we would expect from Figure 2A (SR). Having consistent orientations across modalities and images would help the reader. When colouring tensors (e.g. in Figure 3), the authors could consider a 3D colour scheme (similar to that used by diffusion MRI) rather than colouring by only inclination, as this would provide useful information on whether different laminae have similar orientations, as implied by the tractography in Figure 4.

      R1.11: Thank you for spotting the suboptimal consistency between Figures 2, 3, and 5. Figure 2 has been corrected and updated. The left-right direction in the coronal views was not correctly displayed. Additionally, the glyph directions have been updated in Figures 2 and 3.

      By default, we use the “standard” RGB colour scheme used in dMRI. However, for the monkey CC— essentially Figure 3—this did not effectively illustrate our findings. We decided to use a different directional colour encoding scheme, which captures the angular deviation from the L-R axis. This was to assist in the visualisation of the inclination angle between the laminars. We have used the same colour scheme for the tensors in Figure 3 to avoid confusion.

      On a general note, the standard colour scheme has uniform “colour contrast” in all directions, but when there is only a single dominant direction in the sample, it can make sense to concentrate the colour contrast in that axis.

      Results: "even higher FA anisotropy in the micro-tensor domain of 0.997, i.e., the micro (μ)FA (20, 21)." I understand these references lead to a definition of μFA that is based on multiple diffusion tensor encodings which is quite different from that suggested by Kaden. It may be preferable to reference Kaden directly (since I understand this is the method used) to avoid confusion.

      R1.12: Correctly spotted, and we now reference the method from Kaden et al. and use the other references elsewhere when relevant.

      "and scanned the mouse brain in a whole." - typo?

      R1.13: Thank you for spotting the typo. The mouse brain was kept in the skull during MRI scanning, which has been clarified in the Methods section.

      The crossing fibre region appears to be sometimes referred to as the centrum semiovale, and other times as the CST. CS seems the better description and keeping this naming consistent would avoid confusion to the reader.

      R1.14: Well spotted, thank you. We have replaced the usage of Corticospinal Tract (CST) with centrum semiovale (CS) where relevant.

      Direct comments on the text:

      Abstract: "Individual axon fasciculi exhibited tortuous paths .... in a manner independent of fibre complexity and demyelination"

      Do statistical comparisons of the various distributions support this? The data shows somewhat increased tortuosity in the CST compared to the CC, and somewhat lower tortuosity in CPZ tissue.

      R1.15: The intention of the text was not to point to the comparison of tortuosity, but rather to highlight the maximum deviation. We observe a high probability density of maximum deviations at approximately 5-10 microns in all samples, which corresponds to the size of structures in the extraaxonal environment, such as blood vessels and cells.

      Additionally, we understand that the original statement might imply an expectation of a statistical analysis demonstrating independence, which is not the case. To clarify, we have reformulated the sentence in the Abstract (page 2) to address these points: “Fasciculi exhibited non-straight paths around obstacles like blood vessels, comparable across the samples of varying fibre complexity and demyelination.”

      Abstract: "A quantitative analysis of tissue anisotropies and fibre orientation distributions gave consistent results for different anatomical length scales and modalities, while being dependent on the field-of-view."

      To my understanding, the FODs here from different modalities are calculated over different FOVs (in monkeys at least), and FODs are only presented for a single FOV for each modality, meaning it is difficult to separate the effects of modality from FOV. The microscopic anisotropy is also noticeably different across modalities (DESY < ESRF < dMRI).

      R1.16: That is a fair point. Our statement was trying to capture too much condensed content to be correctly interpretable. We have reformulated the sentence to state: “Quantifications of fibre orientation distributions were consistent across anatomical length scales and modalities, whereas tissue anisotropy had a more complex relationship, both dependent on the field-of-view”.

      While it is true that we only present the ST-derived quantifications – FOD and FA statistics – for a single FOV per modality and sample, the results shown for the ESRF monkey samples (Figures 3 and 4) are a merge of four individually processed volumes. The quantifications of each individual subFOV have now been added as a supplementary figure (Figure S3) to highlight the consistency of the methodology and the effect of shifting the FOV position. In the case of the mouse, we have two volumes from different mice, which also display similar FOD and FA statistics.

      Abstract: "Our study emphasises the need to balance field-of-view and voxel size when characterising white matter features across anatomical length scales."

      This point does not seem very well explored in the paper, rather it is an observation of the limitations of the different imaging modalities. For example, there aren't analyses to compare metrics from highresolution data at different FOVs (i.e. by taking neighbourhoods of different sizes), nor are metrics compared from data at different resolutions and the same FOV.

      R1.17: The question is related to R1.16, R1.4, and R1.8, and we have addressed this point in our responses to those comments.

      Figure 7 - Taking into account the eigenvalues can be helpful when interpreting the secondary and tertiary eigenvectors of tensors (V2 and V3). It would be interesting to know whether the eigenvalues L2 ~= L3 are approximately equal (suggesting isotropic diffusion about V1, where the definition of V2 versus V3 isn't very meaningful), or if L2 is noticeably larger than L3 (suggesting anisotropic diffusion about V1, potentially similar to the anisotropic dispersion discussed above).

      R1.18: It would be interesting to explore the eigenvalues of the structure tensor in more detail, as has been done for the diffusion tensor. However, we believe this belongs to future work, as such additional detailed methodological analysis would complicate the already complex story. As mentioned in response to R1.10, most processed data has been made publicly available, and the rest can be requested (due to the storage size of the data sets) to perform such additional analysis.

      Discussion: "Importantly, our findings revealed common principles of fibre organisation in both monkeys and mice; small axonal fasciculi and major bundles formed sheet-like laminar structures," See above regarding the lack of evidence for laminar structures in mouse data.

      R1.19: We have reformulated the text for clarification as part of R1.3. Additionally, we added FOD quantifications to support why we do not observe an apparent laminar organisation in the mouse CC— please see our response to R1.6.

      Discussion: "Interestingly, the dispersion magnitude is indicative of fasciculi that skirt around obstacles in the white matter such as cells and blood vessels, and the results are largely independent of both white matter complexity (straight vs crossing fibre region) and pathology." Again, do statistical tests of the various distributions support this?

      R1.20: As part of R1.1, we have added statistical tests of significance for the quantifications of how max deviation changes when bending around objects. Indeed, the distributions are not statistically the same, and we do not wish to convey that sentiment, but they are comparable in the object sizes that they detect. As done in the abstract, we have reformulated the sentence to avoid misunderstanding and have replaced “largely independent” with “observed across.”

      Discussion: "Tax et al. have demonstrated the calculation of a sheet probability index from diffusion MRI data, which suggested the presence of sheet-like features in the CC"

      My understanding was that this was observed in crossing fibre regions, such as where fibres projecting with the CC cross the CST, but not the main body of the CC itself. Tax defines sheet structure as "composed of two tracts that cross each other on the same surface in certain regions along their trajectories." Is this a different phenomenon to the laminar structures observed here (where we observe fibres within a single tract being locally organised into laminar structures)?

      R1.21: Thank you for pointing our attention to this. We have corrected the section in the Discussion (page 23), so it now states: “Additionally, Tax et al. have demonstrated the calculation of a gridcrossing sheet probability index from diffusion MRI data, which suggested the presence of sheet-like features in a crossing fibre region (Tax et al. 2016), which is in line with our findings in the synchrotron data. Note that the method by Tax et al. only detects sheet-like structures crossing on a grid and does not reveal laminar structures with lower inclination angles, as we observed in the monkey CC.”

      Discussion: "We found that FODs were consistent across image resolutions and modalities, but only given that the FOV is the same." See above.

      R1.22: As part of our response to R1.6, we quantified the FODs using the ODI and DA indices, which should help support our statement. Nevertheless, we have toned down the statement and reformulated the text as follows: “We found that FODs were comparable across image resolutions and modalities. The observed discrepancies can be attributed to the fact that the FOVs are not exactly matched.”

      Discussion: "microscopic FA were highly correlated across modalities."

      The data shows FA is considerably lower in DESY to ESRF; within modality FA is quite consistent irrespective of tissue region; and differences between the CC and CG shown in ESRF data in mice are not repeated in DESY. It is unclear from the current data if this would lead to a high correlation across modalities. Some evidence would be helpful.

      R1.23: This is a fair point; we have not performed a correlation analysis. However, the pattern we observe for the synchrotron samples is as follows: When the anatomical length scale increases (becomes more macroscopic), the FA distribution shifts to lower values. This reflects the scale of information captured with the ST analysis (see also R1.9). Therefore, the most interesting comparison of FA statistics occurs when the resolution and anatomical length scale are approximately the same.  The sentence in question has been reformulated to the following: ”Estimates of structure tensor derived microscopic FA show a clear pattern across modalities.”

      Discussion: "If so, the (inclination angle) information might serve to form rules for low-resolution diffusion MRI based tractography about how best to project through bottleneck regions, which is currently a source of false-positives trajectories (6)."

      This is an interesting idea but it is unclear to me how this inclination information would help track through bottlenecks where, by definition, fibres are passing through with the same orientation. Some further explanation would be helpful.

      R1.24: We have elaborated on the section in the Discussion (page 23), explaining how this can be used to improve tractography tracing through complex regions: “The reason is that standard tractography methods do not "remember" or follow anatomical organisation rules as they trace through complex regions. Our findings on pathway lamination and inclination angles—low for parallel-like trajectories and high for crossing-like trajectories—can help incorporate trajectory memory into these methods, reducing the risk of false trajectories”.

      Reviewer #2 (Recommendations For The Authors):

      Below I report comments that if addressed I believe would improve the clarity and readability of the manuscript.

      -  Figures 1 and 2 would be more meaningful if combined into one figure. This would allow for a direct visual comparison of the two modalities. If space is needed, I believe the second row of Figure 1 (coronal views of CC) does not add much information. It is often hard to navigate the different orientations of the tissue in the images; thus any effort in trying to help the reader visually clarify would improve readability.

      R2.4: We considered the reviewer’s suggestion to merge Figures 2 and 3. However, this made both the figures and the main text additionally complex, so we chose to retain the original figure layout. Secondly, Figure 3 utilises a non-standard directional colormap. Keeping the colormap consistent within each figure is a feature we wish to preserve. In response to R1.11, the figures have been updated to have more consistent orientations for the monkey samples.

      In Figure 2, the second row, showing a coronal view of the CC, is essential for comparison with human data in Figure S1. It highlights where we observed the columnar laminar organisation and their inclination angle, as also detected by DTI.

      -  Figure 4 shows synchrotron data revealing an anterior-posterior component within the centrum semiovale that is not necessarily seen in the dMRI data. Could the authors comment on this?

      R2.5: Thank you for pointing this out. We have now addressed this in the Results section (page 10), where we describe the observation in detail: “Interestingly, visual inspection of the colour-coded structure tensor directions in Fig. 4E shows the existence of voxels whose primary direction is along the A-P axis. However, this represents a small enough portion of the volume that it does not appear as a distinct peak on the FOD.“

      -  The authors claim they observed several purple axons crossing orthogonally in Figure 5c. However, that is not necessarily clear in the figure.

      R2.6: We appreciate the feedback. We have now coloured the streamlines of the crossing fasciculi in Figure 5C in red.

      -  Figure 5 would benefit from adding the color encoding scheme for Figure 5d, as sometimes this is not necessarily consistent.

      R2.7: We appreciate the feedback. We have added an indication of the standard directional colour coding to Figure 5D.

      -  Figure 5d shows interesting data from the complex region. However, it is hard to visualize and it looks like there are not many streamlines traveling entirely I-S? Maybe a different orientation of the sample would help visualization.

      R2.8: A similar point was raised by Reviewer 1 (see R1.2). We have added an animation of the scene to assist in the interpretation of the 3D organisation within this complex sample.

      -  The concept of axon fasciculi is not necessarily immediately clear. Adding an explanation for what the authors refer to when using this term would improve clarity.

      R2.9: In the introduction, we now state our conceptual definition of an axon fasciculus as a number of axons that follow each other (see also R2.1).

      -  The methods do not provide details on how structure tensor FA is measured.

      R2.10: Thank you for pointing this out. We have restructured and expanded the structure tensor description in the Methods section (see also R1.9 and R2.1), which now includes the definition of FA.

      -  Why didn't the authors select the same cc region for both mice and monkeys? It seems this would have increased the strength of the comparison.

      R2.11: We agree. The reason lies in the chronology of experiments and the fact that we cannot control where demyelination takes place. We have added a clarifying description in the Methods section (page 31): “Note that several separate beamline experiments were conducted to collect the volumes listed in Table 1. In the first two experiments, samples from the monkey brain were scanned at ESRF and DESY, respectively. The samples from the mouse brain were imaged in two subsequent experiments. Consequently, the location of the identified demyelinating lesion in the cuprizone mice, which cannot be precisely controlled, did not match the location of the CC biopsies in the monkey.”

      -  While it is mentioned in the results, the methods do not explain how vessel segmentations or cell segmentation in mice was performed and for which datasets it was performed.

      R2.12: For the small ROI shown in Figure 6, the labelling was a manual process using the software ITK-SNAP, which has now been clarified in the corresponding figure caption. The generation of ROI masks and blood vessel segmentations involved a combination of intensity thresholding, morphological operations, and manual labelling in ITK-SNAP. This has been clarified in the restructured and expanded description of structure tensor analysis in the Methods section (starting on page 32).

      -  From the methods it is hard to understand (1) how many mice were used; (2) why dMRI was done on a different sample; (3) whether the same selenium region was selected for both healthy and CPZ animals; (4) how the registration across samples was performed.

      R2.13: We appreciate the feedback and have inserted clarifying statements in the relevant parts of the Methods section. (1) The total number of mice included was three: one normal, one cuprizone, and one normal for MRI scanning. (2) The quality of the collected dMRI on the mouse was too poor to use, and it could not be redone as the brain had already been sliced and prepared for synchrotron experiments. (3) The same splenium section was selected for both healthy and cuprizone mice. (4) A paragraph on image registration has been added.

      -  Diffusion MRI method sections would benefit from additional details on the protocols used.

      R2.14: Thank you for pointing this out. We have added more details about the diffusion MRI protocols, including the b-value, gradient strength, and other relevant parameters.

    1. Reviewer #2 (Public review):

      Summary:

      Here the authors address the idea that postural and movement control are differentially impacted with stroke. Specifically, they examined whether resting postural forces influenced several metrics of sensorimotor control (e.g., initial reach angle, maximum lateral hand deviation following a perturbation, etc.) during movement or posture. The authors found that resting postural forces influenced control only following the posture perturbation for the paretic arm of stroke patients, but not during movement. They also found that resting postural forces were greater when the arm was unsupported, which correlated with abnormal synergies (as assessed by the Fugl-Meyer). The authors suggest that these findings can be explained by the idea that the neural circuitry associated with posture is relatively more impacted by stroke than the neural circuitry associated with movement. They also propose a conceptual model that differentially weights the reticulospinal tract (RST) and corticospinal tract (CST) to explain greater relative impairments with posture control relative to movement control, due to abnormal synergies, in those with stroke.

      Comments on revisions:

      The authors should be commended for being very responsive to comments and providing several further requested analyses, which have improved the paper. However, there is still some outstanding issues that make it difficult to fully support the provided interpretation.

      The authors say within the response, "We would also like to stress that these perturbations were not designed so that responses are directly compared to each other ***(though of course there is an *indirect* comparison in the sense that we show influence of biases in one type of perturbation but not the other)***." They then state in the first paragraph of the discussion that "Remarkably, these resting postural force biases did not seem to have a detectable effect upon any component of active reaching but only emerged during the control of holding still after the movement ended. The results suggest a dissociation between the control of movement and posture." The main issue here is relying on indirect comparisons (i.e., significant in one situation but not the other), instead of relying on direct comparisons. Using well-known example, just because one group / condition might display a significant linear relationship (i.e., slope_1 > 0) and another group / condition does not (slope_2 = 0), does not necessarily mean that the two groups / conditions are statistically different from one another [see Figure 1 in Makin, T. R., & Orban de Xivry, J. J. (2019). Ten common statistical mistakes to watch out for when writing or reviewing a manuscript. eLife, 8, e48175.].

      The authors have provided reasonable rationale of why they chose certain perturbation waveforms for different. Yet it still holds that these different waveforms would likely yield very different muscular responses making it difficult to interpret the results and this remains a limitation. From the paper it is unknown how these different perturbations would differentially influence a variety of classic neuromuscular responses, including short-range stiffness and stretch reflexes, which would be at play here.

      Much of the results can be interpreted when one considers classic neuromuscular physiology. In Experiment 1, differences in resting postural bias in supported versus unsupported conditions can readily be explained since there is greater muscle activity in the unsupported condition that leads to greater muscle stiffness to resist mechanical perturbations (Rack, P. M., & Westbury, D. R. (1974). The short-range stiffness of active mammalian muscle and its effect on mechanical properties. The Journal of physiology, 240(2), 331-350.). Likewise muscle stiffness would scale with changes in muscle contraction with synergies. Importantly for experiment 2, muscle stiffness is reduced during movement (Rack and Westbury, 1974) which may explain why resting postural biases do not seem to be impacting movement. Likewise, muscle spindle activity is shown to scale with extrafusal muscle fiber activity and forces acting through the tendon (Blum, K. P., Campbell, K. S., Horslen, B. C., Nardelli, P., Housley, S. N., Cope, T. C., & Ting, L. H. (2020). Diverse and complex muscle spindle afferent firing properties emerge from multiscale muscle mechanics. eLife, 9, e55177.). The concern here is that the authors have not sufficiently considered muscle neurophysiology, how that might relate to their findings, and how that might impact their interpretation. Given the differences in perturbations and muscle states at different phases, the concern is that it is not possible to disentangle whether the results are due to classic neurophysiology, the hypothesis they propose, or both. Can the authors please comment.

      The authors should provide a limitations paragraph. They should address 1) how they used different perturbation force profiles, 2) the muscles were in different states which would change neuromuscular responses between trial phase / condition, 3) discuss a lack of direct statistical comparisons that support their hypothesis, and 4) provide a couple of paragraphs on classic neurophysiology, such as muscle stiffness and stretch reflexes, and how these various factors could influence the findings (i.e., whether they can disentangle whether the reported results are due to classic neurophysiology, the hypothesis they propose, or both).

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This is a manuscript from Batra et al. entitled "A FUCCI sensor reveals complex cell cycle organization of Toxoplasma endodyogeny ". It describes the characterization of PCNA1 as cell cycle marker in the parasite Toxoplasma gondii. Tachyzoite endodyogeny is a simplified division process that is crucial for the proliferation of the parasite. Some studies have used fluorescent markers to describe the segregation of organelles and the nuclear division during endodyogeny but the production of more tools to dissect the cell cycle and better characterize mutants is timely. Most of the experiments are based on characterization of PCNA1 mutant and the use of a strain expressing a PCNA1-mNG construct. Unfortunately, there are a number of concerns in this study that need to be addressed.

      Major concerns:

      • The authors choose to describe PCNA1 and IMC3 as FUCCI markers. The efficiency of this system in mammalian cells is based on the proof that the markers are regulated through a rapid proteolysis process. However, the data available for these markers point toward a transcriptional regulation of these markers (Toxodb and (1)). In contrast, the authors do not provide any data indicating that these proteins are true FUCCI markers. Consequently, they should not use the term FUCCI throughout the paper unless they prove that the cell cycle expression depends on proteolysis. For example, the authors could express these genes with a promoter that is not cell cycle regulated.
      • The authors show that the localization of PCNA1 change during the cell cycle and indicate that the PCNA1 aggregates observed are replication forks. They do not provide data supporting this. They should co-localize these aggregates with other markers such as ORC, MCM proteins or DNA polymerase to better characterize these aggregates. There are number of techniques that could be used to localize the origin(s) of replication. Similarly, ExM should be used to characterize the colocalization between PCNA1 aggregates and the centromeres. As such, the images provided are of poor quality and do not support the author conclusions. The few PCNA1 aggregates toward the end of the S phase are also not characterized. Are they telomeres?
      • The authors characterized the proteins associated with PCNA1. All the proteins found to potentially interact are chromatin-bound and are not naturally found in other localization (2). It is unclear why the authors insist on the fact that there are two PCNA1 complexes (one chromatin-bound and one non-chromatin bound). More concerning is the lack of verification of this dataset through reciprocal IP for example.
      • Quantification of some of the phenotypes is lacking. For example, the DNA content analysis are shown but the change in number are not. Similarly, there is no quantification of the PCNA1 mutant phenotypes observed by ExM. Quantification of the bell shape observed by video-microscopy in figure 4 should also be provided.
      • The PCNA1 mutant phenotypes are not sufficiently explored by ExM. What happen to the mitotic spindle? What happens to kinetochore (CenH3 is a centromere protein and does not represent kinetochores)? Many markers for these structures have been described, see (3).
      • TgPCNA1NG strain has a number of concerns. The localization to the daughter cells conoids seems artificial since not observed in the HA-AID mutant and the expression pattern seems different as well than the previous mutant suggesting the mNG tag is affecting the localization and expression dynamics. Did the authors explore other fluorescent proteins to verify that these discrepancies where not due to this tag ? -Cytokinesis seems to be only defined by the presence of IMC3. The marker appears early during the budding process and it is not normally considered as a cytokinesis marker. The author should the text to reflect this.
      • Throughout the manuscript, the authors seems to ignore an essential characteristic of the tachyzoite cell cycle: the nuclear cycle and the budding cycle are independently regulated. It is therefore normal they overlap as it has been shown by the authors themselves in previous studies. This should be better described and discussed in the paper to understand the peculiarities of the parasite cell cycle.

      Minor

      • l196: "The surface of the growing buds": could the authors rephrase?
      • L217: proximal end of the nucleus rather than "parasite ".

      • Behnke,M.S., Wootton,J.C., Lehmann,M.M., Radke,J.B., Lucas,O., Nawas,J., Sibley,L.D. and White,M.W. (2010) Coordinated progression through two subtranscriptomes underlies the tachyzoite cycle of Toxoplasma gondii. PloS One, 5, e12354.

      • Barylyuk,K., Koreny,L., Ke,H., Butterworth,S., Crook,O.M., Lassadi,I., Gupta,V., Tromer,E., Mourier,T., Stevens,T.J., et al. (2020) A Comprehensive Subcellular Atlas of the Toxoplasma Proteome via hyperLOPIT Provides Spatial Context for Protein Functions. Cell Host Microbe, 28, 752-766.e9.
      • L,B., N,D.S.P., Ec,T., D,S.-F. and M,B. (2022) Composition and organization of kinetochores show plasticity in apicomplexan chromosome segregation. J. Cell Biol., 221.

      Significance

      This study provides the characterization of a new cell cycle marker to decipher the tachyzoite cell cycle of the apicomplexan parasite Toxoplasma gondii. A better understanding of the cell cycle is needed to prevent the proliferation of this parasite. This study builds on previous works characterizing organellar segregation in T. gondii. It provides data about the overlap of each cell cycle phase and the synchronicity of the cell cycle in a single vacuole. However, it is limited by the use of a single marker and more data are needed to support the conclusions of this study. This study can be of interest to a broad audience.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      I have reviewed, with interest, the manuscript "Psychological stress disturbs bone metabolism via miR-335-3p/Fos signaling in osteoclast". The described findings are relevant and useful for daily practice in periodontology. The paper is concise, professionally written, and easy to read. In this study, Jiayao et al. revealed the role of miR-335-3p in psychological stress-induced osteoporosis. CUMS mice were constructed to observe the femur phenotype, osteoclasts were identified as the primary research object, and miRNA-seq was used to find the key miRNAs linking the brain and peripheral tissues. This study showed that the expression of miR-335-3p was simultaneously reduced in mice's NAC, serum, and bone under psychological stress. The miR-335-3p/Fos/NFATC1 signaling pathway was validated in osteoclasts to reveal the potential mechanism of enhanced osteoclast activity under psychological stress. From a new perspective of miRNAs, this study indicates a possible cause of disturbed bone metabolism due to psychological stress and may suggest a new approach to treating osteoporosis.

      We thank this reviewer for the instructive suggestions and encouragement.

      Reviewer #2 (Public Review):

      Zhang et al. established chronic unpredictable mild stress (CUMS) mouse model, which displayed osteoporosis phenotype, suggesting a potential correlation between psychological stress and bone metabolism. They found that miRNA candidate miR-335-3p is downregulated in the long bone of CUMS mice through microRNA sequencing and qRT-PCR experiments. They further demonstrated that miR-335-3p attenuates osteoclast activity via inhibiting Fos signaling, which can induce NFATC1 expression and regulate osteoclast activity.

      Strengths:

      The authors established CUMS mouse model and confirmed the osteoporosis phenotype through careful characterization of bone and analysis of osteoclast activity. They performed microRNA sequencing to identify the miRNA candidate regulating the bone loss in the CUMS mouse model. They also validated the expression of miR-335-3p and interfered with the function of miR-335-3p through an in vitro assay. Overall, the findings from this study provide important hints for the correlation between psychological stress and bone metabolism.

      We thank this reviewer for the comprehensive summary and positive comment on our work.

      Weakness:

      The data provided by the authors are preliminary, especially the mechanistic insight, which needs to be enhanced. The authors have shown that miR-335-3p expression was altered in the CUMS mouse model and the change of its expression regulated osteoclast activity. The validation should be conducted in vivo, and the mechanism behind this should be investigated further.

      We thank the reviewer’s important insight on the need for further in vivo validation of the role of miR-335-3p. Therefore, we designed and produced Antagomir-335-3p (antagonist) and Agomir-335-3p (agonist). Then, we injected them into the body through the tail vein for about 2 months and observed the bone phenotype in each group of mice. The results suggested that the decrease of miR-335-3p in vivo could lead to bone loss, which was consistent with our in vitro validation results (Figure 5H-I).

      Reviewing Editor:

      Method

      (1) Bone histomorphometric analysis following ASBMR's guidelines Bone histomorphometric analysis of bone formation and bone resorption: The authors should follow ASBMR's guidelines for bone histomorphometry (PMCID: PMC3672237 and PMID: 3455637) to perform standard analyses of histomorphometry, rather than selected areas. They should also clearly describe a software used and define the areas analyzed.

      We carefully re-analyzed bone histomorphometry according to ASBMR guidelines and combine this with our own understanding. At the same time, we improved the description of micro-CT and histological analysis in the method. If there is still any lack of standardization, we would be grateful for any constructive suggestions to improve this.

      (2) Osteoclast cultures require nuclear staining to demonstrate multinucleated Trap positive cells.

      We used the RAW264.7, a mouse macrophage-like cell line, for in vitro culture and induced its differentiation towards osteoclasts. Successfully induced osteoclasts showed enlarged cytoplasm and multinucleated fusion. Tartrate-resistant acid phosphatase (Trap) is the signature enzyme of osteoclasts. It can bind to the chromogen to exhibit a mauve color, based on the principle of azo-coupled immunohistochemistry. At the same time, small and rounded nuclei fused show a lighter color (author response image 1, yellow arrows). We attempted to stain the nuclei with hematoxylin based on this. However, it was unable to further distinguish the contours of the nuclei clearly due to the similar color to the Trap positive signals. Besides, many other scholars have assessed osteoclast activity in vitro experiments based solely on the results of Trap staining (area and number) (Cheng et al., 2022; Li et al., 2019; Ma et al., 2021; Zhong et al., 2023). Nevertheless, in the immunofluorescence staining of osteoclasts, the nuclei were labeled using a Hochest antibody to reflect the multinucleated fusion of osteoclasts (Figure 5G).  

      (3) Osteoclast pit assays should be carried out to necessarily demonstrate the change of osteoclast resorption ability caused by miR-335-3p.

      We added osteoclast pit assays to validate the role of miR-335-3p on osteoclast resorptive capacity (Figure 5D-E).

      (4) Serum ELISA assay should be done to examine the global change of bone remodeling in the CUMS mice to assess bone formation and bone resorption that will support their claim.

      We performed additional tests on serum concentrations of R-hydroxy glutamic acid protein (BGP), TRAP, Cathepsin K (CTSK), parathyroid hormone (PTH), calcium (CA), phosphate (P) in control and CUMS mice, which could better reflect the global change of bone remodeling in the CUMS mice (Figure 3— figure supplement 1).

      (5) miR-RNA-seq: A labeled volcano plot should be used to replace the present one to show significant changes in differential gene expression.

      We appreciate this great suggestion. We replaced the volcano plot that showed significant changes in differential gene expression (Figure 4B). We also uploaded the raw data to the GEO database (GSE253504), making the results clearer and more accessible.

      Discussion

      The authors should discuss previous works on the influences of hormones from the brain on chronic stress-induced bone loss and an association of these influences with their findings.

      The discussion on the relationship between the bone metabolism regulation of both hormones and miR-335-3p in psychological stress was added in the second and fifth paragraphs of the discussion. To conclude, on the one hand, brain-derived and blood-transported miR-335-3p regulate bone metabolism synergistically. On the other hand, it exerted a more direct influence on bone under psychological stress.

      Language

      The language of the MS should be improved.

      The manuscript has been carefully edited by a professional proofreader.

      Reviewer #1 (Recommendations For The Authors):

      (1) Figure 1F: The exact meaning of the Waveform Graph shown at left needs to be clarified for the not-so-experienced reader.

      We added the more detailed meaning of the Waveform Graph in figure legends (Figure legend 1F).

      (2) Is the concomitant increase in osteogenic and osteoblastic activity in this study consistent with that seen in similar disease studies? This could be added to the discussion.

      In the fifth paragraph of the discussion section, we present the alterations of osteogenic and osteoblastic activity observed in other studies that are similar to ours. We also had a detailed discussion based on these observations.

      (3) Figure 6A: Please highlight the key information to visualize the potential linkage among miR-335-3p, Fos, and osteoclast.

      We highlighted the crucial linkage among miR-335-3p, Fos, and osteoclast with red arrows (Figure 6A)

      4) Figure 6E: The specific area of the selected comparison needs to be clarified. Please add white dotted lines and lettering T (trabecular bone) and GP (growth plate) for the not-so-experienced reader. This will provide some orientation.

      We used white dotted lines as well as letters to label the tissue in immunofluorescence staining images (Figure 6E).

      (5) Line 350: "NAC derived and blood-trans, Ported miR-335-3p". There is a grammatical error. Please conduct general proofreading of the text and writing style.

      Thank you for pointing this out. We have corrected this grammatical error, and we also checked the full text to correct similar errors.

      Reviewer #2 (Recommendations For The Authors):

      (1) miR-335-3p was downregulated in the femur in the CUMS mice. The possible mechanism for this outcome should be further discussed. In Figure 4B, the Volcano plot showed that only a few miRNA were differentially expressed between the control and CUMS mice. How do the authors explain this?

      The chronic unpredictable mild stress (CUMS) model was constructed using normal mice. As the name of the model suggests, the stimulus is mild and does not cause developmental damage or teratogenic effects in mice. Conversely, CUMS has the potential to result in the chronic pathological conditions. Besides, in miRNA sequencing results from other tissues with similar models to ours, the number of differential miRNAs is also around a few dozen (Ma et al., 2019).

      (2) The authors have demonstrated that miR-335-3p inhibits osteoclast differentiation based on an in vitro assay in Figure 5; however, an in vivo experiment is required to provide more solid evidence.

      We strongly agree that in vivo experimental validation would bring more convincing results to this study. Therefore, we designed and produced Antagomir-335-3p (antagonist) and Agomir-335-3p (agonist), which were injected into mice via the tail vein every five days. Samples were collected at one and two months following the injection. We found that sustained two-month injections of antagomir could significantly lead to bone loss in mice (Figure 5H-I), which is consistent with our in vitro validation results.

      However, the Agomir-miR-335-3p group did not exhibit a notable enhancement of bone mass. This may be attributed to the fact that the 11-week-old normal mice selected for this study were in their prime and did not have strong osteoclastic activity in vivo. Therefore, the osteoclastic inhibition of Agomir-335-3p could not be demonstrated.

      In addition, no significant difference was seen one month after the injection. The main reason may be that the time is too short. On the one hand, the drug we injected was RNA preparation. They lacked stability resulting in poor delivery efficiency, which took some time to take effect. On the other hand, bone remodeling is also a time-consuming process.

      (3) FOS and NFATC1 should be expressed in the nuclei of the cells, therefore, the quality of the images needs to be improved.

      NFATC1 is a T-cell-activating nuclear factor that is activated in the nucleus to regulate the transcription of a variety of osteoclast-related genes, including ACP5, MMP9, etc. FOS could bind and interact with NFATC1, resulting in nuclear translocation and transcription activated. This could promote the differentiation and maturation of osteoclasts. They are both synthesized and processed in the cytoplasm and eventually enter the nucleus to perform their functions. Therefore, they are expressed in both the nucleus and the cytoplasm (Deng et al., 2022; Hounoki et al., 2008; Li et al., 2022).

      In Figure 5G, we labeled cell nuclei with HOCHEST antibody with blue fluorescence, and more co-localized signals of nuclei (blue), FOS (red), and NFATC1 (green) were seen in the Inhibitor-miR-335-3p group, whereas the opposite result was observed in the Mimic-miR-335-3p group. These results indicated that inhibited miR-335-3p could promote osteoclast differentiation in vitro.

      (4) The expression of FOS was elevated in CUMS group in Figure 6E; however, its mRNA level was unchanged, as shown in Figure 6 supplement; what is the explanation for this? How do the authors claim FOS is the downstream target if its mRNA expression is not impacted by CUMS?

      The results demonstrated that miR-335-3p targeted binding to the mRNA of Fos did not result in mRNA degradation. Instead, this binding interferes with the protein translation process, which ultimately leads to the reduction of FOS protein.

      (5) What would be the bone phenotype if a FOS inhibitor was injected into the control and CUMS mice? It is important to examine FOS function through an in vivo context.

      The regulatory role of FOS for osteoclasts has been validated in numerous articles, both in vivo and in vitro(Aikawa et al., 2008; Cao et al., 2023; Cheng et al., 2022). For example, Aikawa et al. designed a small-molecule inhibitor of c-Fos/activator protein-1 (AP-1) using three-dimensional (3D) pharmacophore modeling, which helped verify the effect of FOS on osteoclasts in vivo(Aikawa et al., 2008).

      We also strongly agree that in vivo injection of inhibitors of FOS, especially in CUMS mice, could further substantiate the role of miR-335-3p in osteoclasts under psychological stress. However, the study was constrained by the unavailability of commercially viable, efficacious small molecule inhibitors of FOS. In the future, we plan to design more precise therapeutic targets for psychological stress induced osteoporosis based on existing research ideas.

      Reference

      Aikawa, Y., Morimoto, K., Yamamoto, T., Chaki, H., Hashiramoto, A., Narita, H., Hirono, S., & Shiozawa, S. (2008). Treatment of arthritis with a selective inhibitor of c-Fos/activator protein-1. Nature Biotechnology, 26(7), 817-823. https://doi.org/10.1038/nbt1412

      Cao, Z., Niu, X. B., Wang, M. H., Yu, S. W., Wang, M. K., Mu, S. L., Liu, C., & Wang, Y. X. (2023, Nov). Anemoside B4 attenuates RANKL-induced osteoclastogenesis by upregulating Nrf2 and dampens ovariectomy-induced bone loss [Article]. Biomedicine & Pharmacotherapy, 167, 12, Article 115454. https://doi.org/10.1016/j.biopha.2023.115454

      Cheng, X., Yin, C., Deng, Y., & Li, Z. (2022). Exogenous adenosine activates A2A adenosine receptor to inhibit RANKL-induced osteoclastogenesis via AP-1 pathway to facilitate bone repair. Molecular Biology Reports, 49(3), 2003-2014. https://doi.org/10.1007/s11033-021-07017-1

      Deng, W., Ding, Z., Wang, Y., Zou, B., Zheng, J., Tan, Y., Yang, Q., Ke, M., Chen, Y., Wang, S., & Li, X. (2022). Dendrobine attenuates osteoclast differentiation through modulating ROS/NFATc1/ MMP9 pathway and prevents inflammatory bone destruction. Phytomedicine : International Journal of Phytotherapy and Phytopharmacology, 96, 153838. https://doi.org/10.1016/j.phymed.2021.153838

      Hounoki, H., Sugiyama, E., Mohamed, S. G.-K., Shinoda, K., Taki, H., Abdel-Aziz, H. O., Maruyama, M., Kobayashi, M., & Miyahara, T. (2008). Activation of peroxisome proliferator-activated receptor gamma inhibits TNF-alpha-mediated osteoclast differentiation in human peripheral monocytes in part via suppression of monocyte chemoattractant protein-1 expression. Bone, 42(4), 765-774. https://doi.org/10.1016/j.bone.2007.11.016

      Li, Y., Yang, C., Jia, K., Wang, J., Wang, J., Ming, R., Xu, T., Su, X., Jing, Y., Miao, Y., Liu, C., & Lin, N. (2022). Fengshi Qutong capsule ameliorates bone destruction of experimental rheumatoid arthritis by inhibiting osteoclastogenesis. Journal of Ethnopharmacology, 282, 114602. https://doi.org/10.1016/j.jep.2021.114602

      Li, Z., Huang, J., Wang, F., Li, W., Wu, X., Zhao, C., Zhao, J., Wei, H., Wu, Z., Qian, M., Sun, P., He, L., Jin, Y., Tang, J., Qiu, W., Siwko, S., Liu, M., Luo, J., & Xiao, J. (2019). Dual Targeting of Bile Acid Receptor-1 (TGR5) and Farnesoid X Receptor (FXR) Prevents Estrogen-Dependent Bone Loss in Mice. Journal of Bone and Mineral Research : the Official Journal of the American Society For Bone and Mineral Research, 34(4), 765-776. https://doi.org/10.1002/jbmr.3652

      Ma, K., Zhang, H., Wei, G., Dong, Z., Zhao, H., Han, X., Song, X., Zhang, H., Zong, X., Baloch, Z., & Wang, S. (2019). Identification of key genes, pathways, and miRNA/mRNA regulatory networks of CUMS-induced depression in nucleus accumbens by integrated bioinformatics analysis. Neuropsychiatric Disease and Treatment, 15, 685-700. https://doi.org/10.2147/NDT.S200264

      Ma, Q., Liang, M., Wu, Y., Luo, F., Ma, Z., Dong, S., Xu, J., & Dou, C. (2021). Osteoclast-derived apoptotic bodies couple bone resorption and formation in bone remodeling. Bone Research, 9(1), 5. https://doi.org/10.1038/s41413-020-00121-1

      Zhong, L., Lu, J., Fang, J., Yao, L., Yu, W., Gui, T., Duffy, M., Holdreith, N., Bautista, C. A., Huang, X., Bandyopadhyay, S., Tan, K., Chen, C., Choi, Y., Jiang, J. X., Yang, S., Tong, W., Dyment, N., & Qin, L. (2023). Csf1 from marrow adipogenic precursors is required for osteoclast formation and hematopoiesis in bone. eLife, 12. https://doi.org/10.7554/eLife.82112

    1. References

      have you considered adding some of Nicole Letourneau's work? Here are some examples Letourneau, N. L., de Koning, A. J., Sekhon, B., Ntanda, H. N., Kobor, M., Deane, A. J., ... & APrON Study Team. (2020). Parenting interacts with plasticity genes in predicting behavioral outcomes in preschoolers. Canadian Journal of Nursing Research, 52(4), 290-307.

      Ross, K. M., Cole, S., Sanghera, H., Anis, L., Hart, M., & Letourneau, N. (2021). The ATTACH™ program and immune cell gene expression profiles in mothers and children: A pilot randomized controlled trial. Brain, behavior, & immunity-health, 18, 100358.

      Yu, Z., Cole, S., Ross, K., Hart, M., Anis, L., & Letourneau, N. (2024). Childhood Adversities and the ATTACHTM Program’s Influence on Immune Cell Gene Expression. International Journal of Environmental Research and Public Health, 21(6), 776.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      "Neural noise", here operationalized as an imbalance between excitatory and inhibitory neural activity, has been posited as a core cause of developmental dyslexia, a prevalent learning disability that impacts reading accuracy and fluency. This study is the first to systematically evaluate the neural noise hypothesis of dyslexia. Neural noise was measured using neurophysiological (electroencephalography [EEG]) and neurochemical (magnetic resonance spectroscopy [MRS]) in adolescents and young adults with and without dyslexia. The authors did not find evidence of elevated neural noise in the dyslexia group from EEG or MRS measures, and Bayes factors generally informed against including the grouping factor in the models. Although the comparisons between groups with and without dyslexia did not support the neural noise hypothesis, a mediation model that quantified phonological processing and reading abilities continuously revealed that EEG beta power in the left superior temporal sulcus was positively associated with reading ability via phonological awareness. This finding lends support for analysis of associations between neural excitatory/inhibitory factors and reading ability along a continuum, rather than as with a case/control approach, and indicates the relevance of phonological awareness as an intermediate trait that may provide a more proximal link between neurobiology and reading ability. Further research is needed across developmental stages and over a broader set of brain regions to more comprehensively assess the neural noise hypothesis of dyslexia, and alternative neurobiological mechanisms of this disorder should be explored.

      Strengths:

      The inclusion of multiple methods of assessing neural noise (neurophysiological and neurochemical) is a major advantage of this paper. MRS at 7T confers an advantage of more accurately distinguishing and quantifying glutamate, which is a primary target of this study. In addition, the subject-specific functional localization of the MRS acquisition is an innovative approach. MRS acquisition and processing details are noted in the supplementary materials according to the experts' consensus-recommended checklist (https://doi.org/10.1002/nbm.4484). Commenting on the rigor, the EEG methods is beyond my expertise as a reviewer.

      Participants recruited for this study included those with a clinical diagnosis of dyslexia, which strengthens confidence in the accuracy of the diagnosis. The assessment of reading and language abilities during the study further confirms the persistently poorer performance of the dyslexia group compared to the control group.

      The correlational analysis and mediation analysis provide complementary information to the main case-control analyses, and the examination of associations between EEG and MRS measures of neural noise is novel and interesting.

      The authors follow good practice for open science, including data and code sharing. They also apply statistical rigor, using Bayes Factors to support conclusions of null evidence rather than relying only on non-significant findings. In the discussion, they acknowledge the limitations and generalizability of the evidence and provide directions for future research on this topic.

      Weaknesses:

      Though the methods employed in the paper are generally strong, there are certain aspects that are not clearly described in the Materials & Methods section, such as a description of the statistical analyses used for hypothesis testing.

      Thank you for pointing this out. A description of the statistical models used in the analyses of EEG biomarkers has been added to the Materials and Methods:

      “First, exponent and offset values were averaged across all electrodes and analyzed using a 2x2 repeated measures ANOVA with group (dyslexic, control) as a between-subjects factor and condition (resting state, language task) as a within-subjects factor. Age was included in the analyses as a covariate due to the correlation between variables. Next, exponent and offset values were averaged across electrodes corresponding to the left (F7, FT7, FC5) and right inferior frontal gyrus (F8, FT8, FC6), and to the left (T7, TP7, TP9) and right superior temporal sulcus (T8, TP8, TP10). The electrodes were selected based on the analyses outlined by Giacometti and colleagues (2014) and Scrivener and Reader (2022). For these analyses, a 2x2x2x2 repeated measures ANOVA with age as a covariate was conducted with group (dyslexic, control) as a between-subjects factor and condition (resting state, language task), hemisphere (left, right), and region (frontal, temporal) as within-subjects factors. Results for the alpha and beta bands were calculated for the same clusters of frontal and temporal electrodes and analyzed with a similar 2x2x2x2 repeated measures ANOVA; however, for these analyses, age was not included as a covariate due to a lack of significant correlations.”

      We also expanded the description of the statistical models used in the analyses of MRS biomarkers:

      “To analyze the metabolite results, separate univariate ANCOVAs were conducted for Glu, GABA+, Glu/GABA+ ratio and Glu/GABA+ imbalance measures with group (control, dyslexic) as a between-subjects factor and voxel gray matter volume (GMV) as a covariate. Additionally, for the Glu analysis, age was included as a covariate due to a correlation between variables. Both frequentist and Bayesian statistics were calculated. Glu/GABA+ imbalance measure was calculated as the square root of the absolute residual value of a linear relationship between Glu and GABA+ (McKeon et al., 2024).”

      With regard to metabolite quantification, it is unclear why the authors chose to analyze and report metabolite values in terms of creatine ratios rather than quantification based on a water reference given that the MRS acquisition appears to support using a water reference.

      We have decided to use the ratio of Glu and GABA to total creatine (tCr), as this is still a common practice in MRS studies at 7T (e.g., Nandi et al., 2022; Smith et al., 2021). This approach normalizes the signal, reducing the impact of intensity variations across different regions and tissue compositions. Additionally, total creatine concentration is considered relatively stable across different brain regions, which is particularly important in our study, where a functional localizer was used to establish the left STS region individually. Our decision was further influenced by previous studies on dyslexia (Del Tufo et al., 2018; Pugh et al., 2014) which have reported creatine ratios and included GM volume as a covariate in their models, thus providing comparability. It is now indicated in the Results:

      “For comparability with previous studies in dyslexia (Del Tufo et al., 2018; Pugh et al., 2014) we report Glu and GABA as a ratio to total creatine (tCr).”

      and in the Method sections:

      “Glu and GABA+ concentrations were expressed as a ratio to total-creatine (tCr; Creatine + Phosphocreatine) following previous MRS studies in dyslexia (Del Tufo et al., 2018; Pugh et al., 2014).

      We did not estimate absolute concentrations using water signals as a reference, as this would require accounting for water relaxation times, which may vary across our age range. Nevertheless, our dataset has been made publicly available for future researchers to calculate and compare absolute values.

      Del Tufo, S. N., Frost, S. J., Hoeft, F., Cutting, L. E., Molfese, P. J., Mason, G. F., Rothman, D. L., Fulbright, R. K., & Pugh, K. R. (2018). Neurochemistry Predicts Convergence of Written and Spoken Language: A Proton Magnetic Resonance Spectroscopy Study of Cross-Modal Language Integration. Frontiers in Psychology, 9, 1507. https://doi.org/10.3389/fpsyg.2018.01507

      Nandi, T., Puonti, O., Clarke, W. T., Nettekoven, C., Barron, H. C., Kolasinski, J., Hanayik, T., Hinson, E. L., Berrington, A., Bachtiar, V., Johnstone, A., Winkler, A. M., Thielscher, A., Johansen-Berg, H., & Stagg, C. J. (2022). tDCS induced GABA change is associated with the simulated electric field in M1, an effect mediated by grey matter volume in the MRS voxel. Brain Stimulation, 15(5), 1153–1162. https://doi.org/10.1016/j.brs.2022.07.049

      Pugh, K. R., Frost, S. J., Rothman, D. L., Hoeft, F., Del Tufo, S. N., Mason, G. F., Molfese, P. J., Mencl, W. E., Grigorenko, E. L., Landi, N., Preston, J. L., Jacobsen, L., Seidenberg, M. S., & Fulbright, R. K. (2014). Glutamate and choline levels predict individual differences in reading ability in emergent readers. Journal of Neuroscience, 34(11), 4082–4089. https://doi.org/10.1523/JNEUROSCI.3907-13.2014

      Smith, G. S., Oeltzschner, G., Gould, N. F., Leoutsakos, J. S., Nassery, N., Joo, J. H., Kraut, M. A., Edden, R. A. E., Barker, P. B., Wijtenburg, S. A., Rowland, L. M., & Workman, C. I. (2021). Neurotransmitters and Neurometabolites in Late-Life Depression: A Preliminary Magnetic Resonance Spectroscopy Study at 7T. Journal of Affective Disorders, 279, 417–425. https://doi.org/10.1016/j.jad.2020.10.011

      GABA is typically quantified using J-editing sequences as lower field strengths (~3T), and there is some evidence that the GABA signal can be reliably measured at 7T without editing, however, the authors should discuss potential limitations, such as reliability of Glu and GABA measurements with short-TE semi-laser at 7T.

      In addition, MRS measurements of GABA are known to be influenced by macromolecules, and GABA is often denoted as GABA+ to indicate that other compounds contribute to the measured signal, especially at a short TE and in the absence of symmetric spectral editing.

      A general discussion of the strengths and limitations of unedited Glu and GABA quantification at 7T is warranted given the interest of this work to researchers who may not be experts in MRS.

      While we agree with the Reviewer that at 3T, it is recommended to use J-edited MRS to measure GABA (Mullins et al., 2014), the better spectral resolution at 7T allows for more reliable results for both metabolites using moderate echo-time, non-edited MRS (Finkelman et al., 2022). In this study, we used a short echo time (TE), which is optimal for Glu but not ideal for GABA, as it interferes with other signals. We are grateful to the Reviewer for suggesting the addition of a short paragraph to the Discussion, describing the practicalities of 3T and 7T MRS and changing the abbreviation to GABA+ to inform readers of possible macromolecule contamination:

      “We chose ultra-high-field MRS to improve data quality (Özütemiz et al., 2023), as the increased sensitivity and spectral resolution at 7T allows for better separation of overlapping metabolites compared to lower field strengths. Additionally, 7T provides a higher signal-to-noise ratio (SNR), improving the reliability of metabolite measurements and enabling the detection of small changes in Glu and GABA concentrations. Despite these theoretical advantages, several practical obstacles should be considered, such as susceptibility artifacts and inhomogeneities at higher field strengths that can impact data quality. Interestingly, actual methodological comparisons (Pradhan et al., 2015; Terpstra et al., 2016) show only a slight practical advantage of 7T single-voxel MRS compared to optimized 3T acquisition. For example, fitting quality yielded reduced estimates of variance in concentration of Glu in 7T (CRLB) and slightly improved reproducibility levels for Glu and GABA (at both fields below 5%). Choosing the appropriate MRS sequence involves a trade-off between the accuracy of Glu and GABA measurements, as different sequences are recommended for each metabolite. J-edited MRS is recommended for measuring GABA, particularly with 3T scanners (Mullins et al., 2014). However, at 7T, more reliable results can be obtained using moderate echo-time, non-edited MRS (Finkelman et al., 2022). We have opted for a short-echo-time sequence, which is optimal for measuring Glu. However, this approach results in macromolecule contamination of the GABA signal (referred to as GABA+).”

      Finkelman, T., Furman-Haran, E., Paz, R., & Tal, A. (2022). Quantifying the excitatory-inhibitory balance: A comparison of SemiLASER and MEGA-SemiLASER for simultaneously measuring GABA and glutamate at 7T. NeuroImage, 247, 118810. https://doi.org/10.1016/j.neuroimage.2021.118810

      Mullins, P. G., McGonigle, D. J., O'Gorman, R. L., Puts, N. A., Vidyasagar, R., Evans, C. J., Cardiff Symposium on MRS of GABA, & Edden, R. A. (2014). Current practice in the use of MEGA-PRESS spectroscopy for the detection of GABA. NeuroImage, 86, 43–52. https://doi.org/10.1016/j.neuroimage.2012.12.004

      Özütemiz, C., White, M., Elvendahl, W., Eryaman, Y., Marjańska, M., Metzger, G. J., Patriat, R., Kulesa, J., Harel, N., Watanabe, Y., Grant, A., Genovese, G., & Cayci, Z. (2023). Use of a Commercial 7-T MRI Scanner for Clinical Brain Imaging: Indications, Protocols, Challenges, and Solutions-A Single-Center Experience. AJR. American Journal of Roentgenology, 221(6), 788–804. https://doi.org/10.2214/AJR.23.29342

      Pradhan, S., Bonekamp, S., Gillen, J. S., Rowland, L. M., Wijtenburg, S. A., Edden, R. A., & Barker, P. B. (2015). Comparison of single voxel brain MRS AT 3T and 7T using 32-channel head coils. Magnetic Resonance Imaging, 33(8), 1013–1018. https://doi.org/10.1016/j.mri.2015.06.003

      Terpstra, M., Cheong, I., Lyu, T., Deelchand, D. K., Emir, U. E., Bednařík, P., Eberly, L. E., & Öz, G. (2016). Test-retest reproducibility of neurochemical profiles with short-echo, single-voxel MR spectroscopy at 3T and 7T. Magnetic Resonance in Medicine, 76(4), 1083–1091. https://doi.org/10.1002/mrm.26022

      Further, the single MRS voxel location is a limitation of the study as neurochemistry can vary regionally within individuals, and the putative excitatory/inhibitory imbalance in dyslexia may appear in regions outside the left temporal cortex (e.g., network-wide or in frontal regions involved in top-down executive processes). While the functional localization of the MRS voxel is a novelty and a potential advantage, it is unclear whether voxel placement based on left-lateralized reading-related neural activity may bias the experiment to be more sensitive to small, activity-related fluctuations in neurotransmitters in the CON group vs. the DYS group who may have developed an altered, compensatory reading strategy.

      We agree that including only one region of interest for the MRS measurements is a potential limitation of our study, and we have now added this information to the Discussion:

      “Moreover, since the MRS data was collected only from the left STS, it is plausible that other areas might be associated with differences in Glu or GABA concentrations in dyslexia.”

      However, differences in Glu and GABA concentrations in this region were directly predicted by the neural noise hypothesis of dyslexia. We acknowledge that this information was missing in the previous version of the manuscript. It is now included in the Results:

      “Moreover, the neural noise hypothesis of dyslexia identifies perisylvian areas as being affected by increased glutamatergic signaling, and directly predicts associations between Glu and GABA levels in the superior temporal regions and phonological skills (Hancock et al., 2017).”

      as well as in the Discussion:

      “Nevertheless, the neural noise hypothesis predicted increased glutamatergic signaling in perisylvian regions, specifically in the left superior temporal cortex (Hancock et al., 2017).”

      Figure 1 contains a lot of information, and it may be helpful to split it into 2 figures (EEG vs. MRS) so that the plots could be made larger and the reader could more easily digest the information.

      (a) I would also recommend displaying separate metabolite fit plots for each group, since the current presentation in panel F makes it appear that the MRS data is examined by testing differences between groups across the full spectrum (where the lines diverge), which really isn't the case.

      (b) The GABA peak is not visible in the spectrum, and Glutamate and GABA both have multiple peaks that should be shown on the spectrum. This may be best achieved by displaying the individual metabolite sub-spectra below the full spectrum

      Thank you for these suggestions. We have split the information into two Figures following the Reviewer’s recommendations.

      It is not clear why the 3T structural images were used for segmentation and calculation of tissue fraction if 7T structural images were also acquired (which would presumably have higher resolution).

      Generally, T1-weighted images from the 7T scanner exhibit more artifacts than those from the 3T scanner due to higher magnetic field inhomogeneity. These artifacts are especially pronounced in regions near air-tissue interfaces, such as the temporal lobes. Therefore, we chose the 3T structural images for segmentation and tissue fraction calculations and clarified this in the Method section:

      “Voxel segmentation was performed on structural images from a 3T scanner, coregistered to 7T structural images in SPM12, as the latter exhibited excessive artifacts and intensity bias in the temporal regions”.

      The basis set includes a large number of metabolites (27), including many low-concentration metabolites/compounds (e.g., bHG, bHB, Citrate, Threonine, ethanol) that are typically only included in studies targeting specific metabolites in disease/pathology. Please justify the inclusion of this maximal set of metabolites in the basis set, given that the inclusion of overlapping low-concentration metabolites may influence metabolite measurements of interest (https://doi.org/10.1002/mrm.10246).

      There is still no consensus in the MR community on which metabolites should be included in the model of human cerebral 1H-MR spectra. Typically, only major contributors such as NAA, Cr, Cho, Lac, mI, and possibly Glx are evaluated. Some studies also include additional metabolites like Ace, Ala, Asp, GABA, Glc, Gly, sI, NAAG, and Tau. In this study, as in a few others, further metabolites such as PCh, GPC, PCr, GSH, PE, and Thr were introduced and this approach seems suitable for high-field spectra (Hofmann et al., 2002).

      Hofmann, L., Slotboom, J., Jung, B., Maloca, P., Boesch, C., & Kreis, R. (2002). Quantitative 1H-magnetic resonance spectroscopy of human brain: Influence of composition and parameterization of the basis set in linear combination model-fitting. Magnetic Resonance in Medicine, 48(3), 440–453. https://doi.org/10.1002/mrm.10246

      Please provide a figure indicating the localization of the MRS voxel for a sample subject.

      A figure indicating the localization of the MRS voxel for a sample subject was added to the MRS checklist.

      It would be helpful to include Table S1 in the main article.

      Table S1 from the Supplementary Material has now been added to the main manuscript as Table 1 in the Results section.

      Please report descriptive statistics for EEG and MRS measures in Table S1.

      We have added a new Table S1 in the Supplementary Material, providing descriptive statistics for EEG and MRS E/I balance measures, presented separately for the dyslexic and control groups.

      I recommend avoiding using the terms "direct" and "indirect" to contrast MRS and EEG measures of E/I balance. Both of these measures are imperfect and it is misleading to say that MRS is a "direct" measure of neurotransmitters. There is also ambiguity in what is meant by "direct": in contrast to EEG, MRS does not measure neural activity and does not provide high-resolution temporal information, so in a sense, it is less direct.

      Thank you for this suggestion. We have replaced the terms 'direct' and 'indirect' biomarkers with 'MRS' and 'EEG' biomarkers throughout the text.

      There are many cases throughout the results in which Bayes and frequentist stats seem to contradict each other in terms of significance and what should be included in the models, especially with regard to the interaction effects (the Bayes factors appear to favor non-significant interactions). I think this is worth considering and describing to offer more clarity for the readers.

      We agree that a discussion of the divergent results between Bayesian and frequentist models was missing in the previous version of the manuscript. To provide greater clarity for the readers, we have conducted follow-up Bayesian t-tests in every case where the results indicated the inclusion of non-significant interactions with the effect of group in the model. These additional analyses have been performed for the exponent, offset, as well as for beta bandwidth in the Supplementary Material. We have also added a paragraph addressing these discrepancies in the Discussion:

      “Remarkably, in some models, results from Bayesian and frequentist statistics yielded divergent conclusions regarding the inclusion of non-significant effects. This was observed in more complex ANOVA models, whereas no such discrepancies appeared in t-tests or correlations. Given reports of high variability in Bayesian ANOVA estimates across repeated runs of the same analysis (Pfister, 2021), these results should be interpreted with caution. Therefore, following the recommendation to simplify complex models into Bayesian t-tests for more reliable estimates (Pfister, 2021), we conducted follow-up Bayesian t-tests in every case that favored the inclusion of non-significant interactions with the group factor. These analyses provided further evidence for the lack of differences between the dyslexic and control groups. Another source of discrepancy between the two methods may stem from the inclusion of interactions between covariates and within-subject effects in frequentist ANOVA, which were not included in Bayesian ANOVA to adhere to the recommendation for simpler Bayesian models (Pfister, 2021).”

      Pfister, R. (2021). Variability of Bayes factor estimates in Bayesian analysis of variance. The Quantitative Methods for Psychology, 17(1), 40-45. doi:10.20982/tqmp.17.1.p040

      It would be helpful to indicate whether participants in the DYS group had a history of reading intervention/remediation. In addition to showing that the DYS group performed lower than the CON group on reading assessments as a whole and given their age, was the performance on the reading assessments at an individual level considered for inclusion in the study? (i.e., were participants' persistent poor reading abilities confirmed with the research assessments?)

      We were unable to assess individual reading skills due to the lack of standardized diagnostic norms for adult dyslexia in Poland. Therefore, participants in the dyslexic group were recruited based on a previous clinical diagnosis of dyslexia, and reading and reading-related tasks were used for group-level comparisons only. This information has been added to the Methods section:

      “Since there are no standardized diagnostic norms for dyslexia in adults in Poland, individuals were assigned to the dyslexic group based on a past diagnosis of dyslexia.”

      Unfortunately, we did not collect information about participants' history of reading intervention or remediation. In this context, we acknowledge that including a sample of adult participants is a potential limitation of our study, however, this was already mentioned in the Discussion.

      Regarding the fMRI task, please indicate whether the participants whose threshold and/or contrast was changed for localization were from the DYS or CON group.

      This information is now added to the Method section:

      “For 6 participants (DYS n = 2, CON n = 4), the threshold was lowered to p < .05 uncorrected, while for another 6 participants (DYS n = 3, CON n = 3) the contrast from the auditory run was changed to auditory words versus fixation cross due to a lack of activation for other contrasts.”

      Reviewer #2 (Public Review):

      Summary:

      This study utilized two complementary techniques (EEG and 7T MRI/MRS) to directly test a theory of dyslexia: the neural noise hypothesis. The authors report finding no evidence to support an excitatory/inhibitory balance, as quantified by beta in EEG and Glutamate/GABA ratio in MRS. This is important work and speaks to one potential mechanism by which increased neural noise may occur in dyslexia.

      Strengths:

      This is a well-conceived study with in-depth analyses and publicly available data for independent review. The authors provide transparency with their statistics and display the raw data points along with the averages in figures for review and interpretation. The data suggest that an E/I balance issue may not underlie deficits in dyslexia and is a meaningful and needed test of a possible mechanism for increased neural noise.

      Weaknesses:

      The researchers did not include a visual print task in the EEG task, which limits analysis of reading-specific regions such as the visual word form area, which is a commonly hypoactivated region in dyslexia. This region is a common one of interest in dyslexia, yet the researchers measured the I/E balance in only one region of interest, specific to the language network.

      We agree with the Reviewer that including different tasks for the EEG biomarkers assessment would be valuable. However, this limitation was already addressed in the Discussion:

      “Importantly, our study focused on adolescents and young adults, and the EEG recordings were conducted during rest and a spoken language task. These factors may limit the generalizability of our results. Future research should include younger populations and incorporate a broader array of tasks, such as reading and phonological processing, to provide a more comprehensive evaluation of the E/I balance hypothesis.”

      Further, this work does not consider prior studies reporting neural inconsistency; a potential consequence of increased neural noise, which has been reported in several studies and linked with candidate-dyslexia gene variants (e.g., Centanni et al., 2018, 2022; Hornickel & Kraus, 2013; Neef et al., 2017). While E/I imbalance may not be a cause of increased neural noise, other potential mechanisms remain and should be discussed.

      Thank you for referring us to other works reporting neural variability in dyslexia. We agree that a broader context regarding sources of reduced neural synchronization, beyond E/I imbalance, was missing in the previous version of the manuscript. We have now included these references in the Discussion:

      “Furthermore, although our results do not support the idea of E/I balance alterations as a source of neural noise in dyslexia, they do not preclude other mechanisms leading to less synchronous neural firing posited by the hypothesis. In this context, there is evidence showing increased trial-to-trial inconsistency of neural responses in individuals with dyslexia (Centanni et al., 2022) or poor readers (Hornickel and Kraus, 2013) and its associations with specific dyslexia risk genes (Centanni et al., 2018; Neef et al., 2017). At the same time, the observed trial-to-trial inconsistency was either present only in a subset of participants (Centanni et al., 2018), limited to some experimental conditions (Centanni et al., 2022), or specific brain regions – e.g., brainstem in Hornickel and Kraus (2013), left auditory cortex in Centanni et al. (2018), or left supramarginal gyrus in Centanni et al. (2022).”

      A better description of the exponent and offset components is needed at the beginning of the results, given that the methods are presented in detail at the end. I also do not see a clear description of these components in the methods.

      A description of the aperiodic components is now included in the Results:

      “In the initial step of the analysis, we analyzed the aperiodic (exponent and offset) components of the EEG spectrum. The exponent reflects the steepness of the EEG power spectrum, with a higher exponent indicating a steeper signal; while the offset represents a uniform shift in power across frequencies, with a higher offset indicating greater power across the entire EEG spectrum (Donoghue et al., 2020).”

      as well as in the Materials and Methods:

      “Two broadband aperiodic parameters were extracted: the exponent, which quantifies the steepness of the EEG power spectrum, and the offset, which indicates signal’s power across the entire frequency spectrum.”

      Reviewer #3 (Public Review):

      Summary:

      This study by Glica and colleagues utilized EEG (i.e., Beta power, Gamma power, and aperiodic activity) and 7T MRS (i.e., MRS IE ratio, IE balance) to reevaluate the neural noise hypothesis in Dyslexia. Supported by Bayesian statistics, their results show solid 'no evidence' of EI balance differences between groups, challenging the neural noise hypothesis. The work will be of broad interest to neuroscientists, and educational and clinical psychologists.

      Strengths:

      Combining EEG and 7T MRS, this study utilized both the indirect (i.e., Beta power, Gamma power, and aperiodic activity) and direct (i.e., MRS IE ratio, IE balance) measures to reevaluate the neural noise hypothesis in Dyslexia.

      Weaknesses:

      The authors may need to provide more data to assess the quality of the MRS data.

      We have addressed the following specific recommendations of the Reviewer providing more data about the quality of the MRS data.

      The authors may need to explain how the number of subjects is determined in the MRS section.

      We have clarified the MRS sample description in the Results section:

      “Due to financial and logistical constraints, 59 out of the 120 recruited subjects, selected progressively as the study unfolded, were examined with MRS. Subjects were matched by age and sex between the dyslexic and control groups. Due to technical issues and to prevent delays and discomfort for the participants, we collected 54 complete sessions. Additionally, four datasets were excluded based on our quality control criteria, and three GABA+ estimates exceeded the selected CRLB threshold. Ultimately, we report 50 estimates for Glu (21 participants with dyslexia) and 47 for GABA+ and Glu/GABA+ ratios (20 participants with dyslexia).”

      Is there a reason why theta and gamma peaks were not observed in the majority of participants? What are the possible reasons that likely caused the discrepancy between this study and previously reported relevant studies?

      We have now added a discussion about the absence of oscillatory peaks in the theta and gamma bands to the Discussion section:

      “We could not perform analyses for the gamma oscillations since in the majority of participants the gamma peak was not detected above the aperiodic component. Due to the 1/f properties of the EEG spectrum, both aperiodic and periodic components should be disentangled to analyze ‘true’ gamma oscillations; however, this approach is not typically recognized in electrophysiology research (Hudson and Jones, 2022). Indeed, previous studies that analyzed gamma activity in dyslexia (Babiloni et al., 2012; Lasnick et al., 2023; Rufener and Zaehle, 2021) did not separate the background aperiodic activity. For the same reason, we could not analyze results for the theta band, which often does not meet the criteria for an oscillatory component manifested as a peak in the power spectrum (Klimesch, 1999). Moreover, results from a study investigating developmental changes in both periodic and aperiodic components suggest that theta oscillations in older participants are mostly observed in frontal midline electrodes (Cellier et al., 2021), which were not analyzed in the current study.”

      Hudson, M. R., & Jones, N. C. (2022). Deciphering the code: Identifying true gamma neural oscillations. Experimental Neurology357, 114205. https://doi.org/10.1016/j.expneurol.2022.114205

      Klimesch, W. (1999). EEG alpha and theta oscillations reflect cognitive and memory performance: A review and analysis. Brain Research Reviews29(2-3), 169-195. https://doi.org/10.1016/S0165-0173(98)00056-3

      Based on Figure 1F, the quality of the MRS data may be contaminated by the lipid signal, especially for the DYS group. To better evaluate the MRS data, especially the GABA measurements, the authors need to show:

      (a) the placement of the MRS voxel on the anatomical images;

      Averaged MRS voxel placement was already presented in Figure 1 (now Figure 2) in the manuscript. Now, we have also added exemplary single-subject images to the MRS checklist in the Supplement.

      (b) Glu and GABA model functions

      We have now provided more meaningful Glu and GABA indications in Figure 2.

      (c) CRLB for GABA

      We have added respective estimates to the Supplement:

      %CRLB of Glu: mean 2.96, SD = 0.79

      %CRLB of GABA: mean 10.59, SD = 2.76

      %CRLB of NAA: 1.76 SD = 0.46

      Further, the authors added voxel's gray matter volume as a covariate when performing separate ANCOVAs. The authors may need to use alpha correction or 1-fCSF correction to corroborate these results.

      We chose to use the ratio of Glu and GABA to total creatine (tCr), as this remains a common practice in MRS studies at 7T (e.g., Nandi et al., 2022; Smith et al., 2021). This decision was also influenced by previous dyslexia studies (Del Tufo et al., 2018; Pugh et al., 2014) and is now clarified in the Results and Methods sections.

      Regarding alpha correction, a recent paper (García-Pérez et al., 2023) recommends: 'In general, avoid corrections for multiple testing if statistical claims are to be made for each individual test, in the absence of an omnibus null hypothesis.' Since we report null findings, further alpha correction would not significantly impact the results.

      García-Pérez, M. A. (2023). Use and misuse of corrections for multiple testing. Methods in Psychology8, 100120. https://doi.org/10.1016/j.metip.2023.100120

    1. Reviewer #2 (Public review):

      Summary:

      This interesting paper examines the earliest steps in progesterone-induced frog oocyte maturation, an example of non-genomic steroid hormone signaling that has been studied for decades but is still very incompletely understood. In fish and frog oocytes it seems clear that mPR proteins are involved, but exactly how they relay signals is less clear. In human sperm, the lipid hydrolase ABHD2 has been identified as a receptor for progesterone, and so the authors here examine whether ABHD2 might contribute to progesterone-induced oocyte maturation as well. The main results are:

      (1) Knocking down ABHD2 makes oocytes less responsive to progesterone, and ectopically expressing ABHD2.S (but not the shorter ABHD2.L gene product) partially rescues responsiveness. The rescue depends upon the presence of critical residues in the protein's conserved lipid hydrolase domain, but not upon the presence of critical residues in its acyltransferase domain.

      (2) Treatment of oocytes with progesterone causes a decrease in sphingolipid and glycerophospholipid content within 5 min. This is accompanied by an increase in LPA content and arachidonic acid metabolites. These species may contribute to signaling through GPCRs. Perhaps surprisingly, there was no detectable increase in sphingosine-1-phosphate, which might have been expected given the apparent substantial hydrolysis of sphingolipids. The authors speculate that S1P is formed and contributes to signaling but diffuses away.

      (3) Pharmacological inhibitors of lipid-metabolizing enzymes support, for the most part, the inferences from the lipidomics studies, although there are some puzzling findings. The puzzling findings may be due to uncertainty about whether the inhbitors are working as advertised.

      (4) Pharmacological inhibitors of G-protein signaling support a role for G-proteins and GPCRs in this signaling, although again there are some puzzling findings.

      (5) Reticulocyte expression supports the idea that mPRβ and ABHD2 function together to generate a progesterone-regulated PLA2 activity.

      (6) Knocking down or inhibiting ABHD2 inhibited progesterone-induced mPRβ internalization, and knocking down ABHD2 inhibited SNAP25∆20-induced maturation.

      Strengths:<br /> All in all, this could be a very interesting paper and a nice contribution. The data add a lot to our understanding of the process, and, given how ubiquitous mPR and AdipoQ receptor signaling appear to be, something like this may be happening in many other physiological contexts.

      Weaknesses:

      I have several suggestions for how to make the main points more convincing.

      Main criticisms:

      (1) The ABHD2 knockdown and rescue, presented in Fig 1, is one of the most important findings. It can and should be presented in more detail to allow the reader to understand the experiments better. E.g.: the antisense oligos hybridize to both ABHD2.S and ABHD2.L, and they knock down both (ectopically expressed) proteins. Do they hybridize to either or both of the rescue constructs? If so, wouldn't you expect that both rescue constructs would rescue the phenotype, since they both should sequester the AS oligo? Maybe I'm missing something here.

      In addition, it is critical to know whether the partial rescue (Fig 1E, I, and K) is accomplished by expressing reasonable levels of the ABHD2 protein, or only by greatly overexpressing the protein. The author's antibodies do not appear to be sensitive enough to detect the endogenous levels of ABHD2.S or .L, but they do detect the overexpressed proteins (Fig 1D). The authors could thus start by microinjecting enough of the rescue mRNAs to get detectable protein levels, and then titer down, assessing how low one can go and still get rescue. And/or compare the mRNA levels achieved with the rescue construct to the endogenous mRNAs.

      Finally, please make it clear what is meant by n = 7 or n = 3 for these experiments. Does n = 7 mean 7 independently lysed oocytes from the same frog? Or 7 groups of, say, 10 oocytes from the same frog? Or different frogs on different days? I could not tell from the figure legends, the methods, or the supplementary methods. Ideally one wants to be sure that the knockdown and rescue can be demonstrated in different batches of oocytes, and that the experimental variability is substantially smaller than the effect size.

      (2) The lipidomics results should be presented more clearly. First, please drop the heat map presentations (Fig 2A-C) and instead show individual time course results, like those shown in Fig 2E, which make it easy to see the magnitude of the change and the experiment-to-experiment variability. As it stands, the lipidomics data really cannot be critically assessed.

      [Even as heat map data go, panels A-C are hard to understand. The labels are too small, especially on the heat map on the right side of panel B. And the 25 rows in panel C are not defined (the legend makes me think the panel is data from 10 individual oocytes, so are the 25 rows 25 metabolites? If so, are the individual oocyte data being collapsed into an average? Doesn't that defeat the purpose of assessing individual oocytes?) And those readers with red-green colorblindness (8% of men) will not be able to tell an increase from a decrease. But please don't bother improving the heat maps; they should just be replaced with more-informative bar graphs or scatter plots.]

      (3) The reticulocyte lysate co-expression data are quite important, and are both intriguing and puzzling. My impression had been that to express functional membrane proteins, one needed to add some membrane source, like microsomes, to the standard kits. Yet it seems like co-expression of mPR and ABHD2 proteins in a standard kit is sufficient to yield progesterone-regulated PLA2 activity. I could be wrong here-I'm not a protein expression expert-but I was surprised by this result, and I think it is critical that the authors make absolutely certain that it is correct. Do you get much greater activities if microsomes are added? Are the specific activities of the putative mPR-ABHD2 complexes reasonable?

      Comments on revisions:

      The authors have satisfied my concerns with their response letter and revisions.

    1. Reviewer #3 (Public review):

      The author presents a novel theory and computational model suggesting that grid cells do not encode space, but rather encode non-spatial attributes. Place cells in turn encode memories of where those specific attributes occurred. The theory accounts for many experimental results and generates useful predictions for future studies. The model's simplicity and potential explanatory power will interest others in the field. There are, however, a few weaknesses outlined below which undermine the theory.

      Main criticisms:

      (1) A crucial assumption of the model is that grid cells express grid-like firing patterns if and only if the content of experience is constant in space. It is difficult to imagine a real world example that satisfies this assumption. Odors and sounds are used as examples. While they are often more spatially diffuse than an object on the ground, odors and sounds have sources that are readily detectable and thus are not constant in space. Animals can easily navigate to a food source or to a vocalizing conspecific. This assumption is especially problematic because it predicts that all grid cells should become silent when their preferred non-spatial attribute (e.g. a specific odor) is missing. I'm not aware of any experimental data showing that grid cells become silent. On the contrary, grid cells are known to remain active across all contexts that have been tested, including across sleep/wake states. Unlike place cells, grid cells have never been shown to turn off. Since grid cells are active in all contexts, their preferred attribute must also be present in all contexts, and therefore they would not convey any information about the specific content of an experience. The author lists many attributes that could in theory be constant in a laboratory setting, but there is no data I'm aware of that shows this is true in practice. As it stands, this crucial assumption of the model remains mere speculation.

      (2) The proposed novelty of this theory is that other models all assume that grid cells encode space. This is not quite true of models based on continuous attractor networks, the discussion of which is essentially absent. More specifically, attractor models focus on the importance of intrinsic dynamics within entorhinal cortex in generating the grid pattern. While this firing pattern is aligned to space during navigation and therefore can be used a representation of that space, the neural dynamics are preserved even during sleep. Similarly, it is because the grid pattern does not strictly encode physical space that grid-like signals are also observed in relation to other two-dimensional continuous variables.

      (3) The use of border cells or boundary vector cells as the main (or only) source of spatial information in the hippocampus is not well supported by experimental data. Border cells in entorhinal cortex are not active in the center of an environment. Boundary-vector cells can fire farther away from the walls, but are not found in entorhinal cortex. They are located in the subiculum, a major output of the hippocampus. While the entorhinal-hippocampal circuit is a loop, the route from boundary-vector cells to place cells is much less clear than from grid cells. Moreover, both border cells and boundary-vector cells (which are conflated in this paper) comprise a small population of neurons compared to grid cells.

      Minor comments:

      (1) There is substantial theoretical and experimental work supporting the idea that grid cell modules instantiate continuous attractor networks, yet this class of models is largely ignored:

      p. 7 "In contrast, most grid cell models (Bellmund et al., 2016; Bush et al., 2015; Castro & Aguiar, 2014; Hasselmo, 2009; Mhatre et al., 2012; Solstad et al., 2006; Sorscher et al., 2023; Stepanyuk, 2015; Widloski & Fiete, 2014) are domain specific models of spatial navigation"

      The following references should be added:

      McNaughton, B. L., Battaglia, F. P., Jensen, O., Moser, E. I. & Moser, M.-B. Path integration and the neural basis of the 'cognitive map'. Nat. Rev. Neurosci. 7, 663-678 (2006).

      Fuhs, M. C. & Touretzky, D. S. A spin glass model of path integration in rat medial entorhinal cortex. J. Neurosci. 26, 4266-4276 (2006).

      Burak, Y. & Fiete, I. R. Accurate path integration in continuous attractor network models of grid cells. PLoS Comput. Biol. 5, e1000291 (2009).

      Guanella, A., Kiper, D. & Verschure, P. A model of grid cells based on a twisted torus topology. Int. J. Neural Syst. 17, 231-240 (2007).

      Couey, J. J. et al. Recurrent inhibitory circuitry as a mechanism for grid formation. Nat. Neurosci. 16, 318-324 (2013).

      (Note: the Bellmund et al. (2016) citation is likely a mistake and was intended to be Bellmund et al. (2018).)

      (2) The author claims in two places that this model is the first to explain that grid cell population activity lies on a torus. While it may be the first explicit computational account of why grid cell activity is mapped onto a torus, these claims should be moderated for clarity, for example by adding "but see McNaughton et al. (2006) and others".

      Box 1. Results Uniquely Explained by this Memory Model - the population code of grid cells lies on a torus

      p.11 "In addition, this simplifying assumption allows the model to capture the finding that the population of grid cells lies on a torus (Gardner et al., 2022), although I note that the model was developed before this result was known."

      (3) Lateral entorhinal cortex is largely ignored in this memory model. It should be considered that the predominance of spatial representations reported in the literature is due to historical reasons. Namely, the discovery of hippocampal place cells spurred interest in looking upstream for the source of spatial information, which was later abundantly clear in medial entorhinal cortex. Lateral entorhinal cortex is relatively understudied, but is known to encode odors, objects, and time in a way that medial entorhinal cortex does not. It is therefore confusing to assume that these attributes are instead encoded by grid cells.

    1. Author response:

      Public Review:

      In this work, the authors develop a new computational tool, DeepTX, for studying transcriptional bursting through the analysis of single-cell RNA sequencing (scRNA-seq) data using deep learning techniques. This tool aims to describe and predict the transcriptional bursting mechanism, including key model parameters and the steady-state distribution associated with the predicted parameters. By leveraging scRNA-seq data, DeepTX provides high-resolution transcriptional information at the single-cell level, despite the presence of noise that can cause gene expression variation. The authors apply DeepTX to DNA damage experiments, revealing distinct cellular responses based on transcriptional burst kinetics. Specifically, IdU treatment in mouse stem cells increases burst size, promoting differentiation, while 5FU affects burst frequency in human cancer cells, leading to apoptosis or, depending on the dose, to survival and potential drug resistance. These findings underscore the fundamental role of transcriptional burst regulation in cellular responses to DNA damage, including cell differentiation, apoptosis, and survival. Although the insights provided by this tool are mostly well supported by the authors' methods, certain aspects would benefit from further clarification.

      The strengths of this paper lie in its methodological advancements and potential broad applicability. By employing the DeepTXSolver neural network, the authors efficiently approximate stationary distributions of mRNA count through a mixture of negative binomial distributions, establishing a simple yet accurate mapping between the kinetic parameters of the mechanistic model and the resulting steady-state distributions. This innovative use of neural networks allows for efficient inference of kinetic parameters with DeepTXInferrer, reducing computational costs significantly for complex, multi-gene models. The approach advances parameter estimation for high-dimensional datasets, leveraging the power of deep learning to overcome the computational expense typically associated with stochastic mechanistic models. Beyond its current application to DNA damage responses, the tool can be adapted to explore transcriptional changes due to various biological factors, making it valuable to the systems biology, bioinformatics, and mechanistic modelling communities. Additionally, this work contributes to the integration of mechanistic modelling and -omics data, a vital area in achieving deeper insights into biological systems at the cellular and molecular levels.

      We thank the reviewers for their positive opinion on our manuscript. As reflected in our detailed responses to the reviewers’ comments, we will make significant changes to address their concerns comprehensively.

      This work also presents some weaknesses, particularly concerning specific technical aspects. The tool was validated using synthetic data, and while it can predict parameters and steady-state distributions that explain gene expression behaviour across many genes, it requires substantial data for training. The authors account for measurement noise in the parameter inference process, which is commendable, yet they do not specify the exact number of samples required to achieve reliable predictions. Moreover, the tool has limitations arising from assumptions made in its design, such as assuming that gene expression counts for the same cell type follow a consistent distribution. This assumption may not hold in cases where RNA measurement timing introduces variability in expression profiles.

      Thank you for your detailed and constructive feedback on our work. We will address the key concerns raised from the following points:

      (1) Clarification on the required sample size: We tested the robustness of our inference method on simulated datasets by varying the number of single-cell samples. Our results indicated that the predictions of burst kinetics parameters become accurate when the number of cells reaches 500 (Supplementary Figure S3d, e). This sample size is smaller than the data typically obtained with current single-cell RNA sequencing (scRNA-seq) technologies, such as 10x Genomics and Smart-seq3 (Zheng GX et al., 2017; Hagemann-Jensen M et al., 2020). Therefore, we believed that our algorithm is well-suited for inferring burst kinetics from existing scRNA-seq datasets, where the sample size is sufficient for reliable predictions. We will clarify this point in the main text to make it easier for readers to use the tool.

      (2) Assumption-related limitations: One of the fundamental assumptions in our study is that the expression counts of each gene are independently and identically distributed (i.i.d.) among cells, which is a commonly adopted assumption in many related works (Larsson AJM et al., 2019; Ochiai H et al., 2020; Luo S et al., 2023). However, we acknowledged the limitations of this assumption. The expression counts of the same gene in each cell may follow distinct distributions even from the same cell type, and dependencies between genes could exist in realistic biological processes. We recognized this and will deeply discuss these limitations from assumptions and prospect as an important direction for future research.

      The authors present a deep learning pipeline to predict the steady-state distribution, model parameters, and statistical measures solely from scRNA-seq data. Results across three datasets appear robust, indicating that the tool successfully identifies genes associated with expression variability and generates consistent distributions based on its parameters. However, it remains unclear whether these results are sufficient to fully characterize the transcriptional bursting parameter space. The parameters identified by the tool pertain only to the steady-state distribution of the observed data, without ensuring that this distribution specifically originates from transcriptional bursting dynamics.

      We appreciate your insightful comments and the opportunity to clarify our study’s contributions and limitations. Although we agree that assessing whether the results from these three realistic datasets can represent the characterize transcriptional burst parameter space is challenging, as it depends on data property and conditions in biology, we firmly believe that DeepTX has the capacity to characterize the full parameter space. This believes stems from the extensive parameters and samples we input during model training and inference across a sufficiently large parameter range (Method 1.3). Furthermore, the training of the model is both flexible and scalable, allowing for the expansion of the transcriptional burst parameter space as needed. We will clarify this in the text to enable readers to use DeepTX more flexibly.

      On the other hand, we agree that parameter identification is based on the steady-state distribution of the observed data (static data), which loses information about the fine dynamic process of the burst kinetics. In principle, tracking the gene expression of living cells can provide the most complete information about real-time transcriptional dynamics across various timescales (Rodriguez J et al., 2019). However, it is typically limited to only a small number of genes and cells, which could not investigate general principles of transcriptional burst kinetics on a genome-wide scale. Therefore, leveraging the both steady-state distribution of scRNA-seq data and mathematical dynamic modelling to infer genome-wide transcriptional bursting dynamics represents a critical and emerging frontier in this field. For example, the statistical inference framework based on the Markovian telegraph model, as demonstrated in (Larsson AJM et al., 2019), offers a valuable paradigm for understanding underlying transcriptional bursting mechanisms. Building on this, our study considered a more generalized non-Mordovian model that better captures transcriptional kinetics by employing deep learning method under conditions such as DNA damage. This provided a powerful framework for comparative analyses of how DNA damage induces alterations in transcriptional bursting kinetics across the genome. We will highlight the limitations of current inference using steady-state distributions in the text and look ahead to future research directions for inference using time series data across the genome.

      A primary concern with the TXmodel is its reliance on four independent parameters to describe gene state-switching dynamics. Although this general model can capture specific cases, such as the refractory and telegraph models, accurately estimating the parameters of the refractory model using only steady-state distributions and typical cell counts proves challenging in the absence of time-dependent data.

      We thank you for highlighting this critical concern regarding the TXmodel's reliance on four independent parameters to describe gene state-switching dynamics. We acknowledge that estimating the parameters of the TXmodel using only steady-state distributions and typical single-cell RNA sequencing (scRNA-seq) data poses significant challenges, particularly in the absence of time-resolved measurements.

      As described in the response of last point, while time-resolved data can provide richer information than static scRNA-seq data, it is currently limited to a small number of genes and cells, whereas static scRNA-seq data typically capture genome-wide expression. Our framework leverages deep learning methods to link mechanistic models with static scRNA-seq data, enabling the inference of genome-wide dynamic behaviors of genes. This provides a potential pathway for comparative analyses of transcriptional bursting kinetics across the entire genome.

      Nonetheless, the refractory model and telegraphic model are important models for studying transcription bursts. We will discuss and compare them in terms of the accuracy of inferred parameters. Certainly, we agree that inferring the molecular mechanisms underlying transcriptional burst kinetics using time-resolved data remains a critical future direction. We will include a brief discussion on the role and importance of time-resolved data in addressing these challenges in the discussion section of the revised manuscript.

      The claim that the GO analysis pertains specifically to DNA damage response signal transduction and cell cycle G2/M phase transition is not fully accurate. In reality, the GO analysis yielded stronger p-values for pathways related to the mitotic cell cycle checkpoint signalling. As presented, the GO analysis serves more as a preliminary starting point for further bioinformatics investigation that could substantiate these conclusions. Additionally, while GSEA analysis was performed following the GO analysis, the involvement of the cardiac muscle cell differentiation pathway remains unclear, as it was not among the GO terms identified in the initial GO analysis.

      We thank the reviewer for this valuable feedback and for pointing out the need for clarification regarding the GO and GSEA analyses. We agree that the connection between the cardiac muscle cell differentiation pathway identified in the GSEA analysis and the GO terms from the initial analysis requires further clarification. This discrepancy arises because GSEA examines broader sets of pathways and may capture biological processes not highlighted by GO analysis due to differences in the statistical methods and pathway definitions used. We will revise the manuscript to address this point, explicitly discussing the distinct yet complementary nature of GO and GSEA analyses and providing a clearer interpretation of the results.

      As the advancement is primarily methodological, it lacks a comprehensive comparison with traditional methods that serve similar functions. Consequently, the overall evaluation of the method, including aspects such as inference accuracy, computational efficiency, and memory cost, remains unclear. The paper would benefit from being contextualised alongside other computational tools aimed at integrating mechanistic modelling with single-cell RNA sequencing data. Additional context regarding the advantages of deep learning methods, the challenges of analysing large, high-dimensional datasets, and the complexities of parameter estimation for intricate models would strengthen the work.

      We greatly appreciate your insightful feedback, which highlights important considerations for evaluating and contextualizing our methodological advancements. Below, we emphasize our advantages from both the modeling perspective and the inference perspective compared with previous model. As our work is rooted in a model-based approach to describe the transcriptional bursting process underlying gene expression, the classic telegraph model (Markovian) and non-Markovian models which are commonly employed are suitable for this purpose:

      Classic telegraph model: The classic telegraph model allows for the derivation of approximate analytical solutions through numerical integration, enabling efficient parameter point estimation via maximum likelihood methods, e.g., as explored in (Larsson AJM et al., 2019). Although exact analytical solutions for the telegraph model are not available, certain moments of its distribution can be explicitly derived. This allows for an alternative approach to parameter inference using moment-based estimation methods, e.g., as explored in (Ochiai H et al., 2020). However, it is important to note that higher-order sample moments can be unstable, potentially leading to significant estimation bias.

      Non-Markovian Models: For non-Markovian models, analytical or approximate analytical solutions remain elusive. Previous work has employed pseudo-likelihood approaches, leveraging statistical properties of the model’s solutions to estimate parameters, e.g., as explored in (Luo S et al., 2023). However, the method may suffer from low inference efficiency.

      In our current work, we leverage deep learning to estimate parameters of TXmodel, which is non-Markovian model. First, we represent the model's solution as a mixture of negative binomial distributions, which is obtained by the deep learning method. Second, through integration with the deep learning architecture, the model parameters can be optimized using automatic differentiation, significantly improving inference efficiency. Furthermore, by employing a Bayesian framework, our method provides posterior distributions for the estimated dynamic parameters, offering a comprehensive characterization of uncertainty. Compared to traditional methods such as moment-based estimation or pseudo-likelihood approaches, we believe our approach not only achieves higher inference efficiency but also delivers posterior distributions for kinetics parameters, enhancing the interpretability and robustness of the results. We will present and emphasize the computational efficiency and memory cost of our methods the revised version.

      Reference

      Zheng, G.X., Terry, J.M., Belgrader, P., Ryvkin, P., Bent, Z.W., Wilson, R., Ziraldo, S.B., Wheeler, T.D., McDermott, G.P., Zhu, J., Gregory, M.T., Shuga, J., Montesclaros, L., Underwood, J.G., Masquelier, D.A., Nishimura, S.Y., Schnall-Levin, M., Wyatt, P.W., Hindson, C.M., Bharadwaj, R., Wong, A., Ness, K.D., Beppu, L.W., Deeg, H.J., McFarland, C., Loeb, K.R., Valente, W.J., Ericson, N.G., Stevens, E.A., Radich, J.P., Mikkelsen, T.S., Hindson, B.J., Bielas, J.H. 2017. Massively parallel digital transcriptional profiling of single cells. Nature Communications 8: 14049. DOI: https://dx.doi.org/10.1038/ncomms14049, PMID: 28091601

      Hagemann-Jensen, M., Ziegenhain, C., Chen, P., Ramsköld, D., Hendriks, G.J., Larsson, A.J.M., Faridani, O.R., Sandberg, R. 2020. Single-cell RNA counting at allele and isoform resolution using Smart-seq3. Nat Biotechnol 38: 708-714. DOI: https://dx.doi.org/10.1038/s41587-020-0497-0, PMID: 32518404

      Larsson, A.J.M., Johnsson, P., Hagemann-Jensen, M., Hartmanis, L., Faridani, O.R., Reinius, B., Segerstolpe, A., Rivera, C.M., Ren, B., Sandberg, R. 2019. Genomic encoding of transcriptional burst kinetics. Nature 565: 251-254. DOI: https://dx.doi.org/10.1038/s41586-018-0836-1, PMID: 30602787

      Ochiai, H., Hayashi, T., Umeda, M., Yoshimura, M., Harada, A., Shimizu, Y., Nakano, K., Saitoh, N., Liu, Z., Yamamoto, T., Okamura, T., Ohkawa, Y., Kimura, H., Nikaido, I. 2020. Genome-wide kinetic properties of transcriptional bursting in mouse embryonic stem cells. Science Adavances 6: eaaz6699. DOI: https://dx.doi.org/10.1126/sciadv.aaz6699, PMID: 32596448

      Luo, S., Wang, Z., Zhang, Z., Zhou, T., Zhang, J. 2023. Genome-wide inference reveals that feedback regulations constrain promoter-dependent transcriptional burst kinetics. Nucleic Acids Research 51: 68-83. DOI: https://dx.doi.org/10.1093/nar/gkac1204, PMID: 36583343

      Rodriguez, J., Ren, G., Day, C.R., Zhao, K., Chow, C.C., Larson, D.R. 2019. Intrinsic dynamics of a human gene reveal the basis of expression heterogeneity. Cell 176: 213-226.e218. DOI: https://dx.doi.org/10.1016/j.cell.2018.11.026, PMID: 30554876

      Luo, S., Zhang, Z., Wang, Z., Yang, X., Chen, X., Zhou, T., Zhang, J. 2023. Inferring transcriptional bursting kinetics from single-cell snapshot data using a generalized telegraph model. Royal Society Open Science 10: 221057. DOI: https://dx.doi.org/10.1098/rsos.221057, PMID: 37035293

    1. Author response:

      Reviewer #1 (Public review):

      The significance of the target molecule and mechanisms may help in understanding the molecular mechanisms of metformin.

      We greatly appreciate the reviewer’s insightful comment regarding the significance of the target molecule and its mechanisms in understanding the molecular actions of metformin. ATP5I is responsible for the dimerization of the F<sub>1</sub>F<sub>0</sub>-ATPase(1-3). Hence, we propose conducting BN-PAGE followed by a western blot using the β-subunit of the F1 domain of F1F0-ATP synthase to investigate whether metformin affects its dimerization. This will provide a more direct evidence of the on target action of metformin on ATP5I. Due to the high abundance of F<sub>1</sub>F<sub>0</sub>-ATP synthase in cells and the slow ability of metformin to enter mitochondria, we plan to perform long-term treatments (3 and 6 days) with high concentrations of metformin (10 mM) to enhance the likelihood of detecting subtle yet biologically relevant shifts in the monomer and dimer populations. Prolonged exposure is expected to reveal the cumulative effects of metformin on F<sub>1</sub>F<sub>0</sub>-ATP synthase dimers/monomers ratio. We do not expect that metformin will totally mimic the cumulative effect of the dimerization as in ATP5I KO cells but we think it will be important to report to what extent this ratio is affected.

      Reviewer #2 (Public review):

      (1) The interpretation of the cellular co-localization of the biotin-biguanide conjugate with TOMM20 (Figure 1-D) as mitochondrial "accumulation" of the conjugate is overstated because it cannot exclude binding of the conjugate to the mitochondrial membrane. It would have been more convincing if additional incubations with the biotin-biguanide conjugate in combination with metformin had shown that metformin is competitive with the biotin-conjugate.

      We appreciate the reviewer’s insightful comment and agree that the resolution provided by fluorescence microscopy makes it challenging to pinpoint the specific mitochondrial compartment where the biotin-biguanide conjugate localizes, even with additional markers such as TOMM20 antibodies for the inner mitochondrial membrane. While it remains a possibility that the conjugate binds to the mitochondrial surface, another plausible explanation is that the biotin moiety may facilitate entry into mitochondria through a biotin-specific transporter, adding further mechanistic intricacies. Furthermore, while a competition assay with metformin might help investigate interactions with mitochondrial targets and transporters (OCT family), it would not compete for biotin-mediated transport. Thus, while we acknowledge the reviewer’s suggestion, we believe such an experiment may not provide conclusive evidence regarding the conjugate’s mitochondrial localization or mechanism of entry. Instead, we will revise the manuscript to more accurately describe the findings as "mitochondrial association" rather than "mitochondrial accumulation," ensuring that our interpretation remains consistent with the resolution and limitations of the data presented.

      (2) The manuscript reports the identification of 69 proteins by mass spectrometry of the pull-down assay of which 31 proteins were eluted by metformin. However, no Mass Spectrometry data is presented of the peptides identified. The methodology does not state the minimum number of peptides (1, 2?) that were used for the identification of the 31/69 proteins.

      Concerning the mass spectrometry results, our intention was to provide a comprehensive table summarizing these findings in a separate data sheet, as part of the data availability section. To address the reviewer’s comment and ensure full transparency, we will include this table as supplementary material in the revised manuscript. Additionally, we will update the methodology section to explicitly state these criteria and ensure clarity regarding the identification process.

      (3) The validation of ATP5I was based on the use of recombinant protein (which was 90% pure) for the SPR and the use of a single antibody to ATP5I. The validity of the immunoblotting rests on the assumption that there is no "non-specific" immunoactivity in the relevant mol wt range. Information on the validation of the antibody would be helpful.

      Regarding the recombinant protein used for SPR, its purity was evaluated using a Coomassie-stained gel. For the antibody used in immunoblotting, its specificity was validated through knockout cell lines, ensuring minimal concerns about non-specific immunoactivity within the relevant molecular weight range. Unfortunately, the KO data comes in the paper after the first immunoblots are presented. In the revised manuscript, we will clearly outline these validation steps in the methods section and additional manufacturer documentation for the antibody we used.

      (4) Knock-out of ATP5I markedly compromised the NAD/NADH ratio (Fig.3A) and cell proliferation (Figure 3D). These effects may be associated with decreased mitochondrial membrane potential which could explain the low efficacy of metformin (and most of the data in Figures 3-5). This possibility should be discussed. Effects of [metformin] on the NAD/NADH ratio in control cells and ATP5I-KO would have been helpful because the metformin data on cell growth is normalized as fold change relative to control, whereas the NAD/NADH ratio would represent a direct absolute measurement enabling comparison of the absolute effect in control cells with ATP5I KO.

      The mitochondrial membrane potential depends on a functional electron transport chain which drives proton pumping from the matrix to the intermembrane space. Metformin can decrease the mitochondrial membrane potential and this usually explained as a consequence of complex I inhibition(4). It has been published the metformin requires this membrane potential to accumulate in mitochondria so the actions of metformin are self-limiting due to this requirement. The reviewer is right that ATP5I KO cells could be resistant to metformin because they may have a lower membrane potential. We do not believe this to be the case because the response to phenformin, another biguanide that can enter mitochondria through the membrane without the need of the OCT transporters(5), is also affected in ATP5IKO cells. Of note, compensatory mechanisms such as enhanced glycolysis, as observed in ATP5I-KO cells (elevated ECAR and increased sensitivity to 2-D-deoxyglucose), and the ATPase activity of F<sub>1</sub>F<sub>0</sub>-ATP synthase could potentially help maintain membrane potential suggesting that this might not be an issue in the ATP5I KO cells. We will discuss these possibilities in the revised manuscript.

      Nevertheless, to experimentally address this point, we propose measuring mitochondrial membrane potential using tetramethylrhodamine methyl ester (TMRE) and ATP levels using luciferase-based assays (CellTiter-Glo) in ATP5I-KO cells.

      Regarding the NAD+/NADH in both control and KO cells may not be very helpful because this ratio can be corrected by LDH which is induced as part of the glycolytic adaptation that occurs after inhibition of respiration. Since our KO cells have been propagated already for several passages, the extent of this adaptation is likely different from metformin-treated cells. As we mentioned in answering Reviewer 1, we will provide a more direct measurement of metformin acting on ATP5I: the levels of F1F0-ATPase dimers and monomers.

      (5) Figure-6 CRISPR/Cas9 KO at 16mM metformin in comparison with 70nM rotenone and 2 micromolar oligomycin (in serum-containing medium). The rationale for the use of such a high concentration of metformin has not been explained. In liver cells metformin concentrations above 1mM cause severe ATP depletion, whereas therapeutic (micromolar) concentrations have minimal effects on cellular ATP status. The 16mM concentration is ~2 orders of magnitude higher than therapeutic concentrations and likely linked to compromised energy status. The stronger inhibition of cell proliferation by 16mM metformin compared with rotenone or oligomycin raises the issue of whether the changes in gene expression may be linked to the greater inhibition of mitochondrial metabolism. Validation of the cellular ATP status and NAD/NADH with metformin as compared with the two inhibitors could help the interpretation of this data.

      To address the reviewer’s final comment, we would like to clarify the rationale behind our experimental approach. NALM-6 cells are very glycolytic, have low respiration rates, and weak dependence on ATP5I (DepMap score: -0.47)(6). The concentration of 16 mM metformin was chosen based on the IC50 for this cell line. This approach aligns with our focus on the anticancer mechanism of action rather than the antidiabetic effects of metformin. Both ATP status and NAD+/NADH ratios will depend on the extent of the compensatory glycolysis. On the other hand, our genetic screening evaluates cell proliferation as an integration of all metabolic activities required for the process. This unbiased screening revealed a common pathway affected by metformin and oligomycin different that the pathway affected by rotenone, which is consistent with the finding that metformin acts of the F<sub>1</sub>F<sub>0</sub>ATPase.

      Reviewer #3 (Public review):

      (1) Most of the data are based on measurements of the oxygen consumption rate (OCR) and extracellular acidification rate (ECAR) measured by the Seahorse analyser in control and ATP5l KO cells. However, these measurements are conducted by a single injection of a biguanide, followed over time and presented as fold change. By doing so, the individual information on the effect of metformin and derivate on control and KO cells are lost. In addition, the usual measurement of OCR is coupled with certain inhibitors and uncouplers, such as oligomycin, FCCP, and Antimycin A/rotenone, to understand the contribution of individual complexes to respiration. Since biguanides and ATP5l KO affect protein levels of components of complex I and IV, it would be informative to measure their individual contributions/effects in the Seahorse. To further strengthen the data, it would be helpful to obtain measurements of actual ATP levels in these cells, as this would explain the activation of AMPK.

      We appreciate the reviewer’s observations regarding the Seahorse measurements and acknowledge the potential limitations of presenting the data as fold change. Due to experimental challenges in maintaining KP-4 and ATP5I-KO cells with sufficient nutrients, caused by their rapid glucose uptake and subsequent lactate production, it was more practical to present the Seahorse results in this format. Using inhibitors at each time point during the Seahorse experiment was not feasible, as the delay between inhibitor injections and the corresponding changes in oxygen consumption rate (OCR) and extracellular acidification rate (ECAR) would introduce variability and complicate the interpretation of dynamic responses. Nevertheless, we recognize the importance of understanding the contributions of specific respiratory complexes to OCR and ECAR. To address this, we will include a representative figure showcasing a typical Seahorse analysis, highlighting ATP turnover and proton leak after oligomycin addition, maximal respiration with FCCP, and disruption with rotenone and antimycin A. While these experiments are inherently complex due to the metabolic demands of ATP5I-KO cells, this approach will provide a clearer breakdown of mitochondrial activity. Furthermore, as mentioned in our response to Reviewer 2, we will measure ATP levels using a luciferase-based assay (CellTiter-Glo) in both control and ATP5I-KO cells to better explain AMPK activation. This will provide additional context to strengthen the interpretation of mitochondrial function and metabolic compensation mechanisms in these cells.

      (2) The authors report on alterations in mitochondrial morphology upon ATP5l KO, which is measured by subjective quantifications of filamentous versus puncta structures. Fiji offers great tools to quantify the mitochondrial network unbiasedly and with more accuracy using deconvolution and skeletonization of the mitochondria, providing the opportunity to measure length, shape, and number quantitatively. This will help to understand better, whether mitochondria are really fragmented upon ATP5l KO and rescued by its re-introduction.

      Concerning the analysis of mitochondrial morphology, we acknowledge the potential benefits of using Fiji and additional plugins such as MiNA for more accurate and unbiased quantification. Indeed, this approach could provide stronger evidence for mitochondrial fragmentation upon ATP5I-KO and its potential rescue by ATP5I reintroduction. We will consider integrating this methodology into our analysis to enhance the precision and robustness of our findings.

      (3) Finally, the authors report in the last part of the paper a genetic CRISPR/Cas9 KO screen in NALM-6 cells cultured with high amounts of metformin to identify potential new mediators of metformin action. It is difficult to connect that to the rest of the paper because a) different concentrations of metformin are used and b) the metabolic effects on energy consumption are not defined. They argue about the molecular function of the obtained hits based on literature and on a comparison of the pattern of genetic alterations based on treatments with known inhibitors such as oligomycin and rotenone. However, a direct connection is not provided, thus the interpretation at the end of the results that "the OMA1-DEL1-HRI pathway mediates the antiproliferative activity of both biguanides and the F1ATPase inhibitor oligomycin" while increasing glycolysis, needs to be toned down. This is an interesting observation, but no causality is provided. In general, this part stands alone and needs to be better connected to the rest of the paper.

      NALM-6 are very glycolytic, have low respiration rates, and weak dependence on ATP5I(6), forcing us to use higher concentrations of metformin to inhibit their growth. Recent results show that metformin targets PEN2 in the cytosol to increase AMPK activity, controlling both the glucose lowering and the life span extension abilities of metformin 7. This work raises the question whether the antiproliferative and anticancer effects of metformin are due to a mitochondrial activity or are controlled by this new pathway of AMPK activation. Hence, the genetic screening was performed to unbiasedly find how metformin works. The results provide compelling evidence for mitochondria and in particular the ATP synthase as potential targets of metformin and a foundation for future studies. We will revise the text and abstract to better reflect the exploratory nature of this finding and ensure clarity.

      (1) Paumard, P. et al. Two ATP synthases can be linked through subunits i in the inner mitochondrial membrane of Saccharomyces cerevisiae. Biochemistry 41, 10390-10396 (2002). https://doi.org/10.1021/bi025923g

      (2) Paumard, P. et al. The ATP synthase is involved in generating mitochondrial cristae morphology. EMBO J 21, 221-230 (2002). https://doi.org/10.1093/emboj/21.3.221

      (3) Habersetzer, J. et al. ATP synthase oligomerization: from the enzyme models to the mitochondrial morphology. Int J Biochem Cell Biol 45, 99-105 (2013). https://doi.org/10.1016/j.biocel.2012.05.017

      (4) Xian, H. et al. Metformin inhibition of mitochondrial ATP and DNA synthesis abrogates NLRP3 inflammasome activation and pulmonary inflammation. Immunity 54, 1463-1477 e1411 (2021). https://doi.org/10.1016/j.immuni.2021.05.004

      (5) Hawley, S. A. et al. Use of cells expressing gamma subunit variants to identify diverse mechanisms of AMPK activation. Cell metabolism 11, 554-565 (2010). https://doi.org/10.1016/j.cmet.2010.04.001

      (6) Hlozkova, K. et al. Metabolic profile of leukemia cells influences treatment efficacy of L-asparaginase. BMC Cancer 20, 526 (2020). https://doi.org/10.1186/s12885-020-07020-y

      (7) Ma, T. et al. Low-dose metformin targets the lysosomal AMPK pathway through PEN2. Nature 603, 159-165 (2022). https://doi.org/10.1038/s41586-022-04431-8

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      In this revision, the authors significantly improved the manuscript. They now address some of my concerns. Specifically, they show the contribution of end-effects on spreading the inputs between dendrites. This analysis reveals greater applicability of their findings to cortical cells, with long, unbranching dendrites than other neuronal types, such as Purkinje cells in the cerebellum.

      They now explain better the interactions between calcium and voltage signals, which I believe improve the take-away message of their manuscript. They modified and added new figures that helped to provide more information about their simulations.

      However, some of my points remain valid. Figure 6 shows depolarization of ~5mV from -75. This weak depolarization would not effectively recruit nonlinear activation of NMDARs. In their paper, Branco and Hausser (2010) showed depolarizations of ~10-15mV.

      More importantly, the signature of NMDAR activation is the prolonged plateau potential and activation at more depolarized resting membrane potentials (their Figure 4). Thus, despite including NMDARs in the simulation, the authors do not model functional recruitment of these channels. Their simulation is thus equivalent to AMPA only drive, which can indeed summate somewhat nonlinearly.

      In the current study, we used short sequences of 5 inputs, since the convergence of longer sequences is extremely unlikely in the network configurations we have examined. This resulted in smaller EPSP amplitudes of ~5mV (Figure 6 - Supplement 2A, B). Longer sequences containing 9 inputs resulted in larger somatic depolarizations of ~10mV (Figure 6 - Supplement 2E, F). Although we had modified the (Branco, Clark, and Häusser 2010) model to remove the jitter in the timing of arrival of inputs and made slight modifications to the location of stimulus delivery on the dendrite, we saw similar amplitudes when we tested a 9-length sequence using (Branco, Clark, and Häusser 2010)’s published code (Figure 6 - Supplement 2I, J). In all the cases we tested (5 input sequence, 9 input sequence, 9 input sequence with (Branco, Clark, and Häusser 2010) code repository), removal of NMDA synapses lowered both the somatic EPSPs (Figure 6 - Supplement 2C,D,G,H,K,L) as well as the selectivity (measured as the difference between the EPSPs generated for inward and outward stimulus delivery) (Figure 6 Supplement 2M,N,O). Further, monitoring the voltage along the dendrite for a sequence of 5 inputs showed dendritic EPSPs in the range of 20-45 mV (Figure 6 - Supplement 2P, Q), which came down notably (10-25mV) when NMDA synapses were abolished (Figure 6 - Supplement 2R, S). Thus, even sequences containing as few as 5 inputs were capable of engaging the NMDA-mediated nonlinearity to show sequence selectivity, although the selectivity was not as strong as in the case of 9 inputs.

      Reviewer #1 (Recommendations for the authors):

      Minor points:

      Figure 8, what does the scale in A represent? I assume it is voltage, but there are no units. Figure 8, C, E, G, these are unconventional units for synaptic weights, usually, these are given in nS / per input.

      We have corrected these. The scalebar in 8A represents membrane potential in mV. The units of 8C,E,G are now in nS.

      Reviewer #2 (Public Review):

      Summary:

      If synaptic input is functionally clustered on dendrites, nonlinear integration could increase the computational power of neural networks. But this requires the right synapses to be located in the right places. This paper aims to address the question of whether such synaptic arrangements could arise by chance (i.e. without special rules for axon guidance or structural plasticity), and could therefore be exploited even in randomly connected networks. This is important, particularly for the dendrites and biological computation communities, where there is a pressing need to integrate decades of work at the single-neuron level with contemporary ideas about network function.

      Using an abstract model where ensembles of neurons project randomly to a postsynaptic population, back-of-envelope calculations are presented that predict the probability of finding clustered synapses and spatiotemporal sequences. Using data-constrained parameters, the authors conclude that clustering and sequences are indeed likely to occur by chance (for large enough ensembles), but require strong dendritic nonlinearities and low background noise to be useful.

      Strengths:

      (1) The back-of-envelope reasoning presented can provide fast and valuable intuition. The authors have also made the effort to connect the model parameters with measured values. Even an approximate understanding of cluster probability can direct theory and experiments towards promising directions, or away from lost causes.

      (2) I found the general approach to be refreshingly transparent and objective. Assumptions are stated clearly about the model and statistics of different circuits. Along with some positive results, many of the computed cluster probabilities are vanishingly small, and noise is found to be quite detrimental in several cases. This is important to know, and I was happy to see the authors take a balanced look at conditions that help/hinder clustering, rather than to just focus on a particular regime that works.

      (3) This paper is also a timely reminder that synaptic clusters and sequences can exist on multiple spatial and temporal scales. The authors present results pertaining to the standard `electrical' regime (~50-100 µm, <50 ms), as well as two modes of chemical signaling (~10 µm, 100-1000 ms). The senior author is indeed an authority on the latter, and the simulations in Figure 5, extending those from Bhalla (2017), are unique in this area. In my view, the role of chemical signaling in neural computation is understudied theoretically, but research will be increasingly important as experimental technologies continue to develop.

      Weaknesses:

      (1) The paper is mostly let down by the presentation. In the current form, some patience is needed to grasp the main questions and results, and it is hard to keep track of the many abbreviations and definitions. A paper like this can be impactful, but the writing needs to be crisp, and the logic of the derivation accessible to non-experts. See, for instance, Stepanyants, Hof & Chklovskii (2002) for a relevant example.

      It would be good to see a restructure that communicates the main points clearly and concisely, perhaps leaving other observations to an optional appendix. For the interested but time-pressed reader, I recommend starting with the last paragraph of the introduction, working through the main derivation on page 7, and writing out the full expression with key parameters exposed. Next, look at Table 1 and Figure 2J to see where different circuits and mechanisms fit in this scheme. Beyond this, the sequence derivation on page 15 and biophysical simulations in Figures 5 and 6 are also highlights.

      We appreciate the reviewers' suggestions. We have tightened the flow of the introduction. We understand that the abbreviations and definitions are challenging and have therefore provided intuitions and summaries of the equations discussed in the main text.

      Clusters calculations

      Our approach is to ask how likely it is that a given set of inputs lands on a short segment of dendrite, and then scale it up to all segments on the entire dendritic length of the cell.

      Thus, the probability of occurrence of groups that receive connections from each of the M ensembles (PcFMG) is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative zone-length with respect to the total dendritic arbor (Z/L) and the number of ensembles (M).

      Sequence calculations

      Here we estimate the likelihood of the first ensemble input arriving anywhere on the dendrite, and ask how likely it is that succeeding inputs of the sequence would arrive within a set spacing.

      Thus, the probability of occurrence of sequences that receive sequential connections (PcPOSS) from each of the M ensembles is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative window size with respect to the total dendritic arbor (Δ/L) and the number of ensembles (M).

      (2) I wonder if the authors are being overly conservative at times. The result highlighted in the abstract is that 10/100000 postsynaptic neurons are expected to exhibit synaptic clustering. This seems like a very small number, especially if circuits are to rely on such a mechanism. However, this figure assumes the convergence of 3-5 distinct ensembles. Convergence of inputs from just 2 ense mbles would be much more prevalent, but still advantageous computationally. There has been excitement in the field about experiments showing the clustering of synapses encoding even a single feature.

      We agree that short clusters of two inputs would be far more likely. We focused our analysis on clusters with three of more ensembles because of the following reasons:

      (1) The signal to noise in these clusters was very poor as the likelihood of noise clusters is high.

      (2) It is difficult to trigger nonlinearities with very few synaptic inputs.

      (3) At the ensemble sizes we considered (100 for clusters, 1000 for sequences), clusters arising from just two ensembles would result in high probability of occurrence on all neurons in a network (~50% in cortex, see p_CMFG in figures below.). These dense neural representations make it difficult for downstream networks to decode (Foldiak 2003).

      However, in the presence of ensembles containing fewer neurons or when the connection probability between the layers is low, short clusters can result in sparse representations (Figure 2 - Supplement 2). Arguments 1 and 2 hold for short sequences as well.

      (3) The analysis supporting the claim that strong nonlinearities are needed for cluster/sequence detection is unconvincing. In the analysis, different synapse distributions on a single long dendrite are convolved with a sigmoid function and then the sum is taken to reflect the somatic response. In reality, dendritic nonlinearities influence the soma in a complex and dynamic manner. It may be that the abstract approach the authors use captures some of this, but it needs to be validated with simulations to be trusted (in line with previous work, e.g. Poirazi, Brannon & Mel, (2003)).

      We agree that multiple factors might affect the influence of nonlinearities on the soma. The key goal of our study was to understand the role played by random connectivity in giving rise to clustered computation. Since simulating a wide range of connectivity and activity patterns in a detailed biophysical model was computationally expensive, we analyzed the exemplar detailed models for nonlinearity separately (Figures 5, 6, and new figure 8), and then used our abstract models as a proxy for understanding population dynamics. A complete analysis of the role played by morphology, channel kinetics and the effect of branching requires an in-depth study of its own, and some of these questions have already been tackled by (Poirazi, Brannon, and Mel 2003; Branco, Clark, and Häusser 2010; Bhalla 2017). However, in the revision, we have implemented a single model which incorporates the range of ion-channel, synaptic and biochemical signaling nonlinearities which we discuss in the paper (Figure 8, and Figure 8 Supplement 1, 2,3). We use this to demonstrate all three forms of sequence and grouped computation we use in the study, where the only difference is in the stimulus pattern and the separation of time-scales inherent in the stimuli.

      (4) It is unclear whether some of the conclusions would hold in the presence of learning. In the signal-to-noise analysis, all synaptic strengths are assumed equal. But if synapses involved in salient clusters or sequences were potentiated, presumably detection would become easier? Similarly, if presynaptic tuning and/or timing were reorganized through learning, the conditions for synaptic arrangements to be useful could be relaxed. Answering these questions is beyond the scope of the study, but there is a caveat there nonetheless.

      We agree with the reviewer. If synapses receiving connectivity from ensembles had stronger weights, this would make detection easier. Dendritic spikes arising from clustered inputs have been implicated in local cooperative plasticity (Golding, Staff, and Spruston 2002; Losonczy, Makara, and Magee 2008). Further, plasticity related proteins synthesized at a synapse undergoing L-LTP can diffuse to neighboring weakly co-active synapses, and thereby mediate cooperative plasticity (Harvey et al. 2008; Govindarajan, Kelleher, and Tonegawa 2006; Govindarajan et al. 2011). Thus if clusters of synapses were likely to be co-active, they could further engage these local plasticity mechanisms which could potentiate them while not potentiating synapses that are activated by background activity. This would depend on the activity correlation between synapses receiving ensemble inputs within a cluster vs those activated by background activity. We have mentioned some of these ideas in a published opinion paper (Pulikkottil, Somashekar, and Bhalla 2021). In the current study, we wanted to understand whether even in the absence of specialized connection rules, interesting computations could still emerge. Thus, we focused on asking whether clustered or sequential convergence could arise even in a purely randomly connected network, with the most basic set of assumptions. We agree that an analysis of how selectivity evolves with learning would be an interesting topic for further work.

      References

      • Bhalla, Upinder S. 2017. “Synaptic Input Sequence Discrimination on Behavioral Timescales Mediated by Reaction-Diffusion Chemistry in Dendrites.” Edited by Frances K Skinner. eLife 6 (April):e25827. https://doi.org/10.7554/eLife.25827.

      • Branco, Tiago, Beverley A. Clark, and Michael Häusser. 2010. “Dendritic Discrimination of Temporal Input Sequences in Cortical Neurons.” Science (New York, N.Y.) 329 (5999): 1671–75. https://doi.org/10.1126/science.1189664.

      • Foldiak, Peter. 2003. “Sparse Coding in the Primate Cortex.” The Handbook of Brain Theory and Neural Networks. https://research-repository.st-andrews.ac.uk/bitstream/handle/10023/2994/FoldiakSparse HBTNN2e02.pdf?sequence=1.

      • Golding, Nace L., Nathan P. Staff, and Nelson Spruston. 2002. “Dendritic Spikes as a Mechanism for Cooperative Long-Term Potentiation.” Nature 418 (6895): 326–31. https://doi.org/10.1038/nature00854.

      • Govindarajan, Arvind, Inbal Israely, Shu-Ying Huang, and Susumu Tonegawa. 2011. “The Dendritic Branch Is the Preferred Integrative Unit for Protein Synthesis-Dependent LTP.” Neuron 69 (1): 132–46. https://doi.org/10.1016/j.neuron.2010.12.008.

      • Govindarajan, Arvind, Raymond J. Kelleher, and Susumu Tonegawa. 2006. “A Clustered Plasticity Model of Long-Term Memory Engrams.” Nature Reviews Neuroscience 7 (7): 575–83. https://doi.org/10.1038/nrn1937.

      • Harvey, Christopher D., Ryohei Yasuda, Haining Zhong, and Karel Svoboda. 2008. “The Spread of Ras Activity Triggered by Activation of a Single Dendritic Spine.” Science (New York, N.Y.) 321 (5885): 136–40. https://doi.org/10.1126/science.1159675.

      • Losonczy, Attila, Judit K. Makara, and Jeffrey C. Magee. 2008. “Compartmentalized Dendritic Plasticity and Input Feature Storage in Neurons.” Nature 452 (7186): 436–41. https://doi.org/10.1038/nature06725.

      • Poirazi, Panayiota, Terrence Brannon, and Bartlett W. Mel. 2003. “Pyramidal Neuron as Two-Layer Neural Network.” Neuron 37 (6): 989–99. https://doi.org/10.1016/S0896-6273(03)00149-1.

      • Pulikkottil, Vinu Varghese, Bhanu Priya Somashekar, and Upinder S. Bhalla. 2021. “Computation, Wiring, and Plasticity in Synaptic Clusters.” Current Opinion in Neurobiology, Computational Neuroscience, 70 (October):101–12. https://doi.org/10.1016/j.conb.2021.08.001.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      In this revision, the authors significantly improved the manuscript. They now address some of my concerns. Specifically, they show the contribution of end-effects on spreading the inputs between dendrites. This analysis reveals greater applicability of their findings to cortical cells, with long, unbranching dendrites than other neuronal types, such as Purkinje cells in the cerebellum.

      They now explain better the interactions between calcium and voltage signals, which I believe improve the take-away message of their manuscript. They modified and added new figures that helped to provide more information about their simulations.

      However, some of my points remain valid. Figure 6 shows depolarization of ~5mV from -75. This weak depolarization would not effectively recruit nonlinear activation of NMDARs. In their paper, Branco and Hausser (2010) showed depolarizations of ~10-15mV.

      More importantly, the signature of NMDAR activation is the prolonged plateau potential and activation at more depolarized resting membrane potentials (their Figure 4). Thus, despite including NMDARs in the simulation, the authors do not model functional recruitment of these channels. Their simulation is thus equivalent to AMPA only drive, which can indeed summate somewhat nonlinearly.

      In the current study, we used short sequences of 5 inputs, since the convergence of longer sequences is extremely unlikely in the network configurations we have examined. This resulted in smaller EPSP amplitudes of ~5mV (Figure 6 - Supplement 2A, B). Longer sequences containing 9 inputs resulted in larger somatic depolarizations of ~10mV (Figure 6 - Supplement 2E, F). Although we had modified the (Branco, Clark, and Häusser 2010) model to remove the jitter in the timing of arrival of inputs and made slight modifications to the location of stimulus delivery on the dendrite, we saw similar amplitudes when we tested a 9-length sequence using (Branco, Clark, and Häusser 2010)’s published code (Figure 6 - Supplement 2I, J). In all the cases we tested (5 input sequence, 9 input sequence, 9 input sequence with (Branco, Clark, and Häusser 2010) code repository), removal of NMDA synapses lowered both the somatic EPSPs (Figure 6 - Supplement 2C,D,G,H,K,L) as well as the selectivity (measured as the difference between the EPSPs generated for inward and outward stimulus delivery) (Figure 6 Supplement 2M,N,O). Further, monitoring the voltage along the dendrite for a sequence of 5 inputs showed dendritic EPSPs in the range of 20-45 mV (Figure 6 - Supplement 2P, Q), which came down notably (10-25mV) when NMDA synapses were abolished (Figure 6 - Supplement 2R, S). Thus, even sequences containing as few as 5 inputs were capable of engaging the NMDA-mediated nonlinearity to show sequence selectivity, although the selectivity was not as strong as in the case of 9 inputs.

      Reviewer #1 (Recommendations for the authors):

      Minor points:

      Figure 8, what does the scale in A represent? I assume it is voltage, but there are no units. Figure 8, C, E, G, these are unconventional units for synaptic weights, usually, these are given in nS / per input.

      We have corrected these. The scalebar in 8A represents membrane potential in mV. The units of 8C,E,G are now in nS.

      Reviewer #2 (Public Review):

      Summary:

      If synaptic input is functionally clustered on dendrites, nonlinear integration could increase the computational power of neural networks. But this requires the right synapses to be located in the right places. This paper aims to address the question of whether such synaptic arrangements could arise by chance (i.e. without special rules for axon guidance or structural plasticity), and could therefore be exploited even in randomly connected networks. This is important, particularly for the dendrites and biological computation communities, where there is a pressing need to integrate decades of work at the single-neuron level with contemporary ideas about network function.

      Using an abstract model where ensembles of neurons project randomly to a postsynaptic population, back-of-envelope calculations are presented that predict the probability of finding clustered synapses and spatiotemporal sequences. Using data-constrained parameters, the authors conclude that clustering and sequences are indeed likely to occur by chance (for large enough ensembles), but require strong dendritic nonlinearities and low background noise to be useful.

      Strengths:

      (1) The back-of-envelope reasoning presented can provide fast and valuable intuition. The authors have also made the effort to connect the model parameters with measured values. Even an approximate understanding of cluster probability can direct theory and experiments towards promising directions, or away from lost causes.

      (2) I found the general approach to be refreshingly transparent and objective. Assumptions are stated clearly about the model and statistics of different circuits. Along with some positive results, many of the computed cluster probabilities are vanishingly small, and noise is found to be quite detrimental in several cases. This is important to know, and I was happy to see the authors take a balanced look at conditions that help/hinder clustering, rather than to just focus on a particular regime that works.

      (3) This paper is also a timely reminder that synaptic clusters and sequences can exist on multiple spatial and temporal scales. The authors present results pertaining to the standard `electrical' regime (~50-100 µm, <50 ms), as well as two modes of chemical signaling (~10 µm, 100-1000 ms). The senior author is indeed an authority on the latter, and the simulations in Figure 5, extending those from Bhalla (2017), are unique in this area. In my view, the role of chemical signaling in neural computation is understudied theoretically, but research will be increasingly important as experimental technologies continue to develop.

      Weaknesses:

      (1) The paper is mostly let down by the presentation. In the current form, some patience is needed to grasp the main questions and results, and it is hard to keep track of the many abbreviations and definitions. A paper like this can be impactful, but the writing needs to be crisp, and the logic of the derivation accessible to non-experts. See, for instance, Stepanyants, Hof & Chklovskii (2002) for a relevant example.

      It would be good to see a restructure that communicates the main points clearly and concisely, perhaps leaving other observations to an optional appendix. For the interested but time-pressed reader, I recommend starting with the last paragraph of the introduction, working through the main derivation on page 7, and writing out the full expression with key parameters exposed. Next, look at Table 1 and Figure 2J to see where different circuits and mechanisms fit in this scheme. Beyond this, the sequence derivation on page 15 and biophysical simulations in Figures 5 and 6 are also highlights.

      We appreciate the reviewers' suggestions. We have tightened the flow of the introduction. We understand that the abbreviations and definitions are challenging and have therefore provided intuitions and summaries of the equations discussed in the main text.

      Clusters calculations

      Our approach is to ask how likely it is that a given set of inputs lands on a short segment of dendrite, and then scale it up to all segments on the entire dendritic length of the cell.

      Thus, the probability of occurrence of groups that receive connections from each of the M ensembles (PcFMG) is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative zone-length with respect to the total dendritic arbor (Z/L) and the number of ensembles (M).

      Sequence calculations

      Here we estimate the likelihood of the first ensemble input arriving anywhere on the dendrite, and ask how likely it is that succeeding inputs of the sequence would arrive within a set spacing.

      Thus, the probability of occurrence of sequences that receive sequential connections (PcPOSS) from each of the M ensembles is a function of the connection probability (p) between the two layers, the number of neurons in an ensemble (N), the relative window size with respect to the total dendritic arbor (Δ/L) and the number of ensembles (M).

      (2) I wonder if the authors are being overly conservative at times. The result highlighted in the abstract is that 10/100000 postsynaptic neurons are expected to exhibit synaptic clustering. This seems like a very small number, especially if circuits are to rely on such a mechanism. However, this figure assumes the convergence of 3-5 distinct ensembles. Convergence of inputs from just 2 ense mbles would be much more prevalent, but still advantageous computationally. There has been excitement in the field about experiments showing the clustering of synapses encoding even a single feature.

      We agree that short clusters of two inputs would be far more likely. We focused our analysis on clusters with three of more ensembles because of the following reasons:

      (1) The signal to noise in these clusters was very poor as the likelihood of noise clusters is high.

      (2) It is difficult to trigger nonlinearities with very few synaptic inputs.

      (3) At the ensemble sizes we considered (100 for clusters, 1000 for sequences), clusters arising from just two ensembles would result in high probability of occurrence on all neurons in a network (~50% in cortex, see p_CMFG in figures below.). These dense neural representations make it difficult for downstream networks to decode (Foldiak 2003).

      However, in the presence of ensembles containing fewer neurons or when the connection probability between the layers is low, short clusters can result in sparse representations (Figure 2 - Supplement 2). Arguments 1 and 2 hold for short sequences as well.

      (3) The analysis supporting the claim that strong nonlinearities are needed for cluster/sequence detection is unconvincing. In the analysis, different synapse distributions on a single long dendrite are convolved with a sigmoid function and then the sum is taken to reflect the somatic response. In reality, dendritic nonlinearities influence the soma in a complex and dynamic manner. It may be that the abstract approach the authors use captures some of this, but it needs to be validated with simulations to be trusted (in line with previous work, e.g. Poirazi, Brannon & Mel, (2003)).

      We agree that multiple factors might affect the influence of nonlinearities on the soma. The key goal of our study was to understand the role played by random connectivity in giving rise to clustered computation. Since simulating a wide range of connectivity and activity patterns in a detailed biophysical model was computationally expensive, we analyzed the exemplar detailed models for nonlinearity separately (Figures 5, 6, and new figure 8), and then used our abstract models as a proxy for understanding population dynamics. A complete analysis of the role played by morphology, channel kinetics and the effect of branching requires an in-depth study of its own, and some of these questions have already been tackled by (Poirazi, Brannon, and Mel 2003; Branco, Clark, and Häusser 2010; Bhalla 2017). However, in the revision, we have implemented a single model which incorporates the range of ion-channel, synaptic and biochemical signaling nonlinearities which we discuss in the paper (Figure 8, and Figure 8 Supplement 1, 2,3). We use this to demonstrate all three forms of sequence and grouped computation we use in the study, where the only difference is in the stimulus pattern and the separation of time-scales inherent in the stimuli.

      (4) It is unclear whether some of the conclusions would hold in the presence of learning. In the signal-to-noise analysis, all synaptic strengths are assumed equal. But if synapses involved in salient clusters or sequences were potentiated, presumably detection would become easier? Similarly, if presynaptic tuning and/or timing were reorganized through learning, the conditions for synaptic arrangements to be useful could be relaxed. Answering these questions is beyond the scope of the study, but there is a caveat there nonetheless.

      We agree with the reviewer. If synapses receiving connectivity from ensembles had stronger weights, this would make detection easier. Dendritic spikes arising from clustered inputs have been implicated in local cooperative plasticity (Golding, Staff, and Spruston 2002; Losonczy, Makara, and Magee 2008). Further, plasticity related proteins synthesized at a synapse undergoing L-LTP can diffuse to neighboring weakly co-active synapses, and thereby mediate cooperative plasticity (Harvey et al. 2008; Govindarajan, Kelleher, and Tonegawa 2006; Govindarajan et al. 2011). Thus if clusters of synapses were likely to be co-active, they could further engage these local plasticity mechanisms which could potentiate them while not potentiating synapses that are activated by background activity. This would depend on the activity correlation between synapses receiving ensemble inputs within a cluster vs those activated by background activity. We have mentioned some of these ideas in a published opinion paper (Pulikkottil, Somashekar, and Bhalla 2021). In the current study, we wanted to understand whether even in the absence of specialized connection rules, interesting computations could still emerge. Thus, we focused on asking whether clustered or sequential convergence could arise even in a purely randomly connected network, with the most basic set of assumptions. We agree that an analysis of how selectivity evolves with learning would be an interesting topic for further work.

      References

      Bhalla, Upinder S. 2017. “Synaptic Input Sequence Discrimination on Behavioral Timescales Mediated by Reaction-Diffusion Chemistry in Dendrites.” Edited by Frances K Skinner. eLife 6 (April):e25827. https://doi.org/10.7554/eLife.25827.

      Branco, Tiago, Beverley A. Clark, and Michael Häusser. 2010. “Dendritic Discrimination of Temporal Input Sequences in Cortical Neurons.” Science (New York, N.Y.) 329 (5999): 1671–75. https://doi.org/10.1126/science.1189664.

      Foldiak, Peter. 2003. “Sparse Coding in the Primate Cortex.” The Handbook of Brain Theory and Neural Networks. https://research-repository.st-andrews.ac.uk/bitstream/handle/10023/2994/FoldiakSparse HBTNN2e02.pdf?sequence=1.

      Golding, Nace L., Nathan P. Staff, and Nelson Spruston. 2002. “Dendritic Spikes as a Mechanism for Cooperative Long-Term Potentiation.” Nature 418 (6895): 326–31. https://doi.org/10.1038/nature00854.

      Govindarajan, Arvind, Inbal Israely, Shu-Ying Huang, and Susumu Tonegawa. 2011. “The Dendritic Branch Is the Preferred Integrative Unit for Protein Synthesis-Dependent LTP.” Neuron 69 (1): 132–46. https://doi.org/10.1016/j.neuron.2010.12.008.

      Govindarajan, Arvind, Raymond J. Kelleher, and Susumu Tonegawa. 2006. “A Clustered Plasticity Model of Long-Term Memory Engrams.” Nature Reviews Neuroscience 7 (7): 575–83. https://doi.org/10.1038/nrn1937.

      Harvey, Christopher D., Ryohei Yasuda, Haining Zhong, and Karel Svoboda. 2008. “The Spread of Ras Activity Triggered by Activation of a Single Dendritic Spine.” Science (New York, N.Y.) 321 (5885): 136–40. https://doi.org/10.1126/science.1159675.

      Losonczy, Attila, Judit K. Makara, and Jeffrey C. Magee. 2008. “Compartmentalized Dendritic Plasticity and Input Feature Storage in Neurons.” Nature 452 (7186): 436–41. https://doi.org/10.1038/nature06725.

      Poirazi, Panayiota, Terrence Brannon, and Bartlett W. Mel. 2003. “Pyramidal Neuron as Two-Layer Neural Network.” Neuron 37 (6): 989–99. https://doi.org/10.1016/S0896-6273(03)00149-1.

      Pulikkottil, Vinu Varghese, Bhanu Priya Somashekar, and Upinder S. Bhalla. 2021. “Computation, Wiring, and Plasticity in Synaptic Clusters.” Current Opinion in Neurobiology, Computational Neuroscience, 70 (October):101–12. https://doi.org/10.1016/j.conb.2021.08.001.

    1. Figure 9(b) presents the results of of PaLM 2-L as the scorer LLM with the following options ofinitial instructions: (1) “Let’s solve the problem.”; (2) the empty string; or (3) “Let’s think stepby step.”. We notice that the performance differs much more with different initial instructions,especially at the beginning of the optimization. Specifically, starting from (1) leads to better generatedinstructions than (2) in the first 30 steps, while the instructions optimized from both (1) and (2)are worse than (3) throughout. A similar observation holds when using PaLM 2-L as scorer andgpt-3.5-turbo as optimizer for BBH tasks, by comparing the results starting from the emptystring (Appendix E.2) and from “Let’s solve the problem.” (Appendix E.3). Taking a closer look intothe optimization process of (2), we find that although both “solve the problem” and “step by step”show up in generated instructions at Step 5, it takes the optimizer LLM more steps to get rid of worseinstructions presented in the meta-prompt when starting from instructions with lower accuracies.Therefore, one direction for future work is to accelerate convergence from weaker starting points.

      Hình 9b thể hiện kết quả của PaLM 2-L khi làm scorer LLM với các prompt khởi tạo sau: - "Let's solve the problem" - Prompt rỗng - "Let's think step by step"

    1. Primer Validation

      More detail is needed here, include how you determined the limit of detection of your assay. State that you used standard curves to estimate limit of detection (LOD), but see Klymus et al. (2020). Given that your assays are for qualitative purposes, the limit of quantification (LOQ) is likely not relevant in your case. Please verify this to clarify in the main text why the qPCR efficiency may be irrelevant for your assays, but the LOD is.

      Depending on who you will get as an examiner, it may be worthwhile to also mention that you did the testing according to the MIQE guidelines, which I think were incorporated into this paper (see thier Appendix S1 for the checklist):

      • Thalinger, B., Deiner, K., Harper, L. R., Rees, H. C., Blackman, R. C., Sint, D., ... & Bruce, K. (2021). A validation scale to determine the readiness of environmental DNA assays for routine species monitoring. Environmental DNA, 3(4), 823-836.

      • Bustin, S. A. (2024). Improving the quality of quantitative polymerase chain reaction experiments: 15 years of MIQE. Molecular aspects of medicine, 96, 101249.

      • Klymus, K. E., Merkes, C. M., Allison, M. J., Goldberg, C. S., Helbing, C. C., Hunter, M. E., Jackson, C. A., Lance, R. F., Mangan, A. M., Monroe, E. M., Piaggio, A. J., Stokdyk, J. P., Wilson, C. C., & Richter, C. A. (2020). Reporting the limits of detection and quantification for environmental DNA assays. Environmental DNA, 2, 271–282. https://doi.org/10.1002/edn3.29

    Annotators

    1. Primer Validation

      More detail is needed here, include how you determined the limit of detection of your assay. State that you used standard curves to estimate limit of detection (LOD), but see Klymus et al. (2020). Given that your assays are for qualitative purposes, the limit of quantification (LOQ) is likely not relevant in your case. Please verify this to clarify in the main text why the qPCR efficiency may be irrelevant for your assays, but the LOD is.

      Depending on who you will get as an examiner, it may be worthwhile to also mention that you did the testing according to the MIQE guidelines, which I think were incorporated into this paper (see thier Appendix S1 for the checklist):

      • Thalinger, B., Deiner, K., Harper, L. R., Rees, H. C., Blackman, R. C., Sint, D., ... & Bruce, K. (2021). A validation scale to determine the readiness of environmental DNA assays for routine species monitoring. Environmental DNA, 3(4), 823-836.

      • Bustin, S. A. (2024). Improving the quality of quantitative polymerase chain reaction experiments: 15 years of MIQE. Molecular aspects of medicine, 96, 101249.

      • Klymus, K. E., Merkes, C. M., Allison, M. J., Goldberg, C. S., Helbing, C. C., Hunter, M. E., Jackson, C. A., Lance, R. F., Mangan, A. M., Monroe, E. M., Piaggio, A. J., Stokdyk, J. P., Wilson, C. C., & Richter, C. A. (2020). Reporting the limits of detection and quantification for environmental DNA assays. Environmental DNA, 2, 271–282. https://doi.org/10.1002/edn3.29

    Annotators

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The pituitary gonadotropins, FSH and LH, are critical regulators of reproduction. In mammals, synthesis and secretion of FSH and LH by gonadotrope cells are controlled by the hypothalamic peptide, GnRH. As FSH and LH are made in the same cells in mammals, variation in the nature of GnRH secretion is thought to contribute to the differential regulation of the two hormones. In contrast, in fish, FSH and LH are produced in distinct gonadotrope populations and may be less (or differently) dependent on GnRH than in mammals. In the present manuscript, the authors endeavored to determine whether FSH may be independently controlled by a distinct peptide, cholecystokinin (CCK), in zebrafish.

      Strengths:

      The authors demonstrated that the CCK receptor is enriched in FSH-producing relative to LH-producing gonadotropes, and that genetic deletion of the receptor leads to dramatic decreases in gonadotropin production and gonadal development in zebrafish. Also, using innovative in vivo and ex vivo calcium imaging approaches, they show that LH- and FSH-producing gonadotropes preferentially respond to GnRH and CCK, respectively. Exogenous CCK also preferentially stimulated FSH secretion ex vivo and in vivo.

      Weaknesses:

      The concept that there may be a distinct FSH-releasing hormone (FSHRH) has been debated for decades. As the authors suggest that CCK is the long-sought FSHRH (at least in fish), they must provide data that convincingly leads to such a conclusion. In my estimation, they have not yet met this burden. In particular, they show that CCK is sufficient to activate FSH-producing cells, but have not yet demonstrated its necessity. Their one attempt to do so was using fish in which they inactivated the CCK receptor using CRISPR-Cas9. While this manipulation led to a reduction in FSH, LH was affected to a similar extent. As a result, they have not shown that CCK is a selective regulator of FSH.

      Our conclusion regarding the necessity of CCK signaling for FSH secretion is based on the following evidence:

      (1) CCK-like receptors are expressed in the pituitary gland predominantly on FSH cells.

      (2) Application of CCK to pituitaries elicits FSH cell activation and to a much lesser degree activation of LH cells.  (calcium imaging assays)

      (3) Application of CCK to pituitaries and by injections in-vivo significantly increased only FSH release.

      (4) Mutating the FSH-specific CCK receptor in a different species of fish (medaka) also causes a complete shutdown of FSH production and phenocopies a fsh-mutant phenotype (Uehara, Nishiike et al. 2023).

      Taken together, we believe that this data strongly supports the conclusion that CCK is necessary for FSH production and release from the fish pituitary. Admittedly, the overlapping effects of CCK on both FSH and LH cells in zebrafish (evident in both our calcium imaging experiments and especially in the KO phenotype) complicates the interpretation of the phenotype. We speculate that the effect of CCK on LH cells in zebrafish can be caused either by paracrine signaling within the gland or by the effects of CCK on GnRH neurons that were shown to express CCK receptors .

      In the current version, we emphasize that CCK also induces LH secretion. Although it does not affect LH to the same extent as FSH, an overlap does exist. This is mentioned in the abstract and discussion.

      Moreover, they do not yet demonstrate that the effects observed reflect the loss of the receptor's function in gonadotropes, as opposed to other cell types.

      Although there is evidence for the expression of CCK receptor in other tissues, we do show a direct decrease of FSH and LH expression in the gonadotrophs of the pituitary of the mutant fish; taken together with its significant expression in FSH cells compared to the rest of the cells of the pituitary in the cell specific transcriptomic, it is the most reasonable explanation for the mutant phenotype.

      Unfortunately, unlike in mice, technologies for conditional knockout of genes in specific cell types are not yet available for our model and cell types. Additional tissue distribution of the three receptors types of CCK was added in supplementary figure 1, from this tissue distribution it can be appreciated how in the pituitary only CCKBRA (our identified CCK receptor) is expressed, while in other tissues it is either not expressed or expressed with the additional CCK receptors that can compensate its activity.

      It also is not clear whether the phenotypes of the fish reflect perturbations in pituitary development vs. a loss of CCK receptor function in the pituitary later in life. Ideally, the authors would attempt to block CCK signaling in adult fish that develop normally. For example, if CCK receptor antagonists are available, they could be used to treat fish and see whether and how this affects FSH vs. LH secretion.

      While the observed gonadal phenotype of the KO (sex inversed fish) should have a developmental origin since it requires a long time to manifest, the effect of the KO on FSH and LH cells is probably more acute. Unfortunately a specific antagonist that affect only CCKRBA and not the other CCK receptors wasn’t identified yet.

      In the Discussion, the authors suggest that CCK, as a satiety factor, may provide a link between metabolism and reproduction. This is an interesting idea, but it is not supported by the data presented. That is, none of the results shown link metabolic state to CCK regulation of FSH and fertility. Absent such data, the lengthy Discussion of the link is speculative and not fully merited.

      In the revised manuscript, we provided data to link cck with metabolic status in supplementary figure 1 and modified the discussion to tone down the link between metabolic status to and reproductive state.

      Also in the Discussion, the authors argue that "CCK directly controls FSH cells by innervating the pituitary gland and binding to specific receptors that are particularly abundant in FSH gonadotrophs." However, their imaging does not demonstrate innervation of FSH cells by CCK terminals (e.g., at the EM level).

      Innervation of the fish pituitary does not imply a synaptic-like connection between axon terminals and endocrine cells. In fact, such connections are extremely rare, and their functionality is unclear. Instead, the mode of regulation between hypothalamic terminals and endocrine cells in the fish pituitary is more similar to "volume transmission" in the CNS, i.e. peptides are released into the tissue and carried to their endocrine cell targets by the circulation or via diffusion. A short explanation was added in lines 395-398 in the discussion

      Moreover, they have not demonstrated the binding of CCK to these cells. Indeed, no CCK receptor protein data are shown.

      Our revised manuscript  includes detailed experiments showing the activation of the receptor by its homologous ligand, supplementary Figure 1 includes a transactivation  assay of CCK to its receptor and the effect of the different mutants on the activation of the receptor. Unfortunately, no antibody is available against this fish specific receptor (one of the caveats of working with fish models); therefore, we cannot present receptor protein data.

      The calcium responses of FSH cells to exogenous CCK certainly suggest the presence of functional CCK receptors therein; but, the nature of the preparations (with all pituitary cell types present) does not demonstrate that CCK is acting directly in these cells.

      We agree with the reviewer that there are some disadvantages in choosing to work with a whole-tissue preparation. However, we believe that the advantages of working in a more physiological context far outweigh the drawbacks as it reflects the natural dynamics more precisely. Since our transcriptome data, as well as our ISH staining, show that the CCK receptor is exclusively expressed in FSH cells, it is improbable that the observed calcium response is mediated via a different pituitary cell type.

      Indeed, the asynchrony in responses of individual FSH cells to CCK (Figure 4) suggests that not all cells may be activated in the same way. Contrast the response of LH cells to GnRH, where the onset of calcium signaling is similar across cells (Figure 3).

      The difference between the synchronization levels of LH and FSH cells activity stems from the gap-junction mediated coupling between LH cells that does not exist between FSH cells(Golan, Martin et al. 2016). Therefore, the onset of calcium response in FSH cells is dependent on the irregular diffusion rate of the peptide within the preparation, whereas the tight homotypic coupling between LH cells generates a strong and synchronized calcium rise that propagates quickly throughout the entire population

      The differences in connectivity between LH and FSH cells is mentioned in lines 194-195

      Finally, as the authors note in the Discussion, the data presented do not enable them to conclude that the endogenous CCK regulating FSH (assuming it does) is from the brain as opposed to other sources (e.g., the gut).

      We agree with the reviewer that, for now, we are unable to determine whether hypothalamic or peripheral CCK are the main drivers of FSH cells. While the strong innervation of the gland by CCK-secreting hypothalamic neurons strengthens the notion of a hypothalamic-releasing hormone and also fits with the dogma of the neural control of the pituitary gland in fish (Ball 1981), more experiments are required to resolve this question.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript builds on previous work suggesting that the CCK peptide is the releasing hormone for FSH in fishes, which is different than that observed in mammals where both LH and FSH release are under the control of GnRH. Based on data using calcium imaging as a readout for stimulation of the gonadotrophs, the researchers present data supporting the hypothesis that CCK stimulates FSH-containing cells in the pituitary. In contrast, LH-containing cells show a weak and variable response to CCK but are highly responsive to GnRH. Data are presented that support the role of CCK in the release of FSH. Researchers also state that functional overlap exists in the potency of GnRH to activate FSH cells, thus the two signalling pathways are not separate. The results are of interest to the field because for many years the assumption has been that fishes use the same signalling mechanism. These data present an intriguing variation where a hormone involved in satiation acts in the control of reproduction.

      Strengths:

      The strengths of the manuscript are that researchers have shed light on different pathways controlling reproduction in fishes.

      Weaknesses:

      Weaknesses are that it is not clear if multiple ligand/receptors are involved (more than one CCK and more than one receptor?). The imaging of the CCK terminals and CCK receptors needs to be reinforced.

      Reviewer consultation summary: 

      The data presented establish sufficiency, but not necessity of CCK in FSH regulation. The paper did not show that CCK endogenously regulates FSH in fish. This has not been established yet.

      This is a very important comment, also raised by reviewer 1. To avoid repetition, please see our detailed response to the comment above.

      The paper presents the pharmacological effects of CCK on ex vivo preparations but does not establish the in vivo physiological function of the peptide. The current evidence for a novel physiological regulatory mechanism is incomplete and would require further physiological experiments. These could include the use of a CCK receptor antagonist in adult fish to see the effects on FSH and LH release, the generation of a CCK knockout, or cell-specific genetic manipulations.

      As detailed in the responses to the first reviewer, we cannot conduct conditional, cellspecific gene knockout in our model. However we did conducted KO and show the direct effect on FSH and LH secretion together with physiological characterisation of the mutant.

      Zebrafish have two CCK ligands: ccka, cckb and also multiple receptors: cckar, cckbra and cckbrb. There is ambiguity about which CCK receptor and ligand are expressed and which gene was knocked out.

      In the revised manuscript, we clarified which of the receptors are expressed (CCKRBA) and which receptor is targeted. We also provided data showing the specificity of the receptors (both WT and mutant) to the ligands. Supplementary 1 shows receptor cross-activation. The method also specifies the exact NCBI ID numbers of the targeted receptor and the antibody used for the immunostaining.

      Blocking CCK action in fish (with receptor KO) affects FSH and LH. Therefore, the work did not demonstrate a selective role for CCK in FSH regulation in vivo and any claims to have discovered FSHRH need to be more conservative.

      We agree with the reviewer that the overlap in the effect of CCK measured in the calcium activation of cells and in the KO model does not allow us to conclude selectivity. In this context, it is crucial to highlight that CCKRBA exhibits high expression on FSH cells but not on LH cells. Therefore, the effect of CCK on LH cells is likely paracrine or through GnRH neurons that were shown to express CCK receptors. In the current version, we emphasize that CCK also induces LH secretion. Although it does not affect LH to the same extent as FSH, an overlap does exist. This is mentioned in the abstract and discussion.

      The labelling of the terminals with anti-CCK looks a lot like the background and the authors did not show a specificity control (e.g. anti-CCK antibody pre-absorbed with the peptide or anti-CCK in morphant/KO animals).

      Figures colours had been updated to better visualise the specific staining of the antibody. Also, The same antibody had been previously used to mark CCK-positive cells in the gut of the red drum fish(Webb, Khan et al. 2010) , where a control (pre-absorbed with the peptide) experiment had been conducted.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Abstract:

      The authors have not yet established that CCK is the primary regulator of FSH in vivo.

      In the new version, we highlight the leading effect of CCK on the reproductive axis, which includes FSH and LH.

      Introduction:

      The authors need to make clear earlier in the Introduction that fish have two types of gonadotropes. This information comes too late (last paragraph) currently.

      Added in line 42

      They should discuss relevant data on the differential regulation of FSH and LH in fish, as a rationale for looking for different releasing factors.

      This has been discussed in the first paragraph of the introduction

      In the last sentence of the penultimate paragraph, the authors assume that it must be a hypothalamic factor that regulates FSH. Why is this necessarily the case? Are there data indicating that a hypothalamic factor is required for FSH production in fish?

      This has been mentioned in the discussion, we do not deny that circulating CCK or CCK from other brain areas might affect FSH secretion in the pituitary (line 402-404). However, as the hypothalamus serves as the main gateway from the brain to the pituitary and contains hypophysiotropic CCK neurons it is the most reasonable assumption.

      Results:

      In the first paragraph, the authors reference three types of CCK receptors, only one of which is expressed in the pituitary. The specific receptor should be named here.

      The receptor name and NCBI id had been added in this paragraph.

      Figure 1: What specificity controls were used for the ISH in Figure 1?

      HCR- The method used to identify RNA expression and developed by Molecular Instruments (https://www.molecularinstruments.com/hcr-rnafish-protocols), do not require specific control as had been previously done with older ISH methods. The use of multiple short probes assure the specificity to the RNA.More over the expression is specific to the targeted cells.

      In Figure 1D, the red square is missing in the KO fish (at low magnification).

      This was fixed in the updated version.

      In Figure 1G, the number of dots does not correspond to the number of animals described in the figure legend. Does each point represent an animal?

      Each dot represent a fish. The order of the numbers in the legend didn’t match the order in the graph, this had been fixed in the last version

      Figure 2A: It is not clear that all FSH (GFP) cells are double-labeled. Should all double-labeled cells appear white? Many appear as green. Some quantification of the proportion of co-labeling is needed. Also, the scale bars are too small to read. Perhaps add the size of the scale bars to the legend.

      They are all double-labeled, as can be seen by the single-color images, since GFP fluorescence is stronger than RCaMP fluorescence, the double-labelling might be seen a green cells; a scale bar was added.

      Figure 2C: Is the synchronous activity of LH cells here dependent on endogenous GnRH? Can these events be blocked with a GnRH receptor antagonist?

      We currently do not have enough data to support this hypothesis and the in vivo 2 photon system is not optimal to answer these questions since these are spontaneous events which are difficult to predict. This is the main reason we moved to an ex vivo system. The similar response we receive when applying GnRH in the ex vivo system support it is GnRH activation.

      Figure 4C: As some LH cells respond to CCK, can the authors really claim that CCK is a selective regulator of FSH? What explains the heterogeneity in the response of LH cells to CCK?

      In this version, we highlight that CCK directly activates FSH but it is also affecting LH to some extent. However it is clear that the effect on FSH cells is more significant.

      Figures 5A and B: With larger Ns, some of the trends might be significant (e.g., GnRH stimulated FSH release and CCK stimulated LH release).

      Though there is a trend, the values in the Y axis reveal that the trend of response of FSH to GnRH and LH to CCK is lower then the distribution of the basal response (the before) in all of the graphs. Hence we do not believe a larger N will affect those results. We added the range of the secreted hormones concentrations in the result description to emphasize the difference in values,

      Figures 5C and D: What explains the lack of an increase in LH secretion following GnRH treatment?

      We did not measure LH Secretion in the plasma as we didn’t have enough blood, we do see an increase in LH transcription (see supplementary figure 5 – figure supplement 1)

      Also, as mRNA levels were measured (in C), reference should be made to expression rather than transcription. Not all changes in mRNA levels reflect changes in transcription.Also, remove transcription from the legend. Reference to supplementary Figure 4 in the legend should be supplementary Figure 6. Finally, in C and D, distinguish males from females (as in 5A and B).

      Modifications had been done according to the reviewer suggestions.

      Figure legends:

      The figure legends are very long. One way to shorten them is to remove descriptions of the results. The legends should indicate what is in each figure, not the results of the experiments.

      Modifications had been done according to the reviewer suggestions.

      Sample sizes should be spelled out in the legends, as they are not in the M&M.

      We made sure all sample sizes are mentioned in the legend

      Materials and Methods:

      Section 1.1 can be removed as it repeats content presented elsewhere.

      This section was removed

      Section 1.5: It is unclear what this means: "blinding was not applied to ensure tractability" Please clarify.

      This section was removed

      Reviewer #2 (Recommendations For The Authors):

      It appears that zebrafish have two ligands: ccka, cckb. Also multiple receptors: cckar, cckbra and cckbrb. Authors need to discuss this and clearly state which ligand and which receptor they are referring to in the manuscript.

      We discussed the receptor type in the first paragraph of the results, the exact synthetic peptide used is described in the methods. The 8 amino acids of the mature CCK peptide are the same between CCKa and CCKb. A sentence regarding the specificity of the antibody to the mature CCK peptide was added in line 101.

      "to GnRH puff application (300 μl of 30 μg/μl)"; (250 μl of 30 μg/ml CCK)

      Please give the final concentration to make it easy on the readers of the data.

      The molarity of the final concentration was added.

      (2.4) Differential calcium response underlies differential hormone. This section is a bit confusing to read, for example:

      "For that, we collected the medium perfused through our ex vivo system (Fig. 2a) and measured LH and FSH levels using a specific ELISA validated for zebrafish [31] while monitoring the calcium activity of the cells."

      So the authors did the ELISA while monitoring the activity (?). This sentence does not make sense: please rewrite it.

      We modified this sentence  in line 308-311

      To functionally validate the importance of CCK signalling we used CRISPR-cas9 to generate loss-of-function (LOF) mutations in the pituitary- CCK receptor gene.

      The authors need to clearly state WHICH gene they inactivated: Zebrafish have three CCK-receptors, so "the pituitary receptor gene" needs to be defined.

      Was added again in line 107, and is mentioned in the methods

      Figure 3 is a crucial figure!

      Figure 3B: The data are not very convincing. Please state how thick the sections are in the figure legend (assuming these are adult pituitaries),

      Added in the legend (figure 1C in the new version), slice thickness and adult fish.

      Please show at least the merged image a high magnification view of the co-localization of the receptor with the cells.

      This is figure 1 in the new revision, a magnified figure was added

      Please give the scale bar size for 3B.

      Scales for all images were added

      Figure 3C: the co-localization of the terminals of the CCK and FSH cells shows very few cells expressing close to terminals.

      Important: Because the labelling of the terminals with anti-CCK looks a lot like the background, it is very important to show the control (anti-CCK antibody pre-absorbed with the peptide). The authors should have these data. The photo needs to have been taken at the same gain (contrast) and the photo showing the terminals.

      This is  a commercial antibody that had been previously validated for CCK in fish. The co-localization pattern resembles GnRH innervation in the pituitary. In fish when hypothalamic neurons innervate the pituitary they do not innervate all the cells, as this is an endocrine system, the peptide can travel to neighbouring cells via diffusion or aided blood flow (Golan, Zelinger et al. 2015) ).  The images reveal the direct innervation of CCK in the pituitary and its proximity to FSH cells.

      Figure 4c, on right. The text seems to be stretched as if the photo was adjusted without locking the aspect ratio. Please check the original images.

      This has been fixed

      Can the authors use different pseudo colours? Differentiating a double label of white versus yellow is very difficult, and thus the photo is not very convincing.

      This had been changed to green and magenta

      What is meant by "CCK-AB" antibody? Perhaps anti-CCK would be a better label

      This has been fixed

      Figure 5A: increase the magnification of the insets; the structure of the gonads is very difficult to see with clarity in these low mag images. The most obvious way to improve this figure is to reduce or eliminate the pie graph (not really necessary) and show a high magnification (and larger) image of the gonadal structure.

      This is figure 1 in the new version, with magnification of the gonad next to each body section.

      Discussion:

      " Moreover, in the zebrafish, as well as in other species, the functional overlap in gonadotropin signalling pathways is not limited to the pituitary but is also present in the gonad, through the promiscuity of the two gonadotropin receptors"<br /> The reasoning of this sentence is not clear: zebrafish do not use GnRH to control reproduction: they lack GnRH1 through genomic rearrangement (see Whitlock, Postlethwait and Ewer 2019) and KO of GnRH2/GnRH3 does not affect reproduction.

      While GnRH KO model indicate a redundancy of GnRH in this axis in zebrafish, there is also ample evidence for its importance in regulating reproduction such as its effect on gonadotropin (Golan, Martin et al. 2016) and its use in spawning inductions in fish (Mizrahi and Levavi-Sivan 2023). We believe it is currently too soon to conclude that GnRH signalling is completely non relevant to reproduction in cyprinids.  

      Reviewing Editor (Recommendations For The Authors):

      It would be interesting to see calcium imaging experiments in the CCKR receptor mutants to establish a more direct connection between peptide action and activity.

      We added a receptor assay that reflect the non-activation of the mutated receptors by CCK (supplementary figure 1) , and compared it to the wild type that is activated. This show that: 1) CCK directly activate our identified receptor in FSH cells. 2) the mutated receptors are non-active.

      "all homozygous fish (CCKR+12/+7/-1/ CCKR+12/+7/-1, n=12)"

      It may be better to write the genotype of fish separately as CCKR+12/+12, CCKR+7/+7 and CCKR-1/-1, n=12) otherwise it seems as if all alleles occurred together in the same fish.

      Modified according to the reviewer request

      In Figure 1 scale bar legends are very small. 

      Description of the scale bars were added to the all the legends

      Figure 1 legend "On the top right of each panel is the gender distribution" - fish have no gender but sex.

      Modified according to the reviewer request

      The authors should endeavour to improve the presentation of the figures. They should use a sans-serif font and check that text is not cut at the edge of figure panels, that scale bars are uniform and clearly labelled and fonts are of similar size and clearly legible. E.g. labels of the fish brain of Fig3A are very small.

      We modified all the figures to adapt the font and the scales, we increased the size of the image in Figure 3a to make the labels clearer.

      Please use the elife format to name supplementary figures, as Figure X - Figure Supplement Y (each supplement associated with one of the main figures).

      Fixed

      Peptide concentrations in the ex vivo experiments should also be given as molar concentrations not only as '250 μl of 30 μg/ml CCK'.

      Fixed

      "In contrast, FSH cells responded with a very low calcium rise in hormonal secretion in response to GnRH" - a very low rise in hormonal secretion

      Fixed

      Please clarify why you used a GnRH synthetic agonist and not the native peptide.

      It is commonly used for spawning induction in fish (line 245); it has also been shown to directly affect the secretion of LH and FSH (Biran, Golan et al. 2014, Biran, Golan et al. 2014, Mizrahi, Gilon et al. 2019) , added to line 245.

      References

      Ball, J. (1981). "Hypothalamic control of the pars distalis in fishes, amphibians, and reptiles." General and comparative endocrinology 44(2): 135-170.

      Biran, J., M. Golan, N. Mizrahi, S. Ogawa, I. S. Parhar and B. Levavi-Sivan (2014). "Direct regulation of gonadotropin release by neurokinin B in tilapia (Oreochromis niloticus)." Endocrinology 155(12): 4831-4842.

      Biran, J., M. Golan, N. Mizrahi, S. Ogawa, I. S. Parhar and B. Levavi-Sivan (2014). "LPXRFa, the Piscine Ortholog of GnIH, and LPXRF Receptor Positively Regulate Gonadotropin Secretion in Tilapia (Oreochromis niloticus)." Endocrinology 155(11): 4391-4401.

      Golan, M., A. O. Martin, P. Mollard and B. Levavi-Sivan (2016). "Anatomical and functional gonadotrope networks in the teleost pituitary." Scientific Reports 6: 23777.

      Golan, M., E. Zelinger, Y. Zohar and B. Levavi-Sivan (2015). "Architecture of GnRH-Gonadotrope-Vasculature Reveals a Dual Mode of Gonadotropin Regulation in Fish." Endocrinology 156(11): 4163-4173.

      Mizrahi, N., C. Gilon, I. Atre, S. Ogawa, I. S. Parhar and B. Levavi-Sivan (2019). "Deciphering Direct and Indirect Effects of Neurokinin B and GnRH in the Brain-Pituitary Axis of Tilapia." Front Endocrinol (Lausanne) 10: 469.

      Mizrahi, N. and B. Levavi-Sivan (2023). "A novel agent for induced spawning using a combination of GnRH analog and an FDA-approved dopamine receptor antagonist." Aquaculture 565: 739095.

      Uehara, S. K., Y. Nishiike, K. Maeda, T. Karigo, S. Kuraku, K. Okubo and S. Kanda (2023). "Cholecystokinin is the follicle-stimulating hormone (FSH)-releasing hormone." bioRxiv: 2023.2005.2026.542428.

      Webb, K. A., Jr., I. A. Khan, B. S. Nunez, I. Rønnestad and G. J. Holt (2010). "Cholecystokinin: molecular cloning and immunohistochemical localization in the gastrointestinal tract of larval red drum, Sciaenops ocellatus (L.)." Gen Comp Endocrinol 166(1): 152-159.

    1. Primer Validation

      More detail is needed here, include how you determined the limit of detection of your assay. State that you used standard curves to estimate limit of detection (LOD), but see Klymus et al. (2020). Given that your assays are for qualitative purposes, the limit of quantification (LOQ) is likely not relevant in your case. Please verify this to clarify in the main text why the qPCR efficiency may be irrelevant for your assays, but the LOD is.

      Depending on who you will get as an examiner, it may be worthwhile to also mention that you did the testing according to the MIQE guidelines, which I think were incorporated into this paper (see thier Appendix S1 for the checklist):

      • Thalinger, B., Deiner, K., Harper, L. R., Rees, H. C., Blackman, R. C., Sint, D., ... & Bruce, K. (2021). A validation scale to determine the readiness of environmental DNA assays for routine species monitoring. Environmental DNA, 3(4), 823-836.

      • Bustin, S. A. (2024). Improving the quality of quantitative polymerase chain reaction experiments: 15 years of MIQE. Molecular aspects of medicine, 96, 101249.

      • Klymus, K. E., Merkes, C. M., Allison, M. J., Goldberg, C. S., Helbing, C. C., Hunter, M. E., Jackson, C. A., Lance, R. F., Mangan, A. M., Monroe, E. M., Piaggio, A. J., Stokdyk, J. P., Wilson, C. C., & Richter, C. A. (2020). Reporting the limits of detection and quantification for environmental DNA assays. Environmental DNA, 2, 271–282. https://doi.org/10.1002/edn3.29

    Annotators

    1. Primer Validation

      More detail is needed here, include how you determined the limit of detection of your assay. State that you used standard curves to estimate limit of detection (LOD), but see Klymus et al. (2020). Given that your assays are for qualitative purposes, the limit of quantification (LOQ) is likely not relevant in your case. Please verify this to clarify in the main text why the qPCR efficiency may be irrelevant for your assays, but the LOD is.

      Depending on who you will get as an examiner, it may be worthwhile to also mention that you did the testing according to the MIQE guidelines, which I think were incorporated into this paper (see thier Appendix S1 for the checklist):

      • Thalinger, B., Deiner, K., Harper, L. R., Rees, H. C., Blackman, R. C., Sint, D., ... & Bruce, K. (2021). A validation scale to determine the readiness of environmental DNA assays for routine species monitoring. Environmental DNA, 3(4), 823-836.

      • Bustin, S. A. (2024). Improving the quality of quantitative polymerase chain reaction experiments: 15 years of MIQE. Molecular aspects of medicine, 96, 101249.

      • Klymus, K. E., Merkes, C. M., Allison, M. J., Goldberg, C. S., Helbing, C. C., Hunter, M. E., Jackson, C. A., Lance, R. F., Mangan, A. M., Monroe, E. M., Piaggio, A. J., Stokdyk, J. P., Wilson, C. C., & Richter, C. A. (2020). Reporting the limits of detection and quantification for environmental DNA assays. Environmental DNA, 2, 271–282. https://doi.org/10.1002/edn3.29

    Annotators

    1. Primer Validation

      More detail is needed here, include how you determined the limit of detection of your assay. State that you used standard curves to estimate limit of detection (LOD), but see Klymus et al. (2020). Given that your assays are for qualitative purposes, the limit of quantification (LOQ) is likely not relevant in your case. Please verify this to clarify in the main text why the qPCR efficiency may be irrelevant for your assays, but the LOD is.

      Depending on who you will get as an examiner, it may be worthwhile to also mention that you did the testing according to the MIQE guidelines, which I think were incorporated into this paper (see thier Appendix S1 for the checklist):

      • Thalinger, B., Deiner, K., Harper, L. R., Rees, H. C., Blackman, R. C., Sint, D., ... & Bruce, K. (2021). A validation scale to determine the readiness of environmental DNA assays for routine species monitoring. Environmental DNA, 3(4), 823-836.

      • Bustin, S. A. (2024). Improving the quality of quantitative polymerase chain reaction experiments: 15 years of MIQE. Molecular aspects of medicine, 96, 101249.

      • Klymus, K. E., Merkes, C. M., Allison, M. J., Goldberg, C. S., Helbing, C. C., Hunter, M. E., Jackson, C. A., Lance, R. F., Mangan, A. M., Monroe, E. M., Piaggio, A. J., Stokdyk, J. P., Wilson, C. C., & Richter, C. A. (2020). Reporting the limits of detection and quantification for environmental DNA assays. Environmental DNA, 2, 271–282. https://doi.org/10.1002/edn3.29

    Annotators

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      (1) The technology requires a halo-tagged derivation of the active compound, and the linked position will have a huge impact on the potential "target hits" of the molecules. Given the fact that most of the active molecules lack of structure-activity relationship information, it is very challenging to identify the optimal position of the halo tag linkage.

      We appreciate your insightful comment. While finding the optimal position to attach a chemical linker to a small molecule of interest is indeed a challenging but necessary step, this is a common difficulty across all target-ID methods, except for those that are modification-free, as we described in Discussion. However, modification-free approaches such as DARTS, CETSA, and TPP have their own limitations, such as low sensitivity and a high false-positive rate. Additionally, DARTS and SPROX are limited to use with cell lysates. Please refer to the introduction in our manuscript for more details on these approaches. On the other hand, synthesizing HTL derivatives is relatively straightforward compared to other modifications, and we provide helpful guidelines for chemical linker design, provided the optimal chemical moiety has been identified, which is crucial for target identification. We selected dasatinib and HCQ/CQ as model compounds because previous studies offered insights into their derivative synthesis. Our data also show that DH5 retains strong kinase inhibitory activity (Figure 4—figure supplement 2), and DC661-H1 demonstrates potent inhibition of autophagy (Figure 6—figure supplement 1). For novel compounds, conducting a thorough structure-activity relationship (SAR) study is essential to determine the optimal position for HTL derivative synthesis.

      (2) Although POST-IT works in zebrafish embryos, there is still a long way to go for the broad application of the technology in other animal models.

      Thank you for your constructive comment. Yes, there is still a long way to go in developing the POST-IT system for broader applications in other animal models, especially in mice. However, we hope that our study provides valuable insights and inspiration to scientists and experts for applying the POST-IT system in various models. We are also committed to further improving its applicability.

      (3) The authors identified SEPHS2 as a new potential target of dasatinib and further validated the direct binding of dasatinib with this protein. However, considering the super strong activity of dasatinib against c-Src (sub nanomolar IC50 value), it is hard to conclude the contribution of SEPHS2 binding (micromolar potency) to its antitumor activity.

      Thank you for your insightful comment. We agree that the anticancer activity of dasatinib primarily results from inhibiting tyrosine kinases such as SRC and ABL. However, SEPHS2 contains an “opal" termination codon, UGA, at the 60th amino acid residue, which codes for selenocysteine. Due to the technical challenge of expressing selenoproteins in E. coli, we mutated it to cysteine for expression in E. coli to avoid premature translation termination, as described in the Materials and Methods section. Although the purified recombinant SEPHS2 shows a Kd of about 10 µM for dasatinib, the binding affinity to endogenous SEPHS2 may be higher since selenocysteine is larger and more electronegative than cysteine. This presents an interesting area for future investigation. Furthermore, our study of dasatinib’s binding to SEPHS2 could help facilitate the development of new SEPHS2 inhibitors, potentially targeting the active site of SEPHS2.

      Reviewer #3 (Public review):

      (1) Target Specificity: It is crucial for the authors to differentiate between the primary targets of the POST-IT system and those identified as side effects. This distinction is essential for assessing the specificity and utility of the technology.

      Thank you for your insightful comment. Drugs inevitably bind to various proteins with differing affinities, which can contribute to both side effects and beneficial outcomes. Typically, the primary targets exhibit high affinities. In this manuscript, we ranked the identified protein targets of DH5 based on affinity from mass spectrometry and p-values (Fig. 5A), and for DC661-H1, we used the SILAC ratio (Fig. 6A). We also individually assessed many drug-protein binding affinities using the MST assay, as well as in vitro and in cellulo assays, demonstrating their specificity. Moreover, we believe it is essential to identify as many protein targets as possible at physiological drug concentrations to better understand the drug’s side effects. Of course, further investigation is required to assess the roles and effects of these target proteins.

      (2) In Vivo Target Identification: The manuscript lacks detailed clarity on which specific targets were successfully identified in the in vivo experiments. Expanding on this information would provide a clearer view of the system's effectiveness and scope in complex biological settings.

      Thank you for your insightful comment regarding in vivo target identification. In this manuscript, we utilized a cell line as the primary method for in vivo target identification and validation after optimizing our system in test tubes. We successfully validated many of the targets identified using our POST-IT system (Figure 6—figure supplement 3). To demonstrate the proof of principle for in vivo application, we employed zebrafish embryos as an in vivo model, showing that endogenous SRC can be effectively pulled down by DH5 treatment (Fig. 7). While we could have explored the entire proteome to identify endogenous target proteins in zebrafish that bind to DH5 or dasatinib, we felt this would extend beyond our original scope, given that we have already demonstrated POST-IT’s ability to identify target proteins for dasatinib. Specific target identification and validation are crucial when using zebrafish for drug discovery. Additionally, we acknowledge that drugs likely interact with a range of protein targets in living organisms and may undergo metabolism and interactions within the circulatory system, which we address in our discussion.

      (3) Reproducibility and Scalability: Discussion on the reproducibility of the POST-IT system across various experimental setups and biological models, as well as its scalability for larger-scale drug discovery programs, would be beneficial.

      Thank you for the suggestion. While our system has shown  high reproducibility in our experiments, further improving both reproducibility and scalability would be advantageous. One potential approach to address this is through the generation of stable-expressing cell lines and transgenic zebrafish lines, which we have discussed in the revised manuscript. Establishing stable cell lines with robust POST-IT expression could enhance scalability for drug discovery applications.

      (4) Quantitative Analysis: A more detailed quantitative analysis of the protein interactions identified by POST-IT, including statistical significance and comparative data against other technologies, would enhance the manuscript.

      Thank you for your suggestion. In our assessment of drug-protein affinity, we included Kd values as quantitative measures using MST assays. The protein targets of dasatinib identified through mass spectrometry are also accompanied by p-values for quantitative analysis (Fig. 5A), and the detailed procedures are described in the Material and methods section. While it is challenging to provide direct comparative data against other technologies, our system successfully identified many known target proteins for dasatinib, as well as SEPHS2 and VPS37C as new targets for dasatinib and for HCQ/CQ, respectively, which were not detected by other methods.

      (5) Technological Limitations: The authors should discuss any limitations or potential pitfalls of the POST-IT system, which would be crucial for future users and for guiding subsequent improvements.

      Thank you for your insightful suggestion We agree that clearly defining the technological limitations is important. Therefore, we have expanded our original discussion on the limitations of our POST-IT system (Discussion section, paragraph 6).

      (6) Long-Term Stability and Activity: Information on the long-term stability and activity of the POST-IT components in different biological environments would ensure the reliability of the system in prolonged experiments.

      Yes, this is an important question. We did not notice any stability or toxicity issues with Halo-PafA and Pup substrates in HEK293T cells or zebrafish, which is an important factor for stable cell lines and transgenic zebrafish lines. However, HTL derivatives of the drug could be toxic or unstable due to the nature of the drug or its metabolism, which needs to be taken into account when designing experiments, and we have included this in the Discussion.

      (7) Comparison with Existing Technologies: A detailed comparison with existing proximity tagging and target identification technologies would help position POST-IT within the current landscape, highlighting its unique advantages and potential drawbacks.

      We appreciate your valuable feedback and agree that such comparisons are crucial. We have included a detailed overview and comparison of existing proximity-tagging systems and their related target identification technologies in the Introduction (lines 78-100) and Discussion (lines 391-412), highlighting their respective pros and cons. Additionally, we have expanded the discussion to further compare these technologies with our POST-IT system, addressing its advantages and limitations (lines 378-390, lines 448-467). We hope this provides sufficient context and information to effectively position POST-IT among the landscape of proximity-tagging target identification technologies.

      (8) Concerns Regarding Overexposed Bands: Several figures in the manuscript, specifically Figure 3A, 3B, 3C, 3F, 3G, Figure 4D, and the second panels in Figure 7C as well as some figures in the supplementary file, exhibit overexposed bands.

      We appreciate your astute observation regarding the overexposed bands and apologize for any confusion. The “overexposed” bands represent the unpupylated proteins, while the bands above them correspond to the pupylated proteins. We intended to clearly show both pupylated and unpupylated bands, although the latter are generally much weaker. We are currently working on further improving our POST-IT system to enhance pupylation efficiency.

      (9) Innovation Concern: There is a previous paper describing a similar approach: Liu Q, Zheng J, Sun W, Huo Y, Zhang L, Hao P, Wang H, Zhuang M. A proximity-tagging system to identify membrane protein-protein interactions. Nat Methods. 2018 Sep;15(9):715-722. doi: 10.1038/s41592-018-0100-5. Epub 2018 Aug 13. PMID: 30104635. It is crucial to explicitly address the novel aspects of POST-IT in contrast to this earlier work.

      Thank you for bringing this to our attention. Proximity-tagging systems like BioID, TurboID, NEDDylator, and PafA (Lui Q et al., Nat Methods 2018) were initially developed to study protein-protein interactions or identify protein interactomes, as these applications are of broader interest and generally easier to implement. However, applying proximity-tagging systems for small molecule target identification requires significant optimization. As described in the introduction (lines 78-100), target protein identification systems have since been developed using TurboID and NEDDylator (Tao AJ et al., Nat Commun 2023; Hill ZB et al., J Am Chem Soc 2016). It is conceivable that a PafA-based proximity-tagging system could also be adapted for target-ID, and other groups may pursue this approach in the future. Although the PafA-Pup system shows great promise for target-ID applications, extensive optimization was needed to enable its use for this purpose. Finally, we demonstrate that POST-IT offers distinct advantages over other proximity-tagging-based target-ID systems. For more details, please refer to the introduction and discussion sections.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) Figure 1- Figure Supplement 1A: The Pup substrate "HB-Pup" is mentioned, but the main text or figure legend provides no introduction or description.

      We appreciate your astute observation. We have added a description in the main text and figure legend as follows: “…and used HB-Pup as a control, which contains 6´His and BCCP at the N terminus of Pup” in the main text (line 142) and “HB, TS, and SBP refer to 6´His and BCCP, twin-STII (Strep-tag II), and streptavidin binding peptide, respectively.” in the Figure 1-figure supplement 1A.

      (2) Figure 1 - Figure Supplement 3B: The authors used TS-sPupK61R as a substrate but did not explain why. The main text mentions that mutating sPup alone did not affect polypupylation, raising the question of why TS-sPupK61R was used in this figure. Furthermore, while the authors state that polypupylation becomes evident after 1 hour of incubation (more pronounced after 2 or 3 hours), the reactions here were conducted for only 30 minutes.

      Thank you for your question. Figure 1 - Figure Supplement 3B was conducted to test self-pupylation levels in the different Halo-PafA derivatives. For this purpose, we could use any Pup substrate such as SBP-sPup and SBPK4R-sPupK61R, instead of Ts-sPup and TS-sPupK61R, as they do not show any differences in pupylation activity. We chose Ts-sPup and TS-sPupK61R simply because any Pup substrates could be used for this purpose. Similarly, we did not need to incubate the reaction for a longer time to detect polypupylation, as our intention was to test “self-pupylation”. We demonstrated in Figure 1 – figure supplement 2 that polypupylation is dependent on the number or position of lysine residues in Pup substrate or tags. The results clearly showed that self-pupylation was almost completely abolished by the Halo8KR mutation. To clarify this, we added the following description in lines 168-169: “Ts-sPup and TS-sPupK61R were chosen as sPup substrates for this experiment, although any Pup substrates could have been used. The levels of self-pupylation were assessed.”

      (3) Line 156: The statement that "the TS-tag completely abolished polypupylation in TS-sPup" is inaccurate. Using TSK8R-sPupK61R as the substrate, several bands appear, which likely represent Halo-PafA with varying degrees of polypupylation. Some bands also appear to correspond to those seen when using TS-sPup as a substrate. The authors should clarify how they distinguish between multipupylation and polypupylation in this case.

      We sincerely appreciate your insight into clarifying the distinction between multipupylation and polypupylation. Polypupylation refers to the addition of a new Pup onto a previously linked Pup on the target protein, akin to polyubiquitination. In contrast, multipupylation involves multiple single pupylations at different positions on the target proteins. Since pupylation occurs exclusively at lysine residues in tag-Pup substrates, mutating all lysine residues to arginine, as in TSK48R-sPupK61R, prevents the mutant tag-Pup from linking to another Pup. This means that only single pupylation can proceed with this type of mutant Pup substrate. If multiple pupylated bands are observed with this mutant substrate, it indicates “multipupylation” rather than “polypupylation”, as shown in Figure 1-figure supplement 2D. The same applies to the pupylation bands in Figure 1-figure supplement 2E and F, as sSBP-sPupK61R and SBPK4R-sPupK61R lack lysine residues. By comparing these multipupylation bands, it is also possible to distinguish them from polypupylation bands, which are marked by yellow arrows. However, after 2-3 pupylation bands, higher-order bands become increasingly difficult to distinguish.

      To clarify the mutation in the TS-tag, we revised the sentence in line 156 from “However, further mutations within the TS-tag completely abolished polypupylation in TS-sPup” to “However, further mutations of two lysine residues within the TS-tag, creating TSK8R-sPupK61R, completely abolished polypupylation in TS-sPup”. Additionally, we have inserted sentences in line 152 to define polypupylation and multipupylation, as described here.

      (4) Line 160: Similar to the above concern about line 156, the claim that SBPK4R and sSBP completely prevented polypupylation is unconvincing and requires more supporting evidence.

      Thank you for raising this concern. As mentioned above, both SBPK4R and sSBP lack lysine residues required for pupylation. As a result, these mutants can only undergo multiple single pupylations on the lysine residues of the target protein, which leads to “multipupylation”. In Figure 1-figure supplement 2E and F, pupylation bands by sSBP-sPupK61R or SBPK4R-sPupK61R do not display doublet bands (one from multipupylation and the other from polypupylation), as seen with SBP-sPup, marked by yellow arrows. Notably, Halo-PafA containing polypupylated branches migrates more slowly than one with an equal number of multipupylation events. To clarify this point, we have added the phrase “as shown in sSBP-sPupK61R and SBP4KR-sPupK61R” at the end of the sentence in line 160.

      (5) Lines 176-177: The authors claim that PafAS126A exhibited reduced polypupylation compared to PafA, but given that PafAS126A may reduce depupylase activity, how could it reduce polypupylation levels? Moreover, it is hard to find any data supporting this conclusion in Figure 1 - Figure Supplement 3B.

      We appreciate your insightful comment. At this point, we do not fully understand how the mutation that reduces depupylase activity also decreases polypupylation. It is possible that PafAS126A has a lower preference for pupylated Pup as a prey, which is required for polypupylation, since depupylase activity depends on recognizing pupylated Pup as a prey to remove it. Nonetheless, Halo-PafAS126A shows reduced levels of higher molecular weight bands compared to Halo-PafA, as shown in Figure 1-figure supplement 3B, while exhibiting increased pupylation in lower molecular weight bands, which represent either multipupylation or low-degree polypupylation. Since higher molecular weight bands (> 150 kD) are likely due to polypupylation, this result suggests reduced polypupylation and increased multipupylation in Halo-PafAS126A. To clarify this in the main text, we have added the following description in line 177: “as evidenced by the decreased levels of high molecular weight bands and an increase in low molecular weight bands”

      (6) POST-IT system in cellulo validation: The system was developed using the Halo-tag, yet the in-cell validation uses FRB and FKBP instead, without explaining this switch. This inconsistency makes the logic of the experiment unclear.

      We appreciate your insightful comment. The interaction between rapamycin and FRB or FKBP is known to be highly specific and robust, making this system useful in various biological contexts. Due to this property, rapamycin can induce interaction between two proteins when one is fused with FRB and the other with FKBP. Before testing or optimizing the POST-IT system in cells, we hypothesized that using the rapamycin-induced interaction between FRB and FKBP could introduce pupylation of the target protein, provided that PafA is fused with FRB or FKBP and the target protein is fused with the other. The results demonstrate that PafA can introduce pupylation of the target protein in a proximity-dependent manner via this chemically induced interaction. To further clarify this in the main text, we modified the original sentence in lines 214-216 as follows: “To mimic drug-target interaction-induced pupylation in live cells and assess the potential of PafA as a proximity-tagging system for target-ID, we incorporated the rapamycin-induced interaction between FRB and FKBP into our PL system, as this interaction between a small molecule and a protein is known to be highly specific and robust (Figure 3—figure supplement 1A).”

      (7) Line 209: The authors decided to use the SBP-tag for further studies due to better performance, but in Figure 3 - Figure supplement 1, they still used the unintroduced HB-Pup as the substrate, which is confusing and lacks explanation.

      Thank you for raising your question. The SBP-tag is not superior to the TS-tag in terms of pupylation activity. However, the TSK8R mutant cannot bind to Strep-Tactin beads, while the SBP mutants, SBPK4R and sSBP, can bind to streptavidin. Therefore, we chose the SBP-tag instead of the TS-tag for further studies as a Pup substrate in POST-IT system, as we needed to pull down the target proteins. HB-Pup is consistently used as a control throughout various experiments, as it is the original Pup substrate. In Figure 3-figure supplement 1B and C, HB-Pup was used to test chemically induced pupylation by PafA. In these cases, it was not so critical which Pup substrate was chosen. Furthermore, we compared HB-Pup and different SBP-sPup substrates in Figure 3-figure supplement 1D, where HB-Pup was used as a control or for comparison. Although pupylation bands with HB-Pup appear more robust, this substrate contains multiple lysine residues, leading to high levels of polypupylation. To make it clear, we modified the sentence in line 209 to “Therefore, we decided to use the SBP-tag as a Pup substrate in the POST-IT system for further studies.”.

      (8) Line 220: Both SBP-sPup and SBPK4R-sPupK61R are described as exhibiting efficient pupylation, but the data show mostly self-pupylation and little to no pupylation of the target protein.

      Thank you for your concern. However, pupylation of the target protein is actually quite substantial, as the intensities of the free form and pupylated proteins are relatively similar, as shown in the upper panel of Figure 3-figure supplement 1D. Self-pupylation is always much higher than target pupylation, because PafA constantly pupylates itself, whereas pupylation of the target protein occurs only through interaction. Furthermore, V5-FRB-mKate2-PafA contains many lysine residues, which increases the levels of self-pupylation.

      (9) Lines 222-224: The authors chose SBPK4R-sPupK61R to avoid polypupylation, although SBP-sPup did not cause detectable polypupylation. Neither substrate caused pupylation of the target protein, so the rationale behind this choice is unclear.

      Thank you for raising your question. Similar to the above comment (#8), please refer to the pupylation bands of the target protein, as shown in the upper panel of Figure 3-figure supplement 1D. The pupylation band of the target protein is quite remarkable, as the intensities of the free form and pupylated proteins are comparable. Additionally, there are no multiple pupylation bands in either case, except for one additional weak multipupylation band, indicating no polypupylation by SBP-sPup, which does not have K-to-R mutations. Of course, SBPK4R-sPupK61R can only undergo single pupylation, as it does not contain lysine residues. Although we did not observe polypupylation by SBP-sPup in this experimental condition, it is possible that SBP-sPup may cause polypupylation under different experimental conditions or with other target proteins. Since SBPK4R-sPupK61R exhibits comparable pupylation of the target protein at least in this experiment setting as SBP-sPup, we selected SBPK4R-sPupK61R as the Pup substrate for POST-IT system to avoid any potential polypupylation that could be caused by SBP-sPup in other cases. We believe that polypupylation can introduce bias into the analysis and hinder the comprehensive discovery of additional target proteins for small molecules.

      (10) Line 224: The authors conclude that rapamycin greatly reduced self-pupylation, but the supporting data are unclear.

      Thank you for your constructive comments on our manuscript. Please refer to the lower panel of Figure 3-figure supplement 1D. When using either SBPK4R-sPupK61R or SBP-sPup, rapamycin treatment results in reduced levels of self-pupylation compared to the no-treatment control. However, we did not observe this reduction with HB-Pup and do not know the reason. To clarify this in the main text, we added the following description to the end of the sentence: “when using either SBPK4R-sPupK61R or SBP-sPup, as shown in the lower panel of Figure 3—figure supplement 1D”

      (11) Line 234: The authors selected an 18-amino acid linker, but given that linkers longer than 10 amino acids enhance labeling, this choice should be explained.

      Thank you for raising your question. In fact, a linker of 10 amino acids (aa) or longer is likely to behave similarly. We chose an 18 aa linker instead of a 40 aa linker primarily for the convenience of cloning and to reduce the potential for DNA sequence recombination associated with longer repeats. Additionally, a longer, flexible linker may behave like an intrinsically disordered protein (Harmon et al., 2017), which can lead to unwanted protein-protein interactions or phase separation. To elaborate on this, we added the following sentences after the sentence in line 233-235: “We chose the 18-amino acid linker instead of the 40-amino acid linker for easier cloning and to lower the risk of DNA recombination from longer repeats. Additionally, a longer, flexible linker may behave like an intrinsically disordered protein (Harmon et al., 2017), an unwanted feature for target-ID.”

      (12) S126A and K172R mutations: The authors claim that these mutations additively enhanced pupylation under cellular conditions, but in Figure 3B, the band intensities appear similar for the wild-type and mutant versions.

      Thank you for raising your concern. Although a single pupylation band appears similar among the three different Halo-PafA proteins, multipupylation bands are slightly but noticeably increased by the S126A and K172R mutations compared to Halo8KR-PafA. Since we used SBPK4R-sPupK61R as a Pup substrate, all higher molecular weight bands result from multipupylation rather than polypupylation. This illustrates why it is preferable to use SBPK4R-sPupK61R over SBP-sPup, as the pupylation bands with SBP-sPup are mixtures of poly- and multipupylation, making it difficult to assess levels of target labeling. To clarify this in the main text, we added the following description after the sentence in line 236: “as the higher molecular weight multipupylation bands are slightly but noticeably increased with these mutations compared to Halo8KR-PafA”

      (13) Line 263: The authors selected DH5 for further experiments due to its efficiency, but the data suggest that the performance of DH1 to DH5 is similar.

      We appreciate your question about the different dasatinib HTL derivatives. However, our data clearly show that DH2-5 derivatives bind significantly more effectively to Halo-PafA in vitro and in live cells compared to DH1 (Figure 4A and B). Additionally, the DH2-5 derivatives result in dramatically increased pupylation of the target protein in vitro and noticeable enhancement in live cells (Figure 4C and D). Among DH2 to DH5, there is no obvious difference in binding to Halo-PafA or pupylation of the target protein. Therefore, we chose DH5, as we believe that the longer linker in DH5 may facilitate the binding of a more diverse range of target proteins to dasatinib, enabling the discovery of additional target proteins.

      (14) Line 309: The authors introduce HCQ and CQ as important drugs but then investigate the mechanism using DC661 without introducing or justifying the choice of this compound.

      Thank you for your point. We explained the reason to choose DC661, a dimer form of CQ, instead of CQ for the synthesis of an HTL derivative in line 310. “assuming that a dimer would enhance binding affinity as previously described.” As the dimer forms of a drug or a small molecule such as testosterone dimers, estrogen dimers, and numerous anticancer drug dimers have been often developed to enhance drug effects (Paquin A et., Molecules 2021). Similarly, dimer forms of HCQ/CQ have been introduced and shown to be more potent (Hrycyna CA et al., ACS Chem Biol 2014; Rebecca VW et al., Cancer Discovery 2019). We expected that using a dimer form might offer higher probability to identify target proteins for HCQ/CQ.

      (15) The authors suggest that multipupylation levels were enhanced but do not explain whether this might benefit the system or introduce other issues. Clarifying this point would provide valuable insight for potential users of this system.

      Thank you for your thoughtful suggestion. Polypupylation likely leads to biased enrichment of a limited set of target proteins, and its levels may not correlate with the binding affinity of target proteins to the small molecule of interest, features that can negatively impact target-ID. In contrast, multipupylation may be correlated with binding affinity or interaction frequency, as we observed increased levels of multipupylation with higher Pup concentrations and longer incubation times. This suggests that target proteins with multiple lysines in proximity to PafA can be sequentially pupylated, starting with the most accessible lysine. However, if a target protein has only one accessible lysine, pupylation will occur only once, regardless of the protein’s affinity to the small molecule. In summary, while polypupylation may be a drawback for target-ID, multipupylation could be useful for both target-ID and understanding binding mode. To elaborate on this, we added the following additional explanation after the sentence in line 152: “, whereas multipupylation is more likely correlated with binding affinity or interaction frequency.”

      (16) The author should address whether the Halotag ligand modification of the drug alters the binding properties between the drug and targets. That may be causing artifact binding of the drug and other proteins.

      Thank you for your insightful comment. Yes, it is true that chemical modifications of the small molecule of interest, such as linker derivatization (e.g., HTL) or photo-affinity labeling, generally lead to reduced activity or affinity compared to the original molecule. Synthesizing a derivative is a common challenge across all target-ID methods, except for modification-free approaches, as we mentioned in the Discussion. However, modification-free methods like DARTS, CETSA, and TPP have their own limitations, including low sensitivity or high false positive rates. Identifying the optimal position for chemical modification on the small molecule of interest is critical. We chose dasatinib and HCQ/CQ as model compounds, because previous studies provided insights into their derivative synthesis. In addition, our data show that DH5 retains robust kinase inhibitory activity (Figure 4-figure supplement 2), and DC661-H1 exhibits potent autophagy inhibition (Figure 6-figure supplement 1). For novel compounds, a thorough structure-activity relationship study is essential to identify the optimal position for HTL derivative synthesis.

      (17) The author stated there is no observable toxicity in zebrafish without providing a detailed analysis or enough data. Further analysis of the expression of Halo-PafA and its substrate sPup influence on toxicity or side effects to the living cells or animals would be needed. It is important for in vivo applications.

      Thank you for your constructive suggestion. We have now included additional experimental data in Figure 7-figure supplement 1, showing no toxicity in zebrafish embryos expressing the POST-IT system. We assessed toxicity in two ways: by injecting the POST-IT DNA plasmid into one-cell-stage embryos for acute expression, and by using embryos from transgenic zebrafish expressing POST-IT under a heat-shock inducible promoter. Neither the injection nor the heat-shock activation of POST-IT expression resulted in any noticeable toxicity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      This manuscript from Schwintek and coworkers describes a system in which gas flow across a small channel (10^-4-10^-3 m scale) enables the accumulation of reactants and convective flow. The authors go on to show that this can be used to perform PCR as a model of prebiotic replication.

      Strengths:

      The manuscript nicely extends the authors' prior work in thermophoresis and convection to gas flows. The demonstration of nucleic acid replication is an exciting one, and an enzyme-catalyzed proof-of-concept is a great first step towards a novel geochemical scenario for prebiotic replication reactions and other prebiotic chemistry.

      The manuscript nicely combines theory and experiment, which generally agree well with one another, and it convincingly shows that accumulation can be achieved with gas flows and that it can also be utilized in the same system for what one hopes is a precursor to a model prebiotic reaction. This continues efforts from Braun and Mast over the last 10-15 years extending a phenomenon that was appreciated by physicists and perhaps underappreciated in prebiotic chemistry to increasingly chemically relevant systems and, here, a pilot experiment with a simple biochemical system as a prebiotic model.

      I think this is exciting work and will be of broad interest to the prebiotic chemistry community.

      Weaknesses:

      The manuscript states: "The micro scale gas-water evaporation interface consisted of a 1.5 mm wide and 250 µm thick channel that carried an upward pure water flow of 4 nl/s ≈ 10 µm/s perpendicular to an air flow of about 250 ml/min ≈ 10 m/s." This was a bit confusing on first read because Figure 2 appears to show a larger channel - based on the scale bar, it appears to be about 2 mm across on the short axis and 5 mm across on the long axis. From reading the methods, one understands the thickness is associated with the Teflon, but the 1.5 mm dimension is still a bit confusing (and what is the dimension in the long axis?) It is a little hard to tell which portion (perhaps all?) of the image is the channel. This is because discontinuities are present on the left and right sides of the experimental panels (consistent with the image showing material beyond the channel), but not the simulated panels. Based on the authors' description of the apparatus (sapphire/CNC machined Teflon/sapphire) it sounds like the geometry is well-known to them. Clarifying what is going on here (and perhaps supplying the source images for the machined Teflon) would be helpful.

      We understand. We will update the figures to better show dimensions of the experimental chamber. We will also add a more complete Figure in the supplementary information. Part of the complexity of the chamber however stems from the fact that the same chamber design has also been used to create defined temperature gradients which are not necessary and thus the chamber is much more complex than necessary.

      We added the scheme of the whole PTFE Chip to Figure 2 in the top left corner, indicating the ROI shown in the fluorescence micrographs. Additionally, the channel walls are now clearly indicated by white dotted lines. The dimensions of the setup are now shown clearer, by showing the total width of the channel as well as its height until the gas flux channel, as well as its depth. Changed caption of the figure accordingly and it now reads: “[…] The PTFE chip cutout in the top left corner shows the ROI used for the micrographs. The color scale is equal for both simulation and experiment and Channel dimensions are 4 x 1.5 x 0.25 mm as indicated. Dotted lines visualize the location of the channel walls. […]“

      The data shown in Figure 2d nicely shows nonrandom residuals (for experimental values vs. simulated) that are most pronounced at t~12 m and t~40-60m. It seems like this is (1) because some symmetry-breaking occurs that isn't accounted for by the model, and perhaps (2) because of the fact that these data are n=1. I think discussing what's going on with (1) would greatly improve the paper, and performing additional replicates to address (2) would be very informative and enhance the paper. Perhaps the negative and positive residuals would change sign in some, but not all, additional replicates?

      To address this, we will show two more replicates of the experiment and include them in Figure 2.

      We are seeing two effects when we compare fluorescence measurements of the experiments.

      Firstly, degassing of water causes the formation of air-bubbles, which are then transported upwards to the interface, disrupting fluorescence measurements. This, however, mostly occurs in experiments with elevated temperatures for PCR reactions, such as displayed in Figure 4.

      Secondly, due to the high surface tension of water, the interface is quite flexible. As the inflow and evaporation work to balance each other, the shape of the interface adjusts, leading to alterations in the circular flow fields below.

      Thus the conditions, while overall being in steady state, show some fluctuations. The strong dependence on interface shape is also seen in the simulation. However, modeling a dynamic interface shape is not so easy to accomplish, so we had to stick to one geometry setting. Again here, the added movies of two more experiments should clarify this issue.

      We performed three more replicates of the experiment and included the averaged data points together with their respective standard deviation as error bars in Figure 2d. Additionally, the videos of each individual repeat are now added to the supplementary files for the reader to better understand where the strong fluctuations around half an hour come from. The Figure caption was adjusted to “ […] The maximum relative concentration of DNA increased within an hour to ~30 X the initial concentration, with the trend following the simulation. Error bars are the standard deviation from four independent measurements. […].

      The main text was also changed to better explain how the fluctuations impact the measurements: […] Water continuously evaporated at the interface, but nucleic acids remained in the aqueous phase accumulating near the interface. They could only escape downward either by diffusion or by the vortex induced by the gas flowing across the interface, pushing the molecules back deeper into the bulk (See the flow lines in Fig2(b) taken from the simulation).  As the gas flow continuously removed excess vapor, the evaporation rate remained constant. Thus, except for fluctuations, a stable interface shape should be expected. However, due to the high surface tension of water, the interface is very flexible. As the inflow and evaporation work to balance each other, the shape of the interface adjusts, likely in response to small fluctuations in gas pressure and spatial variations in water surface tension. This is leading to alterations in the circular flow fields below (Supplementary Movie 2).

      As these fluctuations are difficult to simulate, we decided to stick with one interface shape, matching evaporation and inflow speeds. The evaporation rate at the interface was therefore set to be proportional to the vapor concentration gradient and varied spatially along the interface between 5 and 10.5 µm/s (See Suppl. Fig. VI.1(d)). Using the known diffusion coefficient of 95 µm²/s for the 63mer[9]}, the simulation closely matched the experimental results. In both cases, DNA accumulated in regions with circular flow patterns driven by the gas flux (Fig.2(b), right panel).

      5 minutes after starting the experiment, the maximum DNA accumulation was 3-fold, while after one hour of evaporation, around 30-fold accumulation was observed. Due to molecules residing in very shallow volumes when directly at the interface, the fluorescence signal can vary drastically compared to measurements deeper in the bulk. This can be seen in the fluctuations between independent measurements (See Supplementary Movies 2b,2b,2c), especially around 0.5~h shown in Figure 2(d). The simulated maximum accumulation followed the experimental results and starts saturating after about one hour (Fig.2(d)). […]”

      The authors will most likely be familiar with the work of Victor Ugaz and colleagues, in which they demonstrated Rayleigh-Bénard-driven PCR in convection cells (10.1126/science.298.5594.793, 10.1002/anie.200700306). Not including some discussion of this work is an unfortunate oversight, and addressing it would significantly improve the manuscript and provide some valuable context to readers. Something of particular interest would be their observation that wide circular cells gave chaotic temperature profiles relative to narrow ones and that these improved PCR amplification (10.1002/anie.201004217). I think contextualizing the results shown here in light of this paper would be helpful.

      Thanks for pointing this out and reminding us. We apologize. We agree that the chaotic trajectories within Rayleigh-Bénard convection cells lead to temperature oscillations similar to the salt variations in our gas-flux system. Although the convection-driven PCR in Rayleigh-Bénard is not isothermal like our system, it provides a useful point of comparison and context for understanding environments that can support full replication cycles. We will add a section comparing approaches and giving some comparison into the history of convective PCR and how these relate to the new isothermal implementation.

      We added a main text paragraph after the last paragraph in section “Strand Separation Dynamics”: “[…]Rayleigh-Bénard convection cells generate similar patterns to those seen in Fig. 3(c) The oscillations in salt concentration resemble the temperature fluctuations observed in convection-based PCR reactions from earlier studies [32,33], which showed that chaotic temperature variations, compared to periodic ones, enhanced the efficiency of the PCR reaction.[…]

      Again, it appears n=1 is shown for Figure 4a-c - the source of the title claim of the paper - and showing some replicates and perhaps discussing them in the context of prior work would enhance the manuscript.

      We appreciate the reviewer for bringing this to our attention. We will now include the two additional repeats for the data shown in Figure 4c, while the repeats of the PAGE measurements are already displayed in Supplementary Fig. IX.2. Initially, we chose not to show the repeats in Figure 4c due to the dynamic and variable nature of the system. These variations are primarily caused by differences at the water-air interface, attributed to the high surface tension of water. Additionally, the stochastic formation of air bubbles in the inflow—despite our best efforts to avoid them—led to fluctuations in the fluorescence measurements across experiments. These bubbles cause a significant drop in fluorescence in a region of interest (ROI) until the area is refilled with the sample.

      Unlike our RNA-focused experiments, PCR requires high temperatures and degassing a PCR master mix effectively is challenging in this context. While we believe our chamber design is sufficiently gas-tight to prevent air from diffusing in, the high surface-to-volume ratio in microfluidics makes degassing highly effective, particularly at elevated temperatures. We anticipate that switching to RNA experiments at lower temperatures will mitigate this issue, which is also relevant in a prebiotic context.

      The reviewer’s comments are valid and prompt us to fully display these aspects of the system. We will now include these repeats in Figure 4c to give readers a deeper understanding of the experiment's dynamics. Additionally, we will provide videos of all three repeats, allowing readers to better grasp the nature of the fluctuations in SYBR Green fluorescence depicted in Figure 4c.

      The data from the triplicates are now added to Figure 4c, showing how air bubbles, forming through degassing at the high temperatures required for Taq polymerase, disrupt the measurement, as they momentarily dry off the channel and stop the reaction until the channel fills again. Figure caption has been adapted and now reads: “[…] Dotted lines show the data from independent repeats. Air bubbles formed through degassing can momentarily disrupt the reaction. […]”

      We additionally changed the main text to explain the reader the experimental difficulties: “[…] In other repetitions of the reaction, this increase was sometimes even observed earlier, around the one-hour mark (dotted lines). However, air bubbles nucleated by degassing events rise and temporarily dry out the channel, interrupting the reaction until the liquid refills the channel (Supplementary Movies 4,4b,4c\&5). Despite our best efforts, we were unable to fully prevent this, especially given the high temperatures required for Taq polymerase activity. In an identical setting when the gas- and water flux were switched off, no fluorescence increase was found (See Fig. 4(c) red lines). Fluorescence variations are additionally caused by fluctuations in the position of the gas-water interface, as discussed earlier. […]”

      I think some caution is warranted in interpreting the PCR results because a primer-dimer would be of essentially the same length as the product. It appears as though the experiment has worked as described, but it's very difficult to be certain of this given this limitation. Doing the PCR with a significantly longer amplicon would be ideal, or alternately discussing this possible limitation would be helpful to the readers in managing expectations.

      This is a good point and should be discussed more in the manuscript. Our gel electrophoresis is capable of distinguishing between replicate and primer dimers. We know this since we were optimizing the primers and template sequences to minimize primer dimers, making it distinguishable from the desired 61mer product. That said, all of the experiments performed without a template strand added did not show any band in the vicinity of the product band after 4h of reaction, in contrast to the experiments with template, presenting a strong argument against the presence of primer dimers.

      We added a main text section explaining this to the reader: “[…]Suppl. Fig. IX.2 shows all independent repeats of the corresponding experiments. No product was detected in any of these cases, ruling out reaction limitations such as primer dimer formation. Primer dimers would form even in the absence of a template strand and would be identifiable through gel electrophoresis. As Taq polymerase requires a significant overlap between the two dimers to bind, this would result in a shorter product compared to the 61mer used here.  […]”

      Reviewer #2 (Public review):

      Schwintek et al. investigated whether a geological setting of a rock pore with water inflow on one end and gas passing over the opening of the pore on the other end could create a non-equilibrium system that sustains nucleic acid reactions under mild conditions. The evaporation of water as the gas passes over it concentrates the solutes at the boundary of evaporation, while the gas flux induces momentum transfer that creates currents in the water that push the concentrated molecules back into the bulk solution. This leads to the creation of steady-state regions of differential salt and macromolecule concentrations that can be used to manipulate nucleic acids. First, the authors showed that fluorescent bead behavior in this system closely matched their fluid dynamic simulations. With that validation in hand, the authors next showed that fluorescently labeled DNA behaved according to their theory as well. Using these insights, the authors performed a FRET experiment that clearly demonstrated the hybridization of two DNA strands as they passed through the high Mg++ concentration zone, and, conversely, the dissociation of the strands as they passed through the low Mg++ concentration zone. This isothermal hybridization and dissociation of DNA strands allowed the authors to perform an isothermal DNA amplification using a DNA polymerase enzyme. Crucially, the isothermal DNA amplification required the presence of the gas flux and could not be recapitulated using a system that was at equilibrium. These experiments advance our understanding of the geological settings that could support nucleic acid reactions that were key to the origin of life.

      The presented data compellingly supports the conclusions made by the authors. To increase the relevance of the work for the origin of life field, the following experiments are suggested:

      (1) While the central premise of this work is that RNA degradation presents a risk for strand separation strategies relying on elevated temperatures, all of the work is performed using DNA as the nucleic acid model. I understand the convenience of using DNA, especially in the latter replication experiment, but I think that at least the FRET experiments could be performed using RNA instead of DNA.

      We understand the request only partially. The modification brought about by the two dye molecules in the FRET probe to be able to probe salt concentrations by melting is of course much larger than the change of the backbone from RNA to DNA. This was the reason why we rather used the much more stable DNA construct which is also manufactured at a lower cost and in much higher purity also with the modifications. But we think the melting temperature characteristics of RNA and DNA in this range is enough known that we can use DNA instead of RNA for probing the salt concentration in our flow cycling.

      Only at extreme conditions of pH and salt, RNA degradation through transesterification, especially under alkaline conditions is at least several orders of magnitude faster than spontaneous degradative mechanisms acting upon DNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.]. The work presented in this article is however focussed on hybridization dynamics of nucleic acids. Here, RNA and DNA share similar properties regarding the formation of double strands and their respective melting temperatures. While RNA has been shown to form more stable duplex structures exhibiting higher melting temperatures compared to DNA [Dimitrov, R. A., & Zuker, M. (2004). Prediction of hybridization and melting for double-stranded nucleic acids. Biophysical Journal, 87(1), 215-226.], the general impact of changes in salt, temperature and pH [Mariani, A., Bonfio, C., Johnson, C. M., & Sutherland, J. D. (2018). pH-Driven RNA strand separation under prebiotically plausible conditions. Biochemistry, 57(45), 6382-6386.] on respective melting temperatures follows the same trend for both nucleic acid types. Also the diffusive properties of RNA and DNA are very similar [Baaske, P., Weinert, F. M., Duhr, S., Lemke, K. H., Russell, M. J., & Braun, D. (2007). Extreme accumulation of nucleotides in simulated hydrothermal pore systems. Proceedings of the National Academy of Sciences, 104(22), 9346-9351.].

      Since this work is a proof of principle for the discussed environment being able to host nucleic acid replication, we aimed to avoid second order effects such as degradation by hydrolysis by using DNA as a proxy polymer. This enabled us to focus on the physical effects of the environment on local salt and nucleic acid concentration. The experiments performed with FRET are used to visualize local salt concentration changes and their impact on the melting temperature of dissolved nucleic acids.  While performing these experiments with RNA would without doubt cover a broader application within the field of origin of life, we aimed at a step-by-step / proof of principle approach, especially since the environmental phenomena studied here have not been previously investigated in the OOL context. Incorporating RNA-related complexity into this system should however be addressed in future studies. This will likely require modifications to the experimental boundary conditions, such as adjusting pH, temperature, and salt concentration, to account for the greater duplex stability of RNA. For instance, lowering the pH would reduce the RNA melting temperature [Ianeselli, A., Atienza, M., Kudella, P. W., Gerland, U., Mast, C. B., & Braun, D. (2022). Water cycles in a Hadean CO2 atmosphere drive the evolution of long DNA. Nature Physics, 18(5), 579-585.].

      (2) Additionally, showing that RNA does not degrade under the conditions employed by the authors (I am particularly worried about the high Mg++ zones created by the flux) would further strengthen the already very strong and compelling work.

      Based on literature values for hydrolysis rates of RNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.], we estimate RNA to have a half-life of multiple months under the deployed conditions in the FRET experiment (High concentration zones contain <1mM of Mg2+). Additionally, dsRNA is multiple orders of magnitude more stable than ssRNA with regards to degradation through hydrolysis [Zhang, K., Hodge, J., Chatterjee, A., Moon, T. S., & Parker, K. M. (2021). Duplex structure of double-stranded RNA provides stability against hydrolysis relative to single-stranded RNA. Environmental Science & Technology, 55(12), 8045-8053.], improving RNA stability especially in zones of high FRET signal. Furthermore, at the neutral pH deployed in this work, RNA does not readily degrade. In previous work from our lab [Salditt, A., Karr, L., Salibi, E., Le Vay, K., Braun, D., & Mutschler, H. (2023). Ribozyme-mediated RNA synthesis and replication in a model Hadean microenvironment. Nature Communications, 14(1), 1495.], we showed that the lifetime of RNA under conditions reaching 40mM Mg2+ at the air-water interface at 45°C was sufficient to support ribozymatically mediated ligation reactions in experiments lasting multiple hours.

      With that in mind, gaining insight into the median Mg2+ concentration across multiple averaged nucleic acid trajectories in our system (see Fig. 3c&d) and numerically convoluting this with hydrolysis dynamics from literature would be highly valuable. We anticipate that longer residence times in trajectories distant from the interface will improve RNA stability compared to a system with uniformly high Mg2+ concentrations.

      Added a new Supplementary section for this. We used the trace from Figure 3(c) and calculated the hydrolysis rate for each timestep by using literature values from RNA [Li, Y., & Breaker, R. R. (1999). Kinetics of RNA degradation by specific base catalysis of transesterification involving the 2 ‘-hydroxyl group. Journal of the American Chemical Society, 121(23), 5364-5372.]. We conclude that the conditions deployed for the experiment are not harsh on RNA, with hydrolysis rates in the E-6 1/min regime. The figure below (also now in the supplementary information) shows the hydrolysis of RNA deployed under the conditions of the experiment in Figure 3. RNA is not expected to hydrolyze under these conditions and timescales, in which a replication reaction would occur. With a half life of around 83 days, even a prebiotically plausible – very slow – replication reaction would not be constrained by hydrolysis boundary conditions in this scenario.

      Referenced to this section in the supplementary information in the maintext: […] In the experimental conditions used here, RNA would also not readily degrade, even if the strand enters the high salt regimes (See Suppl. Sec. IX). Using literature values for hydrolysis rates under the deployed conditions, we estimate dissolved RNA to have a half life of around 83 days. […]

      (3) Finally, I am curious whether the authors have considered designing a simulation or experiment that uses the imidazole- or 2′,3′-cyclic phosphate-activated ribonucleotides. For instance, a fully paired RNA duplex and a fluorescently-labeled primer could be incubated in the presence of activated ribonucleotides +/- flux and subsequently analyzed by gel electrophoresis to determine how much primer extension has occurred. The reason for this suggestion is that, due to the slow kinetics of chemical primer extension, the reannealing of the fully complementary strands as they pass through the high Mg++ zone, which is required for primer extension, may outcompete the primer extension reaction. In the case of the DNA polymerase, the enzymatic catalysis likely outcompetes the reannealing, but this may not recapitulate the uncatalyzed chemical reaction.

      This is certainly on our to-do list for future experiments in this setting. Our current focus is on templated ligation rather than templated polymerization and we are working hard to implement RNA-only enzyme-free ligation chain reaction, based on more optimized parameters for the templated ligation from 2’3’-cyclic phosphate activation that was just published [High-Fidelity RNA Copying via 2′,3′-Cyclic Phosphate Ligation, Adriana C. Serrão, Sreekar Wunnava, Avinash V. Dass, Lennard Ufer, Philipp Schwintek, Christof B. Mast, and Dieter Braun, JACS doi.org/10.1021/jacs.3c10813 (2024)]. But we first would try this at an air-water interface which was shown to work with RNA in a temperature gradient [Ribozyme-mediated RNA synthesis and replication in a model Hadean microenvironment, Annalena Salditt, Leonie Karr, Elia Salibi, Kristian Le Vay, Dieter Braun & Hannes Mutschler, Nature Communications doi.org/10.1038/s41467-023-37206-4 (2023)] before making the jump to the isothermal setting we describe here. So we can understand the question, but it was good practice also in the past to first get to know the setting with PCR, then jump to RNA.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) Could the authors comment on the likelihood of the geological environments where the water inflow velocity equals the evaporation velocity?

      This is an important point to mention in the manuscript, thank you for pointing that out. To produce a defined experiment, we were pushing the water out with a syringe pump, but regulated in a way that the evaporation was matching our flow rate. We imagine that a real system will self-regulate the inflow of the water column on the one hand side by a more complex geometry of the gas flow, matching the evaporation with the reflow of water automatically. The interface would either recede or move closer to the gas flux, depending on whether the inflow exceeds or falls short of the evaporation rate. As the interface moves closer, evaporation speeds up, while moving away slows it down. This dynamic process stabilizes the system, with surface tension ultimately fixing the interface in place.

      We have seen a bit of this dynamic already in the experiments, could however so far not yet find a good geometry within our 2-dimensional constant thickness geometry to make it work for a longer time. Very likely having a 3-dimensional reservoir of water with less frictional forces would be able to do this, but this would require a full redesign of a multi-thickness microfluidics. The more we think about it, the more we envisage to make the next implementation of the experiment with a real porous volcanic rock inside a humidity chamber that simulates a full 6h prebiotic day. But then we would lose the whole reproducibility of the experiment, but likely gain a way that recondensation of water by dew in a cold morning is refilling the water reservoirs in the rocks again. Sorry that I am regressing towards experiments in the future.

      We added a paragraph after the second paragraph in Results and Discussion.

      It now reads: […] For a real early Earth environment we envision a system that self-regulates the water column's inflow by automatically balancing evaporation with capillary flows. The interface adjusts its position relative to the gas flux, moving closer if the inflow is less than the evaporation rate, or receding if it exceeds it. When the interface nears the gas flux, evaporation accelerates, while moving it away slows evaporation. This dynamic process stabilizes the system, with surface tension ultimately fixing the interface's position. […]

      (2) Could the authors speculate on using gases other than ambient air to provide the flux and possibly even chemical energy? For example, using carbonyl sulfide or vaporized methyl isocyanide could drive amino acid and nucleotide activation, respectively, at the gas-water interface.

      This is an interesting prospect for future work with this system. We thought also about introducing ammonia for pH control and possible reactions. We were amazed in the past that having CO2 instead of air had a profound impact on the replication and the strand separation [Water cycles in a Hadean CO2 atmosphere drive the evolution of long DNA, Alan Ianeselli, Miguel Atienza, Patrick Kudella, Ulrich Gerland, Christof Mast & Dieter Braun, Nature Physics doi.org/10.1038/s41567-022-01516-z (2022)]. So going more in this direction absolutely makes sense and as it acts mostly on the length-selectively accumulated molecules at the interface, only the selected molecules will be affected, which adds to the selection pressure of early evolutionary scenarios.

      Of course, in the manuscript, we use ambient air as a proxy for any gas, focusing primarily on the energy introduced through momentum transfer and evaporation. We speculate that soluble gasses could establish chemical gradients, such as pH or redox potential, from the bulk solution to the interface, similar to the Mg2+ accumulation shown in Figure 3c. The nature of these gradients would depend on each gas's solubility and diffusivity. We have already observed such effects in thermal gradients [Keil, L. M., Möller, F. M., Kieß, M., Kudella, P. W., & Mast, C. B. (2017). Proton gradients and pH oscillations emerge from heat flow at the microscale. Nature communications, 8(1), 1897.] and finding similar behavior in an isothermal environment would be a significant discovery.

      Added a paragraph in the Conclusion to showcase this: [… ] Furthermore we expect that other gases, such as CO2, could establish chemical gradients in this environment. Such gradients have been observed in thermal gradients before [23] and finding similar behaviour in an isothermal environment would be a significant discovery.[…]

      (3) Line 162: Instead of "risk," I suggest using "rate".

      Thanks for pointing this out! Will be changed.

      Fixed.

      (4) Using FRET of a DNA duplex as an indicator of salt concentration is a decent proxy, but a more direct measurement of salt concentration would provide further merit to the explicit statement that it is the salt concentration that is changing in the system and not another hidden parameter.

      Directly observing salt concentration using microscopy is a difficult task. While there are dyes that change their fluorescence depending on the local Na+ or Mg2+ concentration, they are not operating differentially, i.e. by making a ratio between two color channels. Only then we are not running into artifacts from the dye molecules being accumulated by the non-equilibrium settings. We were able to do this for pH in the past, but did not find comparable optical salt sensors. This is the reason we ended up with a FRET pair, with the advantage that we actually probe the strand separation that we are interested in anyhow. Using such a dye in future work would however without a doubt enhance the understanding of not only this system, but also our thermal gradient environments.

      (5) Figure 3a: Could the authors add information on "Dried DNA" to the caption? I am assuming this is the DNA that dried off on the sides of the vessel but cannot be sure.

      Thanks to the reviewer for pointing this out. This is correct and we will describe this better in the revised manuscript.

      Added a sentence in the caption to address this: […] Fluctuations in interface position can dry and redissolve DNA repeatedly (see “Dried DNA” in right panel). […]

      (6) Figure 4b and c: How reproducible is this data? Have the authors performed this reaction multiple independent times? If so, this data should be added to the manuscript.

      The data from the gel electrophoresis was performed in triplicates and is shown in full in supplementary information. The data in c is hard to reproduce, as the interface is not static and thus ROI measurements are difficult to perform as an average of repeats. Including the data from the independent repeats will however give the reader insight into some of the experimental difficulties, such as air bubbles, which form from degassing as the liquid heats up, that travel upwards to the interface, disrupting the ongoing fluorescence measurements.

      This was also pointed out by reviewer 1 and addressed there.

      (7) Line 256: "shielding from harmful UV" statement only applies to RNA oligomers as UV light may actually be beneficial for earlier steps during ribonucleoside synthesis. I suggest rephrasing to "shielding nucleic acid oligomers from UV damage.".

      Will be adjusted as mentioned.

      Fixed.

      (8) The final paragraph in the Results and Discussion section would flow better if placed in the Conclusion section.

      This is a good point and we will merge results and discussion closer together.

      Fixed.

      (9) Line 262, "...of early Life" is slightly overstating the conclusions of the study. I suggest rephrasing to "...of nucleic acids that could have supported early life."

      This is a fair comment. We thank the reviewer for his detailed analysis of the manuscript!

      Changed the phrase to: […]In this work we investigated a prebiotically plausible and abundant geological environment to support the replication of nucleic acids. […]

      (10) In references, some of the journal names are in sentence case while others are in title case (see references 23 and 26 for example).

      Thanks - this will be fixed.

      Fixed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This useful study reports on the discovery of an antimicrobial agent that kills Neisseria gonorrhoeae. Sensitivity is attributed to a combination of DedA assisted uptake of oxydifficidin into the cytoplasm and the presence of a oxydifficidin-sensitive RplL ribosomal protein. Due to the narrow scope, the broader antibacterial spectrum remains unclear and therefore the evidence supporting the conclusions is incomplete with key methods and data lacking. This work will be of interest to microbiologists and synthetic biologists.

      General comment about narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The main focus of this study is on its previously unreported potent anti-gonococcal activity and mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Kan et al. report the serendipitous discovery of a Bacillus amyloliquefaciens strain that kills N. gonorrhoeae. They use TnSeq to identify that the anti-gonococcal agent is oxydifficidin and show that it acts at the ribosome and that one of the dedA gene products in N. gonorrhoeae MS11 is important for moving the oxydifficidin across the membrane.

      Strengths:

      This is an impressive amount of work, moving from a serendipitous observation through TnSeq to characterize the mechanism by which Oxydifficidin works.

      Weaknesses:

      (1) There are important gaps in the manuscript's methods.

      The requested additions to the method describing bacterial sequencing and anti-gonococcal activity screening will be made. However, we do not think the absence of these generic methods reduces the significance of our findings.

      (2) The work should evaluate antibiotics relevant to N. gonorrhoeae.

      (1) It is not clear to us why reevaluating the activity of well characterized antibiotics against known gonorrhoeae clinical strains would add value to this manuscript. The activity of clinically relevant antibiotics against antibiotic-resistant N. gonorrhoeae clinical isolates is well described in the literature. Our use of antibiotics in this study was intended to aid in the identification of oxydifficidin’s mode of action. This is true for both Tables 1 and 2.

      (2) If the reviewer insists, we would be happy to include MIC data for the following clinically relevant antibiotics: ceftriaxone (cephalosporin/beta-lactam), gentamicin (aminoglycoside), azithromycin (macrolide), and ciprofloxacin (fluoroquinolone).

      (3) The genetic diversity of dedA and rplL in N. gonorrhoeae is not clear, neither is it clear whether oxydifficidin is active against more relevant strains and species than tested so far.

      (1) We thank the reviewer for this suggestion. We aligned the DedA sequence from strain MS11 with DedA proteins from 220 N. gonorrhoeae strains that have high-quality assemblies in NCBI. The result showed that there are no amino acid changes in this protein. Using the same method, we observed several single amino acid changes in RplL. This included changes at A64, G25 and S82 in 4 strains with one change per strain. These sites differ from R76 and K84, where we identified changes that provide resistance to oxydifficidin. Notably, in a similar search of representative Escherichia, Chlamydia, Vibrio, and Pseudomonas NCBI deposited genomes, we did not identify changes in RplL at position R76 or K84.

      (2) While the usefulness of screening more clinically relevant antibiotics against clinical isolates as suggested in comment 2 was not clear to us, we agree that screening these strains for oxydifficidin activity would be beneficial. We have ordered Neisseria gonorrhoeae strain AR1280, AR1281 (CDC), and Neisseria meningitidis ATCC 13090. They will be tested when they arrive.

      Reviewer #2 (Public Review):

      Summary:

      Kan et al. present the discovery of oxydifficidin as a potential antimicrobial against N. gonorrhoeae, including multi-drug resistant strains. The authors show the role of DedA flippase-assisted uptake and the specificity of RplL in the mechanism of action for oxydifficidin. This novel mode of action could potentially offer a new therapeutic avenue, providing a critical addition to the limited arsenal of antibiotics effective against gonorrhea.

      Strengths:

      This study underscores the potential of revisiting natural products for antibiotic discovery of modern-day-concerning pathogens and highlights a new target mechanism that could inform future drug development. Indeed there is a recent growing body of research utilizing AI and predictive computational informatics to revisit potential antimicrobial agents and metabolites from cultured bacterial species. The discovery of oxydifficidin interaction with RplL and its DedA-assisted uptake mechanism opens new research directions in understanding and combating antibiotic-resistant N. gonorrhoeae. Methodologically, the study is rigorous employing various experimental techniques such as genome sequencing, bioassay-guided fractionation, LCMS, NMR, and Tn-mutagenesis.

      Weaknesses:

      The scope is somewhat narrow, focusing primarily on N. gonorrhoeae. This limits the generalizability of the findings and leaves questions about its broader antibacterial spectrum. Moreover, while the study demonstrates the in vitro effectiveness of oxydifficidin, there is a lack of in vivo validation (i.e., animal models) for assessing pre-clinical potential of oxydifficidin. Potential SNPs within dedA or RplL raise concerns about how quickly resistance could emerge in clinical settings.

      (1) Spectrum/narrow scope: The broader antibacterial spectrum of oxydifficidin has been reported previously (S B Zimmerman et al., 1987). The focus of this study is on its previously unreported potent anti-gonococcal activity and its mode of action. While it is true that broad-spectrum antibiotics have historically played a role in effectively controlling a wide range of infections, we and others believe that narrow-spectrum antibiotics have an overlooked importance in addressing bacterial infections. Their advantage lies in their ability to target specific pathogens without markedly disrupting the human microbiota.

      (2) Animal models: We acknowledge the reviewer’s insight regarding the importance of in vivo validation to enhance oxydifficidin’s pre-clinical potential. However, due to the labor-intensive process needed to isolate oxydifficidin, obtaining a sufficient quantity for animal studies is beyond the scope of this study. Our future work will focus on optimizing the yield of oxydifficidin and developing a topical mouse model for subsequent investigations.

      (3) Potential SNPs: Please see our response to Reviewer #1’s comment 3. We acknowledge that potential SNPs within dedA and rplL raise concerns regarding clinical resistance, which is a common issue for protein-targeting antibiotics. Yet, as pointed out in the manuscript, obtaining mutants in the lab was a very low yield endeavor.

      Reviewer #3 (Public Review):

      Summary:

      The authors have shown that oxydifficidin is a potent inhibitor of Neisseria gonorrhoeae. They were able to identify the target of action to rplL and showed that resistance could occur via mutation in the DedA flippase and RplL.

      Strengths:

      This was a very thorough and clearly argued set of experiments that supported their conclusions.

      Weaknesses:

      There was no obvious weakness in the experimental design. Although it is promising that the DedA mutations resulted in attenuation of fitness, it remains an open question whether secondary rounds of mutation could overcome this selective disadvantage which was untried in this study.

      We thank the reviewer for the positive comment. We agree that investigating factors that could compensate for the fitness attenuation caused by DedA mutation would enhance our understanding of the role of DedA.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The use of the term "N. gonorrhoeae wildtype" should not be used. It is uninformative, as the species contains a large amount of diversity. Instead, please name the strain. From Figure 1, it looks like the authors used MS11. Since MS11 is a longstanding lab strain and likely does not reflect circulating N. gonorrhoeae, and since H041 is no longer in circulation, the authors should ideally test the compound against more representative strains of N. gonorrhoeae. This includes panels of isolates available through the CDC, for example (https://www.cdc.gov/drugresistance/resistance-bank/index.html). I encourage the authors to include FC428 or another recently identified isolate with the penA 60 allele to demonstrate oxydifficidin's activity against contemporary concerning isolates/lineages.

      (1) “N. gonorrhoeae MS11” is now used instead of “N. gonorrhoeae WT” in this manuscript.

      (2) In our revised manuscript, we have added MIC data for recently identified Neisseria gonorrhoeae isolates AR#1280 and AR#1281 which contain the penA 60 allele (Table 1). The data shows oxydifficidin maintains its potent activity against these multidrug-resistant strains. We also added a description of this data to the results section as shown below.

      Original text: “Oxydifficidin was more potent against N. gonorrhoeae MS11 than almost all other antibiotics we tested. In fact, it was only slightly less active than the highly optimized third-generation cephalosporin, ceftazidime.([18]) However, unlike third-generation cephalosporins, oxydifficidin retained activity against the multidrug resistant H041 clinical isolate (Table 1).([4]) H041 is resistant to the “standard of care” cephalosporin ceftriaxone (2 µg/mL) as well as a number of other antibiotics that are normally active against N. gonorrhoeae (penicillin G, 4 µg/mL; cefixime, 8 µg/mL; levofloxacin, 32 µg/mL).”

      Changed to: “Oxydifficidin was more potent against N. gonorrhoeae MS11 than most other antibiotics we tested. Notably, unlike clinically used antibiotics such as ceftriaxone, azithromycin, and ciprofloxacin, oxydifficidin retained activity against all multidrug-resistant clinical isolates we examined (Table 1).” (Line 77-79)

      (2) Does oxydifficidin have activity against N. meningitidis? It is the species most closely related to N. gonorrhoeae and the other pathogenic Neisseria.

      Oxydifficidin has potent activity against N. meningitidis ATCC 13090. In our revised manuscript, we have included its MIC data in Figure 1c.

      (3) Given claims that oxydifficidin activity in N. gonorrhoeae as compared to other Neisseria reflects N. gonorrhoeae's dedA and sensitive rplL, it would be good to assess the allelic diversity of these genes in N. gonorrhoeae. There are over 20,000 genomes from clinical isolates of N. gonorrhoeae in databases. It should be straightforward to check whether dedA and rplL allelic variants already exist in the population. Should variants be observed, oxydifficidin should be tested against the associated strains of N. gonorrhoeae.

      Response: We thank the reviewer for this suggestion. We aligned the DedA sequence from strain MS11 with DedA proteins from 220 N. gonorrhoeae strains that have high-quality assemblies in NCBI. The result showed that there are no amino acid changes in this protein. Using the same method, we observed several single amino acid changes in RplL. This included changes at A64, G25 and S82 in 4 strains with one change per strain. These sites differ from R76 and K84, where we identified changes that provide resistance to oxydifficidin. Notably, in a similar search of representative Escherichia, Chlamydia, Vibrio, and Pseudomonas NCBI deposited genomes, we did not identify changes in RplL at position R76 or K84.

      New text: “A survey of 220 N. gonorrhoeae strains with high-quality assemblies in NCBI found no mutations in the DedA protein.” (Line 104-105)

      “These two mutations were not found in the survey of the same collection of N. gonorrhoeae strains used to look for DedA mutations.” (Line 143-144)

      (4) Clinically relevant antibiotics for N. gonorrhoeae are penicillin, tetracycline, spectinomycin, gentamicin, ciprofloxacin, azithromycin, ceftriaxone; moreover, zoliflodacin and gepotidacin have reportedly successfully completed phase 3 trials. The authors should redo their MIC testing with these antibiotics (e.g., for Figures 1 and 2 and Tables 1 and 2), both because this will enable direct comparison with the many clinical isolates that have undergone testing and because these are the drugs most pertinent to clinical practice. Ampicillin, ceftazidime, chloramphenicol, bacitracin, and daptomycin are not relevant. Could the authors explain why they tested vancomycin, polymyxin B, irgasan, melittin, avilamycin, and thiostrepton?

      Our use of antibiotics with diverse modes of action (e.g. vancomycin, polymyxin B, irgasan, melittin, avilamycin, and thiostrepton) in this study was intended to aid in the identification of oxydifficidin’s mode of action. This is true for both Tables 1 and 2.

      To address the reviewer’s concern, in our revised manuscript, we have added MIC data for the following clinically relevant antibiotics: ceftriaxone (cephalosporin/beta-lactam), gentamicin (aminoglycoside), azithromycin (macrolide), and ciprofloxacin (fluoroquinolone) to Table 1.

      (5) Please describe the characteristics of the transposon library (finding four transposons in a single strain does seem unexpected, given how most transposon libraries aim for one transposon insertion per strain).

      We understand that one transposon insertion per strain is ideal for transposon libraries. This Bacillus strain proved to be recalcitrant to genetic manipulation. In the rare cases where we obtained resistance colonies upon electroporation with the transposon, all colonies contained multiple (≥ 4) transposon insertions. This made it impractical to build a library with one transposon insertion per library member.

      We assumed that the anti-N. gonorrhoeae activity most likely originated from a natural product BGC, which typically range from 10-100 kb in size.

      Based on the average of 50 kb per BGC, ~80 transposon insertions would be required to fully search the 4.2 Mb genome of Bacillus amyloliquefaciens BK for a BGC. At 4 mutations per transformant, 1x coverage of the genome would require only 20 library members.

      After extensive electroporation of transposon into Bacillus amyloliquefaciens BK, we were able to obtain a library of 50 members, including one mutant (Tn5-3) that lacked anti-N. gonorrhoeae activity.

      New text added to the methods section:

      “A library containing 50 transposon mutants was obtained. In the mutants examined, each strain contained ≥4 transposon insertions” (Line 337-339)

      (6) Please describe in the methods how you sequenced and annotated the genome of Bacillus amyloliquefaciens BK.

      The sequencing method is now described in “Genomic Sequencing and annotation of Bacillus amyloliquefaciens” section. The genome of Bacillus amyloliquefaciens BK was not fully annotated. Mutations were identified as described in the updated methods section below.

      New text:

      “Genomic Sequencing and annotation of Bacillus amyloliquefaciens

      Genomic DNA from Bacillus amyloliquefaciens BK WT and transposon mutant Tn5-3 was isolated using PureLink Microbiome DNA purification kit (Invitrogen) according to the manufacturer’s instructions.

      The Bacillus amyloliquefaciens BK WT genome was assembled by mapping its sequencing data onto the annotated genome of Bacillus amyloliquefaciens FZB42 using Geneious Prime. Differences in the mutant strain Tn5-3 were identified by mapping its sequencing data onto the assembled Bacillus amyloliquefaciens BK WT genome. The mutated genes were then annotated using NCBI BLAST. The oxydifficidin BGC was annotated using the antiSMASH online server.” (Line 253-260)

      (7) Please describe in the methods how you screened the library for strains that lacked anti-gonococcal activity.

      The method is added to our revised manuscript as section “Screening of Bacillus Strains Lacking Anti-N. gonorrhoeae Activity”.

      New text:

      “Screening of Bacillus Strains Lacking Anti-N. gonorrhoeae Activity

      The transposon mutants of Bacillus amyloliquefaciens BK were grown overnight in LB medium at 30 °C. Each overnight culture was then diluted 1:5000, and 1 μl of the diluted culture was spotted onto a GCB agar plate swabbed with N. gonorrhoeae cells. The plate was then incubated overnight at 37 °C with 5% CO2. The mutant strain (Tn5-3) lacking anti-N. gonorrhoeae activity was identified due to its failure to produce a zone of growth inhibition in the resulting N. gonorrhoeae lawn.” (Line 341-346)

      (8) Was only one strain found that was a 'non-producer' of anti-N. gonorrhoeae activity? Line 68 suggests that this was only one of multiple non-producers. Is that correct? If so, did you work up the others, and did they also have disruptions in the same biosynthetic gene cluster?

      Only one strain was identified as a “non-producer” of anti-N. gonorrhoeae activity. We have modified the text to clarify this point.

      Original text: “The sequencing of one non-producer strain revealed that it surprisingly contained four transposon insertions and one frame shift mutation.”

      Changed to: “The sequencing of the non-producer strain revealed that it surprisingly contained four transposon insertions and one frame shift mutation.” (Line 53-54 )

      (9) All sequences (including Bacillus amyloliquefaciens BK) must be deposited in a public database (e.g., NCBI) and the accession numbers reported in the manuscript.

      Genomic sequence data of Bacillus amyloliquefaciens BK has been deposited in GenBank, and its accession number (GCA_019093835.1) now appears in figure legend of Figure S1a.

      Figure S1a legend:

      “Genome-based phylogenetic tree containing Bacillus amyloliquefaciens BK and closely related Bacillus spp. The tree was built by Genome Clustering of MicroScope using neighbor-joining method. The NCBI accession numbers of Bacillus strains used in the tree are GCA_000196735.1, GCA_000204275.1, GCA_000015785.2, GCA_019093835.1, GCA_000009045.1, GCA_000011645.1, GCA_000172815.1, GCA_000008005.1, and GCA_000007845.1 (from top to bottom).”

      Minor

      (10) Statements in the article would benefit from fact-checking. For example:

      - gonorrhea is not the second most prevalent sexually transmitted infection worldwide; it is the second most reported bacterial sexually transmitted infection.

      - Treatment is ceftriaxone 500mg IM x1 in the US, but 1g IM x1 in the UK and Europe. The UK guidelines also permit ciprofloxacin, should sequencing indicate gyrA 91S. I suggest reviewing / specifying which treatment guidelines you're referring to.

      We appreciate the reviewer’s corrections. The word “prevalent” is now changed to “reported”.

      Original text: “Gonorrhea, which is caused by Neisseria gonorrhoeae, is the second most prevalent sexually transmitted infection worldwide.”

      Changed to: “Gonorrhea, which is caused by Neisseria gonorrhoeae, is the second most reported sexually transmitted infection worldwide.” (Line 2-3)

      Original text: “Gonorrhea is the second most prevalent sexually transmitted infection worldwide, its causative agent is the bacterium Neisseria gonorrhoeae.”

      Changed to: “Gonorrhea is the second most reported sexually transmitted infection worldwide, its causative agent is the bacterium Neisseria gonorrhoeae.” (Line 18-19)

      “In the USA” is now added to the sentence stating gonorrhea treatment.

      Original text: “The high dose (500 mg) of the cephalosporin ceftriaxone is currently the only recommended therapy for treating gonorrhea infections.”

      Changed to: “The high dose (500 mg) of the cephalosporin ceftriaxone is currently the only recommended therapy for treating gonorrhea infections in the USA.” (Line 20-22)

      (11) Please make sure all results are in the results section. The report of cell morphology, for example, should be in the results, not the discussion.

      In our revised manuscript, we have included the cell morphology data in the results section with the text changes below.

      Original text: “Interestingly, not only was dedA deficient N. gonorrhoeae less susceptible to oxydifficidin, oxydifficidin also kills this mutant more slowly (Figure 2b) than WT N. gonorrhoeae MS11.”

      Changed to: “Interestingly, not only was dedA deficient N. gonorrhoeae less susceptible to oxydifficidin, oxydifficidin also kills this mutant more slowly (Figure 2b) than WT N. gonorrhoeae MS11. The dedA deletion mutant also showed an altered cell morphology with reduced membrane integrity and lower formation of micro-colonies (Figure S4). (Line 100-104)

      Original text: “The dedA deletion mutant also showed an altered cell morphology with reduced membrane integrity and lower formation of micro-colonies (Figure S4), indicating that it should show reduced pathogenesis and fitness, and, as a result, not accumulate in a clinical setting, which adds to the therapeutic appeal of oxydifficidin.”

      Changed to: “The dedA deletion mutant exhibited altered cell morphology, characterized by diminished membrane integrity and reduced micro-colony formation, indicating that it should show reduced pathogenesis and fitness, and, as a result, not accumulate in a clinical setting, which adds to the therapeutic appeal of oxydifficidin” (Line 206-210)

      (12) Tables 1 and 2 should be combined and should address the most relevant antibiotics

      The MIC data of additional relevant antibiotics are now included in Table 1. However, we still believe that keeping Tables 1 and 2 separate enhances the clarity of the manuscript. Table 2 specifically focuses on diverse ribosomal targeting antibiotics, which highlights the unique binding site of oxydifficidin.

      (13) Supplemental Figure 1a. The tree could be better resolved, and there are four entries with the identical listing of "Bacillus amyloliquefaciens subsp. plantarum" on different branches. In the methods or the legend, please indicate the accession numbers for these genomes. Also please specify how this tree was made-is it a maximum likelihood tree? Something else?

      The tree is now better resolved and includes new entries. The requested information regarding accession numbers and tree construction method has been included in the figure legend.

      New supplemental Figure 1a legend:

      “a. Genome-based phylogenetic tree containing Bacillus amyloliquefaciens BK and closely related Bacillus spp. The tree was built by Genome Clustering of MicroScope using neighbor-joining method. The NCBI accession numbers of Bacillus strains used in the tree are GCA_000196735.1, GCA_000204275.1, GCA_000015785.2, GCA_019093835.1, GCA_000009045.1, GCA_000011645.1, GCA_000172815.1, GCA_000008005.1, and GCA_000007845.1 (from top to bottom).”

      Reviewer #2 (Recommendations For The Authors):

      The conclusions drawn in the manuscript are well-supported by the experimental data presented.

      I have the below minor comments:

      (1) "serendipitously identified" - I feel this wording should be avoided throughout the manuscript. The point of a research paper is to communicate methodology and experimental detail, and this language portrays the opposite.

      While we agree that methodology and experimental procedures are paramount in scientific reporting, we believe it is equally important to convey, particularly to younger generations, that a part of the scientific process is often unplanned and can benefit from chance observations. Therefore, we would like to keep this wording.

      (2) The introduction should include the biological roles/function of DedA proteins in bacteria.

      DedA proteins perform a wide array of biological roles and functions in bacteria. In the results section (Line 107-116), we have described the most well-established of these functions, particularly the flippase activity, which appears to be directly related to oxydifficidin sensitivity. We believe that introducing this information in the results section enhances the manuscript’s clarity and flow.

      (3) "When we screened this contaminant for antibacterial activity against lawns of other Gram-negative bacteria it did not produce a zone of growth of inhibition against any of the bacteria we tested (e.g., Escherichia coli, Vibrio cholerae, Caulobacter crescentus)." Can these data Figures be included in the Supplements?

      This result was recorded in the lead author’s notebook, but no image was saved.

      (4) Line 52: Was any base analyses performed on the Tn-mutants i.e., how many insertion-sites? Depth of mutants? Was a library constructed in this study or previously? Why were only BGC assessed?

      Please see our response to Reviewer #1’s comment (5). We focused on BGCs because we believed the anti-N. gonorrhoeae activity most likely resulted from a molecule encoded by a natural product BGC.

      (5) Line 98: Do the other 2 predicted DedA-like proteins also have a role in uptake of oxydifficidin? Is there some redundancy in uptake?

      We generated knockout mutants for two other predicted DedA-like proteins in N. gonorrhoeae MS11, and the MIC of oxydifficidin for these mutants remained the same as for the N. gonorrhoeae MS11 wild type strain. Therefore, we believe that the DedA protein discussed in this manuscript is the primary transporter of oxydifficidin. However, we cannot completely rule out the possibility of redundancy in oxydifficidin uptake by other DedA-like proteins.

      New text: “We also generated deletion mutants for two other predicted dedA-like genes, and the MIC of oxydifficidin for these mutants remained the same as for the N. gonorrhoeae MS11 wild type strain.” (Line 98-100)

      Reviewer #3 (Recommendations For The Authors):

      This is a well presented manuscript and I could not immediately see any issues with it.

      We appreciate the reviewer’s positive feedback.

    1. Author response:

      We are submitting a revised manuscript with major additions that address the main concerns in the initial reviews. At the highest level, this revision provides i) orthogonal biochemical measurements that yield concrete evidence of lysosomal protein aggregates, and ii) a plausible mechanism linking lysosomal lipid handling and protein aggregation through disruption of ESCRT function. We believe these additions significantly improve the completeness of this study and the conclusions that can be drawn from the data.

      Below are more specific highlights on the addition in this revision:

      -       We included orthogonal techniques (thioflavin-T staining and Lyso-IP followed by differential extraction) and confirmed the accumulation of RIPA-insoluble protein aggregates at the lysosomes in cells under lipid perturbation (Figure 3).

      -       We performed TMT-Proteomics and identified accumulation of insoluble ESCRT components at the lysosomes under lipid perturbation (Figure 4). Two new authors involved in this effort are added onto the manuscript.

      -       The ESCRT result prompted us to revisit lysosomal membrane integrity. With improved imaging conditions and analysis we were able to see increased membrane permeabilization under lipid perturbation. VPS4A overexpression partially rescued this phenotype, suggesting that lipid accumulation impairs ESCRT disassembly (Figure 5).

      -       Together, the results suggest that lipid perturbation impairs ESCRT function, compromising both lysosomal membrane repair and microautophagy, resulting in the accumulation of endogenous protein aggregates at the lysosomes (Graphical Abstract).

      Reviewer #1 (Recommendations For The Authors):

      (1) Perhaps the most prominent limitation of this work is the unilateral focus on native cells (i.e. cells under no endogenous or exogenous stress) as the model for protein aggregate formation. Furthermore, although the ProteoStat stain has been utilized by many investigators before, the sole reliance on this stain as the read-out for their assays is concerning. To compound the concern, the ProteoStat-positive puncta co-localize with lysosmal markers which was surprising even to the authors. All in all, it behooves the authors to test proteostasis in multiple parallel ways to actually define what they are studying. How is it possible that protein aggregates under native conditions are only co-localized with lysosomes? Are we really studying protein aggregates which should predominantly be cytoplasmic insoluble aggregates?

      (a) They need to get away from a simple stain like ProteoStat and conduct co-stainings with other markers such as poly-ubiquitin antibodies and other chaperones to define what and where else exactly are these aggregates.

      Co-staining with poly-ubiquitin was included in the original manuscript. We added orthogonal staining with another widely used amyloid dye, Thioflavin-T, and provided fine-grained quantification of lysosomal vs cytosolic localization of various signals (Figures S4A-C & 3A-B).

      (b) They need to do Immunoblots with and without triton insolubility to see if these aggregates are insoluble as most would predict. They can do lysosomal isolation vs cytoplasmic to see if the insoluble aggregates are really lysosomal.

      We performed Lyso-IP followed by differential detergent extraction to confirm the accumulation of insoluble proteins at the lysosomes (Figure 3C). Proteomic analysis identified some of these insoluble proteins as ESCRT subunits (Figure 4).

      (c) They should compare aggregate formation in the native state versus cells with lysosomal inhibition via Bafilomycin or chloroquine versus cells with proteosomal inhibition. The lysosomal inhibition experiments are particularly informative given the lysosomal relevance they have uncovered.

      We included other small molecule inhibitors and at different time points to compare the effect of different modes of proteostasis challenge (Figure S4A-D). Together with the ESCRT finding, our results suggest the role of microautophagy in our system, and provide a model of how ProteoStat- and/or ubiquitin- positive substrates become partitioned between the cytoplasm and lysosomes under different perturbations.

      (d) Many protein aggregates which are too bulky for proteosome degradation will traditionally be dealt with by aggrephagy. Why is this not observed?

      Knockdown of core macroautophagy components did not impact Proteostat intensity in our CRISPRi screen, suggesting that basal macroautophagy plays a negligible role in clearing endogenous amyloid-like structures in our experimental system. We provide an alternative model that these aggregates instead arrive at the lysosomes via microautophagy.

      (2) After addressing #1, they can validate if the genes they identified by CRISPR screens are also important in modulation of protein aggregate burden in other systems. For example, if they inhibit lysosomes by Bafilo or Chloroquine to obtain protein aggregates and then Knockdown the identified genes in the CRISPR screens, will they get the same results?

      We addressed the effect of different modes of proteostasis challenge as recommended above. Deacidifying the lysosomes alone causes intense protein aggregation (Figure S4A-D) and eventually cell death, and was thus not combined with other perturbations.

      (3) They identify lysosomal lipid metabolism genes/pathways as the culprit for inducing proteostasis. In particular sphingolipid and cholesteryl ester species appear to be operational here. However, there are no specific lipids species or specific lipid metabolism gene that is causative. Rather, you have to knockdown entire processes to have an effect. This suggests that the focus on lysosome health (i.e. permeability, proteolysis, etc) is rudimentary. When you have to knockdown entire classes of lipids, this would indicate more broad effects on cellular lipids (including membrane lipids beyond the lysosome) and related cellular health?

      We included data on the effect of knocking down MYLIP, PSAP, and as a comparison PSMD2 on the growth rate of K562 cells (Figure S5A). MYLIP and PSAP KDs, which cause predominantly an accumulation of lipids, do not impede cell growth. Increasing lipid uptake by MYLIP KD increases cell proliferation under our culture conditions, suggesting a general negative impact on cell health was not required for the association between lipid levels and protein aggregates.

      (a) They conduct a superficial methyl-beta-cyclodextrin experiment with equivocal results. The use of MBCD for different time-courses to deplete various membrane cholesterol pools including the plasma membrane pool is important to ascertain what aspect of the cellular cholesterol is affecting proteostasis. MBCD +/- cholesterol reintroduction time-courses for rescue will also be key to determine the culprit cellular cholesterol pool.

      The MBCD / Filipin experiment helped us determine that ProteoStat doesn’t directly stain cholesterol, nor any major plasma membrane components. Free cholesterol was implicated in neither the screen nor the lipidomics and was not the subject of targeted experiments.

      (b) The same concept can be applied to sphingolipids. There are sphingolipids in abundance in multiple membrane compartments. Which ones are causal here? More nuanced evaluation of this with sphingolipid staining/tracking can be conducted.

      We attempted experiments where sphingolipids were added back to cells grown in FBS-depleted media. Nevertheless, we were not able to consistently deliver these lipid species and doing so while ensuring the correct subcellular localization at physiologically relevant level would require substantial methods development.

      (c) As part of this, are lipid rafts and/or caveolae being affected by the perturbations in cholesterol and sphingolipids? Lipid rafts are highly enriched in these 2 lipids which could link to their preteostasis observation.

      Indeed, ceramides released from SM hydrolysis are proposed to self-assembled into microdomains with negative curvature that can promote the formation of intralumenal vesicles (Alonso and Goni, 2018; Niekamp et al 2022). We propose that SM accumulation may hinder this process by counteracting the negative membrane curvature and impede microautophagy.

      (d) How about ER membrane lipids? The UPR and subsequent effects on proteostasis are intricately involved with ER lipid bilayer composition.

      We did not perform lipidomics on ER membranes in this study, though we note that at steady state, sphingolipids and cholesterol esters are not expected to be enriched at the ER (Ikonen and Zhou, 2021). We checked whether lipid-related genetic perturbations induced the UPR in published perturb-seq data in K562 cells. Neither MYLIP nor PSAP knockdown induced a UPR.

      In conclusion, the manuscript is interesting but the excitement over a link between lysosome-related lipid metabolism and proteostasis needs to be tamped until a more robust experimental approach is employed to generate supportive and corroborating results.

      Reviewer #2 (Recommendations For The Authors):

      - The paper has a number of grammatically awkward sentences. Editing these would enhance clarity.

      - It is important to show the co-localization of aggregates with the lysosome. This is shown in supplements but should be in a main figure. Here the authors cite previous work indicating that ProteoStat puncta co-localize with ubiquitinated proteins and state that they do not see this, then essentially just move on. Is there an explanation for this discrepancy and can it be resolved? What do they think is really going on? What happens to levels of ubiquitinated proteins when lipid metabolism is perturbed as in these experiments?

      We have included the lipid-induced lysosomal protein aggregation data in the main text (Figure 3A-B), and provided fine-grained quantification of the cytosolic-vs-lysosomal ProteoStat / Ub / ThT signals under different aggregate-inducing conditions (Figure S4A-D). We discuss these results in the main text and propose a model involving ESCRT-mediated microautophagy in the main text. This is supported further by the LysoIP-proteomics and LMP analysis.

      - Please add an indicator of amino acid numbers to Fig. 3C.

      These annotations are now included (now Figure S3C).

      - The legend for 3D is mislabelled.

      We have corrected the legend (now Figure S3D).

      Reviewer #3 (Recommendations For The Authors):

      Protein homeostasis and lipid homeostasis are both are important for maintaining cellular functions. However, the crosstalk remains largely unknown. The manuscript entitled as "Impairment of lipid homoeostasis causes accumulation of protein aggregates in the lysosome" deals with this interesting topic. An important link between lysosomal protein aggregation and sphingolipids/cholesterol esters metabolism were discovered. The topic belonging to the Cell Biology domain also falls into the aims and scope of eLife. Here are the revisions I recommend:

      (1) From lipidomics analysis, a remarkable correlation between levels of sphingomyelin and cholesterol ester and ProteoStat staining was found. Could the authors explain how sphingomyelin and cholesterol ester are quantified? The two lipids are not included as internal standards from the lipidomics experiment.

      Sphingomyelin and cholesterol ester internal standards are included in the Avanti 330707 SPLASH® LIPIDOMIX® Mass Spec Standard, which was supplied at 3% v/v to the MeOH/H2O cell lysis buffer. We have amended the Methods section to clarify this.

      (2) Could the authors perhaps delete Figure 1B and show it on Figure 2A only? There is no need to show the same figure two times. The threshold of both False Discovery Rate and Median Enrichment needs to be added. From Figure 2A, the Lysosomal hydrolases (GBA, LIPA, GALC) seems located in statistically insignificant region. Based on previous studies, the GBA could have an effect on sphingolipid levels, then how to explain that sphingomyelin was highly correlated with ProteoSate staining?

      We have combined the two volcano plots into a single figure (now Figure 1D), and added a line to help visualize the gene effects while considering the combined contribution of FDR and enrichment. Individual lysosomal hydrolases indeed have insignificant effects on ProteoStat and this is discussed in the main text as having relatively constrained impacts on the general lipidome. For example, while GBA and GALC KDs can lead to accumulation of their immediate substrates (glucosylceramide and galactosylceramide, respectively), they do not directly impinge on sphingomyelin.

      (3) The authors show the corelation between ProteoState staining and different lipids/lipid classes in Figure 3B and Figure S3A. It is not necessary to show the corelation with individual lipids (such as sphingomyelin(d18:1/24:0) and cholesterol ester(18:2). The corelation with full collection of lipid classes would be more representative, which is only list in Figure 3B and Figure S3A. It is suggested to add the information of how many individual lipids in each chass are used for the correlation analysis. Replace Figure 3A to Figure S3A, and put Figure 3A as supplementary figure are suggested.

      We decided to retain the correlation of two individual lipids (a sphingomyelin and a cholesterol ester species) with ProteoStat as examples to illustrate with clarity how we obtained the class-wide comparison. The number of individual lipids included in each class for correlation analysis is now included in Figures 2F and S3A.

      (4) The authors state that lipid uptake and metabolism modulate proteostasis. However, only cholesterol and LDL were tested. It would be more precise to state as cholesterol uptake and metabolism modulate proteostasis. In addition, sphingolipids and cholesterol esters accumulate with increased lysosomal protein aggregation. It would be interesting to see the effects of sphingolipids uptake, since sphingolipids are correlated with proteostasis better than cholesterol.

      We attempted to add back specific sphingolipids to assess sufficiency. However, we found it challenging to ensure that these lipids were distributed to the correct subcellular locations at physiologically relevant levels. Without this crucial information, it was difficult to draw any conclusions about the sufficiency of the sphingolipids we tested to impair proteostasis.

      Alonso A, Goñi FM. 2018. The Physical Properties of Ceramides in Membranes. Annu Rev Biophys 47:633–654. doi:10.1146/annurev-biophys-070317-033309

      Ikonen E, Zhou X. 2021. Cholesterol transport between cellular membranes: A balancing act between interconnected lipid fluxes. Dev Cell 56:1430–1436. doi:10.1016/j.devcel.2021.04.025

      Niekamp P, Scharte F, Sokoya T, Vittadello L, Kim Y, Deng Y, Südhoff E, Hilderink A, Imlau M, Clarke CJ, Hensel M, Burd CG, Holthuis JCM. 2022. Ca2+-activated sphingomyelin scrambling and turnover mediate ESCRT-independent lysosomal repair. Nat Commun 13:1875. doi:10.1038/s41467-022-29481-4

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02546

      Corresponding author: Woo Jae, Kim

      1. General Statements

      This is the second version of revision.

      After thoroughly reviewing the comments provided by the EMBO Journal reviewers, we found their feedback to be highly constructive and valuable for enhancing our manuscript without the need for additional experiments. For example, Reviewer 1 acknowledged that our "data are intriguing and some of the experiments are quite convincing," but suggested that the manuscript contained excessive data that required simplification. This sentiment was echoed by Reviewer 2. In response, we have completely reformatted our manuscript to eliminate unnecessary imaging quantification data and CrzR-related screening data. The reviewers noted the density of our experimental data, which has led us to focus on the SIFa to Crz-CrzR circuit mechanisms related to heart function and interval timing in future projects.

      Reviewer 2's comments were generally more moderate, and we successfully addressed all five of their points with detailed explanations and modifications to our manuscript. They positively remarked that "Overall, this highly interesting study advances our knowledge about the behavioral roles of SIFamide and contributes to an understanding of how motivated behavior such as mating is orchestrated by modulatory peptides." Additionally, Reviewer 3 accepted our manuscript without any further comments.

      In summary, we believe we have effectively addressed all concerns raised by Reviewers 1 and 2, resulting in a clearer manuscript that is more accessible to a broader audience.

      2. Point-by-point description of the revisions

      Reviewer #1

      General Comments: In this revision of their manuscript, Zhang et al have attempted to address most of the points raised by the reviewers, however, they have not assuaged my most important concerns. The manuscript contains a ton of information, but at times this is to the detriment of the narrative flow. I had a lot of trouble following the rationale of each experiment, and the throughline from one experiment to the next is not always obvious. The data are intriguing, and some of the experiments are quite convincing, but other experiments are either superfluous or have methodological issues. I will summarize the most acute issues below.

      • *Answer: Thank you for your thoughtful feedback and for acknowledging our efforts to address your previous comments. We appreciate your recognition of the intriguing nature of our data and the convincing aspects of our experiments. In this second revision, we have taken your concerns regarding the narrative flow and data overload to heart. We have completely reshaped our manuscript, significantly reducing unnecessary data, including the NP5270 data and overlapping quantification results that did not contribute meaningfully to the storytelling. Our goal was to streamline the presentation of our findings to enhance clarity and coherence, ensuring that each experiment clearly supports the overarching narrative. We believe these revisions will not only improve the readability of our manuscript but also allow readers to follow the rationale behind each experiment more easily. We are confident that this refined approach will make our contributions clearer and more impactful. Thank you once again for your constructive insights, which have been invaluable in guiding us toward a more focused and compelling presentation of our work.

      Comment 1. *The authors argue that genetic controls are unnecessary because they have been conducted in previously published papers. I am concerned with this argument, as it is good practice to repeat controls with each experiment. However, I am overall convinced by the basic phenotype indicating that panneuronal SIFaR knockdown eliminates the changes in mating duration associated with previous experience. As for the more restricted 24F06-GAL4, the phenotype is odd-the flies do actually change their mating duration, just in the opposite direction of controls. Doesn't this imply that these flies are still capable of "interval timing", and of changing their mating strategy following exposure to rivals or following sexual experience? *

      • *

      __ Answer:__ We appreciate the reviewer's critical comments regarding genetic control and the intriguing phenotypes we observed in specific genetic combinations. We fully agree with the reviewer and have repeated all genetic control experiments for this revision, confirming that our genetic controls consistently demonstrate intact LMD and SMD behaviors, as previously reported. These genetic control experiments have been included in Supplementary Information 1-2. We are grateful to the reviewer for the opportunity to reaffirm that LMD and SMD represent stable behavioral phenotypes suitable for genetically studying interval timing, supported by reproducible data.

      • *

      We acknowledge the reviewer's insightful comments about the exciting phenotype observed when SIFaR is knockdown which shows both singly reared and sexually experienced male show lengthened mating duration in contrast to normal LMD and SMD behaviors. Actually, we have observed such phenotype when specific neural circuits are disrupted such as when sNPF peptidergic signaling is disrupted in restricted neuronal population [4]. We are now investigating such phenotype as hypothesis as disinhibition. We explained this phenotype and about disinhibition in main text as below.

      In the spatial, the targeted reduction of SIFaR expression in the GAL424F06 neuronal subset resulted in a notable alteration of mating behavior. Both singly reared and sexually experienced flies exhibited an extended mating duration relative to naïve flies, contrary to the expected reduction. This observation indicates a deficit in the neural mechanism responsible for modulating mating duration, suggesting a disinhibition-like effect within the neural circuitry governing mating behavior. We have also previously observed a similar phenotype when sNPF peptidergic signaling is inhibited in specific neuronal circuits [62]. Disinhibition, characterized by the alleviation of inhibitory constraints, permits the activation of neural circuits that are ordinarily repressed. This process is instrumental in sculpting behavioral patterns and facilitating the sequential progression of behaviors. Through the orchestrated promotion of select neuronal activation and concurrent inhibition of competing neural routes, disinhibition empowers the brain with the ability to dynamically ascertain and preserve the requisite behavioral state, concurrently smoothing the transition to ensuing behavioral phases [63]. It is known that Drosophila neural circuits also exhibit disinhibition phenotypes in light preference and ethanol sensitization [64,65]. Further investigation is needed to uncover the underlying mechanisms of this disinhibition-like phenotype observed in LMD and SMD behaviors.

      This reversed phenotype strongly suggests a disruption in interval timing, as one would expect that if interval timing were normal and intact, male flies would decrease their mating duration in response to appropriate environmental changes. For instance, research has shown that patients with Parkinson's disease exhibit heterogeneity in temporal processing, leading to disrupted interval timing phenotypes [5]. Therefore, if male flies subjected to social isolation or sexual experience do not show a reduction in mating duration compared to control conditions, it indicates a potential disruption in their interval timing mechanisms. We appreciate the reviewer's encouragement to further explore this intriguing disinhibition-like phenotype, and we plan to investigate this aspect in our future projects.

      Comment 2. *I am glad the see the addition of data assessing the extent of SIFaR and CrzR RNAi knockdown; however, this has not completely addressed my concerns about interpretation of behavioral phenotypes. In both cases, the knockdown was assessed by qPCR using the very strong tub-GAL4 driver. mRNA levels are decreased but not nearly eliminated. Thus, when in line 177-178 the authors assert: "Consequently, we infer that the knockdown of SIFaR using the HMS00299 line nearly completely diminishes the levels of the SIFaR protein," the statement is not supported by the data. The qPCR results showed a knockdown at the mRNA level of ~50%. No assays were conducted to measure protein levels. The conclusions should be tempered to align with the data. Furthermore, it is not clear that knockdown is as successful with other drivers, which means that negative behavioral data must be interpreted with caution. For example, the lack of phenotype with repo-GAL4 driving SIFaR RNAi or elav-GAL4 driving CrzR RNAi could be due to a lack of efficient knockdown. This should be acknowledged. *

         __Answer:__ We appreciate the reviewer's critical observation regarding the efficiency of SIFaR knockdown. We fully agree that it is essential to confirm both for ourselves and our readers that the SIFaR knockdown phenotype is robust and convincing. At the outset of this project, we tested all available SIFaR-RNAi strains following established protocols within the fly community to ensure consistency in our findings. When we employed strong drivers such as tub-GAL4 and nSyb-GAL4 for SIFaR-RNAi knockdown, we observed that the flies failed to eclose and exhibited a lethal phenotype during the larval stage, which closely resembles the homozygous lethal phenotype seen in SIFaR mutants. This suggests that, in most cases, the effects of SIFaR knockdown can effectively mimic those of SIFaR mutations. To share our methodology and reinforce our findings, we have added clarifying statements in the main text as follows:
      

      "Employment of broad drivers, including the tub-GAL4 and the strong neuronal driver nSyb-GAL4, with HMS00299 line consistently results in 100% embryonic lethality (data not shown). This phenotype mirrors the homozygous lethality observed in the SIFaRB322 mutant."

      • *

      Due to the significant lethality phenotype observed, we conducted PCR analyses using a combination of tub-GAL80ts and SIFaR-RNAi. As detailed in Fig. 1E, we reared the flies at 22{degree sign}C to suppress RNAi expression and then shifted the temperature to 29{degree sign}C for just three days prior to performing PCR. While our PCR results indicate a 50% reduction in SIFaR levels, we believe that experiments conducted without the tub-GAL80ts system would likely demonstrate an even greater reduction in SIFaR expression. To clarify this point and provide additional context, we have included the following description in the main text:

      "The silencing of SIFaR mRNA was achieved at approximately 50% using the HMS00299 knockdown line in combination with tub-GAL80ts, with RNAi induction lasting for three days (bottom diagram in Fig. 1E). Notably, the same tub-GAL4 driver, when used without the tub-GAL80ts combination, resulted in embryonic lethality while still reducing SIFaR mRNA levels by 50% after three days of RNAi induction. This finding suggests that SIFaR knockdown using the HMS00299 line with GAL4 drivers is likely sufficient to elicit the observed LMD and SMD behaviors. This rationale underscores the effectiveness of our experimental approach and its potential implications for understanding the role of SIFaR in mating behaviors."

      We also concur with the reviewer that the absence of a behavioral phenotype associated with CrzR-RNAi may be due to inefficient RNAi knockdown. Consequently, we have included a description of this issue in the main text as follows:

      • *

      "It is important to consider that the 50% knockdown of SIFaR and CrzR may be sufficient to disrupt LMD and/or SMD behavior. However, the lack of phenotype with repo-GAL4 or elav-GAL4 could be due to a less efficient knockdown. This possibility highlights the need for cautious interpretation of negative behavioral data."

      Comment 3. *Regarding the issue of outcrossing, I am confused by the authors' statement: "To reduce the variation from genetic background, all flies were backcrossed for at least 3 generations to CS strain. For the generation of outcrosses, all GAL4, UAS, and RNAi lines employed as the virgin female stock were backcrossed to the CS genetic background for a minimum of ten generations. Notably, the majority of these lines, which were utilized for LMD assays, have been maintained in a CS backcrossed state for long-term generations subsequent to the initial outcrossing process, exceeding ten backcrosses." It's not clear what this means. Perhaps the authors could definitively state how many times each line was outcrossed. The genetic background is important because of 1) the lack of all controls, and 2) the variability of the behavioral phenotype. Often, the presence or absence of LMD or SMD appears to depend on the behavior of the control flies. When these flies show low mating duration, there is typically not a reduction following sexual experience or group raising. Could these differences derive from genetic background or transgenic insertion effects? *

      Answer: We appreciate the reviewer's concern regarding the potential for confusion stemming from our descriptions of the genetic background. As the reviewer noted, we have published multiple papers on LMD and SMD behaviors, and we have conducted our experiments with careful attention to controlling the genetic background [1-3,6-8]. In response to the reviewer's comments about the importance of genetic control and background, we have completed all necessary genetic control experiments and confirmed that all our flies have been backcrossed for more than ten generations to the Canton-S (CS) strain. We believe that we have adequately addressed the reviewer's concerns regarding potential differences arising from genetic background or transgenic insertion effects. To provide readers with more detailed information about our genetic background, we have added a paragraph in the MATERIALS AND METHODS section as follows:

      "The CS background was selected as the experimental background due to its well-characterized and consistent LMD and SMD behaviors. To ensure that genetic variation did not confound our results, all GAL4, UAS, and RNAi lines employed in our assays were rigorously backcrossed into the CS strain, often exceeding ten generations of backcrossing. This approach was undertaken to isolate the effects of our genetic manipulations from those of genetic background. We assert that the extensive backcrossing to the CS background, in concert with the internal control in LMD and SMD, provides a stable platform for the accurate interpretation of the LMD and SMD phenotypes observed in our experiments."

      Comment 4. *I continue to have substantial concerns about the thresholding method used across many experiments to quantify overlap, and then to claim that this indicates that synaptic connections are being made between different neuronal populations. The degree of overlap will depend on factors including the settings during imaging (was care taken to prevent pixel saturation?). It is also not clear to me from the methods whether analysis was done on single confocal images or on projections. The images shown in the figures look like maximum projections of a confocal stack. Overlap would have to be assessed on individual confocal sections-it is possible that this is what was done for analysis but not clear from the description in the methods. Furthermore, a lot of figure space is dedicated to superfluous information. For example, in Figure 1F-J, there is a massive amount of space dedicated to assessing the agree of overlap between red stinger and CD4GFP, each driven from the same SIFaR2A driver, and further assessing what percentage of the CD4GFP signal overlaps with nc82, with the apparent goal of showing that a lot of the SIFaR signal is at active zones. This information does little to drive the narrative forward, and is quite confusing to read. Finally, the confocal images are generally too small to actually assess. *

         __Answer:__ We appreciate the reviewer's concerns regarding our imaging quantification methods. We recognize the importance of providing a clear and transparent methodology for both readers and the broader scientific community. Instead of using maximum projection of confocal images, we employed a projection method that incorporates the standard deviation function available in ImageJ. Based on our experience, this approach yields more reliable quantification results, allowing for a more accurate assessment of our data. To ensure clarity and reproducibility, we have detailed our methods in the MATERIALS AND METHODS section as follows:
      
      • *

      "The quantification of the overlap was performed using confocal images with projection by standard deviation function provided by ImageJ to ensure precise measurements and avoid pixel saturation artifacts."

      We appreciate the reviewer's suggestion regarding the inclusion of image quantification data for overlapping regions, which may not be essential to the logical flow of our narrative and could lead to confusion for readers. In response, we have removed nearly all of the quantification data related to overlapping regions, retaining only those that we consider critical for the paper. Currently, only Fig. S3B-E remains, as it is important for illustrating how SIFa neuronal arborization interacts with SIFaR neurons in the central nervous system.

      Additionally, we fully agree with the reviewer that the overall size of the confocal images was too small for effective assessment. To address this concern, we have enlarged all confocal images and increased the spacing in the figures. We believe these improvements will enhance the clarity of our manuscript and facilitate a better understanding of our findings.

      • *

      Comment 5. *In general, the figures are still very cluttered, with panels too close together, and the labels are hard to read. *

      Answer: We thank the reviewer for their valuable feedback regarding the clarity of our figures. In response to their concern, we have enlarged the figures to enhance readability and ensure that the panels are more distinct. We believe these adjustments will significantly improve the viewer's ability to interpret the data. We appreciate the reviewer's attention to detail, which has helped us to refine the presentation of our findings.

      Comment 6. *There are no methodological details on how the VFB was used. The authors have not addressed my concern that they are showing only the neuronal skeleton (rather than the actual site of synapses). They are simply identifying all locations where the neuronal skeleton overlaps an entire brain region, and suggesting that these represent synapses. Many papers use the VFB to denote the actual location of synapses, which should be done in Figures 3B and S4A. *

      Answer: We appreciate the reviewer's constructive comments regarding the methodological details of using VFB data. We fully agree that we cannot draw definitive conclusions about SIFa projections to specific regions based solely on neuronal skeleton data, which do not indicate the actual locations of synapses. To address this concern, we have made it clear to readers that the VFB skeleton data serves only as a preliminary indication of potential SIFa projections to GA, FB, and AL.

      To confirm the presence of actual synapses from SIFa neurons, we conducted a thorough analysis using FlyWire data, which validated our findings from VFB. By integrating insights from VFB with the detailed synaptic mapping provided by FlyWire, we can confidently assert the functional relevance of these connections within the context of SIFa neuronal activity. This comprehensive approach not only bolsters our conclusions but also enhances our understanding of how SIFa neurons interact within the broader neural circuitry. We believe this rationale highlights the significance of our work in elucidating the complex relationships among these neuronal populations. We have detailed our findings in the main text as follows:

      "We utilized the "Virtual Fly Brain (VFB)" platform, an interactive tool designed for exploring neuronal connectivity, to gain insights into the connectivity of SIFa neurons with four other neurons, specifically GA, FB, and AL (Fig. 3B and Fig. S4B) [74]. While VFB provides valuable information, it does not offer precise locations of synapses originating from SIFa neurons. To address this limitation, we incorporated data from the FlyWire connectome, which allowed us to confirm that SIFa projections indeed form actual synapses with GA, AL, FB, and SMP (Fig. S3F and S3G) [75]. This multi-faceted approach enhances the robustness of our findings by integrating different data sources to validate neuronal connections."

      • *

      Comment 7. *The changes in GRASP and CaLexA with experience are very interesting, and suggest a substantial rearrangement of synaptic connectivity associated with changes in mating duration following group rearing or female exposure. I am still concerned, however, that the nsyb and tGRASP images look so different. I wouldn't expect them to be identical, but it is puzzling that the nsyb-GRASP data show connections in a few discrete brain areas, while the tGRASP data show connections in a much larger overall brain area, but curiously not in the major regions seen with nsyb-GRASP (ie PI, FB and GA). Shouldn't the tGRASP signal appear in all the places that the nsyb-GRASP does? For CaLexA and GRASP data, the methods should indicate the timing of the dissections and staining relative to the group/sexual experience. *

      Answer: We appreciate the reviewer's constructive comments regarding our GRASP data, which indeed reveal an intriguing neural plasticity phenotype, as the reviewer noted. In our previous response, we suggested that the observed differences may be attributed to the distinct SIFa-GAL4 strains utilized, as described in another manuscript focused on SIFa inputs [9]. In that manuscript, we classified the four SIFa neurons into two groups: SIFaDA (dorsal-lateral) and SIFaVP (ventral-posterior). The SIFa2A-GAL4 specifically labels only the SIFaVP neurons, while the SIFa-PT driver labels all four neurons. We acknowledge that we did not clearly communicate this distinction to the reviewer or our readers, and we apologize for any confusion this may have caused. To rectify this oversight, we have added a detailed explanation of these differences in the main text as follows:

      "The subtle differences in GRASP signals observed in Fig. 3A may stem from the distinct expression patterns of the SIFa2A-lexA and GAL4SIFa.PT drivers. We would like to emphasize that the SIFa2A driver labels only a subset of SIFa neurons in other regions (Kim 2024)."

      We recognize that a clear and transparent methodology is essential for generating reproducible data. In response to the reviewer's suggestion, we have revised our MATERIALS AND METHODS section to include more detailed descriptions of the dissection conditions. This enhancement aims to provide readers with the necessary information to replicate our experiments effectively.

      "To ascertain calcium levels and synaptic intensity from microscopic images, we dissected and imaged five-day-old flies of various social conditions and genotypes under uniform conditions. For group reared (naïve) flies, the flies were reared in group condition and dissect right after 5 days of rearing without any further action. For single reared flies, the flies were reared in single condition and dissect at the same time as group reared flies right after 5 days of rearing without any further action. For sexual experienced flies, the flies were reared in group condition after 4 days of rearing and will be given virgins to give them sexual experience for one day, those flies will also be dissected at the same time as group and single reared flies after one day."

      • *

      Comment 8. *The calcium imaging data are odd. In most cases, the experimental flies don't actually show an increase in calcium levels but rather a lack of a decrease that is present in the ATR- controls. Also, in the cases where they argue for an excitatory affect of SIF neuron stimulation, the baseline signal intensity appears higher in ATR- controls compared to ATR+ experimental flies (eg Fig 5L, 6O), while it is significantly higher in ATR+ flies compared to ATR- controls when the activation results in decreased calcium signals. Perhaps more details on how these experiments were conducted and whether data were normalized in some way would help to clarify this. *

      Answer: Thank you for your valuable feedback. We appreciate your careful analysis of our calcium imaging data and have addressed your concerns below:

      In our experiments, we observed that ATR+ flies maintained relatively stable calcium levels, whereas ATR- controls exhibited a gradual decrease. Under confocal imaging, GFP signals typically decrease over time, which we observed in ATR- controls. However, ATR+ flies did not exhibit this decline. To better convey this observation, we have refined the language in the manuscript. Specifically, we now describe this as a tendency to sustain the activity of Crz neurons in the OL and AG regions (Fig. 6K-M, Fig. S6G-I). This is supported by the sustained intracellular calcium activity in ATR+ flies compared to the gradual decline to baseline levels observed in ATR- controls (Fig. 6K-M).

      Baseline signal intensity differences: You correctly noted that in some cases, the baseline signal intensity appears higher in ATR- controls compared to ATR+ flies. These differences are likely due to technical factors, such as variations in the distance between the imaged brain and the objective lens. Even minor positional shifts in the brain (forward or backward) can affect the observed signal intensity.

      Our analyses focus on relative changes in fluorescence intensity within the same sample, which we present as line graphs to highlight trends rather than absolute values. However, we acknowledge that showing the magnitude of relative values instead of absolute values may have caused some confusion. We have revised the images to better align with our conclusions, ensuring that the adjustments do not affect the observed relative changes.

      Normalization and experimental details: The calcium imaging data were normalized to ΔF/F to account for differences in baseline fluorescence intensity. However, we recognize that further clarification of the normalization process and experimental setup is essential. We have expanded the methods section to include detailed descriptions of data acquisition, normalization steps, and statistical analyses.

      As the reviewer correctly noted, calcium signals in ATR+ flies are generally higher than those in ATR- flies. However, it appears that the calcium levels exhibit a maintained response rather than a dramatic increase compared to the control ATR- condition, particularly in the case shown in Fig. 6K, which illustrates SIFa-to-Crz signaling. We believe this observation may reflect the actual physiological conditions under which SIFa influences SIFaR neurons to sustain activity during activation. We have included our interpretation of these findings in the main text as follows:

      "Upon optogenetic stimulation of SIFa neurons, we observed a tendency to maintain the activity of Crz neurons in OL and AG regions (Fig. 6K-M, Fig. S6H-J), evidenced by a sustained activity in intracellular Ca2+ levels that persisted in a high level compared to control ATR- condition which shows gradual declining to baseline levels (Fig. 6K-M). In contrast to the OL and AG regions, the cells in the upper region of the SIP consistently show a decrease in Ca2+ levels following stimulation of the SIFa neurons (Fig. 6N-P)."

      To enhance readers' understanding of our calcium imaging results, we have reformatted our GCaMP data for improved clarity and included additional details in the MATERIALS AND METHODS section regarding the quantification of GCaMP imaging methods. Furthermore, as the reviewer correctly noted, discrepancies in baseline activity were due to our error in presenting the baseline data. We have now corrected this oversight accordingly.

      • *

      Comment 9. *The models in Fig 4 J and T show data from Song et al, though I could not find a citation for this. I would omit this part of the model since these data are not discussed at all in the manuscript. *

      Answer: We appreciate the reviewer for correctly identifying our oversight in failing to properly cite Song et al.'s paper. This error occurred partly because the preprint was not available at the time we submitted our manuscript. We now have a preprint for Song et al.'s paper, which discusses the contributions of SIFa neurons to various energy balance behaviors, and we plan to submit this paper back-to-back with our current submission to PLOS Biology. We have briefly cited Song et al.'s work in the manuscript; however, we have removed references to it from Fig. 4J and T to avoid any potential confusion for readers.

      Comment 10. *The graphs for the SCOPE data (eg Figure 8I-L) are still too small to make sense of. *

      Answer: We enlarged the tSNE plot generated from the SCOPE data.

      • *

      Comment 11. The rationale behind including the data in Figure 9 is not well explained. I would omit this data to help streamline and focus the manuscript.

      Answer: We fully understand and agree with the reviewer's concerns, and we have removed all previous versions of Figure 9 from the manuscript to prevent any confusion regarding the storyline.

      • *

      Comment 12. *The single control group is still being duplicated in two different graphs but with different names in each graph. The authors updated figure caption hints at this but does not make it explicit. At the very least, these should be given the same name across all graphs, as is done, for example, in the CaLexA experiments in Figure 4B-C. *

      Answer: We concur with the reviewer and have changed the label for all "group" conditions to "naïve" in all figures.

      • *

      Comment 13. *Lines 640-641: Moreover, the pacemaker function is essential for the generation of interval timing capabilities (Meck et al, 2012; Matell, 2014; Buhusi & Meck, 2005), with the heart being recognized as the primary pacemaker organ within the animal body". This is an intriguing idea, however, I attempted to look at the cited references and don't see any claim about the heart being involved in interval timing. I could not find a paper matching the citation of Matell 2014. Meck et al 2012 is an introduction to a Frontiers in Integrative Neuroscience Research Topic and does not mention the heart, nor does the Buhusi and Meck 2005 paper. Perhaps there is a more suitable reference to make the assertion that the fly's interval timer would be affected by changes in heart rate. My suggestion would be to simplify the manuscript, focusing on the most robust findings-the behavioral effect of SIFaR knockdown, the GRASP and CaLexA data showing differences following group rearing or female exposure, and the effect of Crz knockdown in SIFaR neurons. Other details could be included but would have to be verified with more rigorous experiments. *

      __ Answer:__ We appreciate the reviewer's interest in our exploration of the role of heart function in interval timing. While we found that knocking down CrzR in the heart specifically disrupts LMD behavior, we agree that our manuscript needs to be streamlined for clarity. As a result, we have eliminated all CrzR-RNAi knockdown data except for the oenocyte, neuronal and glial knockdown data presented in Fig. S8C-H. This decision was made to ensure a more focused comparison with the SIFaR knockdown experiments shown in Fig. 1. We are dedicated to further investigating the role of Crz-CrzR in heart function and its influence on interval timing in a future project. This approach allows us to maintain clarity in our current manuscript while laying the groundwork for more comprehensive studies ahead.

      In line with the reviewer's suggestions, we have simplified our manuscript by eliminating unnecessary data, such as overlapping image quantification and CrzR-RNAi screening, allowing us to focus on SIFaR knockdown and GRASP, as well as CaLexA with GCaMP imaging. We are grateful to the reviewer for providing us with the opportunity to delineate the role of CrzR in heart function related to LMD as a significant future project. We believe that our manuscript has been greatly improved by the reviewer's constructive feedback.

      • *

      __ __


      Reviewer #2

      General Comments:* The authors investigate mating behavior in male fruit flies, Drosophila melanogaster, and test for a role of the SIFamide receptor (SIFaR) in this type of behavior, in particular mating duration in dependence of social isolation and prior mating experience. The anatomy of SIFamide-releasing neurons in comparison with SIFamide receptor-expressing neurons is characterized in a detail-rich manner. Isolating males or exposing them to mating experience modifies the anatomical organization of SIFamidergic axon termini projecting onto SIFamide receptor-expressing neurons. This structural synaptic plasticity is accompanied by changes in calcium influx. Lastly, it is reported that corazonin-releasing neurons are modulated by SIFamide releasing neurons and impact the duration of mating behavior.

      Overall, this highly interesting study advances our knowledge about the behavioral roles of SIFamide, and contributes to an understanding how motivated behavior such as mating is orchestrated by modulatory peptides. The manuscript has some points that are less convincing.*

      __ Answer:__ We appreciate the reviewer's positive feedback regarding our investigation into the role of the SIFamide receptor (SIFaR) in mating behavior in male Drosophila melanogaster. We are pleased that the detailed characterization of SIFamide-releasing neurons and their anatomical changes in response to social isolation and mating experience has been recognized as a valuable contribution to the understanding of synaptic plasticity and its impact on behavior. We are also grateful that the reviewer described our manuscript as a "highly interesting study" that advances knowledge about the behavioral roles of SIFamide and contributes to the understanding of how motivated behaviors, such as mating, are orchestrated by modulatory peptides. We sincerely thank the reviewer for these encouraging comments about our work.

      We acknowledge the reviewer's concerns about certain aspects of our manuscript that may be less convincing. We are committed to addressing these points thoroughly to strengthen our arguments and enhance the clarity of our findings. In response to the feedback, we have made several revisions throughout the manuscript, including clarifying our methodology, enhancing the presentation of our data, and providing additional context where needed. We believe these changes will improve the overall quality of the manuscript and make our conclusions more compelling. Thank you for your thoughtful review, and we look forward to your further insights.

      Comment 1. *It remains unclear why the authors link the differentially motivated duration of mating behavior with the psychological concept of interval timing. This distracts from the actually interesting neurobiology and is not necessary to make the study interesting. The study deals with the modulation of mating behavior by SIFamide. The abstraction that SIFamide plays a role in the neuronal calculation of time intervals for the perception of time sequenc es is not convincing in itself. *

      • Answer: We appreciate the reviewer's thoughtful comments regarding our conclusion that links SIFamide to interval timing in mating behavior. We recognize that our data primarily indicate that SIFamide is essential for normal mating duration and influences the motivation-dependent aspects of this behavior. We also acknowledge the need for more robust evidence to establish a clearer connection between these findings and interval timing. Recent research by Crickmore et al. has provided valuable insights into how mating duration in Drosophila *serves as an effective model for examining changes in motivation over time as behavioral goals are achieved. For example, around six minutes into mating, sperm transfer occurs, resulting in a significant shift in the male's nervous system, where he no longer prioritizes continuing the mating at the expense of his own survival. This pivotal change is mediated by four male-specific neurons that release the neuropeptide Corazonin (Crz). When these Crz neurons are inhibited, sperm transfer does not take place, and as a result, the male fails to reduce his motivation, leading to matings that can extend for hours instead of the typical duration of approximately 23 minutes [10].

      Recent research conducted by Crickmore et al. has secured NIH R01 funding (Mechanisms of Interval Timing, 1R01GM134222-01) to investigate mating duration and sperm transfer timing in Drosophila as a genetic model for understanding interval timing. Their study emphasizes how fluctuations in motivation over time can affect mating behavior, particularly noting that significant behavioral changes occur during mating. For instance, around six minutes into the mating process, sperm transfer takes place, which corresponds with a notable decrease in the male's motivation to continue mating [10]. These findings indicate that mating duration serves not only as an endpoint for behavior but may also reflect fundamental mechanisms associated with interval timing.

         We believe that by leveraging the robustness and experimental tractability of these findings, along with our own work on SIFamide's role in mating behavior, we can gain deeper insights into the molecular and circuit mechanisms underlying interval timing. We will revise our manuscript to clarify this relationship and emphasize how SIFamide may interact with other neuropeptides and neuronal circuits involved in motivation and timing.
      
         In addition to the efforts of Crickmore's group to connect mating duration with a straightforward genetic model for interval timing, we have previously published several papers demonstrating that LMD and SMD can serve as effective genetic models for interval timing within the fly research community. For instance, we have successfully connected SMD to an interval timing model in a recently published paper [3], as detailed below:
      

      "We hypothesize that SMD can serve as a straightforward genetic model system through which we can investigate "interval timing," the capacity of animals to distinguish between periods ranging from minutes to hours in duration.....

      In summary, we report a novel sensory pathway that controls mating investment related to sexual experiences in Drosophila. Since both LMD and SMD behaviors are involved in controlling male investment by varying the interval of mating, these two behavioral paradigms will provide a new avenue to study how the brain computes the 'interval timing' that allows an animal to subjectively experience the passage of physical time [11-16]."

         Lee, S. G., Sun, D., Miao, H., Wu, Z., Kang, C., Saad, B., ... & Kim, W. J. (2023). Taste and pheromonal inputs govern the regulation of time investment for mating by sexual experience in male Drosophila melanogaster. *PLoS Genetics*, *19*(5), e1010753.
      
         We have also successfully linked LMD behavior to an interval timing model and have published several papers on this topic recently [6-8].
      
         Sun, Y., Zhang, X., Wu, Z., Li, W., & Kim, W. J. (2024). Genetic Screening Reveals Cone Cell-Specific Factors as Common Genetic Targets Modulating Rival-Induced Prolonged Mating in male Drosophila melanogaster. *G3: Genes, Genomes, Genetics*, jkae255.
      
         Zhang, T., Zhang, X., Sun, D., & Kim, W. J. (2024). Exploring the Asymmetric Body's Influence on Interval Timing Behaviors of Drosophila melanogaster. *Behavior Genetics*, *54*(5), 416-425.
      
         Huang, Y., Kwan, A., & Kim, W. J. (2024). Y chromosome genes interplay with interval timing in regulating mating duration of male Drosophila melanogaster. *Gene Reports*, *36*, 101999.
      
         Finally, in this context, we have outlined in our INTRODUCTION section below how our LMD and SMD models are related to interval timing, aiming to persuade readers of their relevance. We hope that the reviewer and readers are convinced that mating duration and its associated motivational changes such as LMD and SMD provide a compelling model for studying the genetic basis of interval timing in *Drosophila*.
      

      "The dimension of time is the fundamental basis for an animal's survival. Being able to estimate and control the time between events is crucial for all everyday activities [25]. The perception of time in the seconds-to-hours range, referred to as 'interval timing', is involved in foraging, decision making, and learning via activation of cortico-striatal circuits in mammals [26]. Interval timing requires entirely different neural mechanisms from millisecond or circadian timing [27-29]. There is abundant psychological research on time perception because it is a universal cognitive dimension of experience and behavioral plasticity. Despite decades of research, the genetic and neural substrates of temporal information processing have not been well established except for the molecular bases of circadian timing [30,31]. Thus, a simple genetic model system to study interval timing is required. Considering that the mating duration in fruit flies, which averages approximately 20 minutes, is well within the range addressed by interval timing mechanisms, this behavioral parameter provides a relevant context for examining the neural circuits that modulate the Drosophila's perception of time intervals. Such an investigation necessitates an understanding of the extensive neural and behavioral plasticity underlying interval timing [32-37]."

      We would like to highlight that many researchers are currently working to bridge the gap between interval timing as a purely psychological concept and its neurobiological underpinnings, as illustrated in the following articles [15,17-20]. We appreciate the reviewer's concerns regarding the relationship between mating duration and interval timing. However, we believe that our LMD and SMD model can effectively bridge the gap between psychological concepts and neurobiological mechanisms using a straightforward genetic model organism. By employing Drosophila as our model, we aim to elucidate the underlying neural circuits that govern these behaviors, thereby contributing to a deeper understanding of how interval timing is represented in both psychological and biological contexts.

      Matell, M. S. Neurobiology of Interval Timing. Adv. Exp. Med. Biol. 209-234 (2014) doi:10.1007/978-1-4939-1782-2_12.

      Matell, M. S. & Meck, W. H. Cortico-striatal circuits and interval timing: coincidence detection of oscillatory processes. Cogn. Brain Res. 21, 139-170 (2004).

      Merchant, H. & Lafuente, V. de. Introduction to the neurobiology of interval timing. Adv Exp Med Biol 829, 1-13 (2014).

      Golombek, D. A., Bussi, I. L. & Agostino, P. V. Minutes, days and years: molecular interactions among different scales of biological timing. Philosophical Transactions Royal Soc B Biological Sci 369, 20120465 (2014).

      Balcı, F. & Toda, K. Editorial: Psychological and neurobiological mechanisms of time perception and temporal information processing: insight from novel technical approaches. Front. Behav. Neurosci. 17, 1208794 (2023).

      Comment 2. *For all behavioral experiments, genetic controls should always be conducted. That is, both the heterozygous Gal4-line as well as the heterozygous UAS-line should be used as controls. This is laborious, but important and common standard. The authors often report data only for offspring from genetc crosses in which UAS-lines and Gal4-lines are combined (e.g. figure S1). This is not sufficient. *

      • *Answer: We are grateful for the reviewer's constructive suggestions regarding the genetic control experiments. In response to similar concerns raised by another reviewer, we have conducted all necessary genetic control experiments and included the results in Supplementary Information 1-2. We hope that this thorough effort will demonstrate to both the reviewer and readers that the LMD and SMD behaviors represent stable and reproducible phenotypes for investigating the genetic components of interval timing.

      Comment 3. *There are quite a lot of citations of preprints, including preprints from the authors's own lab. It seems inappropriate to cite non-peer reviewed preprints in order to present the basic principles of the study (interval timing in flies) as recognized knowledge. In general, it is unclear whether the information presented in these multiple preprints will turn out to be credible and acceptable. *

      • *Answer: We concur with the reviewer and have removed most of the preprint material, retaining only one preprint that discusses SIFa function, which has been co-submitted with this manuscript.

      Comment 4. *Anatomical images are often very small and not informative. For example, figure S1 O, R, S and U shows small images of fly brains and ventral nerve chords that do not convincingly describe the expression of fluorescent proteins. The choice of a threshold to quantify fluorescence seems arbitrary. It is also not clear what the quantification "83% of brain and 71% of VNC SIFaR+ neurons" actually tells us. This quantification does not rely on counting neurons (such as 83% of neurons), but only shows how fluorescence in these neurons overlaps with an immunostaining of an ubiquitous active zone protein. The same is true for figure S2 or S3: overlapping brain areas do not inform you about numbers of cells, as stated in the text. *

      Answer: We appreciate the reviewer's concerns regarding our imaging quantification methods. In response to similar questions raised by another reviewer, we have thoroughly reformatted our methods section and eliminated much of the overlapping data that appeared unnecessary for this paper. We recognize the importance of providing a clear and transparent methodology for both readers and the broader scientific community. Instead of using maximum projection of confocal images, we employed a projection method that incorporates the standard deviation function available in ImageJ. Based on our experience, this approach yields more reliable quantification results, allowing for a more accurate assessment of our data. To ensure clarity and reproducibility, we have detailed our methods in the MATERIALS AND METHODS section as follows:

      • *

      "The quantification of the overlap was performed using confocal images with projection by standard deviation function provided by ImageJ to ensure precise measurements and avoid pixel saturation artifacts."

      We appreciate the reviewer's suggestion regarding the inclusion of image quantification data for overlapping regions, which may not be essential to the logical flow of our narrative and could lead to confusion for readers. In response, we have removed nearly all of the quantification data related to overlapping regions, retaining only those that we consider critical for the paper. Currently, only Fig. S3B-E remains, as it is important for illustrating how SIFa neuronal arborization interacts with SIFaR neurons in the central nervous system.

      Additionally, we fully agree with the reviewer that the overall size of the confocal images was too small for effective assessment. To address this concern, we have enlarged all confocal images and increased the spacing in the figures. We believe these improvements will enhance the clarity of our manuscript and facilitate a better understanding of our findings.

      Comment 5. *The authors have consistently confused the extensive overlap of neuronal processes (dendrites and presynaptic regions) across large brain areas with synaptic connections. One cannot infer functional synaptic connectivity from the overlap of these fluorescent signals. *

      Answer: We appreciate the reviewer's feedback and, in light of similar comments from another reviewer, we have removed most of the DenMark and syt.eGFP data, retaining only Fig. 3A. We are grateful for the constructive suggestions, which have significantly enhanced our manuscript. We believe that these revisions have clarified the narrative for readers, allowing for a more focused exploration of SIFaR's role in synaptic plasticity and neuronal orchestration.

      Reviewer #3

      General Comments: In this revised manuscript, the authors have fully and satisfactorily addressed my comments on the previous version. I recommend publication of this manuscript.

      __ Answer:__ We would like to extend our heartfelt thanks for the careful consideration and positive assessment of our revised manuscript. Your insightful feedback has been instrumental in shaping the final version of our work, and we are delighted to hear that our revisions have met your expectations.

      Your dedication to ensuring the quality and rigor of the scientific literature is truly commendable, and we are immensely grateful for the time and effort you have devoted to reviewing our paper. Your support for publication is a significant encouragement to us and validates the hard work we have put into addressing the issues you raised.

      Please accept our sincere appreciation for your professional and constructive approach throughout the review process. We look forward to the possibility of contributing to the scientific community through the dissemination of our research.

      REFERENCES

      1. Kim WJ, Jan LY, Jan YN. Contribution of visual and circadian neural circuits to memory for prolonged mating induced by rivals. Nat Neurosci. 2012;15: 876-883. doi:10.1038/nn.3104
      2. Kim WJ, Jan LY, Jan YN. A PDF/NPF Neuropeptide Signaling Circuitry of Male Drosophila melanogaster Controls Rival-Induced Prolonged Mating. Neuron. 2013;80: 1190-1205. doi:10.1016/j.neuron.2013.09.034
      3. Lee SG, Sun D, Miao H, Wu Z, Kang C, Saad B, et al. Taste and pheromonal inputs govern the regulation of time investment for mating by sexual experience in male Drosophila melanogaster. PLOS Genet. 2023;19: e1010753. doi:10.1371/journal.pgen.1010753
      4. Zhang X, Miao H, Kang D, Sun D, Kim WJ. Male-specific sNPF peptidergic circuits control energy balance for mating duration through neuron-glia interactions. bioRxiv. 2024; 2024.10.17.618859. doi:10.1101/2024.10.17.618859
      5. Merchant H, Luciana M, Hooper C, Majestic S, Tuite P. Interval timing and Parkinson's disease: heterogeneity in temporal performance. Exp Brain Res. 2008;184: 233-248. doi:10.1007/s00221-007-1097-7
      6. Sun Y, Zhang X, Wu Z, Li W, Kim WJ. Genetic Screening Reveals Cone Cell-Specific Factors as Common Genetic Targets Modulating Rival-Induced Prolonged Mating in male Drosophila melanogaster. G3: Genes, Genomes, Genet. 2024; jkae255. doi:10.1093/g3journal/jkae255
      7. Zhang T, Zhang X, Sun D, Kim WJ. Exploring the Asymmetric Body's Influence on Interval Timing Behaviors of Drosophila melanogaster. Behav Genet. 2024; 1-10. doi:10.1007/s10519-024-10193-y
      8. Huang Y, Kwan A, Kim WJ. Y chromosome genes interplay with interval timing in regulating mating duration of male Drosophila melanogaster. Gene Rep. 2024; 101999. doi:10.1016/j.genrep.2024.101999
      9. Kim WJ, Song Y, Zhang T, Zhang X, Ryu TH, Wong KC, et al. Peptidergic neurons with extensive branching orchestrate the internal states and energy balance of male Drosophila melanogaster. bioRxiv. 2024; 2024.06.04.597277. doi:10.1101/2024.06.04.597277
      10. Thornquist SC, Langer K, Zhang SX, Rogulja D, Crickmore MA. CaMKII Measures the Passage of Time to Coordinate Behavior and Motivational State. Neuron. 2020;105: 334-345.e9. doi:10.1016/j.neuron.2019.10.018
      11. Buhusi CV, Meck WH. What makes us tick? Functional and neural mechanisms of interval timing. Nat Rev Neurosci. 2005;6: 755-765. doi:10.1038/nrn1764
      12. Merchant H, Harrington DL, Meck WH. Neural Basis of the Perception and Estimation of Time. Annu Rev Neurosci. 2012;36: 313-336. doi:10.1146/annurev-neuro-062012-170349
      13. Allman MJ, Teki S, Griffiths TD, Meck WH. Properties of the Internal Clock: First- and Second-Order Principles of Subjective Time. Annu Rev Psychol. 2013;65: 743-771. doi:10.1146/annurev-psych-010213-115117
      14. Rammsayer TH, Troche SJ. Neurobiology of Interval Timing. Adv Exp Med Biol. 2014; 33-47. doi:10.1007/978-1-4939-1782-2_3
      15. Golombek DA, Bussi IL, Agostino PV. Minutes, days and years: molecular interactions among different scales of biological timing. Philosophical Transactions Royal Soc B Biological Sci. 2014;369: 20120465. doi:10.1098/rstb.2012.0465
      16. Jazayeri M, Shadlen MN. A Neural Mechanism for Sensing and Reproducing a Time Interval. Curr Biol. 2015;25: 2599-2609. doi:10.1016/j.cub.2015.08.038
      17. Balcı F, Toda K. Editorial: Psychological and neurobiological mechanisms of time perception and temporal information processing: insight from novel technical approaches. Front Behav Neurosci. 2023;17: 1208794. doi:10.3389/fnbeh.2023.1208794
      18. Gür E, Duyan YA, Arkan S, Karson A, Balcı F. Interval timing deficits and their neurobiological correlates in aging mice. Neurobiol Aging. 2020;90: 33-42. doi:10.1016/j.neurobiolaging.2020.02.021
      19. Merchant H, Lafuente V de. Introduction to the neurobiology of interval timing. Adv Exp Med Biol. 2014;829: 1-13. doi:10.1007/978-1-4939-1782-2_1
      20. Matell MS. Neurobiology of Interval Timing. Adv Exp Med Biol. 2014; 209-234. doi:10.1007/978-1-4939-1782-2_12
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We want to thank both reviewers for their thorough and constructive review of our manuscript. Below, we have re-iterated their comments followed by an explanation of how we have revised the manuscript to address this.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript presented by Segeren et al. applied an interesting HRASG12V inducible cell model to study the mechanism of cellular resistance to replication stress inducing agents. They also employed a novel reversible fixation technique which allows them to FAC sort cells according to their replication stress levels before applying single cell sequencing analysis to the same cell populations. By comparing cells with low levels of replication stress to cells with high levels of replication stress, they found that reduction in gene expression of FOXM1 target genes potentially protects cells against replication stress induced by CHK1i plus gemcitabine combination. Overall, this is a very interesting study. However, the following points should be addressed prior to publication:

      Major: 1. Figure 3E and 3F showed two lists of differentially expressed genes in γH2Ax low cells. However, instead of arbitrarily extracting the FOXM1 target genes and TP53 targeted genes, it would be appreciated if the author could perform an unbiased and unsupervised gene set enrichment analysis such as Enrichr.

      As recommended, we performed an enrichment analysis using Enrichr to identify transcriptional programs associated with the we used the genes that were downregulated in the γH2AX-low cells. FOXM1 appeared as a prominent hit in different databases (both experimental and computational). We have included the lists of differentially expressed genes as an additional supplemental table (Table S1) and have included the Enrichr results as Table S3 (i.e. CHEA and ENCODE). We have described our results in lines 198-200 of the revised manuscript.

      1. At the experiment design stage, the authors also included HRASG12V status as a test condition because they previously found that HRASG12V mutation induces basal level replication stress and they would like to include this condition to study the adaptation to replication stress (line 110). However, the difference in HRASG12V negative and HRASG12V positive cells was not followed up in the later part of the paper. Can they show lists of differentially expressed genes identified under HRASG12V negative conditions as well (in the same format of Figure 3E and 3F) and comment on the differences as well?

      In the original manuscript, we included heatmaps of differentially expressed genes in the control cells in Figure S2. For improved clarity, we have modified this figure so that the heatmaps are labeled "Control cells". In the revised manuscript, we have also included Table S2, which lists the differentially expressed genes between yH2AX low and yH2AX high control cells, and Table S3, which lists the Enrichr results obtained based on these gene lists.

      We observed FOXM1 target genes in both the control and HRASG12V cells. Thus, the mechanism we identify does not appear to be specific to oncogenic Ras expression. We discuss this in lines 221-225. Because there were no other notable differences between the gene sets, we do not focus on this in the manuscript.

      1. In line 194 and in Figure S2B, the authors claimed that ANLN, HMGB2, CENPE, MKI67, and UBE2C demonstrated co-expression, but other genes displaying similar correlation scores were not commented (such as F3, CYR61, CTGF, etc). To avoid being biased at the analysis stage, the authors should define clearly what the cut-off of correlation score is and why only co-expression of ANLN, HMGB2, CENPE, MKI67, and UBE2C were mentioned.

      As suggested, we explain now in the revised manuscript that we focused on gene clusters consisting of at least 3 genes, that had a correlation coefficient greater than or equal to 0.4 with at least one other gene within the clusters. This cutoff is typically defined as representing a "moderate to good" correlation in biological data (Overholser, Sowinski, 2008). To make clear which clusters correlating gene sets passed these criteria, we have also highlighted these genes in Figure S3B. This returned the cluster we had already identified as FOXM1 targets, and as well spotted by the reviewer, a larger cluster which included F3, CYR61, CTGF, SERPINE1, ANKRD1, KRTAP2-3, UGCG, and AMOTL. Our Enrichr analysis did not identify any putative transcription factors linking the genes in this larger cluster. We are still interested to identify the putative transcription regulation mechanism linking these genes in future studies, but this is beyond the scope of the current manuscript. We have described these observations in lines 211-218.

      1. In line 215, instead of validating CENPE, UBE2C, HMGB2, ANLN, and MKI67 individually, the authors decided to validate FOXM1 instead, because they believe all the aforementioned genes are targets of FOXM1, therefore, validating FOXM1 alone would suffice. Again, this makes the validation process also biased. CENPE, UBE2C, HMGB2, ANLN, and MKI67 should be validated individually because they might sensitize cells to replication stress via different mechanisms. Besides, if all these genes were identified together because they are FOXM1 target genes, why did the authors not identify FOXM1 itself as a differentially expressed gene from the single cell sequencing? The sequencing only analyzed the S/G2/M cells, expression of FOXM1 should be detected easily.

      We agree with the reviewer that the omission of individual FOXM1 target genes in the validation process makes a biased impression. Therefore we ordered siRNAs against CENPE, UBE2C, HMGB2, ANLN, and MKI67. Similar to the other DE genes in the original mini-screen we first knocked down these genes using the siRNA Smartpools (pools of 4 individual siRNAs against each genes). Here, we observed a decrease in γH2AX signal compared to drug-treated cells transfected with all 5 Smartpools compared to drug-treated cells transfected with control siRNA. We next moved on to the deconvolution step of the screen, where we transfected cells with 4 individual siRNA against each gene. Here, we observed inconsistent effects of ANLN, CENPE, and HMGB2 when comparing the individual siRNAs, which all produce efficient knockdown of their target genes. But interestingly, for both MKI67 and UBE2C, each of the 4 individual siRNAs similar decreased yH2AX signal, though it was not as strong as the decrease observed when FOXM1 is knocked out. Understanding the exact mechanism of how MKI67 and UBE2C reduce replication stress is beyond the scope of this paper, but we hypothesize that, as with FOXM1, it is likely linked to their role in promoting progression through the cell cycle. These results are shown in Figures S5, and we mention these remarkable findings in the revised abstract and discuss these in the light of the recent literature in the Discussion section (lines 275-286).

      Then, we also addressed the comment about FOXM1 not being changed in the single cell RNA-seq analysis. We could indeed readily detect FOXM1 expression our single-cell RNA sequencing data. The difference in expression did not change significantly in cells sorted according to γH2AX level (Figure 4C). Because FOXM1 is highly regulated post-translationally, we hypothesized that an increase in the (active) protein is correlated to increased replication stress rather than transcript levels. This was indeed the case and we further explain our experiment to test this hypothesis in response to Point #6 (results are displayed in Figure 4D and described in lines 201-209).

      1. As pointed out by the author in the Discussion, single cell sequencing is not good at differentiating the causes from the consequences. The author tried to validate many of the differentially expressed genes in γH2Ax low cells. However, the fact that only FOXM1 knockdown passed the validation and deconvolution pointed out that the great majority of the identified genes are not the cause of the sensitivity change to replication stress inducing agents but likely the consequences. Therefore, in Figure S2C and S2D, it would be better that the authors could just name the genes as 'downregulated genes' in Figure S2C and 'upregulated genes' in Figure S2D. Taking into consideration that the expression change in the great majority of these genes are just consequences of sensitivity change to replication stress, defining them as 'potentially sensitizing' genes and 'potentially conferring resistance' genes is rather misleading.

      We agree that the way we originally labeled these plots may have been misleading. We have renamed then to "Downregulated in yH2AXlow" and "Upregulated in yH2AXlow", as recommended by the reviewer.

      1. To better prove that FOXM1 is the leading cause of the sensitivity to CHK1i+Gemcitabine induced replication stress, can the authors show the FOXM1 expression status in the tolerant cell population identified in Figure 1B (lowest panel)? Alternatively, can they plot FOXM1 expression level in the same tSNE plots shown in Figure 3B to 3D to see whether some of the γH2Ax low populations also show reduced FOXM1 expression?

      FOXM1 expression levels were not increased with gH2AXhigh versus gH2AXlow HRASG12V cells in the single cell RNA-sequencing data (Figure 4C in revised manuscript). However, as mentioned in our answer to point #4 we performed an additional experiment, which showed a strong positive correlation between phospho-FOXM1 and γH2AX (as measured by flow cytometry) in S-phase cells (Figure 4D). This indicates that the active form of the FOXM1 indeed increases as yH2AX levels increase, consistent with the observed increase in FOXM1 target genes. These results are described in lines 201-209.

      1. Clonogenic survival assay in Figure 4D was not quantified properly in Figure 4E. To rule out the siFOXM1 mediated growth/survival defects and to only focus on the siFOXM1 mediated resistance to CHK1i+Gemcitabine, the survival rate (intensity percent in this case) of CHK1i+Gemcitabine treated condition should be normalized against the survival rate of the Vehicle condition. E.g., the intensity percent of the siSCRAMBLE after treatment should be divided by the intensity percent of the untreated siSCRAMBLE; the intensity percent of the si#1 after treatment should be divided by the intensity percent of the untreated si#1, and so on. If the authors would like to show siFOXM1 induced growth/survival defects, they can still present the left part of the Figure 4E (the Vehicle group).

      Originally, we chose to show the absolute IntensityPercent for all groups, without normalizing to the untreated group, because we wanted to also highlight the FOXM1-mediated changes in growth. We agree that normalizing the IntensityPercent of the drug-treated group to the vehicle group better highlights the siFOXM1-mediated resistance. We have therefore re-analyzed the data and presented it this way in Figure 5E (described in lines 293-295). We moved our original Figure 4E to a new supplemental figure (Figure S4B) to still point out the effects of siFOXM1 on cell growth in untreated cells.

      Minor:

      1. In line 176, the author claimed that 'Interestingly, rare cells treated with CHK1i + gemcitabine are located within the untreated cell cluster (Fig. 3C)'. However, it is not as obvious where these cells are in the plot, especially to people who are new to tSNE plots. It would be appreciated if the authors could label these cells by circling them with red lines and make the point stronger.

      Rather than circling these points (we thought this would make the plot too "busy"), we have created an inset that zooms in on the region where we see the untreated cells within the untreated cell cluster. Within the inset, we use arrows to point out the cells we are referring to. This can be seen in our updated Figure 3C.

      1. In Figure S2B, it will be ideal to label clearly which genes are upregulated genes and which are downregulate.

      On the x-axis of the heatmap, we have drawn lines to separate the downregulated and upregulated genes.

      1. In line 50, the word 'multifaced' needs to be corrected to 'multifaceted'.

      Thank you for catching this, we have fixed it.

      1. It is unclear what 'underly drug resistance' means in line 150.

      We have reworded this sentence so that is more clear. It is now written as follows: "we aimed to identify gene-expression programs that mediate the low level of RS in a subset of cells, which could potentially mediate drug resistance". This change is in lines 155.

      1. It is advised that the phrase 'cell cycle position' could be changed to 'cell cycle phase' or 'cell cycle stage'.

      We purposefully used the phrase "cell cycle position" because we wanted to emphasis gradient-like progress through the cell cycle rather than a discrete distinction from one-phase to the next. We have reworded the text slightly to now say "position within S-phase" (lines 163, 187, 191, 208), since all the cells we are interested in are in S phase, but some are further through S phase than others.

      1. In line 185, the word 'in' after 'within' can be removed.

      Thank you for catching this, we have fixed it.

      1. In line 194, 'Among genes downregulated in γH2AXlow cells, the expression of ANLN, HMGB2, CENPE, MKI67 and UBE2C correlated' is missing an 'are' in front of the word 'correlated'.

      Thank you for catching this, we have fixed it.

      1. In line 239, Fig.SC3 should be Fig. S3C.

      Thank you for catching this, we have fixed it.

      1. FOXM1 is known as a crucial gene for G2/M transition. Therefore, FOXM1 knockdown cells are expected to be mostly arrested at the G2/M interface. Therefore, in line 244, it is incorrect to say stronger FOXM1 knockdown induced a 'lower proportion of cells in G2 phase'. In fact, as shown in Figure 4C, cells are accumulating in G2 phase (peaking around 11M on the DAPI axis) and depleted from G1 phase (peaking around 7M).

      We have reworded this to say that there is "a higher proportion of cells in S-phase and a less distinct G2 peak" (lines 270-271). The DAPI profiles of the scrambled, siFOXM1 #1, and siFOXM1 #2 conditions all show an S-phase "valley" between a G1 and G2 peak (the valley sits at about 8M-9M). In the siFOXM1 #3 and siFOXM1 #4 conditions, we no longer see this valley, therefore we interpret this as cells still in S-phase. If they had progressed from S-phase into G2 phase, we expect that we would again see this "valley" to the left of a clear G2 peak. In the figure below, we overlayed DNA content histograms of the different FOXM1 targeting siRNAs with the scrambled siRNA to demonstrate this point more clearly.

      Reviewer #1 (Significance (Required)):

      Advance: The study reported a novel reversible fixation technique which can lead to potentially good citations. However, the findings from the single cell sequencing alone fell short in novelty to reach high impact because FOXM1 has been reported to impact on cellular sensitivity to CHK1 inhibition mediated replication stress (PMC7970065). Moreover, the study did not provide mechanistic explanation to the observed phenotype but only validated the finding from the sequencing, and the gene of focus (FOXM1) was not originally identified from the sequencing, slightly undermining the paper's foundation. To make it a better paper. the authors need to be less biased when it comes to data analysis and interpretation.

      Audience: People who are interested in basic research in cell cycle, DNA damage, cancer, chemotherapy would be interested.

      My expertise: Cancer, DNA damage, cell cycle

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Replication stress activates ATR and CHEK1 kinases as part of the inter S phase DNA damage response. CHEK1 kinase inhibitors (CHK1i) have been shown to induce an accumulation of unresolved replication stress and widespread DNA damage and cell death caused by replication catastrophe, and are therefore under clinical evaluation. At the same time, CHEK1 inhibition results in the activation of CDK1 and FOXM1 and premature expression of G2/M genes (Saldivar et al., 2018 Science). FOXM1-drivent premature mitosis has been shown to be required for the replication catastrophe and CHK1i sensitivity (Branigan et al., 2021 Cell Rep.). In this study, Segeren and colleagues set out to investigate the mechanisms of replication stress tolerance. They used CHK1i inhibitors in combination with the DNA-damaging chemotherapeutic agent Gemcitabine and oncogenic HRASG12V expression to increase replication stress. The authors utilized an intriguing setup of combined immunofluorescence staining followed by single cell RNA-seq analysis to overcome limitations of bulk cell analyses. In particular, the authors sought to identify genes that are differentially regulated in replication stress-tolerant cells compared to sensitive cells. However, even single cell analyses can be confounded by differences in cell cycle distribution. To mitigate this, the authors selected mid S-phase cells for their analysis. While this may not have completely eliminated minor differences in cell cycle progression, the authors identified FOXM1-regulated G2/M cell cycle genes, among others, that were down-regulated in the tolerant cells. When the authors followed up on the effect of these genes on replication stress tolerance, they identified FOXM1 knockdown as the only robust mediator of replication stress tolerance.

      Major comments:

      The authors observed that cell cycle distribution could be a major confounding factor in their single cell analysis and attempted to reduce this variation by selecting mid S-phase cells based on the DAPI signal. The authors then chose to compare gH2AXlow and gH2AXhigh subpopulations of RPE-HRASG12V cells because their "DAPI signal was comparable" (line 181-184). However, their data show that these subpopulations also show differences in their DAPI signal distribution, with gH2AXlow cells tending to have lower DAPI signals than gH2AXhigh cells (Supplementary Figure 2A). Thus, the major confounding factor that the authors sought to remove seems to have prevailed and it remains possible that the difference in cell cycle gene expression is merely due to differences in cell cycle progression of the individual cells. Given that DAPI information seem to be readily available for the individual cells, the authors should normalize their analysis to the DAPI signal to remove this potential confounding effect or clearly state this potential limitation.

      We agree that indeed it is very challenging to fully disentangle the influence of cell cycle distribution on our analysis. And indeed, the γH2AXlow HRASG12V cells have slightly reduced median DNA content compared to γH2AXmid and γH2AXhigh. However, this was not the case in the RPE control cells, and we still found that FOXM1 target genes were strongly enriched in the γH2AXhigh cells (Fig S2C and Table S4). Therefore, it is highly unlikely that bias in S-phase position distributions does not explain our results. Nevertheless, to be transparent about this write in the Results on lines 192-193 the following: "The other groups all showed similar DAPI intensities, although gH2AXlow RPE-HRASG12V cells showed a slight but statistically significant reduction compared to their gH2AXhigh counterparts (Fig. S2A)".

      In our subsequent experiments to assess the relationship between phospho-FOXM1 (representing the transcriptionally active protein) and γH2AX, we observed that though there was a strong correlation between pFOXM1 and γH2AX, there was no correlation between phospho-FOXM1 and DAPI (Figure 4D-E). We therefore would like to point out that although our readout for replication stress inevitably increases as cells progress through DNA replication, heterogeneity in phospho-FOXM1 levels cannot be explained by position in S-phase. These results are described in lines 203-209.

      Finally, we do not think it would be statistically appropriate to use the DAPI signal (generated by fluorescence intensity as measured by the flow cytometer) as a normalization factor for our gene expression data.

      Minor comments:

      The findings of Saldivar et al., 2018 Science and Branigan et al., 2021 Cell Rep. should be mentioned in the introduction.

      As recommended, we mentioned both these papers in the introduction. In line 62, we cite the Branigan paper as showing that modulation of cell cycle regulators is a strategy used by cancer cells to resist replication stress. In lines 63-65, we reference them as follows: "The RS response is tightly linked with cell cycle progression, as multiple intra S-phase checkpoint kinases play a role in curtailing proteins involved in the S-G2 transition (Branigan et al., 2021, Saldivar et al., 2018)."

      The authors conclude that "cell cycle position can be a major confounding factor when evaluating the transcriptomic response to RS." It should be noted that stochastic differences in the cell cycle distribution of bulk cells are perhaps the best-known confounder in single cell analyses (see, for example, Buettner et al., 2015 Nat. Biotechnol.).

      We chose to reference the Buettner paper to justify our decision to select only cycling cells in our scRNA seq approach. Our reference to the paper, and to the fact that cell cycle distribution is a major confounder in single cell analysis, is in lines 138-140.

      Supplementary Figure 2A: The median should be added to the violin plots.

      As suggested, we have added medians to the violin plots. In addition, we added details on statistical analysis.

      The statement "Differential expression analysis revealed 19 genes that were significantly downregulated in gH2AXlow RPE-HRASG12V cells, suggesting that elevated levels of these genes are correlated with sensitivity to RS-inducing drugs" refers to Figure 3E and Table S1. However, Table S1 lists the "key resources" and does not seem to be related to this statement. A table showing log2fold-changes and FDR values should be added and referenced here.

      We have generated tables with the fold change values of differentially expressed genes between the yH2AX low and yH2AX high cells. These are found in Table S1 (for HRAS G12V cells) and Table S2 (for Control cells) in the supplementary file of the revised manuscript. The "key resources" has been moved to Table S5.

      The statement "Remarkably, Braningan and co-workers observed no effect of full FOXM1 deletion on cell cycle progression" seems somewhat inconsistent with what has been stated and assessed in that study. The authors may want to replace "progression" with "distribution". A reduction in proliferation is commonly observed when FOXM1 levels are reduced.

      In addition, the authors may want to consider that their addition of HRASG12V and Gemcitabine may contribute to a more substantial S phase checkpoint response.

      We agree with the reviewer that a reduction in proliferation is commonly observed when FOXM1 levels are reduced (Barger et al., 2021, Cheng et al., 2022, Yang et al., 2015, Wu et al., 2010), but in Branigan et al., they see no decrease in proliferation with knockout of FOXM1. They state "There were no apparent differences in the growth rate of the LIN54 and FOXM1 KO versus EV cells over 10 days (Figure 1G)". Though they do not elaborate on why they see this unexpected response, we suspect a permanent full knockout of FOXM1 could cause compensatory adaptation in their cell lines. In our experiments, we perform transient knockdowns, so cells may not have the time to adapt to the loss of FOXM1 and obtain compensatory mechanisms that would allow them to continue cycling as rapidly as control cells treated with non-targeting siRNA.

      However, we decided to remove this from the Discussion section, as it seemed to interrupt the discussion about the potential mechanisms underlying protection against DNA damage by FOXM1 depletion.

      The statement that "the mechanism by which high FOXM1 activity is a prerequisite to accumulate DNA damage in S-phase during CHK1 inhibition remains to be uncovered" seems to neglect that premature mitosis has been suggested as a mechanistic cause (Branigan et al., 2021 Cell Rep.). It would be helpful if the authors could elaborate on this.

      In our discussion, we do already emphasize the described role of FOXM1 in promoting premature mitosis (lines 330-337), but we argue that in our experimental conditions we are observing another - previously undescribed- role for FOXM1 in promoting replication stress during S phase. We previously observed with live cell imaging that CHK1i + gemcitabine does not cause premature mitosis in RPE-HRASG12V cells (published in Segeren et al. Oncogene 2022, Figure 5). Instead, these cells typically showed a cell cycle exit from G2. This makes it highly unlikely that premature mitosis is the reason why these cells would accumulate excessive DNA damage. We realize now that it was an important omission not to elaborate on this and have added this clarification to the Discussion (lines 341-345 in revised manuscript). In addition, we have removed a few lines of less important text (about the lack of direct effect of FOXM1 KO in the Branigan paper; see answer to previous point) to improve clarity and readability.

      Reviewer #2 (Significance (Required)):

      General assessment: The strength of the study is the intriguing methodology of combined immunofluorescence followed by single cell RNA-seq. The limitations are that this methodology does not seem to fully solve the stated problems. In addition, the study is essentially limited to confirming previous findings.

      Advance: The study strengthens current knowledge but provides essentially no advance. The authors confirm existing knowledge with an additional approach. While this is not an advance in itself, it is important to the community.

      Audience: I felt that the study would appeal to a basic science audience. In particular, the CHK1i and intra S-phase checkpoint areas, with limited interest beyond that.

      My relevant expertise lies in transcriptomics, gene regulation and the cell cycle.

      Reference list

      Barger, C.J., Chee, L., Albahrani, M., Munoz-Trujillo, C., Boghean, L., Branick, C., Odunsi, K., Drapkin, R., Zou, L. & Karpf, A.R. 2021, "Co-regulation and function of FOXM1/RHNO1 bidirectional genes in cancer", eLife, vol. 10, pp. 10.7554/eLife.55070.

      Branigan, T.B., Kozono, D., Schade, A.E., Deraska, P., Rivas, H.G., Sambel, L., Reavis, H.D., Shapiro, G.I., D'Andrea, A.D. & DeCaprio, J.A. 2021, "MMB-FOXM1-driven premature mitosis is required for CHK1 inhibitor sensitivity", Cell reports, vol. 34, no. 9, pp. 108808.

      Cheng, Y., Sun, F., Thornton, K., Jing, X., Dong, J., Yun, G., Pisano, M., Zhan, F., Kim, S.H., Katzenellenbogen, J.A., Katzenellenbogen, B.S., Hari, P. & Janz, S. 2022, "FOXM1 regulates glycolysis and energy production in multiple myeloma", Oncogene, vol. 41, no. 32, pp. 3899-3911.

      Overholser, B.R. & Sowinski, K.M. 2008, "Biostatistics primer: part 2", Nutrition in clinical practice : official publication of the American Society for Parenteral and Enteral Nutrition, vol. 23, no. 1, pp. 76-84.

      Saldivar, J.C., Hamperl, S., Bocek, M.J., Chung, M., Bass, T.E., Cisneros-Soberanis, F., Samejima, K., Xie, L., Paulson, J.R., Earnshaw, W.C., Cortez, D., Meyer, T. & Cimprich, K.A. 2018, "An intrinsic S/G(2) checkpoint enforced by ATR", Science (New York, N.Y.), vol. 361, no. 6404, pp. 806-810.

      Segeren, H.A., van Liere, E.A., Riemers, F.M., de Bruin, A. & Westendorp, B. 2022, "Oncogenic RAS sensitizes cells to drug-induced replication stress via transcriptional silencing of P53", Oncogene, vol. 41, no. 19, pp. 2719-2733.

      Wu, Q., Liu, C., Tai, M., Liu, D., Lei, L., Wang, R., Tian, M. & Lu, Y. 2010, "Knockdown of FoxM1 by siRNA interference decreases cell proliferation, induces cell cycle arrest and inhibits cell invasion in MHCC-97H cells in vitro", Acta Pharmacologica Sinica, vol. 31, no. 3, pp. 361-366.

      Yang, K., Jiang, L., Hu, Y., Yu, J., Chen, H., Yao, Y. & Zhu, X. 2015, "Short hairpin RNA- mediated gene knockdown of FOXM1 inhibits the proliferation and metastasis of human colon cancer cells through reversal of epithelial-to-mesenchymal transformation", Journal of experimental & clinical cancer research : CR, vol. 34, no. 1, pp. 40-1.

      We want to thank both reviewers for their thorough and constructive review of our manuscript. Below, we have re-iterated their comments followed by an explanation of how we have revised the manuscript to address this.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Summary:

      In this manuscript, the molecular mechanism of interaction of daptomycin (DAP) with bacterial membrane phospholipids has been explored by fluorescence and CD spectroscopy, mass spectrometry, and RP-HPLC. The mechanism of binding was found to be a two-step process. A fast reversible step of binding to the surface and a slow irreversible step of membrane insertion. Fluorescence-based titrations were performed and analysed to infer that daptomycin bound simultaneously two molecules of PG with nanomolar affinity in the presence of calcium. Conformational change but not membrane insertion was observed for DAP in the presence of cardiolipin and calcium.

      Strengths:

      The strength of the study is skillful execution of biophysical experiments, especially stoppedflow kinetics that capture the first surface binding event, and careful delineation of the stoichiometry.

      Weaknesses:

      The weakness of the study is that it does not add substantially to the previously known information and fails to provide additional molecular details. The current study provides incremental information on DAP-PG-calcium association but fails to capture the complex in mass spectrometry. The ITC and NMR studies with G3P are inconclusive. There are no structural models presented. Another aspect missing from the study is the reconciliation between PG in the monomer, micellar, and membrane forms.

      Besides the two-stage process, another important finding in the current work is the stable complex that plays a critical role in the drug uptake both in vitro and in B. subtilis. This complex has been shown to be a stable species in HPLC and its binding stoichiometry and affinity have been quantitatively characterized. The complex may not be stable enough in gas phase to be detected in the MS analysis, which was designed to detect the phospholipid and Dap components, not the complex itself. The structural model of this complex is clearly proposed and presented in Figure 6. 

      The NMR and ITC studies have a very clear conclusion that Dap has a weak interaction with the PG headgroup alone, which is unable to account for the Dap-PG interaction observed in the fluorescence studies. Thus, the whole PG molecule has to be involved in the interaction, leading to the discovery of the stable complex.  

      Reviewer #2 (Recommendations For The Authors):

      (1) I appreciate and agree with the comment that there are stages of daptomycin insertion, and these might involve the formation of different complexes with different binding partners (e.g. pre-insertion vs quaternary vs bactericidal). However, it seems like lipid II is an apparent participant in daptomycin membrane dynamics (Grein et al. Nature Communications 2020). It's not clear why this was excluded from analysis by the authors, or what basis there is for the discussion statement that the quaternary complex can shift into the bactericidal complex by exchanging 1 PG for lipid II. 

      We agree that lipid II and other isoprenyl lipids may be involved in the uptake and insertion of daptomycin into membrane according to the results of the Nat. Comm. paper. However, these isoprenyl lipids are very small components of the membrane in comparison to PG and their contribution to the drug uptake is thus expected to be much less significant. Nonetheless, we included farnesyl pyrophosphate (FPP) as an analog of bactoprenol pyrophosphate (C55PP), which was reported to have the same promoting effect as lipid II in the previous study, in our study but found no promoting effect in the fluorescence assay (Fig. 2B). In addition, no complex was formed when FPP replaced PG in our preparation and analysis of the drug-lipid complex. In consideration of these negative results and the expected small contribution, other isoprenyl lipids or their analogs were not included in the study.

      The statement of forming the proposed bactericidal complex from the identified complex is a speculation that is possible only when lipid II has a higher affinity for Dap than a PG ligand. To avoid confusion, we deleted the sentence’ in the revision. 

      (2) The detailed examination of daptomycin dynamics, particularly on the millisecond scale, in this paper is ideal for characterizing the effect of lipid II on daptomycin insertion. It would be helpful to either include lipid II in some analyses (micelle binding, fluorescence shifts, CD) or at least address why it was excluded from the scope of this work.

      As mentioned in the response to the first comment, we did not exclude isoprenyl lipids in our study but used some of their analogs in the fluorescence assay. Besides FPP mentioned above, we also tested geranyl pyrophosphate and geranyl monophosphate but obtained the same negative results. Lipid II was not directly used because it is one of the three isoprenyl lipids reported to have the same promoting effects in the Nat. Comm. paper and also because its preparation is not easy. Even if lipid II were different from other isoprenyl lipids in promoting membrane binding, its contribution is likely negligible at the reversible stage compared to the phospholipids because of its minuscule content in bacterial membrane. This is the main reason we did not use the isoprenyl lipids in the fast kinetic study (this stage only involves reversible binding, not insertion). 

      (3) Grein et al. 2020 saw that PG did not have a strong effect on daptomycin interaction with membranes. I believe this discrepancy is more likely due to the complex physical parameters of supported bilayers versus micelles/vesicles or some other methodological variable, but if the authors have more insight on this, it would be valuable commentary in the discussion.

      We totally agree that the discrepancy is likely due to the different conditions in the assays. It is hard to tell exactly what causes the difference. Thus, we did not attempt to comment on the cause of this difference in the discussion.

      (4) Isolation of the daptomycin complex from B. subtilis cells clearly had different traces from the in vitro complex; is it possible that lipid II is present in the B. subtilis complex? If not, a time-course extraction could be useful to support the model that different complexes have different activities. Isolates from early-stage incubation with daptomycin may lack lipid II but isolates from longer incubations may have lipid II present as the complex shifts from insertion to bactericidal.

      From the day we isolated the complex from B. subtilis, we have been looking for evidence for the previously proposed lipid complexes containing lipid II or other isoprenyl lipids but have not been successful. We did not see any sign of lipid II or other isoprenyl lipids in the MALDI or ESI mass spectroscopic data. The minute peaks in the HPLC traces are not the expected complexes in separate LC-MS analysis. However, this does not mean that such complexes are not present in the isolated PG-containing complex because: (1) the amount of such complexes may be too small to be detected due to the low content of the isoprenyl lipids; (2) the isoprenyl lipids, particularly lipid II, are not easily ionizable due to their size and unique structure for detection in mass spectrometry. 

      We don’t think the drug treatment time is the reason for the failure in detecting lipid II or other isoprenyl lipids. In our reported experiment, the cells were treated with a very high dose of Dap for 2 hours before extraction. In a separate experiment done recently, we treated B. subtilis at 1/3 of the used dose under the same condition and found all treated cells were dead after 1 hour in a titration assay, consistent with the results from reported time-killing assays in the literature. From this result, the proposed bactericidal lipid-containing complex should have been formed in the treated cells used in our extraction and isolated along with the PG-containing complex. It was not detected likely due to the reasons discussed above. To avoid the interference of the PG-containing complex, a large amount of bacterial cells might have to be treated at a low dose to isolate enough amount of the lipid II-containing complex for identification. However, isolation or identification of the lipid II-containing complex is outside the scope of the current investigation and is therefore not pursued. 

      (5) Part of the daptomycin mechanism of interacting with bacterial membranes involves the flipping of daptomycin from one leaflet to another. There was some mentioned work on the consistency of results between micelles and vesicles, but the dynamics or existence of a flipping complex in the bilayer system wasn't addressed at all in this paper.

      The current investigation makes no attempt to solve all problems in the daptomycin mode of action and is limited to the uptake of the drug, up to the point when Dap is inserted into the membrane. Within this scope, flipping of the complex is not yet involved and is thus irrelevant to the study. How the complex is flipped and used to kill the bacteria is what should be investigated next.  

      (6) The authors mention data with phosphatidylethanolamine in the text, but I could not find the data in the main or supplemental figures. I recommend including it in at least one of the figures.

      It is much appreciated that this error is identified. The POPE data was lost when the graphic (Fig. 2B) was assembled in Adobe to create Figure 2. We re-draw the graphic and reassemble the figure to solve this problem. Fig. 2B has also been modified to use micromolar for the concentration of the lipids.

      (7) Readability point: I'd suggest some consistency in the concentrations mentioned. Making the concentrations either all molar-based or all percentage-based would make comparison across figures easier.

      As suggested, we have changed the % into micromolar concentrations in Fig. 2B and also in Fig. 3A. 

      (8) The model figure is quite difficult to interpret, particularly the final stage of the tail unfolding. I recommend the authors use a zoomed-in inset for this stage, or at least simplify the diagram by removing the non-participating lipid structures. The figure legend for the model figure should also have a brief description of the events and what the arrows mean, particularly the POPS PG arrow in the final panel of the figure. I am assuming here the authors are implying that daptomycin can transiently interact with one lipid species and move to another, but the arrow here suggests that daptomycin is moving through the lipid headgroup space.

      We really appreciate the suggestions. As suggested, we put an inset to show the preinsertion complex more clearly. In addition, we have removed the green arrows originally intended to show the re-organization/movement of the phospholipids. Moreover, the legend is changed to ‘Proposed mechanism for the two-phased uptake of Dap into bacterial membrane. In the first phase, Dap reversibly binds to negative phospholipids with a hidden tail in the headgroup region, where it combines with two PG molecules to form a pre-insertion complex. In the second phase, the hidden tail unfolds and irreversibly inserts into the membrane. The inset shows the headgroup of the pre-insertion complex with the broad arrow showing the direction for the unfolding of the hidden tail. The red dots denote Ca2+.’  

      (9) The authors listed the Kd for daptomycin and 2 PG as 7.2 x 10-15 M2. Is this correct? This is an affinity in the femtomolar range.

      Please note that this Kd is for the simultaneous binding of two PG molecules, not for the binding of a single ligand that we usually refer to. Assuming that each PG contributes equally to this interaction, the binding affinity for each ligand is then the squared root of 7.2 x 10-15 M2, which equals to 8.5 x 10-8 M. This is equivalent to a nanomolar affinity for PG and is a reasonably high affinity.

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors reported an increase in daptomycin intensity with the increasing amount of negatively charged DMPG. A similar observation has been reported for GUVs, however, the authors did not refer to this paper in their manuscript: E. Krok, M. Stephan, R. Dimova, L. Piatkowski, Tunable biomimetic bacterial membranes from binary and ternary lipid mixtures and their application in antimicrobial testing, Biochim. Biophys. Acta - Biomembr. 1865 (2023) [1]. This paper is also consistent with the authors' observation that there is negligible fluorescence detected for the membranes composed of PC lipids upon exposure to the Dap treatment.

      As suggested, this paper is cited as ref. 29 in the revision by adding the following sentence at the end of the section ‘Dependence of Dap uptake on phosphatidylglycerol.’: ‘PG-dependent increase of the steady-state fluorescence was also observed in giant unilamellar vesicles (GUVs).29’. The numbering is changed accordingly for the remaining references.  

      (2) Please include the plot of the steady-state Kyn fluorescence vs the content of POPA (Figure 2C shows traces for DMPG, CL, and POPS). Both POPA and POPS lipids are negatively charged, however, POPS seems to interact with Dap, while POPA does not. In my opinion, this observation is really interesting and might deserve a more thorough discussion. The authors might want to describe what could be the mechanism behind this lipid-specific mode of binding.

      As suggested, a plot is now added for POPA in Fig. 2C, which is basically a flat line without significant increase for the Kyn fluorescence. Indeed, the different effect of the negative phospholipids is very interesting, indicating that the reversible binding of Dap to the lipid surface is dependent not only on the Ca2+-mediated ionic interaction but also the structure of the headgroup. In other words, Dap recognizes the phospholipids at the surface binding stage. Considering this headgroup specificity, the last sentence in the second paragraph in “Discussion’ is changed from ‘In addition, due to the low lipid specificity, this reversible binding likely involves Ca2+-mediated ionic interaction between Dap and the phosphoryl moiety of the headgroups.’ to ‘In addition, due to the specificity for negative phospholipids (Fig. 2B and 2C), this reversible binding of Dap likely involves both a nonspecific Ca2+-mediated ionic interaction and a specific interaction with the remaining part of the headgroups.’

      (3) The authors write that they propose a novel mechanism for the Ca2+-dependent insertion of Dap to the bacterial membrane, however, they rather ignored the already published findings and hypotheses regarding this process. In fact the role of Ca2+, as well as the proposed conformational changes of Dap, which allow its deeper insertion into the membrane are well known:

      The role of Ca2+ ions in the mechanism of binding is actually three-fold: (i) neutralization of daptomycin charge [2], (iii) creating the connection between lipids and daptomycin and (iii) inducing two daptomycin conformational changes. It should be noted that the interactions between calcium ions and daptomycin are 2-3 orders of magnitude stronger than between daptomycin and PG lipids [3,4]. Thus, upon the addition of CaCl2 to the solution, the divalent cations of calcium bind preferentially to the daptomycin, rather than to the negatively charged PG lipids, which results in the decrease of daptomycin net negative charge but also leads to its first conformational change [4]. Upon binding between calcium ions and two aspartate residues, the area of the hydrophobic surface increases, which allows the daptomycin to interact with the negatively charged membrane. In the next step, Ca2+ acts as a bridge connecting daptomycin with the anionic lipids. This event leads to the second conformational change, which enables deeper insertion of daptomycin into the lipid membrane and enables its fluorescence [4]. The overall mechanism has a sequential character, where the binding of daptomycin-Ca2+ complex to the negatively charged PG (or CA) occurs at the end.

      The authors should focus on emphasizing the novelty of their manuscript, keeping in mind the already published paper.

      We agree with the comments on the three general roles of calcium ion in the Dap interaction with membrane. The current investigation does not ignore the previous findings, which involve many more works than mentioned above, but takes these findings as common knowledge. Actually, the role of calcium ion is not the focus of current work. Instead, the current work focuses on how the drug is taken up and inserted into the membrane in the presence of the ion and how its structure changes in this process. With the known roles of calcium ion in mind, we propose an uptake mechanism (Fig. 6) that shows no conflict with the common knowledge.

      We would like to point out that the ‘deeper insertion into the membrane’ in the comment is different from the membrane insertion referred to in our manuscript. This ‘deeper insertion’ still remains in the reversible stage of binding to the membrane surface because all negative phospholipids can do this (causing a conformational change and fluorescence increase, as quantified in Fig.2C) but now we know that only PG can enable irreversible membrane insertion because of our work. In addition, the comment that calcium binding to daptomycin causes first conformational change is not supported by our finding that no conformational change is found for Dap in the presence of calcium in a lipid-free environment (Fig. 3B). One important aspect of novelty and contribution of our work is to clear up some of these ambiguities in the literature. Another contribution of our work is to demonstrate the formation of a stable complex between Dap and PG with a defined stoichiometry and its crucial role in the drug uptake. 

      (4) One paragraph in the section "Ca2+- dependent interaction between Dap and DMPG" is devoted to a discussion of the formation of precipitate upon extraction of DMPG-containing micelles, exposed to Dap in the calcium-rich environment. Contrary, in the absence of Dap, no precipitate was detected. The authors did not provide any visual proof for their statement. Please include proper photographs in the supplementary information.

      The precipitate formed upon extraction of the DMPG-containing micelles was too little to be visually identifiable but could be collected by centrifugation and detected by fluorescence or HPLC after dissolving in DMSO. For visualization, we show below the precipitate formed using higher amount of Dap and DMPG. The Dap-DMPG-Ca2+ complex (left tube) was formed by mixing 1 mM Dap, 2 mM DMPG and 1 mM Ca2+ and the control (right tube) was a mixture of 2 mM DMPG and 1 mM Ca2+. This is now added as Fig. S7 in the supplementary information (the index is modified accordingly) and cited in the main text.

      (5) The authors wrote that it is not clear how many calcium ions are bound to Dap-2PG complex (page 11, Discussion section). There are already reports discussing this issue. I recommend citing the paper discussing that exactly two Ca2+ ions bind to a single Dap molecule: R. Taylor, K. Butt, B. Scott, T. Zhang, J.K. Muraih, E. Mintzer, S. Taylor, M. Palmer, Two successive calcium-dependent transitions mediate membrane binding and oligomerization of daptomycin and the related antibiotic A54145, Biochim. Biophys. Acta - Biomembr. 1858, (2016) 1999-2005 [5]

      We were aware of the cited work that shows binding of two Ca2+ but also noted that there are more works showing one Ca2+ in the binding, such as the paper in [Ho, S. W., Jung, D., Calhoun, J. R., Lear, J. D., Okon, M., Scott, W. R. P., Hancock, R. E. W., & Straus, S. K. (2008), Effect of divalent cations on the structure of the antibiotic daptomycin. European Biophysics Journal, 37(4), 421–433.]. That was the reason we said ‘it is not clear how many calcium ions are bound to Dap-2PG complex’. Now, both papers are cited (as Ref. #33, 34) to support this statement.

      (6) The authors wrote two contradictory statements:

      -  PG cannot be found in mammalian cell membranes:

      "Moreover, the complete dependence of the membrane insertion on PG also explains why Dap selectively attacks Gram-positive bacteria without affecting mammalian cells, because PG is present only in bacterial membrane but not in mammalian membrane. " (Page 10, Discussion section, last sentence of the first paragraph)

      "However, Dap absorbed on bacterial surface is continuously inserted into the acyl layer via formation of complex with PG in a time scale of minutes, whereas no irreversible insertion of Dap occurs on mammalian membrane due to the absence of PG while the bound Dap is continuously released to the circulation as the drug is depleted by the bacteria." (Page 13, Discussion section)

      -  PG in trace amounts is present in mammalian membranes:

      "The proposed requirement of the pre-insertion quaternary complex increases the threshold of PG content for the membrane insertion to happen and thus makes it impossible on the surface of mammalian cells even if their plasma membrane contains a trace amount of PG." (Page 13, Discussion section).

      In fact, phosphatidylglycerol comprises 1-2 mol% of the mammalian cell membranes. Please, correct this information, which in this form is misleading to the readers.

      We appreciate the comments about the PG content in mammalian cells. Changes are made as listed below:

      (1) p10, the sentence is changed to ‘Moreover, the complete dependence of the membrane insertion on PG also explains why Dap selectively attacks Gram-positive bacteria without affecting mammalian cells, because PG is a major phospholipid in bacterial membrane but is a minor component in mammalian membrane.’ 

      (2) p13, the sentence is changed to ‘However, Dap absorbed on bacterial surface is continuously inserted into the acyl layer via formation of complex with PG in a time scale of minutes, whereas little irreversible insertion of Dap occurs on mammalian membrane due to the low content of PG while the bound Dap is continuously released to the circulation as the drug is depleted by the bacteria.’

      (3) p13, another sentence is modified to ‘The proposed requirement of the pre-insertion quaternary complex increases the threshold of PG content for the membrane insertion to happen and thus makes it less likely on the surface of mammalian cells that contain PG at a low level in the membrane.’ 

      (7) Please include information that Dap is effective only against Gram-positive bacteria and does not show antimicrobial properties against Gram-negative strains. The authors focused on emphasizing that Dap does not affect mammalian membranes, most likely due to the low PG content, however even membranes of Gram-negative bacteria are not susceptible to the Dap, despite the relatively high content of negatively charged PG in the inner membrane (e.g. inner cell membrane of E. coli has ~20% PG).

      The requested information is already included in ‘Introduction’. In this part, Dap is introduced to be only active against Gram-positive bacteria, implicating that it is not active against Gram-negative bacteria. The reason Dap is inactive against E. coli or other Gramnegative bacteria is because the outer membrane prevents the antibiotic from accessing the PG in the inner membrane to cause any harm. When the outer membrane is removed, Dap will also attack the plasma membrane of Gram-negative bacteria. 

      Literature cited in the comments:

      (1) E. Krok, M. Stephan, R. Dimova, L. Piatkowski, Tunable biomimetic bacterial membranes from binary and ternary lipid mixtures and their application in antimicrobial testing, Biochim. Biophys. Acta - Biomembr. 1865 (2023). https://doi.org/10.1101/2023.02.12.528174.

      (2) S.W. Ho, D. Jung, J.R. Calhoun, J.D. Lear, M. Okon, W.R.P. Scott, R.E.W. Hancock, S.K. Straus, Effect of divalent cations on the structure of the antibiotic daptomycin, Eur. Biophys. J. 37 (2008) 421-433. https://doi.org/10.1007/S00249-007-0227-2/METRICS.

      (3) A. Pokorny, P.F. Almeida, The Antibiotic Peptide Daptomycin Functions by Reorganizing the Membrane, J. Membr. Biol. 254 (2021) 97-108. https://doi.org/10.1007/s00232-02100175-0.

      (4) L. Robbel, M.A. Marahiel, Daptomycin, a bacterial lipopeptide synthesized by a nonribosomal machinery, J. Biol. Chem. 285 (2010) 2750127508. https://doi.org/10.1074/JBC.R110.128181.

      (5) R. Taylor, K. Butt, B. Scott, T. Zhang, J.K. Muraih, E. Mintzer, S. Taylor, M. Palmer, Two successive calcium-dependent transitions mediate membrane binding and oligomerization of daptomycin and the related antibiotic A54145, Biochim. Biophys. Acta - Biomembr. 1858 (2016) 1999-2005. https://doi.org/10.1016/J.BBAMEM.2016.05.020.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02545

      Corresponding author(s): Woo Jae, Kim

      1. General Statements

      We sincerely appreciate the positive and constructive feedback provided by all three reviewers. Their insightful comments have been invaluable in guiding our revisions. In response, we have made every effort to address their suggestions through additional experiments and by restructuring our manuscript to improve clarity and coherence.

      In this revision, we have streamlined the presentation of our data to enhance the narrative flow, ensuring that it is more accessible to a general readership. We believe that these changes not only strengthen our manuscript but also align with the reviewers' recommendations for improvement.

      We are hopeful that the revisions we have implemented meet the expectations of the reviewers and contribute to a clearer understanding of our findings. Thank you once again for your thoughtful critiques, which have greatly aided us in refining our work.


      2. Point-by-point description of the revisions

      Reviewer #1

      General comment: This manuscript by Song et al. investigates the molecular mechanisms underlying changes in mating duration in Drosophila induced by previous experience. As they have shown previously, they find that male flies reared in isolation have shorter mating duration than those reared in groups, and also that male flies with previous mating experience have shorter mating duration than sexually naïve males. They have conducted a myriad of experiments to demonstrate that the neuropeptide SIFa is required for these changes in mating duration. They have further provided evidence that SIFa-expressing neurons undergo changes in synaptic connectivity and neuronal firing as a result of previous mating experience. Finally, they argue that SIFa neurons form reciprocal connections with sNPF-expressing neurons, and that communication within the SIFa-sNPF circuit is required for experience-dependent changes in mating duration. These results are used to assert that SIFa neurons track the internal state of the flies to modulate behavioral choice.

         __Answer:__ We appreciate the reviewer's thoughtful comments and commendations regarding our manuscript. The recognition of our investigation into the molecular mechanisms influencing mating duration in *Drosophila* is greatly valued. In particular, we are grateful for the reviewer's positive remarks about our comprehensive experimental approach to demonstrate the role of the neuropeptide SIFa in these changes. The evidence we provided indicating that SIFa-expressing neurons undergo alterations in synaptic connectivity and neuronal firing due to previous social experiences is crucial for elucidating the underlying neural circuitry involved in experience-dependent behaviors. Finally, we are thankful for the recognition of our assertion that SIFa neurons form reciprocal connections with sNPF-expressing neurons, emphasizing the importance of this circuit in modulating behavioral choices based on internal states. To provide stronger evidence for the interactions between SIFa and sNPF, we conducted detailed GCaMP experiments, which revealed intriguing neural connections between these two neuropeptides. We have included this new data in our main figure. We believe these insights contribute significantly to the existing literature on neuropeptidergic signaling and its implications for understanding complex behaviors in *Drosophila*. We look forward to addressing any further comments and enhancing our manuscript based on your invaluable feedback. Thank you once again for your constructive critique and support.
      

      Major concerns:

      Comment 1. The authors are to be commended for the sheer quantity of data they have generated, but I was often overwhelmed by the figures, which try to pack too much into the space provided. As a result, it is often unclear what components belong to each panel. Providing more space between each panel would really help.

         __Answer:__ We sincerely appreciate the reviewer’s commendation regarding the extensive data we have generated in our study. It is gratifying to know that our efforts to provide a comprehensive analysis of the molecular mechanisms underlying changes in mating duration have been recognized. We understand the concern regarding the density of information presented in our figures. We aimed to convey a wealth of data to support our findings, but we acknowledge that this may have led to some confusion regarding the organization and clarity of the panels. We are grateful for your constructive feedback on this matter. In response, we have significantly reduced the density of the main figures and decreased the size of the graphs to improve clarity. We have also increased the spacing between panels to ensure that each component is more easily distinguishable. Further details will be provided in our responses to each comment below.
      
      • *

      Comment 2. This is a rare instance where I would recommend paring down the paper to focus on the more novel, clear and relevant results. For example, all of Figure 2 shows the projection pattern of SIFa+ neuron dendrites and axons, which have been reported by multiple previous papers. Figure 7G and J show trans-tango data and SIFaR-GAL4 expression patterns, which were previously reported by Dreyer et al., 2019. These parts could be removed to supplemental figures. Figure 5 details experiments that knock down expression of different neurotransmitter receptors within the SIFa-expressing cells. The results here are less definitive than the SIFa knockdown results, and the SCope data supporting the idea that these receptors are expressed in SIFa-expressing neurons is equivocal. I would recommend removing these data (perhaps they could serve as the basis for another manuscript) or focusing solely on the CCHa1R results, which is the only manipulation that affects both LMD and SMD.

         __Answer:__ We sincerely appreciate the reviewer’s positive feedback regarding the extensive data generated in our study. We also fully agree with the reviewer that the sheer volume of our data made it challenging to support our hypothesis that SIFa neurons serve as a hub for integrating multiple neuropeptide inputs and orchestrating various behaviors related to energy balance, as highlighted in our new Figure 5N.
      
         In response to the reviewer's suggestions, we have streamlined our manuscript by removing excessive and redundant data to enhance clarity and simplicity. First, we have moved Figure 2 to the supplementary materials as the reviewer noted that the branching patterns of SIFa neurons are well-documented in previous literature. Second, we relocated the trans-tango data from Figure 7G to Figure S7, since this information is also well-established. We retained this data in the supplementary section to illustrate the connection of SIFa to our recent findings regarding SIFaR24F06 neuron connections. Additionally, we have completely removed the neuropeptide receptor input screening data previously included in Figure 5, as well as Figure S8, which presented fly SCope tSNE data. As suggested by the reviewer, we plan to utilize these data for a future paper focused on investigating the underlying mechanisms of SIFa inputs that modulate SIFa activity. Thanks to the reviewer’s constructive suggestions, we believe our manuscript is now more convincing and clearer for readers.
      

      Comment 3. Finally, I would like the authors to spend more time explaining how they think the results tie together. For example, how do the authors think the changes in branching and activity in SIFa-expressing neurons tie to the change in mating duration provoked by previous experience? It would benefit the manuscript to simplify and clarify the message about what the authors think is happening at the mechanistic level. The various schematics (eg. Fig 7N) describe the results but the different parts feel like separate findings rather than a single narrative. (MECHANISMS diagram and explanation)

         __Answer:__ We appreciate the reviewer’s constructive comments, which have significantly improved our manuscript and conclusions for our readers. As the reviewer will see, we have made substantial revisions in line with the suggestions provided. We dedicated additional time to clarify the electrical activities and synaptic plasticity of SIFa neurons in relation to internal states that orchestrate various behaviors. We have summarized our hypothesis regarding the mechanistic role of SIFa neurons in Figure 5N. In brief, we propose that SIFa neurons function as a hub that receives diverse neuropeptidergic signals, which subsequently alters their electrical activity and synaptic branching. This, in turn, leads to different internal states. The internal states of SIFa neurons can then be interpreted by SIFaR-expressing cells, which help orchestrate various behaviors and physiological responses. We aim to address these aspects further in another manuscript that has been co-submitted alongside this one [1].
      

      Comment 4. Most of the experiments lack traditional controls. For example, in experiments in Fig 1C-K, one would typically include genetic controls that contain either the GAL4 or UAS elements alone. The authors should explain their decision to omit these control experiments and provide an argument for why they are not necessary to correctly interpret the data. In this vein, the authors have stated in the methods that stocks were outcrossed at least 3x to Canton-S background, but 3 outcrosses is insufficient to fully control for genetic background.

         __Answer:__ We sincerely thank the reviewer for insightful comments regarding the absence of traditional genetic controls in our study of LMD and SMD behaviors. We acknowledge the importance of such controls and wish to clarify our rationale for not including them in the current investigation. The primary reason for not incorporating all genetic control lines is that we have previously assessed the LMD and SMD behaviors of GAL4/+ and UAS/+ strains in our earlier studies. Our past experiences have consistently shown that 100% of the genetic control flies for both GAL4 and UAS exhibit normal LMD and SMD behaviors. Given these findings, we deemed the inclusion of additional genetic controls to be non-essential for the present study, particularly in the context of extensive screening efforts. We understand the value of providing a clear rationale for our methodology choices. To this end, we have added a detailed explanation in the "MATERIALS AND METHODS" section and the figure legends of Figure 1. This clarification aims to assist readers in understanding our decision to omit traditional controls, as outlined below.
      

      "Mating Duration Assays for Successful Copulation

      The mating duration assay in this study has been reported[33,73,93]. To enhance the efficiency of the mating duration assay, we utilized the Df (1)Exel6234 (DF here after) genetic modified fly line in this study, which harbors a deletion of a specific genomic region that includes the sex peptide receptor (SPR)[94,95]. Previous studies have demonstrated that virgin females of this line exhibit increased receptivity to males[95]. We conducted a comparative analysis between the virgin females of this line and the CS virgin females and found that both groups induced SMD. Consequently, we have elected to employ virgin females from this modified line in all subsequent studies. For naïve males, 40 males from the same strain were placed into a vial with food for 5 days. For single reared males, males of the same strain were collected individually and placed into vials with food for 5 days. For experienced males, 40 males from the same strain were placed into a vial with food for 4 days then 80 DF virgin females were introduced into vials for last 1 day before assay. 40 DF virgin females were collected from bottles and placed into a vial for 5 days. These females provide both sexually experienced partners and mating partners for mating duration assays. At the fifth day after eclosion, males of the appropriate strain and DF virgin females were mildly anaesthetized by CO2. After placing a single female in to the mating chamber, we inserted a transparent film then placed a single male to the other side of the film in each chamber. After allowing for 1 h of recovery in the mating chamber in 25℃ incubators, we removed the transparent film and recorded the mating activities. Only those males that succeeded to mate within 1 h were included for analyses. Initiation and completion of copulation were recorded with an accuracy of 10 sec, and total mating duration was calculated for each couple. All assays were performed from noon to 4pm. Genetic controls with GAL4/+ or UAS/+ lines were omitted from supplementary figures, as prior data confirm their consistent exhibition of normal LMD and SMD behaviors [33,73,93,96,97]. Hence, genetic controls for LMD and SMD behaviors were incorporated exclusively when assessing novel fly strains that had not previously been examined. In essence, internal controls were predominantly employed in the experiments, as LMD and SMD behaviors exhibit enhanced statistical significance when internally controlled. Within the LMD assay, both group and single conditions function reciprocally as internal controls. A significant distinction between the naïve and single conditions implies that the experimental manipulation does not affect LMD. Conversely, the lack of a significant discrepancy suggests that the manipulation does influence LMD. In the context of SMD experiments, the naïve condition (equivalent to the group condition in the LMD assay) and sexually experienced males act as mutual internal controls for one another. A statistically significant divergence between naïve and experienced males indicates that the experimental procedure does not alter SMD. Conversely, the absence of a statistically significant difference suggests that the manipulation does impact SMD. Hence, we incorporated supplementary genetic control experiments solely if they deemed indispensable for testing. All assays were performed from noon to 4 PM. We conducted blinded studies for every test[98,99] .

         While we have previously addressed this type of reviewer feedback in our published manuscript [2–7], we appreciate the reviewer’s suggestion to include traditional genetic control experiments. In response, we conducted all feasible combinations of genetic control experiments for LMD/SMD during the revision period. The results are presented in the supplementary figures and are described in the main text.
      
         We appreciate the reviewer's inquiry regarding the genetic background of our experimental lines. In response to the comments, we would like to clarify the following. All of our GAL4, UAS, or RNAi lines, which were utilized as the virgin female stock for outcrosses, have been backcrossed to the Canton-S (CS) genetic background for over ten generations. The majority of these lines, particularly those employed in LMD assays, have been maintained in a CS backcrossed status for several years, ensuring a consistent genetic background across multiple generations. Our experience has indicated that the genetic background, particularly that of the X chromosome inherited from the female parent, plays a pivotal role in the expression of certain behavioral traits. Therefore, we have consistently employed these fully outcrossed females as virgins for conducting experiments related to LMD and SMD behaviors. It is noteworthy that, in contrast to the significance of genetic background for LMD behaviors, we have previously established in our work [6] that the genetic background does not significantly influence SMD behaviors. This distinction is important for the interpretation of our findings. To provide a comprehensive understanding of our experimental design, we have detailed the genetic background considerations in the __"Materials and Methods"__ section, specifically in the subsection __"Fly Stocks and Husbandry"__ as outlined below.
      

      "To reduce the variation from genetic background, all flies were backcrossed for at least 3 generations to CS strain. For the generation of outcrosses, all GAL4, UAS, and RNAi lines employed as the virgin female stock were backcrossed to the CS genetic background for a minimum of ten generations. Notably, the majority of these lines, which were utilized for LMD assays, have been maintained in a CS backcrossed state for long-term generations subsequent to the initial outcrossing process, exceeding ten backcrosses. Based on our experimental observations, the genetic background of primary significance is that of the X chromosome inherited from the female parent. Consequently, we consistently utilized these fully outcrossed females as virgins for the execution of experiments pertaining to LMD and SMD behaviors. Contrary to the influence on LMD behaviors, we have previously demonstrated that the genetic background exerts negligible influence on SMD behaviors, as reported in our prior publication [6]. All mutants and transgenic lines used here have been described previously."

      Comment 5. Throughout the manuscript, the authors appear to use a single control condition (sexually naïve flies raised in groups) to compare to both males raised singly and males with previous sexual experience. These control conditions are duplicated in two separate graphs, one for long mating duration and one for short mating duration, but they are given different names (group vs naïve) depending on the graph. If these are actually the same flies, then this should be made clear, and they should be given a consistent name across the different "experiments".

         __Answer:__ We are grateful to the reviewer for highlighting the potential for confusion among readers regarding the visualization methods used in our figures. In response to this valuable feedback, we have now included a more detailed explanation of the graph visualization techniques in the legends of Figure 1, as detailed below. This additional information should enhance the clarity and understanding of the figure for all readers.
      

      In the mating duration (MD) assays, light grey data points denote males that were group-reared (or sexually naïve), whereas blue (or pink) data points signify males that were singly reared (or sexually experienced). The dot plots represent the MD of each male fly. The mean value and standard error are labeled within the dot plot (black lines). Asterisks represent significant differences, as revealed by the unpaired Student’s t test, and ns represents non-significant differences M.D represent mating duration. DBMs represent the 'difference between means' for the evaluation of estimation statistics (See MATERIALS AND METHODS). Asterisks represent significant differences, as revealed by the Student’s t test (* p

      Comment 6. The authors use SCope data to provide evidence for co-expression of SIFa and other neurotransmitters or neuropeptide receptors. The graphs they show are hard to read and it is not clear to what extent the gene expression is actually overlapping. It would be more definitive to show graphs that indicate which percentage of SIFa-expressing cells co-express other neurotransmitter components, and what the actual level of expression of the genes is. The authors should also provide more information on how they identified the SIFa+ cells in the fly atlas dataset. These are important pieces of information to be able to interpret the effects of manipulation of these other neurotransmitter systems within SIFa-expressing cells on mating duration.

      __ Answer: We appreciate the reviewer for pointing out the potential for confusion among readers regarding the visualization methods used in our figures, particularly concerning the tSNE plots of scRNA-seq data. As mentioned in our previous response, we have removed most of the tSNE plots related to co-expression data with SIFa and NPRs, which we believe will reduce any confusion for readers interpreting these plots. However, we have retained a few tSNE plots, specifically Figures 2N-O, to confirm the potential co-expression of the ple and Vglut genes in SIFa cells. We understand the reviewer’s concerns about the clarity of the presented data and the necessity for more detailed information regarding the extent of co-expression and the identification of SIFa-expressing cells. To address these concerns, we have included a comprehensive description of our methods in the __MATERIALS AND METHODS section below.

      "Single-nucleus RNA-sequencing analyses

      The snRNAseq dataset analyzed in this paper is published in [112] and available at the Nextflow pipelines (VSN, https://github.com/vib-singlecell-nf), the availability of raw and processed datasets for users to explore, and the development of a crowd-annotation platform with voting, comments, and references through SCope (https://flycellatlas.org/scope), linked to an online analysis platform in ASAP (https://asap.epfl.ch/fca). For the generation of the tSNE plots, we utilized the Fly SCope website (https://scope.aertslab.org/#/FlyCellAtlas/*/welcome). Within the session interface, we selected the appropriate tissues and configured the parameters as follows: 'Log transform' enabled, 'CPM normalize' enabled, 'Expression-based plotting' enabled, 'Show labels' enabled, 'Dissociate viewers' enabled, and both 'Point size' and 'Point alpha level' set to maximum. For all tissues, we referred to the individual tissue sessions within the '10X Cross-tissue' RNAseq dataset. Each tSNE visualization depicts the coexpression patterns of genes, with each color corresponding to the genes listed on the left, right, and bottom of the plot. The tissue name, as referenced on the Fly SCope website is indicated in the upper left corner of the tSNE plot. Dashed lines denote the significant overlap of cell populations annotated by the respective genes. Coexpression between genes or annotated tissues is visually represented by differentially colored cell populations. For instance, yellow cells indicate the coexpression of a gene (or annotated tissue) with red color and another gene (or annotated tissue) with green color. Cyan cells signify coexpression between green and blue, purple cells for red and blue, and white cells for the coexpression of all three colors (red, green, and blue). Consistency in the tSNE plot visualization is preserved across all figures.

      Single-cell RNA sequencing (scRNA-seq) data from the Drosophila melanogaster were obtained from the Fly Cell Atlas website (https://doi.org/10.1126/science.abk2432). Oenocytes gene expression analysis employed UMI (Unique Molecular Identifier) data extracted from the 10x VSN oenocyte (Stringent) loom and h5ad file, encompassing a total of 506,660 cells. The Seurat (v4.2.2) package (https://doi.org/10.1016/j.cell.2021.04.048) was utilized for data analysis. Violin plots were generated using the “Vlnplot” function, the cell types are split by FCA.

         We have also included detailed descriptions in the figure legends for the initial tSNE plot presented below to help readers clearly understand the significance of this visualization.
      

      "Each tSNE visualization depicts the coexpression patterns of genes, with each color corresponding to the genes listed on the left, right, and/or bottom of the plot. The tissue name, as referenced on the Fly SCope website is indicated in the upper left corner of the tSNE plot. Consistency in the tSNE plot visualization is preserved across all figures."

      Comment 7. I would like to see more information on how the thresholding and normalization was done for immunohistochemistry experiments. Was thresholding applied equally across all datasets? Furthermore, "overlap" of Denmark and Syt-eGFP is taken as evidence for synaptic connectivity, but the latter requires more than just overlap in the location of the axon terminal and dendrite regions of the neuron.

      __ Answer: Thank you for your continued engagement with our manuscript and for highlighting the need for further clarification on our methods. Your attention to the details of our immunohistochemistry experiments is commendable, and we agree that providing a clear explanation of our thresholding and normalization procedures is essential for the transparency and reproducibility of our results. We concur that the intensity of these signals is indeed correlated with the area measurements, which is a critical factor to consider. In response to the reviewer's valuable suggestion, we have revised our approach and now present our data based on intensity measurements. Additionally, we have updated the labeling of our Y-axis to "Norm. GFP Int.", which stands for "normalized GFP intensity". This change ensures clarity and consistency in the presentation of our data. We primarily adhered to the established methods outlined by Kayser et al. [8]. To address your first point, we have now included a more detailed description of our thresholding and normalization procedures in the __MATERIALS AND METHODS section as below.

      "Quantitative analysis of fluorescence intensity

      To ascertain calcium levels and synaptic intensity from microscopic images, we dissected and imaged five-day-old flies of various social conditions and genotypes under uniform conditions. The GFP signal in the brains and VNCs was amplified through immunostaining with chicken anti-GFP primary antibody. Image analysis was conducted using ImageJ software. For the quantification of fluorescence intensities, an investigator, blinded to the fly's genotype, thresholded the sum of all pixel intensities within a sub-stack to optimize the signal-to-noise ratio, following established methods [93]. The total fluorescent area or region of interest (ROI) was then quantified using ImageJ, as previously reported. For CaLexA or TRIC signal quantification, we adhered to protocols detailed by Kayser et al. [94], which involve measuring the ROI's GFP-labeled area by summing pixel values across the image stack. This method assumes that changes in the GFP-labeled area and intensity are indicative of alterations in the CaLexA and TRIC signal, reflecting synaptic activity. ROI intensities were background-corrected by measuring and subtracting the fluorescent intensity from a non-specific adjacent area, as per Kayser et al. [94]. For normalization, nc82 fluorescence is utilized for CaLexA, while RFP signal is employed for TRIC experiments, as the RFP signal from the TRIC reporter is independent of calcium signaling [76]. For the analysis of GRASP or tGRASP signals, a sub-stack encompassing all synaptic puncta was thresholded by a genotype-blinded investigator to achieve the optimal signal-to-noise ratio. The fluorescence area or ROI for each region was quantified using ImageJ, employing a similar approach to that used for CaLexA or TRIC quantification [93]. 'Norm. GFP Int.' refers to the normalized GFP intensity relative to the RFP signal."

      Comment 8. None of the RNAi experiments have been validated to demonstrate effective knockdown. In many cases, this would be difficult to do because of a lack of an antibody to quantify in a cell-specific manner; however, this fact should be acknowledged, especially in cases where there was found to be a lack of phenotype, which could result from lack of knockdown. The authors could also look for evidence in the literature of cases where RNAi lines they have used have been previously validated. For SIFa, knockdown can be easily confirmed with the SIFa antibody the authors have used elsewhere in the manuscript.

      __ Answer:__ We appreciate the reviewer’s constructive and critical comments regarding the validation of our RNAi experiments through effective knockdown. We understand the reviewer’s concerns about achieving effective knockdown with RNAi; however, we have demonstrated in our unpublished preprint that the neuronal knockdown using independent SIFa-RNAi lines aligns with the SIFa mutant phenotype, which is consistent with our current findings on SIFa knockdown (Wong 2019). In most cases involving RNAi experiments, we have utilized independent RNAi strains to confirm consistent phenotypes and have compared these results with those from mutant phenotypes [1,9]. Therefore, we are confident that our observed SIFa phenotype results from effective RNAi knockdown. Nevertheless, we respect the reviewer’s comments and have conducted additional SIFa knockdown experiments using various GAL4 drivers, followed by immunostaining with SIFa antibodies. As shown in Figure S1B, both neuronal GAL4 drivers and SIFa-GAL4 effectively reduced SIFa immunoreactivity. We believe this indicates that our SIFa knockdown efficiently phenocopies the SIFa mutant phenotype. We also described this result in manuscript as below:

      "Using the GAL4SIFa.PT driver and the elavc155 driver, we observed a significant decrease in SIFa immunoreactivity following SIFa-RNAi treatment, thereby confirming the effective knockdown of SIFa in these cells. In contrast, when SIFa-RNAi was expressed under the control of the repo-GAL4 driver, no significant change in SIFa immunoreactivity was detected (Fig. S1B). This control experiment highlights the specificity of the SIFa-RNAi effect and supports the conclusion that the behavioral changes observed in SMD and LMD are likely attributable to the targeted reduction of SIFa in the intended neuronal populations."

      Minor comments:

      Comment 1. There are quite a lot of citations to preprints, including preprints of the manuscripts under review. It seems inappropriate to cite a preprint of the manuscript you are submitting because it gives a false sense of strengthening the assertions being made in the manuscript.

         __Answer:__ We agree with the reviewer and have omitted all preprints that are currently under review, except for those that are deemed necessary, such as the Zhang et al. 2024 preprint, which is being submitted alongside this manuscript.
      

      Comment 2. It seems that labels are incorrect on a number of the immunohistochemistry figures. For example, in Fig 2N, it labels dendrites as green, but this is sytEGFP, which is the presynaptic terminal.

      __ Answer:__ We thoroughly reviewed and corrected the errors in the labels.

      Comment ____3. Fig 4N shows grasp between SIFa-LexA and sNPF-R-GAL4, but the authors have argued that these two components should both be expressed in SIFa-expressing cells. This would make grasp signal misleading, because it would appear in the SIFa-expressing cells even without synaptic contacts due to both split GFP molecules being expressed in these cells.

         __Answer:__ We appreciate the reviewer’s critical comments regarding the interpretation of our GRASP experiments. As the reviewer noted, we acknowledge that the GRASP results also indicate synaptic contacts between SIFa cells. We have elaborated on these results in the following sections.
      

      "This indicates that the synapses between SIFa cells expressing sNPF-R become stronger (S5K to S5M Fig)."

         However, we understand that readers may find the interpretation of this GRASP data confusing, so we have included additional explanations below to clarify.
      

      This indicates that the synapses between SIFa cells expressing sNPF-R become stronger (S5K to S5M Fig) since we have found that SIFa cells express sNPF-R (Fig 3M, S5E and S5G)

      Comment 4. For quantifying TRIC and CaLexA experiments (eg. Figure 6A-E), intensity of signal should be measured in addition to the area covered by the signal.

      __ Answer:__ We concur with the reviewer. Since all of our analyses indicated that the area covered by the signal correlates with the signal intensity, we opted to use normalized intensity rather than area coverage.

      Conclusive Comments: This study will be most relevant to researchers interested in understanding neuronal control of behavior. It has provided novel information about the mechanisms underlying mating duration in flies, which is used to delineate how internal state influences behavioral outcomes. This represents a conceptual advance, particularly in identifying a cell type and molecule that influences mating duration decisions. The strength of the manuscript is the number of different assays used to investigate the central question from a number of angles. The limitation is that there is a lack of a big picture tying the different components of the manuscript together. Too much data is presented without providing a framework to understand how the data points fit together.

      • Answer: We sincerely appreciate the reviewer’s positive feedback regarding our study and the recognition of its relevance to researchers interested in understanding the neuronal control of behavior. We are grateful for the acknowledgment of our novel insights into the mechanisms underlying mating duration in Drosophila*, particularly in how internal states influence behavioral outcomes. The identification of specific cell types and molecules that affect mating duration decisions indeed represents a significant conceptual advance. We also appreciate the reviewer’s commendation of the diverse array of assays employed in our investigation, which allowed us to approach our central question from multiple perspectives.

        In response to the reviewer’s constructive criticism regarding the lack of a cohesive framework tying the various components of our manuscript together, we have completely restructured our manuscript. We removed redundant data and incorporated additional convincing experiments, such as GCaMP analyses, to enhance clarity and coherence. Furthermore, we have provided a simplified yet comprehensive overview that describes the role of SIFa as a hub for neuropeptidergic signaling. This framework illustrates how SIFa orchestrates multiple behaviors related to energy balance through calcium signaling and synaptic plasticity via SIFaR-expressing cells.

        We believe these revisions address the reviewer’s concerns and provide a clearer understanding of how the different elements of our study fit together, ultimately strengthening the overall impact of our manuscript. Thank you for your valuable feedback, which has guided us in improving our work.

      Reviewer #2

      General Comments:* In the present study, the authors employ mating behavior in male fruit flies, Drosophila melanogaster, to investigate the behavioral roles of the neuropeptide SIFamide. The duration of mating behavior in these animals varies depending on context, previous experience, and internal metabolic state. The authors use this variability to explore the neuronal mechanisms that control these influences. In an abstraction step, they compare the different mating durations to concepts of neuronal interval timing.

      The behavioral functions of the neuropeptide SIFamide have been thoroughly characterized in several studies, particularly in the contexts of circadian rhythm and sleep, courtship behavior, and food uptake. This study adds new data, demonstrating that SIFamide is essential for the proper control of mating behavior, highlighting the interconnection of various state- and motivation-dependent behaviors at the neuronal level. However, the hypothesis that mating behavior is related to interval timing is not convincingly supported.

      Experimentally, the authors show that RNAi-mediated downregulation of SIFamide affects mating duration in male flies. They use combinations of RNAi lines under the control of various Gal4 lines to identify additional neurotransmitters, neuropeptides, and receptors involved in this process. This approach is complemented by neuroanatomical staining and single-cell RNA sequencing.*

      * Overall, the study advances our knowledge about the behavioral roles of SIFamide, which is certainly important, interesting, and worthy of being reported. However, the manuscript also raises several serious caveats and includes points that remain speculative, are less convincing, or are simply incorrect.*

      • Answer: We would like to thank the reviewer for their thoughtful and constructive comments regarding our study. We appreciate the recognition of our investigation into the behavioral roles of the neuropeptide SIFamide in male Drosophila melanogaster*, particularly how we explored the variability in mating duration influenced by context, previous experience, and internal metabolic state. We are grateful for the acknowledgment that our study adds valuable data demonstrating the essential role of SIFamide in regulating mating behavior, highlighting the interconnectedness of various state- and motivation-dependent behaviors at the neuronal level.

        We also appreciate the reviewer's recognition of our experimental approach, which includes RNAi-mediated downregulation of SIFamide, the use of various Gal4 lines to identify additional neurotransmitters, neuropeptides, and receptors involved in this process, as well as our incorporation of neuroanatomical staining and single-cell RNA sequencing.

        In response to the reviewer’s concerns regarding the hypothesis that mating behavior is related to interval timing, we acknowledge that this aspect requires further clarification and support. We have revisited this hypothesis in our manuscript to strengthen its foundation and address any speculative elements. We aim to provide more robust evidence and clearer connections between mating behavior and neuronal interval timing.

        Furthermore, we have taken care to address any points that may have been perceived as less convincing or incorrect. We are committed to refining our manuscript to ensure that all claims are well-supported by our data. Thank you once again for your valuable feedback. We believe that these revisions will enhance the clarity and impact of our study while addressing the concerns raised.

      Major concerns:

      Comment 1. The authors conclude from their mating experiments that SIFamide controls interval timing. This conclusion is not supported by the data, which only indicate that SIFamide is required for normal mating duration and modulates the motivation-dependent component of this behavior. There is no clear evidence linking this to interval timing.

      __ Answer: __We appreciate the reviewer’s insightful comments regarding our conclusion linking SIFamide to interval timing in mating behavior. We acknowledge that our data primarily demonstrate that SIFamide is required for normal mating duration and modulates the motivation-dependent aspects of this behavior, and we recognize the need for clearer evidence connecting these observations to interval timing. Current research by Crickmore et al. has shed light on how mating duration in Drosophila serves as a powerful model for exploring changes in motivation over time as behavioral goals are achieved. For instance, at approximately six minutes into mating, sperm transfer occurs, leading to a significant shift in the male's nervous system: he no longer prioritizes sustaining the mating at the expense of his own survival. This change is driven by the output of four male-specific neurons that produce the neuropeptide Corazonin (Crz). When these Crz neurons are inhibited, sperm transfer does not occur, and the male fails to downregulate his motivation, resulting in matings that can last for hours instead of the typical ~23 minutes [10].

         Recent research by Crickmore et al. has received NIH R01 funding (Mechanisms of Interval Timing, 1R01GM134222-01) to explore mating duration in *Drosophila* as a genetic model for interval timing. Their work highlights how changes in motivation over time can influence mating behavior, particularly noting that significant behavioral shifts occur during mating, such as the transfer of sperm at approximately six minutes, which correlates with a decrease in the male's motivation to continue mating [10]. These findings suggest that mating duration is not only a behavioral endpoint but may also reflect underlying mechanisms related to interval timing.
      
         We believe that by leveraging the robustness and experimental tractability of these findings, along with our own work on SIFamide's role in mating behavior, we can gain deeper insights into the molecular and circuit mechanisms underlying interval timing. We will revise our manuscript to clarify this relationship and emphasize how SIFamide may interact with other neuropeptides and neuronal circuits involved in motivation and timing.
      
         In addition to the efforts of Crickmore's group to connect mating duration with a straightforward genetic model for interval timing, we have previously published several papers demonstrating that LMD and SMD can serve as effective genetic models for interval timing within the fly research community. For instance, we have successfully connected SMD to an interval timing model in a recently published paper [6], as detailed below:
      

      "We hypothesize that SMD can serve as a straightforward genetic model system through which we can investigate "interval timing," the capacity of animals to distinguish between periods ranging from minutes to hours in duration.....

      In summary, we report a novel sensory pathway that controls mating investment related to sexual experiences in Drosophila. Since both LMD and SMD behaviors are involved in controlling male investment by varying the interval of mating, these two behavioral paradigms will provide a new avenue to study how the brain computes the ‘interval timing’ that allows an animal to subjectively experience the passage of physical time [11–16]."

         Lee, S. G., Sun, D., Miao, H., Wu, Z., Kang, C., Saad, B., ... & Kim, W. J. (2023). Taste and pheromonal inputs govern the regulation of time investment for mating by sexual experience in male Drosophila melanogaster. *PLoS Genetics*, *19*(5), e1010753.
      
         We have also successfully linked LMD behavior to an interval timing model and have published several papers on this topic recently [4,5,7].
      
         Sun, Y., Zhang, X., Wu, Z., Li, W., & Kim, W. J. (2024). Genetic Screening Reveals Cone Cell-Specific Factors as Common Genetic Targets Modulating Rival-Induced Prolonged Mating in male Drosophila melanogaster. *G3: Genes, Genomes, Genetics*, jkae255.
      
         Zhang, T., Zhang, X., Sun, D., & Kim, W. J. (2024). Exploring the Asymmetric Body’s Influence on Interval Timing Behaviors of Drosophila melanogaster. *Behavior Genetics*, *54*(5), 416-425.
      
         Huang, Y., Kwan, A., & Kim, W. J. (2024). Y chromosome genes interplay with interval timing in regulating mating duration of male Drosophila melanogaster. *Gene Reports*, *36*, 101999.
      
         Finally, in this context, we have outlined in our INTRODUCTION section below how our LMD and SMD models are related to interval timing, aiming to persuade readers of their relevance. We hope that the reviewer and readers are convinced that mating duration and its associated motivational changes such as LMD and SMD provide a compelling model for studying the genetic basis of interval timing in *Drosophila*.
      

      "The mating duration of male fruit flies is a suitable model for studying interval timing and it could change based on internal states and environmental context. Previous studies by our group[27–30] and others[31,32] have established several frameworks for investigating the mating duration using sophisticated genetic techniques that can analyze and uncover the neural circuits’ principles governing interval timing. In particular, males exhibit LMD behavior when they are exposed to an environment with rivals, which means they prolong their mating duration. Conversely, they display SMD behavior when they are in a sexually saturated condition, meaning they reduce their mating duration[33,34]."

      Comment 2. On line 160, the authors state, "The connection between the dendrites and axons of the SIFamide neuronal processes is unknown." This is not entirely correct. State-of-the-art connectome analyses can determine synaptic connectivities between SIFamidergic neurons and pre-/postsynaptic neurons. The authors also overlook the thorough connectivity analysis by Martelli et al. (2017), which includes functional analyses and detailed anatomical descriptions that the current study confirms.

      __ Answer:__ We appreciate the reviewer for acknowledging the efforts of Martelli et al. in elucidating the neuronal architecture of SIFa neurons. We recognize that it was an oversight on our part to state that "the connection between the dendrites and axons of SIFa neurons is unknown." This error arose because our manuscript has been in preparation for over ten years, predating the publication of Martelli et al.'s work. That statement likely reflects an outdated section of the manuscript.

      We fully acknowledge the findings from previous publications and have removed that sentence entirely from our manuscript. In its place, we have added the following statement:

      "The established connections and architecture of SIFa neurons has been described by Martelli et al., which enhances our understanding of their functional roles within the neuronal circuitry [51]. To identify the dendritic and axonal components of SIFa-neuronal processes, we employed a similar approach to that reported by Martelli [51]."

      Thank you for your valuable feedback, which has helped us improve the clarity and accuracy of our manuscript.

      Comment 3. The mating experiments are overall okay, with sufficiently high sample sizes and appropriate statistical tests. However, many experiments lack genetic controls for the heterozygous parental strains, such as Gal4-ines AND UAS-lines. This is of course of importance and common standard.

      __ Answer: __While we have previously addressed this type of reviewer feedback in our published manuscript [2–7] as well as this manuscript by Reviewer #1, we appreciate the reviewer’s suggestion to include traditional genetic control experiments. In response, we conducted all feasible combinations of genetic control experiments for LMD/SMD during the revision period. The results are presented in the supplementary figures and are described in the main text.

      Comment 4. *Using a battery of RNAi lines, the authors aim to uncover which neurotransmitters might be co-released from SIFamide neurons to influence mating behavior. However, a behavioral effect of an RNAi construct expressed in SIFamidergic neurons does not demonstrate that the respective transmitter is actually released from these neurons. Alternative methods are needed to show whether glutamate, dopamine, serotonin, octopamine, etc., are present and released from SIFamide neurons. It is particularly challenging to prove that a certain substance acts as a transmitter released by a specific neuron. For example, anti-Tdc2 staining does not actually cover SIFamide neurons, and dopamine has not been described as present in SIFamide neurons. *

      __ Answer:__ We appreciate the reviewer’s constructive comments regarding the need to demonstrate the presence of the responsible neurotransmitters in SIFa neurons. While many studies utilize neurotransmitter-synthesizing enzymes such as TH, VGlut, Gad1, and Trhn to assess neurotransmitter effects, we recognize the importance of conclusively establishing that glutamate and dopamine play significant roles in modulating energy balance within SIFa neurons.

         First, the enrichment of tyramine (TA), octopamine (OA), and dopamine (DA) in SIFa neurons was suggested in the study by Croset et al. (2018) [17]. Although we tested Tdc2-RNAi and observed interesting phenotypes, we chose not to publish these findings, as our data on glutamate and dopamine provide a more compelling explanation for how SIFa cotransmission with these neurotransmitters can independently influence various behaviors, including sleep and mating duration.
      
         To confirm the expression of DA in SIFa neurons, we employed a well-established genetic toolkit for dissecting dopamine circuit function in *Drosophila* [18]. Our findings indicate that TH-C-GAL4 specifically labels SIFa neurons, which have been confirmed as dopaminergic (S4M Fig). Our genetic intersection data, along with Xie et al.'s findings from 2018, confirm that a subset of SIFa neurons is indeed dopaminergic. We have described these new results in the main text as follows:
      

      To further verify the presence of DA neurons within the SIFa neuron population, we utilized a well-established genetic toolkit for dissecting DA circuits and confirmed part of SIFa neurons are dopaminergic (S4M Fig) [58].

          To confirm the glutamatergic characteristics of SIFa neurons, we conducted several experiments that established glutamate as the most critical neurotransmitter for generating interval timing in both SIFa and SIFaR neurons. First, to demonstrate the presence of glutamatergic synaptic vesicles in SIFa neurons, we utilized a conditional glutamatergic synaptic vesicle marker for *Drosophila*, developed by Certel et al. [19]. Our results confirmed that SIFa neurons exhibit strong expression of glutamatergic synaptic vesicles (Fig. 2P and Fig. S4N as a genetic control). We have described these new results in the main text as follows:
      

      “To further verify the presence of DA neurons within the SIFa neuron population, we utilized a well-established genetic toolkit for dissecting DA circuits and confirmed part of SIFa neurons are dopaminergic (S4M Fig) [58]. We also employed a conditional glutamatergic synaptic vesicle marker to confirm the presence of glutamatergic SIFa neurons (Fig 2P and Fig S4N) [59].”

         To further confirm that glutamate release from SIFa neurons influences the function of SIFaR neurons, we tested several RNAi strains targeting glutamate receptors. Our results showed that the knockdown of glutamate receptors in SIFaR-expressing neurons produced phenotypes similar to those observed with VGlut-RNAi knockdown in SIFa neurons (Fig. G-L). We believe that this series of experiments demonstrates that glutamate and dopamine work in conjunction with SIFa to modulate interval timing and other behaviors related to energy balance. We have described these new results in the main text as follows:
      

      "To further substantiate the role of glutamate in SIFa-mediated behaviors. we targeted knockdown of VGlut receptors in SIFaR-expressing neurons. Strikingly, the knockdown of VGlut receptors in these neurons also disrupted SMD behavior, mirroring the phenotype observed upon direct suppression of glutamatergic signaling in SIFa neurons (S4G to S4L Fig). This suggests that glutamate is an essential neurotransmitter for modulating interval timing in SIFa neurons.”

      Comment 5. Single-cell RNA sequencing data alone is insufficient to claim multiple transmitter co-release from SIFamide neurons. Figures illustrating single-cell RNA sequencing, such as Figure 3P-R, are not intuitively understandable, and the figure legends lack sufficient information to clarify these panels. As a side note, Tdc2 is not only present in octopaminergic neurons, but also in tyraminergic neurons.

      __ Answer:__ We agree with the reviewer that scRNA-seq data alone is insufficient to support claims of multiple transmitter co-release in SIFa neurons. We also appreciate the reviewer for highlighting the potential for confusion among readers regarding the visualization methods used in our figures, particularly the tSNE plots of the scRNA-seq data. As noted in our previous response to Reviewer #1, we have removed most of the tSNE plots related to co-expression data involving SIFa and NPRs, which we believe will help clarify the interpretation for readers. However, we have retained a few tSNE plots, specifically Figures 2N-O, to illustrate the potential co-expression of the ple and Vglut genes in SIFa cells.

         We understand the reviewer’s concerns regarding the clarity of the presented data and the need for more detailed information about the extent of co-expression and the identification of SIFa-expressing cells. To address these concerns, we have provided a comprehensive description of our methods in the __MATERIALS AND METHODS__ section below.
      

      "Single-nucleus RNA-sequencing analyses

      The snRNAseq dataset analyzed in this paper is published in [20]and available at the Nextflow pipelines (VSN, https://github.com/vib-singlecell-nf), the availability of raw and processed datasets for users to explore, and the development of a crowd-annotation platform with voting, comments, and references through SCope (https://flycellatlas.org/scope), linked to an online analysis platform in ASAP (https://asap.epfl.ch/fca). For the generation of the tSNE plots, we utilized the Fly SCope website (https://scope.aertslab.org/#/FlyCellAtlas/*/welcome). Within the session interface, we selected the appropriate tissues and configured the parameters as follows: 'Log transform' enabled, 'CPM normalize' enabled, 'Expression-based plotting' enabled, 'Show labels' enabled, 'Dissociate viewers' enabled, and both 'Point size' and 'Point alpha level' set to maximum. For all tissues, we referred to the individual tissue sessions within the '10X Cross-tissue' RNAseq dataset. Each tSNE visualization depicts the coexpression patterns of genes, with each color corresponding to the genes listed on the left, right, and bottom of the plot. The tissue name, as referenced on the Fly SCope website is indicated in the upper left corner of the tSNE plot. Dashed lines denote the significant overlap of cell populations annotated by the respective genes. Coexpression between genes or annotated tissues is visually represented by differentially colored cell populations. For instance, yellow cells indicate the coexpression of a gene (or annotated tissue) with red color and another gene (or annotated tissue) with green color. Cyan cells signify coexpression between green and blue, purple cells for red and blue, and white cells for the coexpression of all three colors (red, green, and blue). Consistency in the tSNE plot visualization is preserved across all figures.

      Single-cell RNA sequencing (scRNA-seq) data from the Drosophila melanogaster were obtained from the Fly Cell Atlas website (https://doi.org/10.1126/science.abk2432). Oenocytes gene expression analysis employed UMI (Unique Molecular Identifier) data extracted from the 10x VSN oenocyte (Stringent) loom and h5ad file, encompassing a total of 506,660 cells. The Seurat (v4.2.2) package (https://doi.org/10.1016/j.cell.2021.04.048) was utilized for data analysis. Violin plots were generated using the “Vlnplot” function, the cell types are split by FCA."

         We have also included detailed descriptions in the figure legends for the initial tSNE plot presented below to help readers clearly understand the significance of this visualization.
      

      "Each tSNE visualization depicts the coexpression patterns of genes, with each color corresponding to the genes listed on the left, right, and/or bottom of the plot. The tissue name, as referenced on the Fly SCope website is indicated in the upper left corner of the tSNE plot. Consistency in the tSNE plot visualization is preserved across all figures."

         We appreciate the reviewer for acknowledging that Tdc2 is present in both TA and OA neurons. As we mentioned earlier, we have completely removed the Tdc2-related results from this manuscript, as we believe that more detailed experiments are necessary to confirm the roles of TA and OA in SIFa neurons.
      

      Comment 6. The same argument applies to the expression of sNPF receptors in SIFamide neurons. The rather small anatomical stainings shown in figure 4M do not convincingly and unambiguously show that actually sNPF receptors are located on SIFamide neurons.

      __ Answer:__ We appreciate the reviewer for pointing out that the co-expression of sNPF-R and SIFa needs further verification, and we agree with this assessment. To confirm the co-expression of SIFa with sNPF-R, we conducted a mini-screen of various sNPF-R driver lines and found that the chemoconnectome (CCT) sNPF-R2A driver which represent the physiological expression patterns of sNPF-R, consistently labels SIFa neurons [21].

         To further establish the functional connection between the SIFa and sNPF systems, we performed GCaMP experiments using SIFa-driven GCaMP in conjunction with sNPF-R neurons expressing P2X2, which can be activated by ATP treatment. As shown in Figures 3N-P, we demonstrated that activation of sNPF-R neurons by ATP significantly increases calcium levels in SIFa neurons. Our results strongly suggest that the sNPF-sNPF-R/SIFa system is functionally present and plays a role in modulating interval timing behaviors.
      

      Comment 7. The authors use the GRASP technique (figure 4N) to determine whether synaptic connections are subject to modulation as a result from the animals' individual experience. The overall extremely bright fluorescence at the dorsal areas of both brain hemispheres (figure 4 N, middle panel) raises doubts whether this signal is actually a specific GRASP fluorescence between two small populations of neurons.

      Answer: We appreciate the reviewer for critically highlighting the inadequacies in our presentation of the GRASP data. We agree that one of our previous panels contained excessive background noise, making it difficult for reviewers and readers to discern the different neuronal connections. To address this issue, we have replaced it with a more representative image that clearly illustrates the strengthening of synaptic connections from SIF to sNPF-R in several neurons, including SIFa cells (Fig. S5J). We hope that this updated image will help convince both the reviewer and readers of the validity of our GRASP data.

      Comment 8. The authors cite Martelli et al. (2017) with the hypothesis that sNPF-releasing neurons provide input signals to SIFamide neurons to modulate feeding behavior. However, the cited manuscript does not contain such a hypothesis. The authors should review the reference in more detail.

      __ Answer:__ We appreciate reviewer to correctly point our misunderstanding of references. We agree with reviewer that Martelli et al.'s paper didn't mention about sNPF signaling transmits hunger and satiety information to SIFa neurons. We removed this sentence and replaced it as below correctly mentioning that sNPF signaling is related to feeding behavior however it's connection to SIFa neurons are not known. We greatly appreciate the reviewer for acknowledging our efforts to accurately cite previous articles that support our rationale and ideas.

      " Short neuropeptide F (sNPF) signaling plays a crucial role in regulating feeding behavior in Drosophila melanogaster, influencing food intake and body size [60,66,67]. However, there is currently no direct evidence reported linking sNPF signaling to SIFa neurons."

      Comment ____9. In lines 281 ff., the authors state that SIFamide neurons receive inputs from peptidergic neurons but simultaneously claim that "this speculation is based on morphological observations." This is incorrect. The functional co-activation/imaging analyses provided in Martelli et al. (2017) should not be ignored.

      * Answer: We fully agree with the reviewer that we misinterpreted Martelli et al.'s analysis. We have removed "this speculation is based on morphological observations." from* the following sentence and finalize as below:

      "The SIFa neurons receive inputs from many peptidergic pathways including Crz, dilp2, Dsk, sNPF, MIP, and hugin"

      Comment 10. Figure 6: A transcriptional calcium sensor (TRIC) was used to quantify the accumulation GFP induced by calcium influx in SIFamide neurons. However, I could not find any description of the method in the materials and methods section, nor any explanation how the data were acquired or analyzed. What is the RFP expression good for? How exactly are thresholds determined, and why are areas rather than fluorescence intensities quantified? Overall, this part of the manuscript is rather confusing and needs more explanation.

      __ Answer: Thank you for your continued engagement with our manuscript and for highlighting the need for further clarification on our methods. Your attention to the details of our immunohistochemistry experiments is commendable, and we agree that providing a clear explanation of our thresholding and normalization procedures is essential for the transparency and reproducibility of our results. We primarily adhered to the established methods outlined by Kayser et al. [8]. To address your first point, we have now included a more detailed description of our thresholding and normalization procedures in the __MATERIALS AND METHODS section as below.

      "Quantitative analysis of fluorescence intensity

      To ascertain calcium levels and synaptic intensity from microscopic images, we dissected and imaged five-day-old flies of various social conditions and genotypes under uniform conditions. The GFP signal in the brains and VNCs was amplified through immunostaining with chicken anti-GFP, rabbit anti-DsRed, and mouse anti-nc82 primary antibodies. Image analysis was conducted using ImageJ software. For the quantification of fluorescence intensities, an investigator, blinded to the fly's genotype, thresholded the sum of all pixel intensities within a sub-stack to optimize the signal-to-noise ratio, following established methods [100]. The total fluorescent area or region of interest (ROI) was then quantified using ImageJ, as previously reported. For CaLexA or TRIC signal quantification, we adhered to protocols detailed by Kayser et al. [101], which involve measuring the ROI's GFP-labeled area by summing pixel values across the image stack. This method assumes that changes in the GFP-labeled area and intensity are indicative of alterations in the CaLexA and TRIC signal, reflecting synaptic activity. ROI intensities were background-corrected by measuring and subtracting the fluorescent intensity from a non-specific adjacent area, as per Kayser et al. [101]. For normalization, nc82 fluorescence is utilized for CaLexA, while RFP signal is employed for TRIC experiments, as the RFP signal from the TRIC reporter is independent of calcium signaling [72] . For the analysis of GRASP or tGRASP signals, a sub-stack encompassing all synaptic puncta was thresholded by a genotype-blinded investigator to achieve the optimal signal-to-noise ratio. The fluorescence area or ROI for each region was quantified using ImageJ, employing a similar approach to that used for CaLexA or TRIC quantification [100]. 'Norm. GFP Int.' refers to the normalized GFP intensity relative to the RFP signal.

      • *

      __Comment 11. __Similarly, it remains unclear how exactly syteGFP fluorescence and DenMark fluorescence were quantified. Why are areas indicated and not fluorescence intensity values? In fact, it appears worrisome that isolation of males should lead to a drastic decline in synaptic terminals (as measure through a vesicle-associated protein) by ~ 30%, or, conversely, keeping animals in groups lead to an respective increase (figure 7D). The technical information how exactly this was quantified is not sufficient.

      __ Answer: __Thank you for your ongoing engagement with our manuscript and for emphasizing the need for clarification on our methods. We appreciate your attention to the details of our immunohistochemistry experiments and agree that a clear explanation of our thresholding and normalization procedures is vital for transparency and reproducibility. We acknowledge that signal intensity correlates with area measurements, which is an important consideration. In response to your valuable suggestion, we have revised our approach to present data based on intensity measurements and updated the Y-axis labeling to "Norm. GFP Int." (normalized GFP intensity) for clarity. We primarily followed the established methods from Kayser et al. (2014) [8]. Additionally, we have included a more detailed description of our thresholding and normalization procedures in the "Quantitative analysis of fluorescence intensity" in __MATERIALS AND METHODS __section as we quoted above.

      • *

      Minor concerns:

      Comment 1. Reference 29 and reference 33 are the same.

         __Answer:__ We removed reference 29.
      

      Comment 2. In figure legends, abbreviations should be explained when used first (e.g., figure 1 A "MD", is explained below for panel C-F), or "CS males". __ __

      __Answer: __We have ensured that abbreviations are explained only when they are first used in the figure legends.

      Comment 3. Indications for statistical significance must be shown in all figure legends at the end of each figure legend, not only in figure 1. __ __

      __ Answer:__ We appreciate the reviewer’s advice. However, we have published all our other manuscripts using the same format for mating duration, stating, "The same notations for statistical significance are used in other figures," in the first figure where we describe our statistical significances. We intend to continue with this approach initially and will then adhere to the journal's policy.

      Comment 4. The figures appear overloaded. For example why do you need two different axis designations (mating duration and differences between means)? __ __

      __ Answer: __We appreciate the reviewer's suggestion to refine our figures, and we have indeed reformatted them to provide clearer presentation and improved readability. Our decision is based on the fact that our analysis encompasses not only traditional t-tests but also incorporates estimation statistics, which have been demonstrated to be effective for biological data analysis [22]. The inclusion of DBMs is essential for the accurate interpretation of these estimation statistics, ensuring a comprehensive representation of our findings. This is the primary area where we present two different axis designations.

      Comment 5. Line: 1154: Typo: gluttaminergic should be glutamatergic.

         __Answer:__ We fixed all.
      

      Comment 6. The authors frequently write "system" when referring to transmitter types, e.g., "glutaminergic system", "octopaminergic system", etc. It I not clear what the term "system" actually refers to. If the authors claim that SIFamide neurons release these transmitters in addition to SIFamide, they should state that precisely and then add experiments to show that this is the case.

         __Answer:__ We agree with reviewer and removed the word 'system' after the name of neurotransmitter's name.
      

      Comment 7. Figure S6: It is not explained in the figure legend what fly strain "UAS-ctrl" actually is. Does "ctrl" mean control? And what genotype is hat control? __ __

      __Answer: __It was wild-type strain. We fixed it as "+".

      Comment 8. Figure legend S6, line 1371: The authors indicate experiments using UAS-OrkDeltaC. I could not find these data in the figure. __ __

      __Answer: __It's now in Fig.S6U-W.

      Comment 9. Line 470: "...reduced branching of SIFa axons at the postsynaptic level" should perhaps be "presynaptic level"?

      Answer: Reviewer is correct. We fixed it.

      Conclusive Comments:* Overall, the study advances our knowledge about the behavioral roles of SIFamide, which is certainly important, interesting, and worthy of being reported. However, the manuscript also raises several serious caveats and includes points that remain speculative and are less convincing.

      Overall, the neuronal basis of action selection based on motivational factors (metabolic state, mating experience, sleep/wake status, etc.) is not well understood. The analysis of SIFamide function in insects might provide a way to address the question how different motivational signals are integrated to orchestrate behavior.*

      • *Answer: Thank you for your thoughtful review and for recognizing the significance of our study in advancing knowledge about the behavioral roles of SIFamide. We appreciate your acknowledgment that our work is important, interesting, and worthy of publication.

      We understand your concerns regarding the caveats and speculative points raised in the manuscript. We agree that the neuronal basis of action selection influenced by motivational factors—such as metabolic state, mating experience, and sleep/wake status—remains poorly understood. We believe that our analysis of SIFamide function in insects offers valuable insights into how various motivational signals are integrated to orchestrate behavior.

      In response to your comments, we have made revisions to clarify our findings and address the concerns raised. We aim to strengthen the arguments presented in the manuscript and provide a more robust discussion of the implications of our results. Thank you once again for your constructive feedback, which has been instrumental in improving the clarity and impact of our work.

      • *

      * *

      Reviewer #3

      General Comments:* The Manuscript Peptidergic neurons with extensive branching orchestrate the internal states and energy balance of male Drosophila melanogaster by Yuton Song and colleagues addresses the question how SIFamidergic neurons coordinate behavioral responses in a context-dependent manner. In this context the authors investigate how SIFa neurons receive information about the physiological state of the animal and integrate this information into the processing of external stimuli. The authors show that SIFamidergic neurons and sNPPF expressing neurons form a feedback loop in the ventral nerve cord that modulate long mating (LMD) and shorter mating duration (SMD).

      The manuscript is well written and very detailed and provides an enormous amount of data corroborating the claims of the authors. However, before publication the authors may want to address some points of concern that warrant some deeper explanation.*

      • *__Answer: __Thank you for your positive feedback on our manuscript. We appreciate your recognition of the importance of our study in investigating how SIFa neurons integrate information about the physiological state of the animal with external stimuli, as well as your acknowledgment of the substantial data we provide to support our claims. We understand your concerns regarding certain points that require deeper explanation, and we are committed to addressing these issues to enhance the clarity and robustness of our findings. Your insights into the neuronal basis of action selection influenced by motivational factors are invaluable, and we believe that our exploration of SIFamide function in insects contributes significantly to understanding how various motivational signals orchestrate behavior. Thank you once again for your constructive comments, which will help us improve our manuscript before publication.

      Major concerns:

      Comment 1. On page 6 line 110 the authors describe that knocking-down SIFamide in glia cell does not change LMD or SMD and say that SIFa expression in glia does not contribute to interval timing behavior. However, the authors do not provide any information why they investigate the role of SIFa expression in glia. Is there any SIFa-expression in glia? The authors should somehow demonstrate using antibody labelling against SIFamide whether any glia specific expression of this peptide is to be expected. If they cannot provide this data - the take home message of the experiment cannot be that glia knockdown of SIFamide does not affect the behavior because you cannot knockdown anything that is not there.

      • *

      • In the latter case the experiment could be considered as a nice negative control for the elav-Gal4 pan-neuronal knockdown of SIFamide. The authors provide some Figure supplement where they use repo-Gal80 to partially answer this question. However, the authors should keep in mind that Gal4-drivers are not always complete in the expression pattern. Accordingly, the result should be corroborated with immune-labelling against SIFamide directly.*

      __ Answer: __We appreciate the reviewer's constructive and critical comments regarding the use of our glial cell drivers. As the reviewer rightly pointed out, we believe that glial control is not essential for our manuscript, given that the expression of SIFa is well established in only four neurons. Therefore, we have removed the data related to glial drivers from this manuscript.

      Comment 2. At this point I would like to directly comment on the figure quality. The figures are so crowded that the described anatomical details are hardly visible. In my opinion the manuscript would profit from less data in the main part and more stringent description of the core of the biological problem the authors want to address. The authors may want to reduce data from the main text and provide additional data that are not directly related to the main story as supplementary information.

      __ Answer: __We agree with the reviewer. As another reviewer also suggested that we streamline our figures and data, we have completely restructured our figures and their presentation. In response, we have significantly reduced the density of the main figures and decreased the size of the graphs to enhance clarity. Additionally, we have increased the spacing between panels to ensure that each component is more easily distinguishable. Further details will be provided in our responses to each comment below.

      • *

      Comment 3. On page 8 starting with line 140 the authors describe the architecture of SIFamidergic neurons using several anatomical markers e.g., Denmark and further state that they have discovered that the dendrites of SIFa neurons span just the central brain area. Seeing that these data have been published in Martelli et al., 2017 the authors should tune down the claim that this was discovered in their work but rather corroborated earlier results.

      __ Answer: __We acknowledge this error, as another reviewer also raised this issue. We have corrected our manuscript as follows:

      "The established connections and architecture of SIFa neurons has been described by Martelli et al., which enhances our understanding of their functional roles within the neuronal circuitry [51]. To identify the dendritic and axonal components of SIFa-neuronal processes, we employed a similar approach to that reported by Martelli [51]."

      Comment 4. In the next chapter, the authors aim at identifying the presynaptic inputs from SIFa positive neurons that may influence interval timing behavior and make a broad RNAi knock-down screen targeting a majority of neuromodulators. The authors claim that glutaminergic and dopaminergic signaling is necessary for interval timing behavior. I guess the authors mean "glutamatergic" instead of "glutaminergic" as glutamine is the precursor but not the neurotransmitter.

      __ Answer: __The reviewer is correct. We have corrected this error and changed all instances to "glutamatergic."

      Comment 5____. Furthermore, the authors show that the knock down of Tdc2 with RNAi has comparable effects on SMD than Glutamate and dopamine but appear to not further discuss this in the main text. To me it is not clear why the authors exclude Tdc2 from their resume. The authors should explain this in detail.

         __Answer:__ We appreciate the reviewer’s constructive comments regarding the need for a more detailed demonstration of the role of Tdc2 data. While we did test Tdc2-RNAi and observed interesting phenotypes, we decided not to include these findings in our publication, as our data on glutamate and dopamine offer a more compelling explanation for how SIFa cotransmission with these neurotransmitters can independently influence various behaviors, such as sleep and mating duration. Consequently, we have removed all data related to Tdc2. We believe that further evaluation is necessary to better understand the roles of the tyramine and octopamine systems in SIFa neurons.
      

      Comment 6. The authors base their assumptions that the tested neurotransmitters are expressed in SIFamidergic neurons on Scope database analysis. But a transcript does not necessarily mean that it will be translated too. To my knowledge there is no available data in the literature showing that tyrosine hydroxylase is expressed in SIFamidergic neurons (see e.g., Mao and Davis, 2010). To show that ple or Tdc2 are indeed expressed and translated into functional enzymes in SIFamidergic neurons the authors should provide the according antibody labelling corroborating the result from the transcriptome analysis.

      __ Answer:__ We appreciate the reviewer’s constructive comments regarding the role of neurotransmitters in conjunction with SIFa in modulating interval timing behaviors. To confirm the expression of dopamine (DA) in SIFa neurons, we utilized a well-established genetic toolkit for dissecting dopamine circuit function in Drosophila [18]. Our findings demonstrate that TH-C-GAL4 specifically labels SIFa neurons, which have been confirmed to be dopaminergic (Fig. S4M). This aligns with the genetic intersection data and the findings from Xie et al. (2018), confirming that a subset of SIFa neurons is indeed dopaminergic. We have included these new results in the main text as follows:

      " To further verify the presence of DA neurons within the SIFa neuron population, we utilized a well-established genetic toolkit for dissecting DA circuits and confirmed part of SIFa neurons are dopaminergic (S4M Fig) [58]."

         To confirm the glutamatergic characteristics of SIFa neurons, we conducted several experiments that established glutamate as the most critical neurotransmitter for generating interval timing in both SIFa and SIFaR neurons. First, to demonstrate the presence of glutamatergic synaptic vesicles in SIFa neurons, we utilized a conditional glutamatergic synaptic vesicle marker for *Drosophila*, developed by Certel et al. [19]. Our results confirmed that SIFa neurons exhibit strong expression of glutamatergic synaptic vesicles (Fig. 2P and Fig. S4N as a genetic control). We have described these new results in the main text as follows:
      

      "To further substantiate the role of glutamate in SIFa-mediated behaviors. we targeted the expression of VGlut receptor in neurons that carry the SIFaR. Strikingly, the knockdown of VGlut receptor in these neurons also disrupted SMD behavior, mirroring the phenotype observed upon direct suppression of glutamatergic signaling in SIFa neurons (S4O-L Fig)."

         To further confirm that glutamate release from SIFa neurons influences the function of SIFaR neurons, we tested several RNAi strains targeting glutamate receptors. Our results showed that the knockdown of glutamate receptors in SIFaR-expressing neurons produced phenotypes similar to those observed with VGlut-RNAi knockdown in SIFa neurons (Fig. S4I-N). We believe that this series of experiments demonstrates that glutamate and dopamine work in conjunction with SIFa to modulate interval timing and other behaviors related to energy balance. We have described these new results in the main text as follows:
      

      "We also further verified that the knockdown of glutamate receptors in SIFaR-expressing neurons produces phenotypes similar to those resulting from VGlut knockdown in SIFa neurons (S4G to S4L Fig). This suggests that glutamate is an essential neurotransmitter for modulating interval timing in SIFa neurons."

      Comment 7. The authors compare the LMD and SMD behavior of the animals with reduced expression with "heterozygous control animals" the authors should describe in detail what these are - are these controls the driver lines or the effector lines or a mix of both? The authors should provide the data for heterozygous driver line controls as well as heterozygous effector line controls to exclude any genetic background influence on the measured behavior. Accordingly, the authors should provide the data for the same controls for the sleep experiment in figure 3O and all the other behavioral experiments in the following parts of the manuscript.

      __ Answer: __We sincerely thank the reviewer for insightful comments regarding the absence of traditional genetic controls in our study of LMD and SMD behaviors. We acknowledge the importance of such controls and wish to clarify our rationale for not including them in the current investigation. The primary reason for not incorporating all genetic control lines is that we have previously assessed the LMD and SMD behaviors of GAL4/+ and UAS/+ strains in our earlier studies. Our past experiences have consistently shown that 100% of the genetic control flies for both GAL4 and UAS exhibit normal LMD and SMD behaviors. Given these findings, we deemed the inclusion of additional genetic controls to be non-essential for the present study, particularly in the context of extensive screening efforts. We understand the value of providing a clear rationale for our methodology choices. To this end, we have added a detailed explanation in the "MATERIALS AND METHODS" section and the figure legends of Figure 1. This clarification aims to assist readers in understanding our decision to omit traditional controls, as outlined below.

      "Mating Duration Assays for Successful Copulation

      The mating duration assay in this study has been reported [33,73,93]. To enhance the efficiency of the mating duration assay, we utilized the Df (1) Exel6234 (DF here after) genetic modified fly line in this study, which harbors a deletion of a specific genomic region that includes the sex peptide receptor (SPR)[94,95]. Previous studies have demonstrated that virgin females of this line exhibit increased receptivity to males [95]. We conducted a comparative analysis between the virgin females of this line and the CS virgin females and found that both groups induced SMD. Consequently, we have elected to employ virgin females from this modified line in all subsequent studies. For naïve males, 40 males from the same strain were placed into a vial with food for 5 days. For single reared males, males of the same strain were collected individually and placed into vials with food for 5 days. For experienced males, 40 males from the same strain were placed into a vial with food for 4 days then 80 DF virgin females were introduced into vials for last 1 day before assay. 40 DF virgin females were collected from bottles and placed into a vial for 5 days. These females provide both sexually experienced partners and mating partners for mating duration assays. At the fifth day after eclosion, males of the appropriate strain and DF virgin females were mildly anaesthetized by CO2. After placing a single female in to the mating chamber, we inserted a transparent film then placed a single male to the other side of the film in each chamber. After allowing for 1 h of recovery in the mating chamber in 25℃ incubators, we removed the transparent film and recorded the mating activities. Only those males that succeeded to mate within 1 h were included for analyses. Initiation and completion of copulation were recorded with an accuracy of 10 sec, and total mating duration was calculated for each couple. All assays were performed from noon to 4pm. Genetic controls with GAL4/+ or UAS/+ lines were omitted from supplementary figures, as prior data confirm their consistent exhibition of normal LMD and SMD behaviors [33,73,93,96,97]. Hence, genetic controls for LMD and SMD behaviors were incorporated exclusively when assessing novel fly strains that had not previously been examined. In essence, internal controls were predominantly employed in the experiments, as LMD and SMD behaviors exhibit enhanced statistical significance when internally controlled. Within the LMD assay, both group and single conditions function reciprocally as internal controls. A significant distinction between the naïve and single conditions implies that the experimental manipulation does not affect LMD. Conversely, the lack of a significant discrepancy suggests that the manipulation does influence LMD. In the context of SMD experiments, the naïve condition (equivalent to the group condition in the LMD assay) and sexually experienced males act as mutual internal controls for one another. A statistically significant divergence between naïve and experienced males indicates that the experimental procedure does not alter SMD. Conversely, the absence of a statistically significant difference suggests that the manipulation does impact SMD. Hence, we incorporated supplementary genetic control experiments solely if they deemed indispensable for testing. All assays were performed from noon to 4 PM. We conducted blinded studies for every test[98,99] .

         While we have previously addressed this type of reviewer feedback in our published manuscript [2–7], we appreciate the reviewer’s suggestion to include traditional genetic control experiments. In response, we conducted all feasible combinations of genetic control experiments for LMD/SMD during the revision period. The results are presented in the supplementary figures and are described in the main text.
      

      __Comment 8. __On page 11 line 231 to page 12 line 233 the authors claim that "sNPF signaling transmits hunger and satiety information to SIFa neurons in order to control food search and feeding" and cite Martelli et al., 2017. Could the authors explain more in detail how the Martelli paper somehow proposes this idea? I do not find the link between sNPF signaling hunger and SIFamide in this precise paper.

      __ Answer:__ We appreciate the reviewer for accurately pointing out our misunderstanding of the references. We agree that Martelli et al.'s paper does not mention that sNPF signaling transmits hunger and satiety information to SIFa neurons. Consequently, we have removed the relevant sentence and replaced it with a statement correctly indicating that while sNPF signaling is related to feeding behavior, its connection to SIFa neurons remains unknown. We are grateful to the reviewer for acknowledging our efforts to accurately cite previous articles that support our rationale and ideas.

      " Short neuropeptide F (sNPF) signaling plays a crucial role in regulating feeding behavior in Drosophila melanogaster, influencing food intake and body size [60,66,67] . However, there is currently no direct evidence reported linking sNPF signaling to SIFa neurons."


      Comment 9. On page 15 line 302 - 303 the authors write that "except for PK2-R2, all other genes coexpress with SIFa in SCope data, indicating that hugin inputs to SIFa may not be transmitted through peptidergic signaling" - if SIFamidergic neurons do not express hugin-receptors how do the authors explain the inverted effect of PK2-R2-RNAi on single housed male courtship index when compared to heterozygous SIFaPT Gal4 control that show a reduction under comparable conditions.

      __ Answer:__ We appreciate the reviewer’s constructive comments. In line with another reviewer’s suggestion, we have completely removed results of other neuropeptidergic inputs, focusing instead on how sNPF inputs modulate SIFa-mediated behavioral modulation using more advanced techniques such as GCaMP (Fig 3N). Consequently, the phenotypes resulting from various knockdowns of neuropeptide receptors are currently under investigation for a separate manuscript that we are preparing. We hope to successfully address how different neuropeptidergic inputs regulate SIFa neuron activity through various strategies.

      Comment 10. On page 17 line 350 - 351 the authors write that "Stimulation of SIFa neurons resulted in an elevation in food consumption. Further, the authors write that "deactivation of SIFa neurons leads to a decrease in food consumption in male flies". From the way this is formulated it is not visible that the role of SIFamide in feeding control was published by Martelli and colleagues before. As the authors do not discuss the finding further in their discussion but cite the concerned paper in other aspects it appears as the authors intentionally want to omit this information to the reader. The authors may add a note that this has been shown before for female flies by Martelli and colleagues.

      __ Answer:__ We appreciate reviewer's concern for properly mention previous Martelli et al.'s results about female feeding behavior modulated by SIFa neurons' activity. We agree with reviewer and added sentence as below in main text.

      "Nevertheless, the temporary deactivation of SIFa neurons leads to a decrease in food consumption in male flies (Fig 4N and S6F to S6H) as previously described by Martelli et al.'s report in female flies [43]."

      Comment 11. SIFamide receptor and GnIHR are discussed as descendants from a common ancestor and the authors nicely demonstrate that SIFamide does not only control homeostatic behavior as shown by Martelli and colleagues but also controls reproductive behavior. The evolution of such behavior control mechanisms may be integrated in the discussion too.

      Answer: We appreciate the reviewer’s constructive comments, which enhance the evolutionary significance of our study. We agree with the reviewer and have added the following paragraph to the DISCUSSION section:

      "The relationship between SIFamide receptors (SIFaR) and gonadotropin inhibitory hormone receptors (GnIHR) [89] highlights an intriguing evolutionary connection, as both are believed to have descended from a common ancestor [90,91]. This study expands on previous findings by Martelli et al., demonstrating that SIFamide not only regulates homeostatic behaviors but also plays a significant role in reproductive behavior [43]. GnIHR regulates food intake and reproductive behavior in opposing directions, thereby prioritizing feeding behavior over other behavioral tasks during times of metabolic need [92]. The evolution of these behavioral control mechanisms suggests a complex interplay between neuropeptides that modulate both physiological states and reproductive strategies. As SIFamide influences various behaviors, including feeding and sexual activity, it may be integral to understanding how organisms adapt their reproductive strategies in response to environmental and internal cues. This integration of behavioral modulation underscores the evolutionary significance of SIFamide signaling in coordinating essential life functions in Drosophila melanogaster and potentially other species, revealing pathways through which neuropeptides can shape behavior across different contexts."

      Conclusive Comments: The manuscript by Song and colleagues is very interesting and may attract a broad readership. However, the authors miss to make clear what was already known and published on the role of SIFamide in homeostatic behavior control before their own study. Seen that the receptors for SIFamide and GnRHI derive from a common ancestor and apparently both GnRHI and SIFamide share similar roles in behavioral control this might indeed suggests that the basic function of this SIFaR/GnIHR-signaling pathway is conserved. This more broad evolutionary aspect is missing in the discussion of the manuscript.

      • *Answer: We wholeheartedly agree with the reviewer regarding the evolutionary significance of SIFaR's function in relation to GnIHR, and we have expanded the DISCUSSION section to emphasize this important aspect.

      "The relationship between SIFamide receptors (SIFaR) and gonadotropin inhibitory hormone receptors (GnIHR) [89] highlights an intriguing evolutionary connection, as both are believed to have descended from a common ancestor [90,91]. This study expands on previous findings by Martelli et al., demonstrating that SIFamide not only regulates homeostatic behaviors but also plays a significant role in reproductive behavior [43]. GnIHR regulates food intake and reproductive behavior in opposing directions, thereby prioritizing feeding behavior over other behavioral tasks during times of metabolic need [92]. The evolution of these behavioral control mechanisms suggests a complex interplay between neuropeptides that modulate both physiological states and reproductive strategies. As SIFamide influences various behaviors, including feeding and sexual activity, it may be integral to understanding how organisms adapt their reproductive strategies in response to environmental and internal cues. This integration of behavioral modulation underscores the evolutionary significance of SIFamide signaling in coordinating essential life functions in Drosophila melanogaster and potentially other species, revealing pathways through which neuropeptides can shape behavior across different contexts."





      Reference

      1. Zhang T, Wu Z, Song Y, Li W, Sun Y, Zhang X, et al. Long-range neuropeptide relay as a central-peripheral communication mechanism for the context-dependent modulation of interval timing behaviors. bioRxiv. 2024; 2024.06.03.597273. doi:10.1101/2024.06.03.597273
      2. Kim WJ, Jan LY, Jan YN. A PDF/NPF Neuropeptide Signaling Circuitry of Male Drosophila melanogaster Controls Rival-Induced Prolonged Mating. Neuron. 2013;80: 1190–1205. doi:10.1016/j.neuron.2013.09.034
      3. Kim WJ, Jan LY, Jan YN. Contribution of visual and circadian neural circuits to memory for prolonged mating induced by rivals. Nat Neurosci. 2012;15: 876–883. doi:10.1038/nn.3104
      4. Zhang T, Zhang X, Sun D, Kim WJ. Exploring the Asymmetric Body’s Influence on Interval Timing Behaviors of Drosophila melanogaster. Behav Genet. 2024; 1–10. doi:10.1007/s10519-024-10193-y
      5. Sun Y, Zhang X, Wu Z, Li W, Kim WJ. Genetic Screening Reveals Cone Cell-Specific Factors as Common Genetic Targets Modulating Rival-Induced Prolonged Mating in male Drosophila melanogaster. G3: Genes, Genomes, Genet. 2024; jkae255. doi:10.1093/g3journal/jkae255
      6. Lee SG, Sun D, Miao H, Wu Z, Kang C, Saad B, et al. Taste and pheromonal inputs govern the regulation of time investment for mating by sexual experience in male Drosophila melanogaster. PLOS Genet. 2023;19: e1010753. doi:10.1371/journal.pgen.1010753
      7. Huang Y, Kwan A, Kim WJ. Y chromosome genes interplay with interval timing in regulating mating duration of male Drosophila melanogaster. Gene Rep. 2024; 101999. doi:10.1016/j.genrep.2024.101999
      8. Kayser MS, Yue Z, Sehgal A. A Critical Period of Sleep for Development of Courtship Circuitry and Behavior in Drosophila. Science. 2014;344: 269–274. doi:10.1126/science.1250553
      9. Wong K, Schweizer J, Nguyen K-NH, Atieh S, Kim WJ. Neuropeptide relay between SIFa signaling controls the experience-dependent mating duration of male Drosophila. Biorxiv. 2019; 819045. doi:10.1101/819045
      10. Thornquist SC, Langer K, Zhang SX, Rogulja D, Crickmore MA. CaMKII Measures the Passage of Time to Coordinate Behavior and Motivational State. Neuron. 2020;105: 334-345.e9. doi:10.1016/j.neuron.2019.10.018
      11. Buhusi CV, Meck WH. What makes us tick? Functional and neural mechanisms of interval timing. Nat Rev Neurosci. 2005;6: 755–765. doi:10.1038/nrn1764
      12. Merchant H, Harrington DL, Meck WH. Neural Basis of the Perception and Estimation of Time. Annu Rev Neurosci. 2012;36: 313–336. doi:10.1146/annurev-neuro-062012-170349
      13. Allman MJ, Teki S, Griffiths TD, Meck WH. Properties of the Internal Clock: First- and Second-Order Principles of Subjective Time. Annu Rev Psychol. 2013;65: 743–771. doi:10.1146/annurev-psych-010213-115117
      14. Rammsayer TH, Troche SJ. Neurobiology of Interval Timing. Adv Exp Med Biol. 2014; 33–47. doi:10.1007/978-1-4939-1782-2_3
      15. Golombek DA, Bussi IL, Agostino PV. Minutes, days and years: molecular interactions among different scales of biological timing. Philosophical Transactions Royal Soc B Biological Sci. 2014;369: 20120465. doi:10.1098/rstb.2012.0465
      16. Jazayeri M, Shadlen MN. A Neural Mechanism for Sensing and Reproducing a Time Interval. Curr Biol. 2015;25: 2599–2609. doi:10.1016/j.cub.2015.08.038
      17. Croset V, Treiber CD, Waddell S. Cellular diversity in the Drosophila midbrain revealed by single-cell transcriptomics. eLife. 2018;7: e34550. doi:10.7554/elife.34550
      18. Xie T, Ho MCW, Liu Q, Horiuchi W, Lin C-C, Task D, et al. A Genetic Toolkit for Dissecting Dopamine Circuit Function in Drosophila. Cell Reports. 2018;23: 652–665. doi:10.1016/j.celrep.2018.03.068
      19. Certel SJ, Ruchti E, McCabe BD, Stowers RS. A conditional glutamatergic synaptic vesicle marker for Drosophila. G3. 2022;12: jkab453. doi:10.1093/g3journal/jkab453
      20. Li H, Janssens J, Waegeneer MD, Kolluru SS, Davie K, Gardeux V, et al. Fly Cell Atlas: A single-nucleus transcriptomic atlas of the adult fruit fly. Science. 2022;375: eabk2432. doi:10.1126/science.abk2432
      21. Deng B, Li Q, Liu X, Cao Y, Li B, Qian Y, et al. Chemoconnectomics: Mapping Chemical Transmission in Drosophila. Neuron. 2019;101: 876-893.e4. doi:10.1016/j.neuron.2019.01.045
      22. Claridge-Chang A, Assam PN. Estimation statistics should replace significance testing. Nat Methods. 2016;13: 108–109. doi:10.1038/nmeth.3729

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review):

      Summary: 

      In the manuscript entitled "Magnesium modulates phospholipid metabolism to promote bacterial phenotypic resistance to antibiotics", Li et al demonstrated the role of magnesium in promoting phenotypic resistance in V. alginolyticus. Using standard microbiological and metabolomic techniques, the authors have shown the significance of fatty acid biosynthesis pathway behind the resistance mechanism. This study is significant as it sheds light on the role of an exogenous factor in altering membrane composition, polarization, and fluidity which ultimately leads to antimicrobial resistance. 

      Strengths: 

      (1) The experiments were carried out methodically and logically. 

      (2) An adequate number of replicates were used for the experiments. 

      Weaknesses: 

      (1) The introduction section needs to be more informative and to the point.  

      Thank you so much for your suggestion. We have revised the introduction to make it more informative and to the point as following:

      “Non-inheritable antibiotic or phenotypic resistance represents a serious challenge for treating bacterial infections. Phenotypic resistance does not involve genetic mutations Phenotypic resistance does not involve genetic mutations and is transient, allowing bacteria to resume normal growth. Biofilm and bacterial persisters are two phenotypic resistance types that have been extensively studied (Brandis et al., 2023; Corona & Martinez, 2013). Biofilms have complex structures, containing elements that impede antibiotic diffusion, sequestering and inhibiting their activity (Ciofu et al., 2022). Biofilm-forming bacteria and persisters also have distinct metabolic states that significantly reduce their antibiotic susceptibility (Yan & Bassler, 2019). These two types of phenotypic resistance share the common feature in their retarded or even cease of growth in the presence of antibiotics (Corona & Martinez, 2013). However, specific factors that promote phenotypic resistance and allow bacteria to proliferate in the presence of antibiotics remain poorly defined.

      Metal ions have a diverse impact on the chemical, physical, and physiological processes of antibiotic resistance  (Booth et al, 2011; Lu et al, 2020; Poole, 2017). This includes genetic elements that confer resistance to metals and antibiotics (Poole, 2017) and metal cations that directly hinder (or enhance) the activity of specific antibiotic drugs (Zhang et al., 2014). The metabolic environment can also impact the sensitivity of bacteria to antibiotics (Jiang et al., 2023; Lee & Collins, 2012; Peng et al., 2015; Zhang et al., 2020; Zhao et al., 2021). Light metal ions, such as magnesium, sodium, and potassium, can behave as cofactors for different enzymes (Du et al., 2016) and influence drug efficacy. Heavy metal ions, including Cu2+ and Zn2+, confer resistance to antibiotics (Yazdankhah et al., 2014; Zhang et al., 2018). Recent reports suggest that sodium negatively regulates redox states to promote the antibiotic resistance of Vibrio alginolyticus (Yang et al., 2018), while actively growing Bacillus subtilis cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al, 2019). In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations.”

      (2) The weakest point of this paper is in the logistics through the results section. The way authors represented the figures and interpreted them in the results section (or the figure legends) does not match. The figures are difficult to interpret and are not at all self-explanatory. 

      Thank you so much for your suggestion. We have followed your suggestion to check the match between result and figures. They are now revised. 

      (3) There are too many mislabeling of the figure panels in the main text which makes it difficult to find out which figures the authors are explaining. There should be more explanation on why and how they did the experiments and how the results were interpreted. 

      Thank you so much for your suggestion. We have checked the figures and main text to ensure that we make every figure clearly stated.  

      Reviewer #2 (Public Review): 

      Summary: 

      In this study, the authors aimed to identify if and how magnesium affects the ability of two particular bacteria species to resist the action of antibiotics. In my view, the authors succeeded in their goals and presented a compelling study that will have important implications for the antibiotic resistance research community. Since metals like magnesium are present in all lab media compositions and are present in the host, the data presented in this study certainly will inspire additional research by the community. These could include research into whether other types of metals also induce multi-drug resistance, whether this phenomenon can be observed in other bacterial species, especially pathogenic species that cause clinical disease, and whether the underlying molecular determinants (i.e. enzymes) of metal-induced phenotypic resistance could be new antimicrobial drug targets themselves. 

      Strengths: 

      This study's strengths include that the authors used a variety of methodologies, all of which point to a clear effect of exogenous Mg2+ on drug resistance in the targeted species. I also commend the authors for carrying out a comprehensive study, spanning evaluation of whole cell phenotypes, metabolic pathways, genetic manipulation, to enzyme activity level evaluation. The fact that the authors uncovered a molecular mechanism underlying Mg2+-induced phenotypic resistance is particularly important as the key proteins should be studied further.

      Weaknesses: 

      I believe there are weaknesses in the manuscript, however. The authors take for granted that the reader is familiar with all the assays utilized, and do not properly explain some experiments, and thus I highly suggest that the authors add a brief statement in each situation describing the rationale for each selected methodology (more details are in the private review to the authors). The Results section is also quite long and bogs down at times, and I suggest that the authors reduce its length by 10 to 20%. In contrast, the Introduction is sparse and lacks key aspects, for example, there should be mention of the study's main purpose and approaches, plus an introduction to the authors' choice of species and their known drug resistance properties, as well as the drug of choice (balofloxacin). Another notable weakness is that the authors evaluated Mg2+-induced phenotypic resistance only against two closely related species, and thus the generalizability of this mechanism of drug resistance is not known. The paper would be strengthened if the authors could demonstrate this type of phenotypic resistance in at least one more Gram-negative species and at least one Gram-positive species (antimicrobial susceptibility evaluations would suffice), each of which should be pathogenic to humans. Demonstrating magnesium-induced phenotypic drug resistance in the WHO Priority Bacterial Pathogens would be particularly important. 

      In general, the conclusions drawn by the authors are justified by the data, except for the interpretation of some experiments. Importantly, this paper has discovered new antimicrobial resistance mechanisms and has also pointed to potential new targets for antimicrobials. 

      Thank you so much for your suggestion! We followed your idea the revise the manuscript as following:

      (1) We added a brief statement in the situation to explain the result and methodology according to your suggestion in the private review.

      (2) To make the streamline of the story more logic, we moved the whole second result to supplementary text and supplementary figure. 

      (3) We revised the introduction part by adding additional information to make it informative and to the point as following:

      “Non-inheritable antibiotic or phenotypic resistance represents a serious challenge for treating bacterial infections. Phenotypic resistance does not involve genetic mutations Phenotypic resistance does not involve genetic mutations and is transient, allowing bacteria to resume normal growth. Biofilm and bacterial persisters are two phenotypic resistance types that have been extensively studied (Brandis et al., 2023; Corona & Martinez, 2013). Biofilms have complex structures, containing elements that impede antibiotic diffusion, sequestering and inhibiting their activity (Ciofu et al., 2022). Biofilm-forming bacteria and persisters also have distinct metabolic states that significantly reduce their antibiotic susceptibility (Yan & Bassler, 2019). These two types of phenotypic resistance share the common feature in their retarded or even cease of growth in the presence of antibiotics (Corona & Martinez, 2013). However, specific factors that promote phenotypic resistance and allow bacteria to proliferate in the presence of antibiotics remain poorly defined.

      Metal ions have a diverse impact on the chemical, physical, and physiological processes of antibiotic resistance  (Booth et al, 2011; Lu et al, 2020; Poole, 2017). This includes genetic elements that confer resistance to metals and antibiotics (Poole, 2017) and metal cations that directly hinder (or enhance) the activity of specific antibiotic drugs (Zhang et al., 2014). The metabolic environment can also impact the sensitivity of bacteria to antibiotics (Jiang et al., 2023; Lee & Collins, 2012; Peng et al., 2015; Zhang et al., 2020; Zhao et al., 2021). Light metal ions, such as magnesium, sodium, and potassium, can behave as cofactors for different enzymes (Du et al., 2016) and influence drug efficacy. Heavy metal ions, including Cu2+ and Zn2+, confer resistance to antibiotics (Yazdankhah et al., 2014; Zhang et al., 2018). Recent reports suggest that sodium negatively regulates redox states to promote the antibiotic resistance of Vibrio alginolyticus (Yang et al., 2018), while actively growing Bacillus subtilis cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al, 2019). In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations.”

      (4) We examined the effect of magnesium in WHO listed priority strains, which confirmed the results as following:

      “Importantly, exogenous MgCl2 also increased MICs of clinic isolates, carbapenemresistant Escherichia coli, carbapenem-resistant Klebsiella pneumoniae, carbapenemresistant Pseudomonas aeruginosa and carbapenem-resistant Acinetobacter baumannii to balofloxacin (Fig 1G).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      (1) There are many grammatical mistakes to point out. The manuscript needs proofreading and editing.

      We appreciate this comment! The manuscript has been revised by a native speaker.

      (2) The introduction could be more informative. A little more description of magnesium - such as what it does to antibiotics and how it's known to affect the microbiome - might be helpful for the general readers. The question remains why out of all the metal ions that might affect antibiotic resistance (many of them are less explored), authors particularly decided to work on the effect of magnesium. The introduction should cover the rationale of their hypothesis. Also, the authors might want to briefly talk about the model organisms (V. algonolyticus and V. parahemolyticus) describing how threatening they are and how they are becoming resistant to antibiotics. 

      We appreciate this comment! We revise the introduction by providing additional information as following:

      “In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations. ”

      (3) Figure 1C is mislabeled as 1B (line 100). Line 101: The sentence is not clear and very confusing. What is meant by 15.6mM - 62.4 mM? Are they talking about the concentration of BLFX (though in the figure the concentration was shown in µg)? Please rewrite the sentence in a simplified way. Also, the zone of inhibition was decreased with increasing MgCl2, not increased. 

      We appreciate this comment! These have been revised, including that Fig 1B is now corrected as Fig. 1C. Line 101, which is now Line 122. The sentence was revised as following:

      “At balofloxacin doses of 1.56, 3.125, 6.25, and 12.5 µg, the zone of inhibition decreased with increasing MgCl2 (Fig 1D)”

      (4) In the western blot images, it would be nice to indicate the MW of the protein bands shown. The loading control used for the experiments should be clearly mentioned in the figure legends. 

      We appreciate this comment! The MWs are indicated in the western-blot image throughout the manuscript. 

      The loading control is clearly stated in the figure legend as following:

      “Whole cell lysates resolved by SDS-PAGE gel was stained with Coomassie brilliant blue as loading control.”. 

      (5) Figures 2 B and C: the figure legend does not explain what the authors wanted to show. It's not clear how they plotted the inhibitory curve, or the binding efficacy. These panels need an explanation of how the analysis was done.

      We appreciate this comment! The figure 2 is now removed to Suppl. Fig 2, and the description of figure 2 is moved to Suppl. Text. We revise the description of the result as following, which is in Suppl. Text:

      “Prior studies suggest that the chelation of antibiotics by magnesium ions inhibits antibiotic uptake (Deitchman et al., 2018; Lunestad and Goksøyr, 1990). To investigate whether magnesium binds to balofloxacin, balofloxacin was pre-incubated with magnesium, and zone of inhibition (ZOI) analysis was conducted. Six different concentrations of balofloxacin were separately incubated with six different concentrations of MgCl2, and then spotted on filter paper so that a defined amount of balofloxacin could be used for ZOI. While lower concentrations of MgCl2, (0.78, 3.125, or 12.5 mM) did not alter the ZOI, higher concentrations, including 50 and 200 mM MgCl2, decreased the ZOI (Suppl. Fig 2A), suggesting that even high doses of magnesium had only a partial effect on balofloxacin through direct binding. For example, at 200 mM MgCl2 and 5 or 10 μg/mL balofloxacin, the balofloxacin ZOI was 53.2 and 70.3% of the ZOI at 0 mM MgCl2, suggesting that  50% of the antibiotics were still functional. Intracellular BLFX also decreased with increasing MgCl2 (Suppl. Fig 2B), while exogenous Mg2+ increased intracellular Mg2+ levels in a dose-dependent manner. For example, exogenous 50 and 200 mM MgCl2 increased intracellular Mg2+ levels to 1.21 and 1.31 mM, respectively (Suppl. Fig 2C). The relationship between TolC, an efflux pump that transports quinolones from bacterial cells, and Mg2+ was also assessed (Kobylka et al., 2020; Song et al., 2020). The expression of TolC/tolC was unaffected by Mg2+ (Suppl. Fig 2D). Magnesium is critical for LPS stability. LPS levels increased at 200 mM Mg2+ (Suppl. Fig 2E), however, the loss of waaF, lpxA, and lpxC, three key genes involved in LPS biosynthesis, did not influence balofloxacin sensitivity/resistance in the presence of Mg2+ (Suppl. Fig 2F). These findings suggest that magnesium-induced LPS biosynthesis does not contribute directly to BLFX resistance and demonstrate that Mg2+ influx is involved in balofloxacin resistance.”

      (6) For the metabolomics results, it will help immensely if the authors provide a volcano plot of the identified metabolites and plot the heat map according to the -log2 metabolite intensities. In Figure 3A, it's not clear what information is conveyed through Euclidean distance calculations of the heat map. In Figure 3 B, the authors mentioned that the OPLS-DA test was conducted, although the figure shows a PCA plot, so it's not clear how these two are connected. Figure 3 E: the figure legend says scattered plot, but the panel represents color-coded numerical values, not a scattered plot. Also, it's not clear how they got those values. 

      We appreciate this comment! We quite agree with you that if the differential metabolites could be shown as volcano plot. However, we didn’t adopt volcano plot in this study because this is a magnesium concentration-dependent metabolomes that includes 6 groups in parallel. Volcano plots may give a complex view of the comparison among different groups. We also tried to plot the heat map according to the -log2 metabolite intensities. Although this analysis cluster 200 mM and 50 mM groups better, the data of low magnesium concentrations was not consistent, which may be due to the minor metabolic change of low concentrations magnesium. Thank you for your understanding. 

      For Euclidean distance calculations, we explain in the figure legend as following:

      “Euclidean distance calculations were used to generate a heatmap that shows clustering of the biological and technical replicates of each treatment.” 

      In Figure 2B, which was Figure 3B in previous version, it has been replaced with OPLS-DA analysis in the revised version. 

      In Figure 2E, which was Figure 3E in previous version, it is revised as following:

      “E. Areas of the peaks of palmitic acid and stearic acid generated by GC-MS analysis.” 

      (7) In Figure 4, the figure legends (as well as the in the text) are not properly referred to. Please make sure to refer to the correct panel. 

      We appreciate this comment! The figure legends have been corrected to match the panel and text. 

      Figure 4F: how was the synergy analysis done? In the methods section, the authors described the antibiotic bactericidal assay protocol, but there was no clear indication of how they generated the isobologram. 

      We appreciate this comment! We provide additional information in the Figure 3F legend, which was Figure 4F in previous version,  as following: 

      “Synergy analysis for BFLX with palmitic acid for V. alginolyticus. Synergy was performed by comparing the dose needed for 50% inhibition of the synergistic agents (white) and non-synergistic (i.e., additive) agents (purple).”

      (8) Figure 5 A: the scatter plot is plotted according to the area along the Y axis: which "area" is represented here? There is absolutely no explanation, neither in the results nor in the figure legends. Using box plots might be a better option than using a scattered plot.

      We appreciate this comment! “Area” has been noted in the revised manuscript as following:

      “The area indicates the area of the peak of the metabolite in total ion chromatography of GC-MS.” 

      (9) In Figure 6 A, the heat map is plotted according to the column Z scores. What is meant by "column Z score"? The corresponding figure legend says, "heat map showing differential abundance of lipid". Z scores do not represent an abundance of a variable, so the conclusion might not be appropriate here. 

      We appreciate this comment! In Figure 5A, which was Figure 6A in previous version, column Z score shows the abundance of metabolites analyzed, which is automatically generated in the heat map analysis to give a sign of these metabolites tested. The legend has been revised as following: 

      “Heatmap showing changes in differential lipid levels at the indicated concentration of MgCl2.”  

      (10) Line 313-314: it should be Figure EV6C.  

      We appreciate this comment! The citation has been corrected.

      (11) The authors have shown that Mg+2 does not alter the LPS transport system, however, there was some significant increase in LPS expression at 200mM MgCl2. It would be interesting if the authors could also check if Mg+2 has any effect on the outer membrane protein (OMP) integrity (by checking OMP components BamA and LptD).  

      We appreciate this comment!  We have carefully examined the membrane permeability in Figure 7. We thus didn’t perform additional experiment here to see the change of BamA and LptD. Thank you very much for your understanding.

      (12) I wonder if the authors could check the effect of extracellular Mg+2 during the co-treatment of palmitic acid, linoleic acid, and balofloxacin. Will there still be the antagonistic effect or the presence of Mg+2 could change the phenotype? 

      We appreciate this comment! Additional experiments is performed as following:

      “Furthermore, magnesium had a minimal effect on the antagonistic effect of palmitic acid, linolenic acid, and balofloxacin (Fig 4G), suggesting that this mineral functions through lipid metabolism.” 

      Reviewer #2 (Recommendations For The Authors)

      (1) As mentioned in the Public Review, I strongly believe that the impact of this study will be more significant if magnesium-induced phenotypic drug resistance could be demonstrated in at least one other Gram-negative and one other Grampositive species, both of which should be human pathogens. The full suite of experiments would not be necessary for this suggestion; evaluation of the effect of Mg concentration in growth media on the drug resistance of other species, testing the different antibiotic types used in this study, would be sufficient. 

      We appreciate this comment! Additional experiments have performed to test this idea. Mg2+ has the similar effect on carbapenem-resistant Escherichia coli, carbapenem-resistant Klebsiella pneumoniae, carbapenem-resistant Pseudomonas aeruginosa and carbapenem-resistant Acinetobacter baumannii as the similar as on the Vibrio species in shown in Figure 1G. These have been described following as

      “Importantly, exogenous MgCl2 also increased MICs of clinic isolates, carbapenemresistant Escherichia coli, carbapenem-resistant Klebsiella pneumoniae, carbapenemresistant Pseudomonas aeruginosa and carbapenem-resistant Acinetobacter baumannii to balofloxacin (Fig 1G).”

      (2) I recommend that the Introduction section be expanded. I recommend one or two sentences introducing the two Vibrio species selected for study. I.e. why did the authors choose these two species? What is known about their phenotypic drug resistance in the literature? Why did the authors select balofloxacin for their studies, is it a common antimicrobial used vs Vibrios? As well, the end of the Introduction section ends abruptly with no transition to the present study itself. The end of the introduction should include one or two sentences introducing the main purpose of the study, its approach, and the techniques undertaken. For example, "In this study, we evaluated whether magnesium induces phenotypic resistance in Vibrio species and the molecular/genetic basis for such resistance. We used genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility evaluations." 

      We appreciate this comment! We revise the introduction by providing additional information as following:

      “In Gram-negative bacteria, by contrast, zinc enhances antibiotic efficacy by potentiating carbapenem, fluoroquinolone, and β-lactam-mediated killing (Isaei et al., 2016; Zhang et al., 2014). Magnesium influences bacterial structure, cell motility, enzyme function, cell signaling, and pathogenesis (Wang et al., 2019). This mineral also modulates microbiota to harvest energy from the diet (Garcia-Legorreta et al., 2020), allowing Bacillus subtilis to cope with ribosome-targeting antibiotics by modulating ion flux (Lee et al., 2019). However, the role of magnesium in promoting phenotypic resistance is less well understood.

      Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesiumdependent phenotypic resistance mechanisms of pathogenic Vibrio species. 

      The current study evaluated whether magnesium induces phenotypic resistance in Vibrio species and defined the molecular/genetic basis for this resistance. Genetic approaches, GC-MS analysis of metabolite and membrane remodeling upon antibiotic exposure, membrane physiology, and extensive antimicrobial susceptibility testing were used for the evaluations. ”

      (3) The authors introduce the acronym AWST but never use it again in the paper, instead they use SWT. The authors should introduce SWT only for consistency. 

      We appreciate this comment! We have corrected all the “SWT” to “ASWT”

      (4) Line 76 is not clear: what is meant by "some of which could influence drug efficacy" - the enzymes that utilize light metal ions are co-factors? Or the metals directly?  

      We appreciate this comment! The information we wanted to deliver is that light metal ions can serve as cofactors to catalyze biochemical reaction. Such chemical reaction would alter the drug efficacy, e.g. the Fe-S cluster are metallocofactor for proteins which regulates redox chemistry including antibioticinduced redox change. However, this information is not appropriate for this manuscript, so we delete this sentence. 

      (5) Line 90: add a reference corroborating that this chemical composition is a mimic of marine water. The NaCl concentration used in particular looks quite low. 

      We appreciate this comment! It was a typo error. The NaCl concentration was 210 mM as shown in Suppl. Table 1. We also provide details of the chemical composition of the marine water as following:

      “Marine environments and agriculture, where antibiotics are commonly used, are rich in magnesium. To investigate whether this mineral impacts antibiotic activity, the minimal inhibitory concentration (MIC) of V. alginolyticus ATCC33787 and V. parahaemolyticus VP01, which we referred as ATCC33787 and VP01 afterwards, isolated from marine aquaculture, to balofloxacin (BLFX) in Luria-Bertani medium

      (LB medium) plus 3% NaCl as LBS medium and “artificial seawater” (ASWT) medium that included the major ion species in marine water (Wilson, 1975) (LB medium plus 210 mM NaCl, 35 mM Mg2SO4, 7 mM KCl, and 7 mM CaCl2) were assessed”

      (6) Line 98 and Figure 1B. M9 is indicated in the text but does not appear in the figure, the figure only shows SWT. This should be checked. Line 99: based on Figure 1C, the authors are adding MgCl2 to SWT, SWT should be mentioned in this line. Line 100: I believe this is referring to Figure 1C, which should be checked. 

      We appreciate this comment! 

      Line 98, which is now Line 118: We have corrected M9 to ASWT as following:

      “However, the MIC for BLFX was higher in ASWT medium supplemented with Mg2SO4 or MgCl2 than in LB medium (Fig 1B).”

      Line 99, which is now Line 133: the sentence is corrected as following:

      “The MIC for BLFX increased at higher concentrations of MgCl2 in ASWT”

      Line 100, which is now Line 135: we have corrected Fig 1B to Fig. 1C.

      (7) Line 101: text and Figure 1D are not consistent, as Figure 1D does not show this level of precision in added MgCl2 as indicated in the text (15.6 - 62.4 mM).  

      We appreciate this comment! The sentence has been corrected as following: “At balofloxacin doses of 1.56, 3.125, 6.25, and 12.5 µg, the zone of inhibition decreased with increasing MgCl2 (Fig 1D)””.  

      (8) MgCl2 clearly induces increasing levels of BLFX resistance, and to high levels, but not for every antibiotic. For example, the level of increased resistance to blactams is low (ceftriaxone) and plateaus (ceftazidime). As well, resistance to gentamicin plateaus at a lower level than the other aminoglycosides. These observations do not take away from the conclusion that Mg induces multi-drug resistance, but since the behaviour of the MICs for these drugs is different than the other drugs, they should be mentioned. Also, Figure 1F - tetracyclines (plural) is used for vertical axis label - does this refer to the tetracycline itself or the class itself, and if the class, which one was tested? 

      We appreciate this comment! We revise the description as following: “Notably, magnesium had a reduced effect on ceftriaxone and gentamicin than other antibiotics.”

      The tetracyclines is labeled as “Oxytetracycline” in the revised manuscript. 

      - The magnesium chelation experiments presented in Figure 2 are not clear. The authors should briefly mention how this was done around line 128, and what data underlies the values in Figure 2C. Figure 2B is also not clear to me at all. Similarly, how the authors measured intracellular balofloxacin and Mg2+ is not clear and should be mentioned briefly around lines 130-132. 

      We appreciate this comment! These have been rewritten following as  “To investigate whether magnesium binds to balofloxacin, balofloxacin was preincubated with magnesium, and zone of inhibition (ZOI) analysis was conducted. Six different concentrations of balofloxacin were separately incubated with six different concentrations of MgCl2, and then spotted on filter paper so that a defined amount of balofloxacin could be used for ZOI. While lower concentrations of MgCl2, (0.78, 3.125, or 12.5 mM) did not alter the ZOI, higher concentrations, including 50 and 200 mM MgCl2, decreased the ZOI (Suppl. Fig 2A), suggesting that even high doses of magnesium had only a partial effect on balofloxacin through direct binding. For example, at 200 mM MgCl2 and 5 or 10 μg/mL balofloxacin, the balofloxacin ZOI was 53.2 and 70.3% of the ZOI at 0 mM MgCl2, suggesting that  50% of the antibiotics were still functional. Intracellular BLFX also decreased with increasing MgCl2 (Suppl. Fig 2B), while exogenous Mg2+ increased intracellular Mg2+ levels in a dose-dependent manner. For example, exogenous 50 and 200 mM MgCl2 increased intracellular Mg2+ levels to 1.21 and 1.31 mM, respectively (Suppl. Fig 2C). The relationship between TolC, an efflux pump that transports quinolones from bacterial cells, and Mg2+ was also assessed (Kobylka et al., 2020; Song et al., 2020). The expression of TolC/tolC was unaffected by Mg2+ (Suppl. Fig 2D). Magnesium is critical for LPS stability. LPS levels increased at 200 mM Mg2+ (Suppl. Fig 2E), however, the loss of waaF, lpxA, and lpxC, three key genes involved in LPS biosynthesis, did not influence balofloxacin sensitivity/resistance in the presence of Mg2+ (Suppl. Fig 2F). These findings suggest that magnesium-induced LPS biosynthesis does not contribute directly to BLFX resistance and demonstrate that Mg2+ influx is involved in balofloxacin resistance.”

      - Line 135: LPS cannot be "expressed", as the authors word it here. This should be corrected. Also, the inspection of Figure 2G actually shows the levels of LPS increase with increased Mg2+. The authors should re-evaluate these results and change their description around this area of the Results. 

      We appreciate this comment! We have removed the whole Figure 2 to Supplementary Text and Supplementary Figure 2. We rewrite this part as following: “The relationship between TolC, an efflux pump that transports quinolones from bacterial cells, and Mg2+ was also assessed (Kobylka et al., 2020; Song et al., 2020). The expression of TolC/tolC was unaffected by Mg2+ (Suppl. Fig 2D). Magnesium is critical for LPS stability. LPS levels increased at 200 mM Mg2+ (Suppl. Fig 2E), however, the loss of waaF, lpxA, and lpxC, three key genes involved in LPS biosynthesis, did not influence balofloxacin sensitivity/resistance in the presence of Mg2+ (Suppl. Fig 2F). These findings suggest that magnesium-induced LPS biosynthesis does not contribute directly to BLFX resistance and demonstrate that Mg2+ influx is involved in balofloxacin resistance.”

      - Section: MgCl2 affects bacterial metabolism. Authors switched to M9 medium - why? This contrasts with other sections using SWT and should be explained. Also, I cannot evaluate whether the statistical analysis of the data here was performed correctly and was appropriate for this type of experiment. I advise the authors to move the details in lines 166-169 to the Materials and Methods and replace this section instead with a more accessible description of the statistical analysis that a non-expert would be able to appreciate. Furthermore, analysis of Figure 3A indicates that the levels of asparagine, 4-hydroxybutyric acid, uracil, cystathionine, fumaric acid, and aminoethanol have significantly changed at high MgCl2, but these are not mentioned in the text. I suggest the authors mention these if they are relevant to the 12 enriched pathways, especially the biosynthesis of fatty acids. 

      We appreciate this comment! 

      We indicate the reason we use M9 medium as following:

      “To better understand how magnesium affects bacterial metabolism” for explaining why the M9 medium was used.”

      The information lines 166-169 indicated has been removed to M &M. 

      We have carefully examined the abundance of the metabolites and the enriched pathway. Among the listed metabolites, only fumarate is within the enriched pathways. We mention this point in our revised manuscript as following:

      “The increase in fatty acid biosynthesis could be partially explained by an imbalanced pyruvate cycle/TCA cycle, in which fumarate levels increased at higher Mg2+ while succinate levels increased at lower Mg2+ (Suppl. Fig 5B). These findings indicated that glycolysis fluxes into fatty acid biosynthesis rather than the pyruvate cycle/TCA cycle. The relevance of fatty acids and BLFX was demonstrated by the observation that exogenous palmitic acid increased bacterial resistance to balofloxacin (Fig 2F). These results suggest that fatty acid metabolism may be critical to magnesium-based phenotypic resistance.”

      - Line 211 appears to refer to Figure 4F and should be checked. Similarly in line 216 - appears this should be Figure 4H, and line 218 should be Figure 4H. Line 226: add a reference to Fig 4I (after arcA was decreased). Line 227: what are genes N646_1004 and N646_1885? Based on Fig 4J these are crp - authors should add to line 227. Line 228 appears to refer to Figure 4J, not Figure 4I. Line 229 - should be Figure 4K, not Figure 4I. Line 231 - should be 4L, not 4K. Line 239 - should be 4M.

      We appreciate this comment! The text and figure is now matched. 

      - Line 312: the descriptions of "11 lipids, 32 lipids, and 53", and then "26 lipids, 52 lipids, and 107 lipids" are not clear at all and should be corrected. 

      We appreciate this comment! The sentence is revised as following:

      “The abundance of 11, 32, and 53 lipids was increased in 3.125, 50, and 200 mM MgCl2-treated bacteria, respectively, while the abundance of 26, 52, and 107 lipids was decreased in 3.125, 50, and 200 mM MgCl2-treated bacteria, respectively (Suppl. Fig 7C)”

      - Line 340. What is the assay the authors are using to measure the levels of the PGS and PSS enzymes? This is not mentioned or clear in this part of the Results.  

      We appreciate this comment!  We provide the information in the manuscript as following:

      “Levels of PGS and PSS were quantified by ELISA kits according to manufacture’s instruction (Shanghai Fusheng Industrial Co., Ltd., China)”

      - Line 372: What is the assay for measuring membrane depolarization? This is not mentioned and I suggest it should be. Line 374: Figure 7B does not show time dependence, only dose dependence, this should be corrected, it is assumed the authors are referring to Fig 7C for the time dependence data. 

      We appreciate this comment! We provide the information in the result as following:  

      “The voltage-sensitive dye, DiBAC4(3) showed that 12.5–200 mM MgCl2 promoted membrane depolarization in a dose-dependent manner (Fig 6A)”

      We also explain how DiBAC4(3) can be used to measure membrane depolarization in the Materials and Methods section as following:

      “DiBAC4(3) is a s voltage-sensitive probe that penetrates depolarized cells, binding intracellular proteins or membranes exhibiting enhanced fluorescence and red spectral shift.”

      To make it clear the specific figure, we revise the sentence as following:

      “Meanwhile, MgCl2 had a dose-dependent (Fig 6B) and time-dependent (Fig 6C) effect on proton motive force (PMF).”

      - Line 384: mention how FM5-95 measures membrane permeability. The authors should also clarify how this reagent is used to measure membrane fluidity, and it is not clear if the data for this is presented in Figure 7 - please clarify. Regarding SYTO9 dye experiment: the authors should briefly explain the experimental design - how SYTO9 dye operates and why FACS was chosen. What is labeled with FITC?  

      We appreciate this comment! We clarify the reason we use FM5-95 in the Methods and Materials section as following:

      “Measurement of fluidity by fluorescence microscopy

      Measurement of membrane fluidity is performed as previously described (Wen et al., 2022). Briefly, ATCC33787 were cultured in medium with indicated concentrations of MgCl2, collected and then adjusted to OD 0.6. Aliquot of 100 μL bacteria cells of each sample were diluted to 1 mL and 10 μL (10 mg/mL) FM5-95 (Thermo Fisher

      Scientific, USA) was added. FM5-95 is a lipophilic styryl dye that insert into the outer leaflet of bacterial membrane and become fluorescence. This dye preferentially bind to the microdomains with high membrane fluidity(Wen et al., 2022). After incubated for 20 min at 30 ℃ at vibration without light, the sample was centrifuged for 10 min at 12,000 rpm. The pellets were resuspended with 20 μL of 3% NaCI. Aliquot of 2 μL sample was dropped on the agarose slide, and take photos under the inverted fluorescence microscope.”

      This data is presented as micrographs in Fig. 6D, which shows the decreased FM5-95 staining with increasing concentrations of MgCl2. We make this description clear with the following revision:

      “FM5-95 staining decreased with increasing concentrations of Mg2+, and no staining was observed in the presence of 200 mM Mg2+ (Fig 6D).”

      We explain the reason why we use SYTO9 as following:

      “SYTO9, a green fluorescent dye that binds to nucleic acid, enters and stains bacteria cells when there is an increase in membrane permeability (Lehtinen et al., 2004; McGoverin et al., 2020). Staining decreased with increasing MgCl2, indicating that bacterial membrane permeability declined in an Mg2+ dose-dependent manner (Fig 6E).”

      We didn’t use FACS in this study, while we only analyze the fluorescence distribution with the equipment. To make it clear, we revise the sentence as following:

      “After incubated for 15 min at 30 ℃ at vibration without light, the mixtures were filtered and measured by flow cytometry (BD FACSCalibur, USA).”

      - Lines 391-397. The statement that palmitic acid shifts the peaks in Figure 7F is not supported by the data. There is essential no change in the major peak position within each MgCl2 concentration set with increasing palmitic acid. For the linolenic acid data, it is clear that linolenic acid increases permeability only at 50 mM MgCl2-this should be mentioned in the text. 

      We appreciate this comment! We revise the sentence as following:

      “Exogenous palmitic acid also shifted the fluorescence signal peaks to the left in an MgCl2-dependent manner while palmitic acid only slightly shifted the peaks (Fig 6F). In contrast, exogenous linolenic acid shifted the peak to the right in a dose-dependent manner at 50 mM MgCl2 (Fig 6G).” 

      - Line 404-405 - as mentioned earlier, the assay for the update of BLFX should be mentioned (if it is done so earlier in the text, then it does not need to be here).  

      We appreciate this comment! It has been mentioned in the introduction.  

      - Discussion: CpxA/R-OmprF pathway is mentioned here for the first time. Is this one of the pathways modified by MgCl2 as determined during the course of the study? If so, this should be reworded to mention that. If not, the relevance of this particular pathway as it relates to light metals and phenotypic resistance should be discussed.

      We appreciate this comment! Since it is not relevant to the discussion of Mg2+ and fatty acid biosynthesis, we delete this sentence in the revised manuscript.  

      -The following grammatical errors should be corrected:

      -line 55 change to: "genetic mutations; instead, this type of resistance is transient, and bacteria resume normal growth"

      -line 57: change to "resistance types are biofilm" 

      -line 61: change to "states that significantly" 

      -line 63: change to "resistance share the common feature in they retard or even cease in the presence" 

      -line 65: change to "resistance that allow bacteria to proliferate" 

      -line 81: change "But whether" to "Whether" 

      -line 178: change to "may be critical to the Mg-based phenotypic resistance"

      -line 86: change to "Marine environments and agriculture are rich in magnesium, where..." 

      -line 93: change in to vs

      -line 154: insert space after metabolism 

      -line 158: change 'identified" to "focused on the levels of" 

      -line 160: change "The levels of forty-one metabolites" 

      -line 198: change shared to share 

      -line 310: increased is duplicated, delete one 

      -line 451: add "the" before ratio 

      -line 453: gram should be capitalized 

      -line 462: "the regulation" should be reworded to "More importantly, the effect of exogenous MgCl targets the..." 

      -line 469: add dash between Mg2+ and limited

      -line 478: change "the crucial" to "a crucial" 

      -there are numerous locations in the manuscript where the word "magnetism" is used when clearly the word is supposed to be magnesium - this should be corrected

      We appreciate this comment! These have been corrected or revised. 

      Editors comments:

      Page 2 line 27; Page 25 line number 426; page 27 line number 481: In the abstract and discussion, only Vibrio alginolyticus was mentioned, even though two Vibrio species were used in the study. It would be helpful to understand the rationale behind the focus on this particular species.

      We appreciate this comment! We have revised the introduction to provide additional information as following:

      “Vibrios inhabit seawater, estuaries, bays, and coastal waters, regions full of metal ions such as magnesium (Kumarage et al., 2022). Magnesium is the second most dissolved element in seawater after sodium. At a salinity of 3.5% seawater, the magnesium concentration is about 54 mM (Potis, 1968), and in deep seawater, can be as high as 2,500 mM (Wang et al., 2024). Vibrio parahaemolyticus and V. alginilyticus are two representative Vibrio pathogens that infect humans and aquatic animals, resulting in illness and economic loss, respectively (Grimes, 2020). (Fluoro)quinolones such as balofloxacin are used to treat Vibrio infection, however, resistance has emerged due to overuse (Suyamud et al., 2024). Indeed, (fluoro)quinolones are one of China's two primary residual chemicals associated with aquaculture (Liu et al., 2017). Vibrio can develop quinolone resistance through mutations in the DNA gyrase gene or through plasmid-mediated mechanisms (Dutta et al., 2021). Thus, the use of V. parahaemolyticus and V. alginilyticus as bacterial representatives, and balofloxacin as a quinolone-based antibacterial representative, can help to define novel magnesium-dependent phenotypic resistance mechanisms of pathogenic Vibrio species.”

      On Page 2, line 34: The abstract contains some undefined abbreviations, such as 'PE' and 'PG', which should be explained. 

      We appreciate this comment! We explain the PE and PG in the revised abstract as following:

      “phosphatidylethanolamine (PE) biosynthesis is reduced and phosphatidylglycerol (PG)”

      On Page 2, line 31-32: For the statement "Exogenous supplementation of fatty acids confirm the role of fatty acids in antibiotic resistance…" it would be beneficial to specify whether the fatty acids were saturated or unsaturated. 

      Response, We appreciate this comment! We revise the sentence as following:

      “Exogenous supplementation of unsaturated and saturated fatty acids increased and decreased bacterial susceptibility to antibiotics, respectively, confirming the role of fatty acids in antibiotic resistance.”

      The potential effects of the specific ions (SO4 and Cl2) present in the Mg2SO4 and MgCl2 compounds used in the study were not discussed. It would be useful to understand if these ions had any influence on the observed outcomes.

      We appreciate this comment! We revise the sentence as following:

      “However, the MIC for BLFX was higher in ASWT medium supplemented with Mg2SO4 or MgCl2 than in LB medium (Fig 1B). And Mg2SO4 or MgCl2 had no

      difference on MIC, suggesting it is Mg2+ not other ions contribute to the MIC change.”

      On Page 8, line 141: The heading of Figure 2, "Mg2+ elevates intracellular Mg2+," seems redundant and could be revised for clarity or modified. 

      We appreciate this comment! Figure 2 is now moved to supplementary figure as Suppl. Fig 2. The title is revised as following:

      “Figure 2. Mg2+ decreases balofloxacin uptake.”

      On Page 4, line 91: some terms/abbreviations, such as 'LB' and 'M9,' require expansion or definition to ensure the reader's understanding.

      We appreciate this comment! We include the expansion for LB and M9 in the  revised manuscript as following:

      “Luria-Bertani medium (LB medium)” and “M9 minimal medium (M9 medium)”

      Page 4, line 92: The real seawater composition used in the experiments should be supported by a reference.

      We appreciate this comment! We provide the reference in the revised manuscript as following:

      ““artificial seawater” (ASWT) medium that included the major ion species in marine water (Wilson, 1975) (LB medium plus 210 mM NaCl, 35 mM Mg2SO4, 7 mM KCl, and 7 mM CaCl2)”

      Page 4 line, number 93: the he full names of the bacterial strains (e.g., ATCC33787 and VP01) should be provided instead of just the strain numbers.

      We appreciate this comment! We revised the sentence as following:

      “To investigate whether this mineral impacts antibiotic activity, the minimal inhibitory concentration (MIC) of V. alginolyticus ATCC33787 and V. parahaemolyticus VP01, which we referred as ATCC33787 and VP01 afterwards,”

      Finally, there appears to be a potential contradiction between the statements on page 12, lines 211-212 and 214-216, regarding the effects of Mg2+ on the synthesis of unsaturated fatty acids. Further explanation may be needed to reconcile these seemingly contradictory points.

      We appreciate this comment! For line 221-226, which was previously line 211-212, is about the gene expression for fatty acid biosynthesis. While, Line 228 and 233, which was previously line 214-216 is about the gene expression for fatty acid degradation. We agree that the previous description is a little bit confuse. We revise the sentence to emphasize that we focus on fatty acid degradation so that the readers can tell them apart. 

      In the text, we revised it as following:

      “In addition, we also quantified gene expression during fatty acid degradation to determine whether Mg2+ affects this process”  In the figure legend, we also indicate that 

      “H. qRT-PCR for the expression of genes encoding fatty acid degradation in the absence or presence of the indicated concentrations of MgCl2”

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02648

      Corresponding author(s): Kevin Berthenet (kevin.berthenet@lyon.unicancer.fr) and Gabriel Ichim (gabriel.ichim@lyon.unicancer.fr)

      1. General Statements

      We thank all the reviewers for their time and their constructive criticism, based on which we propose the revision plan detailed bellow. All our responses are indicated in italics font. When is the case, the figures for the reviewers are included just below the answer. Only where indicated they have been included in the manuscript. The line numbers indicated here refer to those in original manuscript.

      The two reviews are listed in full at the end of the document.

      2. Description of the planned revisions

      Reviewer #1

      In this manuscript, the authors report a non-apoptotic role for caspase 3 in promoting cell migration. RNA sequencing revealed a "gene signature" associated with caspase 3 knockdown in a melanoma cell line, although there is no investigation of the connection between caspase 3 expression and the regulation of gene expression. Mass spectrometry-based experiments (AP-MS and BioID) identified numerous interacting proteins, with coronin 1B being the most extensively characterized. Data provided indicates that there is a direct interaction between caspase 3 and coronin 1B, and that caspase 3 influences coronin 1B phosphorylation basally and following ligand stimulation. Both proteins are required for efficient cell migration in scratch wound assays. Data is provided indicating that the actions of caspase 3 are independent of proteolytic activity, although the pharmacological inhibition of caspase activity is not complete, nor is the knockdown of BAX/BAK, making these conclusions poorly substantiated. Evaluation of pathways regulating caspase 3 expression implicates the SP1 transcription factor.

      Response: We thank the reviewer for their supportive comment. Regarding specific pharmacological inhibition of caspase-3, work is under way to complement the results obtained with a pan-caspase inhibitor (qVD-OPh). We will use specific effector caspases inhibitors, complemented by several other approaches: complete KO of BAX and BAK proteins to prevent all eventual mitochondrial permeabilization and low-level effector caspase activation, overexpression (OE) of the anti-apoptotic protein BCL-xL to also prevent residual mitochondrial permeabilization, while also OE XIAP, a potent caspase inhibitor. The promising preliminary data using two effector caspases specific inhibitors (Ac-DEVD-CHO and Ac-DNLD-CHO) in two different melanoma cells, during wound healing migration, is shown below, with no effect on melanoma cell migration.

      Line 129 - The data in Sup. Fig. 1H-L are technical, but where are the mass spectrometry results from the BioID2 experiments? These technical figures are really only relevant if the BioID2 system has been used for protein pull-downs, not for the IF analysis in Fig. 2B.

      Response: We apologize for lack of precision in the article logical flow, we will now incorporate the MS data based on the BioID2 experiment earlier in the manuscript.

      Line 143 - Figure 2C - it is not entirely convincing that caspase 7 is not associated with the cytoskeleton, there is a visible band in lysates from both cell lines, in contrast with GAPDH which is convincingly cytoplasmic. This is particularly true in the WM852 cell lines, in which the Caspase 3 band is almost the same as Caspase 7. These results would also be more convincing if there was IF of Caspase 7 and actin to show whether it is or is not enriched in regions of higher F-actin levels.

      Response: Indeed, our data points towards an enrichment of caspase-3 at the cell cortex. Since generally caspase-7 protein levels are lower, we detected it less in the cytosolic fraction. As suggested, now we performed more sensitive IF colocalization confocal imaging between caspase-7 and F-actin and find it also partially localized to the cortical cytoskeleton (see below). However, this effector caspase is not involved in melanoma cell migration (see wound healing assay below, with two different siRNAs for CASP7 and the positive control of siRNA CASP3).

      Figure 2D - knockdowns with only a single siRNA are insufficient, this should be replicated with additional siRNAs. In addition to the effect on actin anisotropy, it appears as though cells are smaller, is this and any other morphological changes reproducible?

      Response: We plan to strengthen the data shown in Fig.2D with additional siRNAs, as shown below. In addition, high-content screening (HCS) microscopy will provide several other cell morphology descriptors.

      Figure 2D-E. Is it cytochalasin B or D used in these experiments? The text and figures don't agree with each other. 5. Figure 2F-G, same comments above for 2D-E (i.e. comments 3 & 4).

      Response: The experimental conditions will be better detailed in the revised manuscript.

      Figure 2F-G, it appears as though the fewer focal adhesions in the Caspase 3 knockdown cells are bigger per focal adhesion, is this a consistent result? If so, what is the explanation?

      Response: In addition to number, we also plan to quantify the size of focal adhesions.

      Figure 2H - it's not clear how this RNAseq data is relevant to the manuscript. There are some genes in the heat map, but it's not clear which ones are changed in their expression in the caspase 3 knockdown cells, nor is it clear how this is relevant to the proposed mechanisms of Caspase 3 interacting with and influencing the phosphorylation of coronin 1B. If there is no connection, then these data can be removed.* *

      Response: As suggested by the reviewer, the RNAseq data presented in Figure 2H will be removed from the revised manuscript since it is not very relevant.

      Supp. Figure 3 - given that there is data from multiple siRNAs for the incucyte migration data, it should be in the primary figures. And since there are multiple siRNAs and CRISPR/Cas9 KO cells, there should be nothing limiting the replication of the other data presented from only a single siRNA.

      Response: Several siRNA are now used for replicating key results as shown above.

      Figure 3A - how was cell adhesion measured? The methods section says "cell adhesion was determined through cell shape analysis and scoring" But this is very vague.

      Response: We thank the reviewer for spotting out this ambiguity, in the revised manuscript we will be more precise in Material and Methods section.

      Figure 3L - was the Casp7 knockdown experiments done with multiple siRNAs? Both melanoma cell lines? Why is this figure only shown out to 24 hours, whereas the other Incucyte experiment run out to 48 hours? Where is the western blot confirming the caspase 7 knockdown? This is important to establish a clear lack of effect.

      Response*: We apologize for lacking more details, we now provide several siRNA for caspase-7, all showing no or minimal effect of melanoma cell migration (see answer to point 2). *

      Line 190 - it is not true to say that in the presence of QVD there is no longer any caspase activity induced by actinomycin D/ABT263 in supplemental Figures 3J-K. The way that the Y axis has been broken diminishes the difference between untreated and treated cells. In fact, there is apparently over 3-4 times more caspase activity in the actinomycin D/ABT263 treated cells in the presence of QVD relative to basal caspase activity. As a result, it cannot be concluded that there is no residual caspase activity.

      Response: We were not precise enough in describing the data in S3J-K. In the revised manuscript we will clearly say that since treatment with a pan-caspase inhibitor does not have the effect of lowering any basal caspase activity (column 1 versus 2), we conclude that in melanoma cells (WM793 and WM852) there is no basal caspase activation that could drive cell motility. The ActD/ABT263 treatment was used as positive control for bona fide induction of effector caspase activation. These results will be complemented by BAX/BAK DKO and BCLxL OE.

      Line 192 - Does the knockout of BAX/BAK (which apparently reduced but did not eliminate BAX/BAK protein levels in Supp. Fig. 3L) actually "completely block" caspase activity via the mitochondrial pathway? This has not been demonstrated.

      Response: We now provide a fluorometric effector caspases assay showing abrogation of caspase activity in BAX/BAK DKO cells (see below, caspase activating treatment is ActinomycinD plus ABT263). In addition, we will improve the DKO efficacy.

      Line 217 - coronin 1B was a hit from which assays? IP-MS and/or BioID2? I see that this is shown in Figure 5A but not referenced in this sentence.

      Line 218 - the reference to Figure 5A should be in the previous sentence. Line 220 - Can it really be said that the interaction is specific since there is a coronin 1B band in the GFP "negative" control?__ __

      Response*: The revised manuscript will address these inadequacies. *

      Line 222 - it is a good control to show that siRNA-knockdown of Caspase 3 reduced the PLA signal in Figure 5C, but the reciprocal experiment of looking at what happens with Coronin 1B knockdown should be included. How does the PLA signal relate to phalloidin-stained F-actin?

      Response: The proximity ligation assay (PLA) is now complemented by KD of Coronin 1B (see below) and we will try to also add the phalloidin staining for F-actin, if compatible with the PLA protocol.

      Line 224 - looking at the line scans, is the lack of recruitment of coronin 1B to the F-actin at the edge of the protrusion in the Caspase 3 knockdown cells reproducible? Is the point that caspase 3 recruits Coronin 1B? There is an obvious difference in the F-actin at the cell edge, but if the F-actin were as dense in the Caspase 3 knockdown cells as they are for the control, would the same lack of coronin 1B be apparent?

      Response: This aspect will be better addressed/discussed in the revised manuscript.

      Line 227 - where is the western blot showing the effectiveness of the coronin 1B knockdown to accompany Figure 5F.

      Response: The efficacy of coronin 1B KD will be added in the revised manuscript.

      Figure 5G - the blots indicate that there is no change in phospho-PKCalpha in the caspase 3 knockdown cells, although phospho-coronin 1B does decrease. This has not been commented upon in the text. Is the implication that there is a non-PKCalpha mediated mechanism for coronin 1B phosphorylation that is dependent on caspase 3?

      Figure 5H - following from the previous point, there is no phospho-PKCalpha blot that would be a positive control for the effect of PDGF stimulation on PKC activation, in control and caspase 3 knockdown cells, to evaluate whether the effect on coronin 1B phosphorylation was upstream or downstream of PKCalpha. This is also true for Supp. Fig. 4H.

      Response*: Since there are several PKC isoforms that might be co-expressed in melanoma cells, it is possible that PKCalpha is not the one responsible for phosphorylating Coronin 1B. We will be more precise in our investigations by using a pan-phospho-PKC antibody. *

      Does phosphorylation of coronin 1B affect its interaction with caspase 3?

      Response: We will assess by Co-IP the interaction of caspase-3 with both non-phosphorylated and phosphorylated Coronin 1B.

      Figure 6 - as before, only a single siRNA to knockdown SP1 is insufficient to robustly support the conclusions.

      Response: As shown below, we addressed this helpful comment by using several siRNAs to assess the role of SP1 in melanoma cell motility, in two different melanoma cell lines.

      • *

      Reviewer #2

      In this manuscript, the authors provide substantial amounts of experimental evidence that caspase-3, more precisely pro-caspase-3, might be involved in promoting melanoma cell migration and invasion. As such, this function, which might stem from scaffolding roles independent of proteolytic activity (yet not shown entirely convincingly), could possibly be similar to those attributed to other caspases, yet the latter omitted experiments testing for the necessity of enzyme activity. The data are novel and interesting and obviously deserve publication. Yet, a number of criticisms need to be listed.

      Response*: We thank the reviewer for upholding the novelty of our study. As also rightfully pointed by R1, we will strive in a revised manuscript to definitely show that caspase-3 participate to melanoma cell motility independently of its pro-apoptotic protease role: we will use two effector caspases specific inhibitors (Ac-DEVD-CHO and Ac-DNLD-CHO, as shown above) complemented by several other approaches: complete KO of BAX and BAK protein to prevent all eventual mitochondrial permeabilization and low-level effector caspase activation, OE of the anti-apoptotic protein BCL-xL to also prevent residual mitochondrial permeabilization, while also OE XIAP, a potent caspase inhibitor. *

      • *

      • First and foremost, I don't seem to find ethical approval information on the animal experiments. While I do not work with zebrafish myself, I am also somewhat concerned by the size of tumours seen in some of the depicted fish. It is highly important that appropriate information in this direction, including possible endpoints, is provided. Response*: We completely agree with the reviewer, yet the ethical approval is already provided in the manuscript (line 588) and will be complemented by adding the endpoints. *

      The second major issue lies in figure 1. The figure as a whole seems to be very much forced to support or motivate later experimental findings. The authors lack sufficient clarity on some of the approaches and seem to judge on the data to a good bit as they see fit. (…)

      I´d suggest to largely take out Fig1 in its current form, spend time on properly describing any analysis of public data, carefully interpret these and move them probably to the end of the results. Currently, it just leaves the impression that the data were pushed as hard as possible to promote the good work that follows.

      Response*: We will carefully consider the reviewer’s comments and rework the bioinformatics analysis presented in Figure 1 (and associated supplementary figure), making sure we will present certain data as correlation (and not causality) and go into more details on the physio-pathological features of melanoma patients with low/high caspase-3 expression. *

      • *

      The text on line 129ff seems to have omitted any outcomes from the Suppl. Fig1H-L. What was found and what are we supposed to learn from this?

      Response: We apologize for lack of precision in the article logical flow, we will now incorporate the MS data based on the BioID2 experiment earlier in the manuscript.* *

      Lines 146/147 state similar effects upon CASP3 depletion and cytochalasin D. I cannot make that out from Fig.2D. Can you be more specific or visualize this better?

      Response: We will fix this by including zoomed and detailed images of individual cells.

      • Is it possible to state whether effects such as in Fig.3B are general rather than showing just 1 cell?

      Response: The defects in cell adhesion for caspase-3-depleted cells are quantified in Figure 3A. Moreover, we will add representative images.

      • *

      It is unclear how the genes in Fig.2H were defined and why would all of these differ (unless this was an inclusion criterion for the panel). Are these considered to be downstream of CASP3 somehow? I don't fully get the message here. Is this panel even required here?

      Response: As it brings little information, panel 2H will be excluded from the revised manuscript.

      To fully prove independence of caspase-3 activity, it would be appropriate to k/o caspase-3 to then reconstitute the cells with inactive caspase-3.

      • *

      Response: We will try our best of addressing this comment in the revised manuscript.

      Fig.4C and associated text: Statements on changes in tumor size cannot be made from data on tumor free survival.

      Response: We apologize for the misleading data interpretation; this will be tuned down in a revised manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, the authors report a non-apoptotic role for caspase 3 in promoting cell migration. RNA sequencing revealed a "gene signature" associated with caspase 3 knockdown in a melanoma cell line, although there is no investigation of the connection between caspase 3 expression and the regulation of gene expression. Mass spectrometry-based experiments (AP-MS and BioID) identified numerous interacting proteins, with coronin 1B being the most extensively characterized. Data provided indicates that there is a direct interaction between caspase 3 and coronin 1B, and that caspase 3 influences coronin 1B phosphorylation basally and following ligand stimulation. Both proteins are required for efficient cell migration in scratch wound assays. Data is provided indicating that the actions of caspase 3 are independent of proteolytic activity, although the pharmacological inhibition of caspase activity is not complete, nor is the knockdown of BAX/BAK, making these conclusions poorly substantiated. Evaluation of pathways regulating caspase 3 expression implicates the SP1 transcription factor.

      Major comments:

      1. Line 129 - The data in Sup. Fig. 1H-L are technical, but where are the mass spectrometry results from the BioID2 experiments? These technical figures are really only relevant if the BioID2 system has been used for protein pull-downs, not for the IF analysis in Fig. 2B.
      2. Line 143 - Figure 2C - it is not entirely convincing that caspase 7 is not associated with the cytoskeleton, there is a visible band in lysates from both cell lines, in contrast with GAPDH which is convincingly cytoplasmic. This is particularly true in the WM852 cell lines, in which the Caspase 3 band is almost the same as Caspase 7. These results would also be more convincing if there was IF of Caspase 7 and actin to show whether it is or is not enriched in regions of higher F-actin levels.
      3. Figure 2D - knockdowns with only a single siRNA are insufficient, this should be replicated with additional siRNAs. In addition to the effect on actin anisotropy, it appears as though cells are smaller, is this and any other morphological changes reproducible?
      4. Figure 2D-E. Is it cytochalasin B or D used in these experiments? The text and figures don't agree with each other.
      5. Figure 2F-G, same comments above for 2D-E (i.e. comments 3 & 4).
      6. Figure 2F-G, it appears as though the fewer focal adhesions in the Caspase 3 knockdown cells are bigger per focal adhesion, is this a consistent result? If so, what is the explanation?
      7. Figure 2H - it's not clear how this RNAseq data is relevant to the manuscript. There are some genes in the heat map, but it's not clear which ones are changed in their expression in the caspase 3 knockdown cells, nor is it clear how this is relevant to the proposed mechanisms of Caspase 3 interacting with and influencing the phosphorylation of coronin 1B. If there is no connection, then these data can be removed.
      8. Supp. Figure 3 - given that there is data from multiple siRNAs for the incucyte migration data, it should be in the primary figures. And since there are multiple siRNAs and CRISPR/Cas9 KO cells, there should be nothing limiting the replication of the other data presented from only a single siRNA.
      9. Figure 3A - how was cell adhesion measured? The methods section says "cell adhesion was determined through cell shape analysis and scoring" But this is very vague.
      10. Figure 3L - was the Casp7 knockdown experiments done with multiple siRNAs? Both melanoma cell lines? Why is this figure only shown out to 24 hours, whereas the other Incucyte experiment run out to 48 hours? Where is the western blot confirming the caspase 7 knockdown? This is important to establish a clear lack of effect.
      11. Line 190 - it is not true to say that in the presence of QVD there is no longer any caspase activity induced by actinomycin D/ABT263 in supplemental Figures 3J-K. The way that the Y axis has been broken diminishes the difference between untreated and treated cells. In fact, there is apparently over 3-4 times more caspase activity in the actinomycin D/ABT263 treated cells in the presence of QVD relative to basal caspase activity. As a result, it cannot be concluded that there is no residual caspase activity.
      12. Line 192 - Does the knockout of BAX/BAK (which apparently reduced but did not eliminate BAX/BAK protein levels in Supp. Fig. 3L) actually "completely block" caspase activity via the mitochondrial pathway? This has not been demonstrated.
      13. Line 217 - coronin 1B was a hit from which assays? IP-MS and/or BioID2? I see that this is shown in Figure 5A but not referenced in this sentence.
      14. Line 218 - the reference to Figure 5A should be in the previous sentence.
      15. Line 220 - Can it really be said that the interaction is specific since there is a coronin 1B band in the GFP "negative" control?
      16. Line 222 - it is a good control to show that siRNA-knockdown of Caspase 3 reduced the PLA signal in Figure 5C, but the reciprocal experiment of looking at what happens with Coronin 1B knockdown should be included. How does the PLA signal relate to phalloidin-stained F-actin?
      17. Line 224 - looking at the line scans, is the lack of recruitment of coronin 1B to the F-actin at the edge of the protrusion in the Caspase 3 knockdown cells reproducible? Is the point that caspase 3 recruits Coronin 1B? There is an obvious difference in the F-actin at the cell edge, but if the F-actin were as dense in the Caspase 3 knockdown cells as they are for the control, would the same lack of coronin 1B be apparent?
      18. Line 227 - where is the western blot showing the effectiveness of the coronin 1B knockdown to accompany Figure 5F?
      19. Figure 5G - the blots indicate that there is no change in phospho-PKCalpha in the caspase 3 knockdown cells, although phospho-coronin 1B does decrease. This has not been commented upon in the text. Is the implication that there is a non-PKCalpha mediated mechanism for coronin 1B phosphorylation that is dependent on caspase 3?
      20. Figure 5H - following from the previous point, there is no phospho-PKCalpha blot that would be a positive control for the effect of PDGF stimulation on PKC activation, in control and caspase 3 knockdown cells, to evaluate whether the effect on coronin 1B phosphorylation was upstream or downstream of PKCalpha. This is also true for Supp. Fig. 4H.
      21. Does phosphorylation of coronin 1B affect its interaction with caspase 3?
      22. Figure 6 - as before, only a single siRNA to knockdown SP1 is insufficient to robustly support the conclusions.

      Minor comments:

      1. Figure 2C - all caps for CASP7
      2. Figures 2D,F - Cytochalsin
      3. Figure 2H, the labelling of gene names is too small to read.
      4. Supplemental Fig 1A - why is A375 here? Why plot a graph and not just write a percentage protein remaining under the figure? There are no errors indicated, so presumably this is N = 1.
      5. Line 127 - smal

      Significance

      The manuscript is interesting and novel, making it relevant for a broad basic research audience. The role of caspase 3 in non-apoptotic biological processes is not extensively characterized, making this study an advance in the field. The methods are appropriate and well-executed. The statistical methods are mostly appropriate, although some assays (e.g. wound healing assays) do not have associated statistical analysis. Most of the conclusions are adequately substantiated by the results, but as indicated above and in the points below, this is not entirely consistent. There is an issue with only a single siRNA being used in several experiments that should be addressed.

    1. t He goes on to Stress that "religious or magical behaviorur thinking must not be set apart from the range of everyday purposiveconduct, particularl)' since even the ends of the religious and magical actions are predominantly economic." Bourdieu ( l 990h:4) argues that b)' in,isting on the "this-worldly" character of I>chavior motivated by religiousf:lctOrs \.yebet provides a "way of linking the contents of mythical discourse(;1I1d even its syntax) to the religious interests of those who produce it,diffuse it, and receive it."

      acknowledge distinction of value-rational motives

    Annotators

    URL

    doc-04-1s-prod-02-apps-viewer.googleusercontent.com/viewer2/prod-02/pdf/4dasas8bgjb06ih8nrlhchpg5b9h0kqq/mhkbct7ob9vh5bibbseqq67ou9uq7mrm/1733791800000/3/106465141034196260524/APznzaZbwTm12FnERi3bLDiovCCFPTjSHfX3Kusk_hAgt5UsEXhiBQ5ahVs3cH9be_qtuiAUmi4YuYXLxvqurmWu0vbwCIyep31aH8zcUDzPd8LxIq33vd3kgMg67sFZiYBhSQN6Efb4SXWPtlJvdukLvqZzNBH7nvSG_TuNtCmxyJvkW7380sMLszxrDs-1MmUgF9SHmEIpFfLSqelIZTICl-vi6UqS7c0_xXpge77DSUSLnga5ZYB-OhYumtm924IxC9siv9OjEASyVhuKv094yAcmxgQjjrVVAy65bKOBJPDuDyRVJCRDFndeTOlVFP7HnX6Im-wFZ1F9m_OvoTml7gjiMXQAVpFCCrVUzvFVO6B1_qqgVagbadFDya5GeNK9FwW49tF13M3fYwDsSVA_gZZ7gtqXZxxVoqKrIjJRqnIZdrltvw1FPAy-ygWz-RURo9PKaCHWFaloYtJXtvknhtDVaWPqJCJE7aqAmdya61_7DGi4s3Qk_ZIGxeA9GaGh8UDpZ8pOYdDMlaCl8_L27sSQuUexyDgnLBd0lid3oD2zeZi4_AppjwWv7nRG8MlbqRCI_RPKCCUQhLqBBMvWTIFcM7Vk_SOk8sM-bqIvRxf-3pru91h2Y9SBumrzhe6S3IWX8A3wGHMWg0lGdS8DN9xAOm8BD6IRs4eNyQNHKaKEKUFOMqJbcJm16QSJhw916YAvHmTzh8Y9Fpz1EAGbOKszpWC0UByJiKVkmwjVtWas6RGk9f8YldzeS5Ppl7Q7S29i-2WjBGQD9cNK7HE3oard7RD-ifI2WhH08ZH48c_NZ93ezlvuUZXVhujMbQe6ZMiLSOhZFkQnNZAfJL1mR-aVkubBTLvGlgaOpsHvGeaJIggyON1lkXPw34NAFXsT9AGMzZ8i-To7BPpMVtU5_WK08zlVX27btzO1W8t-EG8E6uOxos5v9D76xYHyeXENf2EeXIUGY44OdXTgI56u8W3w_-kT0XhtKOZQalZ-xEODsxchzr9naqx_DBXBYc5N7DN6FSGS0W6agXvBhy2rfhqWdlOopR1TUCmP0H4Z4umbzybuo0iIc1yIX-VIcyAvjAwuqdlrPwBeQQ61si2c3ZmhH0iOQP0gIjaz7LSaq3DDcjf14Ky3jJgQL-RYwvlXGNK3Eo9K-2a5FVvK04XOYPjb9uvKKRltek733twWagFDbmNfxFGu-ru2rwoMBlawdtMiY14YSuEKYCI7nzSQ2_wgiUamEtUSvVPolho02AVTNxQd2XvQHzbhKZrzMdrUhZ8zwTkrYDlPsBRsMmnoj6fUu1QNWnG1I9smKC2kikqNAsyqzxx28hpTC2lizUi_Zp4zdcgKm1-kX_hbpvCdCyG978S_UNS81krKsd2rKBzzrdeR-lhE8JYzMgLV48lFyXzbneVP9Z66lbMGCo2FVWJuEPf6vFXdDodBY9dT4dZ9M6XV8FrFZSRwYx2N3jKeX-KibTPhUj3f7ly7y6Aq-xmJcy3DVrs0FR4TIqDkSqb3n8gjdHnyQ75zmPOVOomL22QttMhPsWWOBOZa3NCde-8XBRFI5FRcIGHNxFsE6y8k1kZ_9B9XPr2QwnntZvHaUTwcoZ2aO1YkAu-6uK1-P9-bHVxOns1B9JgaVyFH9hQdCtdR9Z59MO5TvNoeNKSrqJtDVNviUfzJMaC3o-KVZimwqQ77KY5FTUyR_541lNdE3sgBI7F_oOasEgKjkRgFlh6HXKGlke4ZRy7izVGmap9CzqYgKHrWvbu9dM7QiL6Wkz6ivY7cso_4R4iSeCevP5swaOk4GSQgde_5jCjH4z1h-6qq9ArLOs3dtMmrI1UeqxssCNrugJLKvBup_7Tiqb0iEvFyV7djzacMZA7p6RlUn0rhlT7WxzId5XRNeUPbIVszzydw1YE8Ku6s6qIilItBI25V
    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      Summary: 

      Recent years have seen spectacular and controversial claims that loss of function of the RNA splicing factor Ptbp1 can efficiently reprogram astrocytes into functional neurons that can rescue motor defects seen in 6-hydroxydopamine (6-OHDA)-induced mouse models of Parkinson's disease (PD). This latest study is one of a series that fails to reproduce these observations, but remarkably also reports that neuronal-specific loss of function of Ptbp1 both induces expression of dopaminergic neuronal markers in striatal neurons and rescues motor defects seen in 6-OHDA-treated mice. The claims, if replicated, are remarkable and identify a straightforward and potentially translationally relevant mechanism for treating motor defects seen in PD models. However, while the reported behavioral effects are strong and were collected without sample exclusion, other claims made here are less convincing. In particular, no evidence that Ptbp1 loss of function actually occurs in striatal neurons is provided, and the immunostaining data used to claim that dopaminergic markers are induced in striatal neurons is not convincing. Furthermore, no characterization of the molecular identity of Ptbp1-deficient striatal neurons is provided using single-cell RNA-Seq or spatial transcriptomics, making it difficult to conclude that these cells are indeed adopting a dopaminergic phenotype. 

      Overall, while the claims of behavioral rescue of 6-OHDA-treated mice appear compelling, it is essential that these be independently replicated as soon as possible before further studies on this topic are carried out. Insights into the molecular mechanisms by which neuronalspecific loss of function of Ptbp1 induces behavioral rescue are lacking, however. Moreover, the claims of induction of neuronal identity in striatal neurons by Ptbp1 require considerable additional work to be convincing.

      We thank the reviewer for the detailed analysis of our study. Please find our answers to the points raised by the reviewer below in blue.

      Strengths of the study: 

      (1) The effect size of the behavioral rescue in the stepping and cylinder tests is strong and significant, essentially restoring 6-OHDA-lesioned mice to control levels.

      (2) Since the neurotoxic effects of 6-OHDA treatment are highly variable, the fact that all behavioral data was collected blinded and that no samples were excluded from analysis increases confidence in the accuracy of the results reported here. 

      We appreciate the reviewer’s feedback and acknowledgement of the strengths of our study. We undertook several optimization steps in the surgery, post-operative care, and handling of the animals for behavior experiments to ensure high reproducibility of our experiments.

      Weaknesses of the study:  

      (1) Neurons express relatively little Ptbp1. Indeed, cellular expression levels as measured by scRNA-Seq are substantially below those of astrocytes and other non-neuronal cell types, and Ptbp1 immunoreactivity has not been observed in either striatal or midbrain neurons (e.g. Hoang, et al. Nature 2023). This raises the question of whether any recovery of Th expression is indeed mediated by the loss of function of Ptbp1 rather than by off-target effects. AAVmediated rescue of Ptbp1 expression could help clarify this.

      In the original manuscript, we delivered control vectors that only express the ABE to 6-OHDAlesioned mice (labeled as AAV-ctrl) and did not detect TH positive cells in the midbrain or striatum of control mice or rescue of spontaneous motor skills. We can therefore exclude that the delivery procedure, AAV-PHP.eB capsid, or ABE expression caused adverse effects leading to induction of TH expression and functional rescue of spontaneous motor behaviors in PD mice. To further exclude that these effects were caused by off-target editing, we experimentally determined off-target binding sites of our sgRNA (sgRNA-ex3) using GUIDEseq and subsequently analyzed these sites in treated animals by NGS (Figure 3 – supplement 3). While two off-target sites were identified, it is unlikely that base editing at these sites caused the observed phenotypes. One off-target site was identified in the myopalladin (Mypn) gene, which encodes for a muscle-specific protein that plays a role in regulating the structure and growth of skeletal and cardiac muscle (Filomena et al., 2021, 2020).  The other site is not located in a coding region, but in an intron of the ankyrin-1 (Ank1) gene, encoding for an adaptor protein linking membrane proteins to the underlying cytoskeleton (Cunha and Mohler, 2009). Even though this gene is also expressed in neurons, base editing within this intronic region did not lead to changes in transcript levels (Figure 3 – supplement 3). Thus, the induction of TH expression upon adenine base editing with sgRNA-ex3 is likely a direct consequence of PTBP1 downregulation.

      Further supporting this conclusion, in the revised manuscript we additionally show PTBP1 downregulation at the RNA and protein level in the SNc and striatum after base editor treatment (Figure 2 – figure supplement 5; figure 3 – supplement 2).

      (2) It is not clear why dopaminergic neurons, which are not normally found in the striatum, are observed following Ptbp1 knockout. This is very similar to the now-debunked claims made in Zhou, et al. Cell 2020, but here performed using the hSyn rather than GFAP mini promoter to control AAV expression. While this is the most dramatic and potentially translationally relevant claim of the study, this claim is extremely surprising and lacks any clear mechanistic explanation for why it might happen in the first place.  

      We agree with the reviewer that our study does not provide mechanistic insights into how Ptbp1 downregulation in neurons leads to the induction of dopaminergic markers in the striatum. As we believe that this is not within the scope of a revision, we discuss potential follow-up experiments in the discussion section of the revised manuscript.

      This observation is even more surprising in light of reports that antisense oligonucleotidemediated knockdown of Ptbp1, which should have affected both neuronal and glial Ptbp1 expression, failed to induce expression of dopaminergic neuronal markers in the striatum (Chen, et al. eLife 2022). Selective loss of function of Ptbp1 in striatal and midbrain astrocytes likewise results in only modest changes in gene expression. 

      Using 6-OHDA lesioned Aldh1l1-CreERT2;Rpl22lsl-HA mice, the Chen et al. study (eLife 2022) assessed potential astrocyte to neuron conversion by quantifying the presence of HA-labeled neurons after ASO-mediated knockdown of Ptbp1. Even though they did not detect HApositive neurons in the SNc, suggesting absence of astrocyte to neuron conversion, the images in Figure 4D reveal TH positive cells in the lesioned hemisphere, similar to our observations in Figure 2B-D. While it cannot be excluded that these TH positive cells are remnants from an incomplete 6-OHDA lesion, they could also be endogenous neurons with induced expression of dopaminergic markers after ASO-mediated knockdown of Ptbp1. Furthermore, Chen et al. performed the apomorphine test to assess changes in motor skills, which did not reveal an improvement in our study either.

      It is critically important that this claim be independently replicated, and that additional data be provided to conclusively show that striatal neurons are indeed expressing dopaminergic markers.

      Our behavior and immunofluorescence experiments involving mice injected into the striatum were performed with two independently generated cohorts of 6-OHDA mice. In detail, the 6OHDA mice were generated by two independent surgeons from different labs (>6 months between experiments of these cohorts), leading to comparable behavioral outcomes before and after treatment. Subsequent behavior and immunofluorescence experiments with each cohort were performed and analyzed by two independent and blinded researchers, showing comparable results.

      (3) More generally, since multiple spectacular and irreproducible claims of single-step glial-toneuron reprogramming have appeared in high-profile journals in recent years, a consensus has emerged that it is essential to comprehensively characterize the identity of "transformed" cells using either single-cell RNA-Seq or spatial transcriptomics (e.g. Qian, et al. FEBS J 2021; Wang and Zhang, Dev Neurobiol 2022). These concerns apply equally to claims of neuronal subtype conversion such as those advanced here, and it is essential to provide these same datasets. 

      In the revised version, we have analyzed the expression of additional neuronal markers in TH positive cells of the striatum using 4i imaging. Briefly, our results showed that the vast majority of TH-expressing cells also expressed the markers DAT and NEUN, further corroborating the neuronal and dopaminergic identity of these cells. Additional analysis revealed that this TH/DAT/NEUN expressing cell population expressed markers of GABAergic neurons, either of medium spiny neurons (~50%) and various types of interneurons (~50%). While our 4i analysis has allowed us to broadly classify these TH-expressing populations, we agree that detailed transcriptional analysis at the single cell level is required to understand the molecular mechanisms underlying the generation of TH positive cells. These analyses are, however, not within the scope of a revision and would require a thorough dedicated study. We have added these results and discussion points to the revised manuscript.

      (4) Low-power images are generally lacking for immunohistochemical data shown in Figures 3 and 4, which makes interpretation difficult. DAPI images in Figure 3C do not appear nuclear. Immunostaining for Th, DAT, and Dcx in Figure 4 shows a high background and is difficult to interpret. 

      We thank the reviewer for closely evaluating these images and suggestions for improvement. In the revised manuscript, we provide low power images and higher magnification insets as requested to allow for easier interpretation.

      (5) Insights into the mechanism by which neuronal-specific loss of Ptbp1 function induces either functional recovery, or dopaminergic markers in striatal neurons, is lacking.

      In the revised manuscript, we provide a more detailed discussion of mechanisms that could potentially be involved in the functional recovery or expression of dopaminergic markers. However, deciphering the exact molecular mechanisms underlying these observations requires thorough transcriptional analysis at the single cell level, which is out of scope of this revision.

      Reviewer #2 (Public Review):

      Summary: 

      The manuscript by Bock and colleagues describes the generation of an AAV-delivered adenine base editing strategy to knockdown PTBP1 and the behavioral and neurorestorative effects of specifically knocking down striatal or nigral PTBP1 in astrocytes or neurons in a mouse model of Parkinson's disease. The authors found that knocking down PTBP1 in neurons, but not astrocytes, and in striatum, but not nigra, results in the phenotypic reorganization of neurons to TH+ cells sufficient to rescue motor phenotypes, though insufficient to normalize responses to dopaminomimetic drugs.

      Strengths: 

      The manuscript is generally well-written and adds to the growing literature challenging previous findings by Qian et al., 2020 and Zhou et al., 2020 indicating that astrocytic downregulation of PTBP1 can induce conversion to dopaminergic neurons in the midbrain and improve parkinsonian symptoms. The base editing approach is interesting and potentially more therapeutically relevant than previous approaches.

      Weaknesses: 

      The manuscript has several weaknesses in approach and interpretation. In terms of approach, the animal model utilized, the 6-OHDA model, though useful to examine dopaminergic cell loss, exhibits accelerated neurodegeneration and none of the typical pathological hallmarks (synucleinopathy, Lewy bodies, etc.) compared to the typical etiology of Parkinson's disease, limiting its translational interpretation. 

      We thank the reviewer for the detailed assessment of our study and pinpointing its current weaknesses. Please find our answers to all comments below in blue.

      We agree with the reviewer that the 6-OHDA model lacks the typical pathological hallmarks of PD. Nevertheless, we chose this model for two reasons:

      i) The 6-OHDA model was used by both Qian et al. (2020) and Zhou et al. (2020). To allow comparison of our results to these studies, it was crucial to use the same model. Notably, the 6-OHDA model was also used by Chen et al. (2022) and Hoang et al. (2023) for comparison to the two studies from 2020.

      ii) The 6-OHDA model is straightforward to generate and displays robust motor impairments for evaluation of potential therapeutic effects of neuroregeneration treatment approaches. We therefore believe that the model is well-suited to analyze the cellular and behavioral effects (specifically motor skills) of PTBP1 downregulation. 

      In future studies, it would be critical to include models that also display typical pathological hallmarks of the disease to further evaluate the therapeutic effect of this base editing approach. These experiments are, however, not within the scope of this study, which was aimed to focus on the cellular and behavioral effects of PTBP1 downregulation. 

      In addition, there is no confirmation of a neuronal or astrocytic knockdown of PTBP1 in vivo; all base editing validation experiments were completed in cell lines. 

      In the revised manuscript, we assess in vivo base editing efficiencies at the Ptbp1 target site in the SNc (AAV-hsyn, 15.6%) and striatum (AAV-hysn, 21.1%). Furthermore, we assessed in vivo Ptbp1 downregulation at the RNA and protein level to complement our in vitro data (Figure 2 – figure supplement 5; figure 3 – supplement 2).

      Finally, it is unclear why the base editing approach was used to induce loss-of-function rather than a cell-type specific knockout, if the goal is to assess the effects of PTBP1 loss in specific neurons. 

      We expressed base editors under cell-type specific promoter to induce a reliable loss-offunction mutation at the Ptbp1 exon-intron junction in neurons or astrocytes. Performing these mutations with Cas9 nucleases instead would have had potential limitations and risks, including i) indel mutations do not always lead to a frameshift and loss-of-function despite high indel formation at the targeted site, ii) nucleases induce DNA double strand breaks, which can have serious side effects (e.g. chromosomal rearrangements or translocations), and iii) ‘mosaicisms’ as edited cells contain different indel mutations, which may result in different effects and thus complicate analysis of the downstream effects. We discuss these points in the revised manuscript.  

      In terms of interpretation, the conclusion by the authors that PTBP1 knockdown has little likelihood to be therapeutically relevant seems overstated, particularly since they did observe a beneficial effect on motor behavior. We know that in PD, patients often display negligible symptoms until 50-70% of dopaminergic input to the striatum is lost, due to compensatory activity of remaining dopaminergic cells. Presumably, a small recovery of dopaminergic neurons would have an outsized effect on motor ability and may improve the efficacy of dopaminergic drugs, particularly levodopa, at lower doses, averting many problematic side effects. Since striatal dopamine was assessed by whole-tissue analysis, which is not necessarily reflective of synaptic dopamine availability, it is difficult to assess whether the ~10% increase in TH+ cells in the striatum was sufficient to improve dopamine function. However, the improvement in motor activity suggests that it was.

      As pointed out by the reviewer, it is difficult to estimate the therapeutic effect and importance of a ~10% increase in TH+ cells for PD patient. Guided by the reviewer’s suggestion, we have included a more in-depth discussion of our results and its potential therapeutic value as well as outstanding questions for future studies in the revised manuscript.

      Reviewer #3 (Public Review):

      This study explores the use of an adenine base editing strategy to knock down PTBP1 in astrocytes and neurons of a Parkinson's disease mouse model, as a potential AAV-BE therapy. The results indicate that editing Ptbp1 in neurons, but not astrocytes, leads to the formation of tyrosine hydroxylase (TH)+ cells, rescuing some motor symptoms.

      Several aspects of the manuscript stand out positively. Firstly, the clarity of the presentation. The authors communicate their ideas and findings in a clear and understandable manner, making it easier for readers to follow. 

      The Materials and methods section is well-elaborated, providing sufficient detail for reproducibility. 

      The logical flow of the manuscript makes sense, with each section building upon the previous one coherently.

      The ABE strategy employed by the authors appears sound, and the manuscript presents a coherent and well-supported argument.

      Positively, some of the data in this study effectively counteracts previous work in line with more recent publications, demonstrating the authors' ability to contribute to the ongoing conversation in the field.

      We thank the reviewer for appreciating the effort we have put into this study. Please find below a point-by-point reply to the weaknesses raised by the reviewer. 

      However, while the in vitro data yields promising results, it may have been overly optimistic to assume that the efficiencies observed in dividing cells will directly translate to in vivo conditions. This consideration is important given the added complexities of vector optimization, different cell types targeted in vitro versus in vivo, as well as unknown intrinsic limitations of the base editing technology. 

      We agree with the reviewer that in vitro base editing efficiencies might not directly translate to in vivo editing outcomes. We therefore assessed in vivo base editing efficiencies at the Ptbp1 locus and PTBP1 downregulation in the striatum and midbrain. Our data revealed that in vivo base editing activity was lower than in our in vitro setting (in vitro: Figure 1; figure 1 – figure supplement 2; in vivo: figure 2 – figure supplement 5; figure 3 – supplement 2). However, we believe that these rates are slightly underestimated since we sequenced DNA isolated from the whole tissue (striatum or SNc) and not from purified astrocytes or neurons. Moreover, we could demonstrate that editing led to a reduction of Ptbp1 transcript and PTBP1 protein level (Figure 2 – figure supplement 5; figure 3 – supplement 2).

      In addition, certain aspects of the manuscript would benefit from a more in-depth and comprehensive discussion rather than being only briefly touched upon. Such a discussion would enhance the relevance of the obtained results and provide the foundation for improvement when using similar approaches.

      Following the reviewer’s suggestion, we included a more in-depth discussion of our results in the revised manuscript.

      Recommendations for the authors:

      Reviewing Editor (Recommendations for the Authors):

      A summary of key recommendations that might improve the eLife assessment in a subsequent submission are provided below, as a guide to help the authors focus on changes that might enhance the strength of evidence (e.g., from "incomplete" to "solid").

      (1) Provide further explanation of the mechanistic relationship between the downregulation of Ptbp1 and TH+ dopaminergic neuron reprogramming. Additional discussion of this topic should also be included.

      (2) Demonstrate proof of editing in the intended targeted cells in vitro and/or in vivo.

      (3) Show evidence of successful Base Editor delivery in vivo.

      (4) Perform a deeper characterization of TH+ cells in vivo and provide a more thorough discussion of the identity of the targeted cells. This may include an exploration of whether TH+ cells detected are TH+ interneurons and/or establish their identity based on transcriptomics or a similar approach.

      (5) Provide better-quality representative images supporting the quantitative data.

      (6) Please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05 in the main manuscript.

      In the revised manuscript, we provided 1) suggestions of the mechanistic relationship between Ptbp1 knockdown, dopamine synthesis, and the functional rescue of spontaneous behaviors, 2) proof of in vivo base editing and successful base editor delivery, 3) deeper characterization of TH-expressing cells in vivo using 4i imaging, 4) better quality images, and 5) full statistical reporting.  

      Individual Reviewer recommendations for the authors are included below.

      Reviewer #1 (Recommendations For The Authors):

      Confirm loss of Ptbp1 function in infected striatal neurons. Single-cell RNA-Seq or spatial transcriptomic analysis must be performed to characterize the identity of the edited striatal neurons. The quality of the immunostaining in Figures 3 and 4 needs to be improved, and lowpower images provided. Were eLife a conventional journal, I would have insisted on all these being included prior to publication. Please also arrange for independent replication of the behavioral rescue and induction of dopaminergic marker gene expression in the striatum. 

      In the revised manuscript, we confirmed Ptbp1 downregulation at the tissue level in the SNc and striatum by RT-qPCR and western blot and included low-power images for easier interpretation. Additionally, we assessed expression of additional neuronal markers on striatal sections using 4i imaging and found that TH/DAT/NEUN positive populations either expressed markers of medium spiny neurons or interneurons. We have included these results in the revised manuscript.

      Our behavioral and imaging experiments involving mice injected into the striatum were in fact performed with two independently generated cohorts of 6-OHDA mice. In detail, the 6OHDA mice were generated by two independent surgeons from different labs (>6 months between experiments of these two cohorts), leading to comparable behavioral outcomes before and after treatment. The experiments with each cohort were performed and analyzed by two independent and blinded researchers, yielding comparable results. 

      Reviewer #2 (Recommendations For The Authors):

      (1) In the introduction, lines 43-45: This statement is inaccurate. Current treatment strategies do not focus on slowing or halting disease progression. There is currently no accepted therapy that does this. Dopaminergic therapies and deep brain stimulation can compensate for circuitry dysfunction as a result of dopamine cell loss but do not slow the disease. The referenced paper used is older and does not refer to new treatments for PD and is a summary article for a special issue of the Disease Models and Mechanisms journal. Please ensure that all references used are appropriate for the statement they are attached to.

      We thank the reviewer for pointing this out. We have rephrased this statement accordingly and provided an appropriate reference describing current treatment strategies.

      (2) The number of TH+ cells in the intact nigra seems low compared to published data. Suggest a stereological approach may be better than the Abercrombie method.

      Following the reviewer’s suggestion, we re-quantified the number of TH positive cells using a stereological approach (Nv:Vref method). We have included these results in the revised manuscript. 

      (3) Have the authors considered that the striatal TH+ cells could be TH+ striatal interneurons? 

      In the revised manuscript, we performed additional 4i imaging experiments to further analyze the identity of the TH positive cells in the striatum. Briefly, we found that TH/DAT/NEUN positive populations either expressed markers of GABAergic medium spiny neurons or interneurons. We have added these results to the revised manuscript (Figure 4). 

      (4) The Western blot shown in Figure 1 C for C8-D1A has some abnormalities and makes it difficult to judge the bands. Also, for 1B, the legends are difficult to see.

      In the revised manuscript, we have repeated the respective western blot to make interpretation of the bands easier, and adapted the legends in Figure 1B for better visibility.

      (5) Figure 2: Please show representative images for the GFAP-targeted editing.

      Representative images of the GFAP-targeted groups can be found in Figure 2 – figure supplement 3.

      (6) Figure 2, Supplement 3: Please include quantification.

      The quantifications for these images can be found in Figure 2D and 2F. 

      (7) Figure 1, Supplement 2: The gene name in A is misspelled.

      Thank you for point this out. In the revised manuscript, we added the correct gene name.

      (8) Line 267-276: As previously indicated, the statement here is overstated based on the data provided. In addition, the citation provided to justify this claim (Kannari et al., 2000) is an odd choice as the dosage of L-DOPA utilized was not therapeutically relevant (50 mg/kg). A better indication of efficacy would be the return to basal, unaffected levels rather than the fold increase in dopamine levels. A better comparison would be Lindgren et al., 2010 who showed that L-DOPA-treated animals with a physiologically relevant dose (6 mg/kg) that did not induce dyskinesia, showed a return to basal, non-lesioned dopamine levels in the striatum after LDOPA by microdialysis. To really support this claim, the authors would need to use an approach that could measure synaptic dopamine availability, rather than whole-tissue dopamine levels, such as microdialysis, fiber photometry, or an equivalent.

      Following the reviewer’s suggestions, we replaced this reference with Lindgren et al. (2010) and provide a more detailed interpretation of our results and remaining questions for future studies.  

      Reviewer #3 (Recommendations For The Authors):

      Major and minor issues are discussed below by section.

      INTRODUCTION and AIM - Lines 36-73

      - The authors effectively contextualize the aim of their study by providing comprehensive background information on previous research regarding cell 'reprogramming' into dopaminergic neurons in the SNc. However, the introduction lacks contextualization of TH+ cells and PD. For readers who may not be well-versed in the Parkinson's field, understanding the importance of TH (Tyrosine Hydroxylase) may be challenging, since the term "TH+ cells" is mentioned only once by the end of the introduction (line 71), to then become a key element in the entire study.

      - Providing a brief explanation of the role of Tyrosine Hydroxylase in the synthesis of L-DOPA would facilitate the reader's comprehension of why the presence of TH+ cells following Base Editing treatment is relevant.

      - Further elaboration on the relationship between the downregulation of the general RNA binding protein, PTBP1, and the specific dopaminergic-related readout, TH, would improve coherence and strengthen the linkage between the introductory section and the results.

      We thank the reviewer for the constructive suggestions. In the introduction of the revised manuscript, we describe the meaning and importance of TH in the context of dopamine synthesis and PD. Likewise, we briefly outlined the importance of the PTBP1/nPTBP regulatory loops during neuronal differentiation and maturation. 

      RESULTS 

      Result Section 1 - Line 75-109

      - Thorough screening of sgRNAs targeting splice junctions across the Ptbp1 gene in HEPA cells, shows the achievement of high levels of editing (80-90%) with sgRNA-ex3 and sgRNAex7. 

      - The data also indicates that editing translates into significant reductions in ptbp1 expression, along with an increase in the expression of genes repressed by PTBP1.

      - Despite obtaining lower percentages of editing events in N2a neuroblastoma cells and the C8-D1A astroglial cell line, the differential expression levels of ptbp1 and the readout genes remain significant. However, the gRNA screening assay is performed in immortalized, dividing cells. 

      - Providing proof that Adenosine Base Editing of Ptbp1 is successful in non-dividing cells (such as SNc and/or striatal primary neurons) would strengthen the case for the potential therapy in the intended cell type.

      Following the reviewer’s comment, we show in vivo base editing rates in the SNc and striatum of treated PD mice in the revised manuscript (Figure 2 – figure supplement 5; figure 3 – supplement 2).

      - Moreover, assessing the expression levels of tyrosine hydroxylase by qPCR after Ptbp1 base editing in vitro could help contextualize the use of TH+ detection as an in vivo readout and may help explain why the total number of TH+ cells is low after ABE treatment in vivo - as shown in following sections.

      In the revised manuscript, we now provide quantifications of in vivo base editing efficiencies in the SNc (~15%) and striatum (~20%). As expected from these lower in vivo base editing rates, downregulation of Ptbp1 at the transcript and protein level was less pronounced compared to our in vitro experiments. It seems likely that higher base editing efficiency and more pronounced downregulation of Ptbp1 could lead to a larger population of TH expressing cells. We have added these results and interpretations to the revised manuscript.

      - Furthermore, although ABEs are less prone to generating bystander and other nucleotide changes compared to CBEs, it is still possible. Figures 1 (line 811) and 1-supplement 2 (line 842) only show a brief window of the Sanger sequencing trace. Updating these figures to display a wider view of the sequencing trace would enhance transparency. If unwanted edits are detected, while they may not significantly alter the relevance, impact, or structure of the paper, they may become an important aspect of the discussion. 

      Indeed, ABEs can induce bystander edits and we also detected such edits at the Ptbp1 target site. However, since our base editing strategy was designed to yield a loss of Ptbp1 function, bystander editing at the splice site was not a primary focus in our analysis. Nevertheless, we included CRISPResso output images showing the specific editing outcomes in a wider analysis window in the revised manuscript (Figure 3 – figure supplement 2). 

      Result Section 2 - Lines 110-159

      A split intein system is used in vivo with sgRNA-ex3, after updating the promoter to make it cell-specific: hSyn to restrict expression to neurons and GFAP to restrict expression to astrocytes. 

      However, no other assay is performed to assess whether a) the promoter change and/or b) splitting Cas9 may affect the editing efficiency compared to their initial in vitro approach.

      In the revised manuscript, we assessed the performance of the in vivo AAV vectors encoding the split intein ABE with sgRNA-ex3 in vitro in N2a and C8-D1A cells. Our results show that all vectors are functional and result in base editing at the target locus.

      -  Addressing whether this is the case may explain the low number of TH+ cells observed in vivo. 

      - The authors could also consider staining for Cas9 to address whether the low number of TH+cells could be attributed to a poor Cas9 delivery.

      To confirm successful in vivo base editor delivery, we quantified in vivo base editing efficiencies in the SNc and striatum of PD mice. Our analysis revealed in vivo base editing efficiencies at both tissue sites, confirming that base editors were successfully delivered. Editing efficiencies were, however, substantially lower (Figure 2 – figure supplement 5; figure 3 – supplement 2).  than in our in vitro cell line setting (Figure 1; figure 1 – figure supplement 2). Even though tissue editing rates likely underestimate the cell type-specific editing rates in astrocytes or neurons, higher base editing rates would have likely resulted in a higher number of TH positive cells. We have added these results and their implications to the revised manuscript. 

      -  Moreover, despite the presence of TH, in Figure 2 E,F authors examine the striatal innervation from newly generated TH+ cells in the SNc by Fluorescence Intensity (FI) to conclude that the edited cells do not form projections towards the striatum. Considering the low levels of TH+ positive cells obtained, the accumulation of gross FI might not be the most accurate way to assess the presence or absence of cell projections.

      - Using another marker that stains the projections rather than the cell soma, and that is a marker of dopaminergic neurons, might be a better way to address this.

      To address the reviewer’s comment, we analyzed the presence of potential dopaminergic fibers in the mfb, where projections are more concentrated (around the injection coordinates of 6-OHDA), using the dopaminergic marker DAT. In line with our previous observations in the striatum, we did not detect an increase in DAT fluorescence intensity upon treatment on the lesioned hemisphere (Figure 2 – figure supplement 4).  

      Result Section 3 - Line 160-182

      Minor issue

      - The same dual split intein system is used in the striatum. However, in Figure 3 - Figure Supplement 1 - line 958 and in Figure 3 - Figure Supplement 4 - line 1000authors show the injection of 2x the viral genomes indicated along the manuscript. In previous experiments the SNc 2x108vg/animal was used whereas this figure shows 4x108vg/animal injected in the striatum. 

      - The authors should clarify if the vg injected in the striatum was different from what they previously indicated.

      Compared to injection in the SNc, the volume of vector injected in the striatum was doubled since the region is significantly larger. We clarified that the injected vector genomes were different between striatum and SNc in the revised manuscript.

      Result Section 4- Line 183-220

      In this section, the authors thoroughly examine the neuronal nature of TH+ cells through NeuN co-staining and iterative immunofluorescence imaging (4i). BrdU experiments are conducted to determine the origin of these cells, leading to the conclusion that TH+ cells derive from nondividing cells and express the neuronal marker DAT, characteristic of dopamine-producing neurons (DANs). Cell shape of the TH+ cells in the striatum and SNc is also evaluated measuring their Feret's diameter and their cell surface. Authors conclude there's heterogeneity in the TH+ cell population due to the presence of TH+/Neun- as well as differences in cell shape. 

      However, their explanation of this heterogeneity is solely attributed to differences in the microenvironment and lacks further elaboration. Similarly, their observation that almost half the number of TH+ striatal cells after treatment express CTIP2 (Line 213 and Figure 4B), a marker for GABAergic medium spiny neurons, which they state as "interesting" (line 213) is not developed further. Delving deeper into these topics could strengthen the discussion.

      In the revised manuscript, we provided a more in-depth discussion of the 4i imaging results and potential therapeutic implications. Additionally, we suggest follow-up experiments to analyze the identity, function, and molecular mechanisms underlying the expression of TH upon PTBP1 downregulation in future studies. 

      Result Section 5- Line 221-243

      Two drug-free and two drug-induced behavioral tests are conducted in control and treated animals to evaluate the restoration of motor functions following treatment. Consistent with their previous findings, only the treatment targeted to neurons resulted in the restoration of motor functions in drug-free behavioral tests. The rationale behind each test and its evaluation is clearly explained.

      DISCUSSION 

      - In the discussion section, the authors effectively re-examine their results contextualizing their data with previous studies in the field. However, it would be helpful at this point in the manuscript to reconsider the use of the term 'cell reprogramming,' as this study does not involve actual cell reprogramming. The concept "reprograming" entails the process of transforming adult cells into a stem cell-like state, to then differentiate them into a different cell type. As proven in section 4 by a BrdU proliferation assay, the targeted cells are differentiated neurons. Considering BrdU is administered 5 days after ABE treatment, if true cell reprogramming was taking place, there should be evidence of BrdU incorporation. Cell reprogramming or reprograming is mentioned 4 times in the manuscript (line 34, line 54, line 265, line 277). Therefore, using another terminology would be more accurate.

      Following the reviewer’s suggestion, we removed the term “cell reprograming” from the manuscript and rather describe it as induction of TH expression in endogenous neurons.

      - As noted in the comments of section 4, a more thorough discussion about the various possibilities for heterogeneity would enhance the manuscript's contribution to the PD field.

      In the revised manuscript, we provided a more in-depth discussion of the 4i imaging results and potential therapeutic implications. 

      - Despite observing low numbers of TH+ cells, no significant rescue of drug-induced behaviors, and low levels of released dopamine, the authors merely state that these results make the therapy non-viable, but there is no further exploration or discussion. Whether the limitations lie in the ABE strategy itself, such as its efficiency in targeting and editing of differentiated neurons; or if the issues lie on the injection and delivery, is never discussed. A deeper argumentation on the possible underlying reasons for these challenges would greatly enhance the manuscript and contribute to the advancement of ABE therapies in the brain.

      We believe that the efficacy of our base editing approach could be significantly enhanced by optimizing the delivery. Currently, we are using a dual AAV approach to deliver intein-split ABEs. Since this approach relies on the delivery of higher AAV doses to achieve cotransduction of a cell by two different AAVs, the efficiency could be significantly enhanced by using smaller Cas9 orthologues that can be delivered as a single AAV. Furthermore, in this study we performed a single injection into the dorsal striatum to deliver ABE-expressing AAVs. Performing multiple injections into the rostral, medial, and caudal regions of the striatum might allow us to transduce more cells and induce TH expression in a larger population of striatal neurons. We have included these points in the revised manuscript.

      - While drug-induced behaviors are not recovered, the data demonstrates a rescue of spontaneous behaviors. Further discussion on the potential differences in circuitry underlying these variations in behavioral rescue would also enrich the manuscript's discussion.

      In the revised manuscript, we provide suggestions for potential mechanisms involved in the rescue of spontaneous behavior vs. absence of rescue of drug-induced behaviors. 

      FIGURES AND FIGURE SUPPLEMENTS

      General minor issue - low magnification images in the following figures, make it difficult to visualize positive cells in tissue sections: Figure 2; Figure 2- supplement 1; Figure 2 - supplement 3, Figure 3- supplement 1. Adding a higher magnification imaging of positive cells in tissue sections of SNc and striatum might help with the visualization. 

      As suggested by the reviewer, we included higher magnification images in the corresponding figures to improve interpretation of our results.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Drawbacks: -While the population-specific approach is a strength, it also limits the direct applicability of findings to other populations.

      We thank the Reviewer for highlighting this important question. While we acknowledge the mentioned limitation, we would like to emphasize the benefits of adopting a population-specific approach, especially given that human gut microbiome diversity remains underexplored in many populations worldwide. Researching the Estonian population microbiome, we contribute to the broader global collection of gut microbial species, helping to address this gap.

      Moreover, new microbial species and strains identified in the Estonian population may be relevant for populations with similar environmental and lifestyle factors, such as the Finnish, Baltic, and Nordic populations. These findings can enhance understanding of regionally relevant microbiome characteristics and may serve as a useful reference for studies in these related populations. As more population-based microbiome research is published, it will build a valuable resource for cross-population comparative studies, shedding light on global microbiome diversity and its implications for health.

      Lastly, as part of the Estonian Biobank, our primary objective is to advance personalized medicine for the Estonian population. This requires a highly accurate reference for our specific population. We believe our approach not only benefits Estonian healthcare but also provides insights and methodologies that other population biobanks may find valuable as they embark on similar paths toward personalized medicine.

      -The study primarily focuses on taxonomic composition at the genus or species level, but a more in-depth functional analysis of the novel species could provide additional insights.

      We thank the Reviewer for this valuable addition. Functional analysis plays a crucial role in understanding the mechanisms that link the microbiome to human health, making it an essential. This becomes even more critical when studying newly discovered species. However, before embarking on functional analysis, we believe it is important to emphasize that, while high-quality metagenome-assembled genomes (MAGs) provide valuable insights, they do not fully represent the genomic completeness and accuracy of genomes reconstructed from pure bacterial cultures. Acknowledging this distinction was one of the reasons we decided not to include functional analysis in the original article. With these considerations in mind, we research a strain structure of four known species of Butyricimonas genus. While the primary interest lies in species associated with diseases, this particular species lacks a substantial number of high-quality MAGs. To gain deeper insights, we prioritized including a new species within the analyzed genus to perform a comparative analysis between the new species and a well-defined strain of a known species, creating a more comprehensive understanding. Among the 758 different genera present in our MAG collection, we selected the Butyricimonas genus for the following reasons: (1) it is a well-described genus of gut bacteria, represented by 300 high-quality MAGs in our dataset (2) it contains four known species along with two newly identified species clusters, and (3) the newly discovered species were shown to be prevalent in the human gut microbiome, being detected in more than 50% of samples through mapping.

      The following section was integrated in the new paragraph “Genome level analysis of species of interest” on page 6 in the revised version of the manuscript:

      “Species-level association studies can help identify candidates for genome-level analysis by exploring strain structure and functional differences. However, such analyses require a large number of high-quality MAGs from the same species, which is only feasible within large cohorts with deep sequencing data. While we currently need more samples to obtain sufficient MAGs for the new disease-associated species, we perform an analysis with the Butyricimonas genus species as an example. We show that the assembled MAGs of Butyricimonas species such as B. faeciominis, B. virosa, B. paravirosa and B. faecalis make up different strains (Figure 4a, Figure 4b, Supplementary results, Supplementary Table S5). After selecting a strain representative, we conducted a pan-genome analysis of species and strain-representative MAGs, including the two new species. The analysis revealed unique gene clusters consistently present in the new species but absent in all other analyzed species and strains (Figure 4c, Supplementary results, Supplementary Table S6).

      Figure 4. Strain-level structure of the Butyricimonas genus and comparative functional analysis of new species and known species strain. a. The strain structure of known Butyricimonas species assembled in the Estonian population - B. paravirosa, B. faecalis, B. virosa, and B. faecihominis (based on ANI index comparison). __b. __Butyricimonas genus structure. Comparisons include all known species from Butyricimonas genus (species assembled in Estonian population and publically available species) and all 4 newly assembled MAGs belonged to a new species. Publicly available Butyricimonas species - B. synergistica, "Candidatus B. faecavium", "Candidatus B. hominis", "Candidatus B. phoceensis", and "Candidatus B. vaginalis"—are each represented by a single genome of the type strain (the strain defining the species according to ISCP). Species assembled from our data are represented by both the type strain and all strain-representative MAGs. ANI values less than 95% (represent that MAGs belonged to different species) are not coloured, 95–100% ANI colored in different colors with 1% step. c. Pan-genome analysis of Butyricimonas genus. The analysis included the same genomes and MAGs as the analysis of the Butyricimonas genus structure and showed a core gene, as well as specific gene, set for the species. The two new species clusters (highlighted in green) also exhibit unique species-specific gene sets.

      We have also added Supplementary Results to our paper, providing a more detailed description of the strain structure analysis of Butyricimonas species and the functional analysis of both known and new species. We chose not to include this in the main text to avoid shifting the focus of the paper.

      Supplementary results

      Butyricimonas genus species strain-level and functional analysis

      Beyond taxonomic characterisation, it is crucial to understand the functional differences of newly detected species, as this insight is key to fully understanding the mechanisms that link the microbiome to human health. Reconstructing MAGs from a large cohort provides multiple genomes of the same species, particularly for prevalent species. During our study, we assembled MAGs from 758 different genera, including 358 genera with more than 10 extracted MAGs. Conducting a detailed in-depth strain-level and functional analysis of all these genera requires substantial effort. Therefore, we conduct an in-depth strain-level and functional analysis using the genus Butyricimonas as an example, because. The genus Butyricimonas was chosen for the following reasons: (1) it is a well-characterized genus of gut bacteria, represented by 300 high-quality MAGs in our dataset (2) it included four known species and two newly identified species clusters, and (3) the new discovered species have been shown to be prevalent in the human gut microbiome.

      *Known Butyricimonas species exhibit a clear strain-level structure based on pairwise ANI comparisons (ANI > 99.0), as calculated using ANIclustermap19 (Figure 4a). From a total of 300 high-quality MAGs selected for strain and functional analysis within the Butyricimonas genus, the species Butyricimonas paravirosa is represented by 23 MAGs and forms 5 distinct strain clusters. While one big cluster (cluster_id: B30) includes 7 highly similar genomes with ANI values close to 100%, other clusters (B31, B32, B34) exhibit more genomic diversity, with genomes showing ANI values greater between 99.0% and 99.6%. The final cluster (B33) contains a single MAG, suggesting unique genomic variation. Butyricimonas faecihominis is represented by 65 MAGs and forms 8 distinct strain clusters, exhibiting high genome similarity within each cluster. Butyricimonas virosa is represented by 67 MAGs and forms 14 distinct strain clusters. These strain clusters can be divided into two strain cluster groups, with low similarity between the groups (ANI values between strain cluster groups ranging from 95.0% to 96% and approaching the species boundary). Within each group, the strain clusters also exhibit genomic diversity, indicating a substantial level of variation even within closely related strains. Finally, Butyricimonas faecalis has the highest number of MAGs within its species 141 MAGs and shows a clean picture of 5 strain clusters with high similarity within the strain cluster (Figure SR1). *

      Figure SR1. The strain structure of known Butyricimonas species assembled in the Estonian population - B. paravirosa, B. faecalis, B. virosa, and B. faecihominis (ANI index comparison histogram).

      In addition to the four known species, we assembled two new species within the Butyricimonas genus. The first new species cluster (id: Bn1) is represented by a single MAG (H0366_Butyricimonas_undS), which serves as the representative genome for this species. The second new species cluster (id: Bn2) comprises three MAGs, with H1068_Butyricimonas_undS designated as the representative genome, selected using dRep. To determine the placement of these new species within the genus, we conducted genome pairwise comparisons based on the Average Nucleotide Identity (ANI) index between the MAGs of the new species and other species within the Butyricimonas genus. For the known species identified in our population, we selected representative genomes for each strain. These comparisons were made between the all new species MAGs, strain-level representative MAGs of four known species, and type strain genomes (the strain that defines the species according to ISCP) from other species of the Butyricimonas genus that were not present in our cohort,, such as Butyricimonas synergistica, "Candidatus Butyricimonas faecavium", "Candidatus Butyricimonas hominis", "Candidatus Butyricimonas phoceensis", and "Candidatus Butyricimonas vaginalis" (Figure 4b). The MAGs from the second new species cluster (Bn2) form a distinct and cohesive group, showing a closer relationship to Butyricimonas paravirosa and Butyricimonas faecihominis. In contrast, the first new species (Bn1), represented by a single MAG, is positioned closer to Butyricimonas virosa. Interestingly, while the ANI index between the type strain of Butyricimonas virosa and the Bn1 MAG is less than 95%, certain strains of B. virosa (e.g., strains 3, 6, 7, 9, 10, and 12) show ANI values slightly above 95%, which technically classifies them as the same species.

      To explore functional differences between new species clusters and other known species we perform pangenomic analysis using the analysis and visualization platform for ‘omics data (Anvi’o) workflow for microbial pangenomics20__. As the first new species cluster (id:Bn1) is represented by a single MAG, despite it containing unique genes not found in any other analyzed genomes, it is challenging to draw definitive conclusions. Another new species cluster (id:Bn2) consisting of three MAGs provides clearer insights. All three MAGs within this new species cluster share 183 unique genes that are consistently present across the species cluster but absent in all other analyzed species and strains. (Figure 4c). The majority of these genes (142 genes, 73.96%) have unknown functions. Among the genes with defined functions, the functions are distributed across various COG categories (__Suppl. Table S5,____Suppl. Figure SR2), with the top three categories being “Cell wall/membrane/envelope biogenesis”, “General function prediction only”, and “Posttranslational modification, protein turnover, and chaperones”.

      Figure SR2. COG categories for 183 unique genes that are consistently present across the new species MAGs from Butyricimonas genus (cluster id:Bn2) but absent in all other analyzed species and strains.

      Undoubtedly, further research is needed to understand the role of newly identified species in the human microbiome and to determine whether strain-level differences influence bacterial interactions with the gut and their overall impact. However, our current analysis has already significantly expanded our knowledge of the diversity within this genus. It has added two new species to the ten previously described and revealed the strain structure of known species within the Estonian population.

      -Is it possible for this large dataset to distill information and have plots for strain diversity of abundant and prevalent species, including low abundance species per donor or between donors? Can authors add such a plot or discuss this?

      We thank the Reviewer for this insightful question. Strain-level analysis holds significant potential and is one of the key reasons to use the genome assembly approach, rather than relying on microbiome community profiling using existing human gut species databases. To demonstrate how this can be applied in large datasets like ours, we focused on the same Butyricimonas genus selected for functional analysis. We believe that combining both strain-level and functional analyses provides a more comprehensive understanding when used together.

      The following section has been incorporated into a new paragraph, “Genome-Level Analysis of Species of Interest,” on page 6 of the revised manuscript, and in-depth analysis has been included in the Supplementary Results. As this section has already been cited in a previous response (due to its logical connection with the functional analysis of the new species), we will not cite it again here. Please refer to the previous answer for further details.

      -While associations between microbes and diseases were found, the study design cannot establish causal relationships. Are the authors planning to test some of the associations experimentally and see whether these observations work in vitro or in vivo?

      We agree that elaboration of causal relationships is crucial. However, this was beyond the scope of the current study, which is intended as a foundational step for future investigations. However, the samples are stored in the Estonian Biobank in a way that allows culturomic studies and follow-up experiments as done by Krigul et al [1].

      Krigul KL, Feeney RH, Wongkuna S, Aasmets O, Holmberg SM, Andreson R, Puértolas-Balint F, Pantiukh K, Sootak L, Org T, Tenson T, Org E, Schroeder BO. A history of repeated antibiotic usage leads to microbiota-dependent mucus defects. Gut Microbes. 2024 Jan-Dec;16(1):2377570. doi: 10.1080/19490976.2024.2377570.

      Minor comments:

      • The authors could provide more context on how their findings compare to similar studies in other populations. What are the differences and similarities, and how does this work at the next level and set new directions?

      We thank the Reviewer for this suggestion. We provided a summary of other population cohorts in the Introduction (Lines 79–90). Since MAG recovery from large cohorts is a relatively new approach, there are limited opportunities for direct comparison. However, we did note a decreasing number of newly recovered species in our study compared to previous studies (Lines 274–290).

      • Figures' quality and readability can be improved easily; all of them are low resolution, and the axes are hardly visible, particularly Figure 2, which could benefit from additional labeling or explanations in the legend to improve clarity.

      We apologize for the quality issues with the figures. We completely revised Figure 2 to improve clarity and placed a new higher-resolution version of Figure 2 to improve readability, ensuring that axes and details are clearly visible.

      Summary of performed changes: (1) we introduced a new Figure 2a to showcase the phylogenetic diversity of the recovered species and highlight the position of the newly assembled species identified for the first time in this study (2) We have updated Figure 2b. In the initial figure, a single line was presented. However, to enhance the visualization and emphasize the trend, five lines were subsequently plotted by altering the order of the samples. Since the order of the samples is not significant, this modification allows for a clearer representation of the overall trend of accumulation of the new species (3) we added new Figure 2c, to address the question about the range of diversity of detected species (4) we moved Figure 2a and 2d to Supplementary Figures to enhance clarity and relevance (Figure S4 and Figure S6 respectively).

      “Figure 2. Overview of species from the EstMB MAG collection a. Phylogenetic tree of the Estonian species representative MAGs. The inner circle displays a phylogenetic tree of species cluster representative MAGs, with branches colored according to their assigned phylum in the Genome Taxonomy Database (GTDB) (see color text). The surrounding ring highlights MAGs that represent novel species assembled in the current study, using the same colors as in the inner circle to indicate the phylum to which each new species belongs (see color text). b. The relationship between the number of samples analyzed and the cumulative number of new species identified c. Distribution of number of species detected by mapping per sample “species hits” (yellow color violinplot) and number of recovered MAGs per sample (blue color violinplot) from Estonian representative MAGs number. d. Number of recovered species (blue color dots) and species detected by mapping the reads against the EstMB MAG collection (yellow color dots) for each sample. Samples are sorted from those with the highest to the lowest number of recovered MAGs e. __The prevalence and number of recovered MAGs per species. The top 10 species with the highest number of recovered MAGs are shown. Blue bars represent the number of samples where MAG of the species were recovered, while gray bars show the species prevalence in EstMB __f. The prevalence and number of recovered MAGs per new species. The top 10 new species with the highest number of recovered MAGs are shown. Green bars represent the number of samples where MAG of the new species were recovered, while gray bars show the new species prevalence.”

      -A brief discussion on the potential clinical implications of the new species-disease associations would enhance the relevance. Why discovering new species are in testing and relevant for the microbiome field? Can authors add this somewhere, discussion?

      We thank the Reviewer for this suggestion. As such, the following section was integrated in the Discussion on page 8 in the revised version of the manuscript:

      “Reconstruction of a new species and new strain is critical for many aspects of personal medicine. We can identify three primary applications of the microbiome in personalized medicine: disease risk assessment and prevention, disease diagnosis, and disease treatment. The latter includes approaches such as microbial supplementation, suppression, or metabolite modulation [Karina Ratiner, 2024]. Both disease prevention and diagnosis rely on identifying bacterial biomarkers associated with prevalent or incident disease cases. In our study, an average of 4% of reads belonged to the newly identified species, with a maximum of 34.76%, demonstrating that excluding this species would lead to a significant loss of community diversity. This omission could potentially exclude biomarkers critical for disease prediction and diagnosis. Notably, one-third of the associations between bacterial species and diseases in our analysis involved the newly identified species, further emphasizing its potential importance as a biomarker. For disease treatment, it is crucial to understand the complete microbial diversity to distinguish between beneficial and harmful species. Equally important is knowing the genomic structure of species and strains to develop effective strategies for microbiome modulation. Without genome assembly, we are limited to assumptions based on previously described genomes of related bacteria. However, given the substantial genomic diversity within species, such assumptions may be highly inaccurate, underscoring the importance of genome assembly in advancing microbiome-based interventions.”

      • In lines 265-266, the authors discuss detected species per sample, on average, 389 species. Can the authors guide which plot is linked to it and whether it is possible to show the disturbing median number of species per sample to get an overall idea about the range of diversity this type of analysis can capture now? Maybe this will improve in the future; it is worth mentioning here.

      We thank the Reviewer for highlighting the need for the clarification. Original Figure 2c displayed the number of species detected through mapping (species hits) and the number of assembled MAGs for each individual sample. To provide a broader characterization of the distribution, we calculated the minimum, mean, median, and maximum values across all samples. As such, the __new Figure 2c __and the following section was integrated in the paragraph “Estimation of species prevalence using population-specific reference” on page 5 in the revised version of the manuscript:

      “Distribution of the number of species detected by mapping per sample exhibits a wide range of values, with a maximum of 842 and a minimum of 7, while the mean and median are 399 and 405, respectively. The distribution of numbers of recovered MAGs per sample shows a narrower range, with a maximum of 155 and a minimum of 1, alongside a mean of 45 and a median of 41 (Figure 2c).”

      Figure 2c.* Distribution of number of species detected by mapping per sample “species hits” (yellow color violinplot) and number of recovered MAGs per sample (blue color violinplot). *

      Other comments:

      -The key conclusions are generally convincing. The authors have successfully assembled a large number of MAGs from the Estonian population, identified potentially novel species, and established associations between microbial abundance and diseases.

      We appreciate the Reviewer's positive feedback on our findings. We are pleased that the significance of our MAG assembly, novel species identification, and disease associations is well-received.

      -The data presented appear to support the claims well. However, the authors should emphasize and clarify that the disease associations are correlational, not causal, and further validation is required.

      We agree that this is an important point to emphasize. We revised the manuscript to clarify that the disease associations are correlational and emphasize the need for further validation by adding the following section in Discussion on page 8 in the revised version of the manuscript:

      “While association does not imply causation, analyzing the association between bacterial species and diseases is a crucial first step in identifying potential biomarkers. This can be followed by meta-analyses across different cohorts and laboratory experiments to validate and confirm the observed effects.”

      -Even though I am not an expert in metagenomics analysis, the current experimental design and analysis are sound to support the main claims.

      We thank the Reviewer for recognizing the robustness of our experimental design and analysis.

      -The methods section can be improved by providing more details about how samples were collected and stored and how long after storage gDNA was extracted and processed for sequencing, allowing for reproducibility. The authors provide information on the bioinformatics pipelines, including software versions and parameters, but this can again be improved by adding details about the steps between sample processing and raw data processing.

      We thank the Reviewer for this suggestion and we agree that this is important information. All these details were thoroughly described in our previous paper, which focuses on our cohort description (Aasmets, O., Krigul, K.L., Lüll, K., Metspalu, A., and Org, E. (2022). Gut metagenome associations with extensive digital health data in a volunteer-based Estonian microbiome cohort. Nat. Commun. 13, 869.

      https://doi.org/10.1038/s41467-022-28464-9).

      However, to improve accessibility of this information, the following paragraph was integrated in the Methods on page 17 in the revised version of the manuscript:

      “Microbiome sample collection and DNA extraction

      The participants collected a fresh stool sample immediately after defecation with a sterile Pasteur pipette and placed it inside a polypropylene conical 15 mL tube. The participants were instructed to time their sample collection as close as possible to the visiting time in the study centre The samples were stored at −80 °C until DNA extraction. The median time between sampling and arrival at the freezer in the core facility was 3 h 25 min (mean 4 h 34 min) and the transport time wasn’t significantly associated with alpha (Spearman correlation, p-value 0.949 for observed richness and 0.464 for Shannon index) nor beta diversity (p-value 0.061, R-squared 0.0005). Microbial DNA extraction was performed after all samples were collected using a QIAamp DNA Stool Mini Kit (Qiagen, Germany). For the extraction, approximately 200 mg of stool was used as a starting material for the DNA extraction kit, according to the manufacturer’s instructions. DNA was quantified from all samples using a Qubit 2.0 Fluorometer with a dsDNA Assay Kit (Thermo Fisher Scientific).”

      -The study includes a large cohort (1,878 samples), which provides statistical power. The statistical analyses, including linear regression models adjusted for BMI, gender, and age, seem appropriate for the type of data presented. I suggest adding a separate paragraph about how the data is processed and statistically analyzed.

      Authors should include:

      • Appropriateness of the statistical tests used for the data types and experimental designs

      • Adequate description and justification of the statistical models and test and assumptions

      • Proper handling of replicates, controls, and data normalization

      • Reporting of effect sizes, sample size, confidence intervals, and statistical power

      • Data processing and analysis workflows.

      We thank the Reviewer for this recommendation. To highlight the statistical analysis carried out, we have made a separate paragraph for statistical analysis under the Methods section (lines 617-628). We note that we have previously described data processing and normalization. This study has an exploratory nature. Hence, the power calculations are not applicable, but this study can be an input for the power calculations of future studies testing statistical hypotheses. However, we agree that the sample sizes for each phenotype and beta estimation would support our results. We have now added them to __Table 1_. _ __

      Reviewer #1 (Significance (Required)):


      -This study represents an advance in the context of population-specific studies. Creating a comprehensive Estonian population-specific MAG reference and identifying new species contribute to our understanding of microbiome diversity.

      -The work builds upon previous large-scale microbiome projects, such as those that established the Unified Human Gastrointestinal Genome (UHGG) collection but focuses on a specific population.

      -The associations between microbial species (including novel ones) and common diseases provide potential avenues for future research into microbiome-based diagnostics or therapeutics.

      -The findings would interest microbiome researchers, bioinformaticians, and clinicians interested in the role of the gut microbiome in health and disease.

      We thank the Reviewer for the thoughtful feedback and recognition of our study's contributions to microbiome research. By creating an Estonian population-specific MAG reference and identifying new species, we advance population-specific studies and enhance global microbiome diversity. Building on projects like UHGG, we integrate local data into the global context and highlight potential applications in microbiome-based diagnostics and therapeutics. To address your suggestions, we expanded the results section with an example from the Butyricimonas genus. We hope our publicly available data will support future research and further advance understanding of the gut microbiome in health and disease.

      __ Reviewer #2 (Evidence, reproducibility and clarity (Required)):__


      The manuscript by Pantiukh et al. presents the collection of MAGs assembled from the Estonian Biobank, with a specific focus on the novel species clusters the authors defined and found associations with some of the diseases as collected among the samples available in their biobank. The manuscript is well organized. However, it lacks a bit in terms of novelty and also some statements that can mislead the readers to overinterpret some parts.

      Majors

      • The last paragraph of the introduction (lines 91-98) anticipates some results but lacks some methodological details. Please consider whether to move it to the results section or add very brief specifications, like (1) "sequence with deep coverage" is vague, how deep is deep? (2) "84,762 MAGs representing 2,257 species" are the 84k MAGs already quality-controlled? (3) "353 MAGs (15,6%) of the EstMB MAGs collection to represent potentially novel species." 353 are MAGs or species? As species clusters are defined later at 95% ANI, are all these 353 defining their own species clusters?

      We thank the Reviewer for insightful questions and suggestions. To address these points, we have added the following clarifications to the text:

      We specified the depth of coverage for sequences, providing an average reads number per sample - 56 mln reads. (Lines 92). We clarified that among 84,762 assembled MAGs, 42,049 MAGs (49.60 %) were high-quality (HQ) MAGs. (Lines 93-94). We revised the statement about the 353 MAGs, explicitly noting that they represent potentially novel species. Additionally, we clarified that all 2,257 representative MAGs, including these 353 new species MAGs represent separate species clusters based on the 95% ANI threshold mentioned later in the text. (Lines 94-98).

      In the paper, we included only the figure showing the quality group distribution for species cluster representative MAGs to avoid potential confusion between two similar figures: one for all assembled MAGs (n=84,762) and another for cluster representative MAGs (n=2,257). However, in response to this query, we have added a new __Supplementary Figure S1__that illustrates the quality group distribution for all assembled MAGs to provide a more comprehensive view.

      Figure S1. Quality estimation for the assembled MAGs (n=84,762). High-quality MAGs (HQ) – 42,049; Medium-quality MAGs (MQ) – 26,806; Low-quality MAGs (LQ) – 15,907.

      • lines 109 and 265, "11.73 +/- 3.9 Gb data per sample and 56.13 +/- 19.37 million reads per sample", numbers don't match... 11.73 Gbp is about 78M reads at 150nt read length, plus later the average depth is not 56.13 but 53.04, please double check these numbers

      We apologize for any misunderstanding. The numbers mentioned in the paper refer to the number of reads and the file size of each compressed *.fasta.gz file. This file size does not directly represent the total base pairs (Gb) for the current metagenome. Instead, it reflects the disk space occupied by the compressed sequencing data, including additional information such as sequence headers. We selected this parameter to provide an easy point of comparison with file sizes from other metagenome sequencing datasets, as *.fasta.gz is a commonly used format for storing sequence data. To clarify further, here is an example of the relationship between these parameters for one sample:

      Sample XX

      Value

      Meaning

      Program

      Compressed file size

      4.2 GB

      Represents disk space occupied by the compressed sequencing data. This applies to forward reads only; for a rough estimation of the disk space for both forward and reverse reads, it should be multiplied by 2 or calculated separately for both files.

      du -sh V00HXZ.fq1.gz

      The total number of reads

      41,062,933 reads

      (avg. read len = 147.7 bp)

      Represents number of forward reads. This applies to forward reads only; for a rough estimation of both forward and reverse reads, it should be multiplied by 2 or calculated separately for both files.

      seqkit stats V00HXZ.fq1.gz -a -T

      Total base pairs (Gb)

      6,066,493,002 bp (6.07 Gb)

      Represents total base pairs (Gb) for the current sample. This applies to forward reads only; for a rough estimation of both forward and reverse reads, it should be multiplied by 2 or calculated separately for both files.

      seqkit stats V00HXZ.fq1.gz -a -T

      We now realize this may have caused confusion. To address this, we have calculated the total base pairs (Gb) parameter for both forward and reverse reads and exchanged the __Compressed file size __number to __Total base pairs__with following section in the paragraph “Cohort overview and study design” on page 3 in the revised version of the manuscript:

      “The EstMB-deep samples were resequenced at deep coverage, generating an average of 16.49 ± 6.2 Gb of total base pairs per sample, or 56.13 ± 19.37 million paired reads per sample, with an average forward read length of 146.85 bp and an average reverse read length of 147.01 bp.”

      • line 118, "completeness > 90% and contamination We thank Reviewer for this comment, we use CheckM v2 for evaluation MAG completeness and contamination. We have incorporated the requested information into the manuscript. (Lines 128).

      • line 120, "84,762 MAGs were clustered at the species level with an average nucleotide identity (ANI) threshold of 95%.", as for my previous comment, either specify the Methods or quickly mention the tool used for the ANI analysis.

      We use dRep with default parameters for clustering. We have incorporated the requested information into the manuscript. (Lines 130).

      • lines 135-138, "The bacterial species most represented in our MAGs collection were Odoribacter splanchnicus (MAG recovered from 70.93% samples), Barnesiella intestinihominis (62.83%), Parabacteroides distasonis (60,38%), Alistipes putredinis (54,53%) and Agathobacter rectalis (51.92%) (Figure S2, Table S2).", it will be interesting to compare (some of) these speceis with other populations, to see if these species are globally prevalent in the human gut microbiome or specific to the Estonian population.

      We thank the Reviewer for this question. As highlighted in Figures 4e and 2d, the number of MAGs recovered for a given species often differs significantly from its prevalence in the population. Due to the complexities of MAG assembly, species prevalence is generally much higher, and these values do not correlate linearly, as shown in Supplementary Figure S5. Keeping in mind that species with the higher number of assembled MAGs are not the same as species with the higher prevalence, we compared our top assembled species with the most comprehensive up to date USGG collection of gut bacteria and integrated the following section in the paragraph “Population-specific Metagenome-Assembled Genomes (MAGs) reference” on page 4 in the revised version of the manuscript:

      “... All these species are also well-represented in other cohorts. For example, Parabacteroides distasonis, Alistipes putredinis, and Agathobacter rectalis rank among the top 6 species in the USGG by the number of genomes. Additionally, Barnesiella intestinihominis and Odoribacter splanchnicus rank among the top 40 species out of a total of 4,644 species in the USGG database.”

      • lines 143-144, "MAGs, 353 MAGs (15,64%) represent a new species according to the GTDB criteria.", these 353 MAGs might define fewer species clusters, I think the 'species' word in this sentence is misleading and can lead to an overinterpretation of the diversity, it will be more correct to report how many species clusters these MAGs defined.

      We apologize for not providing sufficient clarification. In our case each cluster represented a new distinct species. We added clarification in lines 152-153.

      • lines 163-168, the paragraph could be an overinterpretation, as it is unlikely that there is 'infinite' diversity, so it could be that by doubling the samples, there is already a plateau in terms of novel species clusters identified. I think this paragraph should be reconsidered.

      We thank the Reviewer for this question. We have updated Figure 2b. Instead of presenting a single version of the cumulative sum of new species discoveries, we reordered the samples five times to provide a more accurate approximation of new species accumulation as the number of samples increases. Additionally, we integrated the following section in the paragraph “Novel species and comparison of the population-specific reference with global reference UHGG” on page 4 in the revised version of the manuscript:

      “Our analysis so far shows a clear linear trend without indication of a plateau (although we can not exclude that plateau had been reached exactly at current sample size, which may not yet be evident).”

      __Figure 4b. __The relationship between the number of samples analyzed and the cumulative number of new species identified.

      • lines 182-184, "Even species which have been recovered from a large number of samples can be found in significantly more samples after mapping (Figure 2e, Table S2).", this is not novel as assembly requires higher coverage than calling a species present via mapping, please, rephrase this part.

      We thank the Reviewer for this thoughtful suggestion. We included this point in the article not because of its novelty but to emphasize that even a small number of recovered MAGs per sample can still hold significant value. This is because despite a small number of assembled genomes, the same species prevalence, as detected through mapping, can still be substantial which makes it possible to use them for, for example, association study. We added this perspective based on our personal experience of initial disappointment with the small number of MAGs recovered for many new species clusters. Our intention is to prevent similar discouragement among other researchers who may begin recovering MAGs from their large population cohorts.

      • lines 185-188, "which are usually extracted from a small number of samples, 185 show a prevalence exceeding 80% for some species. For example, Bacteroides faecalis has a prevalence of 97.23%, although only 1 MAG was assembled, and Bacteroides intestinigallinarum has a prevalence of 95.85% although only 2 MAGs were assembled.", this should be much better contextualized and discussed in terms of relative abundance and not only on the ability to reconstruct (which is highly impacted by coverage, which is a proxy for abundance) with its prevalence, it is known in the field that there are very highly prevalent species at very low abundance values, which are not that often reconstructed via metagenomic assembly.

      We agree that understanding the causes of assembly complications is important in the field, with abundance playing a key role. Moreover, other factors such as the presence of closely related species with similar genomes or multiple strains of the same species within a sample can significantly impact assembly, even for species with high abundance. However, since this paper focuses on the potential applications of MAG assembly in large population cohorts rather than the technical aspects of assembly, our main goal was to emphasize that MAGs assembled from the samples should not be used to estimate species prevalence.

      • Data availability, it appears that the provided accession number does not exist, please double-check this.

      We apologies about that issue, data now available with provided accession number PRJEB76860:

      Minors

      • line 106, "includes 1,308 women (69.64 %) and 570 men (30.35 %)", these sums up to 99.99%, the ratio for women is 1308/1878=0.69648, so can be rounded up to 69.65%.

      We thank the Reviewer for this correction. We correct numbers from 69.64% to 69.65% (Lines 114).

      • line 293, "ones[Philip Hugenholtz, 2008].", citation to fix.

      Thank you for the correction. We corrected the links. (Lines 414).

      • Fig. 1g, why completeness is up to 25%, from the text it seemed the MAGs were screened for completeness We apologize for not providing sufficient clarification. Indeed, as noted in Lines 124-126, *"We successfully reconstructed 84,762 metagenome-assembled genomes (MAGs), an average of 45 MAGs per sample. Among these, 42,048 according to CheckM, MAGs (49.6%) have completeness > 90% and contamination 90% and contamination 50% and contamination (Lines 131-132).

      • Fig. 2f says "Blue bars represent", but I believe it should be green instead of blue.

      Thank you for the correction. We corrected the color.

      (Lines 520).

  5. Nov 2024
    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      Urination requires precise coordination between the bladder and external urethral sphincter (EUS), while the neural substrates controlling this coordination remain poorly understood. In this study, Li et al. identify estrogen receptor 1-expressing neurons (ESR1+) in Barrington's nucleus as key regulators that faithfully initiate or suspend urination. Results from peripheral nerve lesions suggest that BarEsr1 neurons play independent roles in controlling bladder contraction and relaxation of the EUS. Finally, the authors performed region-specific retrograde tracing, claiming that distinct populations of BarEsr1 neurons target specific spinal nuclei involved in regulating the bladder and EUS, respectively.

      Strength:

      Overall, the work is of high quality. The authors integrate several cutting-edge technologies and sophisticated, thorough analyses, including opto-tagged single unit recordings, combined optogenetics, and urodynamics, particularly those following distinct peripheral nerve lesions.

      Weakness:

      (1) My major concern is the novelty of this study. Keller et al. 2018 have shown that BarEsr1 neurons are active during urination and play an essential role in relaxing the external urethral sphincter (EUS). Minimally, substantial content that merely confirms previous findings (e.g. Figures 1A-E; Figures 3A-E) should be move to the supplementary datasets.

      Indeed, we are aware of and have carefully studied the literature of Keller et al. Our manuscript here presents novel experiments beyond the scopes of that paper. Thanks to this comment, we will substantially revise our manuscript to enhance the visibility of novel data while keeping the agreeing data in the supplementary.

      (2) I also have concerns regarding the results showing that the inactivation of BarEsr1 neurons led to the cessation of EUS muscle firing (Figures 2G and S5C). As shown in the cartoon illustration of Figure 8, spinal projections of BarEsr1 neurons contact interneurons (presumably inhibitory) that innervate motor neurons, which in turn excite the EUS. I would therefore expect that the inactivation of BarEsr1 should shift the EUS firing pattern from phasic (as relaxation) to tonic (removal of relaxation), rather than stopping their firing entirely. Could the authors comment on this and provide potential reasons or mechanisms for this finding?

      We agree with this point. We meant that the EUS’ phasic bursting pattern was rapidly stopped upon BarEsr1 photoinhibition, but not all the firing stopped instantaneously. According to the previous studies (Chang et al., 2007, de Groat, 2009, de Groat and Yoshimura, 2015, Kadekawa et al., 2016), the voiding physiology of rodents is probably different from that of humans, such that for rodents the urine is step-wise pumped out in the gap time between multiple consecutive EUS phasic bursting epochs, and for humans the urine is continuously pumped out once the EUS firing is almost fully inhibition during a period of time. Namely, for mice, the EUS display sustained tonic activity following phasic bursting, while, in contrast, for humans the EUS keeps tonic firing until the moment of voiding onset (complete inhibition, muscle relaxed). Despite the prominent differences in the basic physiological properties, our assumption is that the logic of circuits from the brainstem to the urethra in this pathway is evolutionally conserved for both species; thus the logic of brainstem coordination of voiding could also be the same for both species, which is the main interest of our study (of using an animal model to address concerns of human health). Thus, to interpret our data for a broader audience we made a simplified and inaccurate expression. We apologize for the inaccuracy and we will correct our previous inaccurate description in the revised manuscript.

      (3) Current evidence is insufficient to support the claim that the majority of BarEsr1 neurons innervate the SPN but not DGC. The current spinal images are uninformative, as the fluorescence reflects the distribution of Esr1- or Crh-expressing neurons in the spinal cord, along with descending BarEsr1 or BarCrh axons. Given the close anatomical proximity of these two nuclei, a more thorough histological analysis is required to demonstrate that the spinal injections were accurately confined to either the SPN or the DGC.

      We agree that current evidence is insufficient to support the current claim. To address this concern and strengthen our claim, we will repeat the retrograde viral tracing experiments, combined with CTB647 injections to label the injection site, to validate specific targeting of SPN or DGC populations. We will also add higher-magnification imaging to distinguish BarESR1 axonal projections targeting SPN versus DGC. Results from these ongoing experiments will be incorporated into the revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      The authors have performed a rigorous study to assess the role of ESR1+ neurons in the PMC to control the coordination of bladder and sphincter muscles during urination. This is an important extension of previous work defining the role of these brainstem neurons, and convincingly adds to the understanding of their role as master regulators of urination. This is a thorough, well-done study that clarifies how the Pontine micturition center coordinates different muscle groups for efficient urination, but there are some questions and considerations that remain.

      Strengths:

      These data are thorough and convincing in showing that ESR1+PMC neurons exert coordinated control over both the bladder and sphincter activity, which is essential for efficient urination. The anatomical distinctions in pelvic versus pudendal control are clear, and it's an advance to understand how this coordination occurs. This work offers a clearer picture of how micturition is driven.

      Weaknesses:

      The dynamics of how this population of ESR1+ neurons is engaged in natural urination events remains unclear. Not all ESR1+neurons are always engaged, and it is not measured whether this is simply variation in population activity, or if more neurons are engaged during more intense starting bladder pressures, for instance. In particular, the response dynamics of single and doubly-projecting neurons are not defined. Additionally, the model for how these neurons coordinate with CRH+ neuron activity in the PMC is not addressed, although these cell types seem to be engaged at the same time. Lastly, it would be interesting to know how sensory input can likely modulate the activity of these neurons, but this is perhaps a future direction.

      In response to the reviewer’s comments, we will attempt perform the following revisions for this round:

      (1) Engagement of ESR1+ neurons in natural urination events:

      We agree that probably not all ESR1+ neurons are consistently engaged during urination. To address this, we will perform a detailed analysis of the opto-tagged single unit recordings data.

      (2) Response dynamics of single- and doubly-projecting neurons:

      (a) We will use retrograde labelling combined with Ca2+ photometry recordings to differentiate the response dynamics of SPN- and DGC-projecting neurons during urination.

      (b) We will perform functional validations to assess the specific roles of single- and doubly-projecting neurons in coordinating bladder and EUS activity.

      (3) Coordination with CRH+ neurons in the PMC:<br /> We appreciate the suggestion to include CRH+ neurons in our model. We will expand our model to incorporate CRH+ neurons and their potential interactions with ESR1+ neurons.

      (4) Sensory modulation of ESR1+ neurons:<br /> The reviewer raises an excellent point regarding sensory input modulation of ESR1+ neuron activity. Although this is beyond the scope of our current study, we recognize its importance and propose to include this as a future direction.

      Reviewer #3 (Public review):

      Summary:

      The paper by Li et al explored the role of Estrogen receptor 1 (Esr1) expressing neurons in the pontine micturition center (PMC), a brainstem region also known as Barrington's nucleus (Hou et al 2016, Keller et al 2018). First, the author conducted bulk Ca2+ imaging/unit recording from PMCESR1 to investigate the correlations of PMCESR1 neural activity to voiding behavior in conscious mice and bladder pressure/external urethral muscle activity in urethane anesthetized mice. Next, the authors conducted optogenetics inactivation/activation of PMCESR1 to confirm the contribution to the voiding behavior also conducted peripheral nerve transection together with optogenetics activation to confirm the independent control of bladder pressure and urethral sphincter muscle.

      Weaknesses:

      (1) The study demonstrates that pelvic nerve transection reduces urinary volume triggered by PMCESR1+ cell photoactivation in freely moving mice. Could the role of pudendal nerve transection also be examined in awake mice to provide a more comprehensive understanding of neural involvement?

      Thank you for the suggestion, the pudendal nerve transection in awake mice is indeed a challenging experiment that has been missed. We will try it for the revision.

      (2) While the paper primarily focuses on PMCESR1+ cells in bladder-sphincter coordination, the analysis of PMCESR1+-DGC/SPN neural circuits - given their distinct anatomical projections in the sacral spinal cord - feels underexplored. How do these circuits influence bladder and sphincter function when activated or inhibited? Also, do you have any tracing data to confirm whether bladder-sphincter innervation comes from distinct spinal nuclei?

      Thank you for this great comment. The projection-specific neuronal function analysis is, as also suggested by Reviewer 2 in a similar comment (#8), missing in our first submission. These are so challenging experiments that we have missed in the first round of tests, but we decide to pursuit this goal again. Namely, we will perform photometry recordings of PMC neurons projecting to the DGC/SPN during measuring bladder pressure and urethral sphincter EMG activity. Additionally, while our study does not include direct tracing data to confirm distinct spinal nuclei for bladder and sphincter innervation, this has been well-documented in classic literature (Yao et al., 2018, Karnup and De Groat, 2020, Karnup, 2021). Specifically, anatomical studies have shown that SPN primarily innervates the bladder, while the DGC is associated with the innervation of the urethral sphincter. We will cite these references to provide context and support for our interpretations.

      (3) Although the paper successfully identifies the physiological role of PMCESR1+ cells in bladder-sphincter coordination, the study falls short in examining the electrophysiological properties of PMCESR1+-DGC/SPN cells. A deeper investigation here would strengthen the findings.

      While our study primarily focuses on the functional role of PMCESR1+ neurons in bladder-sphincter coordination, we acknowledge that understanding their intrinsic electrophysiological characteristics could further strengthen our findings. However, this aspect falls beyond the scope of the current study. Nevertheless, we recognize the significance of this direction and are excited to pursue it in future research. We appreciate the reviewer’s suggestion, as it highlights an important avenue for expanding upon our current findings.

      (4) The parameters for photoactivation (blue light pulses delivered at 25 Hz for 15 ms, every 30 s) and photoinhibition (pulses at 50 Hz for 20 ms) vary. What drove the selection of these specific parameters? Moreover, for photoactivation experiments, the change in pressure (ΔP = P5 sec - P0 sec) is calculated differently from photoinhibition (Δpressure = Ppeak - Pmin). Can you clarify the reasoning behind these differing approaches?

      We sincerely thank the reviewer for raising these important points and for the opportunity to clarify our experimental design and data analysis methods.

      Photoactivation versus photoinhibition parameters: The differences in photoactivation (25 Hz, 15 ms pulses) and photoinhibition (50 Hz, 20 ms pulses) protocols are based on the distinct physiological and technical requirements for activating versus inhibiting PMCESR1+ neurons. For photoactivation, 25 Hz stimulation aligns with the natural firing patterns of central neurons, allowing for intermittent activation without exceeding the neuronal refractory period. The shorter pulse duration (15 ms) minimizes phototoxicity and avoids overstimulation, as performed in previous studies (Keller et al., 2018). In contrast, photoinhibition requires sustained suppression of neuronal activity, achieved through higher frequencies (50 Hz) and longer pulses (20 ms) to ensure continuous coverage of neuronal activity.

      Calculation of pressure changes (ΔP) for photoactivation and photoinhibition: The differing methods for calculating pressure changes reflect the distinct physiological effects we aimed to capture. In photoactivation experiments (ΔP = P5 sec - P0 sec), the pressures before (P0 sec) and 5 seconds after (P5 sec) light delivery were compared to capture the immediate effect of light activation on bladder pressure, focusing on the onset and early dynamics of activation. In contrast, photoinhibition experiments assessed the immediate impact of light-induced suppression on bladder pressure during an ongoing voiding event. Here, Δpressure was calculated as Ppeak – Pmin to measure the rapid drop in pressure directly attributable to neuronal inhibition.

      We will expand these details in the methods section of the revised manuscript to provide greater transparency.

      (5) The discussion could further emphasize how PMCESR1+ cells coordinate bladder contraction and sphincter relaxation to control urination, highlighting their central role in the initiation and suspension of this process.

      We fully agree with this point. Additionally, in response to your and other reviewers’ suggestions, we are preparing a new round of experiments with projection-specific recording, and thus our discussion and conclusion will also be updated according to the newly obtained data.

      (6) In Figure 8, The authors analyze the temporal sequence of bladder pressure and EUS bursting during natural voiding and PMC activation-induced voiding. It would be acceptable to consider the existence of a lower spinal reflex circuit, however, the interpretation of the data contains speculation. Bladder pressure measurement is hard to say reflecting efferent pelvic nerve activity in real time. (As a biological system, bladder contraction is mediated by smooth muscle, and does not reflect real-time efferent pelvic nerve activity. As an experimental set-up, bladder pressure measurement has some delays to reflect bladder pressure because of tubing, but EUS bursting has no delay.) Especially for the inactivation experiment, these factors would contribute to the interpretation of data. This reviewer recommends a rewrite of the section considering these limitations. Most of the section is suitable for the results.

      Thank you for mentioning the possibility of bladder pressure measurement delay. We would prefer to perform a physical control test to quantify how much delay this measurement is under our experimental conditions. We will use a small ballon to mimic the bladder and use two identical pressure sensors, one with a very short tube inserted into the ballon and one with an extended tube same as in our animal experiments. We will then mimic both contraction initiation and halting, and quantify the delay between the two sensors.

      References

      • Chang HY, Cheng CL, Chen JJJ, de Groat WC. 2007. Serotonergic drugs and spinal cord transections indicate that different spinal circuits are involved in external urethral sphincter activity in rats. American Journal of Physiology-Renal Physiology 292: F1044-F1053. DOI: 10.1152/ajprenal.00175.2006

      • de Groat WC. 2009. Integrative control of the lower urinary tract: preclinical perspective. British Journal of Pharmacology 147. DOI: 10.1038/sj.bjp.0706604

      • de Groat WC, Yoshimura N. 2015. Anatomy and physiology of the lower urinary tract. Handb Clin Neurol 130: 61-108. DOI: 10.1016/B978-0-444-63247-0.00005-5

      • Kadekawa K, Yoshimura N, Majima T, Wada N, Shimizu T, Birder LA, Kanai AJ, de Groat WC, Sugaya K, Yoshiyama M. 2016. Characterization of bladder and external urethral activity in mice with or without spinal cord injury—a comparison study with rats. American Journal of Physiology-Regulatory, Integrative and Comparative Physiology 310: R752-R758. DOI: 10.1152/ajpregu.00450.2015

      • Karnup S. 2021. Spinal interneurons of the lower urinary tract circuits. Autonomic Neuroscience 235. DOI: 10.1016/j.autneu.2021.102861

      • Karnup SV, De Groat WC. 2020. Mapping of spinal interneurons involved in regulation of the lower urinary tract in juvenile male rats. IBRO Rep 9: 115-131. DOI: 10.1016/j.ibror.2020.07.002

      • Keller JA, Chen J, Simpson S, Wang EH-J, Lilascharoen V, George O, Lim BK, Stowers L. 2018. Voluntary urination control by brainstem neurons that relax the urethral sphincter. Nature Neuroscience 21: 1229-1238. DOI: 10.1038/s41593-018-0204-3             

      • Yao J, Zhang Q, Liao X, Li Q, Liang S, Li X, Zhang Y, Li X, Wang H, Qin H, Wang M, Li J, Zhang J, He W, Zhang W, Li T, Xu F, Gong H, Jia H, Xu X, Yan J, Chen X. 2018. A corticopontine circuit for initiation of urination. Nature Neuroscience 21: 1541-1550. DOI: 10.1038/s41593-018-0256-4

    1. Add your references here. It is recommended to have them as a list.

      Guillem, A., Gros, A., Reby, K., Abergel, V. et DeLuca, L. (2023). RCC8 for CIDOC CRM: Semantic Modeling of Mereological and Topological Spatial Relations in Notre-Dame de Paris. Dans A. Bikakis, R. Ferrario, S. Jean, B. Markhoff, A. Mosca et M. Nicolosi Asmundo (dir.), SWODCH’23 : International Workshop on Semantic Web and Ontology Design for Cultural Heritage. https://hal.science/hal-04275714 Rasmussen, M. H., Lefrançois, M., Schneider, G. F. et Pauwels, P. (2021a). BOT: The building topology ontology of the W3C linked building data group. Semantic Web, 12(1), 143‑161. https://doi.org/10.3233/SW-200385 Rasmussen, M. H., Pauwels, P., Lefrançois, M. et Schneider, G. F. (2021b, 28 juin). Building Topology Ontology [Draft Community Group Report]. https://w3c-lbd-cg.github.io/bot/ Renaudie, Z. (2019). Le monde de Feux pâles, l’exposition à l’épreuve de la conservation-restauration, tome I [Mémoire Master II, École supérieure d’art d’Avignon]. https://www.academia.edu/40627194/Renaudie_Zoë_Le_monde_de_Feux_pâles lexposition_à_l’épreuve_de_la_conservation_restauration_TOME_I

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Qin and colleagues analysed data from the Human Connectome Project on four right-handed subgroups with different gyrification patterns in Heschl's gyrus. Based on these groups, the authors highlight the structure-function relationship of planum temporale asymmetry in lateralised language processing at the group level and next at the individual level. In particular, the authors propose that especially microstructural asymmetries are related to functional auditory language asymmetries in the planum temporale.

      Strengths:

      The study is interesting because of an ongoing and long-standing debate about the relationship between structural and functional brain asymmetries, and in particular whether structural brain asymmetries can be seen as markers of functional language brain lateralisation.

      In this debate, the relationship between Heschl's gyrus asymmetry and planum temporale asymmetry is rare and therefore valuable here. A large sample size and inter-rater reliability support the findings.

      Weaknesses:

      In this case of multiple brain measures, it would be important to provide the reader with some sort of effect size (e.g. Cohen's d) to help interpret the results.

      Thank you for pointing this out. In the revised version, the effect size, i.e., Cohen's d, has been incorporated into the results (page 8, line 159-160; page 9, line 181-186, supplementary page 14, Table S14).

      In addition, the authors highlight the microstructural results in spite of the macrostructural results. However, the macrostructural surface results are also strong. I would suggest either reducing the emphasis on micro vs macrostructural results or adding information to justify the microstructural importance.

      In the original manuscript, we highlighted the results of microstructural measures because the correlations between PT microstructural and functional measures were more pronounced both within the hemispheres and in terms of asymmetry, compared with the significant results of surface area. Following your comments here, we now lowered the tone of microstructure results (page 2, line 40; page 14, line 267), and added relevant discussion regarding the macrostructural results in the revised version (page 18, line 363-370; as copied below):

      “As for macrostructural measures, the asymmetric PT surface area was also associated with speech comprehension AI. Given that the within-hemispheric coupling tendency between surface and speech comprehension existed only in the left PT, it was possible that the larger surface area of the left PT led to a less recruitment of its right homologous, and therefore the lateralization of functional activity would be more pronounced. Additionally, an opposite tendency was found between the correlation of speech perception and comprehension with surface area, potentially implying the segregation of the different speech processing in the PT area.”

      Recommendations for the authors:

      I have only some comments that I wish to be addressed by the authors:

      (1) Please always specify "structural" or "functional" asymmetry or lateralisation, as the reader may be confused.

      This has been done in relevant places.

      (2) Please state that the scale is not the same between the results in Figure 3.

      This have been specified, as suggested (see below).

      “Notably, we did not standardize these structural measures, so the scales differed between indicators.”

      (3) It may be of interest to the reader to learn more about interpretations of how Heschl's gyrus and planum temporale asymmetries are related.

      Thank you for this comment. Given that the asymmetry of Heschl's gyrus was not analyzed in the present study, we do not have direct data/results for such an interpretation. Also, we reviewed the literature but found no relevant results on how Heschl's gyrus and planum temporale asymmetries are related. To address this, specific investigation targeting on this topic is needed. This has now been added in the discussion (page 20, line 415-417).

      (4) As this manuscript builds somewhat on the Science Advances article by Ocklenburg et al. (2018), it would be important to discuss how this more liberal planum temporale definition might (or might not) affect the results compared to the more conservative planum temporale definition described here.

      Yes, the definition of planum temporale varies across studies. Our current manual one is relatively more conservative than the Ocklenburg et al. (2018), in which the planum temporale was automatically derived from the Destrieux atlas. We believe that the definition of the planum temporale likely have non-trivial impact on the results, and our current manual definition with the consideration of the HG duplication should be more reliable and accurate, therefore favored, relative to the other ones. This has been briefly discussed in the revision (page 15-16, line 300-304).

      (5) I would like the authors to briefly but critically discuss what exactly the MRI NODDI model measures and how this is interpreted as measuring microstructural properties of tissue.

      We now provided relevant information regarding the NODDI measures (page 26, line 552-558; as copied below).

      “NODDI is a highly effective method for detecting key features of neurite morphology, which employs a tissue model that detects three microstructural environments: the intracellular, extracellular and cerebrospinal fluid compartments (Zhang et al., 2012). In the grey matter of the cerebral cortex, the neurite density index (NDI) is an estimated volume fraction of the intracellular microstructural environment, with higher NDIs indicating greater neurite density (Jespersen et al., 2010; Zhang et al., 2012). The orientation dispersion index (ODI) is a measure of the alignment or dispersion of neurite, with higher ODIs indicating more dispersed neurite and lower ODIs indicating more aligned neurite (Jespersen et al., 2012; Zhang et al., 2012).”

      (6) While not mandatory, I would be interested to read the authors' thoughts on the evolution of such a functional/(micro)structural lateralisation link of the planum temporale, in light of the literature on planum temporale asymmetries in (newborn) non-human primate species.

      Thank you for this inspiring suggestion. We have incorporated relevant discussion into the revised version (page 15, line 281-288; as copied below).

      “Moreover, there exist evolutionary evidence supporting the role of the PT as an anatomical substrate for language lateralization. For example, the leftward structural asymmetry of the PT have been observed in multiple non-human primates, including chimpanzees, macaques, and baboons (Becker et al., 2024; Gannon et al., 1998; Xia et al., 2019). Particularly, recent studies on baboons further demonstrated that PT structural leftward asymmetry in newborn baboons could predict future development of communicative gestures, implying a key role of PT structural asymmetry in the lateralized communication system for human and non-human brain evolution (Becker et al., 2024, 2021).”

      Reference

      Becker Y, Phelipon R, Marie D, Bouziane S, Marchetti R, Sein J, Velly L, Renaud L, Cermolacce A, Anton J-L, Nazarian B, Coulon O, Meguerditchian A. 2024. Planum temporale asymmetry in newborn monkeys predicts the future development of gestural communication’s handedness. Nat Commun 15:4791. doi:10.1038/s41467-024-47277-6

      Becker Y, Sein J, Velly L, Giacomino L, Renaud L, Lacoste R, Anton J-L, Nazarian B, Berne C, Meguerditchian A. 2021. Early Left-Planum Temporale Asymmetry in newborn monkeys (Papio anubis): A longitudinal structural MRI study at two stages of development. NeuroImage 227:117575. doi:10.1016/j.neuroimage.2020.117575

      Gannon PJ, Holloway RL, Broadfield DC, Braun AR. 1998. Asymmetry of Chimpanzee Planum Temporale: Humanlike Pattern of Wernicke’s Brain Language Area Homolog. Science 279:220–222. doi:10.1126/science.279.5348.220

      Jespersen SN, Bjarkam CR, Nyengaard JR, Chakravarty MM, Hansen B, Vosegaard T, Østergaard L, Yablonskiy D, Nielsen NChr, Vestergaard-Poulsen P. 2010. Neurite density from magnetic resonance diffusion measurements at ultrahigh field: Comparison with light microscopy and electron microscopy. NeuroImage 49:205–216. doi:10.1016/j.neuroimage.2009.08.053

      Jespersen SN, Leigland LA, Cornea A, Kroenke CD. 2012. Determination of Axonal and Dendritic Orientation Distributions Within the Developing Cerebral Cortex by Diffusion Tensor Imaging. IEEE Trans Med Imaging 31:16–32. doi:10.1109/TMI.2011.2162099

      Xia J, Wang F, Wu Z, Wang L, Zhang C, Shen D, Li G. 2019. Mapping hemispheric asymmetries of the macaque cerebral cortex during early brain development. Hum Brain Mapp. doi:10.1002/hbm.24789

      Zhang H, Schneider T, Wheeler-Kingshott CA, Alexander DC. 2012. NODDI: Practical in vivo neurite orientation dispersion and density imaging of the human brain. NeuroImage 61:1000–1016. doi:10.1016/j.neuroimage.2012.03.072

      Reviewer #2 (Public Review):

      Summary:

      The authors assessed the link between structural and functional lateralization in area PT, one of the brain areas that is known to exhibit strong structural lateralization, and which is known to be implicated in speech processing. Importantly, they included the sulcal configuration of Heschl's gyrus (HG), presenting either as a single or duplicated HG, in their analysis. They found several significant associations between microstructural indices and task-based functional lateralization, some of which depended on the sulcal configuration.

      Strengths:

      A clear strength is the large sample size (n=907), an openly available database, and the fact that HG morphology was manually classified in each individual. This allows for robust statistical testing of the effects across morphological categories, which is not often seen in the literature.

      Weaknesses:

      - Unfortunately, no left-handers were included in the study. It would have been a valuable addition to the literature, to study the effect of handedness on the observed associations, as many previous studies on this topic were not adequately powered. The fact that only right-handers were studied should be pointed out clearly in the introduction or even the abstract.

      Thank for pointing this out. We have explicitly specified this in the Abstract and Introduction.

      - The tasks to quantify functional lateralization were not specifically designed to pick up lateralization. In the interest of the sample size, it is understandable that the authors used the available HCP-task-battery results, however, it would have been feasible to access another dataset for validation. A targeted subset of results, concerning for example the relationship between sulcal morphology and task-based functional lateralization, could be re-assessed using other open-access fMRI datasets.

      Yes, the fMRI task was not specifically designed to evaluate PT functional lateralization, which has been acknowledged in the discussion (page 17, line 330-342). Given the observed small effect size of our current structural-functional relationship, reproducing similar results with other datasets would require a cohort with a large sample size. This would induce a quite labor-intensive work given our current manual protocol for outlining PT and HG for everyone. The lack of validation with independent dataset has been discussed as a limitation in the revised version. We will try to conduct such a validation in future work, likely after developing an automatic pipeline for accurately extracting the PT and HG in the individual space (like the manual outlining protocol).

      - The study is mainly descriptive and the general discussion of the findings in the larger context of brain lateralization comes a bit short. For example, are the observed effects in line with what we know from other 'language-relevant' areas? What could be the putative mechanisms that give rise to functional lateralization based on the microstructural markers observed? And which mechanisms might be underlying the formation of a duplicated HG?

      Thank you for these insightful comments. As suggested, we strengthened the discussion as below:

      “Another possible explanation could be that higher myelin content and larger surface area in left PT potentially indicated more white matter connection with other language-related regions such as Broca’s area, and therefore is more involved in language tasks than its right homolog (Allendorfer et al., 2016; Catani et al., 2005; Giampiccolo and Duffau, 2022).

      The distinct roles of left and right PT in speech processing have been well-documented. A number of studies substantiated that PT of the left hemisphere responded more strongly to lexical-semantic and syntactic aspects of sentence processing, whereas the right hemisphere demonstrated a greater involvement in the speech melody (Albouy et al., 2020; Meyer et al., 2002).

      These findings are consistent with those reported for the arcuate fasciculus (AF). The left AF has been identified as a crucial structure for language function (Giampiccolo and Duffau, 2022; Zhang et al., 2021). Disruption to this pathway has been linked to multimodal phonological and semantic deficits (Agosta et al., 2010), while injuries in the right AF did not affect language function (Zeineh et al., 2015).”

      Regarding the mechanism underlying the formation of a duplicated HG, we did not come up with good thoughts after careful literature review. Also, we feel that this is kind of out of the scope of the present study and therefore did not add more discussion on this topic.

      Recommendations for the authors:

      (1) The data availability statement makes no explicit mention of the manual labels of HG configuration. Would the authors consider making available a list of HCP-subject-ID with a morphological group (L1/R1, L1/R2, etc.) for replicability and for re-use by other researchers?

      The list of HCP-subject-ID with a morphological group (L1/R1, L1/R2, etc.) is now available in the supplementary material 2. We have specified this in the revised version.

      (2) It would be helpful to state again the statistical tests associated with the p-value in the figure/table caption, e.g. Table 2.

      As suggested, we now specified the statistical method in the figure/table caption.

      (3) Sometimes, the y-axis labels are missing or not clear, for example in Figure S2.

      Sorry about these. We double-checked all the figures, and corrected the missing or unclear labels for Figure S2 and S3 in the revised version.

      (4) In a few instances the font sizes vary within a figure caption.

      This has been corrected in the revision.

      Reference

      Agosta F, Henry RG, Migliaccio R, Neuhaus J, Miller BL, Dronkers NF, Brambati SM, Filippi M, Ogar JM, Wilson SM, Gorno-Tempini ML. 2010. Language networks in semantic dementia. Brain J Neurol 133:286–299. doi:10.1093/brain/awp233

      Albouy P, Benjamin L, Morillon B, Zatorre RJ. 2020. Distinct sensitivity to spectrotemporal modulation supports brain asymmetry for speech and melody. Science 367:1043–1047. doi:10.1126/science.aaz3468

      Allendorfer JB, Hernando KA, Hossain S, Nenert R, Holland SK, Szaflarski JP. 2016. Arcuate fasciculus asymmetry has a hand in language function but not handedness. Hum Brain Mapp 37:3297–3309. doi:10.1002/hbm.23241

      Catani M, Jones DK, Ffytche DH. 2005. Perisylvian language networks of the human brain. Ann Neurol 57:8–16. doi:10.1002/ana.20319

      Giampiccolo D, Duffau H. 2022. Controversy over the temporal cortical terminations of the left arcuate fasciculus: a reappraisal. Brain J Neurol 145:1242–1256. doi:10.1093/brain/awac057

      Meyer M, Alter K, Friederici AD, Lohmann G, von Cramon DY. 2002. FMRI reveals brain regions mediating slow prosodic modulations in spoken sentences. Hum Brain Mapp 17:73–88. doi:10.1002/hbm.10042

      Zeineh MM, Kang J, Atlas SW, Raman MM, Reiss AL, Norris JL, Valencia I, Montoya JG. 2015. Right arcuate fasciculus abnormality in chronic fatigue syndrome. Radiology 274:517–526. doi:10.1148/radiol.14141079

      Zhang H, Schneider T, Wheeler-Kingshott CA, Alexander DC. 2012. NODDI: Practical in vivo neurite orientation dispersion and density imaging of the human brain. NeuroImage 61:1000–1016. doi:10.1016/j.neuroimage.2012.03.072

      Zhang J, Zhong S, Zhou L, Yu Yamei, Tan X, Wu M, Sun P, Zhang W, Li J, Cheng R, Wu Y, Yu Yanmei, Ye X, Luo B. 2021. Correlations between Dual-Pathway White Matter Alterations and Language Impairment in Patients with Aphasia: A Systematic Review and Meta-analysis. Neuropsychol Rev 31:402–418. doi:10.1007/s11065-021-09482-8

      Reviewing Editor:

      I encourage the authors to incorporate the suggestions of the reviewers, such as:

      (1) to provide more in-depth interpretations about how and why structural and functional lateralization relate,

      Done.

      (2) to provide statistical effect sizes,

      Done.

      (3) to make their sulcal-morphology classification openly available,

      Done.

      (4) to provide statistical effect sizes,

      Done

      (5) to discuss the possible impact of diverging PT definitions with regard to previous studies,

      Done.

      (6) to provide more in-depth interpretations about how and why structural and functional lateralization relate.

      Done.

      Detailed comments:

      In an impressive cohort of 907 human participants, the present paper presents a very interesting set of data on PT asymmetries not only at the macro-structural but also at the microstructural levels in order to investigate their potential correlates with PT functional asymmetry in relation to perceptual acoustic language tasks.

      I believe this is a key paper for the following reasons:

      (1) it provides critical data and results for addressing a controversial but important question: the relevance of measures of anatomical asymmetry for inferring its language-related functional hemispheric specialization;

      (2) to do so, the authors made a very impressive effort to manually trace the anatomical delineation of the planum temporale at different levels in every participant, the best (but crazy time-consuming) approach so far to document interindividual variability of the PT and to address such a question;

      (3) the contribution is particularly relevant regarding the statistical power of the study, the study and measures having been done in 907 participants!

      (4) I also found the study well designed and well written with great relevance of the findings for the field.

      As the results, the authors reported asymmetric measures of microstructural asymmetry (including intracortical myelin content, neurite density, and neurite orientation) but also of macrostructural asymmetries in relation to functional lateralization for language.

      Comments:

      I have only 2 additional minor comments of my own:

      (1) In agreement with reviewer 2, I don't understand why the authors seem to downplay the links they found between gross PT asymmetry and functional lateralization. I recommend the authors to highlight and discuss this important result, just as the microstructural PT asymmetries and their functional links.

      This has been done (page 18, line 363-370).

      (2) PT structural asymmetry (both micro & macro) has been well documented in nonhuman primates (and their functional link with manual lateralization for gestural communication). Without detailing this literature, I recommend the authors at least mention this literature as a comparative perspective in the introduction and/or discussion in order to make the question of PT asymmetry less anthropocentric.

      This has been done (page 15, line 281-288).

    1. Author response:

      The following is the authors’ response to the original reviews.

      This study presents a valuable finding on sperm flagellum and HTCA stabilization. The evidence supporting the authors' claims is incomplete. The work will be of broad interest to cell and reproductive biologists working on cilium and sperm biology.

      We thank the Editor and the two reviewers for their time and thorough evaluation of our manuscript. We greatly appreciate their valuable guidance on improving our study. In the revised manuscript, we have conducted additional experiments and provided quantitative data in response to the reviewers' comments. Furthermore, we have refined the manuscript and added further context to elucidate the significance of our findings for the readers.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this paper, Wu et al. investigated the physiological roles of CCDC113 in sperm flagellum and HTCA stabilization by using CRISPR/Cas knockouts mouse models, co-IP, and single sperm imaging. They find that CCDC113 localizes in the linker region among radial spokes, the nexin-dynein regulatory complex (N-DRC), and doublet microtubules (DMTs) RS, N-DRC, and DMTs and interacts with axoneme-associated proteins CFAP57 and CFAP91, acting as an adaptor protein that facilitates the linkage between RS, N-DRC, and DMTs within the sperm axoneme. They show the disruption of CCDC113 produced spermatozoa with disorganized sperm flagella and CFAP91, DRC2 could not colocalize with DMTs in Ccdc113-/- spermatozoa. Interestingly, the data also indicate that CCDC113 could localize on the HTCA region, and interact with HTCA-associated proteins. The knockout of Ccdc113 could also produce acephalic spermatozoa. By using Sun5 and Centlein knockout mouse models, the authors further find SUN5 and CENTLEIN are indispensable for the docking of CCDC113 to the implantation site on the sperm head. Overall, the experiments were designed properly and performed well to support the authors' observation in each part. Furthermore, the study's findings offer valuable insights into the physiological and developmental roles of CCDC113 in the male germ line, which can provide insight into impaired sperm development and male infertility. The conclusions of this paper are mostly well supported by data, but some points need to be clarified and discussed.

      We thank Reviewer #1 for his or her critical reading and the positive assessment.

      (1) In Figure 1, a sperm flagellum protein, which is far away from CCDC113, should be selected as a negative control to exclude artificial effects in co-IP experiments.

      We greatly appreciate Reviewer #1’s insightful suggestion. In response, we selected two sperm outer dense fiber proteins, ODF1 and ODF2, which are located distant from the sperm axoneme, as negative controls in the co-IP experiments. As shown in Figure 1- figure supplement 1A and B, neither ODF1 nor ODF2 bound to CCDC113, indicating the interaction observed in Figure 1 is not an artifact.

      (2) Whether the detachment of sperm head and tail in Ccdc113-/- mice is a secondary effect of the sperm flagellum defects? The author should discuss this point.

      Good question. Considering that CCDC113 is localized in the sperm neck region and interacts with SUN5 and CENTLEIN, it may play a direct role in connecting the sperm head and tail. Indeed, PAS staining revealed that Ccdc113–/– sperm heads exhibit abnormal orientation in stages V–VIII of the seminiferous epithelia (Figure 6C-D). Furthermore, transmission electron microscopy (TEM) analysis indicated that the absence of CCDC113 caused detachment of the damaged coupling apparatus from the sperm head in step 9–11 spermatids (Figure 6E). These results suggest that the detachment of the sperm head and tail in Ccdc113–/– mice may not be a secondary effect of sperm flagellum defects. We have discussed this point further below:

      “CCDC113 can interact with SUN5 and CENTLEIN, but not PMFBP1 (Figure 7A-C), and left on the tip of the decapitated tail in Sun5–/– and Centlein–/– spermatozoa (Figure 7K and L). Furthermore, CCDC113 colocalizes with SUN5 in the HTCA region, and immunofluorescence staining in spermatozoa shows that SUN5 is positioned closer to the sperm nucleus than CCDC113 (Figure 7G and H). Therefore, SUN5 and CENTLEIN may be closer to the sperm nucleus than CCDC113. PAS staining revealed that Ccdc113–/– sperm heads are abnormally oriented in stages V–VIII seminiferous epithelia (Figure6 C and D), and TEM analysis further demonstrated that the disruption of CCDC113 causes the detachment of the destroyed coupling apparatus from the sperm head in step 9–11 spermatids (Figure 6E). All these results suggest that the detachment of sperm head and tail in Ccdc113–/– mice may not be a secondary effect of sperm flagellum defects.”

      (3) Given that some cytoplasm materials could be observed in Ccdc113-/- spermatozoa (Fig. 5A), whether CCDC113 is also essential for cytoplasmic removal?

      Good question. Unremoved cytoplasm could be detected in spermatozoa by using transmission electron microscopy (TEM) analysis, including disrupted mitochondria, damaged axonemes, and large vacuoles. These observations indicate defects in cytoplasmic removal in Ccdc113–/– mice. We have discussed this point as below:

      “Moreover, TEM analysis detected excess residual cytoplasm in spermatozoa, including disrupted mitochondria, damaged axonemes, and large vacuoles, indicating defects in cytoplasmic removal in Ccdc113–/– mice (Figure 5A).”

      (4) Although CCDC113 could not bind to PMFBP1, the localization of CCDC113 in Pmfbp1-/- spermatozoa should be also detected to clarify the relationship between CCDC113 and SUN5-CENTLEIN-PMFBP1.

      We appreciate Reviewer #1’s suggestion. We have analyzed the localization of CCDC113 in Pmfbp1-/- spermatozoa and found that CCDC113 was located at the tip of the decapitated tail in Pmfbp1-/- spermatozoa (Figure 7K and L). This finding has been incorporated into the revised manuscript as below:

      “To further elucidate the functional relationships among CCDC113, SUN5, CENTLEIN, and PMFBP1 at the sperm HTCA, we examined the localization of CCDC113 in Sun5-/-, Centlein–/–, and Pmfbp1–/– spermatozoa. Compared to the control group, CCDC113 was predominantly localized on the decapitated flagellum in Sun5-/-, Centlein–/–, and Pmfnp1–/– spermatozoa (Figure 7K and L), indicating SUN5, CENTLEIN, and PMFBP1 are crucial for the proper docking of CCDC113 to the implantation site on the sperm head. Taken together, these data demonstrate that CCDC113 cooperates with SUN5 and CENTLEIN to stabilize the sperm HTCA and anchor the sperm head to the tail.”

      Reviewer #2 (Public Review):

      Summary:

      In the present study, the authors select the coiled-coil protein CCDC113 and revealed its expression in the stages of spermatogenesis in the testis as well as in the different steps of spermiogenesis with expression also mapped in the different parts of the epididymis. Gene deletion led to male infertility in CRISPR-Cas9 KO mice and PAS staining showed defects mapped in the different stages of the seminiferous cycle and through the different steps of spermiogenesis. EM and IF with several markers of testis germ cells and spermatozoa in the epididymis indicated defects in flagella and head-to-tail coupling for flagella as well as acephaly. The authors' co-IP experiments of expressed CCDC113 in HEK293T cells indicated an association with CFAP91 and DRC2 as well as SUN5 and CENTLEIN.

      The authors propose that CCDC113 connects CFAP91 and DRC2 to doublet microtubules of the axoneme and CCDC113's association with SUN5 and CENTLEIN to stabilize the sperm flagellum head-to-tail coupling apparatus. Extensive experiments mapping CCDC13 during postnatal development are reported as well as negative co-IP experiments and studies with SUN5 KO mice as well as CENTLEIN KO mice.

      Strengths:

      The authors provide compelling observations to indicate the relevance of CCDC113 to flagellum formation with potential protein partners. The data are relevant to sperm flagella formation and its coupling to the sperm head.

      We are grateful to Reviewer #2 for his or her recognition of the strength of this study.

      Weaknesses:

      The authors' observations are consistent with the model proposed but the authors' conclusions for the mechanism may require direct demonstration in sperm flagella. The Walton et al paper shows human CCDC96/113 in cilia of human respiratory epithelia. An application of such methodology to the proteins indicated by Wu et al for the sperm axoneme and head-tail coupling apparatus is eagerly awaited as a follow-up study.

      We thank Reviewer 2 for his/her kindly help in improving the manuscript.  We now understand that directly detection of CCDC113 precise localization in sperm axoneme and head-tail coupling apparatus (HTCA) using cryo-electron microscopy (cryo-EM) could powerfully strengthen our model. Recent advances in cryo-EM have indeed advanced our understanding of axonemal structures analysis of axonemal structures and determined the structures of native axonemal DMTs from mouse, bovine, and human sperm (Leung et al., 2023; Zhou et al., 2023). However, high-resolution structures of sperm axoneme and HTCA regions, including those involving CCDC113, have yet to be fully characterized. Thus, we would like to discuss this point and consider it a valuable direction for future research.

      “Given that the cryo-EM of sperm axoneme and HTCA could powerfully strengthen the role of CCDC113 in stabilizing sperm axoneme and head-tail coupling apparatus, it a valuable direction for future research.”

      References:

      Bazan, R., Schröfel, A., Joachimiak, E., Poprzeczko, M., Pigino, G., & Wloga, D. (2021). Ccdc113/Ccdc96 complex, a novel regulator of ciliary beating that connects radial spoke 3 to dynein g and the nexin link. PLoS Genet, 17(3), e1009388.

      Ghanaeian, A., Majhi, S., McCafferty, C. L., Nami, B., Black, C. S., Yang, S. K., Legal, T., Papoulas, O., Janowska, M., Valente-Paterno, M., Marcotte, E. M., Wloga, D., & Bui, K. H. (2023). Integrated modeling of the Nexin-dynein regulatory complex reveals its regulatory mechanism. Nat Commun, 14(1), 5741.

      Leung, M. R., Zeng, J., Wang, X., Roelofs, M. C., Huang, W., Zenezini Chiozzi, R., Hevler, J. F., Heck, A. J. R., Dutcher, S. K., Brown, A., Zhang, R., & Zeev-Ben-Mordehai, T.  (2023). Structural specializations of the sperm tail. Cell, 186(13), 2880-2896.e2817

      Walton, T., Gui, M., Velkova, S., Fassad, M. R., Hirst, R. A., Haarman, E., O'Callaghan, C., Bottier, M., Burgoyne, T., Mitchison, H. M., & Brown, A. (2023). Axonemal structures reveal mechanoregulatory and disease mechanisms. Nature, 618(7965), 625-633.

      Zhou, L., Liu, H., Liu, S., Yang, X., Dong, Y., Pan, Y., Xiao, Z., Zheng, B., Sun, Y., Huang, P., Zhang, X., Hu, J., Sun, R., Feng, S., Zhu, Y., Liu, M., Gui, M., & Wu, J. (2023). Structures of sperm flagellar doublet microtubules expand the genetic spectrum of male infertility. Cell, 186(13), 2897-2910.e2819.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Please provide full gel for the Figure 2C experiment (could be as a supplementary file).

      Thanks for your insightful suggestions. We have replaced Figure 2C and provided the full gel in Figure 2-figure supplement 1A.

      (2) The authors write on Line 163 "In contrast, the flagellum staining appeared reduced in Ccdc113-/- seminiferous tubules (Fig. 2J, red asterisk)." However, the magnification of the pictures is not sufficient to distinguish anything in the panel mentioned, please provide others.

      Many thanks for pointing this out. We have provided the iconic figure to show the flagella defect in seminiferous tubules.

      (3) Please add statistical p-values for figures.

      Thanks for your valuable advice. We have added statistical p-values to the figures in the revised manuscript.

      (4) Line 128: Should "speculate" be "speculated"?

      Thank you for pointing out this problem. We have corrected it in the revised manuscript, as shown below:

      “Given that CFAP91 has been reported to stabilize RS on the DMTs (Bicka et al., 2022; Dymek et al., 2011; Gui et al., 2021) and cryo-EM analysis shows that CCDC113 is closed to DMTs, we speculated that CCDC113 may connect RS to DMTs by binding to CFAP91 and microtubules.”

      (5) In lines 384-385, more "-" is typed.

      Thank you for pointing out this problem. We have corrected it in the revised manuscript, as shown below:

      “Furthermore, CCDC113 colocalizes with SUN5 in the HTCA region, and immunofluorescence staining in spermatozoa shows that SUN5 is closer to the sperm nucleus than CCDC113 (Figure 7G and H). Therefore, SUN5 and CENTLEIN may be closer to the sperm nucleus than CCDC113.”

      (6) In general, the article has many typos and should be professionally proofread.

      Many thanks for pointing this out. We have thoroughly revised the manuscript with the assistance professional proofreading.

      Reviewer #2 (Recommendations For The Authors):

      Can the authors indicate in the Materials and Methods if n=3 biological replicates were done for all co-IP, EM, LM, and IF studies? The statistical analysis section indicates this but quantification is missing for most figures including co-IP, most IF, PAS staining, EM, etc.

      We thank Reviewer 2 for the insightful comments and guidance to improve our data quality. All the experiments in this study were repeated at least three times to ensure reproducibility. We have quantified the co-IP experiments in Figures 1C-H and 7A-F, the IF data in Figures 2K, 5C, and 5D, as well as the PAS staining in Figure 6C. Since electron microscopy samples require very little testicular tissue and the sections obtained are very thin, the likelihood of capturing sections specifically at the sperm head-tail junction is considerably low. This challenge makes it difficult to perform quantitative analysis and statistical evaluation in the TEM experiment. To address this limitation, we have quantified the percentage of _Ccdc113-/-_sperm heads with abnormal orientation in stages V–VIII of the seminiferous epithelium to indicate impaired head-to-tail anchorage.

      Figure S2 is compelling and might be indicated as a major figure instead of a supplementary figure.

      We appreciate the positive comment. We have included it as a major figure in Figure 3F.

      Figure 4A may be incomplete. Data sets for RNA expression suggest high expression in the ovary and other organs in males and females including the brain and are not indicated by the authors. Figure 4A may be considered for removal with a more complete study for another paper.

      Thank you for pointing out this issue. We reviewed RNA expression data from various tissues using RNA-Seq data from Mouse ENCODE (https://www.ncbi.nlm.nih.gov/gene/244608) and found that CCDC113 is highly expressed in the testis, but not significantly in the ovary and brain (Figure 4- figure supplement 1A). Additionally, we re-evaluated CCDC113 protein levels in the spleen, lung, kidney, testis, intestine, stomach, brain, and ovary, confirming that it is highly expressed in the testes, with negligible expression in the ovary and brain (Figure 4- figure supplement 1B). In line with Reviewer 2's suggestion, we have removed Figure 4A in the revised manuscript.

      There are grammatical errors throughout the manuscript and Figure 7 is truncated.

      Thank you for pointing out this problem. We have thoroughly revised the manuscript with the assistance professional proofreading.

      The Introduction and Discussion parts of the paper may need some clarification for the general reader. The material in the "Additional Context " section of the critique below may be a helpful place to introduce what a stage is, and the steps in germ cell development in the testis with the latter of course where and when the flagellum develops.

      We appreciate your valuable suggestions. We have referred to the material in the “Additional Context” section to introduce the stages of spermatogenesis and the steps in germ cell development in the testis in the introduction and results.

      “Male fertility relies on the continuous production of spermatozoa through a complex developmental process known as spermatogenesis. Spermatogenesis involves three primary stages: spermatogonia mitosis, spermatocyte meiosis, and spermiogenesis. During spermiogenesis, spermatids undergo complex differentiation processes to develop into spermatozoa, which includes nuclear elongation, chromatin remodeling, acrosome formation, cytoplasm elimination, and flagellum development (Hermo et al., 2010).”

      Hermo, L., Pelletier, R. M., Cyr, D. G., & Smith, C. E. (2010). Surfing the wave, cycle, life history, and genes/proteins expressed by testicular germ cells. Part 1: background to spermatogenesis, spermatogonia, and spermatocytes. Microscopy research and technique, 73(4), 241–278. https://doi.org/10.1002/jemt.20783

      “Pioneering work in the mid-1950s used the PAS stain in histologic sections of mouse testis to visualize glycoproteins of the acrosome and Golgi in seminiferous tubules (Oakberg, 1956). The pioneers discovered in cross-sectioned seminiferous tubules the association of differentiating germ cells with successive layers to define different stages that in mice are twelve, indicated as Roman numerals (XII). For each stage, different associations of maturing germ cells were always the same with early cells in differentiation at the periphery and more mature cells near the lumen. In this way, progressive differentiation from stem cells to mitotic, meiotic, acrosome-forming, and post-acrosome maturing spermatocytes was mapped to define spermatogenesis with the XII stages in mice representing the seminiferous cycle. The maturation process from acrosome-forming cells to mature spermatocytes is defined as spermiogenesis with 16 different steps that are morphologically distinct spermatids (O'Donnell L, 2015).”

      Oakberg, E. F. (1956). A description of spermiogenesis in the mouse and its use in analysis of the cycle of the seminiferous epithelium and germ cell renewal. The American journal of anatomy, 99(3), 391-413. https://doi.org/10.1002/aja.1000990303

      O'Donnell L. (2015). Mechanisms of spermiogenesis and spermiation and how they are disturbed. Spermatogenesis, 4(2), e979623. https://doi.org/10.4161/21565562.2014.979623

      For the Discussion, the authors indicate that the function of CCDC113 in mammals is unknown yet the authors point to the work of Walton et al on human respiratory epithelia that points to a function for CCDC96/113. The work in the manuscript here does indicate a role in sperm flagella and the head-to-tail coupling apparatus but remains descriptive until the methodology of Walton et al is applied. Hopefully, the authors will consider it for a follow-up study.

      Thank you for pointing out this problem. We have revised this part and highlighted the Walton et al’s work in the Discussion.

      “CCDC113 is a highly evolutionarily conserved component of motile cilia/flagella. Studies in the model organism, Tetrahymena thermophila, have revealed that CCDC113 connects RS3 to dynein g and the N-DRC, which plays essential role in cilia motility (Bazan et al., 2021; Ghanaeian et al., 2023). Recent studies have also identified the localization of CCDC113 within the 96-nm repeat structure of the human respiratory epithelial axoneme, and localizes to the linker region among RS, N-DRC and DMTs (Walton et al., 2023). In this study, we reveal that CCDC113 is indispensable for male fertility, as Ccdc113 knockout mice produce spermatozoa with flagellar defects and head-tail linkage detachment (Figure 3D).”

      “Overall, we identified CCDC113 as a structural component of both the flagellar axoneme and the HTCA, where it performs dual roles in stabilizing the sperm axonemal structure and maintaining the structural integrity of HTCA. Given that the cryo-EM of sperm axoneme and HTCA could powerfully strengthen the role of CCDC113 in stabilizing sperm axoneme and head-tail coupling apparatus, it a valuable direction for future research.”

      The Discussion may be focused on the key aspects of CCDC113 related to sperm flagella and the head-to-tail coupling apparatus that represent a genuine advance. The more speculative parts of the Discussion that have not been addressed by experimentation in the Results section may be considered for removal in the Discussion section.

      Thank you for pointing out this. We have removed the speculative parts of the Discussion that have not been addressed by experimentation in the Results section.

      Additional Context to help readers understand the significance of the work:

      Pioneering work in the mid-1950s used the periodic acid Schiff (PAS) stain in histologic sections of rodent testis to visualize glycoproteins of the acrosome and Golgi in seminiferous tubules. The pioneers discovered in cross-sectioned seminiferous tubules the association of differentiating germ cells with successive layers to define different stages that in mice are twelve, indicated as Roman numerals (XII). For each stage, different associations of maturing germ cells were always the same with early cells in differentiation at the periphery and more mature cells near the lumen. In this way, progressive differentiation from stem cells to mitotic, meiotic, acrosome-forming, and post-acrosome maturing spermatocytes was mapped to define spermatogenesis with the XII stages in mice representing the seminiferous cycle. The maturation process from acrosome-forming cells to mature spermatocytes is defined as spermiogenesis with 19 different steps that are morphologically distinct spermatids. It is from steps 8-19 of spermiogenesis that the formation of the flagellum takes place. Final maturation occurs in the epididymis as sperm move through the caput, corpus, and cauda of the organ with motile spermatozoa generated.

      Thank you very much!

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Syngnathid fishes (seahorses, pipefishes, and seadragons) present very particular and elaborated features among teleosts and a major challenge is to understand the cellular and molecular mechanisms that permitted such innovations and adaptations. The study provides a valuable new resource to investigate the morphogenetic basis of four main traits characterizing syngnathids, including the elongated snout, toothlessness, dermal armor, and male pregnancy. More particularly, the authors have focused on a late stage of pipefish organogenesis to perform single-cell RNA-sequencing (scRNA-seq) completed by in situ hybridization analyses to identify molecular pathways implicated in the formation of the different specific traits. 

      The first set of data explores the scRNA-seq atlas composed of 35,785 cells from two samples of gulf pipefish embryos that authors have been able to classify into major cell types characterizing vertebrate organogenesis, including epithelial, connective, neural, and muscle progenitors. To affirm identities and discover potential properties of clusters, authors primarily use KEGG analysis that reveals enriched genetic pathways in each cell types. While the analysis is informative and could be useful for the community, some interpretations appear superficial and data must be completed to confirm identities and properties. Notably, supplementary information should be provided to show quality control data corresponding to the final cell atlas including the UMAP showing the sample source of the cells, violin plots of gene count, UMI count, and mitochondrial fraction for the overall

      dataset and by cluster, and expression profiles on UMAP of selected markers characterizing cluster identities. 

      We thank the reviewer for these suggestions, and have added several figures and supplemental files in response. We added a supplemental UMAP showing the sample that each cell originated (S1). We also added supplemental violin plots for each sample showing the gene count, unique molecular identifier (UMI) count, mitochondrial fraction, and the doublet scores (S2). We added feature plots of zebrafish marker genes for these major cell types and marker genes identified from our dataset to the supplement (S3:S57). We also provided two supplemental files with marker genes. These changes should clarify the work that went into labeling the clusters. Although some of the cluster labels are general, we decided it would be unwise to label clusters with speculated specific annotations. We only gave specific annotations to clusters with concrete markers and/or in situ hybridization (ISH) results that cemented an annotation.  As shown in the new supplemental figures and files, certain clusters had clear, specific markers while others did not. Therefore, we used caution when we annotated clusters without distinct markers. 

      The second set of data aims to correlate the scRNA-seq analysis with in situ hybridizations (ISH) in two different pipefish (gulf and bay) species to identify and characterize markers spatially, and validate cell types and signaling pathways active in them. While the approach is rational, the authors must complete the data and optimize labeling protocols to support their statements. One major concern is the quality of ISH stainings and images; embryos show a high degree of pigmentation that could hide part of the expression profile, and only subparts and hardly detectable tissues/stainings are presented. The authors should provide clear and good-quality images of ISH labeling on whole-mount specimens, highlighting the magnification regions and all other organs/structures (positive controls) expressing the marker of interest along the axis. Moreover, ISH probes have been designed and produced on gulf pipefish genome and cDNA respectively, while ISH labeling has been performed indifferently on bay or gulf pipefish embryos and larvae. The authors should specify stages and species on figure panels and should ensure sequence alignment of the probe-targeted sequences in the two species to validate ISH stainings in the bay pipefish. Moreover, spatiotemporal gene expression being a very dynamic process during embryogenesis, interpretations based on undefined embryonic and larval stages of pipefish development and compared to 3dpf zebrafish are insufficient to hypothesize on developmental specificities of pipefish features, such as on the absence of tooth primordia that could represent a very discrete and transient cell population. The ISH analyses would require a clean and precise spatiotemporal expression comparison of markers at the level of the entire pipefish and zebrafish specimens at well-defined stages, otherwise, the arguments proposed on teleost innovations and adaptations turn out to be very speculative. 

      We are appreciative of the reviewer’s feedback. We primarily used the in situ hybridization (ISH) data as supplementary to the scRNAseq library and we are aware that further evidence is necessary to identify origins of syngnathid’s evolutionary novelties. Our goal was to provide clues for the developmental genetic basis of syngnathid derived features.  We hope that our study will inspire future investigations and are excited for the prospect that future research could include this reviewer’s ideas. 

      All of the developmental stages and species information for the embryos used were in the figure captions as well as in supplemental file 6. Because we primarily used wild caught embryos, we did not have specific ages of most embryos. Syngnathid species are challenging to culture in the laboratory, and extracting embryos requires euthanizing the father which makes it difficult to obtain enough embryos for ISH. In addition, embryos do not survive long when removed from the brood pouch prematurely. We supplemented our ISH with bay pipefish caught off the Oregon coast because these fish have large broods. Wild caught pregnant male bay pipefish were immediately euthanized, and their broods were fixed. Because we did not have their age, we classified them based on developmental markers such as presence of somites and the extent of craniofacial elongation. Although these classification methods are not ideal, they are consistent with the syngnathid literature (Sommer et al. 2012). Since the embryos used for the ISH were primarily wild caught, we had a few different developmental stages represented in our ISH data. For our tooth primordia search, we used embryos from the same brood (therefore, same stage) for these experiments.

      We understand the concern for the degree of pigmentation in the samples. We completed numerous bleach trials before embarking on the in situ hybridization experiments. After completing a bleach trial with a probe created from the gene tnmd for ISH_,_ we noticed that the bleached embryos were missing expression domains found in the unbleached embryos. We were, therefore, concerned that using bleached embryos for our experiments would result incorrect conclusions about the expression domains of these genes. We sparingly used bleaching at older stages, hatched larvae, where it was fundamentally necessary to see staining. As stated above, the primary goal of this manuscript was to generate and annotate the first scRNA-seq atlas in a syngnathid, and the ISHs were utilized to support inferred cluster annotations only through a positive identification of marker gene expression in expected tissues/cells. Therefore, the obscuring of gene expression by pigmentation would have resulted in the absence of evidence for a possible cluster annotation, not an incorrect annotation.

      For the ease of viewing the ISHs, we improved annotations and clarity. We increased the brightness and contrast of images. In the original submission, we had to lower the image resolution to make the submission file smaller. We hope that these improvements plus the true image quality improves clarity of ISH results. We also included alignments in our supplementary files of bay pipefish sequences to the Gulf pipefish probes to showcase the high degree of sequence similarity. 

      Sommer, S., Whittington, C. M., & Wilson, A. B. (2012). Standardised classification of pre-release development in male-brooding pipefish, seahorses, and seadragons (Family Syngnathidae). BMC Developmental Biology, 12, 12–15. 

      To conclude, whereas the scRNA-seq dataset in this unconventional model organism will be useful for the community, the spatiotemporal and comparative expression analyses have to be thoroughly pushed forward to support the claims. Addressing these points is absolutely necessary to validate the data and to give new insights to understand the extraordinary evolution of the Syngnathidae family. 

      We really appreciate the reviewer’s enthusiasm for syngnathid research, and hope that the additional files and explanation of the supporting role of the ISHs have adequately addressed their concerns. We share the reviewer’s enthusiasm and are excited for future work that can extend this study. 

      Reviewer #2 (Public Review):

      Summary: 

      The authors present the first single-cell atlas for syngnathid fishes, providing a resource for future evolution & development studies in this group. 

      Strengths: 

      The concept here is simple and I find the manuscript to be well written. I like the in situ hybridization of marker genes - this is really nice. I also appreciate the gene co-expression analysis to identify modules of expression. There are no explicit hypotheses tested in the manuscript, but the discovery of these cell types should have value in this organism and in the determination of morphological novelties in seahorses and their relatives.  

      We are grateful for this reviewer’s appreciation of the huge amount of work that went into this study, and we agree that the in situ hybridizations (ISHs) support the scRNAseq study as we intended. We appreciate that the reviewer thinks that this work will add value to the syngnathid field.

      Weaknesses: 

      I think there are a few computational analyses that might improve the generality of the results. 

      (1) The cell types: The authors use marker gene analysis and KEGG pathways to identify cell types. I'd suggest a tool like SAMap (https://elifesciences.org/articles/66747) which compares single-cell data sets from distinct organisms to identify 'homologous' cell types - I imagine the zebrafish developmental atlases could serve as a reasonable comparative reference. 

      We appreciate the reviewer’s request, and in fact we would have loved to integrate our dataset with zebrafish. However, syngnathid’s unique craniofacial development makes it challenging to determine the appropriate stage for comparison. While 3 days post fertilization (dpf) zebrafish data were appropriate for comparisons of certain cell types (e.g. epidermal cells), it would have been problematic for other cell types (e.g. osteoblasts) that are not easily detectable until older zebrafish stages. Therefore, determining equivalent stages between these species is difficult and contains potential for error. Future research should focus on trying to better match stages across syngnathids and zebrafish (and other fish species such as stickleback). Studies of this nature promise to uncover the role of heterochrony in the evo-devo of syngnathid’s unique snouts.

      (2) Trajectory analyses: The authors suggest that their analyses might identify progenitor cell states and perhaps related differentiated states. They might explore cytoTRACE and/or pseudotime-based trajectory analyses to more fully delineate these ideas.

      We thank the reviewer for this suggestion! We added a trajectory analysis using cytoTRACE to the manuscript. It complemented our KEGG analysis well (L172-175; S73) and has improved the manuscript.

      (3) Cell-cell communication: I think it's very difficult to identify 'tooth primordium' cell types, because cell types won't be defined by an organ in this way. For instance, dental glia will cluster with other glia, and dental mesenchyme will likely cluster with other mesenchymal cell types. So the histology and ISH is most convincing in this regard. Having said this, given the known signaling interactions in the developing tooth (and in development generally) the authors might explore cell-cell communication analysis (e.g., CellChat) to identify cell types that may be interacting. 

      We agree! It would have been a wonderful addition to the paper to include a cell-cell communication analysis. One limitation of CellChat is that it only includes mouse and human orthologs. Given concerns of reviewer #3 for mouse-syngnathid comparisons, we decided to not pursue CellChat for this study. We are looking forward to future cell communication resources that include teleost fishes.

      Reviewer #3 (Public Review): 

      Summary: 

      This study established a single-cell RNA sequencing atlas of pipefish embryos. The results obtained identified unique gene expression patterns for pipefish-specific characteristics, such as fgf22 in the tip of the palatoquadrate and Meckel's cartilage, broadly informing the genetic mechanisms underlying morphological novelty in teleost fishes. The data obtained are unique and novel, potentially important in understanding fish diversity. Thus, I would enthusiastically support this manuscript if the authors improve it to generate stronger and more convincing conclusions than the current forms. 

      Thank you, we appreciate the reviewer’s enthusiasm!

      Weaknesses: 

      Regarding the expression of sfrp1a and bmp4 dorsal to the elongating ethmoid plate and surrounding the ceratohyal: are their expression patterns spatially extended or broader compared to the pipefish ancestor? Is there a much closer species available to compare gene expression patterns with pipefish? Did the authors consider using other species closely related to pipefish for ISH? Sfrp1a and bmp4 may be expressed in the same regions of much more closely related species without face elongation. I understand that embryos of such species are not always accessible, but it is also hard to argue responsible genes for a specific phenotype by only comparing gene expression patterns between distantly related species (e.g., pipefish vs. zebrafish). Due to the same reason, I would not directly compare/argue gene expression patterns between pipefish and mice, although I should admit that mice gene expression patterns are sometimes helpful to make a hypothesis of fish evolution. Alternatively, can the authors conduct ISH in other species of pipefish? If the expression patterns of sfrp1a and bmp4 are common among fishes with face elongation, the conclusion would become more solid. If these embryos are not available, is it possible to reduce the amount of Wnt and BMP signal using Crispr/Cas, MO, or chemical inhibitor? I do think that there are several ways to test the Wnt and/or BMP hypothesis in face elongation. 

      We appreciate the reviewer’s suggestion, and their recognition for challenges within this system. In response to this comment, we completed further in situ hybridization experiments in threespine stickleback, a short snouted fish that is much more closely related to syngnathids than is zebrafish, to make comparisons with pipefish craniofacial expression patterns (S76-S79). We added ISH data for the signaling genes (fgf22, bmp4, and sfrp1a) as well as prdm16. Through adding this additional ISH results, we speculated that craniofacial expression of bmp4, sfrp1a, and prdm16 is conserved across species. However, compared to the specific ceratohyal/ethmoid staining seen in pipefish, stickleback had broad staining throughout the jaws and gills. These data suggest that pipefish have co-opted existing developmental gene networks in the development of their derived snouts. We added this interpretation to the results and discussion of the manuscript (L244-L248; L262-277; L444-470).

      Recommendations for the authors:  

      Reviewing Editor (Recommendations for the Authors)

      We hope that the eLife assessment, as well as the revisions specified here, prove helpful to you for further revisions of your manuscript. 

      Revisions considered essential: 

      (1) Marker genes and single-cell dataset analyses. While these analyses have been performed to a good standard in broad terms, there is a majority view here that cell type annotations and trajectory analyses can be improved. In particular, there is question about the choice of marker genes for the current annotation. For one it can depend on the use of single marker genes (see tnnti1 example for clusters 17 and 31). Here, we recommend incorporating results from SAMap and trajectory analysis (e.g., cytoTRACE or standard pseudotime).

      Because of the reviewer comments, we became aware that we insufficiently communicated how cell clusters were annotated. We did mention in the manuscript that we did not use single marker genes to annotate clusters, but instead we used multiple marker genes for each cluster for the annotation process. We used both marker genes derived from our dataset and marker genes identified from zebrafish resources for cluster annotation. We chose single marker genes for each cluster for visualization purposes and for in situ hybridizations. However, it is clear from the reviewers’ comments that we needed to make more clear how the annotations were performed. To make this effort more clear in our revision, we included two new supplementary files – one with Seurat derived marker genes and one with marker genes derived from our DotPlot method. We also included extensive supplementary figures highlighting different markers. Using Daniocell, we identified 6 zebrafish markers per major cell type and showed their expression patterns in our atlas with FeaturePlots. We also included feature plots of the top 6 marker genes for each cluster. We hope that the addition of these 40+ plots (S3:S57) to the supplement fully addresses these concerns. 

      We appreciated the suggestion of cytotrace from reviewer #2! We ran cytotrace on three major cell lineages (neural, muscle, and connective; S73) which complemented our KEGG analysis in suggesting an undifferentiated fate for clusters 8, 10, and 16. We chose to not run SAMap because it is a scRNA-seq library integration tool. Although we compared our lectin epidermal findings to 3 dpf zebrafish scRNA-seq data, we did not integrate the datasets out of concern that we could draw erroneous conclusions for other cell types.  Future work that explores this technical challenge may uncover the role of heterochrony in syngnathid craniofacial development. We detail these changes more fully in our responses to reviewers.

      (2) The claims regarding evolutionary novelty and/or the genes involved are considered speculative. In part, this comes from relying too heavily on comparisons against zebrafish, as opposed to more closely related species. For example, the discussion regarding C-type lectin expression in the epidermis and KEGG enrichment (lines 358 - 364) seems confusing. Another good example here is the discussion on sfrp1a (lines 258 - 261). Here, the text seems to suggest craniofacial sfrp1a expression (or specifically ethmoid expression?) is connected to the development of the elongated snout in pipefish. However, craniofacial expression of sfrp1a is also reported in the arctic charr, which the authors grouped into fishes with derived craniofacial structures. Separately, sfrp2 expression was also reported in stickleback fish, for example. Do these different discussions truly support the notion that sfrp1a expression is all that unique in pipefish, rather than that pipefish and zebrafish are only distantly related and that sfrp1a was a marker gene first, and co-opted gene second? The authors should respond to the comments in the public review related to this aspect, and include more informative comparison and discussion. 

      A much more nuanced discussion with appropriate comparisons and caveats would be strongly recommended here.  

      We appreciate this insight and used it as a motivator to complete and add select comparative ISH data to this manuscript. We added in situ hybridization experiments from stickleback fish for craniofacial development genes (sfrp_1a, prdm16, bmp4_, and fgf22; S76-S79).  After adding stickleback ISH to the manuscript, we were able to make comparisons between pipefish and stickleback patterns and draw more informed conclusions (L244-L248; L262-277; L444-470). We added additional nuance to the discussion of the head, tooth (L485-489), and male pregnancy (L358-L391) sections to address concerns of study limitations. We describe in more detail these additional data in response to reviewers.

      (3) In situ hybridization results: as already included above, there is generally weak labeling of species, developmental stages, and other markings that can provide context. The collective feeling here is that as it is currently presented, the ISH results do not go too far beyond simply illustrative purposes. To take these results further, more detailed comparison may be needed. At a minimum, far better labeling can help avoid making the wrong impression. 

      Based on the reviewers’ comments, we made changes to improve ISH clarity and add select comparative ISH findings. ISH was used to further interpretation of the scRNAseq atlas. All the developmental stages and species information for the embryos used were in the figure captions as well as in supplemental file 4. Since we primarily used wild caught embryos, we did not have specific ages of most embryos. The technical challenges of acquiring and staging Syngnathus embryos are detailed above. Because we did not have their age, we classified them based on developmental markers (such as presence of somites and the extent of craniofacial elongation). Although these classification methods are not ideal, they are consistent with the syngnathid literature (Sommer et al. 2012).  

      We followed reviewer #1’s recommendations by adding an annotated graphic of a pipefish head, aligning bay and Gulf pipefish sequences for the probe regions, expanding out our supplemental figures for ISH into a figure for each probe, and improving labeling. These changes improved the description of the ISH experiments and have increased the quality of the manuscript.

      We would have loved to complete detailed comparative studies as suggested, but doing such a complete analysis was not feasible for this study. Therefore, we completed an additional focused analysis. We followed reviewer #3’s idea and added ISHs from threespine stickleback, a short snouted fish, for 4 genes (sfrp1a, prdm16, fgf22, and bmp4). While more extensive ISHs tracking all marker genes through a variety of developmental stages in pipefish and stickleback would have provided crucial insights, we feel that it is beyond the scope of this study and would require a significant amount of additional work. We, thus, primarily interpreted the ISH results as illustrative data points in our discussion. As we state in the response to reviewer 1, the generation and annotation of the first scRNA-seq atlas in a syngnathid is the primary goal of this manuscript.  The ISHs were utilized primarily to support inferred cluster annotations if a positive identification of marker gene expression in expected tissues/cells occurred. 

      Reviewer #1 (Recommendations For The Authors): 

      While the scRNA-seq dataset offers a valuable resource for evo-devo analyses in fish and the hypotheses are of interest, critical aspects should be strengthened to support the claims of the study. 

      Concerning the scRNA-seq dataset, the major points to be addressed are listed below: 

      - Supplementary file 3 reports the single markers used to validate cluster annotations. To confirm cluster identities, more markers specific to each cluster should be highlighted and presented on the UMAP. 

      We recognize the reviewer’s concern and had in reality used numerous markers to annotate the clusters. Based upon the reviewer’s comment we decided to make this clear by creating feature plots for every cluster with the top 6 marker genes. These plots showcase gene specificity in UMAP space. We also added feature plots for zebrafish marker genes for key cell types. Through these changes and the addition of 54 supplementary figures (S3:S57), we hope that it is clear that numerous markers validated cluster identity.

      For example, as clusters 17 and 37 share the same tnnti1 marker, which other markers permit to differentiate their respective identity. 

      This is a fair point. Cluster 17 and 37 both are marked by a tnni1 ortholog.

      Different paralogous co-orthologs mark each cluster (cluster 17: LOC125989146; cluster 37: LOC125970863). In our revision to the above comment, additional (6) markers per cluster were highlighted which should remedy this concern. 

      - L146: the low number of identified cartilaginous cells (only 2% of total connective tissue cells) appears aberrant compared to bone cell number, while Figure 1 presents a welldeveloped cartilaginous skeleton with poor or no signs of ossification. Please discuss this point. 

      We also found this to be interesting and added a brief discussion on this subject to the results section (L147-L149). Single cell dissociations can have variable success for certain cell types. It is possible that the cartilaginous cells were more difficult to dissociate than the osteoblast cells.

      - L162: pax3a/b are not specific to muscle progenitors as the genes are also expressed in the neural tube and neural crest derivatives during organogenesis. Please confirm cluster 10 identity.  

      Thank you for the reminder, we added numerous feature plots that explored zebrafish (from Daniocell) and pipefish markers (identified in our dataset). Examining zebrafish satellite muscle markers (myog, pabpc4, and jam2a) shows a strong correspondence with cluster #10.

      - L198: please specify in the text the pigment cell cluster number. 

      We completed this change.

      - L199: it is not clear why considering module 38 correlated to cluster 20 while modules 2/24 appear more correlated according to the p-value color code. 

      We thank the reviewer for pointing this confusing element out! Although the t-statistic value for module 38 (3.75) is lower than the t-statistics for modules 2 and 24 (5.6 and 5.2, respectively), we chose to highlight module 38 for its ‘connectivity dependence’ score. In our connectivity test, we examined whether removing cells from a specific cell cluster reduced the connectivity of a gene network. We found that removing cluster 20 led to a decrease in module 38’s connectivity (-.13, p=0) while it led to an increase in modules 2 and 24’s connectivity (.145, p=1; .145, p=9.14; our original supplemental files 9-10). Therefore, the connectivity analysis showed that module 38’s structure was more dependent on cluster 20 than in comparison with modules 2 and 24. Although you highlighted an interesting quandary, we decided that this is tangential to the paper and did not add this discussion to the manuscript. 

      - Please describe in the text Figure 4A. 

      Completed, we thank the reviewer for catching this! 

      Concerning embryo stainings, the major points to be addressed are listed below: 

      - Figure 1: please enhance the light/contrast of figures to highlight or show the absence of alcian/alizarin staining. Mineralized structures are hardly detectable in the head and slight differences can be seen between the two samples. The developmental stage should be added. Please homogenize the scale bar format (remove the unit on panels E and, G as the information is already in the text legend). It would be useful to illustrate the data with a schematic view of the structures presented in panels B, and E, and please annotate structures in the other panels.  

      We thank the reviewer for these suggestions to improve our figure. We increased the brightness and contrast for all our images. We also added an illustration of the head with labels of elements. As discussed, we used wild caught pregnant males and, therefore, do not know the exact age of the specimens. However, we described the developmental stage based on morphological observations. Slight differences in morphology between samples is expected. We and others have noticed that

      developmental rate varies, even within the same brood pouch, for syngnathid embryos. We observed several mineralization zones including in the embryos including the upper and lower jaws, the mes(ethmoid), and the pectoral fin. We recognize the cartilage staining is more apparent than the bone staining, though increasing image brightness and contrast did improve the visibility of the mineralization front.

      - All ISH stainings and images presented in Figures 4-6/ Figures S2-3 should be revised according to comments provided in the public review. 

      We thank the reviewer for providing thorough comments, we provided an in-depth response to the public review. We made several improvements to the manuscript to address their concerns. 

      - Figure 4: Figure 4B should be described before 4C in the text or inverse panels / L222 the Meckel's cartilage is not shown on Figure 4C. The schematic views in H should be annotated and the color code described / the ISH data must be completed to correlate spatially clusters to head structures. 

      We thank the reviewer for pointing this out, we fixed the issues with this figure and added annotations to the head schematics.

      - Figure 5: typo on panels 'alician' = alcian. 

      We completed this change. 

      - Figures S2-3: data must be better presented, polished / typo in captions 'relavant'= relevant. 

      Thank you for this critique, we created new supplementary figures to enhance interpretation of the data (S59-S71). In these new figures, we included a feature plot for each gene and respective ISHs.

      - Figure S3: soat2 = no evidence of muscle marker neither by ISH presented nor in the literature. 

      We realized this staining was not clear with the previous S2/S3 figures. Our new changes in these supplementary figures based on the reviewer’s ideas made these ISH results clearer. We observed soat2 staining in the sternohyoideus muscle (panel B in S71).

      Other points: 

      - The cartilage/bone developmental state (Alcian/alizarin staining) and/or ISH for classical markers of muscle development (such as pax3/myf5) could be used to clarify the This could permit the completion of a comparative analysis between the two species and the interpretation of novel and adaptative characters.  

      We appreciate this idea! We thought deeply about a well characterized comparative analysis between pipefish and zebrafish for this study. We discussed our concerns in our public response to reviewer 2. We found that it was challenging to stage match all cell types, and were concerned that we could make erroneous conclusions. For example, our pipefish samples were still inside the male brood pouch and possessed yolk sacs. However, we found osteoblast cells in our scRNAseq atlas, and in alizarin staining. Although zebrafish literature notes that the first zebrafish bone appears at 3 dpf (Kimmel et al. 1995), osteoblasts were not recognized until 5 dpf in two scRNAseq datasets (Fabian et al. 2022; Lange et al. 2023). A 5dpf zebrafish is considered larval and has begun hunting. Therefore, we chose to not integrate our data out of concern that osteoblast development may occur at different timelines between the fishes. 

      Fabian, P., Tseng, K.-C., Thiruppathy, M., Arata, C., Chen, H.-J., Smeeton, J., Nelson, N., & Crump, J. G. (2022). Lifelong single-cell profiling of cranial neural crest diversification in zebrafish. Nature Communications 2022 13:1, 13(1), 1–13. 

      Lange, M., Granados, A., VijayKumar, S., Bragantini, J., Ancheta, S., Santhosh, S., Borja, M., Kobayashi, H., McGeever, E., Solak, A. C., Yang, B., Zhao, X., Liu, Y., Detweiler, A. M., Paul,

      S., Mekonen, H., Lao, T., Banks, R., Kim, Y.-J., … Royer, L. A. (2023). Zebrahub – Multimodal Zebrafish Developmental Atlas Reveals the State-Transition Dynamics of Late-Vertebrate Pluripotent Axial Progenitors. BioRxiv, 2023.03.06.531398. 

      Kimmel, C., Ballard, S., Kimmel, S., Ullmann, B., Schilling, T. (1995). Stages of Embryonic Development of the Zebrafish. Developmental Dynamics 203:253:-310.

      'in situs' in the text should be replaced by 'in situ experiments'.  

      We made this change (L395, L663, L666, L762).

      - Lines 562-565: information on samples should be added at the start of the result section to better apprehend the following scRNA-seq data.

      We thank the reviewer for pointing out this issue. Although we had a few sentences on the samples in the first paragraph of the result section, we understand that it was missing some critical pieces of information. Therefore, we added these additional details to the beginning of the results section (L126-L132). 

      - Lines 629-665: PCR with primers designed on gulf pipefish genome could be performed in parallel on bay and gulf cDNA libraries, and amplification products could be sequenced to analyze alignment and validate the use of gulf pipefish ISH probes in bay pipefish embryos. Probe production could also be performed using gulf primers on bay pipefish cDNA pools. 

      After the submission of this manuscript, a bay pipefish genome was prepared by our laboratory. We used this genome to align our probes, these alignments demonstrate strong sequence conservation between the species. We included these alignments in our supplemental files.

      - L663: the bleaching step must be optimized on pipefish embryos. 

      We understand this concern and had completed several bleach optimization experiments prior to publication. Although we found that bleaching improved visibility of staining, we noticed with the probe tnmd that bleached embryos did not have complete staining of tendons and ligaments. The unbleached embryos had more extensive staining than the bleached embryos. We were concerned that bleaching would lead to failures to detect expression domains (false negatives) important for our analysis. Therefore, we did not use bleaching with our in situs experiments (except with hatched fish with a high degree of pigmentation). 

      - Indicate the number of specimens analyzed for each labeling condition.  

      We thank the reviewer for noticing this issue. We added this information to the methods (L766-767).

      - Describe the fixation and pre-treatment methods previous to ISH and skeleton stainings

      We thank the reviewer for pointing out this issue, we added these descriptions (L765-766; L772-774). 

      Reviewer #3 (Recommendations For The Authors): 

      (1) If sfrp1a expression is observed also in other fish species with derived craniofacial structures, it's important to discuss this more in the Discussion. This could be a common mechanism to modify craniofacial structures, although functional tests are ultimately required (but not in this paper, for sure). Can lines 421-428 involve the statement "a prolonged period of chondrocyte differentiation" underlies craniofacial diversity?

      This is a great idea, and we added a sentence that captures this ethos (L451-452).

      (2) Lines 334-346 need to be rephrased. It's hard to understand which genes are expressed or not in pipefish and zebrafish. Did "23 endocytosis genes" show significant enrichment in zebrafish epidermis, or are they expressed in zebrafish epidermis? 

      We thank the reviewer for this comment, we re-phrased this section for clarity (L365-368).

      (3) Figure 4 is missing the "D" panel and two "E" panels. 

      We thank the reviewer for noticing this, we fixed this figure.

      (4) Line 302: "whole-mount" or "whole mount"

      We thank the reviewer for the catch!

    1. Author response:

      Reviewer #1 (Public review):

      Comment 1: In the Results section, the rationale behind selecting the beta band for the central (C3, CP3, Cz, CP4, C4) regions and the theta band for the fronto-central (Fz, FCz, Cz) regions is not clearly explained in the main text. This information is only mentioned in the figure captions. Additionally, why was the beta band chosen for the S-ROI central region and the theta band for the S-ROI fronto-central region? Was this choice influenced by the MVPA results?

      We thank the reviewer for the question regarding the rationale for the S-ROI selection in our study. The beta band was chosen for the central region due to its established relevance in motor control (Engel & Fries, 2010), movement planning (Little et al., 2019) and motor inhibition (Duque et al., 2017). The fronto-central theta band (or frontal midline theta) was a widely recognized indicator in cognitive control research (Cavanagh & Frank, 2014), associated with conflict detection and resolution processes. Moreover, recent empirical evidence suggested that the fronto-central theta reflected the coordination and integration between stimuli and responses (Senoussi et al., 2022). Although we have described the cognitive processes linked to these different frequencies in the introduction and discussion sections, along with the potential patterns of results observed in Stroop-related studies, we did not specify the involved cortical areas. Therefore, we have specified these areas in the introduction to enhance the clarity of the revised version (in the fourth paragraph of the Introduction section).

      Regarding whether the selection of S-ROIs was influenced by the MVPA results, we would like to clarify here that we selected the S-ROIs based on prior research and then conducted the decoding analysis. Specifically, we first extracted the data representing different frequency indicators (three F-ROIs and three S-ROIs) as features, followed by decoding to obtain the MVPA results. Subsequently, the time-frequency analysis, combined with the specific time windows during which each frequency was decoded, provided detailed interaction patterns among the variables for each indicator. The specifics of feature selection are described in the revised version (in the first paragraph of the Multivariate Pattern Analysis section).

      Comment 2: In the Data Analysis section, line 424 states: “Only trials that were correct in both the memory task and the Stroop task were included in all subsequent analyses. In addition, trials in which response times (RTs) deviated by more than three standard deviations from the condition mean were excluded from behavioral analyses.” The percentage of excluded trials should be reported. Also, for the EEG-related analyses, were the same trials excluded, or were different criteria applied?

      We thank the reviewer for this suggestion. Beyond the behavioral exclusion criteria, trials with EEG artifacts were also excluded from the data for the EEG-related analyses. We have now reported the percentage of excluded trials for both behavioral and EEG data analyses in the revised version (in the second paragraph of the EEG Recording and Preprocessing section and the first paragraph of the Behavioral Analysis section).

      Comment 3: In the Methods section, line 493 mentions: “A 400-200 ms pre-stimulus time window was selected as the baseline time window.” What is the justification in the literature for choosing the 400-200 ms pre-stimulus window as the baseline? Why was the 200-0 ms pre-stimulus period not considered?

      We thank the reviewer for this question and would like to provide the following justification. First, although a baseline ending at 0 ms is common in ERP analyses, it may not be suitable for time-frequency analysis. Due to the inherent temporal smoothing characteristic of wavelet convolution in time-frequency decomposition, task-related early activities can leak into the pre-stimulus period (before 0 ms) (Cohen, 2014). This means that extending the baseline to 0 ms will include some post-stimulus activity in the baseline window, thereby increasing baseline power and compromising the accuracy of the results. Second, an ideal baseline duration is recommended to be around 10-20% of the entire trial of interest (Morales & Bowers, 2022). In our study, the epoch duration was 2000 ms, making 200-400 ms an appropriate baseline length. Third, given that the minimum duration of the fixation point before the stimulus in our experiment was 400 ms, we chose the 400 ms before the stimulus as the baseline point to ensure its purity. In summary, considering edge effects, duration requirements, and the need to exclude other influences, we selected a baseline correction window of -400 to -200 ms. To enhance the clarity of the revised version, we have provided the rationale for the selected time windows along with relevant references (in the first paragraph of the Time-frequency analysis section).

      Comment 4: Is the primary innovation of this study limited to the methodology, such as employing MVPA and RSA to establish the relationship between late theta activity and behavior?

      We thank the reviewer for this insightful question and would like to clarify that our research extends beyond mere methodological innovation; rather, it utilized new methods to explore novel theoretical perspectives. Specifically, our research presents three levels of innovation: methodological, empirical, and theoretical. First, methodologically, MVPA overcame the drawbacks of traditional EEG analyses based on specific averaged voltage intensities, providing new perspectives on how the brain dynamically encoded particular neural representations over time. Furthermore, RSA aimed to identify which indicators among the decoded were directly related to behavioral representation patterns. Second, in terms of empirical results, using these two methods, we have identified for the first time three EEG markers that modulate the Stroop effect under verbal working memory load: SP, late theta, and beta, with late theta being directly linked to the elimination of the behavioral Stroop effect. Lastly, from a theoretical perspective, we proposed the novel idea that working memory played a crucial role in the late stages of conflict processing, specifically in the stimulus-response mapping stage (the specific theoretical contributions are detailed in the second-to-last paragraph of the Discussion section).

      Comment 5: On page 14, lines 280-287, the authors discuss a specific pattern observed in the alpha band. However, the manuscript does not provide the corresponding results to substantiate this discussion. It is recommended to include these results as supplementary material.

      We thank the reviewer for this suggestion. We added a new figure along with the corresponding statistical results that displayed the specific result patterns for the alpha band (Supplementary Figure 1).

      Comment 6: On page 16, lines 323-328, the authors provide a generalized explanation of the findings. According to load theory, stimuli compete for resources only when represented in the same form. Since the pre-memorized Chinese characters are represented semantically in working memory, this explanation lacks a critical premise: that semantic-response mapping is also represented semantically during processing.

      We thank the reviewer for this insightful suggestion. We fully agree with the reviewer’s perspective. As stated in our revised version, load theory suggests that cognitive resources are limited and dependent on a specific type (in the second paragraph of the Discussion section). The previously memorized Chinese characters are stored in working memory in the form of semantic representations; meanwhile the stimulus-response mapping should also be represented semantically, leading to resource occupancy. We have included this logical premise in the revised version (in the third-to-last paragraph of the Discussion section).

      Comment 7: The classic Stroop task includes both a manual and a vocal version. Since stimulus-response mapping in the vocal version is more automatic than in the manual version, it is unclear whether the findings of this study would generalize to the impact of working memory load on the Stroop effect in the vocal version.

      We fully agree with the reviewer’s point that the verbal version of the Stroop task differs from the manual version in terms of the degree of automation in the stimulus-response mapping. Specifically, the verbal version relies on mappings that are established through daily language use, while the manual version involves arbitrary mappings created in the laboratory. Therefore, the stimulus-response mapping in the verbal response version is more automated and less likely to be suppressed. However, our previous research indicated that the degree of automation in the stimulus-response mapping was influenced by practice (Chen et al., 2013). After approximately 128 practice trials, semantic conflict almost disappears, suggesting that the level of automation in stimulus-response mapping for the verbal Stroop task is comparable to that of the manual version (Chen et al., 2010). Given that participants in our study completed 144 practice trials (in the Procedure section), we believe these findings can be generalized to the verbal version.

      Comment 8: While the discussion section provides a comprehensive analysis of the study’s results, the authors could further elaborate on the theoretical and practical contributions of this work.

      We thank the reviewer for the constructive suggestions. We recognize that the theoretical and practical contributions of the study were not thoroughly elaborated in the original manuscript. Therefore, we have now provided a more detailed discussion. Specifically, the theoretical contributions focus on advancing load theory and highlighting the critical role of working memory in conflict processing. The practical contributions emphasize the application of load theory and the development of intervention strategies for enhancing inhibitory control. A more detailed discussion can be found in the revised version (in the second-to-last paragraph of the Discussion section).

      Reviewer #2 (Public review):

      Comment 1: As the researchers mentioned, a previous study reported a diminished Stroop effect with concurrent working memory tasks to memorize meaningless visual shapes rather than memorize Chinese characters as in the study. My main concern is that lower-level graphic processing when memorizing visual shapes also influences the Stroop effect. The stage of Stroop conflict processing affected by the working memory load may depend on the specific content of the concurrent working memory task. If that’s the case, I sense that the generalization of this finding may be limited.

      We thank the reviewer for this insightful concern. As mentioned in the manuscript, this may be attributed to the inherent characteristics of Chinese characters. In contrast to English words, the processing of Chinese characters relies more on graphemic encoding and memory (Chen, 1993). Therefore, the processing of line patterns essentially occupies some of the resources needed for character processing, which aligns with our study’s hypothesis based on dimensional overlap. Additionally, regarding the results, even though the previous study presents lower-level line patterns, the results still showed that the working memory load modulated the later theta band. We hypothesize that, regardless of the specific content of the pre-presented working memory load, once the stimulus disappears from view, these loads are maintained as representations in the working memory platform. Therefore, they do not influence early perceptual processing, and resource competition only occurs once the distractors reach the working memory platform. Lastly, previous study has shown that spatial loads, which do not overlap with either the target or distractor dimensions, do not influence conflict effect (Zhao et al., 2010). Taken together, we believe that regardless of the specific content of the concurrent working memory tasks, as long as they occupy resources related to irrelevant stimulus dimensions, they can influence the late-stage processing of conflict effect. Perhaps our original manuscript did not convey this clearly, so we have rephrased it in a more straightforward manner (in the second paragraph of the Discussion section).

      Comment 2: The P1 and N450 components are sensitive to congruency in previous studies as mentioned by the researchers, but the results in the present study did not replicate them. This raised concerns about data quality and needs to be explained.

      We thank the reviewer for this insightful concern. For P1, we aimed to convey that the early perceptual processing represented by P1 is part of the conflict processing process. Therefore, we included it in our analysis. Additionally, as mentioned in the discussion, most studies find P1 to be insensitive to congruency. However, we inappropriately cited a study in the introduction that suggested P1 shows differences in congruency, which is among the few studies that hold this perspective. To prevent confusion for readers, we have removed this citation from the introduction.

      As for N450, most studies have indeed found it to be influenced by congruency. In our manuscript, we did not observe a congruency effect at our chosen electrodes and time window. However, significant congruency effects were detected at other central-parietal electrodes (CP3, CP4, P5, P6) during the 350-500 ms interval. The interaction between task type and consistency remained non-significant, consistent with previous results. Furthermore, with respect to the location of the electrodes chosen, existing studies on N450 vary widely, including central-parietal electrodes and frontal-central electrodes (for a review, see Heidlmayr et al., 2020). We speculate that this phenomenon may be related to the extent of practice. With fewer total trials, the task may involve more stimulus conflicts, engaging more frontal brain areas. On the other hand, with more total trials, the task may involve more response conflicts, engaging more central-parietal brain areas (Chen et al., 2013; van Veen & Carter, 2005). Due to the extensive practice required in our study, we identified a congruency N450 effect in the central-parietal region. We apologize for not thoroughly exploring other potential electrodes in the previous manuscript, and we have revised the results and interpretations regarding N450 accordingly in the revised version (in the N450 section of the ERP results and the third paragraph of the Discussion section).

      Reference

      Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414–421. https://doi.org/10.1016/j.tics.2014.04.012

      Chen, M. J. (1993). A Comparison of Chinese and English Language Processing. In Advances in Psychology (Vol. 103, pp. 97–117). North-Holland. https://doi.org/10.1016/S0166-4115(08)61659-3

      Chen, X. F., Jiang, J., Zhao, X., & Chen, A. (2010). Effects of practice on semantic conflict and response conflict in the Stroop task. Psychol. Sci., 33, 869–871.

      Chen, Z., Lei, X., Ding, C., Li, H., & Chen, A. (2013). The neural mechanisms of semantic and response conflicts: An fMRI study of practice-related effects in the Stroop task. NeuroImage, 66, 577–584. https://doi.org/10.1016/j.neuroimage.2012.10.028

      Cohen, M. X. (2014). Analyzing Neural Time Series Data: Theory and Practice. The MIT Press. https://doi.org/10.7551/mitpress/9609.001.0001

      Duprez, J., Gulbinaite, R., & Cohen, M. X. (2020). Midfrontal theta phase coordinates behaviorally relevant brain computations during cognitive control. NeuroImage, 207, 116340. https://doi.org/10.1016/j.neuroimage.2019.116340

      Duque, J., Greenhouse, I., Labruna, L., & Ivry, R. B. (2017). Physiological Markers of Motor Inhibition during Human Behavior. Trends in Neurosciences, 40(4), 219–236. https://doi.org/10.1016/j.tins.2017.02.006

      Engel, A. K., & Fries, P. (2010). Beta-band oscillations—Signalling the status quo? Current Opinion in Neurobiology, 20(2), 156–165. https://doi.org/10.1016/j.conb.2010.02.015

      Heidlmayr, K., Kihlstedt, M., & Isel, F. (2020). A review on the electroencephalography markers of Stroop executive control processes. Brain and Cognition, 146, 105637. https://doi.org/10.1016/j.bandc.2020.105637

      Little, S., Bonaiuto, J., Barnes, G., & Bestmann, S. (2019). Human motor cortical beta bursts relate to movement planning and response errors. PLOS Biology, 17(10), e3000479. https://doi.org/10.1371/journal.pbio.3000479

      Morales, S., & Bowers, M. E. (2022). Time-frequency analysis methods and their application in developmental EEG data. Developmental Cognitive Neuroscience, 54, 101067. https://doi.org/10.1016/j.dcn.2022.101067

      Senoussi, M., Verbeke, P., Desender, K., De Loof, E., Talsma, D., & Verguts, T. (2022). Theta oscillations shift towards optimal frequency for cognitive control. Nature Human Behaviour, 6(7), Article 7. https://doi.org/10.1038/s41562-022-01335-5

      van Veen, V., & Carter, C. S. (2005). Separating semantic conflict and response conflict in the Stroop task: A functional MRI study. NeuroImage, 27(3), 497–504. https://doi.org/10.1016/j.neuroimage.2005.04.042

      Zhao, X., Chen, A., & West, R. (2010). The influence of working memory load on the Simon effect. Psychonomic Bulletin & Review, 17(5), 687–692. https://doi.org/10.3758/PBR.17.5.687

    1. In all of the runs, one observes the self-organization of structured developmental tra-jectories, where the robot explores objects and actions in a progressively more complexstage-like manner while acquiring autonomously diverse affordances and skills that canbe reused later on and that change the learning progress in more complicated tasks. Thefollowing developmental sequence was typically observed:1. In a first phase, the learner achieves unorganized body babbling.2. In a second phase, after learning a first rough model and meta-model, the robotstops combining motor primitives, exploring them one by one, but each primitive isexplored itself in a random manner.P.-Y. Oudeyer, L. B. Smith / Topics in Cognitive Science (2016) 5

      In a third phase, the learner begins to experiment with actions toward zones of its environment where the external observer knows there are objects (the robot is not provided with a representation of the concept of “object”), but in a non-affordant manner (e.g., it vocalizes at the non-responding elephant or tries to bash the teacher robot which is too far to be touched). 4. In a fourth phase, the learner now explores the affordances of different objects in the environment: typically focusing first on grasping movements with the elephant, then shifting to bashing movements with the hanging toy, and finally shifting to explorations of vocalizing toward the imitating teacher. 5. In the end, the learner has learned sensorimotor affordances with several objects, as well as social affordances, and has mastered multiple skills. None of these specific objectives were pre-programmed. Instead, they self-organized through the dynamic interaction between curiosity-driven exploration, statistical inference, the properties of the body, and the properties of the environment

    2. ig. 1. The playground experiment (Oudeyer & Kaplan, 2006; Oudeyer et al., 2007). (A) The learning context.(B) The computational architecture for curiosity-driven exploration in which the robot learner probabilisticallyselects experiences according to their potential for reducing uncertainty, that is, for learning progress. (C) Illustra-tion of a self-organized developmental sequence where the robot automatically identifies, categorizes, and shiftsfrom simple to more complex learning experiences. Figure adapted with permission from Gottlieb et al. (2013).4 P.-Y. Oudeyer, L. B. Smith / Topics in Cognitive Science (2016)

      diagram with rough setup

    3. The learner is equipped with a repertoire of motor primitives parameterized by severalcontinuous numbers that control movements of its legs, head, and a simulated vocal tract.Each motor primitive is a dynamical system controlling various forms of actions: (a) turningthe head in different directions; (b) opening and closing the mouth while crouching withvarying strengths and timing; (c) rocking the leg with varying angles and speed; (d) vocaliz-ing with varying pitches and lengths. These primitives can be combined to form a large con-tinuous space of possible actions. Similarly, sensory primitives allow the robot to detectvisual movement, salient visual properties, proprioceptive touch in the mouth, and pitch andlength of perceived sounds. For the robot, these motor and sensory primitives are initiallyblack boxes and he has no knowledge about their semantics, effects, or relations.P.-Y. Oudeyer, L. B. Smith / Topics in Cognitive Science (2016) 3

      Basic primitives which are just numbers to the robot.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Comment 1: In the Results section, the rationale behind selecting the beta band for the central (C3, CP3, Cz, CP4, C4) regions and the theta band for the fronto-central (Fz, FCz, Cz) regions is not clearly explained in the main text. This information is only mentioned in the figure captions. Additionally, why was the beta band chosen for the S-ROI central region and the theta band for the S-ROI fronto-central region? Was this choice influenced by the MVPA results?

      We thank the reviewer for the question regarding the rationale for the S-ROI selection in our study. The beta band was chosen for the central region due to its established relevance in motor control (Engel & Fries, 2010), movement planning (Little et al., 2019) and motor inhibition (Duque et al., 2017). The fronto-central theta band (or frontal midline theta) was a widely recognized indicator in cognitive control research (Cavanagh & Frank, 2014), associated with conflict detection and resolution processes. Moreover, recent empirical evidence suggested that the fronto-central theta reflected the coordination and integration between stimuli and responses (Senoussi et al., 2022). Although we have described the cognitive processes linked to these different frequencies in the introduction and discussion sections, along with the potential patterns of results observed in Stroop-related studies, we did not specify the involved cortical areas. Therefore, we have specified these areas in the introduction to enhance the clarity of the revised version (in the fourth paragraph of the Introduction section).

      Regarding whether the selection of S-ROIs was influenced by the MVPA results, we would like to clarify here that we selected the S-ROIs based on prior research and then conducted the decoding analysis. Specifically, we first extracted the data representing different frequency indicators (three F-ROIs and three S-ROIs) as features, followed by decoding to obtain the MVPA results. Subsequently, the time-frequency analysis, combined with the specific time windows during which each frequency was decoded, provided detailed interaction patterns among the variables for each indicator. The specifics of feature selection are described in the revised version (in the first paragraph of the Multivariate Pattern Analysis section).

      Comment 2: In the Data Analysis section, line 424 states: “Only trials that were correct in both the memory task and the Stroop task were included in all subsequent analyses. In addition, trials in which response times (RTs) deviated by more than three standard deviations from the condition mean were excluded from behavioral analyses.” The percentage of excluded trials should be reported. Also, for the EEG-related analyses, were the same trials excluded, or were different criteria applied?

      We thank the reviewer for this suggestion. Beyond the behavioral exclusion criteria, trials with EEG artifacts were also excluded from the data for the EEG-related analyses. We have now reported the percentage of excluded trials for both behavioral and EEG data analyses in the revised version (in the second paragraph of the EEG Recording and Preprocessing section and the first paragraph of the Behavioral Analysis section).

      Comment 3: In the Methods section, line 493 mentions: “A 400-200 ms pre-stimulus time window was selected as the baseline time window.” What is the justification in the literature for choosing the 400-200 ms pre-stimulus window as the baseline? Why was the 200-0 ms pre-stimulus period not considered?

      We thank the reviewer for this question and would like to provide the following justification. First, although a baseline ending at 0 ms is common in ERP analyses, it may not be suitable for time-frequency analysis. Due to the inherent temporal smoothing characteristic of wavelet convolution in time-frequency decomposition, task-related early activities can leak into the pre-stimulus period (before 0 ms) (Cohen, 2014). This means that extending the baseline to 0 ms will include some post-stimulus activity in the baseline window, thereby increasing baseline power and compromising the accuracy of the results. Second, an ideal baseline duration is recommended to be around 10-20% of the entire trial of interest (Morales & Bowers, 2022). In our study, the epoch duration was 2000 ms, making 200-400 ms an appropriate baseline length. Third, given that the minimum duration of the fixation point before the stimulus in our experiment was 400 ms, we chose the 400 ms before the stimulus as the baseline point to ensure its purity. In summary, considering edge effects, duration requirements, and the need to exclude other influences, we selected a baseline correction window of -400 to -200 ms. To enhance the clarity of the revised version, we have provided the rationale for the selected time windows along with relevant references (in the first paragraph of the Time-frequency analysis section).

      Comment 4: Is the primary innovation of this study limited to the methodology, such as employing MVPA and RSA to establish the relationship between late theta activity and behavior?

      We thank the reviewer for this insightful question and would like to clarify that our research extends beyond mere methodological innovation; rather, it utilized new methods to explore novel theoretical perspectives. Specifically, our research presents three levels of innovation: methodological, empirical, and theoretical. First, methodologically, MVPA overcame the drawbacks of traditional EEG analyses based on specific averaged voltage intensities, providing new perspectives on how the brain dynamically encoded particular neural representations over time. Furthermore, RSA aimed to identify which indicators among the decoded were directly related to behavioral representation patterns. Second, in terms of empirical results, using these two methods, we have identified for the first time three EEG markers that modulate the Stroop effect under verbal working memory load: SP, late theta, and beta, with late theta being directly linked to the elimination of the behavioral Stroop effect. Lastly, from a theoretical perspective, we proposed the novel idea that working memory played a crucial role in the late stages of conflict processing, specifically in the stimulus-response mapping stage (the specific theoretical contributions are detailed in the second-to-last paragraph of the Discussion section).

      Comment 5: On page 14, lines 280-287, the authors discuss a specific pattern observed in the alpha band. However, the manuscript does not provide the corresponding results to substantiate this discussion. It is recommended to include these results as supplementary material.

      We thank the reviewer for this suggestion. We added a new figure along with the corresponding statistical results that displayed the specific result patterns for the alpha band (Supplementary Figure 1).

      Comment 6: On page 16, lines 323-328, the authors provide a generalized explanation of the findings. According to load theory, stimuli compete for resources only when represented in the same form. Since the pre-memorized Chinese characters are represented semantically in working memory, this explanation lacks a critical premise: that semantic-response mapping is also represented semantically during processing.

      We thank the reviewer for this insightful suggestion. We fully agree with the reviewer’s perspective. As stated in our revised version, load theory suggests that cognitive resources are limited and dependent on a specific type (in the second paragraph of the Discussion section). The previously memorized Chinese characters are stored in working memory in the form of semantic representations; meanwhile the stimulus-response mapping should also be represented semantically, leading to resource occupancy. We have included this logical premise in the revised version (in the third-to-last paragraph of the Discussion section).

      Comment 7: The classic Stroop task includes both a manual and a vocal version. Since stimulus-response mapping in the vocal version is more automatic than in the manual version, it is unclear whether the findings of this study would generalize to the impact of working memory load on the Stroop effect in the vocal version.

      We fully agree with the reviewer’s point that the verbal version of the Stroop task differs from the manual version in terms of the degree of automation in the stimulus-response mapping. Specifically, the verbal version relies on mappings that are established through daily language use, while the manual version involves arbitrary mappings created in the laboratory. Therefore, the stimulus-response mapping in the verbal response version is more automated and less likely to be suppressed. However, our previous research indicated that the degree of automation in the stimulus-response mapping was influenced by practice (Chen et al., 2013). After approximately 128 practice trials, semantic conflict almost disappears, suggesting that the level of automation in stimulus-response mapping for the verbal Stroop task is comparable to that of the manual version (Chen et al., 2010). Given that participants in our study completed 144 practice trials (in the Procedure section), we believe these findings can be generalized to the verbal version.

      Comment 8: While the discussion section provides a comprehensive analysis of the study’s results, the authors could further elaborate on the theoretical and practical contributions of this work.

      We thank the reviewer for the constructive suggestions. We recognize that the theoretical and practical contributions of the study were not thoroughly elaborated in the original manuscript. Therefore, we have now provided a more detailed discussion. Specifically, the theoretical contributions focus on advancing load theory and highlighting the critical role of working memory in conflict processing. The practical contributions emphasize the application of load theory and the development of intervention strategies for enhancing inhibitory control. A more detailed discussion can be found in the revised version (in the second-to-last paragraph of the Discussion section).

      Reviewer #2 (Public review):

      Comment 1: As the researchers mentioned, a previous study reported a diminished Stroop effect with concurrent working memory tasks to memorize meaningless visual shapes rather than memorize Chinese characters as in the study. My main concern is that lower-level graphic processing when memorizing visual shapes also influences the Stroop effect. The stage of Stroop conflict processing affected by the working memory load may depend on the specific content of the concurrent working memory task. If that’s the case, I sense that the generalization of this finding may be limited.

      We thank the reviewer for this insightful concern. As mentioned in the manuscript, this may be attributed to the inherent characteristics of Chinese characters. In contrast to English words, the processing of Chinese characters relies more on graphemic encoding and memory (Chen, 1993). Therefore, the processing of line patterns essentially occupies some of the resources needed for character processing, which aligns with our study’s hypothesis based on dimensional overlap. Additionally, regarding the results, even though the previous study presents lower-level line patterns, the results still showed that the working memory load modulated the later theta band. We hypothesize that, regardless of the specific content of the pre-presented working memory load, once the stimulus disappears from view, these loads are maintained as representations in the working memory platform. Therefore, they do not influence early perceptual processing, and resource competition only occurs once the distractors reach the working memory platform. Lastly, previous study has shown that spatial loads, which do not overlap with either the target or distractor dimensions, do not influence conflict effect (Zhao et al., 2010). Taken together, we believe that regardless of the specific content of the concurrent working memory tasks, as long as they occupy resources related to irrelevant stimulus dimensions, they can influence the late-stage processing of conflict effect. Perhaps our original manuscript did not convey this clearly, so we have rephrased it in a more straightforward manner (in the second paragraph of the Discussion section).

      Comment 2: The P1 and N450 components are sensitive to congruency in previous studies as mentioned by the researchers, but the results in the present study did not replicate them. This raised concerns about data quality and needs to be explained.

      We thank the reviewer for this insightful concern. For P1, we aimed to convey that the early perceptual processing represented by P1 is part of the conflict processing process. Therefore, we included it in our analysis. Additionally, as mentioned in the discussion, most studies find P1 to be insensitive to congruency. However, we inappropriately cited a study in the introduction that suggested P1 shows differences in congruency, which is among the few studies that hold this perspective. To prevent confusion for readers, we have removed this citation from the introduction.

      As for N450, most studies have indeed found it to be influenced by congruency. In our manuscript, we did not observe a congruency effect at our chosen electrodes and time window. However, significant congruency effects were detected at other central-parietal electrodes (CP3, CP4, P5, P6) during the 350-500 ms interval. The interaction between task type and consistency remained non-significant, consistent with previous results. Furthermore, with respect to the location of the electrodes chosen, existing studies on N450 vary widely, including central-parietal electrodes and frontal-central electrodes (for a review, see Heidlmayr et al., 2020). We speculate that this phenomenon may be related to the extent of practice. With fewer total trials, the task may involve more stimulus conflicts, engaging more frontal brain areas. On the other hand, with more total trials, the task may involve more response conflicts, engaging more central-parietal brain areas (Chen et al., 2013; van Veen & Carter, 2005). Due to the extensive practice required in our study, we identified a congruency N450 effect in the central-parietal region. We apologize for not thoroughly exploring other potential electrodes in the previous manuscript, and we have revised the results and interpretations regarding N450 accordingly in the revised version (in the N450 section of the ERP results and the third paragraph of the Discussion section).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Comment 1: In the Introduction, line 108 states: “Second, alpha oscillations (8-13 Hz) can serve as a neural inverse index of mental activity or alertness, while a decrease in alpha power reflects increased alertness or enhanced attentional inhibition of distractors (Arakaki et al., 2022; Tafuro et al., 2019; Zhou et al., 2023; Zhu et al., 2023).” Please clarify which specific psychological process related to conflict processing is reflected by alpha oscillations.

      We appreciate your suggestion and we have clearly highlighted the role of alpha oscillations in attentional engagement during conflict processing in the revised version (in the third-to-last paragraph of the introduction).

      Comment 2: In Figures 3C and 3E, a space is needed between “amplitude” and the preceding parenthesis. Similar adjustments are required in Figures 4A, 4B, 4C, 5C, and 6C. Additionally, in Figures 3B and 3D, a space should be added between the numbers and “ms.” This issue also appears in Figure 8. Please review all figures for these formatting inconsistencies.

      We apologize for the inconsistency in formatting and have corrected them throughout the revised version.

      Comment 3: There are some clerical errors in the manuscript that need correction. For instance, on page 19, line 403: “Participants were asked to answer by pressing one of two response buttons (“S with the left ring finger and “L” with the left ring finger).” This should be corrected to: “L” with the right ring finger. I recommend that the authors carefully proofread the manuscript to identify and correct such errors.

      We sincerely apologize for the errors present in the manuscript and have now carefully proofread it (in the Procedure section).

      Comment 4: On page 13, line 254, the elimination of the Stroop effect should not be interpreted as an improvement in processing.

      We greatly appreciate your suggestion. We agree that the elimination of the Stroop effect should not be confused with improvements in processing. We have corrected this in the revised version (the second paragraph of the Discussion section).

      Reviewer #3 (Recommendations for the authors):

      Comment 1: In the introduction section, the N450 was introduced as “a frontal-central negative deflection”, but in the methods part the N450 was computed using central-parietal electrodes. This inconsistency is confusing and needs to be clarified.

      We apologize for this confusion. We have provided a detailed explanation regarding the differences in electrodes and the rationale behind choosing central-parietal electrodes in our response to Reviewer 2’s second comment. To clarify, we have updated the introduction to consistently label them as central-parietal deflections (in the third paragraph of the Introduction section).

      Comment 2: I speculate the “beta” was mistakenly written as “theta” in line 212.

      We sincerely apologize for this mistake. We have corrected this error (in the RSA results section).

      Comment 3: The speculation that “changes in beta bands may be influenced by theta bands, thereby indirectly influencing the behavioral Stroop effect” needs to be rationalized.

      We appreciate your suggestion. What we intended to convey is that we found an interaction effect in the beta bands; however, the RSA results did not show a correlation with the behavioral interaction effect. We speculate that beta activity might be influenced by the theta bands. On the one hand, we realize that the idea of beta bands indirectly influencing the behavioral Stroop effect was inappropriate, and we have removed this point in the revised version. On the other hand, we have provided rational evidence for the idea that beta bands may be influenced by theta bands. This is based on the biological properties of theta oscillations, which support communication between different cortical neural signals, and their functional role in integrating and transmitting task-relevant information to response execution (in the third-to-last paragraph of the Discussion section).

      Comment 4: Typo in line 479: [10,10].

      We sincerely apologize for this mistake. We have corrected this error: [-10,10] (in the Multivariate pattern analysis section).

      Reference

      Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414–421. https://doi.org/10.1016/j.tics.2014.04.012

      Chen, M. J. (1993). A Comparison of Chinese and English Language Processing. In Advances in Psychology (Vol. 103, pp. 97–117). North-Holland. https://doi.org/10.1016/S0166-4115(08)61659-3

      Chen, X. F., Jiang, J., Zhao, X., & Chen, A. (2010). Effects of practice on semantic conflict and response conflict in the Stroop task. Psychol. Sci., 33, 869–871.

      Chen, Z., Lei, X., Ding, C., Li, H., & Chen, A. (2013). The neural mechanisms of semantic and response conflicts: An fMRI study of practice-related effects in the Stroop task. NeuroImage, 66, 577–584. https://doi.org/10.1016/j.neuroimage.2012.10.028

      Cohen, M. X. (2014). Analyzing Neural Time Series Data: Theory and Practice. The MIT Press. https://doi.org/10.7551/mitpress/9609.001.0001

      Duprez, J., Gulbinaite, R., & Cohen, M. X. (2020). Midfrontal theta phase coordinates behaviorally relevant brain computations during cognitive control. NeuroImage, 207, 116340. https://doi.org/10.1016/j.neuroimage.2019.116340

      Duque, J., Greenhouse, I., Labruna, L., & Ivry, R. B. (2017). Physiological Markers of Motor Inhibition during Human Behavior. Trends in Neurosciences, 40(4), 219–236. https://doi.org/10.1016/j.tins.2017.02.006

      Engel, A. K., & Fries, P. (2010). Beta-band oscillations—Signalling the status quo? Current Opinion in Neurobiology, 20(2), 156–165. https://doi.org/10.1016/j.conb.2010.02.015

      Heidlmayr, K., Kihlstedt, M., & Isel, F. (2020). A review on the electroencephalography markers of Stroop executive control processes. Brain and Cognition, 146, 105637. https://doi.org/10.1016/j.bandc.2020.105637

      Little, S., Bonaiuto, J., Barnes, G., & Bestmann, S. (2019). Human motor cortical beta bursts relate to movement planning and response errors. PLOS Biology, 17(10), e3000479. https://doi.org/10.1371/journal.pbio.3000479

      Morales, S., & Bowers, M. E. (2022). Time-frequency analysis methods and their application in developmental EEG data. Developmental Cognitive Neuroscience, 54, 101067. https://doi.org/10.1016/j.dcn.2022.101067

      Senoussi, M., Verbeke, P., Desender, K., De Loof, E., Talsma, D., & Verguts, T. (2022). Theta oscillations shift towards optimal frequency for cognitive control. Nature Human Behaviour, 6(7), Article 7. https://doi.org/10.1038/s41562-022-01335-5

      van Veen, V., & Carter, C. S. (2005). Separating semantic conflict and response conflict in the Stroop task: A functional MRI study. NeuroImage, 27(3), 497–504. https://doi.org/10.1016/j.neuroimage.2005.04.042

      Zhao, X., Chen, A., & West, R. (2010). The influence of working memory load on the Simon effect. Psychonomic Bulletin & Review, 17(5), 687–692. https://doi.org/10.3758/PBR.17.5.687

    1. Reviewer #2 (Public review):

      Summary:

      This study proposes visual homogeneity as a novel visual property that enables observers perform to several seemingly disparate visual tasks, such as finding an odd item, deciding if two items are same, or judging if an object is symmetric. In Exp 1, the reaction times on several objects were measured in human subjects. In Exp 2, visual homogeneity of each object was calculated based on the reaction time data. The visual homogeneity scores predicted reaction times. This value was also correlated with the BOLD signals in a specific region anterior to LO. Similar methods were used to analyze reaction time and fMRI data in a symmetry detection task. It is concluded that visual homogeneity is an important feature that enables observers to solve these two tasks.

      Strengths:

      (1) The writing is very clear. The presentation of the study is informative.

      (2) This study includes several behavioral and fMRI experiments. I appreciate the scientific rigor of the authors.

      Weaknesses:

      Before addressing the manuscript itself, I would like to comment the review process first. Having read the lasted revised manuscript, I shared many of the concerns raised by the two reviewers in the last two rounds of review. It appears that the authors have disagreed with the majority of comments made by the two reviewers. If so, I strongly recommend that the authors proceed to make this revision as a Version of Record and conclude this review process. According to eLife's policy that the authors have the right to make a Version of Record at any time during the review process, and I fully respect that right. However, I also ask that the authors respect the reviewer's right to retain the comments regarding this paper.

      Beside that, I still have several further questions about this study.

      (1) My main concern with this paper is the way visual homogeneity is computed. On page 10, lines 188-192, it says: "we then asked if there is any point in this multidimensional representation such that distances from this point to the target-present and target-absent response vectors can accurately predict the target-present and target-absent response times with a positive and negative correlation respectively (see Methods)". This is also true for the symmetry detection task. If I understand correctly, the reference point in this perceptual space was found by deliberating satisfying the negative and positive correlations in response times. And then on page 10, lines 200-205, it shows that the positive and negative correlations actually exist. This logic is confusing. The positive and negative correlations emerge only because this method is optimized to do so. It seems more reasonable to identify the reference point of this perceptual space independently, without using the reaction time data. Otherwise, the inference process sounds circular. A simple way is to just use the mean point of all objects in Exp 1, without any optimization towards reaction time data.<br /> I raised this question in my initial review. However, the authors did not address whether the positive and negative correlations still hold if the mean point is defined as the reference point without any optimization. The authors also argue that it is similar to a case of fitting a straight line. It is fine that the authors insist on the straight line (e.g., correlation). However, I would not call "straight line correlations" a "quantitative model" as a high-profile journals like eLife. Please remove all related arguments of a novel quantitative model.

      (2) Visual homogeneity (at least given the current form) is an unnecessary term. It is similar to distractor heterogeneity/distractor variability/distractor saliency in literature. However, the authors attempt to claim it as a novel concept. Both R1 and me raised this question in the very first review. However, the authors refused to revise the manuscript. In the last review, I mentioned this and provided some example sentences claiming novelty. The authors only revised the last sentence of the abstract, and even did not bother to revise the last sentence of significance: "we show that these tasks can be solved using a simple property WE DEFINE as visual homogeneity". Also, lines 851 still shows "we have defined a NOVEL image property, visual homogeneity...". I am confused about whether the authors agree or disagree that "visual homogeneity is an unnecessary term". If the authors agree, they should completely remove the related phrase throughout the paper. If not, they should keep all these and state the reasons. I don't think this is a correct approach to revising a manuscript.

      (3) If the authors agree that visual homogeneity is not new, I suggest a complete rewrite of the title, abstract, significance, and introduction. Let me ask a simple question, can we remove "visual homogeneity" and use some more well-established term like "image feature similarity"? If yes, visual homogeneity is unnecessary.

      (4) If I understand it correctly, one of the key findings of this paper is "the response times for target-present searches were positively correlated with visual homogeneity. By contrast, the response times for target-absent searches were negatively correlated with visual homogeneity" (lines 204-207). I think the authors have already acknowledged that this positive correlation is not surprising at all because it reflects the classic target-distractor similarity effect. If this is the case, please completely remove the positive correlation as a novel prediction and finding.

      (5) In my last review, I mentioned the seminal paper by Duncan and Humphreys (1989) has clearly stated that "difficulty increases with increased similarity of targets to nontargets and decreased similarity between nontargets" (the sentence in their abstract). Here, "similarity between nontargets" is the same as the visual homogeneity defined here. Similar effects have been shown in Duncan (1989) and Nagy, Neriani, and Young (2005). See also the inconsistent results in Nagy& Thomas, 2003, Vicent, Baddeley, Troscianko&Gilchrist, 2009. More recently, Wei Ji Ma has systematically investigated the effects of heterogeneous distractors in visual search. I think the introduction part of Wei Ji Ma's paper (2020) provides a nice summary of this line of research.

      Thanks to the authors' revision, I now better understand the negative correlation. The between-distrator similarity mentioned above describes the heterogeneity of distractors WITHIN an image. However, if I understand it correctly, this study aims to address the negative correlation of reaction time and target-absent stimuli ACROSS images. In other words, why do humans show a shorter reaction time to an image of four pigeons than to an image of four dogs (as shown in Figure 2C), simply because the later image is closer to the reference point of the image space. In this sense, this negative correlation is indeed not the same as distractor heterogeneity. However, this is known as the saliency effect or oddball effects. For example, it seems quite natural to me that humans respond faster to a fish image if the image set contains many images of four-leg dogs that look very different from fish. If this is indeed a saliency effect, why should we define a new term "visual homogeneity"?

      (6) The section "key predictions" is quite straightforward. I understand the logic of positive and negative correlations. However, what is the physical meaning of "decision boundary" (Fig. 1G) here? How does the "decision boundary" map on the image space?

      (7) In my opinion, one of the advantages of this study is the fMRI dataset, which is valuable because previous studies did not collect fMRI data. The key contribution may be the novel brain region associated with display heterogeneity. If this is the case, I would suggest using a more parametric way to measure this region. For example, one can use Gabor stimuli and systematically manipulate the variations of multiple Gabor stimuli, the same logic also applies to motion direction. If this study uses static Gabor, random dot motion, object images that span from low-level to high-level visual stimuli, and consistently shows that the stimulus heterogeneity is encoded in one brain region, I would say this finding is valuable. But this sounds another experiment. In other words, it is insufficient to claim a new brain region given the current form of the manuscript.

      References:

      * Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychological Review, 96(3), 433-458. doi: 10.1037/0033-295x.96.3.433<br /> * Duncan, J. (1989). Boundary conditions on parallel processing in human vision. Perception, 18(4), 457-469. doi: 10.1068/p180457<br /> * Nagy, A. L., Neriani, K. E., & Young, T. L. (2005). Effects of target and distractor heterogeneity on search for a color target. Vision Research, 45(14), 1885-1899. doi: 10.1016/j.visres.2005.01.007<br /> * Nagy, A. L., & Thomas, G. (2003). Distractor heterogeneity, attention, and color in visual search. Vision Research, 43(14), 1541-1552. doi: 10.1016/s0042-6989(03)00234-7<br /> * Vincent, B., Baddeley, R., Troscianko, T., & Gilchrist, I. (2009). Optimal feature integration in visual search. Journal of Vision, 9(5), 15-15. doi: 10.1167/9.5.15<br /> * Singh, A., Mihali, A., Chou, W. C., & Ma, W. J. (2023). A Computational Approach to Search in Visual Working Memory.<br /> * Mihali, A., & Ma, W. J. (2020). The psychophysics of visual search with heterogeneous distractors. BioRxiv, 2020-08.<br /> * Calder-Travis, J., & Ma, W. J. (2020). Explaining the effects of distractor statistics in visual search. Journal of Vision, 20(13), 11-11.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      We are grateful to the editors and reviewers for their careful reading and constructive comments. We have now done our best to respond to them fully through additional analyses and text revisions. In the sections below, the original reviewer comments are in black, and our responses are in red.

      To summarize, the major changes in this round of review are as follows:

      (1) We have included a new introductory figure (Figure 1) to explain the distinction between feature-based tasks and property-based tasks.

      (2) We have included a section on “key predictions” and a section on “overview of this study” in the Introduction to clearly delineate our key predictions and provide a overview of our study.

      (3) We have included additional analyses to address the reviewers’ concerns about circularity in Experiments 1 & 2. We show that distance-to-center or visual homogeneity computations performed on object representations obtained from deep networks (instead of the perceptual dissimilarities from Experiment 1) also yields comparable predictions of target-present and target-absent responses in Experiment 2. 

      (4) We have extensively reworked the manuscript wherever possible to address the specific concerns raised by the reviewers.

      We hope that the revised manuscript adequately addresses the concerns raised in this round of review, and we look forward to a positive assessment.

      eLife Assessment

      This study uses carefully designed experiments to generate a useful behavioural and neuroimaging dataset on visual cognition. The results provide solid evidence for the involvement of higher-order visual cortex in processing visual oddballs and asymmetry. However, the evidence provided for the very strong claims of homogeneity as a novel concept in vision science, separable from existing concepts such as target saliency, is inadequate.

      Thank you for your positive assessment. We agree that visual homogeneity is similar to existing concepts such as target saliency, memorability etc. We have proposed it as a separate concept because visual homogeneity has an independent empirical measure (the reciprocal of target-absent search time in oddball search, or the reciprocal of same response time in a same-different task, etc) that may or may not be the same as other empirical measures such as saliency and memorability. Investigating these possibilities is beyond the scope of our study but would be interesting for future work. We have now clarified this in the revised manuscript (Discussion, p. 42).

      However, we’d like to emphasize that the question of whether visual homogeneity is novel or related to existing concepts misses entirely the key contribution of our study.

      Our key contribution is a quantitative, falsifiable model for how the brain could be solving property-based tasks like same-different, oddball or symmetry. Most theories of decision making consider feature-based tasks where there is a well-defined feature space and decision variable. Property-based tasks pose a significant challenge to standard theories since it is not clear how these tasks could be solved. In fact, oddball search, same-different and symmetry tasks have been considered so different that they are rarely even mentioned in the same study. Our study represents a unifying framework showing that all three tasks can be understood as solving the same underlying fundamental problem, and presents evidence in favor of this solution.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors define a new metric for visual displays, derived from psychophysical response times, called visual homogeneity (VH). They attempt to show that VH is explanatory of response times across multiple visual tasks. They use fMRI to find visual cortex regions with VH-correlated activity. On this basis, they declare a new visual region in human brain, area VH, whose purpose is to represent VH for the purpose of visual search and symmetry tasks.

      Thank you for your accurate and positive assessment.

      Strengths:

      The authors present carefully designed experiments, combining multiple types of visual judgments and multiple types of visual stimuli with concurrent fMRI measurements. This is a rich dataset with many possibilities for analysis and interpretation.

      Thank you for your accurate and positive assessment.

      Weaknesses:

      The datasets presented here should provide a rich basis for analysis. However, in this version of the manuscript, I believe that there are major problems with the logic underlying the authors' new theory of visual homogeneity (VH), with the specific methods they used to calculate VH, and with their interpretation of psychophysical results using these methods. These problems with the coherency of VH as a theoretical construct and metric value make it hard to interpret the fMRI results based on searchlight analysis of neural activity correlated with VH.

      We respectfully disagree with your concerns, and have done our best to respond to them fully below.

      In addition, the large regions of VH correlations identified in Experiments 1 and 2 vs. Experiments 3 and 4 are barely overlapping. This undermines the claim that VH is a universal quantity, represented in a newly discovered area of visual cortex, that underlies a wide variety of visual tasks and functions.

      We respectfully disagree with your assertion. First of all, there is partial overlap between the VH regions, for which there are several other obvious explanations that must be considered first before dismissing VH outright as a flawed construct. We acknowledge these alternatives in the Results (p. 27), and the relevant text is reproduced below.

      “We note that it is not straightforward to interpret the overlap between the VH regions identified in Experiments 2 & 4. The lack of overlap could be due to stimulus differences (natural images in Experiment 2 vs silhouettes in Experiment 4), visual field differences (items in the periphery in Experiment 2 vs items at the fovea in Experiment 4) and even due to different participants in the two experiments. There is evidence supporting all these possibilities: stimulus differences (Yue et al., 2014), visual field differences (Kravitz et al., 2013) as well as individual differences can all change the locus of neural activations in object-selective cortex (Weiner and Grill-Spector, 2012a; Glezer and Riesenhuber, 2013). We speculate that testing the same participants on search and symmetry tasks using similar stimuli and display properties would reveal even larger overlap in the VH regions that drive behavior.”

      Maybe I have missed something, or there is some flaw in my logic. But, absent that, I think the authors should radically reconsider their theory, analyses, and interpretations, in light of detailed comments below, in order to make the best use of their extensive and valuable datasets combining behavior and fMRI. I think doing so could lead to a much more coherent and convincing paper, albeit possibly supporting less novel conclusions.

      We respectfully disagree with your assessment, and we hope that our detailed responses below will convince you of the merit of our claims.

      THEORY AND ANALYSIS OF VH

      (1) VH is an unnecessary, complex proxy for response time and target-distractor similarity.<br /> VH is defined as a novel visual quality, calculable for both arrays of objects (as studied in Experiments 1-3) and individual objects (as studied in Experiment 4). It is derived from a center-to-distance calculation in a perceptual space. That space in turn is derived from multi-dimensional scaling of response times for target-distractor pairs in an oddball detection task (Experiments 1 and 2) or in a same different task (Experiments 3 and 4).  Proximity of objects in the space is inversely proportional to response times for arrays in which they were paired. These response times are higher for more similar objects. Hence, proximity is proportional to similarity. This is visible in Fig. 2B as the close clustering of complex, confusable animal shapes.

      VH, i.e. distance-to-center, for target-present arrays is calculated as shown in Fig. 1C, based on a point on the line connecting target and distractors. The authors justify this idea with previous findings that responses to multiple stimuli are an average of responses to the constituent individual stimuli. The distance of the connecting line to the center is inversely proportional to the distance between the two stimuli in the pair, as shown in Fig. 2D. As a result, VH is inversely proportional to distance between the stimuli and thus to stimulus similarity and response times. But this just makes VH a highly derived, unnecessarily complex proxy for target-distractor similarity and response time. The original response times on which the perceptual space is based are far more simple and direct measures of similarity for predicting response times.

      Thank you for carefully thinking through our logic. We agree that a distance-to-centre calculation is entirely unnecessary as an explanation for target-present visual search. The difficulty of target-present search is already known to be directly proportional to the similarity between target and distractor, so there is nothing new to explain here.

      However, this is a narrow and selective interpretation of our findings because you are focusing only on our results on target-present searches, which are only half of all our data. The other half is the target-absent responses which previously have had no clear explanation. You are also missing the fact that we are explaining same-different and symmetry tasks as well using the same visual homogeneity computation.

      We urge you to think more deeply about the problem of how to decide whether an oddball is present or not in the first place. How do we actually solve this task? There must be some underlying representation and decision process. Our study shows that a distance-to-centre computation can actually serve as a decision variable to solve disparate property-based visual tasks. These tasks pose a major challenge to standard models of decision making, because the underlying representation and decision variable have been unclear. Our study resolves this challenge by proposing a novel computation that can be used by the brain to solve all these disparate tasks, and bring these tasks into the ambit of standard theories of decision making.  

      Our results also explain several interesting puzzles in the literature. If oddball search was driven only by target-distractor similarity, the time taken to respond when a target is absent should not vary at all, and should actually take longer than all target-present searches. But in fact, systematic variations in target-absent times have been observed always in the literature, but have never been explained using any theoretical models. Our results explain why target-absent times vary systematically – it is due to visual homogeneity.

      Similarly, in same-different tasks, participants are known to take longer to make a “different” response when the two items differ only slightly. By this logic, they should take the longest to make a “same” response, but in fact, paradoxically, participants are actually faster to make “same” responses. This fast-same effect has been noted several times, but never explained using any models. Our results provide an explanation of why “same” responses to an image vary systematically – it is due to visual homogeneity. 

      Finally, in symmetry tasks, symmetric objects evoke fast responses, and this has always been taken as evidence for special symmetry computations in the brain. But we show that the same distance-to-center computation can explain both responses to symmetric and asymmetric objects. Thus there is no need for a special symmetry computation in the brain.

      (2) The use of VH derived from Experiment 1 to predict response times in Experiment 2 is circular and does not validate the VH theory.<br /> The use of VH, a response time proxy, to predict response times in other, similar tasks, using the same stimuli, is circular. In effect, response times are being used to predict response times across two similar experiments using the same stimuli. Experiment 1 and the target present condition of Experiment 2 involve the same essential task of oddball detection. The results of Experiment 1 are converted into VH values as described above, and these are used to predict response times in experiment 2 (Fig. 2F). Since VH is a derived proxy for response values in Experiment 1, this prediction is circular, and the observed correlation shows only consistency between two oddball detection tasks in two experiments using the same stimuli.

      You are indeed correct in noting that both Experiment 1 & 2 involve oddball search, and so at the superficial level, it looks circular that the oddball search data of Experiment 1 is being used to explain the oddball search data of Experiment 2.

      However a deeper scrutiny reveals more fundamental differences: Experiment 1 consisted of only oddball search with the target appearing on the left or right, whereas Experiment 2 consisted of oddball search with the target either present or completely absent. In fact, we were merely using the search dissimilarities from Experiment 1 to reconstruct the underlying object representation, because it is well known that neural dissimilarities are predicted well by search dissimilarities (Sripati & Olson, 2009; Zhivago et al, 2014).

      To thoroughly refute any lingering concern about circularity, we reasoned that the model predictions for Experiment 2 could have been obtained by a distance-to-center computation on any brain like object representation. To this end, we used object representations from deep neural networks pretrained on object categorization, whose representations are known to match well with the brain, and asked if a distance-to-centre computation on these representations could predict the search data in Experiment 2. This was indeed the case, and these results are now included an additional section in Supplementary Material (Section S1).

      (3) The negative correlation of target-absent response times with VH as it is defined for target-absent arrays, based on distance of a single stimulus from center, is uninterpretable without understanding the effects of center-fitting. Most likely, center-fitting and the different VH metric for target-absent trials produce an inverse correlation of VH with target-distractor similarity.

      Unfortunately, as we have mentioned above, target-distractor similarity cannot explain how target-absent searches behave, since there is no distractor in such searches.

      We do understand your broader concern about the center-fitting algorithm itself. We performed a number of additional analyses to confirm the generality of our results and reject alternate explanations – these are summarized in a new section titled “Confirming the generality of visual homogeneity” (p. 12), and the section is reproduced below for your convenience.   

      “Confirming the generality of visual homogeneity

      We performed several additional analyses to confirm the generality of our results, and to reject alternate explanations.

      First, it could be argued that our results are circular because they involve taking oddball search times from Experiment 1 and using them to explain search response times in Experiment 2. This is a superficial concern since we are using the search dissimilarities from Experiment 1 only as a proxy for the underlying neural representation, based on previous reports that neural dissimilarities closely match oddball search dissimilarities (Sripati and Olson, 2010; Zhivago and Arun, 2014). Nonetheless, to thoroughly refute this possibility, we reasoned that we would get similar predictions of the target present/absent responses in Experiment using any other brain-like object representation. To confirm this, we replaced the object representations derived from Experiment 1 with object representations derived from deep neural networks pretrained for object categorization, and asked if distance-to-center computations could predict the target present/absent responses in Experiment 2. This was indeed the case (Section S1). 

      Second, we wondered whether the nonlinear optimization process of finding the best-fitting center could be yielding disparate optimal centres each time. To investigate this, we repeated the optimization procedure with many randomly initialized starting points, and obtained the same best-fitting center each time (see Methods).

      Third, to confirm that the above model fits are not due to overfitting, we performed a leave-one-out cross validation analysis. We left out all target-present and target-absent searches involving a particular image, and then predicted these searches by calculating visual homogeneity estimated from all other images. This too yielded similar positive and negative correlations (r = 0.63, p < 0.0001 for target-present, r = -0.63, p < 0.001  for target-absent).

      Fourth, if heterogeneous displays indeed elicit similar neural responses due to mixing, then their average distance to other objects must be related to their visual homogeneity. We confirmed that this was indeed the case, suggesting that the average distance of an object from all other objects in visual search can predict visual homogeneity (Section S1).

      Fifth, the above results are based on taking the neural response to oddball arrays to be the average of the target and distractor responses. To confirm that averaging was indeed the optimal choice, we repeated the above analysis by assuming a range of relative weights between the target and distractor. The best correlation was obtained for almost equal weights in the lateral occipital (LO) region, consistent with averaging and its role in the underlying perceptual representation (Section S1).

      Finally, we performed several additional experiments on a larger set of natural objects as well as on silhouette shapes. In all cases, present/absent responses were explained using visual homogeneity (Section S2).”

      The construction of the VH perceptual space also involves fitting a "center" point such that distances to center predict response times as closely as possible. The effect of this fitting process on distance-to-center values for individual objects or clusters of objects is unknowable from what is presented here. These effects would depend on the residual errors after fitting response times with the connecting line distances. The center point location and its effects on distance-to-center of single objects and object clusters are not discussed or reported here.

      While it is true that the optimal center needs to be found by fitting to the data, there no particular mystery to the algorithm: we are simply performing a standard gradient-descent to maximize the fit to the data. We have described the algorithm clearly and are making our codes public. We find the algorithm to yield stable optimal centers despite many randomly initialized starting points. We find the optimal center to be able to predict responses to entirely novel images that were excluded during model training. We are making no assumption about the location of centre with respect to individual points. Therefore, we see no cause for concern regarding the center-finding algorithm. 

      Yet, this uninterpretable distance-to-center of single objects is chosen as the metric for VH of target-absent displays (VHabsent). This is justified by the idea that arrays of a single stimulus will produce an average response equal to one stimulus of the same kind. But it is not logically clear why response strength to a stimulus should be a metric for homogeneity of arrays constructed from that stimulus, or even what homogeneity could mean for a single stimulus from this set. And it is not clear how this VHabsent metric based on single stimuli can be equated to the connecting line VH metric for stimulus pairs, i.e. VHpresent, or how both could be plotted on a single continuum.

      Most visual tasks, such as finding an animal, are thought to involve building a decision boundary on some underlying neural representation. Even visual search has been portrayed as a signal-detection problem where a particular target is to be discriminated from a distractor. However none of these formulations work in the case of property-based visual tasks, where there is no unique feature to look for.

      We are proposing that, when we view a search array, the neural response to the search array can be deduced from the neural responses to the individual elements using well known rules, and that decisions about an oddball target being present or absent can be made by computing the distance of this neural response from some canonical mean firing rate of a population of neurons. This distance to center computation is what we denote as visual homogeneity. We have revised our manuscript throughout to make this clearer and we hope that this helps you understand the logic better. 

      It is clear, however, what *should* be correlated with difficulty and response time in the target-absent trials, and that is the complexity of the stimuli and the numerosity of similar distractors in the overall stimulus set. Complexity of the target, similarity with potential distractors, and number of such similar distractors all make ruling out distractor presence more difficult. The correlation seen in Fig. 2G must reflect these kinds of effects, with higher response times for complex animal shapes with lots of similar distractors and lower response times for simpler round shapes with fewer similar distractors.

      You are absolutely correct that the stimulus complexity should matter, but there are no good empirically derived measures for stimulus complexity, other than subjective ratings which are complex on their own and could be based on any number of other cognitive and semantic factors. But considering what factors are correlated with target-absent response times is entirely different from asking what decision variable or template is being used by participants to solve the task.

      The example points in Fig. 2G seem to bear this out, with higher response times for the deer stimulus (complex, many close distractors in the Fig. 2B perceptual space) and lower response times for the coffee cup (simple, few close distractors in the perceptual space). While the meaning of the VH scale in Fig. 2G, and its relationship to the scale in Fig. 2F, are unknown, it seems like the Fig. 2G scale has an inverse relationship to stimulus complexity, in contrast to the expected positive relationship for Fig. 2F. This is presumably what creates the observed negative correlation in Fig. 2G.

      Taken together, points 1-3 suggest that VHpresent and VHabsent are complex, unnecessary, and disconnected metrics for understanding target detection response times. The standard, simple explanation should stand. Task difficulty and response time in target detection tasks, in both present and absent trials, are positively correlated with target-distractor similarity.

      We strongly disagree. Your assessment seems to be based on only considering target-present searches, which are of course driven by target-distractor similarity. Your  argument is flawed because systematic variations in target-absent trials cannot be linked to any target-distractor similarity since there are no targets in the first place in such trials.

      We have shown that target-absent response times are in fact, independent of experimental context, which means that they index an image property that is independent of any reference target (Results, p. 15; Section S4). This property is what we define as visual homogeneity.

      I think my interpretations apply to Experiments 3 and 4 as well, although I find the analysis in Fig. 4 especially hard to understand. The VH space in this case is based on Experiment 3 oddball detection in a stimulus set that included both symmetric and asymmetric objects. But the response times for a very different task in Experiment 4, a symmetric/asymmetric judgment, are plotted against the axes derived from Experiment 3 (Fig. 4F and 4G). It is not clear to me why a measure based on oddball detection that requires no use of symmetry information should be predictive of within-stimulus symmetry detection response times. If it is, that requires a theoretical explanation not provided here.

      We were simply using an oddball detection task to construct the underlying object representation, on the basis of observations that search dissimilarities are strongly correlated with neural   dissimilarities. In Section S1, we show that similar results could have been obtained using other object representations such as deep networks, as long as the representation is brain-like.

      (4) Contrary to the VH theory, same/different tasks are unlikely to depend on a decision boundary in the middle of a similarity or homogeneity continuum.

      We have provided empirical proof for our claims, by showing that target-present response times in a visual search task are correlated with “different” responses in the same-different task, and that target-absent response times in the visual search task are correlated with “same” responses in the same-different task (Section S4).

      The authors interpret the inverse relationship of response times with VHpresent and VHabsent, described above, as evidence for their theory. They hypothesize, in Fig. 1G, that VHpresent and VHabsent occupy a single scale, with maximum VHpresent falling at the same point as minimum VHabsent. This is not borne out by their analysis, since the VHpresent and VHabsent value scales are mainly overlapping, not only in Experiments 1 and 2 but also in Experiments 3 and 4. The authors dismiss this problem by saying that their analyses are a first pass that will require future refinement. Instead, the failure to conform to this basic part of the theory should be a red flag calling for revision of the theory.

      Again, the opposite correlations between target present/absent search times with VH are the crucial empirical validation of our claims that a distance-to-center calculation explain how we perform these property-based tasks. The VH predictions do not fully explain the data. We have explicitly acknowledged this shortcoming, so we are hardly dismissing it as a problem. 

      The reason for this single scale is that the authors think of target detection as a boundary decision task, along a single scale, with a decision boundary somewhere in the middle, separating present and absent. This model makes sense for decision dimensions or spaces where there are two categories (right/left motion; cats vs. dogs), separated by an inherent boundary (equal left/right motion; training-defined cat/dog boundary). In these cases, there is less information near the boundary, leading to reduced speed/accuracy and producing a pattern like that shown in Fig. 1G.

      Finding an oddball, deciding if two items are same or different and symmetry tasks are disparate visual tasks that do not fit neatly into standard models of decision making. The key conceptual advance of our study is that we propose a plausible neural representation and decision variable that allow all three property-based visual tasks to be reconciled with standard models of decision making.

      This logic does not hold for target detection tasks. There is no inherent middle point boundary between target present and target absent. Instead, in both types of trial, maximum information is present when target and distractors are most dissimilar, and minimum information is present when target and distractors are most similar. The point of greatest similarity occurs at then limit of any metric for similarity. Correspondingly, there is no middle point dip in information that would produce greater difficulty and higher response times. Instead, task difficulty and response times increase monotonically with similarity between targets and distractors, for both target present and target absent decisions. Thus, in Figs. 2F and 2G, response times appear to be highest for animals, which share the largest numbers of closely similar distractors.        

      Your alternative explanation rests on vague factors like “maximum information” which cannot be quantified. By contrast we are proposing a concrete, falsifiable model for three property-based tasks – same/different, oddball present/absent and object symmetry. Any argument based solely on item similarity to explain visual search or symmetry responses cannot explain systematic variations observed for target-absent arrays and for symmetric objects, for the reasons explained earlier.

      DEFINITION OF AREA VH USING fMRI

      (1) The area VH boundaries from different experiments are nearly completely non-overlapping.

      In line with their theory that VH is a single continuum with a decision boundary somewhere in the middle, the authors use fMRI searchlight to find an area whose responses positively correlate with homogeneity, as calculated across all of their target present and target absent arrays. They report VH-correlated activity in regions anterior to LO. However, the VH defined by symmetry Experiments 3 and 4 (VHsymmetry) is substantially anterior to LO, while the VH defined by target detection Experiments 1 and 2 (VHdetection) is almost immediately adjacent to LO. Fig. S13 shows that VHsymmetry and VHdetection are nearly non-overlapping. This is a fundamental problem with the claim of discovering a new area that represents a new quantity that explains response times across multiple visual tasks. In addition, it is hard to understand why VHsymmetry does not show up in a straightforward subtraction between symmetric and asymmetric objects, which should show a clear difference in homogeneity.

      We respectfully disagree. The partial overlap between the VH regions identified in Experiments 1 & 2 can hardly be taken as evidence against the quantity VH itself, because there are several other obvious alternate explanations for this partial overlap, as summarized earlier as well. The VH region does show up in a straightforward subtraction  between symmetric and asymmetric objects (Section S7), so we are not sure what the Reviewer is referring to here.

      (2) It is hard to understand how neural responses can be correlated with both VHpresent and VHabsent.

      The main paper results for VHdetection are based on both target-present and target-absent trials, considered together. It is hard to interpret the observed correlations, since the VHpresent and VHabsent metrics are calculated in such different ways and have opposite correlations with target similarity, task difficulty, and response times (see above). It may be that one or the other dominates the observed correlations. It would be clarifying to analyze correlations for target-present and target-absent trials separately, to see if they are both positive and correlated with each other.

      Thanks for raising this point. We have now confirmed that the positive correlation between VH and neural response holds even when we do the analysis separately for target-present and -absent searches (correlation between neural response in VH region and visual homogeneity (n = 32, r = 0.66, p < 0.0005 for target-present searches & n = 32, r = 0.56, p < 0.005 for target-absent searches).

      (3) Definition of the boundaries and purpose of a new visual area in the brain requires circumspection, abundant and convergent evidence, and careful controls.

      Even if the VH metric, as defined and calculated by the authors here, is a meaningful quantity, it is a bold claim that a large cortical area just anterior to LO is devoted to calculating this metric as its major task. Vision involves much more than target detection and symmetry detection. Cortex anterior to LO is bound to perform a much wider range of visual functionalities. If the reported correlations can be clarified and supported, it would be more circumspect to treat them as one byproduct of unknown visual processing in cortex anterior to LO, rather than treating them as the defining purpose for a large area of visual cortex.

      We totally agree with you that reporting a new brain region would require careful interpretation and abundant and converging evidence. However, this requires many studies worth of work, and historically category-selective regions like the FFA have achieved consensus only after they were replicated and confirmed across many studies. We believe our proposal for the computation of a quantity like visual homogeneity is conceptually novel, and our study represents a first step that provides some converging evidence (through replicable results across different experiments) for such a region. We have reworked our manuscript to make this point clearer (Discussion, p 32).

      Reviewer #3 (Public Review):

      Summary:

      This study proposes visual homogeneity as a novel visual property that enables observers perform to several seemingly disparate visual tasks, such as finding an odd item, deciding if two items are same, or judging if an object is symmetric. In Exp 1, the reaction times on several objects were measured in human subjects. In Exp 2, visual homogeneity of each object was calculated based on the reaction time data. The visual homogeneity scores predicted reaction times. This value was also correlated with the BOLD signals in a specific region anterior to LO. Similar methods were used to analyze reaction time and fMRI data in a symmetry detection task. It is concluded that visual homogeneity is an important feature that enables observers to solve these two tasks.

      Thank you for your accurate and positive assessment.

      Strengths:

      (1) The writing is very clear. The presentation of the study is informative.

      (2) This study includes several behavioral and fMRI experiments. I appreciate the scientific rigor of the authors.

      We are grateful to you for your balanced assessment and constructive comments.

      Weaknesses:

      (1) My main concern with this paper is the way visual homogeneity is computed. On page 10, lines 188-192, it says: "we then asked if there is any point in this multidimensional representation such that distances from this point to the target-present and target-absent response vectors can accurately predict the target-present and target-absent response times with a positive and negative correlation respectively (see Methods)". This is also true for the symmetry detection task. If I understand correctly, the reference point in this perceptual space was found by deliberating satisfying the negative and positive correlations in response times. And then on page 10, lines 200-205, it shows that the positive and negative correlations actually exist. This logic is confusing. The positive and negative correlations emerge only because this method is optimized to do so. It seems more reasonable to identify the reference point of this perceptual space independently, without using the reaction time data. Otherwise, the inference process sounds circular. A simple way is to just use the mean point of all objects in Exp 1, without any optimization towards reaction time data.

      We disagree with you since the same logic applies to any curve-fitting procedure. When we fit data to a straight line, we are finding the slope and intercept that minimizes the error between the data and the straight line, but we would hardly consider the process circular when a good fit is achieved – in fact we take it as a confirmation that the data can be fit linearly. In the same vein, we would not have observed a good fit to the data, if there did not exist any good reference point relative to which the distances of the target-present and target-absent search arrays predicted these response times.

      In Section S2, we show that the visual homogeneity estimates for each object is strongly correlated with the average distance of each object to all other objects (r = 0.84, p<0.0005, Figure S1).

      We have performed several additional analyses to confirm the generality of our results and to reject alternate explanations (see Results, p. 12, Section titled “Confirming the generality of visual homogeneity”). In particular, to confirm that the results we obtained are not due to overfitting, we performed a cross-validation analysis, where we removed all searches involving a particular image and predicted these response times using visual homogeneity. This too revealed a significant model correlation confirming that our results are not due to overfitting.

      (2) Visual homogeneity (at least given the current from) is an unnecessary term. It is similar to distractor heterogeneity/distractor variability/distractor statics in literature. However, the authors attempt to claim it as a novel concept. The title is "visual homogeneity computations in the brain enable solving generic visual tasks". The last sentence of the abstract is "a NOVEL IMAGE PROPERTY, visual homogeneity, is encoded in a localized brain region, to solve generic visual tasks". In the significance, it is mentioned that "we show that these tasks can be solved using a simple property WE DEFINE as visual homogeneity". If the authors agree that visual homogeneity is not new, I suggest a complete rewrite of the title, abstract, significance, and introduction.

      We respectfully disagree that visual homogeneity is an unnecessary term. Please see our comments to Reviewer 1 above. Just like saliency and memorability can be measured empirically, we propose that visual homogeneity can be empirically measured as the reciprocal of the target-absent search time in a search task, or as the reciprocal of the “same” response time in a same-different task. Understanding how these three quantities interact will require measuring them empirically for an identical set of images, which is beyond the scope of this study but an interesting possibility for future work.

      (3) Also, "solving generic tasks" is another overstatement. The oddball search tasks, same-different tasks, and symmetric tasks are only a small subset of many visual tasks. Can this "quantitative model" solve motion direction judgment tasks, visual working memory tasks? Perhaps so, but at least this manuscript provides no such evidence. On line 291, it says "we have proposed that visual homogeneity can be used to solve any task that requires discriminating between homogeneous and heterogeneous displays". I think this is a good statement. A title that says "XXXX enable solving discrimination tasks with multi-component displays" is more acceptable. The phrase "generic tasks" is certainly an exaggeration.

      Thank you for your suggestion. We have now replaced the term “generic tasks” with the term property-based tasks, which we feel is more appropriate and reflect the fact that oddball search, same-different and symmetry tasks all involve looking for a specific image property.

      (4) If I understand it correctly, one of the key findings of this paper is "the response times for target-present searches were positively correlated with visual homogeneity. By contrast, the response times for target-absent searches were negatively correlated with visual homogeneity" (lines 204-207). I think the authors have already acknowledged that the positive correlation is not surprising at all because it reflects the classic target-distractor similarity effect. But the authors claim that the negative correlations in target-absent searches is the true novel finding.

      (5) I would like to make it clear that this negative correlation is not new either. The seminal paper by Duncan and Humphreys (1989) has clearly stated that "difficulty increases with increased similarity of targets to nontargets and decreased similarity between nontargets" (the sentence in their abstract). Here, "similarity between nontargets" is the same as the visual homogeneity defined here. Similar effects have been shown in Duncan (1989) and Nagy, Neriani, and Young (2005). See also the inconsistent results in Nagy & Thomas, 2003, Vicent, Baddeley, Troscianko & Gilchrist, 2009. More recently, Wei Ji Ma has systematically investigated the effects of heterogeneous distractors in visual search. I think the introduction part of Wei Ji Ma's paper (2020) provides a nice summary of this line of research. I am surprised that these references are not mentioned at all in this manuscript (except Duncan and Humphreys, 1989).

      You are right in noting that Duncan and Humphreys (1989) propose that searches are more difficult when nontargets are dissimilar. However, since our searches have identical distractors, the similarity between nontargets is always constant across target-absent searches, and therefore this cannot predict any systematic variation in target-absent search that is observed in our data. By contrast, our results explain both target-absent searches and target-present searches.

      Thank you for pointing us to previous work. These studies show that it is not just the average distractor similarity but the statistics of the distractor similarity that drive visual search. However these studies do not explain why target-absent searches should vary systematically. 

      (6) If the key contribution is the quantitative model, the study should be organized in a different way. Although the findings of positive and negative correlations are not novel, it is still good to propose new models to explain classic phenomena. I would like to mention the three studies by Wei Ji Ma (see below). In these studies, Bayesian observer models were established to account for trial-by-trial behavioral responses. These computational models can also account for the set-size effect, behavior in both localization and detection tasks. I see much more scientific rigor in their studies. Going back to the quantitative model in this paper, I am wondering whether the model can provide any qualitative prediction beyond the positive and negative correlations? Can the model make qualitative predictions that differ from those of Wei Ji's model? If not, can the authors show that the model can quantitatively better account for the data than existing Bayesian models? We should evaluate a model either qualitatively or quantitatively.

      Thank you for pointing us to prior work by Wei Ji Ma. These studies systematically examined visual search for a target among heterogeneous distractors using simple parametric stimuli and a Bayesian modeling framework. By contrast, our experiments involve searching for single oddball targets among multiple identical distractors, so it is not clear to us that the Wei Ji Ma models can be easily used to generate predictions about these searches used in our study. 

      We are not sure what you mean by offering quantitative predictions beyond positive and negative correlations. We have tried to explain systematic variation in target-present and target-absent response times using a model of how these decisions are being made. Our model explains a lot of systematic variation in the data for both types of decisions.

      (7) In my opinion, one of the advantages of this study is the fMRI dataset, which is valuable because previous studies did not collect fMRI data. The key contribution may be the novel brain region associated with display heterogeneity. If this is the case, I would suggest using a more parametric way to measure this region. For example, one can use Gabor stimuli and systematically manipulate the variations of multiple Gabor stimuli, the same logic also applies to motion direction. If this study uses static Gabor, random dot motion, object images that span from low-level to high-level visual stimuli, and consistently shows that the stimulus heterogeneity is encoded in one brain region, I would say this finding is valuable. But this sounds like another experiment. In other words, it is insufficient to claim a new brain region given the current form of the manuscript.

      We agree that parametric stimulus manipulations are important for studying early visual areas where stimulus dimensions are known (e.g. orientation, spatial frequency). Using parametric stimulus manipulations for more complex stimuli is fraught with issues because the underlying representation may not be encoding the dimensions being manipulated. This is the reason why we attempted to recover the underlying neural representation using dissimilarities measured using visual search, and then asked whether a decision making process operating on this underlying representation can explain how decisions are made. Therefore we disagree that parametric stimulus manipulations are the only way to obtain insight into such tasks.

      We have proposed a quantitative model that explains how decisions about target present and absent can be made through distance-to-center computations on an underlying object representation. We feel that the behavioural and the brain imaging results strongly point to a novel computation that is being performed in a localized region in the brain. These results represent an important first step in understanding how complex, property-based tasks are performed by the brain. We have revised our manuscript to make this point clearer.

      REFERENCES

      - Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychological Review, 96(3), 433-458. doi: 10.1037/0033-295x.96.3.433

      - Duncan, J. (1989). Boundary conditions on parallel processing in human vision. Perception, 18(4), 457-469. doi: 10.1068/p180457

      - Nagy, A. L., Neriani, K. E., & Young, T. L. (2005). Effects of target and distractor heterogeneity on search for a color target. Vision Research, 45(14), 1885-1899. doi: 10.1016/j.visres.2005.01.007

      - Nagy, A. L., & Thomas, G. (2003). Distractor heterogeneity, attention, and color in visual search. Vision Research, 43(14), 1541-1552. doi: 10.1016/s0042-6989(03)00234-7

      - Vincent, B., Baddeley, R., Troscianko, T., & Gilchrist, I. (2009). Optimal feature integration in visual search. Journal of Vision, 9(5), 15-15. doi: 10.1167/9.5.15

      - Singh, A., Mihali, A., Chou, W. C., & Ma, W. J. (2023). A Computational Approach to Search in Visual Working Memory.

      - Mihali, A., & Ma, W. J. (2020). The psychophysics of visual search with heterogeneous distractors. BioRxiv, 2020-08.

      - Calder-Travis, J., & Ma, W. J. (2020). Explaining the effects of distractor statistics in visual search. Journal of Vision, 20(13), 11-11.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors have not made substantive changes to address my major concerns. Instead, they have responded with arguments about why their original manuscript was good as written. I did not find these arguments persuasive. Given that, I've left my public review the same, since it still represents my opinions about the paper. Readers can judge which viewpoints are more persuasive.

      We respectfully disagree: we have tried our best to address your concerns with additional analysis wherever feasible, and by acknowledging any limitations.

      Reviewer #3 (Recommendations For The Authors):

      (1) As I mentioned above, please consider rewriting title, abstract, introduction, and significance. Please remove the word "visual homogeneity" and instead use distractor heterogeneity/distractor variability/distractor statistics as often used in literature.

      To clarify, visual homogeneity is NOT the same as distractor homogeneity. Visual homogeneity refers to a distance-to-center computation and represents an image-computable property that can vary systematically even when all distractors are identical. By contrast distractor heterogeneity varies only when distractors are different from each other.

      (2) Better to remove the phrase "generic tasks".

      Thanks for your suggestions. We now refer to these tasks as property-based tasks. 

      (3) Better to explicitly specify the predictions made by the quantitative model beyond positive and negative correlations.

      The predictions of the quantitative model are to explain systematic variation in the response times. We are not sure what else is there to predict in the response times.

      (4) If the quantitative model is the key contribution, better to highlight the details and algorithmic contribution of the model, and show the advantage of this model either qualitatively and quantitatively.

      Please see our responses above. Our quantitative model explains behavior and brain imaging data on three disparate tasks – the same/different, oddball visual search and symmetry tasks. 

      (5) If the new brain region is the key contribution, better to downplay the quantitative model.

      Please see our responses above. Our quantitative model explains behavior and brain imaging data on three disparate tasks – the same/different, oddball visual search and symmetry tasks.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      This paper contains what could be described as a "classic" approach towards evaluating a novel taste stimuli in an animal model, including standard behavioral tests (some with nerve transections), taste nerve physiology, and immunocytochemistry of the tongue. The stimulus being tested is ornithine, from a class of stimuli called "kokumi", which are stimuli that enhance other canonical tastes, increasing essentially the hedonic attributes of these other stimuli; the mechanism for ornithine detection is thought to be GPRC6A receptors expressed in taste cells. The authors showed evidence for this in an earlier paper with mice; this paper evaluates ornithine taste in a rat model.

      Strengths:

      The data show the effects of ornithine on taste: in two-bottle and briefer intake tests, adding ornithine results in a higher intake of most, but not all, stimuli tests. Bilateral nerve cuts or the addition of GPRC6A antagonists decrease this effect. Small effects of ornithine are shown in whole-nerve recordings.

      Weaknesses:

      The conclusion seems to be that the authors have found evidence for ornithine acting as a taste modifier through the GPRC6A receptor expressed on the anterior tongue. It is hard to separate their conclusions from the possibility that any effects are additive rather than modulatory. Animals did prefer ornithine to water when presented by itself. Additionally, the authors refer to evidence that ornithine is activating the T1R1-T1R3 amino acid taste receptor, possibly at higher concentrations than they use for most of the study, although this seems speculative. It is striking that the largest effects on taste are found with the other amino acid (umami) stimuli, leading to the possibility that these are largely synergistic effects taking place at the tas1r receptor heterodimer.

      We would like to thank Reviewer #1 for the valuable comments. Our basis for considering ornithine as a taste modifier stems from our observation that a low concentration of ornithine (1 mM), which does not elicit a preference on its own, enhances the preference for umami substances, sucrose, and soybean oil through the activation of the GPRC6A receptor. Notably, this receptor is not typically considered a taste receptor. The reviewer suggested that the enhancement of umami taste might be due to potentiation occurring at the TAS1R receptor heterodimer. However, we propose that a different mechanism may be at play, as an antagonist of GPRC6A almost completely abolished this enhancement. In the revised manuscript, we will endeavor to provide additional information on the role of ornithine as a taste modifier acting through the GPRC6A receptor.

      Reviewer #2 (Public review):

      Summary:

      The authors used rats to determine the receptor for a food-related perception (kokumi) that has been characterized in humans. They employ a combination of behavioral, electrophysiological, and immunohistochemical results to support their conclusion that ornithine-mediated kokumi effects are mediated by the GPRC6A receptor. They complemented the rat data with some human psychophysical data. I find the results intriguing, but believe that the authors overinterpret their data.

      Strengths:

      The authors examined a new and exciting taste enhancer (ornithine). They used a variety of experimental approaches in rats to document the impact of ornithine on taste preference and peripheral taste nerve recordings. Further, they provided evidence pointing to a potential receptor for ornithine.

      Weaknesses:

      The authors have not established that the rat is an appropriate model system for studying kokumi. Their measurements do not provide insight into any of the established effects of kokumi on human flavor perception. The small study on humans is difficult to compare to the rat study because the authors made completely different types of measurements. Thus, I think that the authors need to substantially scale back the scope of their interpretations. These weaknesses diminish the likely impact of the work on the field of flavor perception.

      We would like to thank Reviewer #2 for the valuable comments and suggestions. Regarding the question of whether the rat is an appropriate model system for studying kokumi, we have chosen this species for several reasons: it is readily available as a conventional experimental model for gustatory research; the calcium-sensing receptor (CaSR), known as the kokumi receptor, is expressed in taste bud cells; and prior research has demonstrated the use of rats in kokumi studies involving gamma Glu-Val-Gly (Yamamoto and Mizuta, Chem. Senses, 2022). We acknowledge that fundamentally different types of measurements were conducted in the human psychophysical study and the rat study. Kokumi can indeed be assessed and expressed in humans; however, we do not currently have the means to confirm that animals experience kokumi in the same way that humans do. Therefore, human studies are necessary to evaluate kokumi, a conceptual term denoting enhanced flavor, while animal studies are needed to explore the potential underlying mechanisms of kokumi. We believe that a combination of both human and animal studies is essential, as is the case with research on sugars. While sugars are known to elicit sweetness, it is unclear whether animals perceive sweetness identically to humans, even though they exhibit a strong preference for sugars. In the revised manuscript, we will incorporate additional information to address the comments raised by the reviewer. We will also carefully review and revise our previous statements to ensure accuracy and clarity.

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors set out to investigate whether GPRC6A mediates kokumi taste initiated by the amino acid L-ornithine. They used Wistar rats, a standard laboratory strain, as the primary model and also performed an informative taste test in humans, in which miso soup was supplemented with various concentrations of L-ornithine. The findings are valuable and overall the evidence is solid. L-Ornithine should be considered to be a useful test substance in future studies of kokumi taste and the class C G protein-coupled receptor known as GPRC6A (C6A) along with its homolog, the calcium-sensing receptor (CaSR) should be considered candidate mediators of kokumi taste.

      Strengths:

      The overall experimental design is solid based on two bottle preference tests in rats. After determining the optimal concentration for L-Ornithine (1 mM) in the presence of MSG, it was added to various tastants, including inosine 5'-monophosphate; monosodium glutamate (MSG); mono-potassium glutamate (MPG); intralipos (a soybean oil emulsion); sucrose; sodium chloride (NaCl); citric acid and quinine hydrochloride. Robust effects of ornithine were observed in the cases of IMP, MSG, MPG, and sucrose, and little or no effects were observed in the cases of sodium chloride, citric acid, and quinine HCl. The researchers then focused on the preference for Ornithine-containing MSG solutions. The inclusion of the C6A inhibitors Calindol (0.3 mM but not 0.06 mM) or the gallate derivative EGCG (0.1 mM but not 0.03 mM) eliminated the preference for solutions that contained Ornithine in addition to MSG. The researchers next performed transections of the chord tympani nerves (with sham operation controls) in anesthetized rats to identify the role of the chorda tympani branches of the facial nerves (cranial nerve VII) in the preference for Ornithine-containing MSG solutions. This finding implicates the anterior half-two thirds of the tongue in ornithine-induced kokumi taste. They then used electrical recordings from intact chorda tympani nerves in anesthetized rats to demonstrate that ornithine enhanced MSG-induced responses following the application of tastants to the anterior surface of the tongue. They went on to show that this enhanced response was insensitive to amiloride, selected to inhibit 'salt tastant' responses mediated by the epithelial Na+ channel, but eliminated by Calindol. Finally, they performed immunohistochemistry on sections of rat tongue demonstrating C6A positive spindle-shaped cells in fungiform papillae that partially overlapped in its distribution with the IP3 type-3 receptor, used as a marker of Type-II cells, but not with (i) gustducin, the G protein partner of Tas1 receptors (T1Rs), used as a marker of a subset of type-II cells; or (ii) 5-HT (serotonin) and Synaptosome-associated protein 25 kDa (SNAP-25) used as markers of Type-III cells.

      Weaknesses:

      The researchers undertook what turned out to be largely confirmatory studies in rats with respect to their previously published work on Ornithine and C6A in mice (Mizuta et al Nutrients 2021).

      The authors point out that animal models pose some difficulties of interpretation in studies of taste and raise the possibility in the Discussion that umami substances may enhance the taste response to ornithine (Line 271, Page 9). They miss an opportunity to outline the experimental results from the study that favor their preferred interpretation that ornithine is a taste enhancer rather than a tastant.

      At least two other receptors in addition to C6A might mediate taste responses to ornithine: (i) the CaSR, which binds and responds to multiple L-amino acids (Conigrave et al, PNAS 2000), and which has been previously reported to mediate kokumi taste (Ohsu et al., JBC 2010) as well as responses to Ornithine (Shin et al., Cell Signaling 2020); and (ii) T1R1/T1R3 heterodimers which also respond to L-amino acids and exhibit enhanced responses to IMP (Nelson et al., Nature 2001). While the experimental results as a whole favor the authors' interpretation that C6A mediates the Ornithine responses, they do not make clear either the nature of the 'receptor identification problem' in the Introduction or the way in which they approached that problem in the Results and Discussion sections. It would be helpful to show that a specific inhibitor of the CaSR failed to block the ornithine response. In addition, while they showed that C6A-positive cells were clearly distinct from gustducin-positive, and thus T1R-positive cells, they missed an opportunity to clearly differentiate C6A-expressing taste cells and CaSR-expressing taste cells in the rat tongue sections.

      It would have been helpful to include a positive control kokumi substance in the two-bottle preference experiment (e.g., one of the known gamma-glutamyl peptides such as gamma-glu-Val-Gly or glutathione), to compare the relative potencies of the control kokumi compound and Ornithine, and to compare the sensitivities of the two responses to C6A and CaSR inhibitors.

      The results demonstrate that enhancement of the chorda tympani nerve response to MSG occurs at substantially greater Ornithine concentrations (10 and 30 mM) than were required to observe differences in the two bottle preference experiments (1.0 mM; Figure 2). The discrepancy requires careful discussion and if necessary further experiments using the two-bottle preference format.

      We would like to thank Reviewer #3 for the valuable comments and helpful suggestions. We propose that ornithine has two stimulatory actions: one acting on GPRC6A, particularly at lower concentrations, and another on amino acid receptors such as T1R1/T1R3 at higher concentrations. Consequently, ornithine is not preferable at lower concentrations but becomes preferable at higher concentrations. For our study on kokumi, we used a low concentration (1 mM) of ornithine. The possibility mentioned in the Discussion that 'the umami substances may enhance the taste response to ornithine' is entirely speculative. We will reconsider including this description in the revised version. As the reviewer suggested, in addition to GPRC6A, ornithine may bind to CaSR and/or T1R1/T1R3 heterodimers. However, we believe that ornithine mainly binds to GPRC6A, as a specific inhibitor of this receptor almost completely abolished the enhanced response to umami substances, and our immunohistochemical study indicated that GPRC6A-expressing taste cells are distinct from CaSR-expressing taste cells (see Supplemental Fig. 3). We conducted essentially the same experiments using gamma-Glu-Val-Gly in Wistar rats (Yamamoto and Mizuta, Chem. Senses, 2022) and compared the results in the Discussion. The reviewer may have misunderstood the chorda tympani results: we added the same concentration (1 mM) used in the two-bottle preference test to MSG (Fig. 5-B). Fig. 5-A shows nerve responses to five concentrations of plain ornithine. In the revised manuscript, we will strive to provide more precise information reflecting the reviewer’s comments.

    1. Welcome back to stage 4 of this advanced demo series.

      Now in stage 4, we're going to perform the last step before we can make this a truly elastic and scalable design.

      And we're going to migrate the wp-content folder which stores these priceless animal images from the EC2 instance onto EFS which is the elastic file system.

      This is a shared network file system that we can use to store images or other content in a resilient way outside of the life cycle of these individual EC2 instances.

      So to do that, we need to move back to the AWS console, click on the services drop down and type EFS.

      Right click and open the EFS console in a new tab.

      Once that's opened, click on create file system.

      Now we're going to step through the full configuration options so rather than using this simplified user interface, go ahead and click on customize.

      So the first step is to create the file system itself.

      So for name, go ahead and call this a4l-wordpress-content.

      Leave the storage class as standard.

      These cat images are critical data and so we are going to leave automatic backups enabled.

      And we're also going to leave life cycle management set to be the default so 30 days since the last access for throughput mode pick bursting which links the throughput to the size of the storage.

      Then expand additional settings.

      You've got two performance modes, general purpose and max IO.

      For this demonstration, go ahead and select general purpose.

      Max IO is for very specific high performance scenarios for 99% of use cases.

      You should select general purpose.

      Now also go ahead and untick enable encryption of data at rest.

      If this were a production scenario, you would leave this on.

      But for this demo, which is focusing on architecture evolution, it simplifies the implementation if we disable it.

      So go ahead and make sure that encryption is disabled.

      Once you've done that, that's all of the file system specific options that we need to configure.

      So go ahead and click on next.

      In this part, you're configuring the EFS mount targets, which are the network interfaces in the VPC, which your instances will connect with.

      So in the virtual private cloud drop down, select it and then pick a for L VPC.

      So this is the VPC that these mount targets are going to go into.

      Now, each of the mount targets is secured by a security group.

      The first thing we need to do is to strip off the default security group for the VPC.

      So click in the crosses next to each of these security groups.

      Now, you should have three rows, one for each availability zone.

      So in my case, you are seized one A, one B and one C and make sure that you've got the same selected.

      So one row for each availability zone, A, B and C.

      Now in the subnet drop down for availability zone one A, I want you to go ahead and pick SN-AP-A.

      So this should be 10.16.32.0/20.

      For the US East one B row, I want you to go ahead and pick SN-AP-B.

      This should be 10.16.96.0/20.

      And then finally for US East one C, I want you to go ahead and pick SN-AP-C, which should be 10.16.160.0/20.

      Now for all three rows within the security groups drop down, I want you to go ahead and select A4LVPC-SGEFS.

      Again, for each of these, it will have some randomness after it, but just make sure you pick the right one.

      A4LVPC-SGEFS.

      And you need to pick that for each of the three rows.

      Make sure you pick the right one because if you don't, it will impact your ability to connect.

      So there the mount targets configured and they'll be allocated with an IP address in each of these subnets automatically, which will allow you to connect to them.

      At this point, go ahead and click on Next.

      You can configure some additional file system policies.

      This is entirely optional.

      We won't be using that.

      So just go ahead and click on Next.

      And then on the review screen, scroll all the way down to the bottom and just click on Create.

      Now the file system itself will initially show as being in the creating state and it will then change to available.

      Go ahead and click on the file system itself.

      Click on the Network tab and then just scroll down and these are the mount targets which are being created.

      Now in order to configure our EC2 instance, we will need all of these mount targets to be in the available state.

      But what we can do to save some time is we can note down the file system ID of this EFS file system.

      So this is this value.

      You can see it at the top header here or you can see it in this row at the top.

      Just note that down and copy that into your clipboard because we need to configure another parameter to point at this file system ID.

      Because remember when we're scaling things automatically, it's always best practice to use the parameter store to store configuration information.

      So click on Services, type Sys which are the first few letters of Systems Manager and open that in a new tab.

      Once you're at the Systems Manager console, go ahead and click on Parameter Store and then you need to click Create Parameter to create a new parameter.

      We're going to call this parameter forward slash A4L forward slash WordPress forward slash and then EFS for Elastic File System, FS for File System and then ID.

      So EFS File System ID.

      For description, put File System ID for WordPress content and then in brackets WP-Content and that will help us know exactly what this parameter is for.

      As before, we'll be picking the standard tier, the type will be string, the data type will be text and then into the value, just go ahead and paste that file system ID.

      And once you've done all that, you can go ahead and click on Create Parameter.

      Once that's done, go back to the EFS console and if required, just hit refresh and make sure that all of these mount targets are in the available state.

      This is what it should look like with all three showing a green tick and available.

      Once that's the case, go to the EC2 console because now we're going to configure our EC2 instance to connect to this file system.

      So go to Running Instances, locate the WordPress -LT instance, right click, select Connect, choose Session Manager and then click on Connect.

      And this will open Session Manager console to the EC2 instance.

      As always, type shudubash, press Enter, cd and press Enter and then type clear and press Enter again, just to clear the screen making it easier to see.

      Now, even though EFS is based on NFS, which is a standard, in order to get EC2 instances to connect to EFS, we need to install an additional tools package.

      And to do that, we use this command.

      So type or paste that in and press Enter to install the EFS support package.

      Once that's installed again, I'm going to clear the screen to make it easier to see.

      Then I'm going to move to the Web Root folder by typing cd /vr/www/html.

      And what I'm going to do is to move the entire wp-content folder somewhere else.

      So if I just go inside this folder to illustrate exactly what it looks like and then do a list, you'll see that inside there are plugins, themes and uploads.

      And inside those folders are any media assets used by WordPress.

      So I'm just going to type cd /dot/ to move back up a level out of this folder.

      And then I'm going to move this entire folder to the /tmp folder, which is a temporary folder.

      So mv/wp-content///tmp and that moves that entire folder to the temporary folder.

      Then we're going to create a new folder.

      So shudu space mkdir space wp-content.

      This will be the mount point for the EFS file system.

      So I'm making an empty directory.

      Then I'm going to clear the screen and then paste in the next two commands from the lesson instructions.

      And this populates an environment variable called EFS/FSID with the value from the parameter you just created in the parameter store.

      So this is now the file system ID of the EFS file system.

      Now there's a file called fstab which exists in the /etc folder.

      And inside there it's called fstab and this contains a list of file systems which are mounted on this EC2 instance.

      Initially this only has the single line for the boot volume.

      What we're going to do is add an additional line to this fstab file.

      And this line is going to configure the EC2 instance so that it mounts our EFS file system on boot every single time.

      And this is this command.

      So it echoes this line.

      So the file system ID from the environment variable.

      We're going to mount it to the folder that we just created.

      So the wp-content folder and these are all of the file system options.

      So we're going to put that into the fstab file.

      So if we now cap this file it's got this extra line.

      And this means this file system will be mounted whenever the operating system starts.

      And we can force this just for now by running mount space-a space-t space-efs space-defaults.

      And this will mount the EFS file system onto this EC2 instance.

      We can verify that by doing a df space-k.

      And the bottom line should show us that we've now got this EFS file system mounted as the wp-content folder.

      So this is the folder that WordPress expects its media to be inside.

      Now all that remains is for us to migrate the existing data that we moved to the temporary folder back in to wp-content.

      And to do that we use this command.

      So we're using the mv command to move forward slash tmp forward slash wp-content forward slash star.

      So any files and folders and then we're moving it back into var www.html wp-content.

      So this is the EFS file system.

      So run that and that will copy the data back to EFS, which remember is now mounted where WordPress expects it to be.

      Now that might take a few moments to complete.

      Once it's done, we just need to fix up the permissions.

      So run this command chown space-bigr space-ec2-user colon apache space and then slash var slash www.

      So this just reestablishes permissions and ownership of everything in this particular part of the file system.

      Just make sure we won't have any problems going forward.

      Now at this point we're going to use the reboot command to restart this instance.

      And if everything goes well, the instance should start, the EFS file system should be loaded and WordPress should have access to all of this wp-content, which is now running from a network file system.

      So go ahead type reboot and press enter.

      If you press enter just to make sure that you are disconnected and I am.

      So that's good.

      So now I need to wait a few minutes for this EC2 instance or at least its operating system to restart.

      So I'll go ahead and close down this session manager tab.

      Go back to the EC2 console.

      After waiting a few minutes, I'll right click select connect check session manager click on connect.

      Assuming the instance has restarted, I'll be back at the prompt.

      And if I do a DF space-k if everything's working as expected, the EFS file system will still be mounted into the directory that we configured.

      If I go back to the EC2 console and just copy down the instances public IP version for address, either refresh the tab if you still got it open or paste in the IP address and reload that page.

      And if everything's working as expected, all of these high quality critical cat pitches should still load from the WordPress blog.

      So now at this point when we're interacting with the application, both the database and the wp-content both exist away from the EC2 instance.

      And this means we're now in a position where we can scale the EC2 instance without worrying about the data or the media for any of the posts.

      And this means we can now further evolve this architecture to be fully elastic.

      Now there is one more thing that we need to do before moving on to the next stage of the demo and implementing this final step towards a fully elastic architecture.

      And that's that we need to update the launch template to include this updated configuration so that it uses EFS.

      To do that, go back to the EC2 console, go to launch templates, select the launch template.

      So check the box, click on the actions drop down, select modify template, create new version.

      For template version description, use app only, uses EFS file system defined in and then the parameter store value that contains the file system ID.

      So this is just the description.

      Now again, because we're creating a new version, it will populate all of the configuration with the previous template version.

      But I'll need you to scroll all the way down to the bottom, expand advanced details and scroll all the way down.

      Again, we're going to make some edits to the user data.

      So expand this box a little bit to make it easier to read.

      What I'll need you to do is to put your cursor after the end of this top line and just press enter twice to make some space and then paste in this set of configuration.

      And again, this is stored within the instructions for this stage of the demo series that will just populate an environment variable with the file system ID that it will get from the parameter store.

      Scroll down and next you're looking for a software installation line.

      You're looking for this line, the line that performs the installation of the Maria DB server, the Apache web server and the W get utility.

      Position your cursor after the word stress and then press space.

      And then I'll want you to add this text followed by a space, which is Amazon hyphen EFS hyphen utils.

      Next, scroll down a little bit further and you're looking for the line that says system, CTL, start, HTTBD.

      Click on the end to position your cursor at the end of that line and then press enter twice to add some space and then paste in this next block also contained within this lessons instructions.

      What this does is to make a WP hyphen content folder before we install WordPress, configure the ownership of the entire folder tree and then add the line for EFS to the FSTAB file and then mount this EFS file system in to VARWWWW/HTML/WP hyphen content.

      And this means that when we're automatically provisioning this instance before we install WordPress, we're creating and mounting this EFS file system.

      And then we go on to installing WordPress, configuring the database and performing the final fix of all of the permissions at that folder structure.

      Next, scroll down.

      We're done with all of the launch template user data configuration.

      Just go ahead and click on create template version.

      We need to make this new version the default.

      So click on launch templates, select the WordPress launch template, click on actions, scroll down, select set default version, click in the dropdown.

      Version two should currently be the default.

      Change that to version three and click set as default version.

      So at this point, you further evolved the architecture.

      Now we have both the database for WordPress stored in RDS and the WP hyphen content data stored within the Elastic file system.

      So we've solved many of the applications limitations.

      We can scale the database independently of the application.

      We've stored the media files separate from the instance.

      So now we can scale the instance freely out or in without risking the media or the database.

      We do still have two final limitations which will be fixing together in the next stage of this demo series.

      One is that customers still connect to the instance directly so we don't have any health checks.

      We don't have any auto healing capabilities and we're limited to how we can scale.

      And then finally, the IP address of the instance is still hard coded into the database.

      And so even if we did provision additional instances, WordPress would expect all of the data to be loaded from that one single original instance.

      And to allow us to scale, we have to resolve both of those problems.

      At this point though, you've done everything required in stage four.

      So go ahead, complete this video.

      And when you're ready, I look forward to you joining me in stage five of this advanced demo series.

    1. AbstractWith emerging of Spatial Transcriptomics (ST) technology, a powerful algorithmic framework to quantitatively evaluate the active cell-cell interactions in the bio-function associated iTME unit will pave the ways to understand the mechanism underlying tumor biology. This study provides the StereoSiTE incorporating open source bioinformatics tools with the self-developed algorithm, SCII, to dissect a cellular neighborhood (CN) organized iTME based on cellular compositions, and to accurately infer the functional cell-cell communications with quantitatively defined interaction intensity in ST data. We applied StereoSiTE to deeply decode ST data of the xenograft models receiving immunoagonist. Results demonstrated that the neutrophils dominated CN5 might attribute to iTME remodeling after treatment. To be noted, SCII analyzed the spatially resolved interaction intensity inferring a neutrophil leading communication network which was proved to actively function by analysis of Transcriptional Factor Regulon and Protein-Protein Interaction. Altogether, StereoSiTE is a promising framework for ST data to spatially reveal tumoribiology mechanisms.

      This work has been published in GigaScience Journal under a CC-BY 4.0 license (https://doi.org/10.1093/gigascience/giae078), and published as part of our Spatial Omics Methods series. The peer-reviews are as follows.

      Reviewer 1. Lihong Peng

      In this manuscript, the authors developed a computational framework named StereoSiTE to spatially and quantitatively profile the cellular neighborhood organized iTME by incorporating open source bioinformatics tools with their self-proposed algorithm named SCII. This study is very meaningful. However, it remains several problems.

      Major comments: 1. The authors incorporated several open sources bioinformatics tools. However, how to ensure their combination is the optimal to the spatially resolved cell-cell communication inference performance? For example, cell2location was used to deconvolute cellular composition and construct cellular neighborhood. Why to use cell2location for deconvoluting spatial transcriptomics data? why not use the newest deconvolution algorithms, for example, SpaDecon, Celloscope, POLARIS, GraphST, SPASCER, and EnDecon? No model can adapt to all data. The authors should first verify that cell2location is the best appropriate cell type annotation tool corresponding to iTME. If not, the subsequent analyses will be not appropriate.

      1. The authors claimed that they computed the decomposition losses of different combinations of the number of CN modules and CT modules. Which combinations? The authors should list them.

      2. When measuring spatial cell interaction intensity, the authors only simply summed up the ligand and receptor gene expression information of the sender and receiver cells. Why not consider existing classical intercellular communication intensity methods? The authors should compare other intercellular communication intensity measurement methods. Please refer to the following two cites: Cell-cell communication inference and analysis in the tumour microenvironments from single-cell transcriptomics: data resources and computational strategies, briefings in bioinformatics. CellDialog: A Computational Framework for Ligand-receptor-mediated Cell-cell Communication Analysis, IEEE Journal of Biomedical and Health Informatics. Deciphering ligand-receptor-mediated intercellular communication based on ensemble deep learning and the joint scoring strategy from single-cell transcriptomic data, Computers in Biology and Medicine.

      3. For protein-protein interaction analysis, the authors queried 628 significant up regulated genes in CN5 area of treatment samples from STRING. Can all obtained proteins be ligands or receptors? In addition, they labeled hub genes and key protein-protein interaction networks, what were these hub genes and key networks used for?

      4. Which ligand-receptor pairs could mediate intercellular communication within immune tumor microenvironment? Among these L-R pairs, which L-R pairs are known in existing databases and which L-R pairs are the predicted ones?

      5. "The enrichment analysis of individual CN showed that each CN had a dominant cell type with a spatial aggregation (Fig 2F), which was increasingly obvious than that in whole slide (Fig 2E)." What's a dominant cell type? How to define it?

      6. "To reduce the variance among open-sourced L-R databases, we unified L-R database in SCII by choosing L-R dataset in CellChatDB, which assigned each L-R with an interaction distance associated classification as secreted signaling, ECM receptor and cell-cell contact." How to unify L-R database? Did it allow for user-specified LR databases and/or add user-specified LR databases?

      7. In figure 3, how to confirm which L-R pairs mediate intercellular communication?

      8. StereoSiTE is composed of multiple modules, is it scalable? Can some of these modules (such as clustering and cell type annotation) be replaced with other more powerful modules?

      9. The authors claimed that "CellPhoneDB detected many false positive interactions". How to find these false positive LRIs? How to validate the LRIs be false positives? Please list the found false positive LRIs.

      10. In Figure 3, the authors should add comparison experiments between StereoSiTME and classical intercellular communication analysis tools.

      Minor comments: 1. The text in subfigure A, B, and C in Supplementary Figure 2 is obscure. The authors should revise Supplementary Figure 2. 2. In Section "Abstract", iTME should use full name when it first appears. 3. Which cites of "13 Li, M. et al. (2023)." is in the reference list?

      Re-review:

      In the revised manuscript, the authors conducted lots of revisions. However, it still remains many problems to solve:

      1. The authors have compared the performance of Cell2location with other cell type identification methods, Celloscope[10], GraphST[11], and POLARIS[12] on on both STARmap and stereo-seq dataset of liver cancer. How about its performance on other unlabeled datasets? Please compare it with "STGNNks: Identifying cell types in spatial transcriptomics data based on graph neural network, denoising auto-encoder, and 𝑘-sums clustering".

      2. Cell-cell communication is usually mediated by LRIs. The construction of high-quality LRI databases is very important to cell-cell communication. The authors should introduce these LRI data resources and potential LRI prediction methods and cite them, for example, PMID: 37976192, 37364528, 38367445.

      3. In Figure 4B, 4C, 4D, and 4F, Figure 5A and 5B, Figure 6B and 6C, the fonts are too small. Please enlarge the fonts.

      4. The organization and structure of this manuscript must be carefully revised. For example, The structure in Discussion is obscure. In the first paragraph in this section, the authors have introduced their proposed method, next, they described it in details. But the third paragraph elucidated the reason why to develop this reason. In addition, "Figure 3 highlights that the analysis without distance threshold may lead to false positive results, and SCII showed more superior performance than other methods." why to Figure 3? Did not the other results support their conclusion? The final paragraph in Discussion introduced their method again. It HAS NO logic.

      5. Where is the conclusion of this manuscript?

      6. The authors should analyze the limitations of this work for further work in the future.

      7. English is VERY POOR. This manuscript must be carefully revised. For example,

      "prove that spatial proximity is a must to guarantee an effective investigation.", is a must to do?

      Re-re-review: The authors have solved my issues.

    1. Author response:

      eLife Assessment

      “The work presented is important for our understanding of the development of the cardiac conduction system and its regulation by T-box transcription factors. The conclusions are supported by convincing data. Overall, this is an excellent study that advances our understanding of cardiac biology and has implications beyond the immediate field of study.”

      We appreciate the positive assessment of this work and the recognition of its importance in advancing our understanding of the cardiac conduction system, its regulation by T-box transcription factors, and contribution beyond the immediate field.

      Reviewer #1 (Public review):

      Summary:

      In a heroic effort, Ozanna Burnicka-Turek et al. have made and investigated conduction system-specific Tbx3-Tbx5 deficient mice and investigated their cardiac phenotype. Perhaps according to expectations, given the body of literature on the function of the two T-box transcription factors in the heart/conduction system, the cardiomyocytes of the ventricular conduction system seemed to convert to "ordinary" ventricular working myocytes. As a consequence, loss of VCS-specific conduction system propagation was observed in the compound KO mice, associated with PR and QRS prolongation and elevated susceptibility to ventricular tachycardia.

      Strengths:

      Great genetic model. Phenotypic consequences at the organ and organismal levels are well investigated. The requirement of both Tbx3 and Tbx5 for maintaining VCS cell state has been demonstrated.

      We thank Reviewer #1 for acknowledging the effort involved in generating and characterizing the Tbx3/Tbx5 double conditional knockout mouse model and for highlighting the significance of this work in elucidating the role of these transcription factors in maintaining the functional and transcriptional identity of the ventricular conduction system.

      Weaknesses:

      The actual cell state of the Tbx3/Tbx5 deficient conducting cells was not investigated in detail, and therefore, these cells could well only partially convert to working cardiomyocytes, and may, in reality, acquire a unique state.

      We agree with Reviewer #1 that the Tbx3/Tbx5 double mutant ventricular conduction myocardial cells may only partially convert to working cardiomyocytes or may acquire a unique state.  The transcriptional state of the double mutant VCS cells was investigated by bulk profiling of key genes associated with specific conduction and non-conduction cardiac regions, including fast conduction, slow conduction, or working myocardium. Neither the bulk transcriptional approaches nor the optical mapping approaches we employed capture single-cell data; in both cases, the data represents aggregated signals from multiple cells (1, 2). Single cell approaches for transcriptional profiling and cellular electrophysiology would clarify this concern and are appropriate for future studies.

      (1) O’Shea C, Nashitha Kabri S, Holmes AP, Lei M, Fabritz L, Rajpoot K, Pavlovic D (2020) Cardiac optical mapping – State-of-the-art and future challenges. The International Journal of Biochemistry & Cell Biology 126:105804. doi: 10.1016/j.biocel.2020.105804.

      (2) Efimov IR, Nikolski VP, and Salama G (2004) Optical Imaging of the Heart. Circulation Research 95:21-33. doi: 10.1161/01.RES.0000130529.18016.35.

      Reviewer #2 (Public review):

      Summary:

      The goal of this work is to define the functions of T-box transcription factors Tbx3 and Tbx5 in the adult mouse ventricular cardiac conduction system (VCS) using a novel conditional mouse allele in which both genes are targeted in cis. A series of studies over the past 2 decades by this group and others have shown that Tbx3 is a transcriptional repressor that patterns the conduction system by repressing genes associated with working myocardium, while Tbx5 is a potent transcriptional activator of "fast" conduction system genes in the VCS. In a previous work, the authors of the present study further demonstrated that Tbx3 and Tbx5 exhibit an epistatic relationship whereby the relief of Tbx3-mediated repression through VCS conditional haploinsufficiency allows better toleration of Tbx5 VCS haploinsufficiency. Conversely, excess Tbx3-mediated repression through overexpression results in disruption of the fast-conduction gene network despite normal levels of Tbx5. Based on these data the authors proposed a model in which repressive functions of Tbx3 drive the adoption of conduction system fate, followed by segregation into a fast-conducting VCS and slow-conduction AVN through modulation of the Tbx5/Tbx3 ratio in these respective tissue compartments.

      The question motivating the present work is: If Tbx5/Tbx3 ratio is important for slow versus fast VCS identity, what happens when both genes are completely deleted from the VCS? Is conduction system identity completely lost without both factors and if so, does the VCS network transform into a working myocardium-like state? To address this question, the authors have generated a novel mouse line in which both Tbx5 and Tbx3 are floxed on the same allele, allowing complete conditional deletion of both factors using the VCS-specific MinK-CreERT2 line, convincingly validated in previous work. The goal is to use these double conditional knockout mice to further explore the model of Tbx3/Tbx5 co-dependent gene networks and VCS patterning. First, the authors demonstrate that the double conditional knockout allele results in the expected loss of Tbx3 and Tbx5 specifically in the VCS when crossed with Mink-CreERT2 and induced with tamoxifen. The double conditional knockout also results in premature mortality. Detailed electrophysiological phenotyping demonstrated prolonged PR and QRS intervals, inducible ventricular tachycardia, and evidence of abnormal impulse propagation along the septal aspect of the right ventricle. In addition, the mutants exhibit downregulation of VCS genes responsible for both fast conduction AND slow conduction phenotypes with upregulation of 2 working myocardial genes including connexin-43. The authors conclude that loss of both Tbx3 and Tbx5 results in "reversion" or "transformation" of the VCS network to a working myocardial phenotype, which they further claim is a prediction of their model and establishes that Tbx3 and Tbx5 "coordinate" transcriptional control of VCS identity.

      We appreciate Reviewer #2’s detailed summary of the study’s aims, methodologies, and findings, as well as their thoughtful suggestions for further analysis. We are grateful for their recognition of our genetic model’s novelty and robustness.

      Overall Appraisal:

      As noted above, the present study does not further explore the Tbx5/Tbx3 ratio concept since both genes are completely knocked out in the VCS. Instead, the main claims are that the absence of both factors results in a transcriptional shift of conduction tissue towards a working myocardial phenotype, and that this shift indicates that Tbx5 and Tbx3 "coordinate" to control VCS identity and function.

      We agree with this reviewer’s assessment of the assertions in our manuscript.  The novel combined Tbx5/Tbx3 double mutant model does not further explore the TBX5/TBX3 ratio concept, which we previously examined in detail (1). Instead, as the Reviewer notes, this manuscript focuses on testing a model that the coordinated activity of Tbx3 and Tbx5 defines specialized ventricular conduction identity.

      (1) Burnicka-Turek O, Broman MT, Steimle JD, Boukens BJ, Petrenko NB, Ikegami K, Nadadur RD, Qiao Y, Arnolds DE, Yang XH, Patel VV, Nobrega MA, Efimov IR, Moskowitz IP (2020) Transcriptional Patterning of the Ventricular Cardiac Conduction System. Circulation Research 127:e94-e106. doi:10.1161/CIRCRESAHA.118.314460. 

      Strengths:

      (1) Successful generation of a novel Tbx3-Tbx5 double conditional mouse model.

      (2) Successful VCS-specific deletion of Tbx3 and Tbx5 using a VCS-specific inducible Cre driver line.

      (3) Well-powered and convincing assessments of mortality and physiological phenotypes.

      (4) Isolation of genetically modified VCS cells using flow.

      We thank Reviewer #2 for acknowledging the listed strengths of our study.

      Weaknesses:

      (1) In general, the data is consistent with a long-standing and well-supported model in which Tbx3 represses working myocardial genes and Tbx5 activates the expression of VCS genes, which seem like distinct roles in VCS patterning. However, the authors move between different descriptions of the functional relationship and epistatic relationship between these factors, including terms like "cooperative", "coordinated", and "distinct" at various points. In a similar vein, sometimes terms like "reversion" are used to describe how VCS cells change after Tbx3/Tbx5 conditional knockout, and other times "transcriptional shift" and at other times "reprogramming". But these are all different concepts. The lack of a clear and consistent terminology for describing the phenomena observed makes the overarching claims of the manuscript more difficult to evaluate.

      We discriminate prior work on the “long-standing and well-supported model’ supported by investigation of the role of Tbx5 and Tbx3 independently from this work examining the coordinated role of Tbx5 and Tbx3. Prior work demonstrated that Tbx3 represses working myocardial genes and Tbx5 activates expression of VCS genes, consistent with the reviewer’s suggestion of their distinct roles in VCS patterning. However, the current study uniquely evaluates the combined role of Tbx3 and Tbx5 in distinguishing specialized conduction identify from working myocardium, for the first time.

      We appreciate Reviewer #2’s feedback regarding the need for consistent terminology when describing the impact of the double Tbx3 and Tbx5 mutant. We will edit the manuscript to replace terms like “reversion” with “transcriptional shift” or “transformation” when describing the observed phenotype, and we will use “coordination” to describe the combined role of Tbx5 and Tbx3 in maintaining VCS-specific identity.

      (2) A more direct quantitative comparison of Tbx5 Adult VCS KO with Tbx5/Tbx3 Adult VCS double KO would be helpful to ascertain whether deletion of Tbx3 on top of Tbx5 deletion changes the underlying phenotype in some discernable way beyond mRNA expression of a few genes. Superficially, the phenotypes look quite similar at the EKG and arrhythmia inducibility level and no optical mapping data from a single Tbx5 KO is presented for comparison to the double KO.

      We thank Reviewer #2 for the suggestions that a direct comparison between Tbx5 single conditional knockout and Tbx3/Tbx5 double conditional knockout models may help isolate the specific contribution of Tbx3 deletion in addition to Tbx5 deletion.

      Previous studies have assessed the effect of single Tbx5 CKO in the VCS of murine hearts (1, 3, 5). Arnolds et al. demonstrated that the removal of Tbx5 from the adult ventricular conduction system results in VCS slowing, including prolonged PR and QRS intervals, prolongation of the His duration and His-ventricular (HV) interval (3). Furthermore, Burnicka-Turek et al. demonstrated that the single conditional knockout of Tbx5 in the adult VCS caused a shift toward a pacemaker cell state, with ectopic beats and inappropriate automaticity (1). Whole-cell patch clamping of VCS-specific Tbx5-deficient cells revealed action potentials characterized by a slower upstroke (phase 0), prolonged plateau (phase 2), delayed repolarization (phase 3), and enhanced phase 4 depolarization - features characteristic of nodal action potentials rather than typical VCS action potentials (3). These observations were interpreted as uncovering nodal potential of the VCS in the absence of Tbx5. Based on the role of Tbx3 in CCS specification (2), we hypothesized that the nodal state of the VCS uncovered in the absence of Tbx5 was enabled by maintained Tbx3 expression. This motivated us to generate the double Tbx5 / Tbx3 knockout model to examine the state of the VCS in the absence of both T-box TFs.

      In the current study, we demonstrate that the VCS-specific deletion of Tbx3 and Tbx5 results in the loss of fast electrical impulse propagation in the VCS, similar to that observed in the single Tbx5 mutant. However, unlike the Tbx5 single mutant, the Tbx3/Tbx5 double deletion does not cause a gain of pacemaker cell state in the VCS. Instead, the physiological data suggests a transition toward non-conduction working myocardial physiology. This conclusion is supported by the presence of only a single upstroke in the optical action potential (OAP) recorded from the His bundle region and VCS cells in Tbx3/Tbx5 double conditional knockout mice. The electrical properties of VCS cells in the double knockout are functionally indistinguishable from those of ventricular working myocardial cells. As a result, ventricular impulse propagation is significantly slowed, resembling activation through exogenous pacing rather than the rapid conduction typically associated with the VCS. We will edit the text of the manuscript to more carefully distinguish the observations between these models, as suggested.

      (1) Burnicka-Turek O, Broman MT, Steimle JD, Boukens BJ, Petrenko NB, Ikegami K, Nadadur RD, Qiao Y, Arnolds DE, Yang XH, Patel VV, Nobrega MA, Efimov IR, Moskowitz IP (2020) Transcriptional Patterning of the Ventricular Cardiac Conduction System. Circulation Research 127:e94-e106. doi:10.1161/CIRCRESAHA.118.314460. 

      (2) Mohan RA, Bosada FM, van Weerd JH, van Duijvenboden K, Wang J, Mommersteeg MTM, Hooijkaas IB, Wakker V, de Gier-de Vries C, Coronel R, Boink GJJ, Bakkers J, Barnett P, Boukens BJ, Christoffels VM (2020) T-box transcription factor 3 governs a transcriptional program for the function of the mouse atrioventricular conduction system. Proc Natl Acad Sci U S A. 117:18617-18626. doi: 10.1073/pnas.1919379117.

      (3) Arnolds DE, Liu F, Fahrenbach JP, Kim GH, Schillinger KJ, Smemo S, McNally EM, Nobrega MA, Patel VV, Moskowitz IP (2012) TBX5 drives Scn5a expression to regulate cardiac conduction system function. The Journal of Clinical Investigation 122:2509–2518. doi: 10.1172/JCI62617.

      (4) Frank DU, Carter KL, Thomas KR, Burr RM, Bakker ML, Coetzee WA, Tristani-Firouzi M, Bamshad MJ, Christoffels VM, Moon AM (2012) Lethal arrhythmias in Tbx3-deficient mice reveal extreme dosage sensitivity of cardiac conduction system function and homeostasis. Proc Natl Acad Sci U S A. 109:E154-63. doi: 10.1073/pnas.1115165109.

      (5) Moskowitz IP, Pizard A, Patel VV, Bruneau BG, Kim JB, Kupershmidt S, Roden D, Berul CI, Seidman CE, Seidman JG (2004) The T-Box transcription factor Tbx5 is required for the patterning and maturation of the murine cardiac conduction system. Development 131:4107-4116. doi: 10.1242/dev.01265. PMID: 15289437.

      (3) The authors claim that double knockout VCS cells transform to working myocardial fate, but there is no comparison of gene expression levels between actual working myocardial cells and the Tbx3/Tbx5 DKO VCS cells so it's hard to know if the data reflect an actual cell state change or a more non-specific phenomenon with global dysregulation of gene expression or perhaps dedifferentiation. I understand that the upregulation of Gja1 and Smpx is intended to address this, but it's only two genes and it seems relevant to understand their degree of expression relative to actual working myocardium. In addition, the gene panel is somewhat limited and does not include other key transcriptional regulators in the VCS such as Irx3 and Nkx2-5. RNA-seq in these populations would provide a clearer comparison among the groups.

      And

      the main claims are that the absence of both factors results in a transcriptional shift of conduction tissue towards a working myocardial phenotype, and that this shift indicates that Tbx5 and Tbx3 "coordinate" to control VCS identity and function. However, only limited data are presented to support the claim of transcriptional reprogramming since the knockout cells are not directly compared to working myocardial cells at the transcriptional level and only a small number of key genes are assessed (versus genome-wide assessment).

      We appreciate Reviewer #2’s suggestion to expand the gene expression analysis in Tbx3/Tbx5-deficient VCS cells by including other specific genes and comparisons with “native”/actual working ventricular myocardial cells and broadening the gene panel. In this study, we evaluated core cardiac conduction system markers, revealing a loss of conduction system-specific gene expression in the double mutant VCS. Furthermore, we evaluated key working myocardial markers normally excluded from the conduction system, Gja1 and Smpx, revealing a shift towards a working myocardial state in the double mutant VCS (Figure 4). We agree that a more comprehensive analysis, such as transcriptome-wide approaches, would offer greater clarity on the extent and specificity of the observed shift from conduction to non-conduction identity. These approaches are appropriate directions for future studies.

      (4) From the optical mapping data, it is difficult to distinguish between the presence of (a) a focal proximal right bundle branch block due to dysregulation of gene expression in the VCS but overall preservation of the right bundle and its distal ramifications; from (b) actual loss of the VCS with reversion of VCS cells to a working myocardial fate. Related to this, the authors claim that this experiment allows for direct visualization of His bundle activation, but can the authors confirm or provide evidence that the tissue penetration of their imaging modality allows for imaging of a deep structure like the AV bundle as opposed to the right bundle branch which is more superficial? Does the timing of the separation of the sharp deflection from the subsequent local activation suggest visualization of more distal components of the VCS rather than the AV bundle itself? Additional clarification would be helpful.

      And

      In addition, the optical mapping dataset is incomplete and has alternative interpretations that are not excluded or thoroughly discussed.

      We agree with Reviewer #2 that the resolution of the optical mapping experiment may be insufficient to precisely localize the conduction block due to the limited signal strength from the VCS. It is possible that the region defined as the His Bundle also includes portions of the right bundle branch. Our control mice show VCS OAP upstrokes consistent with those reported by Tamaddon et al. (2000) using Di-4-ANEPPS (1). We appreciate the Reviewer’s attention to alternative interpretations, and we will incorporate these caveats into the manuscript text.

      (1) Tamaddon HS, Vaidya D, Simon AM, Paul DL, Jalife J, Morley GE (2000) High-resolution optical mapping of the right bundle branch in connexin40 knockout mice reveals slow conduction in the specialized conduction system. Circulation Research 87:929-36. doi: 10.1161/01.res.87.10.929. 

      Impact:

      The present study contributes a novel and elegantly constructed mouse model to the field. The data presented generally corroborate existing models of transcriptional regulation in the VCS but do not, as presented, constitute a decisive advance.

      And

      In sum, while this study adds an elegantly constructed genetic model to the field, the data presented fit well within the existing paradigm of established functions of Tbx3 and Tbx5 in the VCS and in that sense do not decisively advance the field. Moreover, the authors' claims about the implications of the data are not always strongly supported by the data presented and do not fully explore alternative possibilities.

      We appreciate Reviewer # 2’s acknowledgment of the elegance and novelty of the mouse model we generated. However, we respectfully disagree with their assessment that this work merely corroborates existing models without providing a decisive advance. Previous studies have investigated single Tbx5 or Tbx3 gene knockouts in-depth and established the T-box ratio model for distinguishing fast VCS from slow nodal conduction identity (1) that the reviewer alludes to in earlier comments. In contrast, this study aimed to explore a different model, that the combined effects of Tbx5 and Tbx3 distinguish adult VCS identity from non-conduction working myocardium. The coordinated Tbx3 and Tbx5 role in conduction system identify remained untested due to the lack of a mouse model that allowed their simultaneous removal. The very model the reviewer recognizes as “novel and elegantly constructed” has allowed the examination of the coordinated role of Tbx5 and Tbx3 for the first time. While we acknowledge the opportunity for additional depth of investigation of this model in future studies, the data we present provides consistent experimental support for the coordinated requirement of both Tbx5 and Tbx3 for ventricular cardiac conduction system identity.

      (1) Burnicka-Turek O, Broman MT, Steimle JD, Boukens BJ, Petrenko NB, Ikegami K, Nadadur RD, Qiao Y, Arnolds DE, Yang XH, Patel VV, Nobrega MA, Efimov IR, Moskowitz IP (2020) Transcriptional Patterning of the Ventricular Cardiac Conduction System. Circulation Research 127:e94-e106. doi:10.1161/CIRCRESAHA.118.314460. 

      Reviewer #3 (Public review):

      Summary:

      In the study presented by Burnicka-Turek et al., the authors generated for the first time a mouse model to cause the combined conditional deletion of Tbx3 and Tbx5 genes. This has been impossible to achieve to date due to the proximity of these genes in chromosome 5, preventing the generation of loss of function strategies to delete simultaneously both genes. It is known that both Tbx3 and Tbx5 are required for the development of the cardiac conduction system by transcription factor-specific but also overlapping roles as seen in the common and diverse cardiac defects found in patients with mutations for these genes. After validating the deletion efficiency and specificity of the line, the authors characterized the cardiac phenotype associated with the cardiac conduction system (CCS)-specific combined deletion of T_bx5_ and Tbx3 in the adult by inducing the activation of the CCS-specific tamoxifen-inducible Cre recombination (MinK-creERT) at 6 weeks after birth. Their analysis of 8-9-week-old animals did not identify any major morphological cardiac defects. However, the authors found conduction defects including prolonged PR and QTR intervals and ventricular tachycardia causing the death of the double mutants, which do not survive more than 3 months after tamoxifen induction. Molecular and optical mapping analysis of the ventricular conduction system (VCS) of these mutants concluded that, in the absence of Tbx5 and Tbx3 function, the cells forming the ventricular conduction system (VCS) become working myocardium and lose the specific contractile features characterizing VCS cells. Altogether, the study identified the critical combined role of Tbx3 and Tbx5 in the maintenance of the VCS in adulthood.

      Strengths:

      The study generated a new animal model to study the combined deletion of Tbx5 and Tbx3 in the cardiac conduction system. This unique model has provided the authors with the perfect tool to answer their biological questions. The study includes top-class methodologies to assess the functional defects present in the different mutants analyzed, and gathered very robust functional data on the conduction defects present in these mutants. They also applied optical action potential (OAP) methods to demonstrate the loss of conduction action potential and the acquisition of working myocardium action potentials in the affected cells because of Tbx5/Tbx3 loss of function. The study used simpler molecular and morphological analysis to demonstrate that there are no major morphological defects in these mutants and that indeed, the conduction defects found are due to the acquisition of working myocardium features by the VCS cells. Altogether, this study identified the critical role of these transcription factors in the maintenance of the VCS in the adult heart.

      We appreciate the Reviewer’s comments regarding the originality and utility of our model and the strengths of our methodological approach. The Reviewer’s appreciation of the molecular and morphological analyses as well as their constructive feedback is highly valuable.

      Weaknesses:

      In the opinion of this reviewer, the weakness in the study lies in the morphological and molecular characterization. The morphological analysis simply described the absence of general cardiac defects in the adult heart, however, whether the CCS tissues are present or not was not investigated. Lineage tracing analysis using the reporter lines included in the crosses described in the study will determine if there are changes in CCS tissue composition in the different mutants studied. Similarly, combining this reporter analysis with the molecular markers found to be dysregulated by qPCR and western blot, will demonstrate that indeed the cells that were specified as VCS in the adult heart, become working myocardium in the absence of Tbx3 and Tbx5 function.

      We appreciate the reviewer’s concern regarding the morphology of the cardiac conduction system in the Tbx3/Tbx5 double conditional knockout model. We did not observe any structural abnormalities, as the Reviewer notes. We agree with their suggestion for using Genetic Inducible Fate Mapping to mark cardiac conduction cells expressing MinKCre. In fact, we utilized this approach to isolate VCS cells for transcriptional profiling. Specifically, we combined the tamoxifen-inducible MinKCreERT allele with the Cre-dependent R26Eyfp reporter allele to label MinKCre-expressing cells in both control VCS and VCS-specific double Tbx3/Tbx5 knockouts. EYFP-positive cells were isolated for transcriptional studies, ensuring that our analysis exclusively targeted conduction system-lineage marked cells. The ability to isolate MinKCre-marked cells from both controls and Tbx5/Tbx3 double mutants indicates that VCS cells persisted in the double knockout. Nonetheless, the suggestion for in-vivo marking by Genetic Inducible Fate Mapping and morphologic analysis is a valuable recommendation for future studies.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Mutations in CDHR1, the human gene encoding an atypical cadherin-related protein expressed in photoreceptors, are thought to cause cone-rod dystrophy (CRD). However, the pathogenesis leading to this disease is unknown. Previous work has led to the hypothesis that CDHR1 is part of a cadherin-based junction that facilitates the development of new membranous discs at the base of the photoreceptor outer segments, without which photoreceptors malfunction and ultimately degenerate. CDHR1 is hypothesized to bind to a transmembrane partner to accomplish this function, but the putative partner protein has yet to be identified.

      The manuscript by Patel et al. makes an important contribution toward improving our understanding of the cellular and molecular basis of CDHR1-associated CRD. Using gene editing, they generate a loss of function mutation in the zebrafish cdhr1a gene, an ortholog of human CDHR1, and show that this novel mutant model has a retinal dystrophy phenotype, specifically related to defective growth and organization of photoreceptor outer segments (OS) and calyceal processes (CP). This phenotype seems to be progressive with age. Importantly, Patel et al, present intriguing evidence that pcdh15b, also known for causing retinal dystrophy in previous Xenopus and zebrafish loss of function studies, is the putative cdhr1a partner protein mediating the function of the junctional complex that regulates photoreceptor OS growth and stability.

      This research is significant in that it:

      (1) provides evidence for a progressive, dystrophic photoreceptor phenotype in the cdhr1a mutant and, therefore, effectively models human CRD; and

      (2) identifies pcdh15b as the putative, and long sought after, binding partner for cdhr1a, further supporting the theory of a cadherin-based junction complex that facilitates OS disc biogenesis.

      Nonetheless, the study has several shortcomings in methodology, analysis, and conceptual insight, which limits its overall impact.

      Below I outline several issues that the authors should address to strengthen their findings.

      Major comments:

      (1) Co-localization of cdhr1a and pcdh15b proteins

      The model proposed by the authors is that the interaction of cdhr1a and pcdh15b occurs in trans as a heterodimer. In cochlear hair cells, PCDH15 and CDHR23 are proposed to interact first as dimers in cis and then as heteromeric complexes in trans. This was not shown here for cdhr1a and pcdh15b, but it is a plausible configuration, as are single heteromeric dimers or homodimers. Regardless, this model depends on the differential compartmental expression of the cdhr1a and pcdh15b proteins. Data in Figure 1 show convincing evidence that these two proteins can, at least in some cases, be distributed along the length of photoreceptor membranes that are juxtaposed, as would be the case for OS and CP. If pcdh15b is predominantly expressed in CPs, whereas cdhr1a is predominantly expressed in OS, then this should be confirmed with actin double labeling with cdhr1a and pcdh15b since the apicobasal oriented (vertical) CPs would express actin in this same orientation but not in the OS. This would help to clarify whether cdhr1a and pcdh15b can be trafficked to both OS and CP compartments or whether they are mutually exclusive.

      First let me thank the reviewer for taking the time to comprehensively evaluate our work and provide constructive criticism which will improve the quality of our final version.

      To address this issue, we are undertaking imaging of actin/cdhr1a and actin/pcdh15b using SIM in both transverse and axial sections. Additionally, we have recently established an immuno-gold-TEM protocol and are going to provide data showcasing co-labeling of cdhr1a and pcdh15b at TEM resolution.

      Photoreceptor heterogeneity goes beyond the cone versus rod subtypes discussed here and it is known that in zebrafish, CP morphology is distinct in different cone subtypes as well as cone versus rod. It would be important to know which specific photoreceptor subtypes are shown in zebrafish (Figures 1A-C) and the non-fish species depicted in Figures 1E-L. Also, a larger field of view of the staining patterns for Figures 1E-L would be a helpful comparison (could be added as a supplementary figure).

      The revised manuscript will include clear labeling of the different cone cell types as well as lower magnification images to be included as supplemental figures.

      (2) Cdhr1a function in cell culture

      The authors should explain the multiple bands in the anti-FLAG blots. Also, it would be interesting to confirm that the cdhr1a D173 mutant prevents the IP interaction with pcdh15b as well as the additive effects in aggregate assays of Figure 2.

      We believe that the D173 mutation results in no cdhr1a polypeptide, based on the lack of in situ signal in our WISH studies (figures showing absence of cdhr1a mRNA will be provided in a new supplemental figure). However, we will clone the D173 mutant and attempt co-IP with pchd15b in our cell culture system as well as the aggregation assay using K562 cells.

      Is it possible that the cultured cells undergo proliferation in the aggregation assays shown in Figure 2? Cells might differentially proliferate as clusters form in rotating cultures. A simple assay for cell proliferation under the different transfection conditions showing no differences would address this issue and lend further support to the proposed specific changes to cell adhesion as a readout of this assay.

      This is a possibility, however we did not use rotating cultures, this was a monolayer culture. We did not observe any differences in total cell number between the differing transfections. As such, we do not feel proliferation explains the aggregation of K562 cells.

      Also, the authors report that the number of clusters was normalized to the field of view, but this was not defined. Were the n values different fields of view from one transfection experiment, or were they different fields of view from separate transfection experiments? More details and clarification are needed.

      This will be clarified in the revised manuscript, in short we replicated this experiment 3 times, quantifying 5 different fields of view in each replicate.

      (3) Methodological issues in quantification and statistical analyses

      Were all the OS and CP lengths counted in the observation region or just a sample within the region? If the latter, what were the sampling criteria? For CPs, it seems that the length was an average estimate based on all CPs observed surrounding one cone or one-rod cell. Is this correct? Again, if sampled, how was this implemented? In Fig 4M', the cdhr1a-/- ROS mostly looks curvilinear. Did the measurements account for this, or were they straight linear dimension measurements from base to tip of the OS as depicted in Fig 5A-E? A clearer explanation of the OS and CP length quantification methodology is required.

      The revised manuscript will clearly outline measurement methods. In short, we measured every CP/OS in the imaged regions. We did not average CPs/cell, we simply included all CP measurements in our analysis. All our CP measurements (actin or cdhr1a or pcdh15), were done in the presence of a counter stain, WGA, prph2, gnb1 or PNA to ensure proper measurements (landmark) and association with proper cell type.

      All measurements were taken as best as possible to reflect a straight linear dimension for consistency.

      How were cone and rod photoreceptor cell counts performed? The legend in Figure 4 states that they again counted cells in the observation region, but no details were provided. For example, were cones and rods counted as an absolute number of cells in the observation region (e.g., number of cones per defined area) or relative to total (DAPI+) cell nuclei in the region? Changes in cell density in the mutant (smaller eye or thinner ONL) might affect this quantification so it would be important to know how cell quantification was normalized.

      The revised manuscript will clearly outline measurement methods. In short, rod and cone cell counts were based on the number of outer segments that were observed in the imaging region and previously measured for length. We did not observe any eye size differences in our mutant fish.

      In Figure 6I, K, measuring the length of the signal seems problematic. The dimension of staining is not always in the apicobasal (vertical) orientation. It might be more accurate to measure the cdhr1a expression domain relative to the OS (since the length of the OS is already reduced in the mutants). Another possible approach could be to measure the intensity of cdhr1 staining relative to the intensity within a Prph2 expression domain in each group. The authors should provide complementary evidence to support their conclusion.

      The revised manuscript will clearly outline measurement methods. In short, all of our CP measurements (actin or cdhr1a or pcdh15), were done in the presence of a counter stain, WGA, prph2, gnb1 or PNA to ensure proper measurements and association with proper cell type.

      A better description of the statistical methodology is required. For example, the authors state that "each of the data points has an n of 5+ individuals." This is confusing and could indicate that in Figure 4F alone there were ~5000 individuals assayed (~100 data points per treatment group x n=5 individuals per data point x 10 treatment groups). I don't think that is what the authors intended. It would be clearer if the authors stated how many OS, CP, or cells were counted in their observation region averaged per individual, and then provided the n value of individuals used per treatment group (controls and mutants), on which the statistical analyses should be based.

      This will be addressed in the revised manuscript. In short we had an n=5 (individual fish) analyzed for each genotype/time point. We will also include numbers of OS/CP quantified in the observation regions.

      There are hundreds of data points in the separate treatment groups shown in several of the graphs. It would not be correct to perform the ANOVA on the separate OS or CP length measurements alone as this will bias the estimates since they are not all independent samples. For example, in Figure 6H, 5dpf pcdh15b+/- have shorter CPs compared to WT but pcdh15b-/- have longer compared to WT. This could be an artifact of the analysis. Moreover, the authors should clarify in the Methods section which ANOVA post hoc tests were used to control for multiple pairwise comparisons.

      This will be clarified in the revised manuscript.

      (4) Cdhr1a function in photoreceptors

      The cdhr1a IHC staining in 5dpf WT larvae in Figure 3E appears different from the cdhr1a IHC staining in 5dpf WT larvae in Figure 1A or Figure 6I. Perhaps this is just the choice of image. Can the authors comment or provide a more representative image?

      The image in figure 3E was captured using a previous non antigen retrieval protocol which limits the resolution of the cdhr1a signal along the CP. In the revised manuscript we will include an image that better represents cdhr1a staining in the WT and mutant.

      The authors show that pcdh15b localization after 5dpf mirrored the disorganization of the CP observed with actin staining. They also show in Figure 5O that at 180dpf, very little pcdh15b signal remains. They suggest based on this data that total degradation of CPs has occurred in the cdhr1a-/- photoreceptors by this time. However, although reduced in length, COS and cone CPs are still present at 180dpf (Figure 5E, E'). Thus, contrary to the authors' general conclusion, it is possible that the localization, trafficking, and/or turnover of pcdh15b is maintained through a cdhr1a-dependent mechanism, irrespective of the degree to which CPs are maintained. The experiments presented here do not clearly distinguish between a requirement for maintenance of localization versus a secondary loss of localization due to defective CPs.

      We agree, this point will be addressed in our revised manuscript.

      (5) Conceptual insights

      The authors claim that cdhr1a and pcdh15b double mutants have synergistic OS and CP phenotypes. I think this interpretation should be revisited.

      First, assuming the model of cdhr1a-pcdh15b interaction in trans is correct, the authors have not adequately explained the logic of why disrupting one side of this interaction in a single mutant would not give the same severity of phenotype as disrupting both sides of this interaction in a double mutant.

      Second, and perhaps more critically, at 10dpf the OS and CP lengths in cdhr1a-/- mutants (Figure 7J, T) are significantly increased compared to WT. In contrast, there are no significant differences in these measurements in the pcdh15b-/- mutants. Yet in double homozygous mutants, there is a significant reduction of ~50% in these measurements compared to WT. A synergistic phenotype would imply that each mutant causes a change in the same direction and that the magnitude of this change is beyond additive in the double mutants (but still in the same direction). Instead, I would argue that the data presented in Figure 7 suggest that there might be a functionally antagonistic interaction between cdhr1a and pcdh15b with respect to OS and CP growth at 10dpf.

      If these proteins physically interacted in vivo, it would appear that the interaction is complex and that this interaction underlies both OS growth-promoting and growth-restraining (stabilizing) mechanisms working in concert. Perhaps separate homodimers or heterodimers subserve distinct CP-OS functional interactions. This might explain the age-dependent differences in mutant CP and OS length phenotypes if these mechanisms are temporally dynamic or exhibit distinct OS growth versus maintenance phases. Regardless of my speculations, the model presented by the authors appears to be too simplistic to explain the data.

      We agree with the reviewer, as such we will address this conclusion in our revised manuscript. To do so we will revise our final model and include more flexibility in the proposed mechanisms.

      Reviewer #2 (Public review):

      Summary:

      The goal of this study was to develop a model for CDHR1-based Con-rod dystrophy and study the role of this cadherin in cone photoreceptors. Using genetic manipulation, a cell binding assay, and high-resolution microscopy the authors find that like rods, cones localize CDHR1 to the lateral edge of outer segment (OS) discs and closely oppose PCDH15b which is known to localize to calyceal processes (CPs). Ectopic expression of CDHR1 and PCDH15b in K652 cells indicates these cadherins promote cell aggregation as heterophilic interactants, but not through homophilic binding. This data suggests a model where CDHR1 and PCDH15b link OS and CPs and potentially stabilize cone photoreceptor structure. Mutation analysis of each cadherin results in cone structural defects at late larval stages. While pcdh15b homozygous mutants are lethal, cdhr1 mutants are viable and subsequently show photoreceptor degeneration by 3-6 months.

      Strengths:

      A major strength of this research is the development of an animal model to study the cone-specific phenotypes associated with CDHR1-based CRD. The data supporting CDHR1 (OS) and PCDH15 (CP) binding is also a strength, although this interaction could be better characterized in future studies. The quality of the high-resolution imaging (at the light and EM levels) is outstanding. In general, the results support the conclusions of the authors.

      Weaknesses:

      While the cellular phenotyping is strong, the functional consequences of CDHR1 disruption are not addressed. While this is not the focus of the investigation, such analysis would raise the impact of the study overall. This is particularly important given some of the small changes observed in OS and CP structure. While statistically significant, are the subtle changes biologically significant? Examples include cone OS length (Figures 4F, 6E) as well as other morphometric data (Figure 7I in particular). Related, for quantitative data and analysis throughout the manuscript, more information regarding the number of fish/eyes analyzed as well as cells per sample would provide confidence in the rigor. The authors should also note whether the analysis was done in an automated and/or masked manner.

      First let me thank the reviewer for taking the time to comprehensively evaluate our work and provide constructive criticism which will improve the quality of our final version.

      The revised manuscript will clearly outline both methods and statistics used for quantitation of our data. (please see comments from reviewer 1). While we do not include direct evidence of the mechanism of CDHR1 function, we do propose that its role is important in anchoring the CP and the OS, particularly in the cones, while in rods it may serve to regulate the release of newly formed disks (as previously proposed in mice). We do plan to test both of these hypothesis directly, however, that will be the basis of our future studies.

      Reviewer #3 (Public review):

      Summary:

      The manuscript by Patel et al investigates the hypothesis that CDHR1a on photoreceptor outer segments is the binding partner for PCDH15 on the calyceal processes, and the absence of either adhesion molecule results in separation between the two structures, eventually leading to degeneration. PCDH15 mutations cause Usher syndrome, a disease of combined hearing and vision loss. In the ear, PCDH15 binds CDH23 to form tip links between stereocilia. The vision loss is less understood. Previous work suggested PCDH15 is localized to the calyceal processes, but the expression of CDH23 is inconsistent between species. Patel et al suggest that CDHR1a (formerly PCDH21) fulfills the role of CDH23 in the retina.

      The experiments are mainly performed using the zebrafish model system. Expression of Pcdh15b and Cdhr1a protein is shown in the photoreceptor layer through standard confocal and structured illumination microscopy. The two proteins co-IP and can induce aggregation in vitro. Loss of either Cdhr1a or Pcdh15, or both, results in degeneration of photoreceptor outer segments over time, with cones affected primarily.

      The idea of the study is logical given the photoreceptor diseases caused by mutations in either gene, the comparisons to stereocilia tip links, and the protein localization near the outer segments. The work here demonstrates that the two proteins interact in vitro and are both required for ongoing outer segment maintenance. The major novelty of this paper would be the demonstration that Pcdh15 localized to calyceal processes interacts with Cdhr1a on the outer segment, thereby connecting the two structures. Unfortunately, the data presented are inadequate proof of this model.

      Strengths:

      The in vitro data to support the ability of Pcdh15b and Cdhr1a to bind is well done. The use of pcdh15b and cdhr1a single and double mutants is also a strength of the study, especially being that this would be the first characterization of a zebrafish cdhr1a mutant.

      Weaknesses:

      (1) The imaging data in Figure 1 is insufficient to show the specific localization of Pcdh15 to calyceal processes or Cdhr1a to the outer segment membrane. The addition of actin co-labelling with Pcdh15/Cdhr1a would be a good start, as would axial sections. The division into rod and cone-specific imaging panels is confusing because the two cell types are in close physical proximity at 5 dpf, but the cone Cdhr1a expression is somehow missing in the rod images. The SIM data appear to be disrupted by chromatic aberration but also have no context. In the zebrafish image, the lines of Pcdh15/Cdhr1a expression would be 40-50 um in length if the scale bar is correct, which is much longer than the outer segments at this stage and therefore hard to explain.

      First let me thank the reviewer for taking the time to comprehensively evaluate our work and provide constructive criticism which will improve the quality of our final version.

      To address this issue, we are undertaking imaging of actin/cdhr1a and actin/pcdh15b using SIM in both transverse and axial sections. Additionally, we have recently established an immuno-gold-TEM protocol and are going to provide data showcasing co-labeling of cdhr1a and pcdh15b at TEM resolution. We are also going to include lower magnification images to complement the SIM images presented in figure 1.

      (2) Figure 3E staining of Cdhr1a looks very different from the staining in Figure 1. It is unclear what the authors are proposing as to the localization of Cdhr1a. In the lab's previous paper, they describe Cdhr1a as being associated with the connecting cilium and nascent OS discs, and fail to address how that reconciles with the new model of mediating CP-OS interaction. And whether Cdhr1a localizes to discrete domains on the disc edges, where it interacts with Pcdh15 on individual calyceal processes.

      The image in figure 3E was captured using a previous non antigen retrieval protocol which limits the resolution of the cdhr1a signal along the CP. In the revised manuscript we will include an image that better represents cdhr1a staining in the WT and mutant.

      (3) The authors state "In PRCs, Pcdh15 has been unequivocally shown to be localized in the CPs". However, the immunostaining here does not match the pattern seen in the Miles et al 2021 paper, which used a different antibody. Both showed loss of staining in pcdh15b mutants so unclear how to reconcile the two patterns.

      We agree that our staining appears different, but we attribute this to our antigen retrieval protocol which differed from the Miles et al paper. We also point to the fact that pcdh15b localization has been shown to be similar to our images in other species (monkey and frog). As such, we believe our protocol reveals the proper localization pattern which might be lost/hampered in the procedure used in Miles et al 2021.

      (4) The explanation for the CRISPR targets for cdhr1a and the diagram in Figure 3 does not fit with crRNA sequences or the mutation as shown. The mutation spans from the latter part of exon 5 to the initial portion of exon 6, removing intron 5-6. It should nevertheless be a frameshift mutation but requires proper documentation.

      This was an overlooked error in figure making, we apologize and will address this typo in the revised manuscript.

      (5) There are complications with the quantification of data. First, the number of fish analyzed for each experiment is not provided, nor is the justification for performing statistics on individual cell measurements rather than using averages for individual fish. Second, all cone subtypes are lumped together for analysis despite their variable sizes. Third, t-tests are inappropriately used for post-hoc analysis of ANOVA calculations.

      As we discussed for reviewer 1 and 2, all methods and quantification/statistics will be clearly described in the revised manuscript.

      (6) Unclear how calyceal process length is being measured. The cone measurements are shown as starting at the external limiting membrane, which is not equivalent to the origin of calyceal processes, and it is uncertain what defines the apical limit given the multiple subtypes of cones. In Figure 5, the lines demonstrating the measurements seem inconsistently placed.

      As we discussed for reviewer 1 and 2, all methods and quantification/statistics will be clearly described in the revised manuscript.

      (7) The number of fish analyzed by TEM and the prevalence of the phenotype across cells are not provided. A lower magnification view would provide context. Also, the authors should explain whether or not overgrowth of basal discs was observed, as seen previously in cdhr1-null frogs (Carr et al., 2021).

      The revised manuscript will include the aforementioned stats and lower magnification images. We will also compare our results directly to Carr 2021.

      (8) The statement describing the separation between calyceal processes and the outer segment in the mutants is not backed up by the data. TEM or co-labelling of the structures in SIM could be done to provide evidence.

      We will work to include more TEM and co-labeling data for the revised manuscript (see comments to reviewer 1)

      (9) "Based on work in the murine model and our own observations of rod CPs, we hypothesize that zebrafish rod CPs only extend along the newly forming OS discs and do not provide structural support to the ROS." Unclear how murine work would support that conclusion given the lack of CPs in mice, or what data in the manuscript supports this conclusion.

      In the revised manuscript we will improve our discussion of murine CPs, in that we still detect the juxtaposition of cdhr1 and pcdh15, along a potential remanent of the CP as previously described in SEM studies. Our findings do not indicate that mice or rats have CPs, we simply wanted to outline that the behavior of cdhr1 and pcdh15 still remains conserved, despite the absence of long traditional CPs.

      (10) The authors state "from the fact that rod CPs are inherently much smaller than cone CPs" without providing a reference. In the manuscript, the measurements do show rod CPs to be shorter, but there are errors in the cone measurements, and it is possible that the RPE pigment is interfering with the rod measurements.

      We will include a reference where rod CPs have been found to be shorter (monkey and frog data). We have no doubt that in zebrafish the rod CPs are significantly shorter. All our CP measurements are done with a counter stain for rods and cones to be sure that we are measuring the correct cell type.

      (11) The discussion should include a better comparison of the results with ocular phenotypes in previously generated pcdh15 and cdhr1 mutant animals.

      In the revised manuscript we will include this in our discussion.

      (12) The images in panels B-F of the Supplemental Figure are uncannily similar, possibly even of the same fish at different focal planes.

      We assure the reviewer that each of the images in supplemental figure 1 are distinct and represent different in situ experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors report compound heterozygous deleterious variants in the kinase domains of the non-receptor tyrosine kinases (NRTK) TNK2/ACK1 in familial SLE. They suggest that ACK1 and BRK deficiencies are associated with human SLE and impair efferocytosis.

      Strengths: 

      The identification of similar mutations in non-receptor tyrosine kinases (NRTKs) in two different families with familial SLE is a significant finding in human disease. Furthermore, the paper provides a detailed analysis of the molecular mechanisms behind the impairment of efferocytosis caused by mutations in ACK1 and BRK.

      Weaknesses: 

      A critical point in this paper is whether the loss of function of ACK1 or BRK contributes to the onset of familial SLE. The authors emphasize that inhibitors of ACK1/BRK worsened IgG deposition in the kidneys in a pristane-induced SLE model, which contributes not to the onset but to the exacerbation of SLE, thus only partially supporting their claim.

      The evidence supporting that the loss of function of ACK1 or BRK contributes to the onset of SLE in the patients from the 2 families mostly relies on the genetic analysis. As the reviewer states, the observation that inhibitors of ACK1/BRK worsened IgG deposition in the kidneys in a pristane-induced SLE model supports the genetic evidence.

      To further address the possible role of ACK1 or BRK variants in the onset of autoimmunity in vivo, we treated wild-type (WT) BALB/cByJ female mice with inhibitors in the absence of pristane.

      The results indicated that mice that had received a weekly injection of ACK1 or BRK inhibitors developed a large array of serum anti-nuclear IgG antibodies, including but not limited to autoantibodies associated with SLE such as anti-histones, anti-chromatin, anti U1-snRNP, anti-SSA, and anti-Ku in comparison to the control group inhibitor treated mice (Revised Fig 3A). However, they did not develop glomerular deposit of IgG after 12 weeks of treatment, in contrast to mice that have received Pristane (Revised Fig. 3B,C, Figure 3-figure supplement 1).

      These additional data suggests that inhibition of ACK1 and BRK stimulates the production of serum autoantibodies, which strengthen the claim that ACK1 and BRK kinase deficiency contribute to autoimmunity in BALB/cByJ.

      Reviewer #2 (Public Review):

      Summary: 

      In this manuscript, the authors revealed that genetic deficiencies of ACK1 and BRK are associated with human SLE. First, the authors found that compound heterozygous deleterious variants in the kinase domains of the non-receptor tyrosine kinases (NRTK) TNK2/ACK1 in one multiplex family and PTK6/BRK in another family. Then, by an experimental blockade of ACK1 or BRK in a mouse SLE model, they found an increase in glomerular IgG deposits and circulating autoantibodies. Furthermore, they reported that ACK and BRK variants from the SLE patients impaired the MERTK-mediated anti-inflammatory response to apoptotic cells in human induced pluripotent stem cells (hiPSC)-derived macrophages. This work identified new SLE-associated ACK and BRK variants and a role for the NRTK TNK2/ACK1 and PTK6/BRK in efferocytosis, providing a new molecular and cellular mechanism of SLE pathogenesis.

      Strengths: 

      This work identified new SLE-associated ACK and BRK variants and a role for the NRTK TNK2/ACK1 and PTK6/BRK in efferocytosis, providing a new molecular and cellular mechanism of SLE pathogenesis.

      Weaknesses: 

      Although the manuscript is well-organized and clearly stated, there are some points below that should be considered:

      In this study, the authors used forward genetic analyses to identify novel gene mutations that may cause SLE, combined with GWAS studies of SLE. To further explore the importance of these variants, haplotype analysis of two candidate genes could be performed, to observe the evolution and selection relationship of candidate genes in the population (UK 1000 biobank, for example). 

      To investigate whether ACK1/TNK2 or BRK/PTK6 were subject to selection, we gathered data using different metrics quantifying negative selection in the human genome. We collected the f parameter from SnIPRE1, lofTool2, and evoTol3, as well as intraspecies metrics from RVIS4, LOEUF5, and pLI6 (including pRec). We also used our in-house CoNeS metric7. None of these indicators suggest that the genes are under strong negative selection (Revised Figure 2-figure supplement 2). This is consistent with the deficiency being recessive. We also tested the variants with a MAF greater than 0.005. We found them to be neutral. We therefore did not test whether they were associated with any phenotype in the UK Biobank.

      Although the authors focused on SLE and macrophage efferocytosis in their studies, direct evidence of how macrophage efferocytosis significantly affects SLE is lacking. This point should at least be explicitly introduced and discussed by citing appropriate literature.

      We provide a more detailed description of the role of macrophage efferocytosis in autoimmunity and SLE in the revised manuscript. Specifically, we state (in the results section, paragraph: ACK1 and BRK kinase domain variants may lose the ability to link MERTK to RAC1, AKT and STAT3 activation for efferocytosis): “NRTKs such as ACK1 8 and PTK2/FAK 9 are also downstream targets of the TAM family receptor MERTK which is expressed on macrophages and controls the anti-inflammatory engulfment of apoptotic cells, a process known as efferocytosis 10-12. Efferocytosis allows for the clearance of apoptotic cells before they undergo necrosis and release intracellular inflammatory molecules, and simultaneously leads to increased production of anti-inflammatory molecules (TGFb, IL-10, and PGE2) and a decreased secretion of proinflammatory cytokines (TNF-alpha, IL-1b, IL-6) 10-14. In line with these findings, mice deficient in molecular components used by macrophages to efficiently perform efferocytosis, such as MFG-E8, MERTK, TIM4, and C1q, develop phenotypes associated with autoimmunity10,11,14-27. Furthermore, defects in efferocytosis are also observed in patients with SLE and glomerulonephritis14,28-31.“

      It is still not clear how the target molecules identified in this paper may influence macrophage efferocytosis. More direct evidence should be established. 

      Our studies show that wt -but not variants- of ACK1 and BRK are activated by MERTK, a key receptor that mediates the recognition of apoptotic cells. Our studies also show that wt -but not variants- activate RAC1 which is necessary for engulfment and phosphorylate AKT and STAT3 which are involved in the anti-inflammatory response to PtdSer recognition.

      The TAM family receptor MERTK mediates recognition of PtdSer on apoptotic cells via GAS6 and Protein S 10,15,32 leading to their engulfment, which involves activation of RAC1 for actin reorganization and the formation of a phagocytic cup 9,33. Using IP kinase assays we show that MERTK and GAS6 can activate the kinase activity of wild-type ACK1 8 or BRK but not of the patient’s ACK1 or BRK variant alleles (Figure 4D). To further support the role of ACK1 and BRK downstream from PtdSer recognition and uptake of apoptotic cells, we show that reference ACK1 and BRK alleles, in contrast to the patient variant alleles, can activate RAC1 to generate RAC-GTP which is necessary for engulfment 9,33 (Figure 4C).

      PtdSer recognition also typically stimulates an anti-inflammatory process mediated in part via AKT 34 and STAT3 and their target genes such as SOCS3 35-41 and results in the inhibition of LPS-mediated production of inflammatory mediators such as TNF and IL-1b, and the production of cytokines such as IL-10, TGFb 11,25-27,42. Consistent with this literature and the findings of the paper, we show that reference ACK1 and BRK, unlike the patient’s variant alleles, can phosphorylate AKT and STAT3 (Figure 4A, B). The role of ACK1 and BRK in these signaling pathways is further supported by our transcriptomics data comparing the response of controls, patients, and inhibitor-treated iPSC-derived macrophages to apoptotic thymocytes by RNA-seq. Specifically, we show Transcriptional repressors including the AKT targets ATF3, TGIF1, NFIL3, and KLF4, the STAT3 targets SOCS3 and DUSP5, as well as CEBPD and the inhibitor of E-BOX DNA Binding ID3 were among the top-ten genes which expression is induced by apoptotic cells in WT macrophages (Figure 4F), but this regulation was lost in mutant and inhibitor-treated macrophages (Figure 4F).

      For some transcriptional repressors mentioned in their studies, the authors should check whether there is clear experimental evidence. If not, it is recommended to supplement the experimental verifications for clarity.

      Transcriptional repressors including the AKT targets ATF3, TGIF1, NFIL3, and KLF4, the STAT3 targets SOCS3 and DUSP5, as well as CEBPD and the inhibitor of E-BOX DNA Binding ID3 were among the top-ten genes which expression is induced by apoptotic cells in WT macrophages (Figure 4F), but this regulation was lost in mutant and inhibitor-treated macrophages (Figure 4F).

      In the manuscript we cited published evidence, to the best of our knowledge, for the role of these genes in the regulation of inflammatory responses. Specifically we state: “ATF3, TGIF1, NFIL3, and KLF4 are involved in the negative regulation of inflammation in macrophages 35-38, SOCS3 is an inhibitor of the macrophage inflammatory response and DUSP5 is a negative regulator of ERK activation 39,40,43. These data suggest that the kinase domain of ACK1 and BRK contribute to the macrophage anti-inflammatory gene expression program driven by apoptotic cells.”

      In Figures 4C and 4D, it is seen that the usage of inhibitors causes cytoskeletal changes, however this reviewer would not have expected such large change. Did the authors check whether the cells die after heavy treatment by the inhibitors?

      We carefully examine the viability of Isogenic WT, BRK and ACK1 mutant macrophages (left panel) and of WT macrophages treated with ACK1 or BRK inhibitors and we did not observed changes in viability (Figure 4-figure supplement 2).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A crucial step in the development of SLE is the production of autoantibodies. It is shown in Figure 2F that inhibitors of ACK1/BRK enhanced the production of autoantibodies against histones and SSA in a pristane-induced SLE model, which is a significant result that could support the authors' claim. Strangely, this autoantigen panel does not include double-stranded DNA, RNP, or Sm, which should be presented regarding antibody production.

      We thank the reviewer for this comment. In the revised manuscript (Revised Figure 3 – Supplement 1) we added the remainder of the autoantibody panel, which includes double-stranded DNA, RNP, and Sm autoantibody levels. We also added the results for serum IgG autoantibody levels in BALB/cByJ mice treated for three months with DMSO, ACK1, or BRK inhibitors but did not receive a pristane injection (Revised Figure 3A). This data shows that mice which received ACK1 or BRK inhibitors had increased serum IgG autoantibodies in comparison to DMSO treated controls.

      Additionally, if there is information that inhibitors of ACK1/BRK promote the differentiation of follicular helper T cells, memory B cells, and plasma cells in a pristane-induced SLE model, it could be considered indirect evidence supporting the authors' claims.

      These are not available at present to the best of our knowledge.

      Reviewer #2 (Recommendations For The Authors):

      Minor points:

      * In the literature, unpaired t-tests and ordinary one-way ANOVA (Tukey's multiple comparisons test) were used for statistical analysis, which requires data to be normally distributed. This part of the proposal is reflected in the text, and the non-conforming results need to be statistically analyzed using the non-parametric test of graphpad prism.

      We would like to thank the reviewer for pointing out this oversight. In the revised manuscript, for all applicable datasets, we tested whether the data was normally distributed using a Shapiro-Wilk normality test. For datasets that were normally distributed statistical significance was determined by a Student t test or ordinary one-way ANOVA with Tukey’s multiple comparisons test depending on the number of conditions being compared and the experimental setup. In contrast, for datasets that were not normally distributed statistical significance was determined using a Mann-Whitney, Kruskal-Wallis multiple comparisons tests, or Wilcoxon matched-pairs signed rank test depending on the experimental setup. P values below 0.05 were considered significant for all statistical tests.

      The authors used different methods to represent the level of significant difference. Therefore, it is suggested that the significance level should be expressed by letters. 

      As suggested by the reviewer, in the revised manuscript we have designated the significance level throughout all figures using letters (p, or q values).

      For RNA-seq, more information should be provided in the paper. For example, the correlation between sample biological replicates, the total number of differentially expressed genes, and randomly selected genes for qRT-PCR results verification.

      We would like to thank the reviewer for pointing out this oversight. In the revised manuscript we provided more information regarding the RNA-seq dataset, including a Principal Component Analysis (PCA) showing correlation between sample replicates (Revised Figure 4-figure supplement 1A), as well as a table indicating the number of upregulated and downregulated genes between relevant datasets (Revised Figure 4-figure supplement 1B).

      The results of the RNA-seq analysis indicated that ACK1 and BRK contribute to the macrophage anti-inflammatory gene expression program driven by apoptotic cells. MERTK-dependent anti-inflammatory program elicited by apoptotic cells on macrophages is best evidenced by the reduction of LPS-mediated production of inflammatory mediators such as TNF or IL1b 25-27,34,44. Therefore, to validate the RNA-seq results in a functional manner we tested the decrease of LPS-induced production of TNF and IL1b by apoptotic cells in isogenic WT, ACK1 deficient, and BRK deficient macrophages. Consistent with the RNA-seq data, the functional assays indicated that ACK1 and BRK kinase activities are required for the decrease of TNF and IL1b production induced by LPS in response to apoptotic cells (Revised Figure 4H,I).

      The raw data files for the RNA-seq analysis have been deposited in the NCBI Gene Expression Omnibus under accession number GEO: GSE118730.

      The authors did not have the formats for some of the citations correct. This should be fixed. 

      References were reformatted.

      (1) Eilertson, K. E., Booth, J. G. & Bustamante, C. D. SnIPRE: selection inference using a Poisson random effects model. PLoS Comput Biol 8, e1002806 (2012). https://doi.org:10.1371/journal.pcbi.1002806

      (2) Fadista, J., Oskolkov, N., Hansson, O. & Groop, L. LoFtool: a gene intolerance score based on loss-of-function variants in 60 706 individuals. Bioinformatics 33, 471-474 (2017). https://doi.org:10.1093/bioinformatics/btv602

      (3) Rackham, O. J., Shihab, H. A., Johnson, M. R. & Petretto, E. EvoTol: a protein-sequence based evolutionary intolerance framework for disease-gene prioritization. Nucleic Acids Res 43, e33 (2015). https://doi.org:10.1093/nar/gku1322

      (4) Petrovski, S., Wang, Q., Heinzen, E. L., Allen, A. S. & Goldstein, D. B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet 9, e1003709 (2013). https://doi.org:10.1371/journal.pgen.1003709

      (5) Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434-443 (2020). https://doi.org:10.1038/s41586-020-2308-7

      (6) Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285-291 (2016). https://doi.org:10.1038/nature19057

      (7) Rapaport, F. et al. Negative selection on human genes underlying inborn errors depends on disease outcome and both the mode and mechanism of inheritance. Proc Natl Acad Sci U S A 118 (2021). https://doi.org:10.1073/pnas.2001248118

      (8) Mahajan, N. P., Whang, Y. E., Mohler, J. L. & Earp, H. S. Activated tyrosine kinase Ack1 promotes prostate tumorigenesis: role of Ack1 in polyubiquitination of tumor suppressor Wwox. Cancer Res 65, 10514-10523 (2005). https://doi.org:10.1158/0008-5472.CAN-05-1127

      (9) Wu, Y., Singh, S., Georgescu, M. M. & Birge, R. B. A role for Mer tyrosine kinase in alphavbeta5 integrin-mediated phagocytosis of apoptotic cells. J Cell Sci 118, 539-553 (2005). https://doi.org:10.1242/jcs.01632

      (10) Scott, R. S. et al. Phagocytosis and clearance of apoptotic cells is mediated by MER. Nature 411, 207-211 (2001). https://doi.org:10.1038/35075603

      (11) Henson, P. M. & Bratton, D. L. Antiinflammatory effects of apoptotic cells. J Clin Invest 123, 2773-2774 (2013). https://doi.org:10.1172/JCI69344

      (12) Henson, P. M. Cell Removal: Efferocytosis. Annu Rev Cell Dev Biol 33, 127-144 (2017). https://doi.org:10.1146/annurev-cellbio-111315-125315

      (13) deCathelineau, A. M. & Henson, P. M. The final step in programmed cell death: phagocytes carry apoptotic cells to the grave. Essays Biochem 39, 105-117 (2003). https://doi.org:10.1042/bse0390105

      (14) Nagata, S. Apoptosis and Clearance of Apoptotic Cells. Annu Rev Immunol 36, 489-517 (2018). https://doi.org:10.1146/annurev-immunol-042617-053010

      (15) Cohen, P. L. et al. Delayed apoptotic cell clearance and lupus-like autoimmunity in mice lacking the c-mer membrane tyrosine kinase. J Exp Med 196, 135-140 (2002). https://doi.org:10.1084/jem.20012094

      (16) Hanayama, R. et al. Autoimmune disease and impaired uptake of apoptotic cells in MFG-E8-deficient mice. Science 304, 1147-1150 (2004). https://doi.org:10.1126/science.1094359

      (17) Miyanishi, M., Segawa, K. & Nagata, S. Synergistic effect of Tim4 and MFG-E8 null mutations on the development of autoimmunity. Int Immunol 24, 551-559 (2012). https://doi.org:10.1093/intimm/dxs064

      (18) Colonna, L., Parry, G. C., Panicker, S. & Elkon, K. B. Uncoupling complement C1s activation from C1q binding in apoptotic cell phagocytosis and immunosuppressive capacity. Clin Immunol 163, 84-90 (2016). https://doi.org:10.1016/j.clim.2015.12.017

      (19) Nagata, S., Hanayama, R. & Kawane, K. Autoimmunity and the clearance of dead cells. Cell 140, 619-630 (2010). https://doi.org:10.1016/j.cell.2010.02.014

      (20) Kimani, S. G. et al. Contribution of Defective PS Recognition and Efferocytosis to Chronic Inflammation and Autoimmunity. Front Immunol 5, 566 (2014). https://doi.org:10.3389/fimmu.2014.00566

      (21) Hanayama, R., Tanaka, M., Miwa, K., Shinohara, A., Iwamatsu, A. & Nagata, S. Identification of a factor that links apoptotic cells to phagocytes. Nature 417, 182-187 (2002). https://doi.org:10.1038/417182a

      (22) Kawano, M. & Nagata, S. Lupus-like autoimmune disease caused by a lack of Xkr8, a caspase-dependent phospholipid scramblase. Proc Natl Acad Sci U S A 115, 2132-2137 (2018). https://doi.org:10.1073/pnas.1720732115

      (23) Watanabe-Fukunaga, R., Brannan, C. I., Copeland, N. G., Jenkins, N. A. & Nagata, S. Lymphoproliferation disorder in mice explained by defects in Fas antigen that mediates apoptosis. Nature 356, 314-317 (1992). https://doi.org:10.1038/356314a0

      (24) Singer, G. G., Carrera, A. C., Marshak-Rothstein, A., Martinez, C. & Abbas, A. K. Apoptosis, Fas and systemic autoimmunity: the MRL-lpr/lpr model. Current opinion in immunology 6, 913-920 (1994).

      (25) Cvetanovic, M. & Ucker, D. S. Innate immune discrimination of apoptotic cells: repression of proinflammatory macrophage transcription is coupled directly to specific recognition. J Immunol 172, 880-889 (2004). https://doi.org:10.4049/jimmunol.172.2.880

      (26) Fadok, V. A., Bratton, D. L., Konowal, A., Freed, P. W., Westcott, J. Y. & Henson, P. M. Macrophages that have ingested apoptotic cells in vitro inhibit proinflammatory cytokine production through autocrine/paracrine mechanisms involving TGF-beta, PGE2, and PAF. J Clin Invest 101, 890-898 (1998). https://doi.org:10.1172/JCI1112

      (27) Voll, R. E., Herrmann, M., Roth, E. A., Stach, C., Kalden, J. R. & Girkontaite, I. Immunosuppressive effects of apoptotic cells. Nature 390, 350-351 (1997). https://doi.org:10.1038/37022

      (28) Herrmann, M., Voll, R. E., Zoller, O. M., Hagenhofer, M., Ponner, B. B. & Kalden, J. R. Impaired phagocytosis of apoptotic cell material by monocyte-derived macrophages from patients with systemic lupus erythematosus. Arthritis Rheum 41, 1241-1250 (1998). https://doi.org:10.1002/1529-0131(199807)41:7<1241::AID-ART15>3.0.CO;2-H

      (29) Baumann, I. et al. Impaired uptake of apoptotic cells into tingible body macrophages in germinal centers of patients with systemic lupus erythematosus. Arthritis Rheum 46, 191-201 (2002). https://doi.org:10.1002/1529-0131(200201)46:1<191::AID-ART10027>3.0.CO;2-K

      (30) Schrijvers, D. M., De Meyer, G. R. Y., Kockx, M. M., Herman, A. G. & Martinet, W. Phagocytosis of apoptotic cells by macrophages is impaired in atherosclerosis. Arterioscl Throm Vas 25, 1256-1261 (2005). https://doi.org:10.1161/01.ATV.0000166517.18801.a7

      (31) Morioka, S., Maueroder, C. & Ravichandran, K. S. Living on the Edge: Efferocytosis at the Interface of Homeostasis and Pathology. Immunity 50, 1149-1162 (2019). https://doi.org:10.1016/j.immuni.2019.04.018

      (32) Seitz, H. M., Camenisch, T. D., Lemke, G., Earp, H. S. & Matsushima, G. K. Macrophages and dendritic cells use different Axl/Mertk/Tyro3 receptors in clearance of apoptotic cells. J Immunol 178, 5635-5642 (2007). https://doi.org:10.4049/jimmunol.178.9.5635

      (33) Mao, Y. & Finnemann, S. C. Regulation of phagocytosis by Rho GTPases. Small GTPases 6, 89-99 (2015). https://doi.org:10.4161/21541248.2014.989785

      (34) Sen, P. et al. Apoptotic cells induce Mer tyrosine kinase-dependent blockade of NF-kappaB activation in dendritic cells. Blood 109, 653-660 (2007). https://doi.org:10.1182/blood-2006-04-017368

      (35) Vergadi, E., Ieronymaki, E., Lyroni, K., Vaporidi, K. & Tsatsanis, C. Akt Signaling Pathway in Macrophage Activation and M1/M2 Polarization. J Immunol 198, 1006-1014 (2017). https://doi.org:10.4049/jimmunol.1601515

      (36) Byles, V. et al. The TSC-mTOR pathway regulates macrophage polarization. Nat Commun 4, 2834 (2013). https://doi.org:10.1038/ncomms3834

      (37) Liao, X. et al. Kruppel-like factor 4 regulates macrophage polarization. J Clin Invest 121, 2736-2749 (2011). https://doi.org:10.1172/JCI45444

      (38) Roberts, A. W., Lee, B. L., Deguine, J., John, S., Shlomchik, M. J. & Barton, G. M. Tissue-Resident Macrophages Are Locally Programmed for Silent Clearance of Apoptotic Cells. Immunity 47, 913-927 e916 (2017). https://doi.org:10.1016/j.immuni.2017.10.006

      (39) Matsukawa, A. et al. Stat3 in resident macrophages as a repressor protein of inflammatory response. J Immunol 175, 3354-3359 (2005).

      (40) Sica, A. & Mantovani, A. Macrophage plasticity and polarization: in vivo veritas. J Clin Invest 122, 787-795 (2012). https://doi.org:10.1172/JCI59643

      (41) Yi, Z., Li, L., Matsushima, G. K., Earp, H. S., Wang, B. & Tisch, R. A novel role for c-Src and STAT3 in apoptotic cell-mediated MerTK-dependent immunoregulation of dendritic cells. Blood 114, 3191-3198 (2009). https://doi.org:10.1182/blood-2009-03-207522

      (42) Rothlin, C. V., Carrera-Silva, E. A., Bosurgi, L. & Ghosh, S. TAM receptor signaling in immune homeostasis. Annu Rev Immunol 33, 355-391 (2015). https://doi.org:10.1146/annurev-immunol-032414-112103

      (43) Seo, H. et al. Dual-specificity phosphatase 5 acts as an anti-inflammatory regulator by inhibiting the ERK and NF-kappaB signaling pathways. Sci Rep 7, 17348 (2017). https://doi.org:10.1038/s41598-017-17591-9

      (44) Camenisch, T. D., Koller, B. H., Earp, H. S. & Matsushima, G. K. A novel receptor tyrosine kinase, Mer, inhibits TNF-alpha production and lipopolysaccharide-induced endotoxic shock. J Immunol 162, 3498-3503 (1999).

    1. Author response:

      ANALYTICAL

      (1) Figure 3 shows that the relationship between learning rate and informativeness for our rats was very similar to that shown with pigeons by Gibbon and Balsam (1981). We used multiple criteria to establish the number of trials to learn in our data, with the goal of demonstrating that the correspondence between the data sets was robust. To establish that they are effectively the same does require using an equivalent decision criterion for our data as was used for Gibbon and Balsam’s data. However, the criterion they used—at least one peck at the response key on at least 3 out of 4 consecutive trials—cannot be sensibly applied to our magazine entry data because rats make magazine entries during the inter-trial interval (whereas pigeons do not peck at the response key in the inter-trial interval). Therefore, evidence for conditioning in our paradigm must involve comparison between the response rate during CS and the baseline response rate. There are two ways one could adapt the Gibbon and Balsam criterion to our data. One way is to use a non-parametric signed rank test for evidence that the CS response rate exceeds the pre-CS response rate, and adopting a statistical criterion equivalent to Gibbon and Balsam’s 3-out-of-4 consecutive trials (p<.3125). The second method estimates the nDkl for the criterion used by Gibbon and Balsam. This could be done by assuming there are no responses in the inter-trial interval and a response probability of at least 0.75 during the CS (their criterion). This would correspond to an nDkl of 2.2 (odds ratio 27:1). The obtained nDkl could then be applied to our data to identify when the distribution of CS response rates has diverged by an equivalent amount from the distribution of pre-CS response rates.

      (2) A single regression line, as shown in Figure 6, is the simplest possible model of the relationship between response rate and reinforcement rate and it explains approximately 80% of the variance in response rate. Fixing the log-log slope at 1 yields the maximally simple model. (This regression is done in the logarithmic domain to satisfy the homoscedasticity assumption.) When transformed into the linear domain, this model assumes a truly scalar relation (linear, intercept at the origin) and assumes the same scale factor and the same scalar variability in response rates for both sets of data (ITI and CS). Our plot supports such a model. Its simplicity is its own motivation (Occam’s razor).

      If regression lines are fitted to the CS and ITI data separately, there is a small increase in explained variance (R2 = 0.82). We leave it to further research to determine whether such a complex model, with 4 parameters, is required. However, we do not think the present data warrant comparing the simplest possible model, with one parameter, to any more complex model for the following reasons:

      · When a brain—or any other machine—maps an observed (input) rate to a rate it produces (output rate), there is always an implicit scalar. In the special case where the produced rate equals the observed rate, the implicit scalar has value 1. Thus, there cannot be a simpler model than the one we propose, which is, in and of itself, interesting.

      · The present case is an intuitively accessible example of why the MDL (Minimum Description Length) approach to model complexity (Barron, Rissanen, & Yu, 1998; Grünwald, Myung, & Pitt, 2005; Rissanen, 1999) can yield a very different conclusion from the conclusion reached using the Bayesian Information Criterion (BIC) approach. The MDL approach measures the complexity of a model when given N data specified with precision of B bits per datum by computing (or approximating) the sum of the maximum-likelihoods of the model’s fits to all possible sets of N data with B precision per datum. The greater the sum over the maximum likelihoods, the more complex the model, that is, the greater its measured wiggle room, it’s capacity to fit data. Recall that von Neuman remarked to Fermi that with 4 parameters he could fit an elephant. His deeper point was that multi-parameter models bring neither insight nor predictive power; they explain only post-hoc, after one has adjusted their parameters in the light of the data. For realistic data sets like ours, the sums of maximum likelihoods are finite but astronomical. However, just as the Sterling approximation allows one to work with astronomical factorials, it has proved possible to develop readily computable approximations to these sums, which can be used to take model complexity into account when comparing models. Proponents of the MDL approach point out that the BIC is inadequate because models with the same number of parameters can have very different amounts of wiggle room. A standard illustration of this point is the contrast between logarithmic model and power-function model. Log regressions must be concave; whereas power function regressions can be concave, linear, or convex—yet they have the same number of parameters (one or two, depending on whether one counts the scale parameter that is always implicit). The MDL approach captures this difference in complexity because it measures wiggle room; the BIC approach does not, because it only counts parameters.

      · In the present case, one is comparing a model with no pivot and no vertical displacement at the boundary between the black dots and the red dots (the 1-parameter unilinear model) to a bilinear model that allows both a change in slope and a vertical displacement for both lines. The 4-parameter model is superior if we use the BIC to take model complexity into account. However, 4-parameter has ludicrously more wiggle room. It will provide excellent fits—high maximum likelihood—to data sets in which the red points have slope > 1, slope 0, or slope < 0 and in which it is also true that the intercept for the red points lies well below or well above the black points (non-overlap in the marginal distribution of the red and black data). The 1-parameter model, on the other hand, will provide terrible fits to all such data (very low maximum likelihoods). Thus, we believe the BIC does not properly capture the immense actual difference in the complexity between the 1-parameter model (unilinear with slope 1) to the 4-parameter model (bilinear with neither the slope nor the intercept fixed in the linear domain).

      · In any event, because the pivot (change in slope between black and red data sets), if any, is small and likewise for the displacement (vertical change), it suffices for now to know that the variance captured by the 1-parameter model is only marginally improved by adding three more parameters. Researchers using the properly corrected measured rate of head poking to measure the rate of reinforcement a subject expects can therefore assume that they have an approximately scalar measure of the subject’s expectation. Given our data, they won’t be far wrong even near the extremes of the values commonly used for rates of reinforcement. That is a major advance in current thinking, with strong implications for formal models of associative learning. It implies that the performance function that maps from the neurobiological realization of the subject’s expectation is not an unknown function. On the contrary, it’s the simplest possible function, the scalar function. That is a powerful constraint on brain-behavior linkage hypotheses, such as the many hypothesized relations between mesolimbic dopamine activity and the expectation that drives responding in Pavlovian conditioning (Berridge, 2012; Jeong et al., 2022; Y.  Niv, Daw, Joel, & Dayan, 2007; Y. Niv & Schoenbaum, 2008).

      The data in Figure 6 are taken from the last 5 sessions of training. The exact number of sessions was somewhat arbitrary but was chosen to meet two goals: (1) to capture asymptotic responding, which is why we restricted this to the end of the training, and (2) to obtain a sufficiently large sample of data to estimate reliably each rat’s response rate. We have checked what the data look like using the last 10 sessions, and can confirm it makes very little difference to the results.<br /> Finally, as noted by the reviews, the relationship between the contextual rate of reinforcement and ITI responding should also be evident if we had measured context responding prior to introducing the CS. However, there was no period in our experiment when rats were given unsignalled reinforcement (such as is done during “magazine training” in some experiments). Therefore, we could not measure responding based on contextual conditioning prior to the introduction of the CS. This is a question for future experiments that use an extended period of magazine training or “poor positive” protocols in which there are reinforcements during the ITIs as well as during the CSs. The learning rate equation has been shown to predict reinforcements to acquisition in the poor-positive case (Balsam, Fairhurst, & Gallistel, 2006).

      (3) One of us (CRG) has earlier suggested that responding appears abruptly when the accumulated evidence that the CS reinforcement rate is greater than the contextual rate exceeds a decision threshold (C.R.  Gallistel, Balsam, & Fairhurst, 2004). The new more extensive data require a more nuanced view. Evidence about the manner in which responding changes over the course of training is to some extent dependent on the analytic method used to track those changes. We presented two different approaches. The approach shown in Figures 7 and 8, extending on that developed by Harris (2022), assumes a monotonic increase in response rate and uses the slope of the cumulative response rate to identify when responding exceeds particular milestones (percentiles of the asymptotic response rate). This analysis suggests a steady rise in responding over trials. Within our theoretical model, this might reflect an increase in the animal’s certainty about the CS reinforcement rate with accumulated evidence from each trial. While this method should be able to distinguish between a gradual change and a single abrupt change in responding (Harris, 2022) it may not distinguish between a gradual change and multiple step-like changes in responding and cannot account for decreases in response rate.<br /> The other analytic method we used relies on the information theoretic measure of divergence, the nDkl (Gallistel & Latham, 2023), to identify each point of change (up or down) in the response record. With that method, we discern three trends. First, the onset tends to be abrupt in that the initial step up is often large (an increase in response rate by 50% or more of the difference between its initial value and its terminal value is common and there are instances where the initial step is to the terminal rate or higher). Second, there is marked within-subject variability in the response rate, characterised by large steps up and down in the parsed response rates following the initial step up, but this variability tends to decrease with further training (there tend to be fewer and smaller steps in both the ITI response rates and the CS response rate as training progresses). Third, the overall trend, seen most clearly when one averages across subjects within groups is to a moderately higher rate of responding later in training than after the initial rise. We think that the first tendency reflects an underlying decision process whose latency is controlled by diminishing uncertainty about the two reinforcement rates and hence about their ratio. We think that decreasing uncertainty about the true values of the estimated rates of reinforcement is also likely to be an important part of the explanation for the second tendency (decreasing within-subject variation in response rates). It is less clear whether diminishing uncertainty can explain the trend toward a somewhat greater difference in the two response rates as conditioning progresses. It is perhaps worth noting that the distribution of the estimates of the informativeness ratio is likely to be heavy tailed and have peculiar properties (as witness, for example, the distribution of the ratio of two gamma distributions with arbitrary shape and scale parameters) but we are unable at this time to propound an explanation of the third trend.

      (4) There is an error in the description provided in the text. The pre-CS period used to measure the ITI responding was 10 s rather than 20 s. There was always at least a 5-s gap between the end of the previous trial and the start of the pre-CS period.

      (5) Details about model fitting will be added in a revision. The question about fitting a single model or multiple models to the data in Figure 6 is addressed in response 2 above. In Figure 6, each rat provides 2 behavioural data points (ITI response rate and CS response rate) and 2 values for reinforcement rate (1/C and 1/T). There is a weak but significant correlation between the ITI and CS response rates (r = 0.28, p < 0.01; log transformed to correct for heteroscedasticity). By design, there is no correlation between the log reinforcement rates (r = 0.06, p = .404).

      CONCEPTUAL

      (1) It is important for the field to realize that the RW model cannot be used to explain the results of Rescorla’s (Rescorla, 1966; Rescorla, 1968, 1969) contingency-not-pairing experiments, despite what was claimed by Rescorla and Wagner (Rescorla & Wagner, 1972; Wagner & Rescorla, 1972) and has subsequently been claimed in many modelling papers and in most textbooks and reviews (Dayan & Niv, 2008; Y. Niv & Montague, 2008). Rescorla programmed reinforcements with a Poisson process. The defining property of a Poisson process is its flat hazard function; the reinforcements were equally likely at every moment in time when the process was running. This makes it impossible to say when non-reinforcements occurred and, a fortiori, to count them. The non-reinforcements are causal events in RW algorithm and subsequent versions of it. Their effects on associative strength are essential to the explanations proffered by these models. Non-reinforcements—failures to occur, updates when reinforcement is set to 0, hence also the lambda parameter—can have causal efficacy only when the successes may be predicted to occur at specified times (during “trials”). When reinforcements are programmed by a Poisson process, there are no such times. Attempts to apply the RW formula to reinforcement learning soon foundered on this problem (Gibbon, 1981; Gibbon, Berryman, & Thompson, 1974; Hallam, Grahame, & Miller, 1992; L.J. Hammond, 1980; L. J. Hammond & Paynter, 1983; Scott & Platt, 1985). The enduring popularity of the delta-rule updating equation in reinforcement learning depends on “big-concept” papers that don’t fit models to real data and discretize time into states while claiming to be real-time models (Y. Niv, 2009; Y. Niv, Daw, & Dayan, 2005).

      The information-theoretic approach to associative learning, which sometimes historically travels as RET (rate estimation theory), is unabashedly and inescapably representational. It assumes a temporal map and arithmetic machinery capable in principle of implementing any implementable computation. In short, it assumes a Turing-complete brain. It assumes that whatever the material basis of memory may be, it must make sense to ask of it how many bits can be stored in a given volume of material. This question is seldom posed in associative models of learning, nor by neurobiologists committed to the hypothesis that the Hebbian synapse is the material basis of memory. Many—including the new Nobelist, Geoffrey Hinton— would agree that the question makes no sense. When you assume that brains learn by rewiring themselves rather than by acquiring and storing information, it makes no sense.

      When a subject learns a rate of reinforcement, it bases its behavior on that expectation, and it alters its behavior when that expectation is disappointed. Subjects also learn probabilities when they are defined. They base some aspects of their behavior on those expectations, making computationally sophisticated use of their representation of the uncertainties (Balci, Freestone, & Gallistel, 2009; Chan & Harris, 2019; J. A. Harris, 2019; J.A. Harris & Andrew, 2017; J. A. Harris & Bouton, 2020; J. A. Harris, Kwok, & Gottlieb, 2019; Kheifets, Freestone, & Gallistel, 2017; Kheifets & Gallistel, 2012; Mallea, Schulhof, Gallistel, & Balsam, 2024 in press).

      (2) Rate estimation theory is oblivious to the temporal order in which experience with different predictors occurs. The matrix computation finds the additive solution, if it exists, to the data so far observed, on the assumption that predicted rates have remained the same. This is the stationarity assumption, which is implicit in a rate computation and was made explicit in the formulation of RET (C.R. Gallistel, 1990). When the additive solution does not exist, the RET algorithm treats the compound of two predictors as a third predictor, and computes the additive solution to the 3-predictor problem. Because it is oblivious to the order in which the data have been acquired, it predicts one-trial overshadowing and retroactive blocking and unblocking (C.R. Gallistel, 1990 pp 439 & 452-455).

      The RET algorithm is but one component of the information-theoretic model of associative learning (aka, TATAL, The Analytic Theory of Associative Learning Wilkes & Gallistel, 2016)). It solves the assignment-of-credit problem, not the change-detection problem. Because rates of reinforcement do sometimes change, the stationarity assumption, which is essential to the RET algorithm, must be tested when each new reinforcement occurs and when the interval since the last reinforcement has become longer than would be expected or the number of reinforcements has become significantly fewer than would be expected given the current estimate of the probability of reinforcement (C. R. Gallistel, Krishan, Liu, Miller, & Latham, 2014). In the information-theoretic approach to associative learning, detecting non-stationarity is done by an information-theoretic change-detecting algorithm. The algorithm correctly predicts that omitted reinforcements to extinction will be a constant (C.R. Gallistel, 2024 under review; Gibbon, Farrell, Locurto, Duncan, & Terrace, 1980). To put the prediction another way, unreinforced trials to extinction will increase in proportional to the trials/reinforcement during training (C.R. Gallistel, 2012; Wilkes & Gallistel, 2016). In other words, it predicts the best and most systematic data on the partial reinforcement extinction effect (PREE) known to us. The profound challenge to neo-Hullian delta-rule updating models that is posed by the PREE has been recognized for the better part of a century. To the best of our knowledge, no other formalized model of associative learning has overcome this challenge (Dayan & Niv, 2008; Mellgren, 2012). Explaining extinction algorithmically is straightforward when one adopts an information-theoretic perspective, because computing reinforcement-by-reinforcement the Kullback-Leibler divergence in a sequence of earlier rate (or probability!) estimates from the most recent estimate and multiplying the vector of divergences by the vector of effective sample sizes (C. R. Gallistel & Latham, 2022) detects and localized changes in rates and probabilities of reinforcement (C.R. Gallistel, 2024 under review). The computation presupposes the existence of a temporal map, a time-stamped record of past events. This supposition is strongly resisted by neuroscience-oriented reinforcement-learning modelers, who try to substitute the assumption of decaying eligibility traces.

      The very interesting Pearce-Ganesan findings (Ganesan & Pearce, 1988) are not predicted by RET, but nor do they run counter its predictions. RET has nothing to say about how subjects categorize appetitive reinforcements; nor, at this time, does the information-theoretic approach to an understanding of associative have anything to say about that.

      The same is not true for the Betts, Brandon & Wagner results (Betts, Brandon, & Wagner, 1996). They pretrained a blocking cue that predicted a painful paraorbital shock to one eye of a rabbit. This cue elicited an anticipatory blink in the threatened eye. It also potentiated the startle reflex made to a loud noise in one ear. A new cue that was then introduced, which always occurred in compound with the pretrained blocking cue. In one group, the painful shock continued to be delivered to the same eye as before; in another group, it was delivered to the skin around the other eye. In the group that continued to receive the shock to the same eye, the old cue effectively blocked conditioning of the new cue for both the eyeblink and the potentiated startle response. However, in the group for which the location of the shock changed to the other eye, the old cue did not block conditioning of the eyeblink response to the new cue but did block conditioning of the startle response to the new cue. The information-theoretic analysis of associative learning focusses on the encoding of measurable predictive temporal relationships, rather than on general and, to our mind, vague notions like CS processing and US processing. A painful shock elicits fear in a rabbit no matter where on the body surface it is experienced, because fear is a reaction to a very broad category of dangers, and fear potentiates the startle reflex regardless of the threat that causes fear. Once that prediction of such a threat is encoded; redundant cues will not be encoded that same way because the RET algorithm blocks the encoding of redundant predictions. A painful shock near an eye elicits a blink of the threatened eye as well as the fear that potentiates the startle. An appropriate encoding for the eye blink must specify the location of the threat. RET will attribute prediction of the threat to the new eye to the new cue—and not to the old cue, the pretrained blocker— while continuing to attribute to the old cue the prediction of a fear-causing threat, because the change in location does not alter that prediction. Therefore, the new cue will be encoded as predicting the new location of the threat to the eye, but not as predicting the large category non-specific threats that elicit fear and the potentiation of the startle, because that prediction remains valid. Changing that prediction would violate the stationarity assumption; predictive relations do not change unless the data imply that they must have changed. Unless we have made a slip in our logic, this would seem to explain Betts et al’s (1996) results. It does so with no free parameters, unlike AESOP, which has a notoriously large number of free parameters.

      Balci, F., Freestone, D., & Gallistel, C. R. (2009). Risk assessment in man and mouse. Proceedings of the National Academy of Science U S A, 106(7), 2459-2463. doi:10.1073/pnas.0812709106

      Balsam, P. D., Fairhurst, S., & Gallistel, C. R. (2006). Pavlovian contingencies and temporal information. Journal of Experimental Psychology: Animal Behavior Processes, 32, 284-294.

      Barron, A., Rissanen, J., & Yu, B. (1998). The minimum description length principle in coding and modeling. IEEE Transactions on Information Theory, 44(6), 2743-2760.

      Berridge, K. C. (2012). From prediction error to incentive salience: Mesolimbic computation of reward motivation. European Journal of Neuroscience.

      Betts, S. L., Brandon, S. E., & Wagner, A. R. (1996). Dissociation of the blocking of conditioned eyeblink and conditioned fear following a shift in US locus. Animal Learning and Behavior, 24(4), 459-470.

      Chan, C. K. J., & Harris, J. A. (2019). The partial reinforcement extinction effect: The proportion of trials reinforced during conditioning predicts the number of trials to extinction. Journal of Experimental Psychology: Animal Learning and Cognition, 45(1). doi:http://dx.doi.org/10.1037/xan0000190

      Dayan, P., & Niv, Y. (2008). Reinforcement learning: The good, the bad and the ugly. Current Opinion in Neurobiology, 18(2), 185-196.

      Gallistel, C. R. (1990). The organization of learning. Cambridge, MA: Bradford Books/MIT Press.

      Gallistel, C. R. (2012). Extinction from a rationalist perspective. Behav Processes, 90, 66-88. doi:10.1016/j.beproc.2012.02.008

      Gallistel, C. R. (2024 under review). Reconceptualized associative learning. Perspectives on Behavioral Science (Special Issue for SQAB 2024).

      Gallistel, C. R., Balsam, P. D., & Fairhurst, S. (2004). The learning curve: Implications of a quantitative analysis. Proceedings of the National Academy of Sciences, 101(36), 13124-13131.

      Gallistel, C. R., Krishan, M., Liu, Y., Miller, R. R., & Latham, P. E. (2014). The perception of probability. Psychological Review, 121, 96-123. doi:10.1037/a0035232

      Gallistel, C. R., & Latham, P. E. (2022). Bringing Bayes and Shannon to the Study of Behavioral and Neurobiological Timing. Timing & Time Perception. timing & TIME Perception, 1-61. doi:10.1163/22134468-bja10069

      Ganesan, R., & Pearce, J. M. (1988). Effect of changing the unconditioned stimulus on appetitive blocking. Journal of Experimental Psychology: Animal Behavior Processes, 14, 280-291.

      Gibbon, J. (1981). The contingency problem in autoshaping. In C. M. Locurto, H. S. Terrace, & J. Gibbon (Eds.), Autoshaping and conditioning theory (pp. 285-308). New York: Academic.

      Gibbon, J., & Balsam, P. (1981). Spreading association in time. In C. M. Locurto, H. S. Terrace, & J. Gibbon (Eds.), Autoshaping and conditioning theory (pp. 219-253). New York: Academic Press.

      Gibbon, J., Berryman, R., & Thompson, R. L. (1974). Contingency spaces and measures in classical and instrumental conditioning. Journal of the Experimental Analysis of Behavior, 21(3), 585-605. doi: 10.1901/jeab.1974.21-585

      Gibbon, J., Farrell, L., Locurto, C. M., Duncan, H. J., & Terrace, H. S. (1980). Partial reinforcement in autoshaping with pigeons. Animal Learning and Behavior, 8, 45–59. doi:doi.org/10.3758/BF03209729

      Grünwald, P. D., Myung, I. J., & Pitt, M. A. (2005). Advances in minimum description length: theory and applications. Cambridge, MA: MIT Press.

      Hallam, S. C., Grahame, N. J., & Miller, R. R. (1992). Exploring the edges of Pavlovian contingency space: An assessment of contignency theory and its various metrics. Learning and Motivation, 23, 225-249.

      Hammond, L. J. (1980). The effect of contingency upon the appetitive conditioning of free operant behavior. Journal of  the Experimental Analysis of Behavior, 34, 297-304. doi:10.1901/jeab.1980.34-297

      Hammond, L. J., & Paynter, W. E. (1983). Probabilistic contingency theories of animal conditioning: A critical analysis. Learning and Motivation, 14, 527-550. doi:10.1016/0023-9690(83)90031-0

      Harris, J. A. (2019). The importance of trials. Journal of Experimental Psychology: Animal Learning and Cognition, 45(4).

      Harris, J. A. (2022). The learning curve, revisited. Journal of Experimental Psychology: Animal Learning and Cognition, 48, 265-280.

      Harris, J. A., & Andrew, B. J. (2017). Time, Trials and Extinction. Journal of Experimental Psychology: Animal Learning and Cognition, 43(1), 15-29.

      Harris, J. A., & Bouton, M. E. (2020). Pavlovian conditioning under partial reinforcement: The effects of non-reinforced trials versus cumulative CS duration. The Journal of Experimental Psychology: Animal Learning & Cognition, 46, 256-272.

      Harris, J. A., Kwok, D. W. S., & Gottlieb, D. A. (2019). The partial reinforcement extinction effect depends on learning about nonreinforced trials rather than reinforcement rate. Journal of Experimental Psychology: Animal Behavior Learning and Cognition, 45(4). doi:10.1037/xan0000220

      Jeong, H., Taylor, A., Floeder, J. R., Lohmann, M., Mihalas, S., Wu, B., . . . Namboodiri, V. M. K. (2022). Mesolimbic dopamine release conveys causal associations. Science. doi:10.1126/science.abq6740

      Kheifets, A., Freestone, D., & Gallistel, C. R. (2017). Theoretical Implications of Quantitative Properties of Interval Timing and Probability Estimation in Mouse and Rat. Journal of the Experimental Analysis of Behavior, 108(1), 39-72. doi:doi.org/10.1002/jeab.261

      Kheifets, A., & Gallistel, C. R. (2012). Mice take calculated risks. Proceedings of the National Academy of Science, 109, 8776-8779. doi:doi.org/10.1073/pnas.1205131109

      Mallea, J., Schulhof, A., Gallistel, C. R., & Balsam, P. D. (2024 in press). Both probability and rate of reinforcement can affect the acquisition and maintenance of conditioned responses. Journal of Experimental Psychology: Animal Learning and Cognition.

      Mellgren, R. (2012). Partial reinforcement extinction effect. In N. M. Seel (Ed.), Encyclopedia of the Sciences of Learning. Boston, MA: Springer.

      Niv, Y. (2009). Reinforcement learning in the brain. Journal of Mathematical Psychology, 53, 139-154.

      Niv, Y., Daw, N. D., & Dayan, P. (2005). How fast to work: response vigor, motivation and tonic dopamine. In Y. Weiss, B. Schölkopf, & J. R. Platt (Eds.), NIPS 18 (pp. 1019–1026). Cambridge, MA: MIT Press.

      Niv, Y., Daw, N. D., Joel, D., & Dayan, P. (2007). Tonic dopamine: Opportunity costs and the control of response vigor. Psychopharmacology, 191(3), 507-520.

      Niv, Y., & Montague, P. R. (2008). Theoretical and empirical studies of learning. In  (., eds), pp. , Academic Press. In P. W. e. a. Glimcher (Ed.), Neuroeconomics: Decision-Making and the Brain (pp. 329–349). New York: Academic Press.

      Niv, Y., & Schoenbaum, G. (2008). Dialogues on prediction errors. Trends in Cognitive Sciences, 12(7), 265-272. doi:10.1016/j.tics.2008.03.006

      Rescorla, R. A. (1966). Predictability and the number of pairings in Pavlovian fear conditioning. Psychonomic Science, 4, 383-384.

      Rescorla, R. A. (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative and Physiological Psychology, 66(1), 1-5. doi:10.1037/h0025984

      Rescorla, R. A. (1969). Conditioned inhibition of fear resulting from negative CS-US contingencies. Journal of Comparative and Physiological Psychology, 67, 504-509.

      Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II (pp. 64-99). New York: Appleton-Century-Crofts.

      Rissanen, J. (1999). Hypothesis selection and testing by the MDL principle. The Computer Journal, 42, 260–269. doi:10.1093/comjnl/42.4.260

      Scott, G. K., & Platt, J. R. (1985). Model of response-reinforcement contingency. Journal of  Experimental Psychology: Animal Behavior Processes, 11(2), 152-171.

      Wagner, A. R., & Rescorla, R. A. (1972). Inhibition in Pavlovian conditioning: Appllication of a theory. In R. A. Boakes & S. Halliday (Eds.), Inhibition and learning. New York: Academic.

      Wilkes, J. T., & Gallistel, C. R. (2016). Information Theory, Memory, Prediction, and Timing in Associative Learning (original long version).

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In their manuscript, Gomez-Frittelli and colleagues characterize the expression of cadherin6 (and -8) in colonic IPANs of mice. Moreover, they found that these cdh6-expressing IPANs are capable of initiating colonic motor complexes in the distal colon, but not proximal and midcolon. They support their claim by morphological, electrophysiological, optogenetic, and pharmacological experiments.

      Strengths:

      The work is very impressive and involves several genetic models and state-of-the-art physiological setups including respective controls. It is a very well-written manuscript that truly contributes to our understanding of GI-motility and its anatomical and physiological basis. The authors were able to convincingly answer their research questions with a wide range of methods without overselling their results.

      We greatly appreciate the reviewer’s time, careful reading and support of our study.

      Weaknesses:

      The authors put quite some emphasis on stating that cdh6 is a synaptic protein (in the title and throughout the text), which interacts in a homophilic fashion. They deduct that cdh6 might be involved in IPAN-IPAN synapses (line 247ff.). However, Cdh6 does not only interact in synapses and is expressed by non-neuronal cells as well (see e.g., expression in the proximal tubuli of the kidney). Moreover, cdh6 does not only build homodimers, but also heterodimers with Chd9 as well as Cdh7, -10, and -14 (see e.g., Shimoyama et al. 2000, DOI: 10.1042/0264-6021:3490159). It would therefore be interesting to assess the expression pattern of cdh6-proteins using immunostainings in combination with synaptic markers to substantiate the authors' claim or at least add the possibility of cell-cell-interactions other than synapses to the discussion. Additionally, an immunostaining of cdh6 would confirm if the expression of tdTomato in smooth muscle cells of the cdh6-creERT model is valid or a leaky expression (false positive).

      We agree with the reviewer that Cdh6 could be mediating some other cell-cell interaction besides synapses between IPANs, and will include more on this in the discussion. Cdh6 primarily forms homodimers but, as the reviewer points out, has been known to also form heterodimers with some other cadherins. We performed RNAscope in the colonic myenteric plexus with Cdh7 and found no expression (data not shown). Cdh10 is suggested to have very low expression (Drokhlyansky et al., 2020), possibly in putative secretomotor vasodilator neurons, and Cdh14 has not been assayed in any RNAseq screens. We attempted to visualize Cdh6 protein via antibody staining (Duan et al., 2018) but our efforts did not result in sufficient signal or resolution to identify synapses in the ENS, which remain broadly challenging to assay. Similarly, immunostaining with Cdh6 antibody was unable to confirm Cdh6 protein in tdT-expressing muscle cells, or by RNAscope. We will address these caveats in the discussion section.

      (1) E. Drokhlyansky, C. S. Smillie, N. V. Wittenberghe, M. Ericsson, G. K. Griffin, G. Eraslan, D. Dionne, M. S. Cuoco, M. N. Goder-Reiser, T. Sharova, O. Kuksenko, A. J. Aguirre, G. M. Boland, D. Graham, O. Rozenblatt-Rosen, R. J. Xavier, A. Regev, The Human and Mouse Enteric Nervous System at Single-Cell Resolution. Cell 182, 1606-1622.e23 (2020).

      (2) X. Duan, A. Krishnaswamy, M. A. Laboulaye, J. Liu, Y.-R. Peng, M. Yamagata, K. Toma, J. R. Sanes, Cadherin Combinations Recruit Dendrites of Distinct Retinal Neurons to a Shared Interneuronal Scaffold. Neuron 99, 1145-1154.e6 (2018).

      Reviewer #2 (Public review):

      Summary:

      Intrinsic primary afferent neurons are an interesting population of enteric neurons that transduce stimuli from the mucosa, initiate reflexive neurocircuitry involved in motor and secretory functions, and modulate gut immune responses. The morphology, neurochemical coding, and electrophysiological properties of these cells have been relatively well described in a long literature dating back to the late 1800's but questions remain regarding their roles in enteric neurocircuitry, potential subsets with unique functions, and contributions to disease. Here, the authors provide RNAscope, immunolabeling, electrophysiological, and organ function data characterizing IPANs in mice and suggest that Cdh6 is an additional marker of these cells.

      Strengths:

      This paper would likely be of interest to a focused enteric neuroscience audience and increase information regarding the properties of IPANs in mice. These data are useful and suggest that prior data from studies of IPANs in other species are likely translatable to mice.

      We appreciate the reviewer’s support of our study and insightful critiques for its improvement.

      Weaknesses:

      The advance presented here beyond what is already known is minimal. Some of the core conclusions are overstated and there are multiple other major issues that limit enthusiasm. Key control experiments are lacking and data do not specifically address the properties of the proposed Cdh6+ population.

      Major weaknesses:

      (1) The novelty of this study is relatively low. The main point of novelty suggests an additional marker of IPANs (Cdh6) that would add to the known list of markers for these cells. How useful this would be is unclear. Other main findings basically confirm that IPANs in mice display the same classical characteristics that have been known for many years from studies in guinea pigs, rats, mice and humans.

      We appreciate the already existing markers for IPANs in the ENS and the existing literature characterizing these neurons. The primary intent of this study was to use these well established characteristics of IPANs in both mice and other species to characterize Cdh6-expressing neurons in the mouse myenteric plexus and confirm their classification as IPANs.

      (2) Some of the main conclusions of this study are overstated and claims of priority are made that are not true. For example, the authors state in lines 27-28 of the abstract that their findings provide the "first demonstration of selective activation of a single neurochemical and functional class of enteric neurons". This is certainly not true since Gould et al (AJP-GIL 2019) expressed ChR2 in nitrergic enteric neurons and showed that activating those cells disrupted CMC activity. In fact, prior work by the authors themselves (Hibberd et al., Gastro 2018) showed that activating calretinin neurons with ChR2 evoked motor responses. Work by other groups has used chemogenetics and optogenetics to show the effects of activating multiple other classes of neurons in the gut.

      We believe our phrasing in this sentence was misleading. Whilst single neurochemical classes of enteric neurons have been manipulated to alter gut functions, all such instances to date do not represent manipulation of a single functional class of enteric neurons. In the given examples, NOS and calretinin are each expressed to varying degrees across putative motor neurons, interneurons and IPANs. In contrast, Chd6 is restricted to IPANs and therefore this study is the first optogenetic investigation of enteric neurons from a single putative functional class. We will alter this segment in the revised manuscript to emphasize this point and differentiate this study from those previous.

      (3) Critical controls are needed to support the optogenetic experiments. Control experiments are needed to show that ChR2 expression a) does not change the baseline properties of the neurons, b) that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons, and c) that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions focused on here.

      We completely agree controls are essential. However, our paper is not the first to express ChR2 in enteric neurons. Authors of our paper have shown in Hibberd et al. 2018 that expression of ChR2 in a heterogeneous population of myenteric neurons did not change network properties of the myenteric plexus. This was demonstrated in the lack of change in control CMC characteristics in mice expressing ChR2 under basal conditions (without blue light exposure). Regarding question (b), that it should be shown that stimulation with the chosen intensity of light elicits physiologically relevant responses in those neurons. We show the restricted expression of ChR2 in IPANs and that motor responses (to blue light) are blocked by selective nerve conduction blockade.

      Regarding question (c), that our study should demonstrate that stimulation via ChR2 elicits comparable responses in IPANs in the different gut regions. We would not expect each region of the gut to behave comparably. This is because the different gut regions (i.e. proximal, mid, distal) are very different anatomically, as is anatomy of the myenteric plexus and myenteric ganglia between each region, including the density of IPANs within each ganglia, in addition to the presence of different patterns of electrical and mechanical activity [Spencer et al., 2020]. Hence, it is difficult to expect that between regions stimulation of ChR2 should induce similar physiological responses. The motor output we record in our study (CMCs) is a unified motor program that involves the temporal coordination of hundreds of thousands of enteric neurons and a complex neural circuit that we have previously characterized [Spencer et al., 2018]. But, never has any study until now been able to selectively stimulate a single functional class of enteric neurons (with light) to avoid indiscriminate activation of other classes of neurons.

      (1) T. J. Hibberd, J. Feng, J. Luo, P. Yang, V. K. Samineni, R. W. Gereau, N. Kelley, H. Hu, N. J. Spencer, Optogenetic Induction of Colonic Motility in Mice. Gastroenterology 155, 514-528.e6 (2018).

      (2) N. J. Spencer, L. Travis, L. Wiklendt, T. J. Hibberd, M. Costa, P. Dinning, H. Hu, Diversity of neurogenic smooth muscle electrical rhythmicity in mouse proximal colon. American Journal of Physiology-Gastrointestinal and Liver Physiology 318, G244–G253 (2020).

      (3) N. J. Spencer, T. J. Hibberd, L. Travis, L. Wiklendt, M. Costa, H. Hu, S. J. Brookes, D. A. Wattchow, P. G. Dinning, D. J. Keating, J. Sorensen, Identification of a Rhythmic Firing Pattern in the Enteric Nervous System That Generates Rhythmic Electrical Activity in Smooth Muscle. J. Neurosci. 38, 5507–5522 (2018).

      (4) The electrophysiological characterization of mouse IPANs is useful but this is a basic characterization of any IPAN and really says nothing specifically about Cdh6+ neurons. The electrophysiological characterization was also only done in a small fraction of colonic IPANs, and it is not clear if these represent cell properties in the distal colon or proximal colon, and whether these properties might be extrapolated to IPANs in the different regions. Similarly, blocking IH with ZD7288 affects all IPANs and does not add specific information regarding the role of the proposed Cdh6+ subtype.

      Our electrophysiological characterization was guided to be within a subset of Cdh6+ neurons by Hb9:GFP expression. As in the prior comment (1) above, we used these experiments to confirm classification of Cdh6+ (Hb9:GFP+) neurons in the distal colon as IPANs. We will clarify that these experiments were performed in the distal colon and agree that we cannot extrapolate that these properties are also representative of IPANs in the proximal colon. We apologize that this was confusing. Finally, we agree with the reviewer that ZD7288 affects all IPANs in the ENS and will clarify this in the text.

      (5) Why SMP IPANs were not included in the analysis of Cdh6 expression is a little puzzling. IPANs are present in the SMP of the small intestine and colon, and it would be useful to know if this proposed marker is also present in these cells.

      We agree with the reviewer. In addition to characterizing Cdh6 in the myenteric plexus, it would be interesting to query if sensory neurons located within the SMP also express Cdh6. Our preliminary data (n=2) show ~6-12% tdT/Hu neurons in Cdh6-tdT ileum and colon (data not shown). We will add a sentence to the discussion.

      (6) The emphasis on IH being a rhythmicity indicator seems a bit premature. There is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS.

      Regarding the statement there is no evidence to suggest that IH and IT are rhythm-generating currents in the ENS. We agree with the reviewer that evidence of rhythm generation by IH and IT in the ENS has not been explicitly confirmed. We are confident the reviewer agrees that an absence of evidence is not evidence of absence, although the presence of IH has been well described in enteric neurons. We will modify the text in the results to indicate more clearly that IH and IT are known to participate in rhythm generation in thalamocortical circuits, though their roles in the ENS remain unknown. Our discussion of the potential role of IH or IT in rhythm generation or oscillatory firing of the ENS is constrained to speculation in the discussion section of the text.

      (7) As the authors point out in the introduction and discuss later on, Type II Cadherins such as Cdh6 bind homophillically to the same cadherin at both pre- and post-synapse. The apparent enrichment of Cdh6 in IPANs would suggest extensive expression in synaptic terminals that would also suggest extensive IPAN-IPAN connections unless other subtypes of neurons express this protein. Such synaptic connections are not typical of IPANs and raise the question of whether or not IPANs actually express the functional protein and if so, what might be its role. Not having this information limits the usefulness of this as a proposed marker.

      We agree with the reviewer that the proposed IPAN-IPAN connection is novel although it has been proposed before (Kunze et al., 1993). As detailed in our response to Reviewer #1, we attempted to confirm Cdh6 protein expression, but were unsuccessful, due to insufficient signal and resolution. We therefore discuss potential IPAN interconnectivity in the discussion, in the context of contrasting literature.

      (1) W. A. A. Kunze, J. B. Furness, J. C. Bornstein, Simultaneous intracellular recordings from enteric neurons reveal that myenteric ah neurons transmit via slow excitatory postsynaptic potentials. Neuroscience 55, 685–694 (1993).

      (8) Experiments shown in Figures 6J and K use a tethered pellet to drive motor responses. By definition, these are not CMCs as stated by the authors.

      The reviewer makes a valid criticism as to the terminology, since tethered pellet experiments do not record propagation. We believe the periodic bouts of propulsive force on the pellet is triggered by the same activity underlying the CMC. In our experience, these activities have similar periodicity, force and identical pharmacological properties. Consistent with this, we also tested full colons (n = 2) set up for typical CMC recordings by multiple force transducers, finding that CMCs were abolished by ZD7288, similar to fixed pellet recordings (data not shown).

      (9) The data from the optogenetic experiments are difficult to understand. How would stimulating IPANs in the distal colon generate retrograde CMCs and stimulating IPANs in the proximal colon do nothing? Additional characterization of the Cdh6+ population of cells is needed to understand the mechanisms underlying these effects.

      We agree that the different optogenetic responses in the proximal and distal colon are challenging to interpret, but perhaps not surprising in the wider context. It is not only possible that the different optogenetic responses in this study reflect regional differences in the Chd6+ neuronal populations, but also differences in neural circuits within these gut regions. A study some time ago by the authors showed that electrical stimulation of the proximal mouse colon was unable to evoke a retrograde (aborally) propagating CMC (Spencer, Bywater, 2002), but stimulation of the distal colon was readily able to. We concluded that at the oral lesion site there is a preferential bias of descending inhibitory nerve projections, since the ascending excitatory pathways have been cut off. In contrast, stimulation of the distal colon was readily able to activate an ascending excitatory neural pathway, and hence induce the complex CMC circuits required to generate an orally propagating CMC. Indeed, other recent studies have added to a growing body of evidence for significant differences in the behaviors and neural circuits of the two regions (Li et al., 2019, Costa et al., 2021a, Costa et al., 2021b, Nestor-Kalinoski et al., 2022). We will expand this discussion.

      (1) N. J. Spencer, R. A. Bywater, Enteric nerve stimulation evokes a premature colonic migrating motor complex in mouse. Neurogastroenterology & Motility 14, 657–665 (2002).

      (2) Li Z, Hao MM, Van den Haute C, Baekelandt V, Boesmans W, Vanden Berghe P (2019) Regional complexity in enteric neuron wiring reflects diversity of motility patterns in the mouse large intestine. Elife 8.

      (3). Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Dinning PG, Brookes SJ, Spencer NJ (2021a) Motor patterns in the proximal and distal mouse colon which underlie formation and propulsion of feces. Neurogastroenterol Motil e14098.

      (4) Costa M, Keightley LJ, Hibberd TJ, Wiklendt L, Smolilo DJ, Dinning PG, Brookes SJ, Spencer NJ (2021b) Characterization of alternating neurogenic motor patterns in mouse colon. Neurogastroenterol Motil 33:e14047.

      (5) Nestor-Kalinoski A, Smith-Edwards KM, Meerschaert K, Margiotta JF, Rajwa B, Davis BM, Howard MJ (2022) Unique Neural Circuit Connectivity of Mouse Proximal, Middle, and Distal Colon Defines Regional Colonic Motor Patterns. Cell Mol Gastroenterol Hepatol 13:309-337.e303.

    1. References

      Update v1.1

      The following references have been added:

      1. FDA approves pembrolizumab for cutaneous squamous cell carcinoma. https://www.fda.gov/drugs/drug-approvals-and-databases/fda-approves-pembrolizumab-cutaneous-squamous-cell-carcinoma

      2. FDA approves toripalimab-tpzi for nasopharyngeal carcinoma. https://www.fda.gov/drugs/resources-information-approved-drugs/fda-approves-toripalimab-tpzi-nasopharyngeal-carcinoma

      3. Food and Drug Administration, Coherus BioSciences. LOQTORZ (toripalimab-tpzi) prescribing information. Available: https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm?event=overview.process&ApplNo=761240 Accessed 6/27/24.

      4. Gillison ML, Blumenschein Jr G, Fayette J, Guigay J, Colevas AD, Licitra L, Harrington KJ, Kasper S, Vokes EE, Even C, Worden F, Saba NF, Iglesias Docampo LC, Haddad R, Rordorf T, Kiyota N, Tahara M, Monga M, Lynch M, Li L, Ferris RL. CheckMate 141: 1‐Year Update and Subgroup Analysis of Nivolumab as First‐Line Therapy in Patients with Recurrent/Metastatic Head and Neck Cancer. The Oncologist, 2018 Sept. https://doi.org/10.1634%2Ftheoncologist.2017-0674"10.1634/theoncologist.2017-0674

      5. Dzienis MR, Cundom JE, Fuentes CS, Hansen AR, Nordlinger MJ, Pastor AV, Oppelt P, Neki A, Gregg RW, Lima IPF, Franke FA, daCunha Junior GF, Tsent JE, Loree T, Joshi AJ, Mccarthy JS, Naicker N, Sidi Y, Gumuscu B, De Castro Jr G. 651O Pembrolizumab (pembro) + carboplatin (carbo) + paclitaxel (pacli) as first-line (1L) therapy in recurrent/metastatic (R/M) head and neck squamous cell carcinoma (HNSCC): Phase VI KEYNOTE-B10 study. Annals of Oncology, 2022 Sept. https://doi.org/10.1016/j.annonc.2022.07.775

      6. Fayette J, Cropet C, Gautier J, Toullec C , Burgy M, Bruyas A, Sire C, Lagrange A, Clatot F, Calderon B, Vinches M, Iacob M, Martin L, Neidhardt Berard EM, Kaminsky MC, Vansteene D, Salas S, Champagnac A, Pérol D, Bourhis J. Results of the multicenter phase II FRAIL-IMMUNE trial evaluating the efficacy and safety of durvalumab combined with weekly paclitaxel carboplatin in first-line in patients (pts) with recurrent/metastatic squamous cell carcinoma of the head and neck (R/M SCCHN) not eligible for cisplatin-based therapies. J Clin Oncol 41, 2023 (suppl 16; abstr 6003).

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #3 (Public review):

      Summary:

      Juan Liu et al. investigated the interplay between habitat fragmentation and climate-driven thermophilization in birds in an island system in China. They used extensive bird monitoring data (9 surveys per year per island) across 36 islands of varying size and isolation from the mainland covering 10 years. The authors use extensive modeling frameworks to test a general increase of the occurrence and abundance of warm-dwelling species and vice versa for cold-dwelling species using the widely used Community Temperature Index (CTI), as well the relationship between island fragmentation in terms of island area and isolation from the mainland on extinction and colonization rates of cold- and warm-adapted species. They found that indeed there was thermophilization happening during the last 10 years, which was more pronounced for the CTI based on abundances and less clearly for the occurrence based metric. Generally, the authors show that this is driven by an increased colonization rate of warm-dwelling and an increased extinction rate of cold-dwelling species. Interestingly, they unravel some of the mechanisms behind this dynamic by showing that warm-adapted species increased while cold-dwelling decreased more strongly on smaller islands, which is - according to the authors - due to lowered thermal buffering on smaller islands (which was supported by air temperature monitoring done during the study period on small and large islands). They argue, that the increased extinction rate of cold-adapted species could also be due to lowered habitat heterogeneity on smaller islands. With regards to island isolation, they show that also both thermophilization processes (increase of warm and decrease of cold-adapted species) was stronger on islands closer to the mainland, due to closer sources to species populations of either group on the mainland as compared to limited dispersal (i.e. range shift potential) in more isolated islands.

      The conclusions drawn in this study are sound, and mostly well supported by the results. Only few aspects leave open questions and could quite likely be further supported by the authors themselves thanks to their apparent extensive understanding of the study system.

      Strengths:

      The study questions and hypotheses are very well aligned with the methods used, ranging from field surveys to extensive modeling frameworks, as well as with the conclusions drawn from the results. The study addresses a complex question on the interplay between habitat fragmentation and climate-driven thermophilization which can naturally be affected by a multitude of additional factors than the ones included here. Nevertheless, the authors use a well balanced method of simplifying this to the most important factors in question (CTI change, extinction, colonization, together with habitat fragmentation metrics of isolation and island area). The interpretation of the results presents interesting mechanisms without being too bold on their findings and by providing important links to the existing literature as well as to additional data and analyses presented in the appendix.

      Weaknesses:

      The metric of island isolation based on distance to the mainland seems a bit too oversimplified as in real-life the study system rather represents an island network where the islands of different sizes are in varying distances to each other, such that smaller islands can potentially draw from the species pools from near-by larger islands too - rather than just from the mainland. Although the authors do explain the reason for this metric, backed up by earlier research, a network approach could be worthwhile exploring in future research done in this system. The fact, that the authors did find a signal of island isolation does support their method, but the variation in responses to this metric could hint on a more complex pattern going on in real-life than was assumed for this study.

      Thank you again for this suggestion. Based on the previous revision, we discussed more about the importance of taking the island network into future research. The paragraph is now on Lines 294-304:

      “As a caveat, we only consider the distance to the nearest mainland as a measure of fragmentation, consistent with previous work in this system (Si et al., 2014), but we acknowledge that other distance-based metrics of isolation that incorporate inter-island connections and island size could hint on a more complex pattern going on in real-life than was assumed for this study, thus reveal additional insights on fragmentation effects. For instance, smaller islands may also potentially utilize species pools from nearby larger islands, rather than being limited solely to those from the mainland. The spatial arrangement of islands, like the arrangement of habitat, can influence niche tracking of species (Fourcade et al., 2021). Future studies should use a network approach to take these metrics into account to thoroughly understand the influence of isolation and spatial arrangement of patches in mediating the effect of climate warming on species.”

      Recommendations for the authors:

      Reviewer #3 (Recommendations for the authors):

      Great job on the revision! The new version reads well and in my opinion all comments were addressed appropriately. A few additional comments are as follows:

      Thank you very much for your further review and recognition. We have carefully modified the manuscript according to all recommendations.

      (1) L 62: replace shifts with process

      Done. We also added the word “transforming” to match this revision. The new sentence is now on Lines 61-63:

      “Habitat fragmentation, usually defined as the process of transforming continuous habitat into spatially isolated and small patches”

      (2) L 363: Your metric for habitat fragmentation is isolation and habitat area and I think this could be introduced already in the introduction, where you somewhat define fragmentation (although it could be clearer still). You could also discuss this in the discussion more, that other measures of fragmentation may be interesting to look at.

      Thank you for this suggestion. We now introduced metric of habitat fragmentation in the Introduction part after habitat fragmentation was defined. The sentence is now on Lines 64-66:

      “Among the various ways in which habitat fragmentation is conceptualized and measured, patch area and isolation are two of the most used measures (Fahrig, 2003).”

      (3) L 384: replace for with because of

      Done.

      (4) L 388: "Following this filtering, 60 ...."

      Done.

      (5) Figure 1: In panels b-d you use different terms (fragmented, small, isolated) but aiming to describe the same thing. I would highly recommend to either use fragmented islands or isolated islands for all panels. Although I see that in your study fragmentation includes both, habitat loss and isolation. So make this clear in the figure caption too...

      Thank you very much for this suggestion. It’s important to maintain consistency in using “fragmentation”. We change “fragmented, small, isolated” into “Fragmented patches” in the caption of b-d. The modified caption is now on Line 771:

      (6) L 783: replace background with habitat (or landscape) and exhibit with exemplify

      Done. The new sentence is now on Lines 782-784:

      “The three distinct patches signify a fragmented landscape and the community in the middle of the three patches was selected to exemplify colonization-extinction dynamics in fragmented habitats.”

      (7) One bigger thing is the definition of fragmentation in your study for which you used habitat area (from habitat loss process) and isolation. This could still be clarified a bit more, especially in the figures. In Fig. 1 the smaller panels b-d could all be titled fragmented islands as this is what the different terms describe in your study (small, isolated) and thus the figure would become even clearer. Otherwise I'm happy with the changes made.

      Thank you for raising this important question. Yes, “habitat fragmentation” in our research includes both habitat loss and fragmentation per se. We have clarified the caption of b-d in Figure 1 as suggested by Recommendation (5). We believe this can make it clearer to the readers.

    1. In fa c t th e re is a s e n s e in which the c a te g o ry ofim po stor, p re v io u s ly referre d co, c a n be d e fin e d a s a p erso nwho m a k e s it im p o s s ib le for h i s a u d i e n c e to be ta c tf u l abo u to b s e rv e d m is r e p r e s e n t a ti o n

      Works as a service to the audience- opportunity to be tactful about misrepresentation - helps the audience help the performer retain their own performance

    2. o re x a m p le , it w a s s u g g e s t e d th a t ta c tf u l o u t s i d e r s in a p h y s ic a lp o s it io n to o v e r h e a r a n i n te r a c t io n m ay o ffe r a s h o w o fi n a t t e n t i o n . In o rd e r to a s s i s t in th is ta c tf u l w ith d ra w a l, th ep a r t i c i p a n t s who feel i t i s p h y s i c a l l y p o s s i b l e for them to beo v e rh e a rd may omit from th e ir c o n v e r s a ti o n and a c t i v i t y anythin gt h a t would ta x t h i s ta c tf u l r e s o lv e of the o u t s i d e r s , and att h e s a m e tim e i n c lu d e e n o u g h s e m i- c o n f id e n t ia l f a c t s to sho wt h a t th e y do not d i s t r u s t th e sh o w of w ith d raw a l p r e s e n te d byt h e o u t s i d e r s

      example two people having a convo that's private with others around other random unrelated people offer the courtesy or tact of not paying attention two people won't completely cease having conversation or make dramatic show of secrecy but will refrain from discussing all things out right maintain some level of secrecy while respecting or acknowledging other's "tact"

    3. And th en, in turn, it b e c o m e s p o s s i b l e for theperform ers to le a rn th a t the a u d i e n c e k now s th a t the perform ersknow t h e y are being p r o te c te d . Now w hen s u c h s t a t e s ofin fo rm a tio n e x i s t , a moment in th e p erfo rm an ce may com ew h en th e s e p a r a t e n e s s of th e t e a m s w ill b reak down a n d bem o m e n ta rily r e p l a c e d by a com munion of g l a n c e s through w hiche a c h team o p e n ly a d m its to the o th e r i t s s t a t e of inform ation.A t s u c h m o m e n ts th e w hole d r am a tu rg ica l s tr u c t u r e of s o c i a li n t e r a c t i o n i s s u d d e n ly an d p o ig n a n tly la id bare, an d th e lin es e p a r a t i n g th e t e a m s m om entarily d i s a p p e a r s . Whether t h i sc l o s e v ie w of th in g s b rin g s sh a m e or la u g h te r , th e teams» arel ik e ly to d raw r a p id l y b ack in to th e ir a p p o in te d c h a r a c t e r s

      THIISSSSS

      If I say that I know that you know I know you know- the whole performance reality breaks - shame or laughter

    4. T h e g a m e s in tro d u c e dby t h e n u r s e s w e r e on a very c h i l d i s h l e v e l ; many o f che pac iencs felts i l l y p l a y i n g them a n d w ere g l a d when the pa rty w as over a n d theyc o u l d go b ack to a c t i v i t i e s of t h e i r own c h o o s i n g

      audience allows nurse to fulfill role by participating in something they didn't want to do

    5. And when o u t s i d e r s find th e y are a b o u t toe n t e r s u c h a r eg io n , th e y o fte n give t h o s e a lre a d y p r e s e n tsom e w arning, in th e form o f a m e s s a g e , or a knock, or a co u g h ,so t h a t t h e in tru sio n c a n be put off if n e c e s s a r y o r the s e t t i n gh u rrie dly pu t in order and proper e x p r e s s i o n s fixed on t h ef a c e s o f t h o s e p r e s e n t. 1 T h i s kin d o f t a c t c a n bec om e n ic e lye l a b o r a te d . T h u s, in p r e s e n tin g o n e s e l f to a s tr a n g e r by m e a n so f a l e t t e r o f in tro d u c tio n , it i s thought proper to c o n v e y th ele t t e r to th e a d d r e s s e e before a c tu a lly com ing into h i s im m e d ia tep r e s e n c e ; th e a d d r e s s e e then h a s tim e to d e c id e w hat kind ofg r e e tin g th e ind iv id u al i s to r e c e i v e , and tim e to a s s e m b l et h e e x p r e s s i v e m anner a p p r o p r ia te to s u c h a g re e tin g

      outsiders make performers aware of their own prescence

    6. In a d d itio n , it will be u s e fu l i f th e m em bersof th e te am e x e r c i s e f o re s ig h t and d e s i g n in d e term in in g ina d v a n c e how b e s t to s t a g e a show .

      Dramaturgical circumspection- investment and therefore foresight in how the performance will play out- anticipation of it

    7. I refer to the fact that w hile theperformer is o s t e n s i b l y im m ersed a n d g iv e n o v er to th e a c tiv i tyhe i s perform ing, and i s a p p a r e n tly e n g r o s s e d in h i s a c t i o n sin a s p o n ta n e o u s , u n c a l c u l a t i n g way, h e m ust n one th e l e s sb e a f f e c tiv e ly d i s s o c i a t e d from h i s p r e s e n ta t io n in a way th a tl e a v e s him free to c o p e with d ram a tu rg ica l c o n t in g e n c ie s a sth e y a r i s e .

      dramaturgical discipline- disassociation from own role as a performance- real or fake intellectual and emotional involvement in activity at stake

    8. but c l e r k scan fre q u e n tly be found who not o n ly a p p e a r to t a k e the r o leo f th e c u s to m e r in g iv in g b u y in g - a d v i c e but a c t u a l l y do so

      example- dramturgical loyalty threatened when clerks say candidly what products are actually worth buying

    9. o c o n fro n ts th e perform erswith f a c t s or e x p r e s s i v e a c t s w hich e a c h team know s w ill beu n a c c e p t a b l

      sometimes audience can't believe impression and confronts performers about it

    10. H ow everth e :? a re s i t u a t i o n s , often c a l l e d ' s c e n e s , ' in w hich a nin d iv id u a l a c t s in s u c h a way a s to d e s tro y or s e r i o u s l y th r e a te nth e p o lite a p p e a r a n c e of c o n s e n s u s , and w hile he may not a c tsi.~:.;ly in o rd er to c r e a t e suci? d i s s o n a n c e , he a c t s wich th ekno w led g e th a t th is kind c f d i s s o n a n c e is lik e ly to r e s u lt .T n e c o m m o n - s e n s e p h rase , ' c r e a t i n g a s c e n e , ’ is a p t b e c a u s e ,in iiifect, a new s c e n e is c r e a te d by s u c h d is r u p tio n s . T h ep r e v io u s and e x p e c t e d in te r p la y betw een th e te a m s i s s u d d e n lyforced a s i d e and a new drama forcibly t a k e s i t s p l a c e

      sometimes this threat of politene appearances is for a certain purpose

    11. T h e p a s t rife an d c u r r e n t round o f a c t i v i t y o f a givenperform er ty p ic a lly c o n t a in at l e a s t a fe w f a c t s w hich, ifin tro d u c e d d urin g th e perfo rm ance, would d i s c r e d i t or a t l e a s tw eaken the c l a im s a b o u t s e l f th a t the perform er w a s atte m p tin g

      facts about performers (of past and present relevance) can pose as threat to illusion

    12. Q u e s t io n sa r e r a is e d about the con d itio n of sign equ ip m en t; s ta n d s , lin e s,an d p o s it io n s are te n ta t iv e l y brought forth and ' c l e a r e d ' byth e a s s e m b le d m e m b ersh ip ; the m erits and d em erits of a v a ila b lefront re g io n s are a n a ly z e d ; th e s i z e and c h a r a c te r of p o s s i b l ea u d i e n c e s for the perform an ce a r e c o n s id e r e d ; p a s t perform anced is r u p tio n s and likely d is r u p tio n s are ta lk e d a b o u t; new s aboutth e te a m s of o n e ’s c o l le a g u è s i s tra n s m itte d ; th e receptiongiven on e’s l a s t perform ance i s mulled o v er in what are some-tim e s c a lle d ' p o s t m o r te m s ;’ wounds a r e lic k e d and moralei s s tren g th en e d for th e next perform ance

      staging talk- talk regarding performance itself

    13. In c o n v e r s a tio n a l c i r c l e s of five or six, b a s ic a lig nm ents a s betw een one con ju gal pair and another, or betweenh o s t s and g u e s t s , or betw een men and women, may be light-h e a rte d ly s e t a s i d e , and th e p a r ti c ip a n ts will s ta n d rea dy toshift and r e sh ift team a lig n m e n ts with l i t t l e provocation,jo k in g ly joining th eir p r e v io u s a u d ie n c e a g a i n s t th e ir prev io u s

      banter = consistent team realignments

    14. Attem pts are made to e s t a b l i s h a s p e c i a l r e l a t i o n s h i p with th ed octor. P a t i e n t s often a t t e m p t to c u l t i v a t e th e i l l u s i o n of a s e c r e tu n d e r s t a n d i n g with the d o c t o r by, for e x a m p le , try in g t o c a t c h h i s ey e

      this is just a great example