10,000 Matching Annotations
  1. Feb 2025
    1. Reviewer #1 (Public review):

      Wang et al. investigated how sexual failure influences sweet taste perception in male Drosophila. The study revealed that courtship failure leads to decreased sweet sensitivity and feeding behavior via dopaminergic signaling. Specifically, the authors identified a group of dopaminergic neurons projecting to the subesophageal zone that interacts with sweet-sensing Gr5a+ neurons. These dopaminergic neurons positively regulate the sweet sensitivity of Gr5a+ neurons via DopR1 and Dop2R receptors. Sexual failure diminishes the activity of these dopaminergic neurons, leading to reduced sweet-taste sensitivity and sugar-feeding behavior in male flies. These findings highlight the role of dopaminergic neurons in integrating reproductive experiences to modulate appetitive sensory responses.

      Previous studies have explored the dopaminergic-to-Gr5a+ neuronal pathways in regulating sugar feeding under hunger conditions. Starvation has been shown to increase dopamine release from a subset of TH-GAL4 labeled neurons, known as TH-VUM, in the subesophageal zone. This enhanced dopamine release activates dopamine receptors in Gr5a+ neurons, heightening their sensitivity to sugar and promoting sucrose acceptance in flies. Since the function of the dopaminergic-to-Gr5a+ circuit motif has been well established, the primary contribution of Wang et al. is to show that mating failure in male flies can also engage this circuit to modulate sugar-feeding behavior. This contribution is valuable because it highlights the role of dopaminergic neurons in integrating diverse internal state signals to inform behavioral decisions.

      An intriguing discrepancy between Wang et al. and earlier studies lies in the involvement of dopamine receptors in Gr5a+ neurons. Prior research has shown that Dop2R and DopEcR, but not DopR1, mediate starvation-induced enhancement of sugar sensitivity in Gr5a+ neurons. In contrast, Wang et al. found that DopR1 and Dop2R, but not DopEcR, are involved in the sexual failure-induced decrease in sugar sensitivity in these neurons. I wish the authors had further explored or discussed this discrepancy, as it is unclear how dopamine release selectively engages different receptors to modulate neuronal sensitivity in a context-dependent manner.

      The data presented by Wang et al. are solid and effectively support their conclusions. However, certain aspects of their experimental design, data analysis, and interpretation warrant further review, as outlined below.

      (1) The authors did not explicitly indicate the feeding status of the flies, but it appears they were not starved. However, the naive and satisfied flies in this study displayed high feeding and PER baselines, similar to those observed in starved flies in other studies. This raises the concern that sexually failed flies may have consumed additional food during the 4.5-hour conditioning period, potentially lowering their baseline hunger levels and subsequently reducing PER responses. This alternative explanation is worth considering, as an earlier study demonstrated that sexually deprived males consumed more alcohol, and both alcohol and food are known rewards for flies. To address this concern, the authors could remove food during the conditioning phase to rule out its influence on the results.

      (2) Figure 1B reveals that approximately half of the males in the Failed group did not consume sucrose, yet Figure 1-S1A suggests that the total volume consumed remained unchanged. Were the flies that did not consume sucrose omitted from the dataset presented in Figure 1-S1A? If so, does this imply that only half of the male flies experience sexual failure, or that sexual failure affects only half of males while the others remain unaffected? The authors should clarify this point.

      (3) The evidence linking TH-GAL4 labeled dopaminergic neurons to reduced sugar sensitivity in Gr5a+ neurons in sexually failed males could be further strengthened. Ideally, the authors would have activated TH-GAL4 neurons and observed whether this restored GCaMP responses in Gr5a+ neurons in sexually failed males. Instead, the authors performed a less direct experiment, shown in Figures 3-S1C and D. The manuscript does not describe the condition of the flies used in this experiment, but it appears that they were not sexually conditioned. I have two concerns with this experiment. First, no statistical analysis was provided to support the enhancement of sucrose responses following activation of TH-GAL4 neurons. Second, without performing this experiment in sexually failed males, the authors lack direct evidence to confirm that the dampened response of Gr5a+ neurons to sucrose results from decreased activity in TH-GAL4 neurons.

      (4) The statistical methods used in this study are poorly described, making it unclear which method was used for each experiment. I suggest that the authors include a clear description of the statistical methods used for each experiment in the figure legends. Furthermore, as I have pointed out, there is a lack of statistical comparisons in Figures 3-S1C and D, a similar problem exists for Figures 6E and F.

      (5) The experiments in Figure 5 lack specificity. The target neurons in this study are Gr5a+ neurons, which are directly involved in sugar sensing. However, the authors used the less specific Dop1R1- and Dop2R-GAL4 lines for their manipulations. Using Gr5a-GAL4 to specifically target Gr5a+ neurons would provide greater precision and ensure that the observed effects are directly attributable to the modulation of Gr5a+ neurons, rather than being influenced by potential off-target effects from other neuronal populations expressing these dopamine receptors.

      (6) I found the results presented in Fig. 6F puzzling. The knockdown of Dop2R in Gr5a+ neurons would be expected to decrease sucrose responses in naive and satisfied flies, given the role of Dop2R in enhancing sweet sensitivity. However, the figure shows an apparent increase in responses across all three groups, which contradicts this expectation. The authors may want to provide an explanation for this unexpected result.

      (7) In several instances in the manuscript, the authors described the effects of silencing dopamine signaling pathways or knocking down dopamine receptors in Gr5a neurons with phrases such as 'no longer exhibited reduced sweet sensitivity' (e.g., L269 and L288), 'prevent the reduction of sweet sensitivity' (e.g., L292), or 'this suppression was reversed' (e.g. L299). I found these descriptions misleading, as they suggest that sweet sensitivity in naive and satisfied groups remains normal while the reduction in failed flies is specifically prevented or reversed. However, this is not the case. The data indicate that these manipulations result in an overall decrease in sweet sensitivity across all groups, such that a further reduction in failed flies is not observed. I recommend revising these descriptions to accurately reflect the observed phenotypes and avoid any confusion regarding the effects of these manipulations.

    2. Author response:

      Reviewer #1 (Public review):

      Wang et al. investigated how sexual failure influences sweet taste perception in male Drosophila. The study revealed that courtship failure leads to decreased sweet sensitivity and feeding behavior via dopaminergic signaling. Specifically, the authors identified a group of dopaminergic neurons projecting to the suboesophageal zone that interacts with sweet-sensing Gr5a+ neurons. These dopaminergic neurons positively regulate the sweet sensitivity of Gr5a+ neurons via DopR1 and Dop2R receptors. Sexual failure diminishes the activity of these dopaminergic neurons, leading to reduced sweet-taste sensitivity and sugar-feeding behavior in male flies. These findings highlight the role of dopaminergic neurons in integrating reproductive experiences to modulate appetitive sensory responses.

      Previous studies have explored the dopaminergic-to-Gr5a+ neuronal pathways in regulating sugar feeding under hunger conditions. Starvation has been shown to increase dopamine release from a subset of TH-GAL4 labeled neurons, known as TH-VUM, in the suboesophageal zone. This enhanced dopamine release activates dopamine receptors in Gr5a+ neurons, heightening their sensitivity to sugar and promoting sucrose acceptance in flies. Since the function of the dopaminergic-to-Gr5a+ circuit motif has been well established, the primary contribution of Wang et al. is to show that mating failure in male flies can also engage this circuit to modulate sugar-feeding behavior. This contribution is valuable because it highlights the role of dopaminergic neurons in integrating diverse internal state signals to inform behavioral decisions.

      An intriguing discrepancy between Wang et al. and earlier studies lies in the involvement of dopamine receptors in Gr5a+ neurons. Prior research has shown that Dop2R and DopEcR, but not DopR1, mediate starvation-induced enhancement of sugar sensitivity in Gr5a+ neurons. In contrast, Wang et al. found that DopR1 and Dop2R, but not DopEcR, are involved in the sexual failure-induced decrease in sugar sensitivity in these neurons. I wish the authors had further explored or discussed this discrepancy, as it is unclear how dopamine release selectively engages different receptors to modulate neuronal sensitivity in a context-dependent manner.

      Our immunostaining experiments showed that three dopamine receptors, DopR1, Dop2R, and DopEcR were expressed in Gr5a<sup>+</sup> neurons in the proboscis, which was consistent with previous findings by using RT-PCR (Inagaki et al 2012). As the reviewer pointed out, we found that DopR1 and Dop2R were required for courtship failure-induced suppression of sugar sensitivity, whereas Marella et al 2012 and Inagaki et al 2012 found that Dop2R and DopEcR were required for starvation-induced enhancement of sugar sensitivity. These results may suggest different internal states (courtship failure vs. starvation) modulate peripheral sensory system via different signaling pathways (e.g. different subsets of dopaminergic neurons; different dopamine release mechanisms; and different dopamine receptors). We will further discuss these possibilities in the revised manuscript.

      The data presented by Wang et al. are solid and effectively support their conclusions. However, certain aspects of their experimental design, data analysis, and interpretation warrant further review, as outlined below.

      (1) The authors did not explicitly indicate the feeding status of the flies, but it appears they were not starved. However, the naive and satisfied flies in this study displayed high feeding and PER baselines, similar to those observed in starved flies in other studies. This raises the concern that sexually failed flies may have consumed additional food during the 4.5-hour conditioning period, potentially lowering their baseline hunger levels and subsequently reducing PER responses. This alternative explanation is worth considering, as an earlier study demonstrated that sexually deprived males consumed more alcohol, and both alcohol and food are known rewards for flies. To address this concern, the authors could remove food during the conditioning phase to rule out its influence on the results.

      We think this is a valid concern. We will conduct courtship conditioning in the absence of food and test if courtship failure can still suppress sugar sensitivity in the revised manuscript.

      (2) Figure 1B reveals that approximately half of the males in the Failed group did not consume sucrose yet Figure 1-S1A suggests that the total volume consumed remained unchanged. Were the flies that did not consume sucrose omitted from the dataset presented in Figure 1-S1A? If so, does this imply that only half of the male flies experience sexual failure, or that sexual failure affects only half of males while the others remain unaffected? The authors should clarify this point.

      Here is a brief clarification of our experimental design and we will further clarify the details in the revised manuscript:

      After the behavioral conditioning, male flies were divided for two assays. On the one hand, we quantified PER responses of individual flies. As shown in Figure 1C, Failed males exhibited decreased sweet sensitivity (as demonstrated by the right shift of the response curve).

      On the other hand, we sought to quantify food consumption of individual flies by using the MAFE assay (Qi et al 2005). When presented with 400 mM sucrose, approximately 100% of the flies in the Naïve and Satisfied groups, and 50% of the flies in the Failed group, extended their proboscis and started feeding (Figure 1B). For these flies, we could quantify the consumed volumes and found there was no change (Figure 1, S1A). We should also note the consistency of these two experiments, e.g. in Figure 1C, only 50-60% of Failed males responded to 400 mM stimulation.  

      These two experiments in combination suggest that sexual failure suppressed sweet sensitivity of the Failed males. Meanwhile, as long as they still initiated feeding, the volume of food consumption remained unchanged. These results led us to focus on the modulatory effect of sexual failure on the sensory system, the main topic of this present study.

      In addition, to further clarify the potential misunderstanding, we plan to examine food consumption by using 800 mM sucrose in the revised manuscript. As shown in Figure 1C, 800 mM sucrose was adequate to induce feeding in ~100% of the flies.

      (3) The evidence linking TH-GAL4 labeled dopaminergic neurons to reduced sugar sensitivity in Gr5a+ neurons in sexually failed males could be further strengthened. Ideally, the authors would have activated TH-GAL4 neurons and observed whether this restored GCaMP responses in Gr5a+ neurons in sexually failed males. Instead, the authors performed a less direct experiment, shown in Figures 3-S1C and D. The manuscript does not describe the condition of the flies used in this experiment, but it appears that they were not sexually conditioned. I have two concerns with this experiment. First, no statistical analysis was provided to support the enhancement of sucrose responses following activation of TH-GAL4 neurons. Second, without performing this experiment in sexually failed males, the authors lack direct evidence to confirm that the dampened response of Gr5a+ neurons to sucrose results from decreased activity in TH-GAL4 neurons.

      We think this is also a valid suggestion. We will directly examine whether activating TH<sup>+</sup> neurons in sexually conditioned males would enhance sugar responses of Gr5a<sup>+</sup> neurons in sexually failed males. We will also add in statistical analysis.

      Nevertheless, we would still argue our current experiments using Naive males (Figure 3, S1C-D) are adequate to show a functional link between TH<sup>+</sup> neurons and Gr5a<sup>+</sup> neurons. Combining with the results that these neurons form active synapses (Figure 3, S1B) and that the activity of TH<sup>+</sup> neurons was dampened in sexually failed males (Figure 3G-I), our current data support the notion that sexual failure suppresses sweet sensitivity via TH-Gr5a circuity.

      (4) The statistical methods used in this study are poorly described, making it unclear which method was used for each experiment. I suggest that the authors include a clear description of the statistical methods used for each experiment in the figure legends. Furthermore, as I have pointed out, there is a lack of statistical comparisons in Figures 3-S1C and D, a similar problem exists for Figures 6E and F.

      We will add detailed information of statistical analysis in each figure legend.

      (5) The experiments in Figure 5 lack specificity. The target neurons in this study are Gr5a+ neurons, which are directly involved in sugar sensing. However, the authors used the less specific Dop1R1- and Dop2R-GAL4 lines for their manipulations. Using Gr5a-GAL4 to specifically target Gr5a+ neurons would provide greater precision and ensure that the observed effects are directly attributable to the modulation of Gr5a+ neurons, rather than being influenced by potential off-target effects from other neuronal populations expressing these dopamine receptors.

      We agree with the reviewer that manipulating Dop1R1 and Dop2R genes (Figure 4) and the neurons expressing them (Figure 5) might have broader impacts. In fact, we have also tested the role of Dop1R1 and Dop2R in Gr5a<sup>+</sup> neurons by RNAi experiments (Figure 6). As shown by both behavioral and calcium imaging experiments, knocking down Dop1R1 and Dop2R in Gr5a<sup>+</sup> neurons both eliminated the effect of sexual failure to dampen sweet sensitivity, further confirming the role of these two receptors in Gr5a<sup>+</sup> neurons.

      (6) I found the results presented in Fig. 6F puzzling. The knockdown of Dop2R in Gr5a+ neurons would be expected to decrease sucrose responses in naive and satisfied flies, given the role of Dop2R in enhancing sweet sensitivity. However, the figure shows an apparent increase in responses across all three groups, which contradicts this expectation. The authors may want to provide an explanation for this unexpected result.

      We agree that there might be some potential discrepancies. However, our current data are not adequate for the clarification given the experiments shown in Figure 6E-F and the apparent control (Figure 3C) were not conducted under identical settings at the same (that’s why we did not directly compare these results). One way to address the issues is to conduct these calcium imaging experiments again with a head-to-head comparison with the control group (Gr5a-GCaMP, +/- Dop1R1 and Dop2R RNAi). We will conduct the experiments and present the data in the revised manuscript.

      (7) In several instances in the manuscript, the authors described the effects of silencing dopamine signaling pathways or knocking down dopamine receptors in Gr5a neurons with phrases such as 'no longer exhibited reduced sweet sensitivity' (e.g., L269 and L288), 'prevent the reduction of sweet sensitivity' (e.g., L292), or 'this suppression was reversed' (e.g. L299). I found these descriptions misleading, as they suggest that sweet sensitivity in naive and satisfied groups remains normal while the reduction in failed flies is specifically prevented or reversed. However, this is not the case. The data indicate that these manipulations result in an overall decrease in sweet sensitivity across all groups, such that a further reduction in failed flies is not observed. I recommend revising these descriptions to accurately reflect the observed phenotypes and avoid any confusion regarding the effects of these manipulations.

      We will change our expressions in the revised manuscript. In brief, we think that these manipulations (suppressing Dop1R1<sup>+</sup> and Dop2R<sup>+</sup> neurons) have two consequences: suppressing the overall sweet sensitivity and eliminating the effect of sexual failure.

      Reviewer #2 (Public review):

      Summary:

      The authors exposed naïve male flies to different groups of females, either mated or virgin. Male flies can successfully copulate with virgin females; however, they are rejected by mated females. This rejection reduces sugar preference and sensitivity in males. Investigating the underlying neural circuits, the authors show that dopamine signaling onto GR5a sensory neurons is required for reduced sugar preference. GR5a sensory neurons respond less to sugar exposure when they lack dopamine receptors.

      Strengths:

      The findings add another strong phenotype to the existing dataset about brain-wide neuromodulatory effects of mating. The authors use several state-of-the-art methods, such as activity-dependent GRASP to decipher the underlying neural circuitry. They further perform rigorous behavioral tests and provide convincing evidence for the local labellar circuit.

      Weaknesses:

      The authors focus on the circuit connection between dopamine and gustatory sensory neurons in the male SEZ. Therefore, it is still unknown how mating modulates dopamine signaling and what possible implications on other behaviors might result from a reduced sugar preference.

      We agree with the reviewer that in the current study, we did not examine how mating experience suppressed the activity of dopaminergic neurons in the SEZ. The current study mainly focused on the behavioral characterization (sexual failure suppresses sweet sensitivity) and the downstream mechanism (TH-Gr5a pathway). We think that examining the upstream modulatory mechanism may be more suitable for a separate future study.

      We believe that a sustained reduction in sweet sensitivity (not limited to sucrose but extend to other sweet compounds, Figure 1, S1B-C) upon sexual failure suggests a generalized and sustained consequence on reward-related behaviors. Sexual failure may thus resemble a state of “primitive emotion” in fruit flies. We will further discuss this possibility in the revised manuscript.

      Reviewer #3 (Public review):

      Summary

      In this work, the authors asked how mating experience impacts reward perception and processing. For this, they employ fruit flies as a model, with a combination of behavioral, immunostaining, and live calcium imaging approaches.

      Their study allowed them to demonstrate that courtship failure decreases the fraction of flies motivated to eat sweet compounds, revealing a link between reproductive stress and reward-related behaviors. This effect is mediated by a small group of dopaminergic neurons projecting to the SEZ. After courtship failure, these dopaminergic neurons exhibit reduced activity, leading to decreased Gr5a+ neuron activity via Dop1R1 and Dop2R signaling, and leading to reduced sweet sensitivity. The authors therefore showed how mating failure influences broader behavioral outputs through suppression of the dopamine-mediated reward system and underscores the interactions between reproductive and reward pathways.

      Concern

      My main concern regarding this study lies in the way the authors chose to present their results. If I understood correctly, they provided evidence that mating failure induces a decrease in the fraction of flies exhibiting PER. However, they also showed that food consumption was not affected (Fig. 1, supplement), suggesting that individuals who did eat consumed more. This raises questions about the analysis and interpretation of the results. Should we consider the group as a whole, with a reduced sensitivity to sweetness, or should we focus on individuals, with each one eating more? I am also concerned about how this could influence the results obtained using live imaging approaches, as the flies being imaged might or might not have been motivated to eat during the feeding assays. I would like the authors to clarify their choice of analysis and discuss this critical point, as the interpretation of the results could potentially be the opposite of what is presented in the manuscript.

      Here is a brief clarification of our experimental design and we will further clarify the details in the revised manuscript:

      After the behavioral conditioning, male flies were divided for two assays. On the one hand, we quantified PER responses of individual flies. As shown in Figure 1C, Failed males exhibited decreased sweet sensitivity (as demonstrated by the right shift of the response curve).

      On the other hand, we sought to quantify food consumption of individual flies by using the MAFE assay (Qi et al 2005). When presented with 400 mM sucrose, approximately 100% of the flies in the Naïve and Satisfied groups, and 50% of the flies in the Failed group, extended their proboscis and started feeding (Figure 1B). For these flies, we could quantify the consumed volumes and found there was no change (Figure 1, S1A). We should also note the consistency of these two experiments, e.g. in Figure 1C, only 50-60% of Failed males responded to 400 mM stimulation.  

      These two experiments in combination suggest that sexual failure suppressed sweet sensitivity of the Failed males. Meanwhile, as long as they still initiated feeding, the volume of food consumption remained unchanged. These results led us to focus on the modulatory effect of sexual failure on the sensory system, the main topic of this present study.

      In addition, to further clarify the potential misunderstanding, we plan to examine food consumption by using 800 mM sucrose instead. As shown in Figure 1C, 800 mM sucrose was adequate to induce feeding in ~100% of the flies.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Krishnan et al devised three paradigms to perform contextual fear conditioning in head-fixed mice. Each of the paradigms relied on head-fixed mice running on a treadmill through virtual reality arenas. The authors tested the validity of three versions of the paradigms by using various parameters. As described below, I think there are several issues with the way the paradigms are designed and how the data are interpreted. Moreover, as Paradigm 3 was published previously in a study by the same group, it is unclear to me what this manuscript offers beyond the validations of parameters used for the previous publication. Below, I list my concerns point-by-point, which I believe need to be addressed to strengthen the manuscript.

      Major comments

      (1) In the analysis using the LME model (Tables 1 and 2), I am left wondering why the mice had increased freezing across recall days as well as increased generalization (increased freezing to the familiar context, where shock was never delivered). Would the authors expect freezing to decrease across recall days, since repeated exposure to the shock context should drive some extinction? This is complicated by the analysis showing that freeing was increased only on retrieval day 1 when analyzing data from the first lap only. Since reward (e.g., motivation to run) is removed during the conditioning and retrieval tests, I wonder if what the authors are observing is related to decreased motivation to perform the task (mice will just sit, immobile, not necessarily freezing per se). I think that these aspects need to be teased out.

      (2) Related to point 1, the authors actually point out that these changes could be due to the loss of the water reward. So, in line 304, is it appropriate to call this freezing? I think it will be very important for the authors to exactly define and delineate what they consider as freezing in this task, versus mice just simply sitting around, immobile, and taking a break from performing the task when they realize there is no reward at the end.

      (3) In the second paradigm, mice are exposed to both novel and (at the time before conditioning) neutral environments just before fear conditioning. There is a big chance that the mice are 'linking' the memories (Cai et al 2016) of the two contexts such that there is no difference in freezing in the shock context compared to the neutral context, which is what the authors observe (Lines 333-335). The experiment should be repeated such that exposure to the contexts does not occur on the conditioning day.

      (4) On lines 360-361, the authors conclude that extinction happens rapidly, within the first lap of the VR trial. To my understanding, that would mean that extinction would happen within the first 5-10 seconds of the test (according to Figure S1E). That seems far too fast for extinction to occur, as this never occurs in freely behaving mice this quickly.

      (5) Throughout the different paradigms, the authors are using different shock intensities. This can lead to differences in fear memory encoding as well as in levels of fear memory generalization. I don't think that comparisons can be made across the different paradigms as too many variables (including shock intensity - 0.5/0.6mA can be very different from 1.0 mA) are different. How can the authors pinpoint which works best? Indeed, they find Paradigm 3 'works' better than Paradigm 2 because mice discriminate better between the neutral and shock contexts. This can definitely be driven by decreased generalization from using a 0.6mA shock in Paradigm 3 compared to 1.0 mA shock in Paradigm 2.

      (6) There are some differences in the calcium imaging dataset compared to other studies, and the authors should perform additional testing to determine why. This will be integral to validating their head-fixed paradigm(s) and showing they are useful for modeling circuit dynamics/behaviors observed in freely behaving mice. Moreover, the sample size (number of mice) seems low.

      (7) It appears that the authors have already published a paper using Paradigm 3 (Ratigan et al 2023). If they already found a paradigm that is published and works, it is unclear to me what the current manuscript offers beyond that initial manuscript.

      (8) As written, the manuscript is really difficult to follow with the averages and standard error reported throughout the text. This reporting in the text occurred heterogeneously throughout the text, as sometimes it was reported and other times it was not. Cleaning this reporting up throughout the paper would greatly improve the flow of the text and qualitative description of the results.

    1. Reviewer #2 (Public review):

      Summary:

      The authors developed a deep learning model based on a DenseNet CNN architecture to predict two cognitive functions: working memory and episodic memory, from functional connectivity matrices. These matrices were recorded under three conditions: during rest, a working memory task, and a movie, and were treated as images for the CNN algorithm. They tested their model's performance across different conditions and a separate dataset with a different age distribution (using the same MRI scanner, scanning configurations, and cognitive tests). They also calculated the "brain cognition gap" based on the model trained on resting functional connectivity to predict working memory. Extending from the commonly used index "brain age," the brain cognition gap was defined as the difference between the working memory score predicted by their model (predicted working memory) and the working memory score based on the working memory test itself (observed working memory). This brain cognition gap was found to be associated with physical activity, education, and cardiovascular risk. The authors also conducted additional mediation tests to examine whether regional functional variability mediated the relationship between PET-derived measures of dopamine and the brain cognition gap.

      Strengths:

      The major strength of this manuscript is the extensive effort the authors have put into creating a new 'biomarker' that links deep learning with fMRI, PET, physical activity, education, and cardiovascular risk across two studies. This effort is impressive.

      Weaknesses:

      There are several weaknesses in the current methods and results, making many of the claims unconvincing. These weaknesses include:

      (1) The lack of baseline models to benchmark the predictive performance of their DenseNet models.

      (2) The inappropriate calculation of the brain cognition gap due to the lack of control for regression-toward-the-mean and the influence of the working memory itself (a common practice in brain age studies).

      (3) The lack of benchmarking of the brain cognition gap against the 'corrected' brain age gap and the direct prediction of physical activity, education, and cardiovascular risk.

      (4) Minimal justification for their PET mediation analysis.

      Regarding the impact of the work on the field and the utility of the methods and data to the community, I see its potential. However, addressing all the weaknesses listed above is crucial and likely to change the conclusions of the results.

      It is important to note that many statements in the manuscript are overstated, making the contribution of the manuscript seem exaggerated.

      For instance, the abstract claims "there is a lack of objective biomarkers to accurately predict cognitive function," and the discussion states, "across various studies, the correlation between predicted and actual fluid intelligence typically hovers around 0.25 (98-100)." However, a meta-analysis by Vieira and colleagues (2022 https://doi.org/10.1016/j.intell.2022.101654) found over 37 studies up to 2020 predicting cognitive abilities from fMRI with machine learning, with 24 studies published in 2019-20 alone. Since 2020, with the rise of machine learning and AI, even more studies have likely been published on this topic, all claiming to show objective biomarkers to accurately predict cognitive function. Vieira and colleagues also found an average performance of these objective biomarkers in predicting general cognition at r = .42, similar to what was found in this manuscript. Based on this alone, it is unclear how novel or superior their method is without a proper systematic benchmark.

      Similarly, the authors claim superior performance of deep learning and mischaracterize machine learning algorithms: "In particular, deep neural networks (DNN) methods have been successfully applied to behavioral and disease prediction (24-26), and have been found to outperform other machine learning approaches (27-29)," and "Deep learning approaches overcome the limitation of predictive techniques that solely rely on linear associations between connectivity and behavioral phenotypes (17)." However, the superiority of deep learning is debatable. Studies show comparable performance between machine learning (such as kernel regression) and deep learning (such as fully-connected neural networks, BrainNetCNN, Graph CNN (GCNN), and temporal CNN), e.g., He and colleagues (2019) and Vieira and colleagues (2024) https://doi.org/10.1016/j.neuroimage.2019.116276 and Vieira and colleagues' https://doi.org/10.1101/2024.03.07.583858.

      Moreover, many non-deep learning predictive techniques are non-linear, e.g., XGBoost, CatBoost, random forest, kernel ridge, and support vector regression with non-linear kernels (such as RBF and polynomial). Thus, stating that machine learning can only model linear relationships is incorrect. Moreover, for the small amount of data the authors had, some might argue that a linear algorithm might be more appropriate to balance the bias-variance trade-off in prediction. Again, without a proper systematic benchmark, it is unclear how well their DenseNet algorithm performs compared to other algorithms.

      Regarding the Brain Age literature, the authors also misinterpreted recent findings: "However, a recent study suggests that brain age predictions contribute minimally compared to chronological age for explaining cognitive decline (65), implying that cognitive predictions are more reliable." In this study, Tetereva and colleagues (2024) (https://doi.org/10.7554/eLife.87297.4) showed that non-deep-learning machine learning can make good predictions from MRI on both chronological age (with r up to .88) and fluid cognition (with r up to .627). Using the combination of functional connectivity matrices across rest and tasks to predict fluid cognition, they found performance at r = .565, comparable to what was found in the current manuscript with deep learning. Nonetheless, while brain age predicted chronological age well (and brain cognition predicted fluid cognition well), it was problematic to predict fluid cognition from brain age. They showed that, because brain age, by design, shared so much common variance with chronological age, brain age and chronological age captured the same variance of fluid cognition. When chronological age was controlled for in the prediction of fluid cognition, brain age no longer had high predictive ability. In the case of the current manuscript, the brain cognition gap is not appropriately controlled for cognition (to be more precise, a working memory score). I expect the performance in predicting physical activity, education, and cardiovascular risk will drop dramatically once cognition is controlled for. There are at least two ways to control cognition according to Tetereva and colleagues' study (see more in the recommendations).

      The authors mentioned, "The third aim of the current study is to uncover the contribution of dopamine (DA) integrity to brain-cognition gaps." However, I fail to see how mediation analysis would test this. The authors also mentioned, "Insufficient DA modulation can affect neurocognitive functions detrimentally (69, 74, 76-78)." They should test if DA levels are related to working memory scores in their study, and if so, whether the relationship is mediated by the "corrected" brain-cognition gaps. Note see more on the recommendation for the calculation of the "corrected" brain-cognition gaps.

    2. Reviewer #3 (Public review):

      Summary:

      This paper by Esmaeili and co-authors presents a connectome prediction study to predict episodic memory and relate prediction errors to other phonotypic variables.

      Strengths:

      (1) A primary and external validation dataset.

      (2) Novel use of prediction errors (i.e., brain-cognitive gap).

      (3) A wide range of data was investigated.

      Weaknesses:

      (1) Lack of comparisons to other methods for prediction.

      (2) Several different points are being investigated that don't allow any particular one to shine through.

      (3) Some choices of analysis are not well-motivated.

      (4) How do the n-back connectomes perform for prediction if the authors do not regress task activations from the n-back task?

      (5) I am a little concerned about overfitting with the convolutional neural net. For example, the drop-off in prediction performance in the external sample is stark. How does the deep learning approach used here compare to something simpler, like a connectome-based predictive model or ridge regression?

      (6) It may be nice to try the other models in the validation dataset. This would also provide a sense of the overfitting that may be going on with overfitting.

      (7) While predictive models increase the power over association studies, they still require large samples to prevent overfitting. Do the authors have a sense of the power their main and external validation sample sizes provide?

      (8) I am not sure that the Mann-Whitney is the correct test for comparing the distributions of prediction performances. The distributions are dependent on each other as they are each predicting the same outcomes. Using the typical degrees of freedom formula would overestimate the degrees of freedom.

      (9) The brain cognition gap is interesting. It is very similar conceptually to the brain age gap. When associating the brain age gap with other phenotypes, typically age is regressed from the brain age gap and the other phenotype. In other words, age is typically associated with a brain age gap as individuals at the tail ages often show the largest gaps. Is the brain cognition gap correlated with episodic memory and do the group differences hold if episodic memory is controlled for?

      (10) I have the same question for the dopamine results. Particularly, in the correlations that are divided by brain cognition gap sign. I could see these types of patterns arise due to a correlation with a third variable.

    1. Pour structurer un document de synthèse pour un briefing, il est essentiel de saisir comment la forme scolaire influence le bien-être des élèves, en intégrant des perspectives critiques et des alternatives pédagogiques.

      L'analyse de Henri Louis Go dans "La fabrique des écoliers, ou l'enfance enfermée" offre un cadre pertinent pour examiner l'évolution de l'école et son impact sur l'enfance.

      L'école comme espace d'enfermement et d'aliénation

      Historiquement, la forme scolaire s'est structurée comme un espace d'"enrégimentement" des enfants, les soumettant à une "orthopédie sociale".

      L'école républicaine, héritière des modèles éducatifs chrétiens, a instauré une discipline stricte et une séparation entre l'école et la vie.

      Cette séparation, défendue par des figures comme Jules Ferry et illustrée par la "simplicité monastique" prônée par le philosophe Alain, visait à concentrer l'attention des élèves sur le travail scolaire, souvent perçu comme une fin en soi.

      Cette approche a conduit à un "confinement scolaire", où le monde extérieur est exclu de la salle de classe, créant un environnement artificiel et aliénant pour les élèves.

      Foucault souligne que l'école, à l'instar d'autres institutions disciplinaires, exerce un pouvoir sur les corps et les esprits, normalisant les comportements et étouffant l'individualité.

      L'immobilisation des corps, l'assujettissement de la parole et l'aliénation du temps des élèves contribuent à un "grand enfermement" de la jeunesse, produisant un mal-être généralisé.

      La critique Arendtienne et ses contradictions

      Hannah Arendt critique l'éducation moderne pour avoir créé un "monde de l'enfance" séparé du monde des adultes.

      Elle estime que l'école doit préparer les enfants à comprendre le monde et à assumer leurs responsabilités civiques, mais met en garde contre toute tentative de politisation de l'école.

      Pour Arendt, l'institution scolaire est censée s'intercaler entre la famille et le monde pour faire transition, relevant de la sphère prépolitique où les enfants ne peuvent ni ne doivent prendre part au monde.

      Cependant, Go souligne les contradictions de la pensée arendtienne, notamment son opposition entre un "monde de l'enfant" illusoire et une exclusion de l'enfant du monde des adultes.

      Arendt conteste l'existence d'un "monde de l'enfant", tout en déplorant l'exclusion des enfants du monde des adultes, ce qui crée une tension dans son argumentation.

      Alternatives pédagogiques et reconstruction de la forme scolaire

      Face aux limites de la forme scolaire classique, des alternatives pédagogiques émergent, comme la pédagogie Freinet.

      L'École Freinet se veut un "conservatoire pour l'enfance", un lieu où les enfants peuvent s'épanouir et développer leur potentiel.

      Contrairement à l'école traditionnelle, elle cherche à ouvrir l'école sur la vie, en favorisant les activités pratiques, les échanges avec le milieu environnant et l'expression libre des enfants.

      Élise et Célestin Freinet insistent sur l'importance de prendre en compte le "monde de l'enfance", en préservant la sensibilité et la créativité des enfants.

      Ils cherchent à dialectiser l'approche des problèmes, en tenant compte à la fois de la nécessité de protéger l'enfance et de celle de développer l'esprit critique des élèves.

      L'École Freinet n'est ni un espace clos, protégé des bruits extérieurs, ni un espace seulement intermédiaire où le monde n'entrerait que représenté.

      Freinet accorde une attention particulière aux conditions matérielles de l'existence des élèves, considérant que l'aménagement des locaux, l'accès à la nature et la qualité de l'alimentation sont essentiels pour favoriser le bien-être et l'épanouissement des enfants.

      Pistes d'action pour le briefing

      Pour structurer votre briefing, vous pourriez aborder les points suivants :

      1. Évolution historique de la forme scolaire :

        • Analyser comment la forme scolaire a évolué depuis le XIXe siècle, en mettant en évidence les continuités et les ruptures.
        • Examiner comment les conditions matérielles et les pratiques disciplinaires ont influencé le vécu des élèves.
      2. Critiques de la forme scolaire classique :

        • Présenter les critiques formulées par Foucault, Arendt et d'autres auteurs à l'encontre de la forme scolaire classique.
        • Analyser les notions d'enfermement, d'aliénation et de normalisation, et leurs effets sur le bien-être des élèves.
      3. Alternatives pédagogiques et pistes de reconstruction :

        • Explorer les alternatives à la forme scolaire classique, comme la pédagogie Freinet et l'éducation nouvelle.
        • Mettre en évidence les principes clés de ces pédagogies, tels que l'ouverture sur la vie, la coopération, l'expression libre et le respect de l'enfance.
      4. Conditions matérielles et bien-être :

        • Souligner l'importance des conditions matérielles de l'existence des élèves pour leur bien-être et leur épanouissement.
        • Proposer des pistes d'action concrètes pour améliorer l'aménagement des locaux, l'accès à la nature et la qualité de l'alimentation.
      5. Reconstruction de l'école comme lieu de vie émancipateur :

      *Comment repenser l'école comme un lieu de vie émancipateur, où les élèves peuvent développer un désir d'apprendre et s'ouvrir à l'avenir?

      • Favoriser une approche pédagogique qui prenne en compte les besoins et les aspirations des enfants, tout en les préparant à devenir des citoyens actifs et responsables.
        • Encourager l'expérimentation de nouvelles formes d'organisation scolaire, plus ouvertes, coopératives et respectueuses de l'enfance.

      En explorant ces questions, votre briefing permettra de nuancer l'idéologie du bien-être à l'école et de proposer des pistes d'action concrètes pour améliorer la vie scolaire des élèves.

    1. Voici les éléments clés à considérer, organisés thématiquement pour faciliter la compréhension et la présentation.

      Concepts et Définitions Clés

      Distinction entre méthodologie et méthodes, ainsi que la notion de recherche créative.

      • Méthodologie : Cadre large et cohérent, guidé par les valeurs et les croyances du chercheur, influençant les choix tout au long de la recherche.
      • Méthodes : Outils utilisés à différents niveaux de la recherche (conception, collecte, analyse, rédaction, présentation). Un même outil, comme l'interview, peut être utilisé avec différentes méthodologies.
      • Recherche Créative : Production visant à construire de nouvelles connaissances en adaptant les conventions et en s'ancrant dans le monde social.

      La créativité n'est pas synonyme d'innovation et n'est pas réservée à l'art ou aux sciences sociales.

      Il faut également distinguer la recherche création (partant des pratiques artistiques) de la méthodologie de recherche créative (utilisant une pratique créative à différentes étapes de la recherche).

      Types de Méthodes de Recherche Créative

      Il existe un large éventail d'approches, souvent basées sur l'art, visant à perturber les habitudes, susciter le changement, rechercher l'empathie et explorer l'identité des participants. Hélène Kara (2020) identifie cinq grands types de méthodes:

      1. Méthodes basées sur les arts : Incluant les arts visuels, du spectacle et textiles.
      2. Recherches incarnées : Liées à une implication totale du corps, notamment pour les thèmes liés au toucher ou à l'écoute.
      3. Recherches basées sur la technologie : Utilisant les médias sociaux, applications, jeux et vidéos.
      4. Approches multimodales : Combinant différentes méthodes qualitatives et quantitatives.
      5. Approches transformatives : Visant à questionner ou changer un statu quo, en lien avec les recherches participatives, actions ou communautaires.

      Intérêts et Enjeux des Approches Créatives

      Ces méthodes présentent plusieurs avantages:

      • Facilitation de l'écoute de points de vue multiples.
      • Amélioration de la réflexivité des chercheurs.
      • Vision holistique des expériences.
      • Promotion de l'autonomisation des participants.
      • Positionnement du participant comme expert.
      • Perturbation des relations de pouvoir.

      Cependant, elles soulèvent également des enjeux importants:

      • Gestion de la relation chercheurs-participants.
      • Production et analyse de données de différentes formes (2D, 3D, audio, etc.).
      • Accessibilité des résultats pour les participants et les non-participants.
      • Mobilisation des connaissances en conservant leur complexité.

      Exemples Concrets

      La présentation illustre l'utilisation de ces méthodes à travers une recherche sur la transition des jeunes enfants ayant des besoins éducatifs particuliers vers l'école. Deux exemples sont particulièrement détaillés:

      1. Approche mosaïque avec les enfants:

        • Méthode participative où les enfants sont considérés comme experts.
        • Basée sur six composantes centrales : participation, réflexivité, adaptabilité, expérience, intégration à la pratique et multiméthode.
        • Utilisation d'outils variés : marionnette, micro, appareil photo, matériel créatif.
        • Collecte de données à travers des observations, dialogues, photos et créations.
        • Focus groupes créatifs avec des professionnels:

        • Utilisation de la métaphore du voyage pour faciliter la participation.

        • Fourniture de matériel artistique varié (papier, couleurs, ciseaux, colle, etc.).
        • Analyse des données en quatre étapes : description des créations, familiarisation avec les présentations, codage et élaboration de récits analytiques.
        • Prise en compte des aspects visuels, textuels et sensoriels des créations.

      Forces et Questions

      Les chercheuses soulignent trois forces principales dans leur utilisation des approches créatives:

      • Promotion d'une participation significative et éthique.
      • Création de portraits holistiques et empathiques.
      • Importance des représentations visuelles dans toutes les phases de la recherche.

      Elles mentionnent également des questions importantes qui les habitent:

      • Comment reconnaître et affronter les dilemmes éthiques ?
      • Dans quelle mesure le matériel fourni facilite-t-il l'élaboration du sens ?
      • Comment s'assurer de la pertinence des outils cocréés pour les différents acteurs ?

      En structurant votre briefing autour de ces thèmes, vous offrirez une vue d'ensemble claire et approfondie des méthodes de recherche créatives présentées, tout en soulignant leurs avantages, leurs défis et les questions qu'elles soulèvent.

    1. Table 3Stimuli Selections.Total Number of StimuliSelectionsNumber of SquareSelectionsNumber of KanizsaSelectionsNumber of ControlSelectionsSignificance at p = .05Kanizsa vs.Control5 – 5 0 p < .05*Control vs. Square 4 3 – 1 p = .40Kanizsa vs. Square 7 5 2 – p = .29Note. Summed across the nine cats, each stimuli pair was presented 18 times. Asterisk (*) indicates significance at the p = .05 level.G.E. Smith et al.

      This figure is essential to previewing the text because it helps you see what the author used for his data, in this case he drew a table with a straight forward x and y axis. It shows the number of Selections.

    1. static inline bool should_compact_retry(struct alloc_context *ac, int order, int alloc_flags, enum compact_result compact_result, enum compact_priority *compact_priority, int *compaction_retries) { int max_retries = MAX_COMPACT_RETRIES; int min_priority; bool ret = false; int retries = *compaction_retries; enum compact_priority priority = *compact_priority; if (!order) return false; if (fatal_signal_pending(current)) return false; /* * Compaction was skipped due to a lack of free order-0 * migration targets. Continue if reclaim can help. */ if (compact_result == COMPACT_SKIPPED) { ret = compaction_zonelist_suitable(ac, order, alloc_flags); goto out; } /* * Compaction managed to coalesce some page blocks, but the * allocation failed presumably due to a race. Retry some. */ if (compact_result == COMPACT_SUCCESS) { /* * !costly requests are much more important than * __GFP_RETRY_MAYFAIL costly ones because they are de * facto nofail and invoke OOM killer to move on while * costly can fail and users are ready to cope with * that. 1/4 retries is rather arbitrary but we would * need much more detailed feedback from compaction to * make a better decision. */ if (order > PAGE_ALLOC_COSTLY_ORDER) max_retries /= 4; if (++(*compaction_retries) <= max_retries) { ret = true; goto out; } } /* * Compaction failed. Retry with increasing priority. */ min_priority = (order > PAGE_ALLOC_COSTLY_ORDER) ? MIN_COMPACT_COSTLY_PRIORITY : MIN_COMPACT_PRIORITY; if (*compact_priority > min_priority) { (*compact_priority)--; *compaction_retries = 0; ret = true; } out: trace_compact_retry(order, priority, compact_result, retries, max_retries, ret); return ret; }

      This function determines whether the system should retry memory compaction during page allocation. 1. If the allocation order is 0, no need for compaction 2. If compaction was skipped because of a lack of free order-0 pages: Check if memory reclaim can help. 3. If compaction succeeded but allocation still failed: retry 1/4 times. 4. If the compaction failed. retry with increase priority

    1. pure

      whether it has a (yuh) sound 1. yes 2. yes 3. no 4. yes 5. no 6. yes 7. yes 8. no 9. no 10. yes 11. yes 12. no 13. no 14. yes

      yes: p, k, b, g, m, f, v, h no: t,d,n,th,s differences: place of articulation; coronals don't get a (yuh)

    Annotators

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      I would like to thank the reviewers for their comments and interest in the manuscript and the study.

      Reviewer #1

      1) I would assume that there are RNA-seq and/or ChIP-seq data out there produced after knockdown of one or more of these DBPs that show directional positioning.

      As the reviewer pointed out, a wet experimental validation of the results of this study would give an opportunity for more biological researchers to have an interest in the study. I plan to promote the wet experimental analysis in collaboration with biological experimental researchers as a next step of this study. The same analysis in this study can be performed in immortalized cells for CRISPR experiment (e.g. Guo Y et al. Cell 2015).

      2) Figure 6 should be expanded to incorporate analysis of DBPs not overlapping CTCF/cohesin in chromatin interaction data that is important and potentially more interesting than the simple DBPs enrichment reported in the present form of the figure.

      Following the reviewer's advice, I performed the same analysis with the DNA-binding sites that do no overlap with the DNA-binding sites of CTCF and cohesin (RAD21 and SMC3) (Fig. 6 and Supplementary Fig. 4). The result showed the same tendency in the distribution of DNA-binding sites. The height of a peak on the graph became lower for some DNA-binding proteins after removing the DNA-binding sites that overlapped with those of CTCF and cohesin. I have added the following sentence on lines 427 and 817: For the insulator-associated DBPs other than CTCF, RAD21, and SMC3, the DNA-binding sites that do not overlap with those of CTCF, RND21, and SMC3 were used to examine their distribution around interaction sites.

      3) Critically, I would like to see use of Micro-C/Hi-C data and ChIP-seq from these factors, where insulation scores around their directionally-bound sites show some sort of an effect like that presumed by the authors - and many such datasets are publicly-available and can be put to good use here.

      As suggested by the reviewer, I have added the insulator scores and boundary sites from the 4D nucleome data portal as tracks in the UCSC genome browser. The insulator scores seem to correspond to some extent to the H3K27me3 histone marks from ChIP-seq (Fig. 4a and Supplementary Fig. 3). The direction of DNA-binding sites on the genome can be shown with different colors (e.g. red and green), but the directionality is their overall tendency, and it may be difficult to notice the directionality from each binding site.

      I found that the CTCF binding sites examined by a wet experiment in the previous study may not always overlap with the boundary sites of chromatin interactions from Micro-C assay (Guo Y et al. Cell 2015). The chromatin interaction data do not include all interactions due to the high sequencing cost of the assay. The number of the boundary sites may be smaller than that of CTCF binding sites acting as insulators and/or some of the CTCF binding sites may not be locate in the boundary sites. It may be difficult for the boundary location algorithm to identify a short boundary location. Due to the limitations of the chromatin interaction data, I planned to search for insulator-associated DNA-binding proteins without using chromatin interaction data in this study. I have added the statistical summary of the analysis in lines 364-387 as follows: Overall, among 20,837 DNA-binding sites of the 97 insulator-associated proteins found at insulator sites identified by H3K27me3 histone modification marks (type 1 insulator sites), 1,315 (6%) overlapped with 264 of 17,126 5kb long boundary sites, and 6,137 (29%) overlapped with 784 of 17,126 25kb long boundary sites in HFF cells. Among 5,205 DNA-binding sites of the 97 insulator-associated DNA-binding proteins found at insulator sites identified by H3K27me3 histone modification marks and transcribed regions (type 2 insulator sites), 383 (7%) overlapped with 74 of 17,126 5-kb long boundary sites, 1,901 (37%) overlapped with 306 of 17,126 25-kb long boundary sites. Although CTCF-binding sites separate active and repressive domains, the limited number of DNA-binding sites of insulator-associated proteins found at type 1 and 2 insulator sites overlapped boundary sites identified by chromatin interaction data. Furthermore, by analyzing the regulatory regions of genes, the DNA-binding sites of the 97 insulator-associated DNA-binding proteins were found (1) at the type 1 insulator sites (based on H3K27me3 marks) in the regulatory regions of 3,170 genes, (2) at the type 2 insulator sites (based on H3K27me3 marks and gene expression levels) in the regulatory regions of 1,044 genes, and (3) at insulator sites as boundary sites identified by chromatin interaction data in the regulatory regions of 6,275 genes. The boundary sites showed the highest number of overlaps with the DNA-binding sites. Comparing the insulator sites identified by (1) and (3), 1,212 (38%) genes have both types of insulator sites. Comparing the insulator sites between (2) and (3), 389 (37%) genes have both types of insulator sites. From the comparison of insulator and boundary sites, we found that (1) or (2) types of insulator sites overlapped or were close to boundary sites identified by chromatin interaction data.

      4) The suggested alternative transcripts function, also highlighted in the manuscripts abstract, is only supported by visual inspection of a few cases for several putative DBPs. I believe this is insufficient to support what looks like one of the major claims of the paper when reading the abstract, and a more quantitative and genome-wide analysis must be adopted, although the authors mention it as just an 'observation'.

      According to the reviewer's comment, I performed the genome-wide analysis of alternative transcripts where the DNA-binding sites of insulator-associated proteins are located near splicing sites. The DNA-binding sites of insulator-associated DNA-binding proteins were found within 200 bp centered on splice sites more significantly than the other DNA-binding proteins (Fig. 4e and Table 2). I have added the following sentences on lines 397 - 404: We performed the statistical test to estimate the enrichment of insulator-associated DNA-binding sites compared to the other DNA-binding proteins, and found that the insulator-associated DNA-binding sites were significantly more abundant at splice sites than the DNA-binding sites of the other proteins (Fig 4e and Table 2; Mann‒Whitney U test, p value < 0.05). The comparison between the splice sites of both ends of first and last introns and those of other introns showed the similar statistical significance of enrichment and number of splice sites with the insulator-associated DNA-binding proteins (Table 2 and Table S9).

      5) Figure 1 serves no purpose in my opinion and can be removed, while figures can generally be improved (e.g., the browser screenshots in Figs 4 and 5) for interpretability from readers outside the immediate research field.

      I believe that the Figure 1 would help researchers in other fields who are not familiar with biological phenomena and functions to understand the study. More explanation has been included in the Figures and legends of Figs. 4 and 5 to help readers outside the immediate research field understand the figures.

      6) Similarly, the text is rather convoluted at places and should be re-approached with more clarity for less specialized readers in mind.

      Reviewer #2's comments would be related to this comment. I have introduced a more detailed explanation of the method in the Results section, as shown in the responses to Reviewer #2’s comments.

      Reviewer #2

      1) Introduction, line 95: CTCF appears two times, it seems redundant.

      On lines 91-93, I deleted the latter CTCF from the sentence "We examine the directional bias of DNA-binding sites of CTCF and insulator-associated DBPs, including those of known DBPs such as RAD21 and SMC3".

      2) Introduction, lines 99-103: Please stress better the novelty of the work. What is the main focus? The new identified DPBs or their binding sites? What are the "novel structural and functional roles of DBPs" mentioned?

      Although CTCF is known to be the main insulator protein in vertebrates, we found that 97 DNA-binding proteins including CTCF and cohesin are associated with insulator sites by modifying and developing a machine learning method to search for insulator-associated DNA-binding proteins. Most of the insulator-associated DNA-binding proteins showed the directional bias of DNA-binding motifs, suggesting that the directional bias is associated with the insulator.

      I have added the sentence in lines 96-99 as follows: Furthermore, statistical testing the contribution scores between the directional and non-directional DNA-binding sites of insulator-associated DBPs revealed that the directional sites contributed more significantly to the prediction of gene expression levels than the non-directional sites. I have revised the statement in lines 101-110 as follows: To validate these findings, we demonstrate that the DNA-binding sites of the identified insulator-associated DBPs are located within potential insulator sites, and some of the DNA-binding sites in the insulator site are found without the nearby DNA-binding sites of CTCF and cohesin. Homologous and heterologous insulator-insulator pairing interactions are orientation-dependent, as suggested by the insulator-pairing model based on experimental analysis in flies. Our method and analyses contribute to the identification of insulator- and chromatin-associated DNA-binding sites that influence EPIs and reveal novel functional roles and molecular mechanisms of DBPs associated with transcriptional condensation, phase separation and transcriptional regulation.

      3) Results, line 111: How do the SNPs come into the procedure? From the figures it seems the input is ChIP-seq peaks of DNBPs around the TSS.

      On lines 121-124, to explain the procedure for the SNP of an eQTL, I have added the sentence in the Methods: "If a DNA-binding site was located within a 100-bp region around a single-nucleotide polymorphism (SNP) of an eQTL, we assumed that the DNA-binding proteins regulated the expression of the transcript corresponding to the eQTL".

      4) Again, are those SNPs coming from the different cell lines? Or are they from individuals w.r.t some reference genome? I suggest a general restructuring of this part to let the reader understand more easily. One option could be simplifying the details here or alternatively including all the necessary details.

      On line 119, I have included the explanation of the eQTL dataset of GTEx v8 as follows: " The eQTL data were derived from the GTEx v8 dataset, after quality control, consisting of 838 donors and 17,382 samples from 52 tissues and two cell lines”. On lines 681 and 865, I have added the filename of the eQTL data "(GTEx_Analysis_v8_eQTL.tar)".

      5) Figure 1: panel a and b are misleading. Is the matrix in panel a equivalent to the matrix in panel b? If not please clarify why. Maybe in b it is included the info about the SNPs? And if yes, again, what is then difference with a.

      The reviewer would mention Figure 2, not Figure 1. If so, the matrices in panels a and b in Figure 2 are equivalent. I have shown it in the figure: The same figure in panel a is rotated 90 degrees to the right. The green boxes in the matrix show the regions with the ChIP-seq peak of a DNA-binding protein overlapping with a SNP of an eQTL. I used eQTL data to associate a gene with a ChIP-seq peak that was more than 2 kb upstream and 1 kb downstream of a transcriptional start site of a gene. For each gene, the matrix was produced and the gene expression levels in cells were learned and predicted using the deep learning method. I have added the following sentences to explain the method in lines 133 - 139: Through the training, the tool learned to select the binding sites of DNA-binding proteins from ChIP-seq assays that were suitable for predicting gene expression levels in the cell types. The binding sites of a DNA-binding protein tend to be observed in common across multiple cell and tissue types. Therefore, ChIP-seq data and eQTL data in different cell and tissue types were used as input data for learning, and then the tool selected the data suitable for predicting gene expression levels in the cell types, even if the data were not obtained from the same cell types.

      6) Line 386-388: could the author investigate in more detail this observation? Does it mean that loops driven by other DBPs independent of the known CTCF/Cohesin? Could the author provide examples of chromatin structural data e.g. MicroC?

      As suggested by the reviewer, to help readers understand the observation, I have added Supplementary Fig. S4c to show the distribution of DNA-binding sites of "CTCF, RAD21, and SMC3" and "BACH2, FOS, ATF3, NFE2, and MAFK" around chromatin interaction sites. I have modified the following sentence to indicate the figure on line 493: Although a DNA-binding-site distribution pattern around chromatin interaction sites similar to those of CTCF, RAD21, and SMC3 was observed for DBPs such as BACH2, FOS, ATF3, NFE2, and MAFK, less than 1% of the DNA-binding sites of the latter set of DBPs colocalized with CTCF, RAD21, or SMC3 in a single bin (Fig. S4c).

      In Aljahani A et al. Nature Communications 2022, we find that depletion of cohesin causes a subtle reduction in longer-range enhancer-promoter interactions and that CTCF depletion can cause rewiring of regulatory contacts. Together, our data show that loop extrusion is not essential for enhancer-promoter interactions, but contributes to their robustness and specificity and to precise regulation of gene expression. Goel VY et al. Nature Genetics 2023 mentioned in the abstract: Microcompartments frequently connect enhancers and promoters and though loss of loop extrusion and inhibition of transcription disrupts some microcompartments, most are largely unaffected. These results suggested that chromatin loops can be driven by other DBPs independent of the known CTCF/Cohesin.

      I added the following sentence on lines 561-569: The depletion of cohesin causes a subtle reduction in longer-range enhancer-promoter interactions and that CTCF depletion can cause rewiring of regulatory contacts. Another group reported that enhancer-promoter interactions and transcription are largely maintained upon depletion of CTCF, cohesin, WAPL or YY1. Instead, cohesin depletion decreased transcription factor binding to chromatin. Thus, cohesin may allow transcription factors to find and bind their targets more efficiently. Furthermore, the loop extrusion is not essential for enhancer-promoter interactions, but contributes to their robustness and specificity and to precise regulation of gene expression.

      FOXA1 pioneer factor functions as an initial chromatin-binding and chromatin-remodeling factor and has been reported to form biomolecular condensates (Ji D et al. Molecular Cell 2024). CTCF have also found to form transcriptional condensate and phase separation (Lee R et al. Nucleic acids research 2022). FOS was found to be an insulator-associated DNA-binding protein in this study and is potentially involved in chromatin remodeling, transcription condensation, and phase separation with the other factors such as BACH2, ATF3, NFE2 and MAFK. I have added the following sentence on line 548: FOXA1 pioneer factor functions as an initial chromatin-binding and chromatin-remodeling factor and has been reported to form biomolecular condensates.

      7) In general, how the presented results are related to some models of chromatin architecture, e.g. loop extrusion, in which it is integrated convergent CTCF binding sites?

      Goel VY et al. Nature Genetics 2023 identified highly nested and focal interactions through region capture Micro-C, which resemble fine-scale compartmental interactions and are termed microcompartments. In the section titled "Most microcompartments are robust to loss of loop extrusion," the researchers noted that a small proportion of interactions between CTCF and cohesin-bound sites exhibited significant reductions in strength when cohesin was depleted. In contrast, the majority of microcompartmental interactions remained largely unchanged under cohesin depletion. Our findings indicate that most P-P and E-P interactions, aside from a few CTCF and cohesin-bound enhancers and promoters, are likely facilitated by a compartmentalization mechanism that differs from loop extrusion. We suggest that nested, multiway, and focal microcompartments correspond to small, discrete A-compartments that arise through a compartmentalization process, potentially influenced by factors upstream of RNA Pol II initiation, such as transcription factors, co-factors, or active chromatin states. It follows that if active chromatin regions at microcompartment anchors exhibit selective "stickiness" with one another, they will tend to co-segregate, leading to the development of nested, focal interactions. This microphase separation, driven by preferential interactions among active loci within a block copolymer, may account for the striking interaction patterns we observe.

      The authors of the paper proposed several mechanisms potentially involved in microcompartments. These mechanisms may be involved in looping with insulator function. Another group reported that enhancer-promoter interactions and transcription are largely maintained upon depletion of CTCF, cohesin, WAPL or YY1. Instead, cohesin depletion decreased transcription factor binding to chromatin. Thus, cohesin may allow transcription factors to find and bind their targets more efficiently (Hsieh TS et al. Nature Genetics 2022). Among the identified insulator-associated DNA-binding proteins, Maz and MyoD1 form loops without CTCF (Xiao T et al. Proc Natl Acad Sci USA 2021 ; Ortabozkoyun H et al. Nature genetics 2022 ; Wang R et al. Nature communications 2022). I have added the following sentences on lines 563-567: Another group reported that enhancer-promoter interactions and transcription are largely maintained upon depletion of CTCF, cohesin, WAPL or YY1. Instead, cohesin depletion decreased transcription factor binding to chromatin. Thus, cohesin may allow transcription factors to find and bind their targets more efficiently. I have included the following explanation on lines 574-576: Maz and MyoD1 among the identified insulator-associated DNA-binding proteins form loops without CTCF.

      As for the directionality of CTCF, if chromatin loop anchors have some structural conformation, as shown in the paper entitled "The structural basis for cohesin-CTCF-anchored loops" (Li Y et al. Nature 2020), directional DNA binding would occur similarly to CTCF binding sites. Moreover, cohesin complexes that interact with convergent CTCF sites, that is, the N-terminus of CTCF, might be protected from WAPL, but those that interact with divergent CTCF sites, that is, the C-terminus of CTCF, might not be protected from WAPL, which could release cohesin from chromatin and thus disrupt cohesin-mediated chromatin loops (Davidson IF et al. Nature Reviews Molecular Cell Biology 2021). Regarding loop extrusion, the ‘loop extrusion’ hypothesis is motivated by in vitro observations. The experiment in yeast, in which cohesin variants that are unable to extrude DNA loops but retain the ability to topologically entrap DNA, suggested that in vivo chromatin loops are formed independently of loop extrusion. Instead, transcription promotes loop formation and acts as an extrinsic motor that extends these loops and defines their final positions (Guerin TM et al. EMBO Journal 2024). I have added the following sentences on lines 535-539: Cohesin complexes that interact with convergent CTCF sites, that is, the N-terminus of CTCF, might be protected from WAPL, but those that interact with divergent CTCF sites, that is, the C-terminus of CTCF, might not be protected from WAPL, which could release cohesin from chromatin and thus disrupt cohesin-mediated chromatin loops. I have included the following sentences on lines 569-574: The ‘loop extrusion’ hypothesis is motivated by in vitro observations. The experiment in yeast, in which cohesin variants that are unable to extrude DNA loops but retain the ability to topologically entrap DNA, suggested that in vivo chromatin loops are formed independently of loop extrusion. Instead, transcription promotes loop formation and acts as an extrinsic motor that extends these loops and defines their final positions.

      Another model for the regulation of gene expression by insulators is the boundary-pairing (insulator-pairing) model (Bing X et al. Elife 2024) (Ke W et al. Elife 2024) (Fujioka M et al. PLoS Genetics 2016). Molecules bound to insulators physically pair with their partners, either head-to-head or head-to-tail, with different degrees of specificity at the termini of TADs in flies. Although the experiments do not reveal how partners find each other, the mechanism unlikely requires loop extrusion. Homologous and heterologous insulator-insulator pairing interactions are central to the architectural functions of insulators. The manner of insulator-insulator interactions is orientation-dependent. I have summarized the model on lines 551-559: Other types of chromatin regulation are also expected to be related to the structural interactions of molecules. As the boundary-pairing (insulator-pairing) model, molecules bound to insulators physically pair with their partners, either head-to-head or head-to-tail, with different degrees of specificity at the termini of TADs in flies (Fig. 7). Although the experiments do not reveal how partners find each other, the mechanism unlikely requires loop extrusion. Homologous and heterologous insulator-insulator pairing interactions are central to the architectural functions of insulators. The manner of insulator-insulator interactions is orientation-dependent.

      8) Do the authors think that the identified DBPs could work in that way as well?

      The boundary-pairing (insulator-pairing) model would be applied to the insulator-associated DNA-binding proteins other than CTCF and cohesin that are involved in the loop extrusion mechanism (Bing X et al. Elife 2024) (Ke W et al. Elife 2024) (Fujioka M et al. PLoS Genetics 2016).

      Liquid-liquid phase separation was shown to occur through CTCF-mediated chromatin loops and to act as an insulator (Lee, R et al. Nucleic Acids Research 2022). Among the identified insulator-associated DNA-binding proteins, CEBPA has been found to form hubs that colocalize with transcriptional co-activators in a native cell context, which is associated with transcriptional condensate and phase separation (Christou-Kent M et al. Cell Reports 2023). The proposed microcompartment mechanisms are also associated with phase separation. Thus, the same or similar mechanisms are potentially associated with the insulator function of the identified DNA-binding proteins. I have included the following information on line 546: CEBPA in the identified insulator-associated DNA-binding proteins was also reported to be involved in transcriptional condensates and phase separation.

      9) Also, can the authors comment about the mechanisms those newly identified DBPs mediate contacts by active processes or equilibrium processes?

      Snead WT et al. Molecular Cell 2019 mentioned that protein post-transcriptional modifications (PTMs) facilitate the control of molecular valency and strength of protein-protein interactions. O-GlcNAcylation as a PTM inhibits CTCF binding to chromatin (Tang X et al. Nature Communications 2024). I found that the identified insulator-associated DNA-binding proteins tend to form a cluster at potential insulator sites (Supplementary Fig. 2d). These proteins may interact and actively regulate chromatin interactions, transcriptional condensation, and phase separation by PTMs. I have added the following explanation on lines 576-582: Furthermore, protein post-transcriptional modifications (PTMs) facilitate control over the molecular valency and strength of protein-protein interactions. O-GlcNAcylation as a PTM inhibits CTCF binding to chromatin. We found that the identified insulator-associated DNA-binding proteins tend to form a cluster at potential insulator sites (Fig. 4f and Supplementary Fig. 3c). These proteins may interact and actively regulate chromatin interactions, transcriptional condensation, and phase separation through PTMs.

      10) Can the author provide some real examples along with published structural data (e.g. the mentioned micro-C data) to show the link between protein co-presence, directional bias and contact formation?

      Structural molecular model of cohesin-CTCF-anchored loops has been published by Li Y et al. Nature 2020. The structural conformation of CTCF and cohesin in the loops would be the cause of the directional bias of CTCF binding sites, which I mentioned in lines 531 – 535 as follows: These results suggest that the directional bias of DNA-binding sites of insulator-associated DBPs may be involved in insulator function and chromatin regulation through structural interactions among DBPs, other proteins, DNAs, and RNAs. For example, the N-terminal amino acids of CTCF have been shown to interact with RAD21 in chromatin loops.

      To investigate the principles underlying the architectural functions of insulator-insulator pairing interactions, two insulators, Homie and Nhomie, flanking the Drosophila even skipped locus were analyzed. Pairing interactions between the transgene Homie and the eve locus are directional. The head-to-head pairing between the transgene and endogenous Homie matches the pattern of activation (Fujioka M et al. PLoS Genetics 2016).

      Reviewer #3

      1. Some of these TFs do not have specific direct binding to DNA (P300, Cohesin). Since the authors are using binding motifs in their analysis workflow, I would remove those from the analysis.

      When a protein complex binds to DNA, one protein of the complex binds to the DNA directory, and the other proteins may not bind to DNA. However, the DNA motif sequence bound by the protein may be registered as the DNA-binding motif of all the proteins in the complex. The molecular structure of the complex of CTCF and Cohesin showed that both CTCF and Cohesin bind to DNA (Li Y et al. Nature 2020). I think there is a possibility that if the molecular structure of a protein complex becomes available, the previous recognition of the DNA-binding ability of a protein may be changed. Therefore, I searched the Pfam database for 99 insulator-associated DNA-binding proteins identified in this study. I found that 97 are registered as DNA-binding proteins and/or have a known DNA-binding domain, and EP300 and SIN3A do not directory bind to DNA, which was also checked by Google search. I have added the following explanation in line 249 to indicate direct and indirect DNA-binding proteins: Among 99 insulator-associated DBPs, EP300 and SIN3A do not directory interact with DNA, and thus 97 insulator-associated DBPs directory bind to DNA. I have updated the sentence in line 22 of the Abstract as follows: We discovered 97 directional and minor nondirectional motifs in human fibroblast cells that corresponded to 23 DBPs related to insulator function, CTCF, and/or other types of chromosomal transcriptional regulation reported in previous studies.

      2. I am not sure if I understood correctly, by why do the authors consider enhancers spanning 2Mb (200 bins of 10Kb around eSNPs)? This seems wrong. Enhancers are relatively small regions (100bp to 1Kb) and only a very small subset form super enhancers.

      As the reviewer mentioned, I recognize enhancers are relatively small regions. In the paper, I intended to examine further upstream and downstream of promoter regions where enhancers are found. Therefore, I have modified the sentence in lines 917 – 919 of the Fig. 2 legend as follows: Enhancer-gene regulatory interaction regions consist of 200 bins of 10 kbp between -1 Mbp and 1 Mbp region from TSS, not including promoter.

      3. I think the H3K27me3 analysis was very good, but I would have liked to see also constitutive heterochromatin as well, so maybe repeat the analysis for H3K9me3.

      Following the reviewer's advice, I have added the ChIP-seq data of H3K9me3 as a truck of the UCSC Genome Browser. The distribution of H3K9me3 signal was different from that of H3K27me3 in some regions. I also found the insulator-associated DNA-binding sites close to the edges of H3K9me3 regions and took some screenshots of the UCSC Genome Browser of the regions around the sites in Supplementary Fig. 3b. I have modified the following sentence on lines 962 – 964 in the legend of Fig. 4: a Distribution of histone modification marks H3K27me3 (green color) and H3K9me3 (turquoise color) and transcript levels (pink color) in upstream and downstream regions of a potential insulator site (light orange color). I have also added the following result on lines 348 – 352: The same analysis was performed using H3K9me3 marks, instead of H3K27me3 (Fig. S3b). We found that the distribution of H3K9me3 signal was different from that of H3K27me3 in some regions, and discovered the insulator-associated DNA-binding sites close to the edges of H3K9me3 regions (Fig. S3b).

      4. I was not sure I understood the analysis in Figure 6. The binding site is with 500bp of the interaction site, but micro-C interactions are at best at 1Kb resolution. They say they chose the centre of the interaction site, but we don't know exactly where there is the actual interaction. Also, it is not clear what they measure. Is it the number of binding sites of a specific or multiple DBP insulator proteins at a specific distance from this midpoint that they recover in all chromatin loops? Maybe I am missing something. This analysis was not very clear.

      The resolution of the Micro-C assay is considered to be 100 bp and above, as the human nucleome core particle contains 145 bp (and 193 bp with linker) of DNA. However, internucleosomal DNA is cleaved by endonuclease into fragments of multiples of 10 nucleotides (Pospelov VA et al. Nucleic Acids Research 1979). Highly nested focal interactions were observed (Goel VY et al. Nature Genetics 2023). Base pair resolution was reported using Micro Capture-C (Hua P et al. Nature 2021). Sub-kilobase (20 bp resolution) chromatin topology was reported using an MNase-based chromosome conformation capture (3C) approach (Aljahani A et al. Nature Communications 2022). On the other hand, Hi-C data was analyzed at 1 kb resolution. (Gu H et al. bioRxiv 2021). If the resolution of Micro-C interactions is at best at 1 kb, the binding sites of a DNA-binding protein will not show a peak around the center of the genomic locations of interaction edges. Each panel shows the number of binding sites of a specific DNA-binding protein at a specific distance from the midpoint of all chromatin interaction edges. I have modified and added the following sentences in lines 585-589: High-resolution chromatin interaction data from a Micro-C assay indicated that most of the predicted insulator-associated DBPs showed DNA-binding-site distribution peaks around chromatin interaction sites, suggesting that these DBPs are involved in chromatin interactions and that the chromatin interaction data has a high degree of resolution. Base pair resolution was reported using Micro Capture-C.

      1.PIQ does not consider TF concentration. Other methods do that and show that TF concentration improves predictions (e.g.,https://www.biorxiv.org/content/10.1101/2023.07.15.549134v2 or https://pubmed.ncbi.nlm.nih.gov/37486787/). The authors should discuss how that would impact their results.

      The directional bias of CTCF binding sites was identified by ChIA-pet interactions of CTCF binding sites. The analysis of the contribution scores of DNA-binding sites of proteins considering the binding sites of CTCF as an insulator showed the same tendency of directional bias of CTCF binding sites. In the analysis, to remove the false-positive prediction of DNA-binding sites, I used the binding sites that overlapped with a ChIP-seq peak of the DNA-binding protein. This result suggests that the DNA-binding sites of CTCF obtained by the current analysis have sufficient quality. Therefore, if the accuracy of prediction of DNA-binding sites is improved, althought the number of DNA-binding sites may be different, the overall tendency of the directionality of DNA-binding sites will not change and the results of this study will not change significantly.

      As for the first reference in the reviewer's comment, chromatin interaction data from Micro-C assay does not include all chromatin interactions in a cell or tissue, because it is expensive to cover all interactions. Therefore, it would be difficult to predict all chromatin interactions based on machine learning. As for the second reference in the reviewer's comment, pioneer factors such as FOXA are known to bind to closed chromatin regions, but transcription factors and DNA-binding proteins involved in chromatin interactions and insulators generally bind to open chromatin regions. The search for the DNA-binding motifs is not required in closed chromatin regions.

      2. DeepLIFT is a good approach to interpret complex structures of CNN, but is not truly explainable AI. I think the authors should acknowledge this.

      In the DeepLIFT paper, the authors explain that DeepLIFT is a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input (Shrikumar A et al. ICML 2017). DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. DeepLIFT calculates a metric to measure the difference between an input and the reference of the input.

      Truly explainable AI would be able to find cause and reason, and to make choices and decisions like humans. DeepLIFT does not perform causal inferences. I did not use the term "Explainable AI" in our manuscript, but I briefly explained it in Discussion. I have added the following explanation in lines 615-620: AI (Artificial Intelligence) is considered as a black box, since the reason and cause of prediction are difficult to know. To solve this issue, tools and methods have been developed to know the reason and cause. These technologies are called Explainable AI. DeepLIFT is considered to be a tool for Explainable AI. However, DeepLIFT does not answer the reason and cause for a prediction. It calculates scores representing the contribution of the input data to the prediction.

      Furthermore, to improve the readability of the manuscript, I have included the following explanation in lines 159-165: we computed DeepLIFT scores of the input data (i.e., each binding site of the ChIP-seq data of DNA-binding proteins) in the deep leaning analysis on gene expression levels. DeepLIFT compares the importance of each input for predicting gene expression levels to its 'reference or background level' and assigns contribution scores according to the difference. DeepLIFT calculates a metric to measure the difference between an input and the reference of the input.

    1. Reviewer #1 (Public review):

      Summary:

      The authors investigate the relationship between 3D chromatin architecture and innate immune gene regulation in monocytes from patients with alcohol-associated hepatitis (AH). Using Hi-C technology, they attempt to identify structural changes in the genome that correlate with altered gene expression. Their central claim is that genome restructuring contributes to the hyper-inflammatory phenotype associated with AH.

      Strengths:

      (1) The manuscript employs Hi-C technology, which, in principle, is a powerful approach for studying genome organization.

      (2) The focus on disease-relevant genes, particularly innate immune loci, provides a contextually important angle for understanding AH.

      Weaknesses:

      (1) Sample Size: The study relies on an exceptionally small cohort (4 AH patients and 4 healthy controls), rendering the results statistically underpowered and highly susceptible to variability.

      (2) Hi-C Resolution unpaired to RNA seq: The data are presented at a resolution of 100kb, which is insufficient to uncover meaningful chromatin interactions at the level of individual genes. This data is unpaired.

      (3) Functional Validation: The manuscript lacks experiments to directly link changes in chromatin architecture with gene expression or monocyte function, leaving the claims speculative.

      (4) Data Integration: The lack of Hi-C with ATAC and RNA-seq data handicaps the analysis and really makes it superficial. In short, it does not convincingly demonstrate a functional relationship.

      (5) Confounding Factors: The manuscript neglects critical confounding variables such as comorbidities, medications, and lifestyle factors, which could influence chromatin structure and gene expression independently of AH.

      Appraisal of the Aims and Results:

      The manuscript sets out to establish a connection between chromatin architecture and AH pathology. However, the study fails to achieve its stated aims due to inadequate methods and insufficient data. The conclusions drawn from the Hi-C analyses alone are poorly supported, and the lack of functional validation undermines the credibility of the proposed mechanisms. Overall, the results do not provide compelling evidence to substantiate the authors' claims.

      Impact on the Field and Utility to the Community:

      The work, in its current form, is unlikely to have a meaningful impact on the field. The limited scope, methodological shortcomings, and lack of robust data significantly diminish its potential utility. Without addressing these critical gaps, the study does not offer new insights into the role of genome architecture in AH or provide useful methodologies or datasets for the community.

      Additional Context:

      The manuscript would benefit from a more comprehensive analysis of potential mechanisms underlying the observed changes, including the interplay between chromatin architecture and epigenetic modifications. Furthermore, longitudinal studies or therapeutic interventions could provide insights into the dynamic aspects of genome restructuring in AH. These considerations are entirely absent from the current study.

      Conclusion:

      The manuscript does not achieve its stated goals and does not present sufficient evidence to support its conclusions. The limitations in sample size, resolution, and experimental rigor severely hinder its contribution to the field. Addressing these fundamental flaws will be essential for the work to be considered a meaningful addition to the literature.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This valuable study combined whole-head magnetoencephalography (MEG) and subthalamic (STN) local field potential (LFP) recordings in patients with Parkinson's disease undergoing deep brain stimulation surgery. The paper provides solid evidence that cortical and STN beta oscillations are sensitive to movement context and may play a role in the coordination of movement redirection.

      We are grateful for the expert assessment by the editor and the reviewers. Below we provide pointby-point replies to both public and private reviews. We have tried to keep the answers in the public section short and concise, not citing the changed passages unless the point does not re-appear in the recommendations. There, we did include all of the changes to the manuscript, such that the reviewers need not go back and forth between replies and manuscript.

      The reviewer comments have not only led to numerous improvements of the text, but also to new analyses, such as Granger causality analysis, and to methodological improvements e.g. including numerous covariates in the statistical analyses. We believe that the article improved substantially through the feedback, and we thank the reviewers and the editor for their effort.

      Public Reviews

      Reviewer #1 (Public review):

      Summary:

      Winkler et al. present brain activity patterns related to complex motor behaviour by combining wholehead magnetoencephalography (MEG) with subthalamic local field potential (LFP) recordings from people with Parkinson's disease. The motor task involved repetitive circular movements with stops or reversals associated with either predictable or unpredictable cues. Beta and gamma frequency oscillations are described, and the authors found complex interactions between recording sites and task conditions. For example, they observed stronger modulation of connectivity in unpredictable conditions. Moreover, STN power varied across patients during reversals, which differed from stopping movements. The authors conclude that cortex-STN beta modulation is sensitive to movement context, with potential relevance for movement redirection.

      Strengths:

      This study employs a unique methodology, leveraging the rare opportunity to simultaneously record both invasive and non-invasive brain activity to explore oscillatory networks.

      Weaknesses:

      It is difficult to interpret the role of the STN in the context of reversals because no consistent activity pattern emerged.

      We thank the reviewer for the valuable feedback to our study. We agree that the interpretation of the role of the STN during reversals is rather difficult, because reversal-related STN activity was highly variable across patients. Although there seem to be consistent patterns in sub-groups of the current cohort, with some patients showing event-related increases (Fig. 3b) and others showing decreases, the current dataset is not large enough to substantiate or even explain the existence of such clusters. Thus, we limit ourselves to acknowledging this limitation and discussing potential reasons for the high variability, namely variability in electrode placement and insufficient spatial resolution for the separation of specialized cell ensembles within the STN (see Discussion, section Limitations and future directions).

      Reviewer #2 (Public review):

      Summary:

      This study examines the role of beta oscillations in motor control, particularly during rapid changes in movement direction among patients with Parkinson's disease. The researchers utilized magnetoencephalography (MEG) and local field potential (LFP) recordings from the subthalamic nucleus to investigate variations in beta band activity within the cortex and STN during the initiation, cessation, and reversal of movements, as well as the impact of external cue predictability on these dynamics. The primary finding indicates that beta oscillations more effectively signify the start and end of motor sequences than transitions within those sequences. The article is well-written, clear, and concise.

      Strengths:

      The use of a continuous motion paradigm with rapid reversals extends the understanding of beta oscillations in motor control beyond simple tasks. It offers a comprehensive perspective on subthalamocortical interactions by combining MEG and LFP.

      Weaknesses:

      (1) The small and clinically diverse sample size may limit the robustness and generalizability of the findings. Additionally, the limited exploration of causal mechanisms reduces the depth of its conclusions and focusing solely on Parkinson's disease patients might restrict the applicability of the results to broader populations.

      We thank the reviewer for the insightful feedback. We address these issues one by one in our responses to points 2, 4 and 6, respectively.

      (2) The small sample size and variability in clinical characteristics among patients may limit the robustness of the study's conclusions. It would be beneficial for the authors to acknowledge this limitation and propose strategies for addressing it in future research. Additionally, incorporating patient-specific factors as covariates in the ANOVA could help mitigate the confounding effects of heterogeneity.

      Thank you for this comment. The challenges associated with recording brain activity peri-operatively can be a limiting factor when it comes to sample size and cohort stratification. We now acknowledge this in the revised discussion (section Limitations and future directions). Furthermore, we suggest using sensing-capable devices in the future as a measure to increase sample sizes (Discussion, section Limitations and future directions). Lastly, we appreciate the idea of adding patient-specific factors as covariates to the ANOVAs and have thus included age, disease duration and pre-surgical UPDRS score into our models. This did not lead to any qualitative changes of statistical effects.

      (3) The author may consider using standardized statistics, such as effect size, that would provide a clearer picture of the observed effect magnitude and improve comparability.

      Thanks for the suggestion. As measures of effect size, we have added partial eta squared (η<sub>p</sub><sup2</sup>) to the results of all ANOVAs and Cohen’s d to all follow-up t-tests.

      (4) Although the study identifies relevance between beta activity and motor events, it lacks causal analysis and discussion of potential causal mechanisms. Given the valuable datasets collected, exploring or discussing causal mechanisms would enhance the depth of the study.

      We appreciate this idea and have conducted Granger causality analyses in response to this comment. This new analysis reveals that there is a strong cortical drive to the STN for all movements of interest and predictability conditions in the beta band. The detailed results can be viewed on p. 16 in the section on Granger causality. For statistical testing, we conducted an rmANCOVA, similar to those for power and coherence (see p. 46-48 and 54-56 for the corresponding tables), as well as t-tests assessing directionality (Figure 6-figure supplement 2 on p. 35). In the discussion section, we connect these results with prior findings suggesting that the frontal cortex drives the STN in the beta band, likely through hyperdirect pathway fibers (p. 17).

      (5) The study cohort focused on senior adults, who may exhibit age-related cortical responses during movement planning in neural mechanisms. These aspects were not discussed in the study.

      We appreciate the comment and agree that age may have impacted neural oscillatory activity of patients in the present study. We now acknowledge this in the limitations section, and point out that our approach to handling these effects was including age as a covariate in the statistical analyses.

      (6) Including a control group of patients with other movement disorders who also undergo DBS surgery would be beneficial. Because we cannot exclude the possibility that the observed findings are specific to PD or can be generalized. Additionally, the current title and the article, which are oriented toward understanding human motor control, may not be appropriate.

      We thank the reviewer for this comment and fully agree that it cannot be ruled out that the present findings are, in part, specific to PD. We acknowledge this limitation in the Limitations and future directions section (p. 20-21). Indeed, including a control group of patients with other disorders would be ideal, but the scarcity of patients with diseases other than PD who receive STN DBS in our centre makes this an unfeasible option in practical terms. We do suggest that future research may address this issue by extending our approach to different disorders or healthy participants on the cortical level (p. 21). Lastly, we appreciate the idea to adjust the title of the present article. The adjusted title is: “Context-Dependent Modulations of Subthalamo-Cortical Synchronization during Rapid Reversals of Movement Direction in Parkinson’s Disease”.

      That being said, we do believe that our findings at least approximate healthy functioning and are not solely related to PD. For one, patients were on their usual dopaminergic medication and dopamine has been found to normalize pathological alterations of beta activity. Further, the general pattern of movement-related beta and gamma oscillations reported here has been observed in numerous diseases and brain structures, including cortical beta oscillations measured non-invasively in healthy participants.

      Reviewer #3 (Public review):

      Summary:

      The study highlights how the initiation, reversal, and cessation of movements are linked to changes in beta synchronization within the basal ganglia-cortex loops. It was observed that different movement phases, such as starting, stopping briefly, and stopping completely, affect beta oscillations in the motor system.

      It was found that unpredictable cues lead to stronger changes in STN-cortex beta coherence. Additionally, specific patterns of beta and gamma oscillations related to different movement actions and contexts were observed. Stopping movements was associated with a lack of the expected beta rebound during brief pauses within a movement sequence.

      Overall, the results underline the complex and context-dependent nature of motor-control and emphasize the role of beta oscillations in managing movement according to changing external cues.

      Strengths:

      The paper is very well written, clear, and appears methodologically sound.

      Although the use of continuous movement (turning) with reversals is more naturalistic than many previous button push paradigms.

      Weaknesses:

      The generalizability of the findings is somewhat curtailed by the fact that this was performed perioperatively during the period of the microlesion effect. Given the availability of sensing-enabled DBS devices now and HD-EEG, does MEG offer a significant enough gain in spatial localizability to offset the fact that it has to be done shortly postoperatively with externalized leads, with an attendant stun effect? Specifically, for paradigms that are not asking very spatially localized questions as a primary hypothesis?

      We appreciate the reviewer’s feedback and acknowledge the valid point raised on the timing of our measurements. Indeed, sensing-enabled devices offer a valid alternative to peri-operative recordings, circumventing the stun effect. We acknowledge this in the revised discussion, section Limitations and future directions (p. 23): “Additionally, future research could capitalize on sensingcapable devices to circumvent the necessity to record brain activity peri-operatively, facilitating larger sample sizes and circumventing the stun effect, an immediate improvement in motor symptoms arising as a consequence of electrode implantation (Mann et al., 2009).” This alternative strategy, however, was not an option here because we did not have a sufficient number of patients implanted with sensing-enabled devices at the time when the data collection was initialized.

      That being said, we would like to highlight that in the present study, our goal was not to study pathology related to Parkinson’s disease. Rather, we aimed to learn about motor control in general. The stun effect may have facilitated motor performance in our patients, which is actually beneficial to the research goals at hand.

      Further investigation of the gamma signal seems warranted, even though it has a slightly lower proportional change in amplitude in beta. Given that the changes in gamma here are relatively wide band, this could represent a marker of neural firing that could be interestingly contrasted against the rhythm account presented.

      We appreciate the reviewer’s interest and we have extended the investigation of gamma oscillations. We now provide statistics regarding the influence of predictability on gamma power and gamma coherence (no significant effects) and explore Granger causality in the gamma (and beta) band (see comment 4 of reviewer 2). Unfortunately, we cannot measure spiking via the DBS electrode, and therefore we cannot investigate correlations between gamma oscillatory activity and action potentials. We do agree with the reviewer, however, that action potentials rather than oscillations form the basis of motor control in the brain. This view of ours is now reflected in the revised discussion, section Limitations and future directions (p. 21): “Lastly, given the present study’s focus on understanding movement-related rhythms, particularly in the beta range, future research could further explore the role of gamma oscillations in continuous movement and their relation to action potentials in motor areas (Fischer et al., 2020; Igarashi, Isomura, Arai, Harukuni, & Fukai, 2013), which form the basis of movement encoding in the brain.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      This is a well-conducted study and overall the results are clear. I only have one minor suggestion for improvement of the manuscript. I found the order of appearance of the results somewhat confusing, switching from predictability-related behavioral effects to primarily stopping and reversal-related neurophysiological effects, back to predictability but starting with coherence. I would suggest that the authors try to follow a systematic order focused on the questions at hand. E.g. perhaps readability could be improved if the results section is split into reversal vs. stopping related effects, reporting behavior, power, and coherence in this order, followed by a predictability section, again reporting behavior, power, and coherence. Obviously, this is an optional suggestion. Apart from that, I just missed a more direct message related to the absence of statistical significance related to STN power changes during reversal. I think this could be made more clear in the text.

      We thank the reviewer for the feedback to our study. In order to ease reading, we modified the order and further added additional sub-titles to the results section. We start with Behavior (p. 4) and then move on to Power (general movement effects on power – movement effects on STN power – movement effects on cortical power – predictability effects on power). Next, we move on to Connectivity (movement effects on connectivity – predictability effects on connectivity – Granger causality). We hope that these adaptations will help guide the reader.

      Additionally, we thank the reviewer for noting that we did not explicitly mention the lack of statistical significance of reversal-related beta power modulations in the STN. We have adapted the section on modulation of STN beta power associated with reversals (p. 8) to: “In the STN, reversals were associated with a brief modulation of beta power, which was weak in the group-average spectrum and did not reach significance (Fig. 3a).”

      Reviewer #2 (Recommendations for the authors):

      (1) The small sample size and variability in clinical characteristics among patients may limit the robustness of the study's conclusions. It would be beneficial for the authors to acknowledge this limitation and propose strategies for addressing it in future research. Additionally, incorporating patient-specific factors as covariates in the ANOVA could help mitigate the confounding effects of heterogeneity.

      Thank you for this comment. The challenges associated with recording brain activity peri-operatively can be a limiting factor when it comes to sample size. We now acknowledge this in the revised discussion, section Limitations and future directions (p. 20):

      “Invasive measurements of STN activity are only possible in patients who are undergoing or have undergone brain surgery. Studies drawing from this limited pool of candidate participants are typically limited in terms of sample size and cohort stratification, particularly when carried out in a peri-operative setting. Here, we had a sample size of 20, which is rather high for a peri-operative study, but still low in terms of absolute numbers.”

      Furthermore, we suggest using sensing-capable devices in the future as a measure to increase sample sizes (p. 21):

      “Additionally, future research could capitalize on sensing-capable devices to circumvent the necessity to record brain activity peri-operatively, facilitating larger sample sizes and circumventing the stun effect, an immediate improvement in motor symptoms arising as a consequence of electrode implantation (Mann et al., 2009).”

      Lastly, we appreciate the idea of adding patient-specific factors as covariates to the ANOVAs and have thus included age, disease duration and pre-surgical UPDRS score into our models. This did not lead to any qualitative changes of statistical effects.

      Revised article

      Methods, Statistical analysis:

      “To account for their potential influence on brain activity, we added age, pre-operative UPDRS score, and disease duration as covariates to all ANOVAs. Covariates were standardized by means of zscoring.”

      (2) The author may consider using standardized statistics, such as effect size, that would provide a clearer picture of the observed effect magnitude and improve comparability.

      Thanks for this useful suggestion. As measures of effect size, we have added partial eta squared (η<sub>p</sub><sup2</sup>) to the results of all ANOVAs and Cohen’s d to all follow-up _t-_tests.

      (3) Although the study identifies relevance between beta activity and motor events, it lacks causal analysis and discussion of potential causal mechanisms. Given the valuable datasets collected, exploring or discussing causal mechanisms would enhance the depth of the study.

      We appreciate this idea and have conducted Granger causality analyses in response to this comment. This new analysis reveals that there is a strong cortical drive to the STN for all movements of interest and predictability conditions in the beta band, but no directed interactions in the gamma band. For statistical testing, we conducted an rmANCOVA, similar to the analysis of power and coherence (see p. 46-48 and 54-56 for the corresponding tables), as well as t-tests assessing directionality (Figure 6 figure supplement 2 on p. 35). In the discussion section, we connect these results with prior findings suggesting that the frontal cortex drives the STN in the beta band, likely through hyperdirect pathway fibers (p. 17).

      Revised article

      Methods Section, Granger Causality Analysis

      “We computed beta and gamma band non-parametric Granger causality (Dhamala, Rangarajan, & Ding, 2008) between cortical ROIs and the STN in the hemisphere contralateral to movement for the post-event time windows (0 – 2 s with respect to start, reversal, and stop). Because estimates of Granger causality are often biased, we compared the original data to time-reversed data to suppress non-causal interactions. True directional influence is reflected by a higher causality measure in the original data than in its time-reversed version, resulting in a positive difference between the two, the opposite being the case for a signal that is “Granger-caused” by the other. Directionality is thus reflected by the sign of the estimate (Haufe, Nikulin, Müller, & Nolte, 2013). Because rmANCOVA results indicated no significant effects for predictability and movement type, and post-hoc tests did not detect significant differences between hemispheres, we averaged Granger causality estimates over movement types, hemispheres and predictability conditions in Figure 6-figure supplement 2.”

      Results, Granger causality

      “In general, cortex appeared to drive the STN in the beta band, regardless of the movement type and predictability condition. This was reflected in a main effect of ROI on Granger causality estimates (F<sub>ROI</sub>(7,9) = 3.443, p<sub>ROI</sub> = 0.044, η<sub>p</sub><sup2</sup> = 0.728; refer to Supplementary File 4 for the full results of the ANOVA). In the hemisphere contralateral to movement, follow-up t-tests revealed significantly higher Granger causality estimates from M1 to the STN (t = 3.609, one-sided p < 0.001, d = 0.807) and from MSMC to the STN (t = 2.051, one-sided p < 0.027, d = 0.459) than the other way around. The same picture emerged in the hemisphere ipsilateral to movement (M1 to STN: t = 3.082, one-sided p = 0.003, d = 0.689; MSMC to STN: t \= 1.833, one-sided p < 0.041, d = 0.410). In the gamma band, we did not detect a significant drive from one area to the other (F<sub>ROI</sub>(7,9) = 0.338, p<sub>ROI</sub> = 0.917, η<sub>p</sub><sup2</sup> = 0.208, Supplementary File 6). Figure 6-figure supplement 2 demonstrates the differences in Granger causality between original and time-reversed data for the beta and gamma band.”

      Discussion, The dynamics of STN-cortex coherence

      “Considering the timing of the increase observed here, the STN’s role in movement inhibition (Benis et al., 2014; Ray et al., 2012) and the fact that frontal and prefrontal cortical areas are believed to drive subthalamic beta activity via the hyperdirect pathway (Chen et al., 2020; Oswal et al., 2021) it seems plausible that the increase of beta coherence reflects feedback of sensorimotor cortex to the STN in the course of post-movement processing. In line with this idea, we observed a cortical drive of subthalamic activity in the beta band.”

      (4) The study cohort focused on senior adults, who may exhibit age-related cortical responses during movement planning in neural mechanisms. These aspects were not discussed in the study.

      We appreciate the comment and agree that age may have impacted neural oscillatory activity of patients in the present study. We now acknowledge this in the limitations section, and point out that our approach to handling these effects was including age as a covariate in the statistical analyses.

      Revised article

      Discussion, Limitations and Future Directions

      “Further, most of our participants were older than 60 years. To diminish any confounding effects of age on movement-related modulations of neural oscillations, such as beta suppression and rebound (Bardouille & Bailey, 2019; Espenhahn et al., 2019), we included age as a covariate in the statistical analyses.”

      (5) Including a control group of patients with other movement disorders who also undergo DBS surgery would be beneficial. Because we cannot exclude the possibility that the observed findings are specific to PD or can be generalized. Additionally, the current title and the article, which are oriented toward understanding human motor control, may not be appropriate.

      We thank the reviewer for this comment and fully agree that it cannot be ruled out that the present findings are, in part, specific to PD. We acknowledge this limitation in the Limitations and future directions section (p. 20-21). Indeed, including a control group of patients with other disorders would be ideal, but the scarcity of patients with diseases other than PD who receive STN DBS makes this an unfeasible option. We do suggest that future research may address this issue by extending our approach to different disorders or healthy participants on the cortical level (p. 21). Lastly, we appreciate the idea to adjust the title of the present article. The adjusted title is: “Context-Dependent Modulations of Subthalamo-Cortical Synchronization during Rapid Reversals of Movement Direction in Parkinson’s Disease”.

      That being said, we do believe that our findings at least approximate healthy functioning and are not solely related to PD. For one, patients were on their usual dopaminergic medication for the study and dopamine has been found to normalize pathological alterations of beta activity. More importantly, the general pattern of movement-related beta and gamma oscillations has been observed in numerous diseases and brain structures, including cortical beta oscillations measured non-invasively in healthy participants. Thus, it is not unlikely that the new aspects discovered here are also general features of motor processing.

      Revised article

      Discussion, Limitations and future directions

      “Furthermore, we cannot be sure to what extent the present study’s findings relate to PD pathology rather than general motor processing. We suggest that our approach at least approximates healthy brain functioning as patients were on their usual dopaminergic medication. Dopaminergic medication has been demonstrated to normalize power within the STN and globus pallidus internus, as well as STN-globus pallidus internus and STN-cortex coherence (Brown et al., 2001; Hirschmann et al., 2013). Additionally, several of our findings match observations made in other patient populations and healthy participants, who exhibit the same beta power dynamics at movement start and stop (Alegre et al., 2004) that we observed here. Notably, our finding of enhanced cortical involvement in face of uncertainty aligns well with established theories of cognitive processing, given the cortex' prominent role in managing higher cognitive functions (Altamura et al., 2010). Yet, transferring our approach and task to patients with different disorders, e.g. obsessive compulsive disorder, or examining young and healthy participants solely at the cortical level, could contribute to elucidating whether the synchronization dynamics reported here are indeed independent of PD and age.”

      Reviewer #3 (Recommendations for the authors):

      Despite the strengths of the "rhythm" account of cognitive processes, the paper could possibly be improved by making it less skewed to rhythms explaining all of the movement encoding.

      Thank you for this comment - the point is well taken. There is a large body of literature relating neural oscillations to spiking in larger neural populations, which itself is likely the most relevant signal with respect to motor control. In our eyes, it is this link that justifies the rhythm account, i.e. we agree with the reviewer that action potentials are the basis of movement encoding in the brain, not oscillations. Unfortunately, we cannot measure spiking with the method at hand.

      To better integrate this view into the current manuscript, we make the following suggestion for future research in the Limitations and future directions section (p. 21): “Lastly, given the present study’s focus on understanding movement-related rhythms, particularly in the beta range, future research could further explore the role of gamma oscillations in continuous movement and their relation to action potentials in motor areas (Fischer et al., 2020; Igarashi, Isomura, Arai, Harukuni, & Fukai, 2013), which form the basis of movement encoding in the brain.”

      In Figure 5 - is the legend correct? Is it really just a 0.2% change in power only? That would be a very surprisingly small effect size.

      We thank the reviewer for noting this. Indeed, the numbers on the scale quantify relative change (post - pre)/pre and should be multiplied by 100 to obtain %-change. We have adjusted the color bars accordingly.

      The dissociation between the effects of unpredictable cues in coherence versus raw power is interesting and could potentially be directly contrasted further in the discussion (here they are presented separately with separate discussions, but this seems like a pretty important and novel finding as beta coherence and power usually go in the same direction).

      We appreciate the reviewer’s interest in our findings on the predictability of movement instructions. In case of coherence, the difference between pre- and post-event was generally more positive in the unpredictable condition, meaning that suppressions (negative pre-post difference) were diminished whereas increases (positive pre-post difference) were enhanced. With respect to power, we also observed less suppression in the unpredictable condition at movement start. Therefore, the direction of change is in fact the same. We made this clearer in the revised version by adapting the corresponding sections of the abstract, results and discussion (see below).

      The only instance of coherence and power diverging (on a qualitative level) was observed during reversals: here, we noted post-event increases in coherence and post-event decreases in M1 power in the group-average spectra. However, when comparing the pre- and post-event epochs statistically by means of permutation testing, the coherence increase did not reach significance. Hence, we did not highlight this aspect.

      Revised version

      Abstract

      “… Event-related increases of STN-cortex beta coherence were generally stronger in the unpredictable than in the predictable condition. … “

      Results, Effects of predictability on beta power  

      “With respect to the effect of predictability of movement instructions on beta power dynamics (research aim 2), we observed an interaction between movement type and condition (F<sub>cond*mov</sub> (2,14) = 4.206, p<sub>cond*mov</sub> = 0.037, η<sub>p</sub><sup2</sup> = 0.375), such that the beta power suppression at movement start was generally stronger in the predictable (M = -0.170, SD = 0.065) than in the unpredictable (M \= -0.154, SD = 0.070) condition across ROIs (t = -1.888, one-sided p \= 0.037, d = -0.422). We did not observe any modulation of gamma power by the predictability of movement instructions (F<sub>cond</sub> (1,15) = 0.792, p<sub>cond</sub> = 0.388, η<sub>p</sub><sup2</sup> = 0.050, Supplementary File 5).”

      Effects of predictability on STN-cortex coherence

      “With respect to the effect of predictability of movement instructions on beta coherence (research aim 2), we found that the pre-post event differences were generally more positive in the unpredictable condition (main effect of predictability condition; F<sub>cond</sub>(1,15) = 8.684, p<sub>cond</sub> = 0.010, η<sub>p</sub><sup2</sup> = 0.367; Supplementary File 3), meaning that the suppression following movement start was diminished and the increases following stop and reversal were enhanced in the unpredictable condition (Fig. 6a). This effect was most pronounced in the MSMC (Fig. 6b). When comparing regionaverage TFRs between the unpredictable and the predictable condition, we observed a significant difference only for stopping (t<sub>clustersum</sub> = 142.8, p = 0.023), suggesting that the predictability effect was mostly carried by increased beta coherence following stops. When repeating the rmANCOVA for preevent coherence, we did not observe an effect of predictability (F<sub>cond</sub>(1,15) = 0.163, p<sub>cond</sub> = 0.692, η<sub>p</sub><sup2</sup> = 0.011), i.e. the effect was most likely not due to a shift of baseline levels. The increased tendency for upward modulations and decreased tendency for downward modulations rather suggests that the inability to predict the next cue prompted intensified event-related interaction between STN and cortex. STN-cortex gamma coherence was not modulated by predictability (F<sub>cond</sub>(1,15) = 0.005, p<sub>cond</sub> = 0.944, η<sub>p</sub><sup2</sup> = 0.000, Supplementary File 5).”

      Discussion, Beta coherence and beta power are modulated by predictability

      “In the present paradigm, patients were presented with cues that were either temporally predictable or unpredictable. We found that unpredictable movement prompts were associated with stronger upward modulations and weaker downward modulations of STN-cortex beta coherence, likely reflecting the patients adopting a more cautious approach, paying greater attention to instructive cues. Enhanced STN-cortex interactions might thus indicate the recruitment of additional neural resources, which might have allowed patients to maintain the same movement speed in both conditions. […]”

      With respect to power, we observed reduced beta suppression in the unpredictable condition at movement start, consistent with the effect on coherence, likely demonstrating a lower level of motor preparation.

      Given that you have a nice continuous data task here - the turning of the wheel, it might be interesting to cross-correlate the circular position (and separately - velocity) of the turning with the envelope of the beta signal. This would be a nice finding if you could also show that the beta is modulated continuously by the continuous movements. In the natural world, we rarely do a continuous movement with a sudden reversal, or stop, most of the time we are in continuous movement. Look at this might also be a strength of your dataset.

      We could not agree more. In fact, having a continuous behavioral output was a major motivation for choosing this particular task. We are very interested in state space models such as preferential subspace identification (Sani et al., 2021), for example. These models relate continuous brain signals to continuous behavioral target variables and should be of great help for questions such as: do oscillations relate to moment-by-moment adaptations of continuous movement? Which frequency bands and brain areas are important? Is angular position encoded by different brain areas/frequency bands than angular speed? These analyses are in fact ongoing. This project, however, is too large to fit into the current article.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study is an important follow-up to their prior work - Wong et al. (2019), starting with clear questions and hypotheses, followed by a series of thoughtful and organized experiments. The method and results are convincing. Experiment 1 demonstrated the sensory preconditioned fear with few (8) or many (32) sound-light pairings. Experiments 2A and 2B showed the role of PRh NMDA receptors during conditioning for online integration, revealing that this contribution is present only after a few sound-light pairings, not after many sound-light pairings. Experiments 3A and 3B showed the contribution of PRh-BLA communication to online integration, again only after a few but not after many. Contrary to Experiments 3A and 3B, Experiments 4A and 4B showed the contribution of PRh-BLA communication to integration at test only after many but not few sound-light pairings.

      Strengths:

      Throughout the manuscript, the methods and results are clearly organized and described, and the use of statistics is solid, all contributing to the overall clarity of the research. The discussion section was also well-written, effectively comparing the current research with the prior work and offering insightful interpretations and potential future directions for this line of research. I have only a limited amount of concerns about some results and some details of experiments/statistics.

      We thank the reviewer for their positive assessment.

      Weaknesses:

      Could you provide further interpretation regarding line 171: the observation that sensory preconditioned fear increased with the number of sound-light pairings? Was this increase due to better sound-light association learning during Stage 1? Additionally, were there any experimental differences between Experiment 1 and the other experiments that might explain why freezing was higher in the P32 group compared to the P8 group? This pattern seemed to be absent in the other experiments. If we consider the hypothesis that the online integration mechanism is more active with fewer pairings and the chaining mechanism at the test is more prominent with many pairings, we wouldn't expect a difference between the P8 and P32 groups. Given the relatively small sample size in Experiment 1, the authors might consider conducting a cross-experiment analysis or something similar to investigate this further.

      We appreciate the reviewer’s point and thank them for the question. The heightened level of sensory preconditioned fear among rats that received many sound-light pairings in the initial control experiment (Group P32) may reflect the combined effects of both mediated learning and chaining at test. We are, however, reluctant to offer a strong interpretation of this result as it was not replicated in the subsequent experiments: i.e., the levels of freezing to the sensory preconditioned stimulus at test were almost identical among vehicle-injected controls that received either few (8) or many (32) sound-light pairings in Experiments 2A and 2B; and this was also true in Experiments 3A and 3B, and again in Experiments 4A and 4B. A key difference between the initial and subsequent experiments is that, in contrast to the initial experiment, rats in subsequent experiments underwent surgery for one reason or another (implantation of cannulas, lesion of the perirhinal cortex). The implication is that surgical interventions in the perirhinal cortex and/or basolateral amygdala might affect the way that rats integrate the sound-light and light-shock associations in sensory preconditioning: i.e., they may force rats to rely on one type of integration strategy or the other. This is, of course, purely speculative – it will be addressed in future research.

      Reviewer #2 (Public review):

      This manuscript builds on the authors' earlier work, most recently Wong et al. 2019, in which they showed the importance of the perirhinal cortex (PRh) during the first-order conditioning stage of sensory preconditioning. Sensory preconditioning requires learning between two neutral stimuli (S2-S1) and subsequent development of a conditioned response to one of the neutral stimuli after pairing of the other stimulus with a motivationally relevant unconditioned stimulus (S1-US). One highly debated question regarding the mechanisms of learning of sensory preconditioning has been whether conditioned responses evoked by the indirectly trained stimulus (S2) occur through a mediated representation at the time of the first-order US training, or whether the conditioned responses develop through a chained evoked representation (S2--> S1 --> US) at the time of test. The authors' prior findings provided strong evidence for PRh being involved in mediated learning during the first-order training. They showed that protein synthesis was required during the first-order S1-US learning to support the conditioned response to the indirectly trained stimulus (S2) at the test.

      One question remaining following the previous paper was whether certain conditions may promote a chaining mechanism over mediated learning, as there is some evidence for chained representations at the time of the test. In this paper, the authors directly address this important question and find unambiguous results that the extent of training during the preconditioning stage impacts the involvement of PRh during the first-order conditioning or stage 2. They show that putative blockade of synaptic changes in PRh, using an NMDA antagonist, disrupts responding to the preconditioned cue at test during shorter duration preconditioning training (8 trials), but not during extended training (32 trials). They also show that this is the case for communication between the PRh and BLA during the same stage of training using a contralateral inactivation approach. This confirms their previous findings in 2019 of connectivity between these regions for the short-duration training, while they observe here for the first time that this is not the case for extended training. Finally, they show that with extended training, communication between BLA and the PRh is required at the final test of the preconditioned stimulus, but not for the short duration training.

      The results are clear and extremely consistent across experiments within this paper as well as with earlier work. The experiments here are thorough, and well-conceived, and address an important and highly debated question in the field regarding the neural and psychological mechanisms underlying sensory preconditioning. This work is highly impactful for the field as the debate over mediated versus chaining mechanisms has been an important topic for more than 70 years.

      We thank the reviewer for their kind assessment.

      Reviewer #3 (Public review):

      The authors tested whether the number of stimulus-stimulus pairings alters whether preconditioned fear depends on online integration during the formation of the stimulus-outcome memory or during the probe test/mobilization phase, when the original stimulus, which was never paired with aversive events, elicits fear via chaining of stimulus-stimulus and stimulus-outcome memories. They found that sensory preconditioning was successful with either 8 or 32 stimulus-stimulus pairings. Perirhinal cortex NMDA receptor blockade during stimulus-outcome learning impaired preconditioning following 8 but not 32 pairings during preconditioning. Therefore, perirhinal cortex NMDA activity is required for online integration or mediated learning. Perirhinal-basolateral amygdala had nearly identical effects with the same interpretation: these areas communicate during stimulus-outcome learning, and this online communication is required for later expressing preconditioned fear. Disconnection prior to the probe test, when chaining might occur, had different effects: it impaired the expression of preconditioned fear in rats that received 32, but not 8, pairings during preconditioning. The study has several strengths and provides a thoughtful discussion of future experiments. The study is highly impactful and significant; the authors were successful in describing the behavioral and neurobiological mechanisms of mediated learning versus chaining in sensory preconditioning, which is often debated in the learning field. Therefore this study will have a significant impact on the behavioral neurobiology and learning fields.

      Strengths:

      Careful, rigorous experimental design and statistics.

      The discussion leaves open questions that are very much worth exploring. For example - why did perirhinal-amygdala disconnection prior to the probe have no effect in the 8-pairing group, when bilateral perirhinal inactivation did (in Wong et al, 2019)? The authors propose that perirhinal cortex outputs bypass the amygdala during the probe test, which is an excellent hypothesis to test.

      The authors provide evidence that both mediated learning and chaining occur.

      Thank you for the positive assessment – we fully intend to identify the circuitry that regulates retrieval/expression of sensory preconditioned fear when it is based on mediated learning in stage 2.

      Weaknesses:

      This is inherent to all neural interference and behavioral experiments: biological/psychological functions do not typically operate binarily. There is no single clear number or parameter at which mediated learning or chaining happens, and both probably happen to some extent. Addressing this is even more difficult given behavioral variability across subjects, implant sites, etc. Thus, this is not so much a weakness particular to this study as much as an existential problem, which the authors were able to work around with careful experimental design and appropriate controls.

      We completely agree with the point raised here and thank the reviewer for their assessment.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) It appears that the method description for Sensory Preconditioning was copied from their previous Wong et al. (2019) paper, which is fine, but in the current research, the authors use 8 or 32 presentations, which is not reflected in the description.

      Thank you for bringing this to our attention. This is now addressed in the method section on page 27 (beginning at line 655):

      “Rats received either eight presentations of the sound and eight of the light in a single session, or 32 presentations of the sound and 32 of the light across four daily sessions. On Day 3, all rats received eight presentations of the sound and eight of the light. Each presentation of the sound was 30 s in duration and each presentation of the light was 10 s in duration. The first stimulus presentation occurred five min after rats were placed into the chambers. The offset of one stimulus co-occurred with the onset of the other stimulus for groups that received paired presentations of the sound and the light, while these stimuli were presented separately for groups that received explicitly unpaired presentations. The interval between each paired presentation was five min while the interval between each separately presented stimulus was 150 s. After the last stimulus presentation, rats remained in the chambers for an additional one min. They were then returned to their home cages. This training was repeated on Days 4-6 for rats that received 32 presentations of the sound and 32 of the light. All rats proceeded to first-order conditioning (details below) the day after their final session of sound and light exposures, which was Day 4 for rats exposed to eight presentations of the sound and light and Day 7 for rats exposed to 32 presentations of the sound and light.”

      (2) Line 148: Could the authors clarify how the "significant linear increase" was assessed? From similar descriptions in later experiments, it seems it was based on a comparison of freezing across the four presentations, but the F(1,26) statistic suggests there seemed to be a half-split test. The same questions exist in all the experiments. Please clarify.

      Conditioning data were analysed using contrasts with repeated measures in ANOVA. The repeated measures (or within-subject) factor was “trial” as all rats were exposed to four light-shock pairings in this stage of training. We examined whether there was a significant linear increase in freezing across trials using a standard within-subject contrast. The specific coefficients for this contrast, given the four trials, were -3, -1, 1, and 3. The reason that the degrees of freedom remain 1 and 26 in this analysis is because the within-subject contrast is part of a set of planned orthogonal contrasts. That is, in any planned analysis of the sort conducted here, the df1 will always be 1, indicating the very nature of the analysis. There was no splitting of the data, or comparisons between the split halves.

      (3) Line 154: Could the authors clarify what is meant by "other main effects and their interactions"? It is not clearly inferable from the context.

      Apologies for the confusion here. “Other main effects” refer to the two between-subject factors in isolation: i.e., the overall comparison of freezing to the light (averaged across the four trials) between groups that received either paired or unpaired stimulus presentations in stage 1 (factor 1 à main effect 1), and between groups that received either eight or 32 sound and light exposures in stage 1 (factor 2 à main effect 2). “Their interaction” refers to the assessment of whether the overall difference in freezing to the light (averaged across the four trials) between Groups P8 and U8 differs from the overall difference in freezing to the light (averaged across the four trials) between Groups P32 and U32. We have edited the text near line 153 to indicate that:

      “The overall comparisons of freezing to the light (averaged across the four conditioning trials) between groups that received either paired or unpaired stimulus presentations in stage 1 (factor 1), and between groups that received either eight or 32 sound and light exposures in stage 1 (factor 2), were not significant (Fs < .45, p > .508). The interaction between these two between-subject factors was also not significant (F < .45, p > .508).”

      (4) The use of sound and light as preconditioned and conditioned cues are counterbalanced. Was there any difference in the increase of freezing during conditioning depending on the type of conditioned cues? Was there any difference in the preconditioned fear? While it is hard to assess statistical significance due to the sample size limit, even observing a trend could be interesting.

      We examined whether the levels of freezing to the conditioned and preconditioned stimuli depend on their physical identity. In general, there was a slight trend towards more freezing to the preconditioned stimulus when it was a tone, and less freezing to the conditioned stimulus when it was a tone. These are, however, simply indications. None of the statistical comparisons between rats for which the preconditioned stimulus was the tone (and, thereby, conditioned stimulus was the light) and rats for which the preconditioned stimulus was the light (and, thereby, conditioned stimulus was the tone) reached the conventional level of significance.

      (5) General suggestion on reporting non-significant statistics: the authors reported a small F statistic value a few times to suggest non-significance. But without clearly specifying degrees of freedom, it is hard to get a sense of statistical significance (e.g. Line 227, largest F<3.10). I recommend adding p values alongside the F statistics and reporting exact statistics whenever possible.

      Apologies for the omission. The p values have now been included alongside all non-significant F statistics.

      (6) Another general suggestion is to use non-parametric statistical testing with such small sample sizes. I recommend using the Kruskal-Wallis H test (the non-parametric equivalent of F-statistic) to replace the ANOVA result. Also, given many tests only involve comparing two independent groups, using Mann-Whitney U test (the non-parametric equivalent of independent t-test) would be sufficient.

      We understand that small sample sizes can occasionally lead to unequal variances between groups, which necessitates the use of non-parametric statistics. However, as non-parametric statistics raise a different set of issues for data analysis (e.g., power) and interpretation, our general view for the type of data collected in this study is that parametric analyses are appropriate and should be retained (particularly in the absence of unequal variances between groups). We hold this view for two reasons. First, the hypotheses tested in the present series were derived from past work in which parametric analyses revealed meaningful patterns of results at the same level of statistical power. Second, the application of these analyses then yielded results consistent with our hypotheses: for the most part, we observed between-group differences where we expected there to be such differences and did not observe between-group differences where we did not expect there to be such differences. As such, we have not switched from a parametric to non-parametric analysis strategy. We do, however, appreciate the suggestion and will apply a non-parametric approach where it is warranted in our future work.

      Reviewer #2 (Recommendations for the authors):

      I have a few very minor comments for the authors regarding the discussion and interpretation of the very nice experimental results.

      (1) In Figures 4 and 5, the authors provide a schematic of the experiment. It's very clearly indicated whether the BLA inactivation is ipsi- or contralateral, but the unilateral PRh lesion isn't mentioned. I'd recommend including that here so that someone reading through the figures can more easily understand the experiment. The hypothesis is clear and the experiment is so well designed that a read through of the figures can relay most information to an experienced reader.

      Thank you for this suggestion – we have included information about the unilateral PRh lesion in the schematic for Figures 4 and 5.

      (2) The authors have an extended description of backward conditioning in the discussion. It seems like the authors are suggesting this as an important future direction, but they never explicitly say this, resulting in a bit of confusion as to what this section refers to. Also, Ward-Robinson and Hall 1996 showed backward sensory preconditioning using a serial auditory-visual association and argued for a mediated solution based on their results. It may be worth citing that paper here.

      Apologies for the lack of clarity. We have revised this point in the discussion (page 18, beginning line 434) and referenced Ward-Robinson and Hall (1996):

      “Why does increasing the number of sound-light pairings change the way that rats integrate the sound-light and light-shock memories? One possibility is that increasing the number of sound-light pairings in stage 1 reduces the ability of each stimulus to activate the memory of the other. This is consistent with findings by Holland (1998), who showed that the likelihood of mediated learning in rats decreases with the amount of training (see also Holland, 2005); but inconsistent with our findings that, after extended training, rats continue to integrate the sound-light and light-shock associations through chaining at the time of testing (as chaining is predicated on the sound activating the memory of the light after extended training). Instead, we propose that the change in integration occurs because the increased number of sound-light pairings allows the rats to learn about the order in which the sound and light are presented (Figure 1; for evidence that rats acquire order information in sensory preconditioning, see Barnet et al., 1997; Hart et al., 2022; Leising et al., 2007; Miller & Barnet, 1993). This order hypothesis is consistent with evidence showing that the way in which animals represent an audio-visual compound changes across repeated compound exposures (e.g., Bellingham & Gillette, 1981; Holmes & Harris, 2009). It can be tested using a so-called “backward” sensory preconditioning protocol, which reverses the order of stimulus presentations in stage 1 (e.g., Ward-Robinson & Hall, 1996). That is, rather than rats being exposed to the “forward” sound-light pairings used here and by Wong et al. (2019), rats in a backward protocol are exposed to light-sound pairings. Increasing the number of light-sound pairings in this protocol should result in rats learning that the light is followed by the sound (light→sound) and that the sound is followed by nothing (sound→nothing). Hence, during the session of light-shock pairings in stage 2, the light should continue to activate the memory of the sound, resulting in formation of the mediated sound-shock association (e.g., Ward-Robinson & Hall, 1996). That is, if our order hypothesis is correct, increasing the number of light-sound pairings in the backward protocol should preserve the likelihood of mediated learning in stage 2 and, if anything, diminish the likelihood of chaining at test in stage 3 (as the sound is never followed by a light). Hence, PRh manipulations that fail to affect fear of the sound when administered after many sound-light pairings (e.g., infusion of DAP5) should disrupt that fear when administered after many light-sound pairings in the backward protocol. This will be assessed in future work.”

      (3) Line 467 in the discussion suggests that the results are surprising that PRh-BLA communication is not needed at test when learning putatively occurs through a mediated mechanism during first-order conditioning. I was a bit surprised by this comment since I was under the assumption that only BLA was required at this point after consolidation of the mediated learning. Holmes et al., 2013 showed that BLA is required for extinction to S2 after first-order conditioning. In that experiment they inactivated BLA during S2- presentations (typically considered the extinction test), and showed that reduction to S2 did not occur the subsequent day, indicating the memory was stored in BLA and may not necessarily require PRh-BLA communication.

      The result noted here was somewhat surprising as our past studies showed that silencing activity in the PRh prior to testing attenuates freezing to a sensory preconditioned stimulus (i.e., an S2). We took this to mean that the PRh is necessary for retrieval/expression of fear to S2 and supposed that this retrieval/expression would be achieved through communication between the PRh and BLA. However, the results of the PRh-BLA disconnection at test show that this communication is not required, leaving us to speculate that retrieval/expression of fear to S2 may be achieved through communication between the PRh and CeA.

      We have edited the opening of the relevant paragraph to clarify why the result noted here was surprising (page 20, beginning line 485):

      “While the PRh and BLA clearly communicate to support mediated learning about the sound, this communication is not required for retrieval/expression of the mediated sound-shock association at the time of testing. This result is somewhat surprising as activity in the PRh is needed for expression of fear to the sound (Holmes et al., 2013; Wong et al., 2019) and raises the question: how does the PRh-dependent sound-shock association come to be expressed in fear responses?”

      (4) The authors reference Holland 1981 and 1998, yet there's not much discussion of these findings. I think there should be a bit more emphasis on these studies since they show how mediated learning greatly depends on the extent of training. Also, it may be worth considering Holland's theory of why mediated conditioning is more effective with shorter training. His theory may be consistent with the authors, but I believe he suggests that early in training a stronger mediated representation is evoked which tends to dissipate with time. I think this is a valid hypothesis to consider in this paper.

      The Holland papers show that rats form mediated associations (Holland, 1981) and that the likelihood of them doing so decreases with the amount of training (Holland, 1998). These findings are paralleled by those reported in the present series of experiments. However, the protocols used by Holland were very different to those used in the present study; and the explanation for his 1998 findings (which is the more relevant of the two papers) simply does not apply to the case of sensory preconditioning.

      To be clear: Holland (1998) exposed rats to either “few” or “many” tone-food pairings in stage 1, tone-lithium chloride pairings in stage 2 and, finally, tested rats with the food alone in stage 3. He predicted and showed that those exposed to few tone-food pairings showed an aversion to the food at test (i.e., they consumed less of the food than controls) whereas those exposed to many tone-food pairings showed no such aversion (i.e., they consumed the same amount of food as the controls). This was taken to mean that, across the series of tone-lithium pairings, the tone activated the memory of food among rats in the few condition, resulting in a mediated food-lithium association; but failed to do so among rats in the many condition, resulting in no food-lithium association. According to Holland, the tone failed to activate the memory of food in the many condition because, by the end of training in stage 1, it was not needed for them to know what to do when the tone was presented: they simply had to run to the magazine to collect the food when delivered. That is, the tone eventually associated with the responses that rats emitted in the training situation, thereby obviating any need for activation of the food memory.

      While this explanation is both elegant and interesting, it cannot be applied to the results obtained in the present study where the initial stage of training involved few or many sound-light pairings. That is, unlike in the Holland study where rats in the many condition eventually learned a stimulus-“run to magazine” association that maintained performance in the absence of any mental image of food, in the present study, any stimulus-response association acquired in stage 1 (e.g., orienting responses towards the sources of the auditory and visual stimuli) cannot have contributed to the expression of sensory preconditioned fear at test. Hence, stimulus-response learning in the many condition cannot be invoked to explain the pattern of results in the present study, even if it adequately explains what-appears-to-be a similar finding in the Holland study.

      Nonetheless, we have included a reference to the general style of explanation that was considered and rejected by Holland in his 1998 and 2005 papers. This appears on page 18 (beginning line 434) and reads:

      “Why does increasing the number of sound-light pairings change the way that rats integrate the sound-light and light-shock memories? One possibility is that increasing the number of sound-light pairings in stage 1 reduces the ability of each stimulus to activate the memory of the other. This is consistent with findings by Holland (1998), who showed that the likelihood of mediated learning in rats decreases with the amount of training (see also Holland, 2005); but inconsistent with our findings that, after extended training, rats continue to integrate the sound-light and light-shock associations through chaining at the time of testing (as chaining is predicated on the sound activating the memory of the light after extended training). Instead, we propose that the change in integration occurs because the increased number of sound-light pairings allows the rats to learn about the order in which the sound and light are presented (Figure 1; for evidence that rats acquire order information in sensory preconditioning, see Barnet et al., 1997; Hart et al., 2022; Leising et al., 2007; Miller & Barnet, 1993)…”

      (5) There is also a Holland 2005 paper in which he tests whether extended training of the initial stimulus associations may result in a reduced associability of those stimuli. This would potentially result in lower mediated learning due to a decreased associability of the mediated representation, thereby explaining why extended training reductions in mediated learning occur. Using a probabilistic design, Holland shows that this reduction in mediated learning is likely not due to a change in associability.

      We appreciate the note re Holland (2005) and have included a reference to it in our General Discussion. We agree with Holland that the reduction in mediated learning across extended training is not due to reduced associability of the retrieved stimulus representation. If this were the case, it would remain to explain why stimulus representations continue to be activated at test, which must occur for successful chaining of the sound-light and light-shock associations upon presentations of the sound alone. This is included in the modified text on page 18 (beginning line 434), which is part of our response to point 4.

      Reviewer #3 (Recommendations for the authors):

      (1) I think the 4th intro paragraph is essentially saying that more pairings during preconditioning encourage chaining as opposed to mediated learning - I might recommend clarifying this a bit. It took me a while to put it together.

      Apologies for the confusion. We have clarified the argument at this point in the Introduction with the following insertion on page 4 (beginning line 84):

      “That is, increasing the number of sound-light pairings may allow rats to encode information about stimulus order in stage 1 and, thereby, shift the locus of integration from mediated conditioning in stage 2 to chaining at test in stage 3 (Holmes et al., 2022).”

      (2) In analyzing test data I am assuming percent freezing is the average of the entire 30s or 10s CS period - could this be clarified?

      This is correct and has been clarified in the section for ‘Scoring and Statistics’ on page 29 (beginning line 708):

      “Freezing data were collected using a time-sampling procedure in which each rat was scored as either ‘freezing’ or ‘not freezing’ every two seconds by an observer blind to the rat’s group allocation. A percentage score was then calculated by dividing the number of samples scored as freezing by the total number of samples. The baseline level of freezing was established by scoring the first two min at the start of each experimental session: i.e., we divided the total number of samples scored as freezing by the total number of observed samples, which was 60. The levels of freezing to the 10 s conditioned stimulus and 30 s preconditioned stimulus were established in a similar manner: we scored the entire period of each stimulus presentation and divided the number of samples scored as freezing by the total number of observed samples, which was 5 for each presentation of the conditioned stimulus and 15 for each presentation of the preconditioned stimulus.”

      (3) Complementary to the above - during the probe test is there a difference during the first/last 2s of the CS? This would be interesting with respect to understanding the associative structure encoded.

      We have previously examined whether freezing responses change across the duration of a 30 s preconditioned stimulus and a 10 s conditioned stimulus. We have never seen any such changes: in our past work and in the present series of experiments, the expression of freezing is largely uniform across each presentation of a preconditioned or conditioned stimulus.

      (4) It is sort of unclear to me why more CS-CS pairings produced stronger preconditioned fear - is it that both mediated learning and chaining occur and giving 32 pairings permits both processes more than 8 pairings?

      This is a very reasonable explanation for the heightened level of sensory preconditioned fear among rats that received many sound-light pairings in the initial control experiment. We are, however, reluctant to offer a strong interpretation of this result as it was not replicated across subsequent experiments in the series: i.e., the levels of freezing to the sensory preconditioned stimulus at test were largely the same among vehicle-injected controls that received either few (8) or many (32) sound-light pairings in Experiments 2A and 2B, and again in Experiments 3A and 3B as well as Experiments 4A and 4B.

      (5) I would suggest individual data points overlaid on the bars, violin plots, or box and whisker plots to provide a better visualization of the data.

      We appreciate the suggestion – these have been included overlaid on bars in each histogram_._

      (6) There are other citations that would strengthen arguments for the idea that unidirectional/temporal associative structure can be acquired during (appetitive) sensory preconditioning: Leising 2007 Learning and Behavior, Hart 2022 Current Biology, for example.

      Thank you for these citations. We have included references to the Leising et al (2007) and Hart et al (2022) papers in our discussion on page 18-19 (beginning line 442):

      “Instead, we propose that the change in integration occurs because the increased number of sound-light pairings allows the rats to learn about the order in which the sound and light are presented (Figure 1; for evidence that rats acquire order information in sensory preconditioning, see Barnet et al., 1997; Hart et al., 2022; Leising et al., 2007; Miller & Barnet, 1993)…”

      Editor's note:

      We agree with the suggestions about full statistical reporting for non-significant results and about putting individual data points, perhaps coded to identify sex, on top of the bar graphs. Both will increase the transparency of the rigor of the work for readers.

      We thank the editors and authors for their suggestions. We have included full statistical reporting for non-significant results and overlaid individual data points on the bars in each histogram.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Joint Public Review:

      Summary:

      The behavioral switch between foraging and mating is important for resource allocation in insects. This study investigated the role of the neuropeptide, sulfakinin, and of its receptor, the sulfakinin receptor 1 (SkR1), in mediating this switch in the oriental fruit fly, Bactrocera dorsalis. The authors use genetic disruption of sulfakinin and of SkR1 to provide strong evidence that changes in sulfakinin signaling alter odorant receptor expression profiles and antennal responses and that these changes mediate the behavioral switch. The combination of molecular and physiological data is a strength of the study. Additional work would be needed to determine whether the physiological and molecular changes observed account for the behavioral changes observed.

      Strengths:

      (1) The authors show that sulfakinin signaling in the olfactory organ mediates the switch between foraging and mating, thereby providing evidence that peripheral sensory inputs contribute to this important change in behavior.

      (2) The authors' development of an assay to investigate the behavioral switch and their use of different approaches to demonstrate the role of sulfakinin and SkR1 in this process provides strong support for their hypothesis.

      (3) The manuscript is overall well-organized and documented.

      Weaknesses:

      (1) The authors claim that sulfakinin acts directly on SkR1-positive neurons to modulate the foraging and mating behaviors in B. dorsalis. The authors also indicated in the schematic that satiation suppresses SkR1 expression. Additional experiments and more a detailed discussion of the results would help support these claims.

      (2) The findings reported could be strengthened with additional experimental details regarding time of day versus duration of starvation effects and additional genetic controls, amongst others.

      Recommendations for the authors:

      Major issues

      (1) As written the introduction is somewhat fragmented and does not lay out a clear rationale for the current study in the species used by the authors. Others, including Guo et al. (2021) and Wang et al. (2022), have previously shown that sulfakinin signaling pathways are important for feeding and receptivity regulation in D. melanogaster. Thus, the novelty of this study should be more clearly articulated.

      The introduction in the revision is significantly changed to improve the description for the rationale of study (lines 60-66 in the revision).

      (2) In addition, the Introduction should provide more specific background information on the pheromonal activity of oriental fruit fly body extract, the odor-preferences, and the sex pheromone of this species compared to that of model insects such as Drosophila melanogaster.

      The revision contains a paragraph of introduction for chemical ecology of oriental fruit fly that is related to this study (lines 67-75).

      (3) It isn't clear what the first image in Figure 1C represents - is this a schematic of the area or does it represent data?

      The Fig 1C and the associated figure caption are revised. The figure is more visible by changing the track colors. The figure caption is revised as “Representative foraging trajectories in the 100 mm diameter arenas within a 15-min observation period of flies starved for different durations.”

      (4) The authors should include examples of the EAG recordings following the stimulation with food volatiles or pheromones, not only the results of their analyses. This could be included in the main figures or even in supporting information.

      As suggested, we added the examples of the EAG recordings following the stimulation with food odors and body extracts in the Figure 1 and Figure 3.

      (5) The demonstration that removal of the antennae severely impairs mating is dispensable because the antennae are required for other functions in addition to olfaction.

      We agree that the roles of the antennae are likely more than the olfactory function. As suggested, we removed the data.

      (6) It is currently difficult to understand how the authors measured successful rates of foraging. Please provide more details.

      In the revision, we added a sentence describing the method for measuring in detail. See line 269-273.

      (7) The expression of sulfakinin does not change significantly in the antennae following starvation (Figure 2A). Do the authors know whether they change in the central nervous system under these conditions? Have the authors (or has anyone else) checked the expression pattern of sulfakinin in the antennae? This information would help determine whether the sulfakinin signal that acts on SkR1 is released from neurons in the central nervous system (Figure S4C) or whether it is also released from the neurons in the olfactory organs. Based on the immunochemistry results shown in Figure S4C, it would also be interesting to determine whether the intensity of anti-sulfakinin immunoreactivity changes before versus after starvation. This could help establish whether sulfakinin is released during starvation.

      We added the expression data showing the the mRNA level of Sk in the head that is higher after refeeding in Fig. S3. The change in the expression of Sk is also added in the text (lines 107-110). We were unable to identify the Sk neurons in the antennae suggesting possibility of the direct action of humoral Sk on the antennae.

      (8) In Figure 2A, the authors show that the expression levels of some neuropeptides system components change during starvation. However, it would be helpful if the authors could include more detailed information on how the results are shown in the figure legends (e.g., the expression level of each candidate in fed flies was set as 1, etc).

      We revised the figure caption to explain the Figure 2 with the expression values in the figure legend.

      (9) In Figure 2D, null mutant males of sulfakinin and SkR1 consume more food at all times compared to the wild type. However, the corresponding mutant females consume more food only at night. Is this because the wild-type female flies eat more food during the day? In a related issue, Figure 2D shows differences in food consumption measured at different times of day, however, this is not directly addressed in the text, which instead mentions that "the amount of excess food consumed by the mutants was dependent on the duration of the starvation period in both sexes".

      Thank you for the important suggestions. We speculate that the difference of feeding amounts of females occurring only at night is due to the high basal feeding rate of females during the daytime, masking the increase in feeding in the knockout of Sk signaling. As suggested, we have added a relevant description of the difference in food consumption. In addition, we changed the Y-axis scale in the figure for a justified comparison between males and females. See line 123-128.

      (10) It isn't clear how the time of day relates to the duration of starvation. This suggests that mutant females only consume more at 21:00 (presumably at night) whereas males consume more throughout the day. Does this suggest an interaction with the circadian system? What is the duration of starvation in Figure 3A? In a related issue, in Figure 4 it would be useful to know what time of day the EAG analysis was done because the data shown in Figure 2D suggests that the time of day significantly impacts behavioral responses. And does the red versus blue color scheme of the OR subunits represent up/downregulated levels in wild-type animals? Please define this for the reader.

      In addition to the response to the point 9, responding to the issue of feeding amount in females. As the reviewer noted, there was indeed a diurnal difference in food amount consumed by B. dorsalis. However, whether this is related to circadian rhythms is something we haven't studied for further in-depth. Measuring food intake at these 3 times of day, we all ensured that the duration of starvation was the same 12 h. The duration of starvation in Figure 3A is 12h. We have mentioned this in the manuscript. See line 267-268.

      The EAG for sex pheromones and body surface extracts were measured form 21:00-23:00, and food odor was measured from 9:00-11:00. The times of the experiments are described in the revision. See line 309-311.

      Accordingly, we made a revision of the figure caption for explaining the colored fonts. Red color represents a set of ORs related with foraging and blue color is for a set of ORs related with mating. Therefore, the ORs with red color were upregulated in starved wild-type animals and the ORs with blue color were downregulated in starved wild-type flies. We have defined this in the revised manuscript. See line 672-673.

      (11) The authors convincingly show that SKR1 is present in the antennae and is co-expressed with orco. It would be useful to discuss whether this receptor is also expressed in other tissues where there may be additional sites of action of this pathway.

      Indeed, SkR1 is also expressed in the Drosophila brain. We added the discussion on the expression and additional sites of action of SKR1 within the central nervous system. See line 200-205.

      (12) It isn't clear what the dotted arrows in the model shown in Figure 5 represent.

      Dashed arrows represent the additional possible pathways that have not been tested in this study, but not excluded in the model. Please see the discussion for details of additional possible factors modulating odorant sensitivity relevant to satiety. See line 210-229.

      (13) In Figure 5, the authors indicate that satiation suppresses SkR1 expression. It would be helpful if the authors tested the expression level of SkR1 in re-fed flies (by feeding the flies after 12h starvation) to see whether levels of expression are rapidly restored to the levels seen in satiated animals. Such a result could further support the claims made by the authors.

      Thank for your suggestions. Indeed, refeeding after 12h starvation significantly decreased SkR1. We added the result in supporting information (Fig. S3). See line 713. Results see line 107-110.

      (14) The authors show that locomotor activity is unaffected in the mutants but body size comparison would be more useful here since this could also contribute to baseline differences in meal size.

      In the revision, we provided a comparison between WT and Sk-/- in the supplementary data. Results showed that mutant flies have the same body size as the WT flies. (Fig. S7) See line 742. Results see line 120-121.

      (15) Have the authors tested the behavioral phenotypes of heterozygotes mutant of both Sk and SkR1 flies? This may reveal whether a reduced expression of Sk-SkR1 will also cause significant changes in the foraging and mating behaviors seen during starvation.

      We tested the behavioral phenotypes of heterozygous mutant of Sk knockout flies. The results showed that foraging and mating behaviors of Sk heterozygous mutants were unaffected during starvation, suggesting the mutants are completely recessive. We have added the results in supporting information (Fig. S8). See line 746. Results see line 132-135.

      (16) It would be useful to provide information about which SK peptide is detected by the antibody used in Figure S4C. In Figures S4C and S5D, it would be useful to include a counterstain to show that the general morphology is unaffected in the mutants.

      As suggested, we added a detailed description for rabbit anti-BdSk antibody. See line 362-363. We have improved the background image to be available to show the general structure. So counter staining would not be essential.

      (17) The figure legends for supporting figures need to be improved as they are currently difficult to understand. For example, in S2: what is the meaning of "different removal of antennae"? In S3: it isn't clear how the authors evaluated the responses in EAG experiments; in S4A: there are several DNA sequences that do not appear in the main text of the manuscript; in S4C: the meaning of the boxes and the dots is unclear, as is the figure to the left; in S5D, the authors explain only the suppression of SKR1, yet the figure indicates some images for SKR IHC. These are only a few examples; we ask that the authors revise and improve the legends for supporting figures.

      For S2, we removed the data as suggested. For S3, we added a sentence describing the method for measuring in detail. See line 707-709. For S4, the figure in the revision is significantly changed and added a detailed description in the legend (lines 717-724 in the revision). For S5, we have improved our description. See line 731-734. In addition, we have checked all the figure legends of our manuscript and changes were displayed in track version.

      Minor issues

      (1) It isn't clear what the meaning of "the complexity of sulfakinin pathways" is. Please explain.

      We have rewritten the sentence in the revised manuscript by adding the description as “…complexity of Sk pathways, special and temporal dynamics and multiple ligands and receptors, is…”. See line 61-65.

      (2) Please double-check the calls to the various figures in the text.

      We have double-checked the calls to all the figures in the text to make sure they were correct.

      (3) L125: What is the meaning of "olfactory reprogramming"? Please explain.

      We rephrased it to “alteration of olfactory sensitivities”. See line 145.

      (4) L135: After mentioning qRT-PCR the authors should include a call to a figure that shows these results.

      Thank you for your suggestion, the qRT-PCR results are shown in Figure 4B, and we have added it as suggested. See line 154.

      (5) L270: Details are provided for the extraction of the pheromone. However, more details are needed on how the EAG and other functional assays were done.

      We have described the assay procedures in detail in the materials and method part. See line 298-311.

      (6) Figure 2B. Please remove the period(".") at the C-terminal end of WT sk.

      We are sorry for our mistake. We have corrected it.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This important study identifies the "H-state" as a potential conformational marker distinguishing amyloidogenic from non-amyloidogenic light chains, addressing a critical problem in protein misfolding and amyloidosis. By combining advanced techniques such as small-angle X-ray scattering, molecular dynamics simulations, and H-D exchange mass spectrometry, the authors provide convincing evidence for their novel findings. However, incomplete experimental descriptions, limitations in SAXS data interpretation, and the way HDX MS data is presented aHect the strength and generalizability of the conclusions. Strengthening these aspects would enhance the impact of this work for researchers in amyloidosis and protein misfolding.

      We thank eLife editors and reviewers for their constructive feedback. The manuscript has been improved to provide a more complete description of the experiments and to strengthen the interpretation and presentation of all data. Updated Figures (Figure 2 and Figure 5) and a new Table (Table 2) in the main text provide a more complete and clearer comparison of the SAXS data with MD simulations as well as a clearer representation of the HDX MS data. Additional figures have been added in SI. The text has been extended accordingly and complete materials and methods are now included in the main text. Abstract, introduction and discussion have been revised to improve the overall readability of the manuscript.

      Public Reviews:

      Reviewer #1 (Public review):

      The study investigates light chains (LCs) using three distinct approaches, with a focus on identifying a conformational fingerprint to diHerentiate amyloidogenic light chains from multiple myeloma light chains. The study's major contribution is identifying a low-populated "H state," which the authors propose as a unique marker for AL-LCs. While this finding is promising, the review highlights several strengths and weaknesses. Strengths include the valuable contribution of identifying the H state and using multiple approaches, which provide a comprehensive understanding of LC structural dynamics. However, the study suHers from weaknesses, particularly in interpreting SAXS data, lack of clarity in presentation, and methodological inconsistencies. Critical concerns include high error margins between SAXS profiles and MD fits, unclear validation of oligomeric species in SAXS measurements, and insuHicient quantitative cross-validation between experimental (HDX) and computational data (MD). This reviewer calls for major revisions including clearer definitions, improved methodology, and additional validation, to strengthen the conclusions.

      We thank the reviewer for the supportive comments, in the revised version of the manuscript we have focused on improving the clarity and completeness of our work. We are sorry for example to not have made previously clear enough that the comparison of SAXS with MD simulation was not that shown in the main text in Figure 1 and Table 1 (this is the comparison with single structures) but that reported in the SI (previously Figure S1 and Table S2, showing very good fits). These data have been moved in the main text in the reworked Figure 2 and new Table 2. We have also improved the presentation of the HDX MS data in Figure 5 and in the text adding also additional analysis in SI. Materials and methods are now completely moved in the main text. We generally revised the manuscript for clarity.

      Reviewer #2 (Public review):

      Summary:

      This well-written manuscript addresses an important but recalcitrant problem - the molecular mechanism of protein misfolding in Ig light chain (LC) amyloidosis (AL), a major life-threatening form of systemic human amyloidosis. The authors use expertly recorded and analyzed smallangle X-ray scattering (SAXS) data as a restraint for molecular dynamics simulations (called M&M) and to explore six patient-based LC proteins. The authors report that a highly populated "H-state" determined computationally, wherein the two domains in an LC molecule acquire a straight rather than bent conformation, is what distinguishes AL from non-AL LCs. They then use H-D exchange mass spectrometry to verify this conclusion. If confirmed, this is a novel and interesting finding with potentially important translational implications.

      We thank the reviewer for the supportive comments.

      Strengths:

      Expertly recorded and analyzed SAXS data combined with clever M&M simulations lead to a novel and interesting conclusion. Regardless of whether or not the CL-CL domain interface is destabilized in AL LCs explored in this (Figure 6) and other studies, stabilization of this interface is an excellent idea that may help protect at least a subset of AL LCs from misfolding in amyloid. This idea increases the potential impact of this interesting study.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      The HDX analysis could be strengthened.

      We have extended the analysis and improved the presentation of the HDX data. Figure 5 has been reworked, text has been improved accordingly and additional analysis have been reported in SI.

      Reviewer #3 (Public review):

      Summary:

      This study identifies conformational fingerprints of amyloidogenic light chains, that set them apart from the non-amyloidogenic ones.

      We thank the reviewer for the supportive comments.

      Strengths:

      The research employs a comprehensive combination of structural and dynamic analysis techniques, providing evidence that conformational dynamics at the VL-CL interface and structural expansion are distinguished features of amyloidogenic LCs.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      The sample size is limited, which may aHect the generalizability of the findings. Additionally, the study could benefit from deeper analysis of specific mutations driving this unique conformation to further strengthen therapeutic relevance.

      We agree, we tried to maximise the size of the sample and this was the best we could do. With respect to the analysis of the mutations, while we tried to discuss some of them also in view of previous works, because our set covers multiple germlines instead than focusing on a single one, this limit our ability to discuss single point mutations systematically, at the same time the discussion of single points mutations has been the focus of many recent works, while our approach provide a diNerent point of view.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      This study provides an investigation of light chains (LCs) using three distinct approaches, focusing primarily on identifying a conformational fingerprint to distinguish amyloidogenic light chains (AL-LCs) from multiple myeloma light chains (MM-LCs). The authors propose that the presence of a low-populated "H state," characterized by an extended quaternary structure and a perturbed CL-CL interface, is unique to AL-LCs. This finding is validated through hydrogendeuterium exchange mass spectrometry (HDX-MS). The study makes a valuable contribution to understanding the structural dynamics of light chains, particularly with the identification of the H state in AL-LCs. However, significant concerns regarding the interpretation of the SAXS data, clarity in presentation, and methodological rigor must be addressed. I recommend major revisions and resubmission of the work.

      Major concerns:

      (1) A critical concern is how the authors ensure that the SAXS profiles represent only dimeric species, given the high propensity of LCs to aggregate. If higher-order aggregates or monomers were present, this would significantly impact the SAXS data and SAXS-MD integration. Some measurements are bulk SAXS, while others are SEC-SAXS, making the study questionable. The authors need to clarify how only dimeric species were measured for the SEC-SAXS analysis, and all assessments of the dimeric state should be shown in the SI. Additionally, complementary techniques such as DLS or SEC-MALS should be used to verify the oligomeric state of the samples. Without this validation, the SAXS profiles may not be reliable.

      We added SEC-MALS and SEC-SAXS data in the SI (Figures S20 and S21) as well the SAXS curves shown in log-log plot (Figure S1) that display a flat trend at low q that exclude aggregation. SAXS is very sensitive to oligomers and aggregates and our data do not indicate the presence of those species. When we had indication of possible aggregation in the sample we used SEC-SAXS.

      (2) A major problem with the paper is that the claim of the "H state," which is the novelty of the study and serves as a marker of aggregation, is derived from samples where the error between the SAXS profiles and MD fits is extremely high. This casts doubt on whether the structure is indeed resolved by MD. The main conclusion of the paper is derived from weak consistency between experiment and simulation. In AL55, the error between experiment and simulation is greater than 5; for H7, it is higher than 2.8. The residuals show significant error at mid-q values, suggesting that long-range distance correlations (20-10 Å, CL, VL positioning) are not consistent between simulation and experiment. Furthermore, the FES plots of two independent replicas show deviation in the existence of the H state. One shows a minimum in that region, while the other does not. So, how robust is this conclusion? What is the chi-squared value if each replica is used independently? A separate experimental cross-validation is necessary to claim the existence of the H state.

      We apologise for the misunderstanding underlying this reviewer comment. The poor agreement mentioned is not between the SAXS and MD simulations, but with the individual structures, and this disagreement led us to perform MD simulations that are in much better agreement with the data (previously Fig. S1 and Table S2). To avoid this misunderstanding, which would indeed weaken our work, we have now moved both the figure and the table in the main text to the updated Figure 2 and the new Table 2.

      Regarding the robustness of the sampling, we believe that Table 3 (previously Table 2) clearly shows the statistical convergence of the data, diNerences in the presentation of the free energy are purely interpolation issues. The chi-squares of each replicate are reported in Table 2 (previously Table S2).

      (3) There is insuHicient discussion about SAXS computations from MD trajectories. The accuracy of these calculations is crucial to deriving the existing conclusions, and the study's reliance on the PLUMED plugin, which is known to give inaccurate results for SAXS computations, raises concerns. How the solvent is treated in the SAXS computations needs to be explained. Alternative methods like WAXSiS or Crysol should be explored to check whether the SAXS profiles derived from the MD trajectory are consistent across other SAXS computation methods for the major conformers of the proteins.

      We have now clarified that while the SAXS calculation to perform Metainference MD were done using PLUMED (that to our knowledge is as accurate as crysol) SAXS curves used for analysis were calculated using crysol.

      (4) The HDX and MD results do not seem to correlate well, and there is a disconnect between Figure 2 (SAXS profiles) and Figure 5 (HDX structural interpretation). The authors should quantitatively assess residue-level dynamics by comparing HDX signals with MD-derived HDX signals for each protein. This would provide a cross-validation between the experimental and computational data.

      In our opinion our SAXS, MD and HDX MS data provide a consistent picture. Our HDX-MS do not provide per residue data, making a quantitative comparison out of scope. RMSF data do not necessarily need to correlate with the deuterium uptake.

      (5) MD simulations are only used to refine the structure of AlphaFold predictions, but the trajectories could help explain why these structures diHer, what stabilizes the dimer, or what leads to the conformational transition of the H state. A lack of analysis regarding the physical mechanism behind these structural changes is a weakness of the study. The authors should dedicate more eHort to analyzing their data and provide physical insights into why these changes are observed.

      Our aim was to identify a property that could discriminate between AL and MM LCs. We used MD simulations, not to refine structures, but to explore the conformational dynamics of LCs (starting from either X-ray structures, homology or AlphaFold models), because SAXS data suggested that conformational dynamics could discriminate between AL- and MM-LCs. Simulations allowed us to propose a hypothesis, which we tested by HDX MS. While more insight is always welcome, we believe that we have achieved our goal for now. In the discussion, we present additional analysis of the simulations to connect with previous literature, we agree that more analysis can be done, and also for this reason, all our data are publicly available.

      Minor concerns

      (6) The abstract leans heavily on describing the problem and methods but lacks a clear presentation of key results. Providing a concise summary of the main findings (e.g., the identification of the H state) would better balance the abstract.

      We agree with the reviewer and we rewrote the abstract.

      (7) In the abstract, the term "experimental structure" is used ambiguously. Since SAXS also provides an experimental structure, it is unclear what the authors are referring to. This should be clarified.

      We agree with the reviewer and we rewrote the abstract.

      (8) Abbreviations such as VL (variable domain) and CL (constant domain) are not defined, making it harder for readers unfamiliar with the field to follow. Abbreviations should be defined when first mentioned.

      We agree with the reviewer and we rewrote the abstract.

      (9) The introduction provides a good general context but fails to explicitly define the knowledge gap. Specifically, the structural and dynamic determinants of LC amyloidogenicity are not well established, and this study could be framed as addressing that gap.

      We thank the reviewer and we agree this could be better framed, we improved the introduction accordingly.

      (10) The introduction does not present the novel discovery of the H state early enough. The unique contribution of identifying this state as a marker for AL-LCs should be mentioned upfront to guide the reader through the significance of the study.

      We thank the reviewer and we have now made more explicit what we found.

      (11) The therapeutic implications of this research should be highlighted more clearly in the discussion. Examples of how these findings could be utilized in drug design or therapeutic approaches would enhance the study's impact.

      We thank the reviewer, but while we think that the H-state could be targeted for drug design, since we do not have data yet we do not want to stress this point more than what we are already doing.

      (12) There is an overwhelming use of abbreviations such as H3, H7, H18, M7, and M10 without proper introduction. This makes it diHicult for readers to follow the results, and the average reader may become lost in the details. An introductory figure summarizing the sequences under study, along with a schematic of the dimeric structure defining VL and CL domains, would significantly aid comprehension.

      We agree and we tried to better introduce the systems and simplify the language without adding a figure that we think would be redundant.

      (13) In Figure 1, add labels to each SAXS curve to indicate which protein they correspond to. Also, what does online SEC-SAXS mean?

      Done

      (14) The caption of Figure 3 is unclear, particularly with abbreviations like Lb, Ls, G, and H, which are not mentioned in the captions. The authors should define these terms for clarity.

      Done

      (15) The study claims that the dominant structure of the dimer changes between diHerent LCs. However, Figure 5 shows identical structures for all proteins, raising questions about the consistency between the SAXS and HDX data. This inconsistency is a general problem between the MD and HDX sections, where cross-communication and comparisons are not properly addressed.

      We do not claim that the dominant structure of the dimer changes between diNerent LCs, this would also be in contradiction with current literature. We claim a diNerence in a low-populated state. From this point of view using always the same structure is consistent and should simplify the representation of the results. We agree that the manuscript may be not always easy to follow and we thank the reviewer in helping us improving it.

      (16) The authors show I(q) vs q and residuals for each protein. The Kratky plots are not suHicient to compare the SAXS computations with the measured profile.

      Showing Kratky and residuals is a standard and complementary way to present and compare SAXS data to structures. Chi-square values are also reported. Log-log plots have been added to SI in response to previous comments.

      (17) The authors need to explain how they estimate the Rg values (from simulation or SAXS profiles). If they are using simulations, they should compute the Rg values from the simulations for comparison.

      Rg values reported in Table 1 are derived from SAXS. Rg from simulations have been added in Table 2.

      (18) The evolution of the sampling is unclear. The authors need to show the initial starting conformation in each case and the most likely conformation after M&M in the SI, to demonstrate that their approach indeed caused changes in the initial predictions.

      Our approach is not structure refinement and as such the proposed analysis would be misleading. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. DiNerences (or not) between initial and selected configurations will not be particularly informative in this context.

      (19) The authors should also provide a running average of chi-squared values over time to demonstrate that the conformational ensemble converged toward the SAXS profile.

      Our simulations are not driven to improve the agreement with SAXS over time, this is not structure refinement. Metainference is meant to generate a statistical ensemble representing the equilibrium conformations that as whole reproduce the data. The suggested analysis would be a misinterpretation of our simulations. The comparison with SAXS is provided in Figure 2 and Table 2 as mentioned above.

      (20) The aggregate simulation time of 120 microseconds is misleading, as each replica was only run for 2-3 microseconds. This should be clarified.

      The number reported in the text is accurate and represent the aggregated sampling. The number of replicas for each metainference simulation and their length is reported in Table 2 now moved for clarity from the SI to main text.

      (21) It is not clear how the replicas were weighted to compute the SAXS profiles and FES. There are two independent runs in each case, and each run has about 30 replicas. How these replicas are weighted needs to be discussed in the SI.

      Done

      (22) The methods section is unevenly distributed, with detailed explanations of LC production and purification, while other key methodologies like SAXS+MD integration and HDX are not even mentioned in the main text (they are in the Supporting Information). The authors should provide a brief overview of all methodologies in the main text or move everything to the SI for consistency.

      We agree with the reviewer, all methods are now in main text.

      Reviewer #2 (Recommendations for the authors):

      (1) Computational M&M evidence is strong (Figure 3) and is supported by SAXS (used as restraints). However, Kratky plots reported in the main MS Figure 1 show significant diHerences between the data and the structural model only for one protein, AL-55. It is hard for the general reader to see how these SAXS data support a clear diHerence between AL and non-AL proteins. If possible, please strengthen the evidence; if not, soften the conclusions.

      We thank the reviewer for the comments. The chi-square (Table 1) and the residuals (Figure 1) are a strong indication of the diNerence. To strengthen the evidence, following also the comment from reviewer 3 we calculated the p-value (<10<sup>-5</sup>) on the significance of the radius of gyration to discriminate AL and MM LCs. We agree that SAXS alone was not enough and this is indeed what prompted us to perform MD simulations.

      (2) HDX MS results are cursory and not very convincing as presented. The butterfly plots in Figure 5 are too small to read and are unlabeled so it is unclear which protein is which.

      Figure 5 has been reworked for readability. More data have been added in SI.

      (3) What labeling time was selected to construct these plots and why?

      The deuterium uptakes at 30 min HDX time showed the most pronounced diNerences between diNerent proteins, which were chosen to illustrate the key structural features in the main figure panel (Figure 5).

      How diHerent are the results at other labeling times? Showing uptake curves (with errors) for more than just two peptides in the supplement Figure S12 might be helpful.

      We found a continuous increase in deuterium uptake as we increased the exchange time from 0.5 to 240 min, which reached saturation at 120 min. Therefore, the exchange follows the same pattern at all time points. Butterfly plots at diNerent HDX times of 0.5 to 240 min are shown in gradient of light blue to dark blue which clearly shows the pattern of deuterium uptake at increasing incubation times (Figure 5). The HDX uptake kinetics of selected peptides with corresponding error bars are shown in Figure S12.

      How redundant are the data, i.e. how good is the peptide coverage/resolution in key regions at the domain-domain interface that the authors deem important? Mapping the maximal deuterium uptake on the structures in Figure 5 is not very helpful. Perhaps mapping the whole range of uptake using a gradient color scheme would be more informative.

      Overall coverage and redundancy for all four proteins are> 90% and > 4.0, respectively, with an average error margin in fractional uptake among all peptides is 0.04-0.05 Da, which suggests that our data is reliable (Table S3). We modified the main panel figures showing the gradient of deuterium uptake in blue-white-red for 0 to 30% of deuterium uptake on the chain A of the dimeric LCs.

      (3) Is the conformational heterogeneity depicted in M&M simulations consistent with HDX results? The authors may want to address this by looking at the EX1/EX2 exchange kinetics for AL vs. non-AL proteins. Do AL proteins show more EX1?

      No, we don’t see any EX1 exchange kinetics in our analysis. This is compatible with the prediction of the H-state that is a native like state and not an unfolded/partially folded state.

      (4) Perhaps the main conclusion could be softened given the small number of proteins (six), esp. since only four (3 AL and 1 non-AL) could be explored by HDX. Are other HDX MS data of AL LCs from the same Lambda6 family (e.g. PMID: 34678302) consistent with the conclusions that a particular domain-domain interface is weakened in AL vs. non-AL LCs?

      We thank the reviewer for this suggestions. A diNerence in HDX MS data is indeed visible between AL and MM proteins for peptide 33-47 in the suggested paper (Figures 4, S5 and S8). The diNerence is reduced by the mutation identified in the paper as driving the aggregation in that specific case. We now mention this in the discussion.

      (5) Please clarify if the H* state is the same for a covalent vs. non-covalent LC dimer.

      We do not know because our data are only for covalent dimers. But, interestingly, the state is very similar to what was observed for a model kappa light-chain in Weber, et al., we have better highlighted this point in the discussion.

      (6) Please try and better explain why a smaller distance between CL domains in H7 protein and a larger distance in other AL proteins both promote protein misfolding.

      We do not have elements to discuss this point in more detail.

      (7) Please comment on the Kratky plots data vs. model agreement (see comments above).

      Done.

      (8) Please find a better way to display, describe, and interpret the HD exchange MS data.

      We have generated new main text (new Figure 5) and SI figures that we think allow the reader to better appreciated our observations. Corresponding results sections have been also improved.

      Minor points:

      (9) Is the population of the H-state with perturbed CL-CL domain interface, which was obtained in M&M simulations, suHicient to be observable by HDX MS?

      While populations alone are not enough to determine what is observable by HDX MS, a 10% population correspond roughly to 6 kJ/mol of ΔG and is compatible with EX2 kinetics. Previous works suggested that HDX-MS data should be sensitive to subpopulations of the order of 10%, (https://doi.org/10.1016/j.bpj.2020.02.005, https://doi.org/10.1021/jacs.2c06148)

      (10) Typically, an excited intermediate in protein unfolding is a monomer, while here it is an LC dimer. Is this unusual?

      This is a good point, we think that intermediates have mostly been studied on monomeric proteins because these are more commonly used as model systems, but we do not feel like discussing this point.

      (11) Low deuterium uptake is consistent with a rigid structure but may also reflect buried structure and/or structure that moves on a time scale greater than the labeling time.

      We agree.

      Reviewer #3 (Recommendations for the authors):

      (1) The p-value (statistical significance) of Rg diHerence should be computed.

      We thank the reviewer for the suggestion, we calculated the p-value that resulted quite significant.

      (2) The significance of mutations (SHM?) at the interface, such as A40G should be compared with previous observations. (Garrofalo et al., 2021).

      We thank the reviewer for the suggestion, a sentence has been added in the discussion.

    1. Reviewer #3 (Public review):

      Summary:

      Ishii et al used molecular genetics, behavioral analyses, in vivo neural activity imaging, and neural activity manipulations in mice to study the functional role of a subset of medial preoptic area (MPOA) neurons in the regulation of female sexual drive. They first employed a self-paced mating assay during which a female could control the amount of interaction time with a male to assess female sexual drive after completion of mating. The authors observed that after mating completion (i.e., male ejaculation) females spend significantly less time interacting with males, indicating that their sexual drive is reduced. Next, the authors performed a brain-wide analysis of neurons activated following male ejaculation and identified the MPOA as a strong candidate region. One caveat is that the activity labeling was not exclusive to neurons activated following male ejaculation but included all neurons activated before, during, and after the mating encounter. However, in this revised version of the manuscript, the authors have included a key control group that labels all neurons activated up to but not including male ejaculation. Comparison of the number of activated neurons in these two groups revealed a significant additional set of neurons in the female MPOA following ejaculation. Importantly, the authors also provided in vivo calcium imaging data showing that a subset of MPOA neurons responds significantly and specifically to male ejaculation and not other behaviors during the social encounter. The authors performed these studies in both excitatory and inhibitory populations of the MPOA. Their analysis identified a subpopulation of inhibitory neurons that exhibit sustained increased activity for 90 sec following male ejaculation. Finally, the authors used chemogenetics to activate MPOA neurons during home cage mating, condition place preference, pup retrieval, and the self-paced mating assay. They found that activation of female MPOA neurons that were previously activated following male ejaculation significantly reduces mating behaviors and time spent interacting with a male during the self-paced mating assay. Whereas, activation of female MPOA neurons that were previously activated during consummatory behaviors but not male ejaculation does not alter mating behaviors and time spent interacting with a male. Therefore, MPOA neurons activated following ejaculation are sufficient to suppress female sexual motivation.

      The authors' experimental execution is rigorous and well performed. Their data identify inhibitory neurons in the female MPOA as a neural locus that is activated following male ejaculation and whose prolonged activity plays a key role in the regulation of female sexual motivation. The addition of some key control groups to this revised version of the manuscript greatly strengthens the interpretation of the authors' findings.

      Strengths:

      (1) The use of the self-paced mating assay in combination with neural imaging and manipulation to assess female sexual drive is innovative. The authors correctly assert that relatively little is known about how male ejaculation affects sexual motivation in females as compared to males. Therefore, the data collected from these studies is important and valuable.

      (2) The authors provide convincing histological data and analyses to verify and validate their brain-wide activity labeling, neural imaging, and chemogenetic studies.

      (3) The single cell in vivo calcium imaging data are well performed and analyzed. They provide key insights into the activity profiles of both excitatory and inhibitory neurons in the female MPOA during mating encounters. The authors identification of an inhibitory subpopulation of female MPOA neurons that is selectively activated following completion of mating is fundamental for future experiments which could potentially find a molecular marker for this population and specifically manipulate these neurons to understand their role in female sexual motivation in greater detail.

      (4) The authors provide convincing evidence that activation of female MPOA neurons activated following male ejaculation is sufficient to suppress female sexual motivation. Importantly, the authors addition of the consummatory-hM3Dq group demonstrates that activation of female MPOA neurons activated during mating behaviors prior to male ejaculation is not sufficient to suppress female sexual motivation.

      Weaknesses:

      In this revised version of the manuscript, the authors have added important controls as well as additional clarifying text that adequately address the weaknesses that were present in the original version of the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, Corso-Diaz et al, focus on the NRL transcription factor (TF), which is critical for retinal rod photoreceptor development and function. The authors profile NRL's protein interactome, revealing several RNA-binding proteins (RBPs) among its components. Notably, many of these RBPs are associated with R-loop biology, including DHX9 helicase, which is the primary focus of this study. R-loops are three-stranded nucleic acid structures that frequently form during transcription. The authors demonstrate that R-loop levels increase during photoreceptor maturation and establish an interaction between NRL TF and DHX9 helicase. The association between NRL and RBPs like DHX9 suggests a cooperative regulation of gene expression in a cell-type-specific manner, an intriguing discovery relevant to photoreceptor health. Since DHX9 is a key regulator of R-loop homeostasis, the study proposes a potential mechanism where a cell-type-specific TF controls the expression of certain genes by modulating R-loop homeostasis. This study also presents the first data on R-loop mapping in mammalian retinas and shows the enrichment of R-loops over intergenic regions as well as genes encoding neuronal function factors. While the research topic is very important, there is some concern regarding the data presented: there are substantial data supporting the interaction between NRL and DHX9, including pull-down experiments and proximity labeling assay (PLA), however, the data showing an interaction between NRL and DDX5, another R-loop-associated helicase, are inadequate. Importantly, the data supporting the claim that NRL interacts with R-loops are absolutely insufficient and at best, correlative. The next concerns are regarding the R-loop mapping data analysis and visualization.

      Strengths:

      There is compelling evidence that the NRL transcription factor interacts with several RNA binding proteins, and specifically, sufficient data supporting the interaction of NRL with DHX9 helicase.

      A major strength is the use of the single-stranded R-loop mapping method in the mouse retina.

      Weaknesses:

      (1) Figure S1A: There is a strong band in GST-IP (control IP) for either HNRNPUI1 or HNRNPU, although the authors state in their results that there is a strong interaction of these two RBPs with NRL.

      Under our experimental conditions, most RNA-binding proteins displayed higher binding to glutathione beads (Fig. S1A). However, GST-NRL purifications showed much stronger signals for respective RBPs. In the case of HNRNPU and HNRNPUl1, white bands that are indicative of substrate depletion due to higher protein levels are observed in GST-NRL lanes. Additionally, in Figures 1B and 1C, there is a clear enrichment of HNRNPU and HNRNPUl1 above the background signal. We added this to the text. See page 5.

      Both DHX9 and DDX5 samples have a faint band in the GST-IP.

      RNA-binding proteins may display some background as observed in other studies (e.g. PMID: 32704541). We think that showing the raw data without decreasing the exposure time is useful and that there is a clear enrichment compared to controls.  In addition, we tested the interaction in multiple systems.

      There is an extremely faint band for HNRNPA2B1 in the GST-NRL IP lane. Given this is a pull-down with added benzonase treatment to remove all nucleic acids, these data suggest, that previously observed NRL interactions with these particular RBPs are mediated via nucleic acids. Similarly, there is a loss of band signal for HNRNM in this assay, although it was identified as an NRL-interacting protein in three assays, which again suggests that nucleic acids mediate the interaction.

      Thank you for highlighting this point. We mention in the manuscript that the interaction between HNRNPM and A1 depends on nucleic acids, as noted by the reviewer, since there is no obvious band after the pull-down. We have now added that the interaction of NRL with HNRNPA1B1 is likely dependent on nucleic acids as well, given its weak signal. See page 5.

      (2) The data supporting NRL-DDX5 interaction in rod photoreceptor nuclei is very weak. In Figure 2D, the PLA signal for DDX5-NRL is very weak in the adult mouse retina and is absent in the human retina, as shown in Figure 2H.

      We agree with the reviewer. We think that the signal for DDX5 is weak, and we addressed this in the text. We noted on page 7: “Taken together, these findings suggest a strong interaction between NRL and DHX9 throughout the nuclear compartment in the retina and that a transient and/or more regulated interaction of NRL with DDX5 may require additional protein partners.”  We have modified this sentence to add that the data also suggest transient interaction or the requirement of additional protein partners for stable interaction. See page 7.

      Given that there is no NRL-KO available for the human PLA assay, the control experiments using single-protein antibodies should be included in the assay. Similarly, the single-protein antibody control PLA experiments should be included in the experimental data presented in Figure 2J.

      Thank you for the suggestion. We performed PLAs using both DHX9 and IgG in the human retina and observed no specific amplification signal. Some background is observed outside the nucleus and in the extracellular space. We added these results to the text and to the supplementary information. See page 7 and Fig.S2B.

      (3) The EMSA experiment using a probe containing NRL binding motif within the DHX9 promoter should include incubation with retina nuclear extracts depleted for NRL as a control.

      In EMSA experiments, we used bovine retina to obtain enough protein quantities. As suggested by the reviewer, using NRL depleted extract would increase the specificity of observed gel shift and complement our pre-immune serum as a negative control. However, removal of all the NRL protein using the antibodies available was not feasible. In the future, we will use enough mice to obtain large quantities of protein for this experiment and will collect retinas from Nrl knockout as negative control.

      (4) There is a reduced amount of DHX9 pulled down in NRL-IP in HEK293 cells, but there is no statistically significant difference in the reciprocal IP (DHX9-IP and blotting for NRL) (Figure 4C).

      We believe the reviewer is referring to the data in Figure 4C showing that RNase H treatment led to significantly reduced pulldown of DHX9 as compared to control, but the reciprocal IP in Figure 4D showed no statistical significance between control and RNase H treatment. In Figure 4D, we hypothesize that NRL may account for only a small proportion of DHX9’s interactome, so the change in NRL levels could not be detected due to the sensitivity of our assay. DHX9 likely constitutes a large proportion of NRL’s interactome in HEK293 cells, hence the change in DHX9 level was more obvious when pulling down with NRL. We added this information to the results. See page 8.

      (5) The only data supporting the claim that NRL interacts with R-loops are presented in Figure 5A.

      Additional evidence that NRL interacts with R-loops comes from DRIP-Seq experiments where signals from R-loops overlap with NRL ChIP-Seq signals (Figure 7A). This shows that R-loops and NRL co-occur on multiple genomic regions. In addition, indirect evidence of NRL and R-loops’ interaction is shown in pull down experiments and PLA assays where R-loops influence DHX9 and NRL binding. We clarified this in the discussion. See page 14.

      This is a co-IP of R-loops and then blotting for NRL, DHX9, and DDX5. Here, there is no signal for DDX5, quantification of DHX9 signal shows no statistically significant difference between RNase H treated and untreated samples, while NRL shows a signal in RNase H treated sample. These data are not sufficient to make the statement regarding the interaction of NRL with R-loops.

      Thank you for this comment. We respectfully disagree as we observe statistically significant enrichment for both NRL and DHX9 in these experiments (See Fig5A). Some NRL continues to bind to DNA that is pulled down nonspecifically, which may be expected since NRL is a transcription factor. See for example R-loop binding by the transcription factor Sox2 (PMID: 32704541). However, binding to R-loops is evidenced by an enrichment compared to RNase H-treated sample. We clarified this in Results section (See page 9).

      (6) Regarding R-loop mapping, the data analysis is quite confusing. The authors perform two different types of analyses: either overall narrow and broad peak analysis or strand-specific analysis. Given that the authors used ssDRIP-seq, which is a method designed to map R-loops strand specifically, it is confusing to perform different types of analyses.

      Thank you for highlighting this point. This has enhanced the clarity of the methods and enriched the discussion. We aimed to identify R-loops as accurately as possible. We conducted two types of analyses to capture different aspects of R-loops: one that looks at overall patterns (narrow and broad peaks) and another that focuses on specific strands of DNA.

      Using ssDRIP-seq, which is designed to map R-loops on specific strands, allowed us to examine R-loops formed in only one strand and those formed on both strands. To identify strand-specific R-loops, we filtered our RNase-H enriched peaks for those enriched on one strand compared to the opposite strand. We clarified the analysis in the results section, and Figure 6B. See page 10 and methods section page 25.

      Next, the peak analysis is usually performed based on the RNase H treated R-loop mapping; what does it mean then to have a pool of "Not R-loops", see Figure 6B?

      The “Not R-loop” group refers to peaks called using the opposite strand that are not observed when calling peaks using RNase H as control. We modified this figure for clarity (Figure 6B).

      In that regard, what does the term "unstranded" R-loops mean? Based on the authors' definition, these are R-loops that do not fall within the group of strand-specific R-loops. The authors should explain the reasons behind these types of analyses and explain, what the biological relevance of these different types of R-loops is.

      Thank you for helping us clarify this point. Unstranded R-loops are DNA regions containing DNA:RNA hybrids on both plus and minus strands and possibly representing bidirectional transcription by Pol II. We observed that unstranded R-loops are enriched only in intergenic regions, H3K9me3 regions, and downstream of the transcriptional termination site (TTS). We added to the discussion the possible implications of these enrichments, including regulation of Pol II termination and transcription of long genes.  See Page 13.

      (7) It would be more useful to show the percent distribution of R-loops over the different genomic regions, instead of showing p-value enrichment, see Figure 6C.

      Since most of the genome is non-coding, plotting the distribution as a proportion was not informative since the vast proportion of the data falls in intergenic regions. However, we created a new figure showing observed vs. expected ratio that seems to be more informative and moved the current p-value figure to the supplement in revised version. See Figure 6C and S6D.

      (8) Based on the model presented, NRL regulates R-loop biology via interaction with RBPs, such as DHX9, a known R-loop resolution helicase. Given that the gene targets of NRL TF are known, it would be useful to then analyze the R-loop mapping data across this gene set.

      Thank you for this suggestion. We performed an analysis of R-loops on NRL-regulated genes. Interestingly, NRL target genes have an enrichment of stranded R-loops at the promoter/TSS and unstranded R-loops on the gene body compared to all Ensembl genes (Figure S7B). We added a table containing all NRL-regulated genes we used for this analysis (table S5) and a figure showing this result (Fig. S7B).

      Reviewer #2 (Public review):

      Summary:

      The authors utilize biochemical approaches to determine and validate NRL protein-protein interactions to further understand the mechanisms by which the NRL transcription factor controls rod photoreceptor gene regulatory networks. Observations that NRL displays numerous protein-protein interactions with RNA-binding proteins, many of which are involved in R-loop biology, led the authors to investigate the role of RNA and R-loops in mediating protein-protein interactions and profile the co-localization of R-loops with NRL genomic occupancy.

      Strengths:

      Overall, the manuscript is very well written, providing succinct explanations of the observed results and potential implications. Additionally, the authors use multiple orthogonal techniques and tissue samples to reproduce and validate that NRL interacts with DHX9 and DDX5. Experiments also utilize specific assays to understand the influence of RNA and R-loops on protein-protein interactions. The authors also use state-of-the-art techniques to profile R-loop localization within the retina and integrate multiple previously established datasets to correlate R-loop presence with transcription factor binding and chromatin marks in an attempt to understand the significance of R-loops in the retina.

      Weaknesses:

      In general, the authors provide superficial interpretations of the data that fit a narrative but fail to provide alternative explanations or address caveats of the results. Specifically, many bands are present in interaction studies either in control lanes (GST controls) of Westerns or large amounts of background in PLA experiments.

      We have added additional information to the text regarding the presence of background signals in pull downs. We wish to note that experimental samples always exceeded background signals.  We believe that reporting these raw findings (rather than showing shorter exposures) is valuable for the scientific community. We did not observe any background in the proximity ligation assay (PLA) that exceeded what is typically expected, and the signals were clearly discernible. Cases where signals are weaker, such as with DDX5, have been highlighted. In addition, we added a DHX9-IgG negative control for the human PLA experiment. See page 5 and Fig. S2B.

      Additionally, the lack of experiments testing the functional significance of Nrl interactions or R-loops within the developing retina fails to provide novel biological insights into the regulation of gene regulatory networks other than, 'This could be a potentially important new mechanism'.

      We agree that functional experiments are necessary to understand the molecular mechanisms behind R-loop regulation in the retina; however, we believe it goes beyond the scope of this initial characterization (as this is the first report on R-loops in the retina). We are currently pursuing these studies.

      We performed new analysis on NRL-regulated genes as suggested by reviewer 1. We show that NRL target genes have an enrichment of stranded R-loops at the promoter/TSS and unstranded R-loops on the gene body compared to all Ensembl genes (Figure S7B), providing further evidence of the functional  interaction between NRL and R-loops. See table S5 and Fig. S7B, and discussion.

      Additionally, the authors test the necessity of RNA for NRL/DHX9 interactions but don't show RNA binding of NRL or DHX9 or the sufficiency of RNA to interfere/mediate protein-protein interactions. Recent work has highlighted the prevalence of RNA binding by transcription factors through Arginine Rich Motifs that are located near the DNA binding domains of transcription factors.

      We agree that the role of RNA in these complexes is very exciting, and we are currently pursuing these studies. However, we believe that they fall outside the scope of this initial report on R-loops in the retina.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      There are a couple of minor comments:

      (1) Unfinished sentence; page 11, the end of the first paragraph.

      Thank you for catching this error. We removed the unfinished text.

      (2) Page 6: Figure S2A should be Figure S2.

      In general, the manuscript would benefit from a deeper explanation of the biological relevance of R-loop formation and the connection to NRL TF and the expression of genes regulated by NRL. In this regard, a more substantial description of the model would be useful.

      We have modified the discussion for clarity and included new ideas on possible roles of R-loops in gene regulation of photoreceptors.

      Reviewer #2 (Recommendations for the authors):

      (1) The specificity of interactions needs to be addressed:

      - Figure 1B - HNRNPUI1 bands present in GST control.

      - Figure 1C - Bands present in the Empty Vector control IP for HNRNPU and DHX9.

      - Supplemental Figure 1A - most proteins are present in GST control suggesting prevalent binding to GST and lack of specificity for other interactions.

      Thank you for your comment. RNA-binding proteins can have more background as observed in other studies (e.g. PMID: 32704541) but there is always a higher signal in experimental samples compared to controls. While we agree that we can enhance the conditions for immunoprecipitation (IP) by optimizing washing buffers, exposure and other parameters, we believe the current methods tell the story. We have added additional text explaining this. See page 5.

      (2) Use of the term 'Strongest' interaction - IPs don't directly address the strength of interaction, but depend on levels of expression AND affinity. The strength of interaction should be tested using techniques like an OCTET or SPR assay. One can also quantify the effect that RNA would have in such an assay.

      Thank you for your suggestion. We replaced the term 'stronger' with “higher signal” and “robust” at most places. The source of protein lysates is the same for experiments and controls, thus the amount of protein is consistent in both conditions, and not dependent on level of gene expression.

      (3) In supplemental tables, please use the proper gene names, not the UniProt peptide name. For example, there are no genes named ELAV1-ELAV4. These should be ELAVL1-ELAVL4. A short glance identifies >10 gene name errors.

      Thank you for the suggestion. We updated current gene names in all tables.

      (4) Please provide the rationale for the choice of DNA sequence for the DHX9 nucleotide sequence used for EMSA assays. In the human DHX9 locus, the NRL ChIP-seq peak looks to be contained in Intron1 whereas the NRL ChIP-seq peak in mouse DHX9 looks to be in the proximal upstream promoter. Did the authors choose an evolutionarily conserved sequence in the promoter region that contained the NRL motif or does the probe sequence arise from the sequence that has known NRL binding as assayed by NRL ChIP-seq? A zoomed-in image of the NRL ChIP-seq pile-ups in the DHX9 locus in each species would be beneficial.

      Thank you for this suggestion. The probe was chosen by scanning for NRL binding motifs on the Chip-Seq peak at the human DHX9 promoter. We added a Zoom-in image of the ChIP-Seq or CUT&RUN reads for NRL on both human and mouse retinas. Figure 3D shows NRL binding in both species in regions containing the homologous motif. The sequence is partially conserved and shown in the figure.

      (5) Normalization in RNaseH/RNaseA Co-IP experiments. Why does RNAseH treatment result in increased NRL IP (increased NRL expression?) or does RNaseA treatment cause reduced IP of DHX9? These differences seem to cause a 'denominator' effect, leading the Authors to conclude decreased co-IP of DHX9 with NRL when R-loops are inhibited or increased co-IP of NRL with DHX9 when RNA is degraded. An alternate interpretation would be that inhibiting the R-loop binding of NRL unmasks the epitope for antibody recognition. The authors should test NRL binding to RNA and determine if RNA binding affects the co-IP of NRL with DHX9.

      We agree that removing total RNA by RNase A or R-loops by RNase H may alter the accessibility of our antibodies to the epitopes, resulting in the differences in the level of total protein pulled down. However, we quantified the relative level of the associating protein to the total protein and confirmed, in reciprocal assays, that RNase A treatment led to increased interaction between NRL and DHX9. However, the quantification was not consistent between the reciprocal IPs upon RNase H treatment. We reason that in Figure 4D, as NRL may account for only a small proportion of DHX9’s interactome, the change in NRL level could not be detected due to the sensitivity of our assay. However reciprocally, DHX9 can constitute a larger proportion of NRL’s interactome in HEK293 cells, hence the change in DHX9 level was more obvious. We added this information to the text. See page 8.

      (6) Figure 7 - Malat1 - there doesn't seem to be an overlap of NRL with Stranded R-loop peaks in this image. Nrl seems to flank the region of R-loops.

      We changed Malat1 for Mplkip that shows a direct overlap of Nrl binding and R-loops. See Figure 7C.

      (7) Results end with 'A Model'. Seems like some concluding remarks and references to Figure 8 were mistakenly left out.

      Thank you for catching this typo. We removed the misplaced text.

      (8) Model and Discussion - authors should show raw data for RHO with respect to NRL binding and R-loops. No evidence was provided regarding R-loops (or lack thereof) in the Rhodopsin locus. Additionally, conclusions stating that "R-loops... are specifically depleted from genes, such as Rhodopsin, with high expression levels" go against Figures 7B and 7C. Malat1 is one of the highest expressed genes in the retina and contains R-loops.

      Thank you for helping us clarify our hypothesis. We added a genome browser view of Rhodopsin showing the absence of R-loops (Fig. S8). We hypothesize that R-loops could interfere with achieving higher rates of transcription, however we did not mean to say that all high expressed genes lack R-loops. We have rephrased the discussion to clarify this point.

      (9) Neuronal genes, particularly those involved in synaptic transmission are known to be, on average, longer than most genes (Gabel, 2015; PMID: 25762136). Is it possible that R-loops are detected at genes involved in synaptic function/structure solely because of transcript length, as it takes longer for transcription termination to resolve in genes that are longer? A plot showing R-loop enrichment and transcript length would address this.

      We added a plot showing gene length in relation to R-loops and expression levels. We observed that R-loops are more common over long genes regardless of their expression levels. We also observed that the concomitant presence of stranded and unstranded R-loops is restricted to the longest genes in most cases. We added this to Figure 7D.

    1. ⊢ (((P➝Q)➝(P➝R))➝(P➝(Q➝R)))

      1.show ((P➝Q)➝(P➝R))➝(P➝(Q➝R))) 2. (P to Q) to (P to R) ass cd 3. show (P➝(Q➝R)) 4.P ass cd 5.show (Q to R) 6. Q 7. show (P to Q ) 8.P <br /> 9.Q 10. 11. P to R 1, 2 mp 12. R 4, 11 mp 13. 14.

    2. ((S∨R)➝Q), ~(P∨~S) ⊢ ~(P⟷Q)
      1. show ~(P <->Q)
      2. p <->Q ass id
      3. p to Q 2bc
      4. Q to p 2bc
      5. ((S∨R)➝Q)
      6. ~(P∨~S) **7. show ~p
      7. p ass id
      8. Q
      9. P v ~S 11.** 12.~Q
      10. ~(S∨R) 3, nt **14. show s
      11. ~s ass id
      12. p v ~s 17.**
      13. s v R

    Annotators

    1. Reviewer #2 (Public review):

      Summary:

      Generating biophysically detailed computational models that capture the characteristic physiological properties of biological neurons for diverse cell types is an important and difficult problem in computational neuroscience. One major challenge lies in determining the large number of parameters of such models, which are notoriously difficult to fit to experimental data. Thereby, the computational and energy costs can be significant. The study 'ElectroPhysiomeGAN: Generation of Biophysical Neuron Model Parameters from Recorded Electrophysiological Responses' by Kim et al. describes a computationally efficient approach for predicting model parameters of Hodgkin-Huxley neuron models using Generative Adversarial Networks (GANs) trained on simulation data. The method is applied to generate models for 9 non-spiking neurons in C. elegans based on electrophysiological recordings. While the generated models capture the responses of these neurons to some degree, they generally show significant deviations from the empirically observed responses in important features. Although EP-GAN shows clear benefits under limited compute, the results do not yet demonstrate the quality needed to match other state-of-the-art methods. Future work examining extended training, larger datasets, or hybrid approaches would help clarify whether EP-GAN can generate models of high quality. If so, this would indeed be a major step forward; if not, the computationally more expensive methods will remain essential.

      Strengths:

      The authors work on an important and difficult problem. A noteworthy strength of their approach is that once trained, the GANs can generate models from new empirical data with very little computational effort. The generated models reproduce the response to current injections reasonably well.

      Weaknesses:

      Major 1: Models do not faithfully capture empirical responses. While the models generated with EP-GAN reproduce the average voltage during current injections reasonably well, the dynamics of the response are generally not well captured. For example, for the neuron labeled RIM (Figure 2), the most depolarized voltage traces show an initial 'overshoot' of depolarization, i.e. they depolarize strongly within the first few hundred milliseconds but then fall back to a less depolarized membrane potential. In contrast, the empirical recording shows no such overshoot. Similarly, for the neuron labeled AFD, all empirically recorded traces slowly ramp up over time. In contrast, the simulated traces are mostly flat. Furthermore, all empirical traces return to the pre-stimulus membrane potential, but many of the simulated voltage traces remain significantly depolarized, far outside of the ranges of empirically observed membrane potentials. The authors trained an additional GAN (EP-GAN Extended) to improve the fit to the resting membrane potential. Interestingly, for one neuron (AWB), this improved the response during stimulation, which now reproduced the slowly raising membrane potentials observed empirically, however, the neuron still does not reliably return to its resting membrane potential. For the other two neurons, the authors report a decrease in accuracy in comparison to EP-GAN. While such deviations may appear small in the Root mean Square Error (RMSE), they likely indicate a large mismatch between the model and the electrophysiological properties of the biological neuron. The authors added a second metric during the revision - percentages of predicted membrane potential trajectories within empirical range. I appreciate this additional analysis. As the empirical ranges across neurons are far larger than the magnitude of dynamical properties of the response ('slow ramps', etc.), this metric doesn't seem to be well suited to quantify to which degree these dynamical properties are captured by the models.

      Major 2: Comparison with other approaches is potentially misleading. Throughout the manuscript, the authors claim that their approach outperforms the other approaches tested. But compare the responses of the models in the present manuscript (neurons RIM, AFD, AIY) to the ones provided for the same neurons in Naudin et al. 2022 (https://doi.org/10.1371/journal. pone.0268380). Naudin et al. present models that seem to match empirical data far more accurately than any model presented in the current study. Naudin et al. achieved this using DEMO, an algorithm that in the present manuscript is consistently shown to be among the worst of all algorithms tested. I therefore strongly disagree with the authors claim that a "Comparison of EP-GAN with existing estimation methods shows EP-GAN advantage in the accuracy of estimated parameters". This may be true in the context of the benchmark performed in the study (i.e., a condition of very limited compute resources - 18 generations with a population size of 600, compare that to 2000 generations recommended in Naudin et al.), but while EP-GAN wins under these specific conditions (and yes, here the authors convincingly show that their EP-GAN produces by far the best results!), other approaches seem to win with respect to the quality of the models they can ultimately generate.

      Major 3: As long as the quality of the models generated by the EP-GAN cannot be significantly improved, I am doubtful that it indeed can contribute to the 'ElectroPhysiome', as it seems likely that dynamics that are currently poorly captured, like slow ramps, or the ability of the neuron to return to its resting membrane potential, will critically affect network computations. If the authors want to motivate their study based on this very ambitious goal, they should illustrate that single neuron model generation with their approach is robust enough to warrant well-constrained network dynamics. Based on the currently presented results, I find the framing of the manuscript far too bold.

      Major 4: The conclusion of the ablation study 'In addition the architecture of EP-GAN permits inference of parameters even when partial membrane potential and steady-state currents profile are given as inputs' does not seem to be justified given the voltage traces shown in Figure 3. For example, for RIM, the resting membrane potential stays around 0 mV, but all empirical traces are around -40mV. For AFD, all simulated traces have a negative slope during the depolarizing stimuli, but a positive slope in all empirically observed traces. For AIY, the shape of hyperpolarized traces is off. While it may be that by their metric neurons in the 25% category are classified as 'preserving baseline accuracy', this doesn't seem justified given the voltage traces presented in the manuscript. It appears the metric is not strict enough.

    2. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for valuable feedback and comments. Based on the feedback we revised the manuscript and believe that we addressed most of the reviewers' raised points. Below we include a summary of key revisions and point-by-point responses to reviewers comments.

      Abstract/Introduction

      We further emphasized EP-GAN strength in parameter inference of detailed neuron parameters vs specialized models with reduced parameters.

      Results

      We further elaborated on the method of training EP-GAN on synthetic neurons and validating on both synthetic and experimental neurons.

      We added a new section Statistical Analysis and Loss Extension which includes:

      - Statistical evaluation of baseline EP-GAN and other methods on neurons with multi recording membrane potential responses/steady-state currents data: AWB, URX, HSN

      - Evaluation of EP-GAN with added resting potential loss + longer simulations to ensure stability of membrane potential (EP-GAN-E)

      Methods

      We added a detailed explanation on "inverse gradient process"

      We added detailed current/voltage-clamp protocols for both synthetic and experimental validation and prediction scenarios (table 6)

      Supplementary

      We added error distribution and representative samples for synthetic neuron validations (Fig S1)

      We added membrane potential response statistical analysis plots for existing methods for AWB, URX, HSN (Fig S6)

      We added steady-state currents statistical analysis plots on EP-GAN + existing methods for AWB, URX, HSN (Fig S7)

      We added mean membrane potential errors for AWB, URX, HSN normalized by empirical standard deviations for all methods (Table S4)

      Please see our point-by-point responses to specific feedback and comment below.

      Reviewer 1:

      First, at the methodological level, the authors should explain the inverse gradient operation in more detail, as the reconstructed voltage will not only depend on the evaluation of the right-hand side of the HH-equations, as they write but also on the initial state of the system. Why did the authors not simply simulate the responses?

      We thank the reviewer for the feedback regarding the need for further explanation. We have revised the Methods section to provide a more detailed description of the inverse gradient process. The process uses a discrete integration method, similar to Euler’s formula, which takes systems’ initial conditions into account. For the EP-GAN baseline, the initial states were picked soon after the start of the stimulus to reconstruct the voltage during the stimulation period. For EP-GAN with extended loss (EP-GAN-E), introduced in this revision in sub-section Statistical Analysis and Loss Extension, initial states before/after stimulations were also taken into account to incorporate resting voltage states into target loss.

      Since EP-GAN is a neural network and we want the inverse gradient process to be part of the training process (i.e., making EP-GAN a “model informed network”), the process is expected to be implemented as a differentiable function of generated parameter p. This enables the derivatives from reconstructed voltages to be traced back to all network components via back-propagation algorithm.

      Computationally, this requires the implementation of the process as a combination of discrete array operations with “auto-differentiation”, which allows automatic computation of derivatives for each operation. While explicit simulation of the responses using ODE solvers provides more accurate solutions, the algorithms used by these solvers typically do not support such specialized arrays nor are they compatible with neural network training. We thus utilized PyTorch tensors [54], which support both auto-differentiation and vectorization to implement the process.

      The authors did not allow the models time to equilibrate before starting their reconstruction simulations, as testified by the large transients observed before stimulation onset in their plots. To get a sense of whether the models reproduce the equilibria of the measured responses to a reasonable degree, the authors should allow sufficient time for the models to equilibrate before starting their stimulation protocol.

      In the added Statistical Analysis and Loss Extension under the Results section, we added results for EP-GAN-E where we simulate the voltage responses with 5 seconds of added stabilization period in the beginning of simulations. The added period mitigates voltage fluctuations observed during the initial simulation phase and we observe that simulated voltage responses indeed reach stable equilibrium for both prior stimulations and for the zero stimulus current-clamp protocol (Figure 5 bottom, Column 3).

      In fact, why did the authors not explicitly include the equilibrium voltage as a target loss in their set of loss functions? This would be an important quantity that determines the opening level of all the ion channels and therefore would influence the associated parameter values.

      EP-GAN baseline does include equilibrium voltage as a target loss since all current-clamp protocols used in the study (both synthetic and experimental) include a membrane potential trace where the stimulus amplitude is zero throughout the entire recording duration (see added Table 6 for current clamp protocols), thus enforcing EP-GAN to optimize resting membrane potential alongside with other non-zero stimulus current-clamp scenarios.

      To further study EP-GAN’s accuracy in resting potential, we evaluated EP-GAN with supplemental resting potential target loss and evaluated its performance in the sub-section Statistical Analysis and Loss Extension. The added loss, combined with 5 seconds of additional stabilization period, improved accuracy in predicting resting potentials by mitigating voltage fluctuations during the early simulation phase and made significant improvements to predicting AWB membrane potential responses where EP-GAN baseline resulted in overshoot of the resting potential.

      The authors should provide a more detailed evaluation of the models. They should explicitly provide the IV curves (this should be easy enough, as they compute them anyway), and clearly describe the time-point at which they compute them, as their current figures suggest there might be strong transient changes in them.

      We included predicted IV-curve vs ground truth plots in addition to the voltages in the supplementary materials (Figure S2, S5) in the original submitted version of the manuscript. In this revision, we added additional IV-curve plots with statistical analysis for the neurons with multi-recording data (AWB, URX, HSN) in the supplementary materials (Figure S7).

      For the evaluation of predicted membrane potential responses, we added further details in Validation Scenarios (Synthetic) under Results section such that it clearly explains on the current-clamp protocols used for both synthetic and experimental neurons and which time interval the RMSE evaluations were performed.

      In the sub-section Statistical Analysis and Loss Extension, we introduced a new statistical metric in addition to RMSE, applied for neurons AWB, URX, HSN which evaluates the percentage of predicted voltages that fall within the empirical range (i.e., mean +- 2 std) and voltage error normalized by empirical standard deviations (Table S4).

      The authors should assess the stability of the models. Some of the models exhibit responses that look as if they might be unstable if simulated for sufficiently long periods of time. Therefore, the authors should investigate whether all obtained parameter sets lead to stable models.

      In the sub-section Statistical Analysis and Loss Extension, we included individual voltage traces generated by both EP-GAN baseline and EP-GAN-E (extended) with longer simulation (+5 seconds) to ensure stability. EP-GAN-E is able to produce equilibrium voltages that are indeed stable and within empirical bounds throughout the simulations for the zero-stimulus current-clamp scenario (column 3) for the 3 tested neurons (AWB, URX, HSN).

      Minor:

      The authors should provide a description of the model, and it's trainable parameters. At the moment, it is unclear which parameter of the ion channels are actually trained by the methodology.

      The detailed description of the model and its ion channels can be found in [7]. Supplementary materials also include an excel table predicted parameters which lists all EP-GAN fitted parameters for 9 neurons (+3 new parameter sets for AWB, URX, HSN using EP-GAN-E) included in the study, the labels for trainability, and their respective lower/upper bounds used during training data generation. In the revised manuscript, we further elaborated on the above information in the second paragraph of the Results section.

      Reviewer 2:

      Major 1: While the models generated with EP-GAN reproduce the average voltage during current injections reasonably well, the dynamics of the response are not well captured. For example, for the neuron labeled RIM (Figure 2), the most depolarized voltage traces show an initial 'overshoot' of depolarization, i.e. they depolarize strongly within the first few hundred milliseconds but then fall back to a less depolarized membrane potential. In contrast, the empirical recording shows no such overshoot. Similarly, for the neuron labeled AFD, all empirically recorded traces slowly ramp up over time. In contrast, the simulated traces are mostly flat. Furthermore, all empirical traces return to the pre-stimulus membrane potential, but many of the simulated voltage traces remain significantly depolarized, far outside of the ranges of empirically observed membrane potentials. While these deviations may appear small in the Root mean Square Error (RMSE), the only metric used in the study to assess the quality of the models, they likely indicate a large mismatch between the model and the electrophysiological properties of the biological neuron.

      EP-GAN main contribution is targeted towards parameter inference of detailed neuron model parameters, in a compute efficient manner. This is a difficult problem to address even with current state-of-the-art fitting algorithms. While EP-GAN is not perfect in capturing the dynamics of the responses and RMSE does not fully reflect the quality of predicted electrophysiological properties, it’s a generic error metric for time series that is easily interpretable and applicable for all methods. Using such a metric, our studies show that EP-GAN overall prediction quality exceeds those of existing methods when given identical optimization goals in a compute normalized setup.

      In our revised manuscript, we included a new section Statistical Analysis and Loss Extension under Results section where we performed additional statistical evaluations (e.g., % of predicted responses within empirical range) of EP-GAN’s predictions for neurons with multi recording data. The results show that predicted voltage responses from EP-GAN baseline (introduced in original manuscript) are in general, within the empirical range with ~80% of its responses falling within +- 2 empirical standard deviations, which were higher than existing methods: DEMO (57.9%), GDE3 (37.9%), NSDE (38%), NSGA2 (60.2%).

      Major 2: Other metrics than the RMSE should be incorporated to validate simulated responses against electrophysiological data. A common approach is to extract multiple biologically meaningful features from the voltage traces before, during and after the stimulus, and compare the simulated responses to the experimentally observed distribution of these features. Typically, a model is only accepted if all features fall within the empirically observed ranges (see e.g. https://doi.org/10.1371/journal.pcbi.1002107). However, based on the deviations in resting membrane potential and the return to the resting membrane potential alone, most if not all the models shown in this study would not be accepted.

      In our original manuscript, due to all of our neurons’ recordings having a single set of recording data, RMSE was chosen to be the most generic and interpretable error metric. We conducted additional electrophysiological recordings for 3 neurons in prediction scenarios (AWB, URX, HSN) and performed statistical analysis of generated models in the sub-section Statistical Analysis and Loss Extension. Specifically, we evaluated the percentage of predicted voltage responses that fall within the empirical range (empirical mean +- 2 std, p ~ 0.05) that encompass the responses before, during and after stimulus (Figure 5, Table 5) and mean membrane potential error normalized by empirical standard deviations (Table S4).

      The results show that EP-GAN baseline achieves average of ~80% of its predicted responses falling within the empirical range, which is higher than the other methods: DEMO (57.9%), GDE3 (37.9%), NSDE (38%), NSGA2 (60.2%). Supplementing EP-GAN with additional resting potential loss (EPGAN-E) increased the percentage to ~85% with noticeable improvements in reproducing dynamical features for AWB (Figure 5). Evaluations of membrane potential errors normalized by empirical standard deviations also showed similar results where EP-GAN baseline and EP-GAN-E have average error of 1.0 std and 0.7 std respectively, outperforming DEMO (1.7 std), GDE3 (2.0 std), NSDE (3.0 std) and NSGA (1.5 std) (Table S4).

      Major 3: Abstract and introduction imply that the 'ElectroPhysiome' refers to models that incorporate both the connectome and individual neuron physiology. However, the work presented in this study does not make use of any connectomics data. To make the claim that ElectroPhysiomeGAN can jointly capture both 'network interaction and cellular dynamics', the generated models would need to be evaluated for network inputs, for example by exposing them to naturalistic stimuli of synaptic inputs. It seems likely that dynamics that are currently poorly captured, like slow ramps, or the ability of the neuron to return to its resting membrane potential, will critically affect network computations.

      In the paper, EP-GAN is introduced as a parameter estimation method that can aid the development of ElectroPhysiome, which is a network model - these are two different method types and we do not claim EP-GAN is a model that can capture network dynamics. To avoid possible confusion, we made further clarifications in the abstract/introduction that EP-GAN is a machine learning approach for neuron HH-parameter estimation.

      I find it hard to believe that the methods EP-GAN is compared to could not perform any better. For example, multi-objective optimization algorithms are often successful in generating models that match empirical observations very well, but features used as target of the optimization need to be carefully selected for the optimization to succeed. Likely, each method requires extensive trial and error to achieve the best performance for a given problem. It is therefore hard to do a fair comparison. Given these complications, I would like to encourage the authors to rethink the framing of the story as a benchmark of EP-GAN vs. other methods. Also, the number of parameters does not seem that relevant to me, as long as the resulting models faithfully reproduce empirical data. What I find most interesting is that EP-GAN learns general relationships between electrophysiological responses and biophysical parameters, and likely could also be used to inspect the distribution of parameters that are consistent with a given empirical observation.

      We thank the reviewer for providing this perspective. While it is indeed difficult to have a completely fair comparison between existing optimization methods vs EP-GAN due to the fundamental differences in their algorithms, we believe that the current comparisons with other methods are justified as they provide baseline performance metrics to test EP-GAN for its intended use cases.

      The main strength of EP-GAN, as previously mentioned, is in its ability to efficiently navigate large detailed HH-models with many parameters so that it can aid in the development of nervous system models such as ElectroPhysiome, potentially fitting hundreds of neurons in a time efficient manner.

      While EP-GAN’s ability to learn the general relationship between electrophysiological responses and parameter distribution are indeed interesting and warrant a more careful examination, this is not the main focus of the paper since in this work we focus on introducing EP-GAN as a methodology for parameter inference.

      In this context, we believe the comparisons with other methods conducted in a compute normalized manner (i.e., each method is given the same # of simulations) and identical optimization targets provides an adequate framework for evaluating the aforementioned EP-GAN aim. Indeed, while EPGAN excels with larger HH-models, it performs slightly worse than DE for smaller models such as the one used by [16] despite it being more compute efficient (Table S2).

      To emphasize the EP-GAN aim, we revised the main manuscript description to focus on its intended use in parameter inference of detailed neuron parameters vs specialized models with reduced parameters.

      I could not find important aspects of the methods. What are the 176 parameters that were targeted as trainable parameters? What are the parameter bounds? What are the remaining parameters that have been excluded? What are the Hodgkin-Huxley models used? Which channels do they represent? What are the stimulus protocols?

      The detailed description and development of the HH-model that we use and its ion channel list can be found in [7]. Supplementary materials also include an excel table predicted parameters which lists all EP-GAN fitted parameters for 9 neurons (+3 new parameter sets for AWB, URX, HSN using EPGAN-E), the labels for trainability, and parameter bounds used for parameters during the generation of training data.

      We also added a new Table which details the current/voltage clamp protocols used for 9 neurons including the ones used for evaluating EP-GAN-E, which was supplemented with longer simulation time to ensure voltage stability (please see Table 6).

      I could not assess the validation of the EP-GAN by modeling 200 synthetic neurons based on the data presented in the manuscript since the only reported metric is the RMSE (5.84mV and 5.81mV for neurons sampled from training data and testing data respectively) averaged over all 200 synthetic neurons. Please report the distribution of RMSEs, include other biologically more relevant metrics, and show representative examples. The responses should be carefully investigated for the types of mismatches that occur, and their biological relevance should be discussed. For example, is the EP-GAN biased to generate responses with certain characteristics, like the 'overshoot' discussed in Major 1? Is it generally poor at fitting the resting potential?

      We thank the reviewer for the feedback regarding the need for additional supporting data for synthetic neuron validations. In the revised supplementary materials Figure S1, we included the distribution of RMSE errors for both groups of synthetic neuron validations (validation/test set) and representative samples for both EP-GAN baseline and EP-GAN-E. Notably, the inaccuracies observed during the experimental neuron predictions (e.g., resting potential, voltage overshoot) do not necessarily generalize to synthetic neurons, indicating that such mismatches could stem from the differences between synthetic neurons used for training and experimental neurons for predictions. While synthetic neurons are generated according to empirically determined parameter bounds, some experimental neuron types are rarer than the others and may also involve other channels that have not been recorded or modeled in [7], which can affect the quality of predicted parameters (see 2nd and 4th paragraphs of Discussions section for more detail). Also, properties such as recording error/noise that are often present in experimental neurons are not fully accounted for in synthetic neurons.

      To further study how these mismatches can be mitigated, in the revision we added an extended version of EP-GAN where target loss was supplemented with additional resting potential and 5 seconds of stabilization period during simulations (EP-GAN-E described in Statistical Analysis and Loss Extension). With such extensions, EP-GAN-E was able to improve its accuracies on both resting potentials and dynamical features with the most notable improvements on AWB where predicted voltage responses closely match slowly rising voltage response during stimulation. EPGAN-E is an example of further extensions to loss function that account for additional experimental features.

      Furthermore, the conclusion of the ablation study ('EP-GAN preserves reasonable accuracy up to a 25% reduction in membrane potential responses') does not seem to be justified given the voltage traces shown in Figure 3. For example, for RIM, the resting membrane potential stays around 0 mV, but all empirical traces are around -40mV. For AFD, all simulated traces have a negative slope during the depolarizing stimuli, but a positive slope in all empirically observed traces. For AIY, the shape of hyperpolarized traces is off.

      Since EP-GAN baseline optimizes voltage responses during the stimulation period, RMSE was also evaluated with respect to this period. From these errors, we evaluated whether the predicted voltage error for each ablation scenario fell within the 2 standard deviations from the mean error obtained from synthetic neuron test data (i.e. the baseline performance). We found that for input ablation for voltage responses, the error was within such range up to 25% reduction whereas for steady-state current input ablation, all 25%, 50% and 75% reductions resulted in errors within the range.

      We extended the “Ablation Studies” sub-section so that the above reasoning is better communicated to the readers.

      Additionally, I found a number of minor issues:

      Minor 1: Table 1 lists the number of HH simulations as '32k (11k · 3)'. Should it be 33k, since 11.000 times 3 is 33.000? Please specify the exact number of samples.

      Minor 2: x- and y-ticks are missing in Fig 2, Fig 3, Fig S1, Fig S2, Fig S3 and Fig S4.

      Minor 3: All files in the supplementary zip file should be listed and described.

      Minor 4: Code for training the GAN, generation of training datasets and for reproducing the figures should be provided.

      Minor 5: In the reference (Figure 3A, Table 1 Row 2): should this refer to Table 2?

      Minor 6: 'the ablation is done on stimulus space where a 50% reduction corresponds to removing half of the membrane potential responses traces each associated with a stimulus.' - which half is removed?

      We thank the reviewer for pointing out these errors in the original manuscript. The revised manuscript includes corrections for these items. We will publish the python code reproducing the results in the public repository in the near future.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Since multiple Reviewers requested that the results describing effects of TTX treatment on GluA2 receptor levels detected by immunofluorescence and confocal imaging be revised, we have made substantial changes, which are described below. We believe the changes have greatly improved the manuscript and thank the reviewers for their comments.

      Lack of significant increase in GluA2 receptor data is due to too few cultures sampled; anything could have happened [in one] particular dissociation. A concern that the TTX effect might vary greatly from culture to culture was why we felt it was important to match the receptor measurements on the same cultures that we recorded mEPSCs. We now present the culture means in Figure 5A (mEPSCs) and 5B (GluA2 receptor cluster size). These plots make it clear that the variability in the GluA2 receptor cluster size effect is not attributable to a failure of that culture to show a homeostatic effect. That is, the variability in GluA2 receptor effect is independent of the variability in mEPSC effect. To increase sample size, we examined 2 additional cultures for synaptic GluA2 receptor levels in control vs. TTX treatment. These cultures showed very modest increases (Figure 5C). When cell means from these experiments were pooled with those from the 3 matched cultures, the TTX effect was still not statistically significant (Figure 5G).

      Lack of significant increase in GluA2 receptor data is due to the choice to restrict our analysis to the primary dendrite, close to the cell body. We restricted our analysis to the primary dendrite because Figure 3 in Turrigiano et al, 1998, shows the increased response to exogenously applied glutamate after TTX treatment is greatest close to the cell body and wanes as the glutamate is applied further away (added to Results, new lines 388-389).

      Variability in GluA2 receptor data is due to the much smaller number of synapses sampled, compared to mEPSCs. We matched the sampling for mEPSC amplitude data to that of imaging data by taking only 20 samples from each electrophysiological recording. Each mEPSC represents one synapse; in a set of 20 mEPSCs some might come from the same synapse, so that we are sampling from £ 20 synapses. The effect of TTX on mEPSC amplitudes remained significant despite the reduced samples per cell (Figure 5A).

      Why do we fail to show a significant increase in receptors when this has been shown in many studies?

      We have added to our discussion the point that several studies, including Wang et al. 2019, use the number of puncta, rather than the number of cells, as the sample number. We ran an analysis of GluA2 receptor cluster size where we sampled multiple synapses per cell, and used the number of clusters as the sample n. We found that even with as few as 6 synapses randomly selected from each cell, the effect of TTX on GluA2 receptor cluster size became highly significant (p = 0.001 for data from 3 cultures and p = 0.005 for data from 5 cultures) (see new lines 400-406 in Discussion). In sum, our data are not very different from that of some previous studies. We are not arguing that receptors do not increase. Instead our point is that the increase is more variable than the increase in MESPC amplitude and thus takes a much bigger sample size to detect. In sum, the difference between the mEPSC data and the receptor data is that the mEPSC data consistently show a ~20-25% increase, whereas the receptor data do not always show an increase and sometimes the increase is only ~10%. Finally, we added two matched culture experiments examining synaptic GluA1 receptor cluster characteristics. GluA1 receptor cluster size decreased in one culture, and increased very modestly in the other (Supplemental Figure 1B), whereas mEPSC amplitude robustly increased (Supplemental Figure 1A; Results, new lines 265-268).

      We conclude that these data support the idea that there is another contributor to the TTXinduced increase in quantal size.

      Other changes in presentation of GluA2 receptor results: Since the effects on intensity and integral are of lesser magnitude than that on cluster size, we have removed these results from the graphs, although they are presented in Table 1. We have removed Figure 6, the presentation of individual culture results, since these results are now conveyed in Figure 5A-C. We have removed graphs depicting GluA2 receptor cluster size in response to TTX in Rab3A-/- cultures, but these data are still presented in Table 1.

      We address other detailed comments below.

      Public Reviews:

      Reviewer #1 (Public review):

      (2) The effects of Rab3A on TTX-induced mini frequency modulation remains unclear, because TTX does not induce a change in mini frequency in the Rab3A+/Ebd control (Fig. 2). The respective conclusions should be revised accordingly (l. 427).

      The effects on mini frequency were added for completeness, but given the lack of consistently significant changes with TTX treatment or changes in the KO or Rab3A<sup>Ebd/Ebd</sup> cultures, we have removed comment on these results from the Discussion.

      (3) The model is still not supported by the data. In particular, data supporting a negative regulation of Rab3A by APs, Rab3A-dependent release of a tropic factor, or a Rab3Adependent increase in GluA2 abundance are not presented.

      We have removed the model from the manuscript.

      (4) Data points are not overlapping and appear "quantal" in most box plots. How were the data rounded?

      The appearance of quantal variation in cell amplitude means is due to the binning that is part of the creation of the box plot. We have not remade the figures without binning, because the binning provides a visual depiction of the distribution of the data points. We have added the bin sizes to the appropriate figure legends.

      Reviewer #2 (Public review):

      However, the authors still have not provided further investigation of the mechanisms behind the role of Rab3A in this form of plasticity, and the revision therefore has added little to the significance of the study. Moreover, the experimental design for the investigation of the mismatch between mEPSC amplitude and GluA2 cluster fluorescence remains questionable, making it difficult to draw any credible conclusions from groups of data that not only look similar to the eye but also show no significance statistically.

      To our knowledge, no other study has matched measurements of mEPSC amplitude in the same cultures where synaptic receptor levels were assessed. As stated above, we have revised the presentation of GluA2 receptor results, concluding from the lack of significant effects on receptor levels that the mEPSC amplitude increase cannot be fully explained by the receptor data (which is strengthened by addition of two more cultures analyzed for GluA2 immunofluorescence). This is an important addition to the significance of the study.

      In summary, this study establishes that neuronal Rab3A plays a role in homeostatic synaptic plasticity, but so do a number of other molecules that have been implicated in homeostatic synaptic plasticity in the past two decades (only will grow with the new techniques such as RNAseq). Without going beyond this finding and demonstrating how exactly Rab3A participates in the induction and/or expression of this form of plasticity, or maybe the potential Rab3A-mediated functional and behavioral defects in vivo, the contribution of the current study to the field is limited. However, given the presynaptic location of Rab3A, this finding could serve as a starting point for researchers interested in pre-postsynaptic cross-talk during homeostatic plasticity in general.

      We previously published a review in which we list 19 molecules known at that time to be important for homeostatic synaptic plasticity (see Table 2, Koesters et al., 2024), and they fall into two categories: molecules involved in glutamate receptor expression or trafficking, and signaling molecules. Rab3A is the first synaptic vesicle protein to be implicated in homeostatic plasticity of quantal size. We have added this point to the Discussion, new lines 473-476. By demonstrating that Rab3A is not acting in glia (which release TNF, which regulates receptor expression), and that GluA2 receptor levels do not explain the homeostatic mEPSC increase in our experimental conditions, we have ruled out two major mechanisms.

      Reviewer #3 (Public review):

      Other questions arise from the NASPM experiments, used to justify looking at GluA2 (and not GluA1) in the immunostaining. First, there is a frequency effect that is unclear in origin. One would expect NASPM to merely block some fraction of the post-synaptic current, and not affect pre-synaptic release or block whole synapses. However the change in frequency seems to argue (as the authors do) that some synapses only have CP-AMPARs, while the rest of the synapses have few or none. Another possibility is that there are pre-synaptic NASPM-sensitive receptors that influence release probability. Further, the amplitude data show a strong trend towards smaller amplitude following NASPM treatment (Fig 3B). The p value for both control and TTX neurons was 0.08 - it is very difficult to argue that there is no effect. The decrease on average is larger in the TTX neurons, and some cells show a strong effect. It is possible there is some heterogeneity between neurons on whether GluA1/A2 heteromers or GluA1 homomers are added during HSP. This would impact the weakly supported conclusions about the GluA2 imaging vs mEPSC amplitude data.

      We cannot rule out that the NAPSM-induced decrease in mEPSC frequency is due to a loss of presynaptic glutamate receptor enhancement of release probability, and have added this statement to the Results, new lines 202-204. Regarding the p value of 0.08—we are not arguing that NASPM has no effect on mEPSC amplitude, only that it has no effect on the homeostatic increase in amplitude after TTX treatment. An increase in GluA1/A2 heteromers should have been detected in our imaging studies.

      Unaddressed issues that would greatly increase the impact of the paper:

      (1) Is Rab3A acting pre-synaptically, post-synaptically or both? The authors provide good evidence that Rab3A is acting within neurons and not astrocytes. But where it is acting (pre or post) would aid substantially in understanding its role. They could use sparse knockdown of Rab3A, or simply mix cultures from KO and WT mice (with appropriate tags/labels). The general view in the field has been that HSP is regulated post-synaptically via regulation of AMPAR trafficking, and considerable evidence supports this view. The more support for their suggestion of a pre-synaptic site of control, the better.

      We agree that doing co-cultures of Rab3A-/- and Rab3A+/+ neurons is the definitive experiment to determine the locus of action of Rab3A in homeostatic synaptic plasticity. We hope to examine this question in a future manuscript.

      (2) Rab3A is also found at inhibitory synapses. It would be very informative to know if HSP at inhibitory synapses is similarly affected. This is particularly relevant as at inhibitory synapses, one expects a removal of GABARs (ie the opposite of whatever is happening at excitatory synapses). If both processes are regulated by Rab3A, this might suggest a role for this protein more upstream in the signaling; an effect only at excitatory synapses would argue for a more specific role just at these synapses.

      We agree that it would be very interesting to determine if the homeostatic decrease in mIPSCs after activity blockade depends on Rab3A. We hope to address this question in the future.

      Recommendations for the authors:

      Reviewer #3 (Recommendations for the authors):

      Minor points:

      The abstract is a bit repetitive in places. Some editing would be advised.

      We did not identify anything repetitive in the abstract except the parallel construction referring to the previous findings at the NMJ and current findings in cortical neurons. However, we have eliminated a section in the introduction which went into detail about the receptor imaging results (previous lines 103-110).

      Line 77: 'shift toward early awakening' is unclear; do you mean shorter sleep/wake cycle? Other circadian issues? A more complete description is needed.

      We have moved the additional detail about the Earlybird mutation’s effect on circadian period from the Results to the Introduction, new lines 77 to 79.

      The results section has many passages that seem more like discussion, offering various interpretation and alternatives for the data. While some commentary is appropriate, to justify the next series of experiments and maintain a logical flow, this manuscript has rather a high amount of this. Some editing and shifting material to the discussion might be warranted.

      We have reduced the commentary in the Results section.

      Line 245: GluA2 homomers are really unlikely, as they won't pass current (unless unedited) and don't often if ever form. But GluA2/A3 heteromers are likely (and detected by their methods).

      GluA2 homomers do conduct current, albeit less than heteromers (Swanson et al., 1997; Oh and Derkach, 2005; Coombs et al., 2019). [The Oh and Derkach paper shows a GluA2 homomer current in Supplementary Figure 3]. We have modified the text to acknowledge that the GluA2 receptor imaging will detect heteromers and homomers (Results, new lines 214 to 215).

      Line 258: If the number of synaptic pairs analyzed was usually <20, what was the average and range of pairs? This gets into the sampling issue.

      We have added the average number of synaptic sites (20.4 ± 6.5) and range (11-38) to the text, Results, new line 229.

      Are the stats of the baseline mEPSC amplitude and frequency shifts (WT vs KO on WT feeder layer) given somewhere (lines 398-402)? If not, please add them.

      These stats have been added to the text, mEPSC amplitude, (CON, WT on WT, 13.3 ± 0.5 pA; CON, KO on WT, 15.2 ± 1.1 pA, p = 0.23, Kruskal-Wallis test), new lines 325-326 and frequency, (CON, WT on WT, 2.54 ± 0.57 sec<sup>-1</sup>; CON, KO on WT, 4.46 ± 1.21 sec<sup>-1</sup>, p = 0.23, Kruskal-Wallis test), new lines, 329-330.

      25mM K+ is going to be much more than 'mildly' depolarizing (line 697). Should just skip that word.

      ‘mildly’ has been removed.

      The section on MiniAnalysis seems overly argumentative, and there is no need to discuss flaws in the Wu paper. The important thing (a bit buried at the end of this section) is that the manual mini selection was done blind to condition, which is the normal way of dealing with potential bias. It would be better to limit the methods to describing what was done.

      The bulk of the justification of manual analysis has been removed from the text.

      The discussion of potential conductance changes (lines 534-6) seems somewhat unwarranted.

      Modification of GluA1 phosphorylation in the GluA1/A2 heteromer would not be detected by NASPM (and the NASPM data being a bit inconclusive anyway). Further, auxiliary subunits (like TARPs) can alter conductance of any of the AMPARs. So I don't think they have enough data to exclude such a possibility.

      The discussion of contributions of conductance have been removed from the text.

      Coombs ID, Soto D, McGee TP, Gold MG, Farrant M, Cull-Candy SG (2019) Homomeric GluA2(R) AMPA receptors can conduct when desensitized. Nat Commun 10:4312.

      Oh MC, Derkach VA (2005) Dominant role of the GluR2 subunit in regulation of AMPA receptors by CaMKII. Nat Neurosci 8:853-854.

      Swanson GT, Kamboj SK, Cull-Candy SG (1997) Single-channel properties of recombinant AMPA receptors depend on RNA editing, splice variation, and subunit composition. J Neurosci 17:5869.

    1. Authors’ Response (27 January 2025)

      Revised preprint

      GENERAL ASSESSMENT

      The revised manuscript by Henze et al. presents a novel and significant contribution to the field, demonstrating that lysophospholipids (LPC) act as endogenous activators of pannexin channels. The study provides compelling evidence that LPC activation of PANX1 and PANX2 channels facilitates the release of signaling molecules critical for immune responses, particularly in the context of inflammation and inflammasome activation. The removal of the cryo-EM data, addressing concerns about the binding site and mechanism of LPC activation, is a thoughtful revision that enhances the manuscript's focus. While the authors have addressed many of the previous critiques, key mechanistic questions remain regarding channel permeability and the specificity of LPC-induced metabolite release.

      RECOMMENDATIONS

      Essential revisions:

      1. While the study suggests that LPC-induced PANX1 activation results in the release of metabolites via the pannexin channel pore, no direct evidence is provided to confirm this. It is possible that the release occurs through alternative pathways or that detected metabolites are by-products of other permeating substances. Additional experiments or a more thorough discussion of these possibilities would enhance the manuscript’s rigor.

      We thank the reviewers for pointing this out. While the overlap between our current study and the published secretomics studies by Medina et al. supports the direct release of these metabolites, we agree that our experiments do not rule out the possibility of an alternative pathway. We have updated the discussion to acknowledge this limitation.

      1. The manuscript lacks data on the ion selectivity of LPC-activated PANX1, an important aspect for understanding the channel’s permeability profile. A comparison of the selectivity of LPC-induced PANX1 currents to those activated by other stimuli, such as C-terminal cleavage or depolarization, would clarify whether LPC induces a unique or comparable open state. The reviewers appreciate the technical challenges of these experiments, however, thus a more thorough discussion of the uncertainties may be appropriate.

      We agree that characterizing the channel’s permeability profile is essential. However, as the reviewers noted, prolonged or repeated lysophospholipid (LPC) perfusion often destabilizes the patch, making it difficult to apply conventional reversal potential analysis. Additionally, the application of voltage ramps would inevitably activate Panx1 channels, further complicating the interpretation of the results.

      As a workaround, we analysed whole-cell current density before and after LPC treatment at -60 mV using different buffers containing various anions and cations. Both Panx1 and Panx2 channels produced significantly larger currents in NaCl or NMDG-Cl compared to NaGluconate, indicating that LPC-activated channels are more selective for anions under these conditions. Interestingly, currents in NMDG-Cl were slightly smaller than in NaCl, suggesting that NMDG may have an inhibitory effect on these channels activated by LPC. While we acknowledge that this analysis does not directly compare ion selectivity within the same patch, the almost negligible current observed in NaGluconate strongly suggests that small anion conductance through both Panx1 and Panx2 channels is greater than cation conductance. We have included these findings in the new Fig. 1.

      Another important aspect of Panx1 channels is their ability to allow the permeation of cationic molecules, such as YOPRO-1, when the C-terminus is cleaved. In our experiments, we demonstrate that LPC-activated full-length Panx1 channels are permeable to both anions (e.g., ATP and Cl) and cations (e.g., YOPRO-1). Although the precise mechanism underlying ion selectivity remains to be elucidated, our data support the conclusion that LPC can facilitate the release of multiple signalling molecules through full-length Panx1.

      1. The YO-PRO-1 uptake observed in the absence of LPC contradicts prior findings (e.g., Bayliss et al., eLife, 2021) that full-length PANX1 does not release large molecules like ATP or YO-PRO-1 under similar conditions. This discrepancy raises concerns about the interpretation of the proteoliposome experiments. Addressing this by ion selectivity experiments, or discussing differences in experimental conditions and reconstitution protocols, could provide clarity as well as strengthen the conclusions.

      We are also aware of the discrepancy and currently lack a clear understanding of the underlying mechanism. However, there are substantial differences between the two experimental setups that may account for the divergent results.

      First, we used human Panx1 tagged in the flexible intracellular loop, whereas the Bayliss group used frog Panx1 tagged with GFP at the C-terminus. This difference in tagging and species may have contributed to variations in basal activity. Second, the lipid compositions used for reconstitution were significantly different. In our experiments, we used 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphoethanolamine (POPE), 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphoglycerol (POPG), and sphingomyelin, while the Bayliss group employed a mixture of 70% brain phosphatidylcholine, 15% total brain lipid extract, 14% cholesterol, and 1% phosphatidylinositol 4,5-bisphosphate. Given that the function of many ion channels is heavily influenced by lipid composition, these differences could have contributed to the observed discrepancy. Regardless, our functional reconstitution experiments clearly demonstrate that LPC stimulates YOPRO-1 uptake in a dose-dependent manner, which forms the foundation of our interpretation. We have included this discussion in the revised manuscript. discrepancy.

      Optional suggestions:

      1. The manuscript refers to experiments conducted in human monocytes, but the actual cell line used was THP-1, a human monocytic leukemia cell line. This should be made clearer in the text to avoid confusion.

      We are confused by this comment. Our manuscript states that we used "phorbol 12-myristate 13-acetate (PMA)-differentiated human THP-1 monocytes" which should clearly indicate which cells we used in the study.

      1. The behavior of LPC at concentrations exceeding its Critical Micelle Concentration (CMC) (4–8 μM for 16:0 Lyso-PC) should be considered. The authors should discuss whether micelle formation affects the observed channel activation and how this might influence the interpretation of the results.

      Our dose-response experiments, shown in Supplementary Fig. 3, suggest that CMC does not appear to affect their activity. While we are eager to understand the delivery method and action mechanisms of lysophospholipids, such studies are beyond the scope of the current work.

      (This is a response to peer review conducted by Biophysics Colab on version 2 of this preprint.)

    2. Consolidated Peer Review Report (15 January 2025)

      Revised preprint

      GENERAL ASSESSMENT

      The revised manuscript by Henze et al. presents a novel and significant contribution to the field, demonstrating that lysophospholipids (LPC) act as endogenous activators of pannexin channels. The study provides compelling evidence that LPC activation of PANX1 and PANX2 channels facilitates the release of signaling molecules critical for immune responses, particularly in the context of inflammation and inflammasome activation. The removal of the cryo-EM data, addressing concerns about the binding site and mechanism of LPC activation, is a thoughtful revision that enhances the manuscript's focus. While the authors have addressed many of the previous critiques, key mechanistic questions remain regarding channel permeability and the specificity of LPC-induced metabolite release.

      RECOMMENDATIONS

      Essential revisions:

      1. While the study suggests that LPC-induced PANX1 activation results in the release of metabolites via the pannexin channel pore, no direct evidence is provided to confirm this. It is possible that the release occurs through alternative pathways or that detected metabolites are by-products of other permeating substances. Additional experiments or a more thorough discussion of these possibilities would enhance the manuscript’s rigor.
      2. The manuscript lacks data on the ion selectivity of LPC-activated PANX1, an important aspect for understanding the channel’s permeability profile. A comparison of the selectivity of LPC-induced PANX1 currents to those activated by other stimuli, such as C-terminal cleavage or depolarization, would clarify whether LPC induces a unique or comparable open state. The reviewers appreciate the technical challenges of these experiments, however, thus a more thorough discussion of the uncertainties may be appropriate.
      3. The YO-PRO-1 uptake observed in the absence of LPC contradicts prior findings (e.g., Bayliss et al., eLife, 2021) that full-length PANX1 does not release large molecules like ATP or YO-PRO-1 under similar conditions. This discrepancy raises concerns about the interpretation of the proteoliposome experiments. Addressing this by ion selectivity experiments, or discussing differences in experimental conditions and reconstitution protocols, could provide clarity as well as strengthen the conclusions.

      Optional suggestions:

      1. The manuscript refers to experiments conducted in human monocytes, but the actual cell line used was THP-1, a human monocytic leukemia cell line. This should be made clearer in the text to avoid confusion.
      2. The behavior of LPC at concentrations exceeding its Critical Micelle Concentration (CMC) (4–8 μM for 16:0 Lyso-PC) should be considered. The authors should discuss whether micelle formation affects the observed channel activation and how this might influence the interpretation of the results.

      REVIEWING TEAM

      Reviewed by:

      Jorge Contreras, Professor, University of California, Davis, USA: electrophysiology and ion channel mechanisms

      Wei Lü, Professor, Northwestern University, USA: ion channel mechanisms, X-ray crystallography and cryo-EM

      Xiaofeng Tan, Research Fellow, NINDS, NIH, USA: structural biology (X-ray crystallography and cryo-electron microscopy) and ion channel mechanisms

      Curated by:

      Kenton J. Swartz, Senior Investigator, NINDS, NIH, USA

      (This consolidated report is a result of peer review conducted by Biophysics Colab on version 2 of this preprint. Comments concerning minor and presentational issues have been omitted for brevity.)

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors investigate the role of BEND2, a novel regulator of meiosis, in both male and female fertility. Huang et al have created a mouse model where the fulllength BEND2 transcript is depleted but the truncated BEND2 version remains. This mouse model is fertile, and the authors used it to study the role of BEND2 on both male and female meiosis. Overall, the full-length BEND2 appears dispensable for male meiosis. The more interesting phenotype was observed in females. Females exhibit a lower ovarian reserve suggesting that full-length BEND2 is involved in the establishment of the primordial follicle pool.

      Strengths:

      The authors generated a mouse model that enabled them to study the role of BEND2 in meiosis. The role of BEND2 in female fertility is novel and enhances our knowledge of genes involved in the establishment of the primordial follicle pool.

      Weaknesses:

      The manuscript extensively explores the role of BEND2 in male meiosis; however, a more interesting result was obtained from the study of female mice. Only a few experiments were performed using female mice, therefore, more experiments should be performed to complete the story of the role of BEND2 on female fertility. In addition, the title and abstract of the manuscript do not align with the story, as female fertility is only a small portion of the data compared to the male fertility section.

      We appreciate the reviewer’s thoughtful summary, recognition of the strengths of our study, and constructive feedback. In the revised manuscript, we have performed additional experiments to enhance our understanding of the role of BEND2 in female gametogenesis. These new experiments provide further insights into the establishment of the ovarian reserve and the role of BEND2 in female fertility.

      Additionally, we have rewritten the title, abstract, and introduction to better align with the content of the manuscript and to reflect the balance between the male and female fertility results. We believe these changes address the reviewer’s concerns and improve the overall clarity and focus of the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      • I recommend that the authors re-organize their abstract and introduction to accurately reflect the manuscript's primary focus on male fertility. Right now, the title of the manuscript is misleading. The manuscript does not investigate reproductive aging; rather, it primarily describes the depletion of primordial follicle number. The mechanism behind this depletion and whether this phenotype accelerates reproductive aging, are not explored. Clarifying these points will help align the title and content of the manuscript more accurately.

      We thank the reviewer for this suggestion. We agree that the original title and abstract did not fully capture the focus of the study. In response, we have rewritten the title, abstract, and introduction to better align with the results presented, focusing more clearly on the implications of the effects of the full-length BEND2 depletion for spermatogenesis and oogenesis. These revisions ensure that the title, the abstract, and the manuscript's introduction are now more accurately reflective of the work performed.

      • Figure 1: I couldn't find the validation of the polyclonal antibody against BEND2 that the authors generated.

      Regarding this query about the validation of the polyclonal antibody against BEND2, we apologize for any confusion. We would like to clarify that this validation is indeed presented in Figure 2 of our manuscript. To ensure this information is easily accessible, we have revised the text to explicitly mention the validation in Figure 2.

      • Figure 2A: Could you provide the actual numbers for the weight of the mice testis?

      In response to this question regarding Figure 2A and the weights of the mice testis, we have now included this data in a graph in Fig 2A and Table S1 and added this information in the results section.

      • Figure 2C and D: I am confused by the fact that in the WB we can appreciate a high expression of the p75 protein, but the signal is very low in the IF (Figure 2D).

      We thank the reviewer for raising this point. We acknowledge the apparent discrepancy between the strong p75 signal observed in the Western blot (Fig. 2C) and the weaker signal seen in the immunofluorescence (Fig. 2D). We think several factors could contribute to this difference, such as differences in sensitivity and detection methods, epitope accessibility, protein localization or differences in sample preparation, antibody affinity, and experimental conditions between Western blot and IF.

      • In the same figure, the authors also mention that the p75 protein is functional. On what basis do they rely on reaching this conclusion?

      We acknowledge that we cannot definitively confirm the functionality of the p75 protein. Our assumption was based on the observed fertility of the male mice and existing literature indicating that BEND2 is essential for completing meiosis (Ma et al., 2022). However, we understand the importance of clarity in our claims. To avoid any potential confusion, we have revised the sentence to read: "The p75 BEND2 protein—likely corresponding to an exon 11-skipped transcript—is present and might be functional in our mutant testis, based on the observed phenotype (see below)."

      • The phenotype in females is very interesting. The authors conclude that BEND2 influences primordial follicle formation, oocyte quality, fertility, and reproductive aging by (1) performing follicle counts, (2) analyzing the litter size, and (3) analyzing meiotic progression. Given that the authors build their story around these experiments, I strongly encourage them to expand the section on female fertility, or reorganize the manuscript, or be more cautious with some of their conclusions. They might consider performing additional experiments such as:

      - Oocyte quality: To determine whether BEND2 impacts oocyte quality, mice should be stimulated with hormones and oocyte quality should be analyzed (GV, MI, MII progression, spindle morphology and/or fertilization, and embryo development). Does the decrease in primordial follicles correlate with the number of ovulated oocytes, or is the impact only on oocyte quality?

      We appreciate the reviewer's suggestion to assess the impact of BEND2 on oocyte quality. Following the reviewer’s recommendation, we stimulated three control and three mutant mice. We analyzed the number of ovulated oocytes, their fertilization rate, and the percentage of embryos that developed to the blastocyst stage. These new results are included in the revised manuscript (see Results section and new Table 1). Our analyses indicate that for all parameters assessed, control and mutant oocytes behaved similarly. Specifically, there were no significant differences in the number of ovulated oocytes, fertilization rates, or the ability of embryos to progress to the blastocyst stage between the control and mutant groups. These findings suggest that mutant oocyte quality is comparable to control mice of a similar age. We have incorporated these new results into the manuscript.

      - Reproductive aging: A fertility trial would provide more information on whether BEND2 depletion triggers an acceleration of reproductive aging. In addition, the oldest mice used by the authors are 9 months old, and at this point, fertility has not declined yet.

      We appreciate the reviewer's suggestion regarding the assessment of reproductive aging. However, we respectfully disagree with the assertion that fertility has not declined by 9 months of age. In our colony, we have observed a significant decline in fertility around 10 months of age. Specifically, out of 18 10-month-old female mice placed in breeding cages, we observed only three pregnancies within the first 30 days (N.N. and I.R., data not published). Based on these observations, we determined that fertility begins to decline around this age in our colony, which informed our decision to use 9-month-old mice as the oldest age group for our analysis. Thus, this age is appropriate for evaluating the potential effects of BEND2 depletion on reproductive aging in our specific mouse population.

      - The observation that the primordial follicle pool is already diminished in mice that are 1 week old is very interesting. Some experiments that the authors could perform to figure out the mechanism are: (1) Analyzing apoptosis. Are the primordial follicles dying during the pool's establishment, or is this an ongoing apoptotic process throughout the mice's lifespan? (2) If the authors still have ovaries from mice younger than 1 week of age (when the primordial pool is forming), they could perform DDX4 staining and quantify the number of oocytes in follicles and the total number of oocytes. These experiments would provide mechanistic insights into whether BEND2 impacts the formation of the primordial follicle pool or if the pool forms but is then depleted.

      We appreciate the reviewer's suggestion to further explore the mechanism behind the reduced primordial follicle pool. In response, we have analyzed the number of DDX4positive cells (DDX4 labels oocytes) in newborn mutant and wild-type animals. Our results show that mutant ovaries contain significantly fewer oocytes compared to controls (see new Fig. 5). This finding supports the hypothesis that BEND2 is critical for the establishment of a normal ovarian reserve. We are grateful for this suggestion, as these additional data reinforce our conclusion that BEND2 is required to determine a normal ovarian reserve in mice.

      • What is the red signal in Supplementary Figure 1C?

      This image depicts the BEND2 staining pattern in 16 days post-coitum (dpc) wild-type mouse ovaries. To clarify this and prevent any confusion, we have updated the figure legend to explicitly state that the sample shown is from a wild-type mouse.

      • Please spell out the full term of all the acronyms.

      We apologize for the oversight in not fully spelling out some acronyms in the original manuscript. We have carefully reviewed the entire manuscript and have ensured that all acronyms are now spelled out in full upon their first use in the revised version. We want to thank the reviewer for bringing this to our attention.

      • Is Line-1 also dysregulated in the ovary? This was one of the main findings from the male part. It would be interesting to perform the same analysis in the ovary since Line1 has a role in establishing the ovarian reserve (PDMI: 31949138).

      We thank the reviewer for this insightful suggestion. We have analyzed the number of LINE1 and SYCP3-positive cells in wild-type and mutant newborn ovaries (new Fig. S4). Our results show no significant difference between the two genotypes, suggesting that LINE-1 is not dysregulated in newborn Bend2 mutant oocytes. These findings indicate that, at least in the context of the newborn ovary, LINE-1 does not appear to be affected by BEND2 depletion.

      Reviewer #2 (Public Review):

      In their manuscript entitled "BEND2 is a crucial player in oogenesis and reproductive aging", the authors present their findings that full-length BEND2 is important for repair of meiotic double strand break repair in spermatocytes, regulation of LINE-1 elements in spermatocytes, and proper oocyte meiosis and folliculogenesis in females. The manuscript utilizes an elegant system to specifically ablate the full-length form of BEND2 which has been historically difficult to study due to its location on the X chromosome and male sterility of global knockout animals.

      While the manuscript is an overall excellent addition to the field, it would significantly benefit from a few additional experiments, as well as some additional clarification/elaboration.

      The claim that BEND2 is required for ovarian reserve establishment is not supported, as the authors only look at folliculogenesis and oocyte abundance starting at one week of age, after the reserve is formed. Analysis of earlier time points would be much more convincing and would parse the role of BEND2 in the establishment vs. maintenance of this cell population. In spermatocytes, the authors demonstrate a loss of nuclear BEND2 in their mutant but do not comment on the change in localization (which is now cytoplasmic) of the remaining protein in these animals. This may have true biological significance and a discussion of this should be more thoroughly explored.

      We thank the reviewer for their thoughtful feedback and constructive suggestions to improve our manuscript.

      In response to the comment regarding the establishment of the ovarian reserve, we have now analyzed Bend2 mutant and control newborn ovaries. Our results show a significant reduction in the number of DDX4-positive cells in mutant ovaries compared to controls. These findings demonstrate that BEND2 is required for the establishment of the ovarian reserve, as the reduction is evident at birth.

      Regarding the cytoplasmic staining of BEND2 in mutant spermatocytes, we did perform secondary-antibody-only controls using goat anti-rabbit Cy3 to address the specificity of the signal. The staining observed in the Bend2 mutants closely resembles background staining, suggesting that the cytoplasmic signal is nonspecific. Therefore, we do not believe this represents a meaningful change in the localization of BEND2 protein in the mutants. We have clarified this in the revised manuscript to address this point.

      We hope these additional experiments and clarifications strengthen the manuscript and address the reviewer’s concerns.

      Reviewer #2 (Recommendations For The Authors):

      Major points:

      (1) The title of the manuscript does not accurately capture the content of the work. The vast majority of the data presented here is from the male, which is not reflected at all in the title - perhaps considering revising it?

      Thank you for your valuable suggestion. We agree that the original title did not fully reflect the focus of the manuscript. In response, we have revised the title, along with the abstract and introduction, to more accurately capture the content of the study and the emphasis on the male data. These changes ensure that the manuscript more clearly aligns with the results presented.

      (2) In Figure 2D, the authors demonstrate that WT BEND2 expression and localization are lost in the mutant, but staining is still apparent, just in the cytoplasm. Did the authors perform secondary-antibody-only controls to determine if this was background staining or real staining? If real, can they comment on the change in localization of the protein?

      We thank the reviewer for this insightful question. We have indeed performed secondary antibody-only controls using goat anti-rabbit Cy3. The staining observed in the Bend2 mutants closely resembles background staining, suggesting that the signal in the cytoplasm is not specific. Therefore, we do not believe this staining represents any real or meaningful expression of the BEND2 protein in the mutants.

      (3) In Figure S2A, the authors show Ku70 staining and describe that it is similar between the genotypes, but - to my eye - it looks quite distinctly different. It appears to stain in patches in WT SYCP3+ spermatocytes, versus staining in patches in the more mature, SYCP3- germ cells closer to the lumen in the mutant. Can the authors please clarify, or provide arrows to point which foci they are referring to?

      We apologize for the confusion caused by the image provided in the original submission. Upon review, we realized that the mutant image was not fully representative of the staining pattern observed in the majority of mutant samples. We have replaced this image with a new one in the revised manuscript, which more accurately reflects the similarity in Ku70 staining between wild-type and mutant testis. In this updated Figure S2, we have also included arrowheads to indicate the relevant foci, making it clearer to the reader. We have updated the figure legend to correspond with these changes as well.

      (4) The authors state that BEND2 is "required to establish the ovarian reserve during oogenesis" but this has not been demonstrated. The authors do show a reduced density of primordial follicles at one week of age. While this is compelling data, the ovarian reserve is established earlier in the mouse, around postnatal days 0-1, so it is not clear from this manuscript whether BEND2 is required for the maintenance of this population after PND1, leading to reduced numbers by 1 week of age, OR if it is required for the establishment of this population, which would result in reduced numbers of oocytes around the time of birth. This is a critical experiment that should be performed in order to determine which of these possibilities is likely the case. Ideally, looking at embryonic through early postnatal time points during ovarian development would be very helpful.

      We thank the reviewer for raising this important point. As mentioned earlier in response to Reviewer 1, we have performed the experiment suggested by Reviewer 2 and analyzed the number of DDX4-positive cells in newborn ovaries. Our results show that Bend2 mutant ovaries have fewer oocytes at birth than wild-type controls (Fig. 5H). This finding reinforces our conclusion that BEND2 is indeed required to establish the ovarian reserve, as the reduction in oocyte number is evident at the time of birth. We agree that this additional data strengthens our original claim, so we have included these results in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      Huang et al. investigated the phenotype of Bend2 mutant mice which expressed a truncated isoform. This mutant male showed increasing apoptosis due to unrepaired double-strand breaks. However, this mutant male has fertility, and this enabled them to analyze Bend2 function in females. They revealed that Bend2 mutation in females showed decreasing follicle numbers which leads to loss of ovarian reserve.

      Strengths:

      Since their Bend2 mutant males were fertile, they were able to analyze the function of Bend2 in females and they revealed that loss of Bend2 causes less follicle formation.

      Weaknesses:

      Why the phenotype of their mutant male is different from previous work (Ma et al.) is not clear enough although they discuss it.

      We appreciate the reviewer’s comment regarding the differences between our Bend2 mutant male phenotype and the previously reported phenotype by Ma et al., 2022. We believe this discrepancy is due to the fact that the Bend2 locus encodes two BEND2 isoforms: p140 and p80. In contrast to the previous study, where both proteins were ablated by mutation employed (the deletion of exons 12 and 13), our exon 11 deletion specifically ablates p140 expression while allowing the expression of p80 in the testis.

      Based on the distinct phenotypes observed in the two Bend2 mutant mouse models, we hypothesize that p80 is sufficient to fulfill BEND2’s roles in meiosis, which could explain why our Bend2 mutant males remain fertile. We have rewritten the relevant sections in the results and discussion to better articulate this hypothesis and clarify the potential mechanisms behind the observed phenotypic differences.

      We hope these clarifications and additional details adequately address the reviewer’s concerns.

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors showed that Bend2 mutant females had decreased fertility. This may be due to decreased ovarian reserve. Did the authors check if the mutant mice decreased or lost fertility faster than WT? If the authors have the data, please refer to it in the manuscript.

      We followed the breeding performance of a small number of control and Bend2 mutant females, and preliminary observations suggested no clear differences between the two groups. However, due to the limited sample size, we felt that these data were not conclusive enough to be included in the manuscript. We agree that a more thorough analysis of fertility decline over time would be valuable, and we plan to address this question in a future study.

      (2) In Figure 1 A, there is no exon1 in the upper figure.

      We thank the reviewer for pointing this out. We have revised Figure 1A to include exon 1 and ensure the schematic is accurate. The updated figure is included in the revised version of the manuscript.

      (3) Figure 3A, it would be nice to show several tubules of the testis section as well as an enlarged one.

      Following the reviewer's advice, we have revised Figure 3A to include new images showing several tubules and an enlarged view of one section of a tubule. These updates are included in the revised manuscript to better represent the testis sections.

      (4) Please be consistent with the format of the graph, especially Supplemental figures 2C and 4D.

      We have revised the figures, including Supplemental Figures 2C and 4D, to ensure consistency in the format throughout the manuscript. We have made modifications to the figures to align them more closely and improve the overall presentation.

    1. 作品赏析

      这段文字对 Raoul Hausmann 的雕塑作品《Der Geist unserer Zeit – Mechanischer Kopf》(意为《我们时代的精神——机械头》)进行了深入的分析,揭示了其中复杂的哲学和政治含义。让我们一步步解析这段话的含义。

      1. 哲学背景与黑格尔的思想

      文本提到,Hausmann 的雕塑作品特别地唤起了德国哲学家 乔治·威廉·弗里德里希·黑格尔(Georg Wilhelm Friedrich Hegel) 的思想。黑格尔的哲学体系认为,“一切皆精神”(“everything is mind”)。在黑格尔的哲学中,历史和社会的演进由“绝对精神”推动,精神是现实和意识的最终主导力量。黑格尔的思想体系强调了心灵、思想在理解世界的核心地位。

      2. 黑格尔与马克思的关系

      黑格尔的思想深刻影响了后来的哲学家,尤其是 卡尔·马克思。马克思对黑格尔的思想进行了批判,并在其著作中提出了“唯物主义”的立场,主张历史的发展是由物质条件(如经济和生产方式)推动的,而非单纯的“精神”或“思想”主导社会的进程。马克思的这种观点对 Hausmann 的雕塑有直接影响。

      3. 雕塑的马克思主义反转

      在此语境下,Hausmann 的雕塑可以被看作是对黑格尔思想的“激进马克思主义式的反转”。黑格尔认为,思想和精神是社会和历史发展的核心力量,而 Hausmann 通过其雕塑《机械头》挑战这一观点。雕塑表现的不是一个主导思想的自由精神,而是一个“思想”受到外部物质力量决定的头部。这是通过雕塑中将机械部件直接固定在头部上,形象地表现了思想(头脑)受制于外部物质力量的主题。

      4. 对西方文化更深层次的批判

      然而,Hausmann 的雕塑不仅仅是对黑格尔思想的反动,它还触及了西方文化的其他深层次假设,揭示了西方哲学和文化中对“头部”作为理性象征的长期迷信。在传统西方文化中,头部和大脑被视为理性的座位,是人类智慧和思想的象征。这种观念在欧洲的肖像画艺术中也有深刻的体现,肖像画往往强调人物头部的表现,象征着理性和个人的内心世界。

      Hausmann 的雕塑则颠覆了这种传统观念。他通过表现一个“被物质力量渗透和支配的头部”,直接挑战了头部作为理性象征的理念,揭示了思想并非独立于物质世界而存在。相反,思想和理性是受到外部力量和物质条件的支配。

      5. 雕塑的现代性与力量

      这件作品之所以被认为是“现代杰作”,正是因为它挑战了传统的哲学观念,尤其是对理性和个体主体性的传统理解。Hausmann 的雕塑不仅批判了西方哲学的理性至上主义,还通过其物质化的视觉表现,揭示了现代社会中个体思维被外部物质力量、社会机制甚至机器化力量所渗透和控制的现实。

      总结

      通过这件雕塑,Hausmann 传达了一个复杂的批判信息:他不仅挑战了黑格尔关于精神主导一切的观念,也揭示了西方文化中关于理性、思想与物质世界之间关系的根深蒂固的误解。《机械头》作品通过展示一个被外部物质控制的头部,反思并颠覆了西方传统文化中的理性观念,提出了思想和人的存在无法脱离物质世界的控制。

    1. Voici un sommaire de l'entretien avec Grégoire Borst, avec les horodatages correspondants :

      • Introduction []
        • L'animateur présente Grégoire Borst, professeur de psychologie du développement et de neurosciences cognitives, directeur du laboratoire de psychologie du développement et de l'éducation de l'enfant (Lapsid), un laboratoire du CNRS.
        • Borst a fait sa thèse en psychologie en 2005, puis a passé 4 ans en post-doctorat à Harvard, avant de revenir en France en 2010.
        • Le Lapsid est le premier laboratoire français en psychologie scientifique, créé il y a 135 ans.
      • Recherches du laboratoire []
        • Le laboratoire étudie le rôle des mécanismes de contrôle, des automatismes, de la détection de conflits et de doutes dans le développement cognitif et socio-émotionnel de l'enfant et de l'adolescent, ainsi que dans les apprentissages scolaires, en combinant approches comportementales et neuroimagerie.
        • L'objectif est d'intégrer différents niveaux d'explication, du génétique aux contextes sociaux et culturels.
        • Borst explique qu'il s'intéresse aux différences entre les individus et que l'adolescence est la période où l'hétérogénéité est la plus forte.
        • Il souligne l'importance de combiner les approches de la psychologie, des neurosciences, de la linguistique, de l'informatique, de la sociologie, de l'économie et de la didactique pour comprendre le développement de l'enfant.
      • Interventions et publications []
        • Borst intervient auprès du monde de l'éducation et est membre du bureau international de l'éducation à l'UNESCO.
        • Il est l'auteur de nombreux articles scientifiques et de livres, dont "Le cerveau et les apprentissages" et "Explore ton cerveau" avec Olivier Houdé.
        • Il mentionne également "Mon cerveau questions-réponses" pour les moins de 10 ans et "C'est pas moi, c'est mon cerveau" avec Mathieu Cassoti pour les adolescents.
        • Ce dernier ouvrage décrypte le fonctionnement du cerveau des adolescents à travers 14 situations quotidiennes, en utilisant des jeux, des quiz et des récits.
      • Le développement cérébral à l'adolescence []
        • Le cerveau se développe très tôt, dès les premiers jours après la fécondation, et continue de se transformer longtemps après la naissance.
        • Le cerveau n'est pas structuré comme un cerveau adulte avant 20 à 25 ans.
        • La plasticité cérébrale permet aux cerveaux de se transformer tout au long de la vie.
        • Une étude sur l'apprentissage du jonglage montre comment l'acquisition de nouvelles compétences transforme le cerveau.
        • Le cerveau humain contient environ 86 milliards de neurones connectés par 1 million de milliards de connexions.
        • L'adolescence est une période de forte plasticité cérébrale qui dure de 10 à 12 ans.
        • L'entrée dans la puberté marque le début de cette période, avec la réouverture du filet périneuronal qui augmente la plasticité cérébrale.
        • Le développement du cerveau est asynchrone, avec le système limbique (émotions, récompense) qui mature plus tôt que le cortex préfrontal (régulation).
        • Ce décalage rend difficile la régulation des émotions et l'impulsivité chez les adolescents.
      • Gratification différée et prise de risque []
        • La tâche de gratification différée (test du marshmallow) montre la capacité des enfants à se maîtriser.
        • La capacité à différer son plaisir est un prédicteur de la réussite éducative future, plus que le milieu social ou le QI.
        • L'adolescence est une période de mortalité plus élevée que l'enfance en raison des risques suicidaires et des conduites à risque.
        • Les adolescents sont plus sensibles aux récompenses que les enfants et les adultes, comme le montre une expérience en IRM.
        • Les adolescents évaluent le ratio coût-bénéfice différemment, en accordant plus d'importance à la récompense, notamment sociale.
        • Les liens sociaux avec les pairs deviennent primordiaux à l'adolescence.
        • Les comportements à risque, comme le fire challenge, sont une façon de maximiser sa position sociale.
        • L'adolescence est une période où il est difficile de réguler ses émotions, comme une "cocotte-minute émotionnelle" sans valve.
        • Les adolescents sont plus orientés vers les récompenses immédiates, surtout en groupe.
        • Une expérience de conduite simulée montre que les adolescents prennent plus de risques lorsqu'ils sont observés par leurs pairs.
        • Les campagnes de prévention doivent tenir compte de la psychologie des adolescents.
      • Influence sociale et altruisme []
        • Les adolescents sont très influençables par les autres et plus sujets au conformisme social.
        • L'altruisme devient plus stratégique à l'adolescence, avec un partage de ressources surtout avec les amis.
        • Le cerveau adolescent a une grande capacité d'apprentissage, mais est aussi très vulnérable.
        • La consommation d'alcool avant 15 ans a des effets irréversibles sur le développement cérébral, tout comme le cannabis, qui peut entraîner une perte de points de QI.
        • Après la COVID, 40% des adolescents présentent des symptômes dépressifs.
      • Les écrans et les adolescents []
        • L'idée que les écrans sont responsables de la baisse du QI, des troubles de l'attention, des difficultés de concentration, des problèmes psychologiques et de l'addiction est fausse.
        • Il n'y a pas d'addiction aux écrans, ni de lien direct entre les réseaux sociaux et la dépression.
        • Les écrans peuvent même avoir un effet positif sur le développement de l'intelligence entre 8 et 10 ans et le développement de l'empathie.
        • Les écrans ont surtout un impact négatif sur le sommeil et la sédentarité.
        • La lumière artificielle des écrans perturbe la sécrétion de mélatonine, l'hormone du sommeil.
        • Il est recommandé de couper les écrans au moins une heure avant de se coucher.
        • Les rythmes scolaires ne sont pas adaptés au rythme de sommeil des adolescents, qui ont un décalage de phase de 2 heures.
        • Il faudrait décaler la première heure de cours d'une heure pour respecter leur rythme physiologique.
        • Un rapport de l'Élysée sur les enfants et les écrans propose 29 recommandations.
      • Questions et réponses []
        • Les hormones et les transformations physiques expliquent en partie la fatigue et la maladresse des adolescents.
        • Les études sur les écrans ne sont pas encore assez probantes, et la question du contenu est complexe.
        • L'adolescence cérébrale est universelle, mais des différences culturelles et sociales peuvent exister.
        • L'impact des écrans sur les relations sociales est nuancé.
        • Il ne faut pas jouer sur les récompenses pour motiver les ados, mais plutôt développer leur motivation intrinsèque.
        • Les recommandations des 3-6-9-12 sont des repères mais ne sont pas basées sur des études scientifiques solides.
        • Il faut surtout un parcours de parentalité en France qui explique comment se développe un enfant et un adolescent.
        • Il est essentiel de développer les compétences psychosociales dès l'enfance pour prévenir les problèmes de santé mentale.
    1. Creativity is often seen as something out of the ordinary or something that happens in 1burst of innovation. However, creativity is a gradual and complex process that forces individualsto look upon themselves, taking patience, introspection, and liberating oneself from society’snorms. Authors who demonstrate these aspects are Sami Schalk, David Lynch, and StevensonJohnson

      There are a number of ways to hook the reader: 1) relate the discussion to something about creativity that is in the news. Are there current popular artists or art or other creative work that is on the public radar that you can use as a means of starting the conversation? 2) You can attempt to define creativity is unusual or unorthodox terms --what is you personal definition and how might it differ from common and conventional ones? 3) you can relate creativity to education or technology. 4) You can tell a personal story of creativity.

    Annotators

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewing Editor Comment:<br /> Please note that all three reviewers suggested this manuscript would best fit as a resource paper at eLife.

      Reviewer #1 (Public review):

      Summary:

      This impressive study presents a comprehensive scRNAseq atlas of the cranial region during neural induction, patterning, and morphogenesis. The authors collected a robust scRNAseq dataset covering six distinct developmental stages. The analysis focused on the neural tissue, resulting in a highly detailed temporal map of neural plate development. The findings demonstrate how different cell fates are organized in specific spatial patterns along the anterior-posterior and medial-lateral axes within the developing neural tissue. Additionally, the research utilized high-density single-cell RNA sequencing (scRNAseq) to reveal intricate spatial and temporal patterns independent of traditional spatial techniques.

      The investigation utilized diffusion component analysis to spatially order cells based on their positioning along the anterior-posterior axis, corresponding to the forebrain, midbrain, hindbrain, and medial-lateral axis. By cross-referencing with MGI expression data, the identification of cell types was validated, affirming the expression patterns of numerous known genes and implicating others as differentially expressed along these axes. These findings significantly advance our understanding of the spatially regulated genes in neural tissues during early developmental stages. The emphasis on transcription factors, cell surface, and secreted proteins provides valuable insights into the intricate gene regulatory networks underpinning neural tissue patterning. Analysis of a second scRNAseq dataset where Shh signaling was inhibited by culturing embryos in SAG identified known and previously unknown transcripts regulated by Shh, including the Wnt pathway.

      The data includes the neural plate and captures all major cell types in the head, including the mesoderm, endoderm, non-neural ectoderm, neural crest, notochord, and blood. With further analyses, this high-quality data promises to significantly advance our understanding of how these tissues develop in conjunction with the neural tissue, paving the way for future breakthroughs in developmental biology and genomics.

      Strengths:

      The data is well presented in the figures and thoroughly described in the text. The quality of the scRNAseq data and bioinformatic analysis is exceptional.

      Weaknesses:

      No weaknesses were identified by this reviewer.

      Reviewer #2 (Public review):

      Summary:

      Brooks et al. generate a gene expression atlas of the early embryonic cranial neural plate. They generate single-cell transcriptome data from early cranial neural plate cells at 6 consecutive stages between E7.5 to E9. Utilizing computational analysis they infer temporal gene expression dynamics and spatial gene expression patterns along the anterior-posterior and mediolateral axis of the neural plate. Subsequent comparison with known gene expression patterns revealed a good agreement with their inferred patterns, thus validating their approach. They then focus on Sonic Hedgehog (Shh) signalling, a key morphogen signal, whose activities partition the neural plate into distinct gene expression domains along the mediolateral axis. Single-cell transcriptome analysis of embryos in which the Shh pathway was pharmacologically activated throughout the neural plate revealed characteristic changes in gene expression along the mediolateral axis and the induction of distinct Shh-regulated gene expression programs in the developing fore-, mid-, and hindbrain.

      Strengths:

      This manuscript provides a comprehensive transcriptomic characterisation of the developing cranial neural plate, a part of the embryo that to my knowledge has not been extensively analysed by single-cell transcriptomic approaches. The single-cell sequencing data appears to be of high quality and will be a great resource for the wider scientific community. Moreover, the computational analysis is well executed and the validation of the sequencing data using published gene expression patterns is convincing. Taken together, this is a well-executed study that describes a relevant scientific resource for the wider scientific community.

      Weaknesses:

      Conceptually, the findings that gene expression patterns differ along the rostrocaudal, mediolateral, and temporal axes of the neural plate and that Shh signalling induces distinct target genes along the anterior-posterior axis of the nervous system are more expected than surprising. However, the strength of this manuscript is again the comprehensive characterization of the spatiotemporal gene expression patterns and how they change upon ectopic activation of the Shh pathway.

      Reviewer #3 (Public review):

      Summary:

      The authors performed a detailed single-cell analysis of the early embryonic cranial neural plate with unprecedented temporal resolution between embryonic days 7.5 and 8.75. They employed diffusion analysis to identify genes that correspond to different temporal and spatial locations within the embryo. Finally, they also examined the global response of cranial tissue to a Smoothened agonist.

      Strengths:

      Overall, this is an impressive resource, well-validated against sets of genes with known temporal and spatial patterns of expression. It will be of great value to investigators examining the early stages of neural plate patterning, neural progenitor diversity, and the roles of signaling molecules and gene regulatory networks controlling the regionalization and diversification of the neural plate.

      Weaknesses:

      The manuscript should be considered a resource. Experimental manipulation is limited to the analysis of neural plate cells that were cultured in vitro for 12 hours with SAG. Besides the identification of a significant set of previously unreported genes that are differentially expressed in the cranial neural plate, there is little new biological insight emerging from this study. Some additional analyses might help to highlight novel hypotheses arising from this remarkable resource.

      We thank all three reviewers for their thoughtful and constructive public reviews and believe they nicely capture the contributions of our study. We agree that this article represents a valuable resource for the community and agree with its designation as a Tools and Resources article.

      We also thank the reviewers for their useful suggestions for improving the manuscript. In addition to addressing most of their comments, described below, we note that we have changed midbrain-hindbrain boundary (MHB) to rhombomere 1 (r1) throughout the paper and in Tables S4, S7, S10, and S11, as this designation is more closely aligned with the literature on this region. In addition, we added the anterior-posterior and mediolateral cluster identities from our wild-type analysis for the genes that were differentially expressed in SAG-treated embryos in Table S11. Lastly, we have added a new figure (Figure 5—figure supplement 2), as suggested by Reviewer 2, in which we compare our results with the published expression of genes in neural progenitor domains along the dorsal-ventral axis of the spinal cord.

      Reviewer #1 (Recommendations for the authors):

      I have a few small suggestions for improving the presentation of the data.

      (1) It would be helpful to show illustrations and embryo images of all the stages utilized in the analysis in Figures 1A and B.

      (2) It was difficult to distinguish all the different colors in Figures 3B and 4B. Could you label, as in Figure 4, supplements 1D, F?

      (3) I was confused by the position of the color code key for Figure 7D-J, thinking it belonged to panels B and C. Could you put it under the figure/heatmap key so that it is clearly linked to panels D-J?

      Thank you for these suggestions. We have incorporated the third suggestion to improve readability, but were not able to make the first two changes due to space limitations.

      Reviewer #2 (Recommendations for the authors):

      I only have a couple of minor additional suggestions/questions for the authors:

      (1) The authors state that nearly half of the transcripts they found as differentially regulated in SAG-treated embryos were also characterized as spatially regulated in the wild-type embryos. It would be great if the authors could provide more detail here. How many of the transcripts that are differentially regulated along the mediolateral axis of the wild-type are characterized as differentially regulated in the SAG-treated embryos? How does this further break down into where these genes are expressed along the mediolateral and the anterior-posterior axes? I am aware that the authors answer some of these questions already by providing examples, but a more systematic characterisation would be appreciated here.

      We have updated Table S11 to include the anterior-posterior and mediolateral cluster identities of differentially expressed genes in SAG-treated embryos, where applicable. In addition, we have added more discussion of the genes from our SAG analysis that were also found to be spatially patterned in wild-type embryos to the fourth paragraph of the last results section.

      (2) Related to the previous question, the authors nicely demonstrate that SAG treatment of embryos causes many transcriptional changes, including the expression/repression of several transcription factors well-known to mediate spatial patterning, raising the question of which of these effects are directly due to gene regulation by the Shh pathway and which effects are secondary consequences of transcriptional changes of other transcription factors. Similarly, the authors' results also suggest that some genes are only induced in specific parts along the neuraxis, raising the question of why. The authors could attempt some type of regulon-interference approaches to identify further candidates that may mediate these effects.

      This is an excellent suggestion for a future extension of this work, as we agree that validation of the predicted SHH targets, including which targets are direct, indirect, or region-specific, would be required to evaluate the predictions of this scRNA-seq analysis.

      (3) The authors report that they observed 'a previously unreported inhibition of Scube2' upon SAG treatment of the embryos. At least in the spinal cord Scube2 is well-known to be expressed at a distance from the source of Shh secretion (e.g. Kawakami et al. Curr. Biol. 2005), thus the direct or indirect repression by Shh signalling is strongly expected. Moreover, a recent preprint (Collins et al. bioRxiv, https://doi.org/10.1101/469239 ) suggests that the interaction between Shh and Scube2 can mediate the scale-invariance of Shh patterning. Of note, the authors of this preprint also state that 'upregulation of Shh represses scube2 expression while Shh downregulation increases scube2 expression thus establishing a negative feedback loop.'

      Thank you for this suggestion. We have added these references.

      (4) The authors partition genes based on different diffusion components as being differentially expressed along the mediolateral axis. However, starting from ~e8.5, neural progenitors in the neural tube can be partitioned based on the expression of well-characterised combinatorial sets of transcription factors into molecularly defined progenitor domains that subsequently give rise to functionally distinct types of neurons. How much of this patterning process can the authors capture with their diffusion component analysis and does their data also allow them to capture these finer-grained differences in gene expression along the mediolateral and prospective dorsal-ventral axis of the neural tube that are known to exist?

      This is a very interesting point. We have added a new figure showing UMAPs of the E8.5-9.0 cranial neural plate for a subset of 29 genes (described in Delile et al., 2019) that define distinct neural progenitor domains along the dorsal-ventral axis of the spinal cord (Figure 5—figure supplement 2). We observed that 18 of 20 genes that were detected in the midbrain/r1 region in our dataset were expressed in broad domains along the mediolateral axis of the cranial neural plate that were roughly consistent with their expression domains along the dorsal-ventral axis of the spinal cord. Of these 18 genes, 14 were patterned along both anterior-posterior and mediolateral axes, 2 were patterned only along the mediolateral axis, and 2 were patterned only along the anterior-posterior axis. These results suggest a general correspondence between mediolateral patterning in the cranial neural plate and dorsal-ventral patterning in the spinal cord. However, less refinement of these domains along the mediolateral axis was observed in the cranial neural plate, possibly because the relatively early, pre-closure stages captured by our dataset may be before the establishment of secondary feedback systems that lead to fine-scale patterning of mutually exclusive neural precursor domains. These results are described in the last paragraph of the results section titled “An integrated framework for analyzing cell identity in multiscale space.”

      (5) The authors state that they will not only make the raw sequencing data but also the processed intermediate data files available. This is greatly appreciated as it strongly facilitates the re-use of the data. However, it would be also appreciated if the authors made the computational code publicly available that was used to analyze the data and generate the figure panels in the manuscript.

      We have deposited the processed h5ad files in the GEO database, accession number GSE273804. Additionally, we have made interactive python notebooks available with the code used to analyze gene expression and generate the figures in this study, as well as code used to automatically generate customizable links to gene expression images in the Mouse Genome Informatics Gene Expression database, on our lab GitHub page (https://github.com/ZallenLab). We have updated the Data availability section to reflect these changes.

      Reviewer #3 (Recommendations for the authors):

      (1) Considering that individual progenitor domains in the developing neural tube are typically sharply delineated with few cells exhibiting mixed identities, it is interesting that clustering of single-cell data results in a largely continuous “cloud” of cells. Is this because the early neural plate cells have not yet crystallized their identity, or would clustering based on a smaller set of genes that exhibit high variance across only neural plate cells result in improved granularity, allowing for better characterization and quantification of distinct progenitor subtypes?

      Thank you for raising this interesting point. The apparent continuity of gene expression in the cranial neural plate could reflect a gene signature shared by cranial neural plate cells and that cells may not be extensively regionalized into unique populations at these early stages. We now discuss these possibilities in the third paragraph of the discussion.

      (2) Can the authors clarify how neural plate cells were identified and how they were distinguished from the anterior epiblast?

      Cell typing was performed by supervised clustering based on known markers of fate. Cranial neural plate cells were identified by their expression of pan-neural factors (Sox2 and Sox3), early or late neural plate markers (Cdh1 or Cdh2), and the lack of markers associated with non-neural ectodermal cell fates (Grhl2, Krt18, Tfap2a) or other cell types (Ets1, T, Tbx6). Full gene sets used to identify all cell types in our analysis are provided in Supplementary Table 13.

      (3) Did the study identify cells with cranial placode identity? Cranial placodes emerge during the same period, and it would be useful to highlight them in Figure 1.

      Thank you for highlighting this point. Examination of the early placode markers Six1 and Eya1 indicates that cranial placode cells are a subset of the cells in PhenoGraph cluster 17 in our full dataset Figure 1—figure supplement 1). We now mention this along with other cell types of interest in the last paragraph of the discussion.

      (4) It could be interesting to provide more information about the novel genes identified as differentially expressed along the AP or mediolateral axes. Do they belong to gene families that were not previously implicated in neural patterning, or do they point to novel biological mechanisms controlling neural patterning?

      Diverse gene families are represented by the genes that are patterned along the anterior-posterior and mediolateral axes of the cranial neural plate at these stages, likely due to the large number of genes that are spatially patterned in this tissue. Further investigation of the biological mechanisms suggested by these patterns is an important direction for future work, both in terms of molecularly classifying the genes identified as well as directly investigating their roles in neural patterning using genetic analysis.

      (5) It would be helpful to discuss how the data presented here compare to other relevant single-cell analyses, such as PMC10901739. This would help to highlight aspects that are unique to this study.

      We have added this reference as well as an earlier study from these authors and we discuss how our study complements this work in the introduction.

      (6) The inclusion of single-cell data from control embryos that were cultured for 12 hours is of great interest. The authors should identify the set of genes that are deregulated in cultured cells and, taking advantage of their detailed temporal series, examine whether the maturation of cultured embryos progresses normally or whether there are genes that fail to mature correctly in vitro.

      We agree that an analysis of the impact of ex vivo culture on gene expression would be useful. However, the large difference in the number of cells in our wild-type and cultured embryo datasets, as well as the lack of time-course data for the cultured embryos, could make a comparison between our current cultured and non-cultured embryo datasets difficult to interpret.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors studied how hippocampal connectivity gradients across the lifespan, and how these relate to memory function and neurotransmitter distributions. They observed older age with less distinct transitions and observed an association between gradient de-differentiation and cognitive decline.

      This is overall an innovative and interesting study to assess gradient alterations across the lifespan and its associations to cognition.

      The paper is well-written, and the methods appear sound and thoughtful. There are several strengths, including the inclusion of two independent cohorts, the use of gradient mapping and alignment techniques, and an overall sound statistical and analysis framework. There are several areas for potential improvements in the paper, and these are listed below:

      We thank the Reviewer for their positive assessment and summary of our work. We address each of the Reviewer’s comments below, and outline the revisions we have made to the manuscript based on the Reviewer’s suggestions.

      (1) The reported D1 associations appear a bit post-hoc in the current work and I was unclear why the authors specifically focussed on dopamine here, as other transmitter systems are similar present at the level of the hippocampus and implicated in aging.

      Other neurotransmitter systems may indeed be relevant in the context of hippocampal function in aging. In this study, however, we included a specific research question about the DA D1 receptor (D1DR) based on previous research 1) emphasizing the role of DA neuromodulation in maintaining functional network segregation in aging to support cognition (Pedersen et al., 2023), 2) reporting heterogeneous distribution of DA markers across the hippocampus, supporting efficient modulation of distinct behaviors (Dubovyk & ManahanVaughan, 2019; Edelmann & Lessmann, 2018; Gasbarri et al., 1994; Kempadoo et al., 2016), and 3) demonstrating the spatial distribution of D1DRs as varying across neocortex along a unimodal-transmodal gradient (Pedersen et al., 2024). To which degree this variation might be reflected in cortico-hippocampal connectivity, however, remained to be investigated. As such, one of the study’s specific aims was to evaluate the spatial distribution of D1DRs as a molecular correlate of the hippocampus’ functional organization. Importantly, we were interested in mapping associations between individual differences in the organization of connectivity and D1DRs. This was uniquely enabled by utilizing the DyNAMiC sample, as it includes structural and functional MRI data in combination with D1DR PET in the same individuals across the adult lifespan (n=180). However, after observing significant spatial correspondence between functional organization and D1DR expressed by the second hippocampal gradient (G2), we did indeed perform complimentary analyses with group-averaged data of additional dopamine markers (D2DR from a subsample of our participants, as well as DAT and FDOPA from open sources) to test the generalizability of the original finding. Taken together, the original analyses based on subject-level data and complimentary group-level analyses provided support for the interpretation of G2 as a dopaminergic mode.

      We have updated the manuscript to clarify the focus on the D1 receptor and the contribution of including additional DA markers.

      Updated paragraph in the Introduction, pages 5-6:

      “Dopamine (DA) is one of the most important modulators of hippocampus-dependent function(47,48), and influences the brain’s functional architecture through enhancing specificity of neuronal signaling(49). Consistently, there is a DA-dependent aspect of maintained functional network segregation in aging which supports cognition(50). Animal models suggest heterogeneous patterns of DA innervation(51,52) and postsynaptic DA receptors(53), across both transverse and longitudinal hippocampal axes, likely allowing for separation between DA modulation of distinct hippocampus-dependent behaviors(47). Moreover, the human hippocampus has been linked to distinct DA circuits on the basis of long-axis variation in functional connectivity with midbrain and striatal regions(54,55). Taken together with recent findings revealing a unimodal-transmodal organization of the most abundantly expressed DA receptor subtype, D1 (D1DR), across cortex(56), we tested the hypothesis that the organization of hippocampal-neocortical connectivity partly reflects the underlying distribution of hippocampal DA receptors, predicting predominant spatial correspondence for any hippocampal gradient conveying a unimodal-transmodal pattern across cortex.”

      Updated sections in the Results, page 13-14:

      “Our next aim was to investigate to which extent the distribution of hippocampal DA D1 receptors (D1DRs), measured by [<sup>11</sup>C]SCH23390 PET in the DyNAMiC(58) sample, may serve as a molecular correlate of the hippocampus’ functional organization.”

      “Complimentary analyses were then conducted to further evaluate G2 as a dopaminergic hippocampal mode by utilizing additional DA markers at group-level.”

      Moreover, the authors may be aware that multiple PET tracers are somewhat challenged in the mesiotemporal region. Is this the case for the D1 receptor as well? The hippocampus is a small and complex structure, and PET more of a low res technique so one would want to highlight and discuss the limitations of the correlations with PET maps here and/or evaluate whether the analysis adds necessary findings to the study.

      We thank the Reviewer for raising this point. The lower resolution of PET is indeed a relevant aspect to consider when quantifying D1DR availability in the hippocampus, even though previous research indicate high test-retest reliability of [<sup>11</sup>C]SCH23390 PET measurement in this region (Kaller et al., 2017). We have now elaborated on PET limitations in the Discussion of the revised manuscript.

      In our study, we made efforts to reduce potential partial volume effects (PVE) by correcting our PET data, and tested spatial associations between our functional gradients and D1DR maps using trend-surface modelling (TSM), rather than through voxel-wise comparisons. This allowed us to evaluate the spatial correspondence between functional connectivity and D1DRs at a level of spatial trends, estimated using TSM models computed at increasing levels of complexity. The results showed consistent spatial overlap between G2 and D1DRs across these models, that is, across spatial trends described at coarser-to-finer scales. Furthermore, this was replicated across several DA markers with PET and SPECT data from independent samples.

      Taken together, we agree with the Reviewer that the spatial correspondence observed between G2 and hippocampal D1DRs should be interpreted in the context of resolution-related limitations inherent to PET imaging. However, we strongly believe that our DA analyses offer valuable insight to the molecular underpinnings of hippocampal functional organization.

      Updated paragraph in the Discussion, pages 25-26:

      “We discovered that G2, specifically, manifested organizational principles shared among function, behavior, and neuromodulation. Meta-analytical decoding reproduced a unimodalassociative axis across G2 (Figure 3B), and analyses in relation to the distribution of D1DRs – which vary across cortex along a unimodal-transmodal axis(76,77) – demonstrated topographic correspondence both at the level of individual differences and across the group. It should, however, be acknowledged that PET imaging in the hippocampus is associated with resolutionrelated limitations, although previous research indicate high test-retest reliability of [<sup>11</sup>C]SCH23390 PET to quantify D1DR availability in this region(78). As such, mapping the distribution of hippocampal D1DRs at a fine spatial scale remains challenging, and replication of our results in terms of overlap with G2 is needed in independent samples. Here, we evaluated the observed spatial overlap between G2 topography and D1DRs across multiple TSM model orders, showing correspondence between modalities from simple to more complex parameterizations of their spatial properties. Topographic correspondence was additionally observed between G2 and other DA markers from independent datasets (Figure 3B), suggesting that G2 may constitute a mode reflecting a dopaminergic phenotype, which contributes to the currently limited understanding of its biological underpinnings.”

      From my (perhaps somewhat biased) perspective, it might be valuable to instead or in addition look at measures of hippocampal microstructure and how these relate to the functional aging effects. This could be done, if available, using data from the same subjects (eg based on quantitative MRI contrasts and/or structural MRI) and/or using contextualization findings as implemented in eg hippomaps.readthedocs.io

      We thank the Reviewer for this suggestion. We performed additional analyses investigating the spatial overlap between our connectivity gradients and estimates of hippocampal microstructure, computed as the ratio of T1- over T2-weighted (T1w/T2w) images (Glasser & Von Essen, 2011; vos de Wael et al., 2018). Analyses of spatial correspondence then followed the TSM-based method used to test the spatial overlap between functional connectivity gradients and D1DR distribution. Applying TSM to the T1w/T2w image computed for each participant yielded subject-level model parameters describing microstructure topography, which were then entered as predictors of connectivity topography in multivariate GLMs (separate models for each gradient and hemisphere, 6 models in total).

      Analyses revealed that microstructure of the right hippocampus significantly predicted gradient topography of right-hemisphere G1 (F = 1.325, p \= 0.034), while no other links between connectivity gradients and microstructure emerged as significant (F 0.930-1.184, ps 0.7060.079).

      These results, suggesting an association along the anteroposterior axis, deviate from previous findings linking hippocampal microstructure to G3-like, medial-lateral, connectivity organization (vos de Wael et al., 2018). As we believe that comprehensive analyses of our gradients in relation to microstructure across the lifespan would be best addressed in future work, we have not included these analyses of microstructure in the revised manuscript.

      (2) Can the authors clarify why they did not replicate based on cohorts that are more widely used in the community and open access, such as CamCAN and/or HCP-Aging? It might connect their results with other studies if an attempt was made to also show that findings persist in either of these repositories.

      We agree with the Reviewer that replication in samples such as CamCAN and/or HCP-Aging would provide valuable opportunities to connect our findings with those of other studies using those datasets. Here, we included the Betula dataset (Nilsson et al., 2004) as our replication sample, as it was immediately available to us, included a large sample of adults in a comparable age, and a word recall episodic memory task closely aligned with the one included in DyNAMiC. Importantly, leveraging the Betula dataset as our replication sample allows us to link our findings to a wide range of previous studies central to the understanding of neurocognitive aging in general, and hippocampal aging in particular (Nyberg, 2017; Nyberg et al., 2020). Betula is a large longitudinal project that has been tracking individuals since 1988, and is part of the National E-infrastructure for Aging Research (NEAR: www.near-aging.se), through which data from several Swedish studies are made available to both national and international researchers. While we acknowledge the value of extending replication efforts to datasets like CamCAN and HCP-Aging, we emphasize the significant contribution of having replicated our connectivity gradients in the Betula dataset.

      (3) The authors applied TSM and related these parameters to topographic changes in the gradients. I was wondering whether and how such an approach controls for autocorrelation present in both the PET map and gradients. Could the authors clarify?

      The Reviewer raises an important topic in spatial autocorrelation. The TSM approach used to parameterize the topography of the functional gradients and D1DR distribution, and to test the spatial correspondence between modalities, did not include any specific method to control for autocorrelation. Here, we highlight two aspects of our study in relation to this point. First, we demonstrated in the Supplementary information (S. Figure 4) that autocorrelation induced by spatial smoothing likely has limited effects on overall gradient topography and the ability of TSM parameters to capture meaningful inter-individual differences in terms of age. Second, in the case of spatial overlap effects being significantly impacted by autocorrelation, we would expect the association between right-hemisphere G2 and D1DR topography to similarly emerge for G2 in the left hemisphere. The absence of such an association may speak to a limited effect of spatial autocorrelation.

      (4) The TSM approach quantifies the gradients in terms of x/y/z direction in a cartesian coordinate system. Wouldn't a shape intrinsic coordinate system in the hippocampus also be interesting, and perhaps even be more efficient to look at here (see eg DeKraker 2022 eLife or Paquola et al 2020 eLife)?

      This is a very relevant question and we appreciate the Reviewer’s suggestion. We recognize that there may be several benefits associated with adopting a shape-intrinsic coordinate system when characterizing effects in the hippocampus, given its curved/folded anatomy. Approaches like the ones adopted in DeKraker et al., 2022 and Paquola et al., 2020, utilizes geodesic coordinate frameworks to represent the hippocampus in surface space, enabling mapping of connectivity onto the hippocampal surface while respecting its inherent curvature and topology. We anticipate that quantifying gradients within such a framework would especially benefit identification of connectivity change across the hippocampal surface relative to reference points such as subfield boundaries, while minimizing effects of interindividual differences in hippocampal shape and folding. In our study, hippocampal gradients and their associated cortical patterns were computed in volumetric space, with TSM subsequently used to parameterize the change in connectivity along these gradients. This indeed yields a description of connectivity change within a coordinate system less specific to hippocampal anatomy, but may favor generalizability and integration with previous gradient findings within and beyond the hippocampus (e.g., Przeździk et al., 2019; Tian et al., 2020; Katsumi et al., 2023; Navarro-Schröder et al., 2015), as well as connections with broader neuroimaging frameworks through techniques such as meta-analytical decoding. In our view, the different coordinate frameworks offer complimentary insight to hippocampal organization, and while we have opted to not undertake novel analyses to explore our gradients within a geodesic coordinate system for the purposes of this paper, we recognize the importance of such evaluation of our gradients in future analyses. We have made updates to the Discussion in the revised manuscript on this topic (pages 23-24):

      “Greater anatomical specificity, with more precise characterization of connectivity in relation to subfield boundaries while minimizing effects of inter-individual differences in hippocampal shape and folding, might be achieved by adopting techniques implementing a geodesic coordinate system to represent effects within the hippocampus(68,69).”

      Reviewer #2 (Public Review):

      Summary:

      This paper derives the first three functional gradients in the left and right hippocampus across two datasets. These gradient maps are then compared to dopamine receptor maps obtained with PET, associated with age, and linked to memory. Results reveal links between dopamine maps and gradient 2, age with gradients 1 and 2, and memory performance.

      Strengths:

      This paper investigates how hippocampal gradients relate to aging, memory, and dopamine receptors, which are interesting and important questions. A strength of the paper is that some of the findings were replicated in a separate sample.

      Weaknesses:

      The paper would benefit from added clarification on the number of models/comparisons for each test. Furthermore, it would be helpful to clarify whether or not multiple comparison correction was performed and - if so - what type or - if not - to provide a justification. The manuscript would furthermore benefit from code sharing and clarifying which results did/did not replicate.

      We thank the Reviewer for their positive assessment and suggestions regarding further clarifications. We have addressed the Reviewer’s comments in a point-by-point manner under the “Recommendations for the authors” section.

      Reviewer #3 (Public Review):

      Summary:

      In this study, the authors analyzed the complex functional organization of the hippocampus using two separate adult lifespan datasets. They investigated how individual variations in the detailed connectivity patterns within the hippocampus relate to behavioral and molecular traits. The findings confirm three overlapping hippocampal gradients and reveal that each is linked to established functional patterns in the cortex, the arrangement of dopamine receptors within the hippocampus, and differences in memory abilities among individuals. By employing multivariate data analysis techniques, they identified older adults who display a hippocampal gradient pattern resembling that of younger individuals and exhibit better memory performance compared to their age-matched peers. This underscores the behavioral importance of maintaining a specific functional organization within the hippocampus as people age.

      Strengths:

      The evidence supporting the conclusions is overall compelling, based on a unique dataset, rich set of carefully unpacked results, and an in-depth data analysis. Possible confounds are carefully considered and ruled out.

      Weaknesses:

      No major weaknesses. The transparency of the statistical analyses could be improved by explicitly (1) stating what tests and corrections (if any) were performed, and (2) justifying the elected statistical approaches. Further, some of the findings related to the DA markers are borderline statistically significant and therefore perhaps less compelling but they line up nicely with results obtained using experimental animals and I expect the small effect sizes to be largely related to the quality and specificity of the PET data rather than the derived functional connectivity gradients.

      We thank the Reviewer for the thoughtful summary and positive assessment of our work. To increase transparency of the statistical analyses, we have in the revised manuscript added information regarding statistical tests and corrections for multiple comparisons. In the Results, p-values were reported at an uncorrected statistical threshold, and we have in the revised manuscript included the corresponding p-values adjusted for multiple comparisons using the Benjamini-Hochberg method to control the false discovery rate (FDR). Finally, in the revised manuscript, we have now elaborated on the potential limitations of our PET analyses and we include the updated paragraph below.

      Addition made to the Results section, page 13:

      “Individual maps of D1DR binding potential (BP) were also submitted to TSM, yielding a set of spatial model parameters describing the topographic characteristics of hippocampal D1DR distribution for each participant. D1DR parameters were subsequently used as predictors of gradient parameters in one multivariate GLM per gradient (in total 6 GLMs, controlled for age, sex, and mean FD). Results are reported with p-values at an uncorrected statistical threshold and p-values after adjustment for multiple comparisons using the Benjamini-Hochberg method to control the false discovery rate (FDR).”

      Addition made to the Results section, page 15:

      “Effects of age on gradient topography were assessed using multivariate GLMs including age as the predictor and gradient TSM parameters as dependent variables (controlling for sex and mean frame-wise displacement; FD). One model was fitted per gradient and hemisphere, each model including all TSM parameters belonging to a gradient (in total, 6 GLMs).”

      Addition made to the Results section, page 17:

      “Models were assessed separately for left and right hemispheres, across the full sample and within age groups, yielding eight hierarchical models in total. Results are reported with p-values at an uncorrected statistical threshold and p-values after FDR adjustment.”

      Updated paragraph in the Discussion, pages 25-26:

      “We discovered that G2, specifically, manifested organizational principles shared among function, behavior, and neuromodulation. Meta-analytical decoding reproduced a unimodalassociative axis across G2 (Figure 3B), and analyses in relation to the distribution of D1DRs – which vary across cortex along a unimodal-transmodal axis(76,77) – demonstrated topographic correspondence both at the level of individual differences and across the group. It should, however, be acknowledged that PET imaging in the hippocampus is associated with resolutionrelated limitations, although previous research indicate high test-retest reliability of [<sup>11</sup>C]SCH23390 PET to quantify D1DR availability in this region(78). As such, mapping the distribution of hippocampal D1DRs at a fine spatial scale remains challenging, and replication of our results in terms of overlap with G2 is needed in independent samples. Here, we evaluated the observed spatial overlap between G2 topography and D1DRs across multiple TSM model orders, showing correspondence between modalities from simple to more complex parameterizations of their spatial properties. Topographic correspondence was additionally observed between G2 and other DA markers from independent datasets (Figure 3B), suggesting that G2 may constitute a mode reflecting a dopaminergic phenotype, which contributes to the currently limited understanding of its biological underpinnings.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Please see the comments in the public review.

      We thank the Reviewer for their comments and recommendations, and have addressed them in the “Public review” section.

      Reviewer #2 (Recommendations For The Authors):

      (1) All statistical analyses are based on linear regressions using trend surface modeling (TSM) parameters that parameterize gradients at the subject level. These models resulted in 9 parameters for gradient 1 and 12 parameters each for gradients 2 and 3. The text states that 'Effects of age on gradient topography was assessed using multivariate GLMs including age as the predictor and gradient TSM parameters as dependent variables (controlling for sex and mean frame-wise displacement; FD)'. Please clarify whether these GLMs were fitted separately for each TSM parameter (i.e., 9+12+12=33 models for both left and right = 66 total models) or on the overall model?

      We appreciate the Reviewer’s request for clarification on this matter. These GLMs were fitted on the overall TSM model, that is, through one GLM per gradient (3) and hemisphere (2), each one including all TSM parameters belonging to a gradient (in total, 6 GLMs).

      In the revised manuscript, we have added more details to the Results section, page 15: “Effects of age on gradient topography were assessed using multivariate GLMs including age as the predictor and gradient TSM parameters as dependent variables (controlling for sex and mean frame-wise displacement; FD). One model was fitted per gradient and hemisphere, each model including all TSM parameters belonging to a gradient (in total, 6 GLMs).”

      (2) Similarly, for memory it appears that multiple models were performed (left and right, young, middle-aged, old, whole groups). Please clarify whether and how multiple comparison correction was performed in this case.

      In the revised manuscript, we have now specified the number of analyses conducted in relation to memory performance. We have also clarified that p-values were reported at an uncorrected statistical threshold, and we have in the revised manuscript included the corresponding p-values adjusted for multiple comparisons using the Benjamini-Hochberg method to control the FDR.

      Updated section in the Results, page 17:

      “Models were assessed separately for left and right hemispheres, across the full sample and within age groups, yielding eight hierarchical models in total. Results are reported with p-values at an uncorrected statistical threshold and p-values after FDR adjustment.”

      (3) Although I applaud the authors for their replication efforts, the results do not appear to replicate well. For example, memory was linked to gradient 2 in the whole group but to gradient 1 in the young group. Furthermore, dopamine was linked to gradient 2 in the right but not the left hemisphere. Although the overall group-level gradients were very stable between the two datasets, it is not clear whether the age findings replicated and the memory subgroup findings only replicated at trend level for memory and only partially replicated at the TSM parameter level.

      We thank the Reviewer for highlighting the inclusion of a replication dataset as a strength of our study, and we appreciate the recommendation to clarify to which extent results replicated. We provide a response to the Reviewer’s points below, and specify the revisions made to the manuscript in relation to this topic.

      The main aim of our study was to characterize the topographic organization of functional hippocampal-neocortical connectivity within the hippocampus across the adult lifespan, as previous studies have limited their focus to younger adults. Given the lack of previous studies for comparison, together with our identification of a novel secondary long-axis connectivity gradient (G2) taking precedence over the previously established medial-lateral G3, we included the Betula sample (Nilsson et al., 2004) for the purpose of replication. There was a high level of consistency between our main dataset and our replication dataset, with gradients 1-3 in left and right hemispheres identified in both samples.

      Further use of the replication dataset, beyond the identification of the connectivity gradients, was originally not planned. As such, not all subsequent analyses in the main dataset were conducted in the replication dataset. However, we found it critical to evaluate the observation that older individuals who maintained a youth-like gradient topography also exhibited higher levels of memory performance in an independent sample. This was possible given that the replication dataset included a comparable number of participants in similar ages and a word recall episodic memory task corresponding well to the one used in DyNAMiC. Overall, we conclude that these analyses replicated well across samples. Firstly, topography of lefthemisphere G1 informed the classification of older adults into youth-like and aged subgroups in both samples. Furthermore, in both samples, we observed that the older subgroups identified based on G1 topography also exhibited the youth-like vs. aged pattern in G2 topography. This pattern was, however, evident also in G3 only in the main sample, possibly suggesting a limited contribution of G3 topography in determining overall functional profiles in older age. In terms of the behavioral relevance of maintaining youth-like gradient topography in older age, we observed effects on word recall performance in both samples; although the Reviewer correctly points out that, the difference between subgroups was significant at trend-level (p = 0.058) in the replication dataset. While this indeed underscores the importance of replication efforts in additional samples, we argue that the pattern observed in our replication dataset is overall consistent with, and conveys effects in the expected direction based on, the original observations in our main dataset.

      In revising the manuscript, we have performed additional analyses for replication purposes in terms of memory. Originally, we observed a significant association between G2 topography and episodic memory across the main sample. However, this effect did not remain significant after FDR adjustment for multiple comparisons. To evaluate this association further, we conducted a corresponding hierarchical multiple regression analysis in the replication dataset, which supported a role of G2 in memory (Adj. R<sup>2</sup> = 0.368, ΔR<sup>2</sup> = 0.081, F= 1.992, p = 0.028). Together, these analyses suggest that inter-individual differences in episodic memory performance may in part be explained by the spatial characteristics of G2 across the adult lifespan, although increased statistical power in relation to the large number of TSM parameters included in the hierarchical regression models may be needed to explore this association in smaller, age-stratified, groups. Relatedly, it is worth mentioning that higher levels of memory performance in older age were linked to the maintenance of youth-like G2 topography in both our main and replication datasets.

      In parallel, topographic parameters of G1 predicted memory performance in the younger adults, which successfully replicates TSM-based results previously reported in Przeździk et al., 2019. Although similar associations were not evident within the other age groups, a link between G1 topography and memory was demonstrated in older age based on a) the identification of individuals maintaining a youth-like G1 profile and higher levels of memory, within which b) memory performance was, as in young adults, significantly predicted by G1 topography.

      The spatial correspondence between G2 topography and distribution of hippocampal D1DRs was lateralized to the right, and as the Reviewer points out, as such did not replicate across hemispheres. To which extent replication across hemispheres should be expected in this case is, however, difficult to determine. Lateralization and/or hemispheric asymmetry is commonly observed in numerous hippocampal features, from the molecular level to its functional involvement in behavior (Nematis et al., 2023; Persson & Söderlund, 2015), including various dopaminergic markers tested in the animal literature (Afonso et al., 1993; Sadeghi et al., 2017). Yet, potential differences between hemispheres in D1DR availability and the spatial distribution of receptors along hippocampal axes remain less studied in humans. More data is therefore needed to determine the nature of this right-hemisphere lateralization.

      In sum, we argue that our results show a good level of replication across independent datasets and across analyses in our main dataset. Whereas this study did not attempt replication of all analyses conducted in the main dataset, it has through replication across independent samples provided support for its main findings – the organization of hippocampal-neocortical connectivity along three main hippocampal gradients across the adult lifespan, and the gradient topography-based identification of older individuals maintaining a youth-like hippocampal organization in older age.

      The revised manuscript includes edits made to incorporate the new analyses and clarifications of observations in relation to memory.

      In the Results, page 17:

      “Observing that the association between G2 and memory did not remain significant after FDR adjustment, we performed the same analysis in our replication dataset, which also included episodic memory testing. Consistent with the observation in our main dataset, G2 significantly predicted memory performance (Adj. R<sup>2</sup> = 0.368, ΔR<sup>2</sup> = 0.081, F= 1.992, p = 0.028) over and above covariates and topography of G1. Here, the analysis also showed that G1 topography predicted performance across the sample (Adj. R<sup>2</sup> = 0.325, ΔR<sup>2</sup> = 0.112, F= 3.431, p < 0.001).”

      In the Discussion, page 26:

      “Results linked both G1 and G2 to episodic memory, suggesting complimentary contributions of these two overlapping long-axis modes. Considered together, analyses in the main and replication datasets indicated a role of G2 topography in memory across the adult lifespan, independent of age. A similar association with G1 was only evident across the entire sample in the replication dataset, whereas results in the main sample seemed to emphasize a role of youthlike G1 topography in memory performance. In line with previous research, memory was successfully predicted by G1 topography in young adults(30), and similarly predicted by G1 in older adults exhibiting a youth-like functional profile.”

      (4) Please share the data and code and add a description of data and code availability in the manuscript.

      We have now made our code available, and added a statement on data and code availability in the revised manuscript.

      On page 37: “Data from the DyNAMiC study are not publicly available. Access to the original data may be shared upon request from the Principal investigator, Dr. Alireza Salami. The Matlab, R, and FSL codes used for analyses included in this study are openly available at https://github.com/kristinnordin/hcgradients. Computation of gradients was done using the freely available toolbox ConGrads: https://github.com/koenhaak/congrads.”

      Reviewer #3 (Recommendations For The Authors):

      Please see the comments in the public review.

      We thank the Reviewer for their comments and recommendations, and have addressed them in the “Public review” section.

      References

      Afonso, D., Santana, C., & Rodriguez, M. (1993). Neonatal lateralization of behavior and brain dopaminergic asymmetry. Brain Research Bulletin, 32(1), 11–16. https://doi.org/10.1016/0361-9230(93)90312-Y

      DeKraker, J., Haast, R. A., Yousif, M. D., Karat, B., Lau, J. C., Köhler, S., & Khan, A. R. (2022). Automated hippocampal unfolding for morphometry and subfield segmentation with HippUnfold. eLife, 11, e77945. https://doi.org/10.7554/eLife.77945

      Dubovyk, V., & Manahan-Vaughan, D. (2019). Gradient of expression of dopamine D2 receptors along the dorso-ventral axis of the hippocampus. Frontiers in Synaptic Neuroscience, 11. https://doi.org/10.3389/fnsyn.2019.00028

      Edelmann, E., & Lessmann, V. (2018). Dopaminergic innervation and modulation of hippocampal networks. Cell and Tissue Research, 373(3), 711–727. https://doi.org/10.1007/s00441-018-2800-7

      Gasbarri, A., Verney, C., Innocenzi, R., Campana, E., & Pacitti, C. (1994). Mesolimbic dopaminergic neurons innervating the hippocampal formation in the rat: A combined retrograde tracing and immunohistochemical study. Brain Research, 668(1), 71–79. https://doi.org/10.1016/0006-8993(94)90512-6

      Glasser, M. F., & Essen, D. C. V. (2011). Mapping Human Cortical Areas In Vivo Based on Myelin Content as Revealed by T1- and T2-Weighted MRI. Journal of Neuroscience, 31(32), 11597–11616. https://doi.org/10.1523/JNEUROSCI.2180-11.2011

      Kaller, S., Rullmann, M., Patt, M., Becker, G.-A., Luthardt, J., Girbardt, J., Meyer, P. M., Werner, P., Barthel, H., Bresch, A., Fritz, T. H., Hesse, S., & Sabri, O. (2017). Test– retest measurements of dopamine D1-type receptors using simultaneous PET/MRI imaging. European Journal of Nuclear Medicine and Molecular Imaging, 44(6), 1025–1032. https://doi.org/10.1007/s00259-017-3645-0

      Katsumi, Y., Zhang, J., Chen, D., Kamona, N., Bunce, J. G., Hutchinson, J. B., Yarossi, M., Tunik, E., Dickerson, B. C., Quigley, K. S., & Barrett, L. F. (2023). Correspondence of functional connectivity gradients across human isocortex, cerebellum, and hippocampus. Communications Biology, 6(1), Article 1. https://doi.org/10.1038/s42003-023-04796-0

      Kempadoo, K. A., Mosharov, E. V., Choi, S. J., Sulzer, D., & Kandel, E. R. (2016). Dopamine release from the locus coeruleus to the dorsal hippocampus promotes spatial learning and memory. Proceedings of the National Academy of Sciences, 113(51), 14835–14840. https://doi.org/10.1073/pnas.1616515114

      Navarro Schröder, T., Haak, K. V., Zaragoza Jimenez, N. I., Beckmann, C. F., & Doeller, C. F. (2015). Functional topography of the human entorhinal cortex. eLife, 4, e06738. https://doi.org/10.7554/eLife.06738

      Nemati, S. S., Sadeghi, L., Dehghan, G., & Sheibani, N. (2023). Lateralization of the hippocampus: A review of molecular, functional, and physiological properties in health and disease. Behavioural Brain Research, 454, 114657. https://doi.org/10.1016/j.bbr.2023.114657

      Nilsson, L.-G., Adolfsson, R., Bäckman, L., Frias, C. M. de, Molander, B., & Nyberg, L. (2004). Betula: A Prospective Cohort Study on Memory, Health and Aging. Aging, Neuropsychology, and Cognition, 11(2–3), 134–148. https://doi.org/10.1080/13825580490511026

      Nyberg, L. (2017). Functional brain imaging of episodic memory decline in ageing. Journal of Internal Medicine, 281(1), 65–74. https://doi.org/10.1111/joim.12533

      Nyberg, L., Boraxbekk, C.-J., Sörman, D. E., Hansson, P., Herlitz, A., Kauppi, K., Ljungberg, J. K., Lövheim, H., Lundquist, A., Adolfsson, A. N., Oudin, A., Pudas, S., Rönnlund, M., Stiernstedt, M., Sundström, A., & Adolfsson, R. (2020). Biological and environmental predictors of heterogeneity in neurocognitive ageing: Evidence from Betula and other longitudinal studies. Ageing Research Reviews, 64, 101184. https://doi.org/10.1016/j.arr.2020.101184

      Paquola, C., Benkarim, O., DeKraker, J., Larivière, S., Frässle, S., Royer, J., Tavakol, S.,

      Valk, S., Bernasconi, A., Bernasconi, N., Khan, A., Evans, A. C., Razi, A., Smallwood, J., & Bernhardt, B. C. (2020). Convergence of cortical types and functional motifs in the human mesiotemporal lobe. eLife, 9, e60673. https://doi.org/10.7554/eLife.60673

      Pedersen, R., Johansson, J., Nordin, K., Rieckmann, A., Wåhlin, A., Nyberg, L., Bäckman, L., & Salami, A. (2024). Dopamine D1-Receptor Organization Contributes to Functional Brain Architecture. Journal of Neuroscience, 44(11). https://doi.org/10.1523/JNEUROSCI.0621-23.2024

      Pedersen, R., Johansson, J., & Salami, A. (2023). Dopamine D1-signaling modulates maintenance of functional network segregation in aging. Aging Brain, 3, 100079. https://doi.org/10.1016/j.nbas.2023.100079

      Persson, J., & Söderlund, H. (2015). Hippocampal hemispheric and long-axis differentiation of stimulus content during episodic memory encoding and retrieval: An activation likelihood estimation meta-analysis. Hippocampus, 25(12), 1614–1631. https://doi.org/10.1002/hipo.22482

      Przeździk, I., Faber, M., Fernández, G., Beckmann, C. F., & Haak, K. V. (2019). The functional organisation of the hippocampus along its long axis is gradual and predicts recollection. Cortex, 119, 324–335. https://doi.org/10.1016/j.cortex.2019.04.015

      Sadeghi, L., Rizvanov, A. A., Salafutdinov, I. I., Dabirmanesh, B., Sayyah, M., Fathollahi, Y., & Khajeh, K. (2017). Hippocampal asymmetry: Differences in the left and right hippocampus proteome in the rat model of temporal lobe epilepsy. Journal of Proteomics, 154, 22–29. https://doi.org/10.1016/j.jprot.2016.11.023

      Tian, Y., Margulies, D. S., Breakspear, M., & Zalesky, A. (2020). Topographic organization of the human subcortex unveiled with functional connectivity gradients. Nature Neuroscience, 1–12. https://doi.org/10.1038/s41593-020-00711-6

      vos de Wael, R., Larivière, S., Caldairou, B., Hong, S.-J., Margulies, D. S., Jefferies, E., Bernasconi, A., Smallwood, J., Bernasconi, N., & Bernhardt, B. C. (2018). Anatomical and microstructural determinants of hippocampal subfield functional connectome embedding. Proceedings of the National Academy of Sciences, 115(40), 10154–10159. https://doi.org/10.1073/pnas.1803667115

    1. 印刷品在不同光源下的色彩保持

      印刷与色彩匹配是印刷行业中非常重要的一环,尤其是在需要高精度色彩还原的场景(如品牌标识、艺术作品、摄影画册等)中。以下是关于印刷与色彩匹配的详细解释,特别是针对 印刷品在不同光源下的色彩保持RGB → CMYK 色彩转换的优化


      1. 印刷品在不同光源下的色彩保持

      印刷品的色彩表现会受到光源的影响,这种现象称为 同色异谱(Metamerism)。为了在不同光源下保持色彩一致性,需要采取以下措施:

      (1)同色异谱现象

      • 定义:同色异谱是指两种颜色在某种光源下看起来相同,但在另一种光源下却显现出差异的现象。
      • 原因:不同光源的光谱分布不同,会导致印刷品表面的反射光谱发生变化,从而影响人眼对色彩的感知。

      (2)解决方案

      • 标准化光源:在印刷品的设计和校对过程中,使用标准光源(如 D50 或 D65)进行色彩评估。这些光源模拟了自然光的色温,能够提供一致的观察条件。
      • 多光源测试:在实际生产中,印刷品可能会在不同的光照环境下被观看(如日光、荧光灯、LED 灯)。因此,可以在多光源下测试印刷品的色彩表现,确保其在不同环境中的一致性。
      • 使用特殊油墨:某些高端印刷会使用特殊油墨(如 Pantone 专色),这些油墨的光谱特性更稳定,能够减少同色异谱现象。

      2. RGB → CMYK 色彩转换的优化

      在印刷过程中,需要将数字图像从 RGB 色彩模式转换为 CMYK 色彩模式。由于 RGB 和 CMYK 的色彩空间不同,直接转换可能会导致色彩漂移(Color Shift)。以下是优化这一转换过程的方法:

      (1)RGB 与 CMYK 的区别

      • RGB:用于显示设备(如显示器、手机屏幕),基于光的加色混合原理,色彩范围较广。
      • CMYK:用于印刷,基于油墨的减色混合原理,色彩范围较窄。

      (2)色彩漂移的原因

      • 色域差异:RGB 的色域比 CMYK 更广,某些 RGB 颜色无法在 CMYK 中准确再现。
      • 转换算法:不同的色彩管理软件或工具可能使用不同的转换算法,导致色彩表现不一致。

      (3)优化 RGB → CMYK 转换的方法

      • 使用色彩配置文件(ICC Profile)
      • ICC 配置文件是描述设备色彩特性的文件,可以帮助在 RGB 和 CMYK 之间进行准确的色彩转换。
      • 在转换时,选择与印刷设备和纸张匹配的 ICC 配置文件,以确保色彩还原的准确性。
      • 软打样(Soft Proofing)
      • 在设计软件(如 Adobe Photoshop)中启用软打样功能,模拟印刷后的色彩效果。
      • 通过软打样,可以提前发现并修正色彩漂移问题。
      • 手动调整
      • 在转换后,手动调整图像的色彩平衡、饱和度和亮度,以弥补转换过程中丢失的色彩细节。
      • 使用专色(Spot Color)
      • 对于品牌标识或其他需要高精度色彩还原的元素,可以使用 Pantone 专色,而不是依赖于 CMYK 的混合色。
      • 测试印刷
      • 在实际印刷前,进行小样测试(如打样或数码打样),检查色彩表现是否符合预期。

      3. 实际应用中的注意事项

      • 设计阶段:在设计时尽量使用 CMYK 模式,避免后期转换带来的色彩损失。
      • 沟通与协作:与印刷厂保持密切沟通,了解其设备和工艺特性,确保色彩管理的无缝衔接。
      • 环境控制:在色彩校对和评估时,确保观察环境的光源和背景符合标准(如使用中性灰背景和标准光源)。

      4. 总结

      • 印刷品在不同光源下的色彩保持:通过标准化光源、多光源测试和使用特殊油墨,可以减少同色异谱现象,确保印刷品在不同光照环境下的色彩一致性。
      • RGB → CMYK 色彩转换的优化:通过使用 ICC 配置文件、软打样、手动调整和测试印刷等方法,可以避免色彩漂移,确保印刷品的色彩还原准确。

      这些技术和方法的结合,能够帮助设计师和印刷厂实现高质量的印刷效果,满足客户对色彩精度和一致性的要求。

    2. 电影调色(Color Grading)

      电影调色(Color Grading)是电影制作中非常重要的一环,它不仅仅是让画面看起来更美观,还需要确保在不同光照环境下色彩的一致性。以下是对这一概念的具体解释:


      1. 电影调色的基本目标

      电影调色的主要目标是通过调整画面的色彩、对比度、亮度等参数,来实现以下效果: - 增强情感表达:通过色彩传递特定的情绪或氛围(如冷色调表现孤独,暖色调表现温馨)。 - 统一视觉风格:确保整部电影的视觉风格一致,避免不同场景之间的色彩跳跃。 - 修复拍摄问题:校正拍摄过程中因光照、设备等原因导致的色彩偏差。

      然而,除了这些美学和技术目标外,电影调色还需要解决一个关键问题:在不同光照环境下保持色彩一致性


      2. 不同光照环境下的挑战

      在电影拍摄和后期制作中,光照环境可能会发生显著变化,例如: - 自然光与人造光的差异:自然光(如日光)和人造光(如灯光)的色温和光谱分布不同,会导致色彩表现不一致。 - 场景切换:同一场景在不同时间拍摄(如白天和夜晚),或者不同场景之间的光照条件不同,可能会导致色彩偏差。 - 设备差异:不同摄影机或显示设备的色彩响应特性不同,可能会导致色彩还原不一致。

      这些光照环境的变化会直接影响画面的色彩表现,如果不加以处理,可能会导致观众在观看时感到不连贯或失真。


      3. 保持色彩一致性的重要性

      在电影调色中,保持色彩一致性的意义在于: - 叙事连贯性:色彩是电影叙事的重要元素之一。如果同一场景或同一角色的色彩在不同镜头中不一致,可能会破坏叙事的连贯性,让观众感到困惑。 - 视觉沉浸感:一致的色彩表现能够增强观众的沉浸感,让他们更专注于故事情节,而不是被色彩跳跃所干扰。 - 品牌与风格:许多导演和摄影师有自己独特的视觉风格(如 Wes Anderson 的高饱和度色彩,或 Christopher Nolan 的低饱和度暗色调)。保持色彩一致性有助于强化这种风格,形成品牌识别度。


      4. 如何在不同光照环境下保持色彩一致性

      为了实现色彩一致性,电影调色师通常会采取以下步骤:

      (1)色彩管理

      • 使用色彩管理系统(如 ACES)确保从拍摄到后期制作的整个流程中色彩的一致性。
      • 在拍摄时使用灰卡或色卡进行色彩校准,为后期调色提供参考。

      (2)色彩匹配

      • 对同一场景的不同镜头进行色彩匹配,确保它们看起来像是在相同光照条件下拍摄的。
      • 对不同场景的色彩进行调整,使它们在视觉上保持连贯。

      (3)动态范围调整

      • 在高动态范围(HDR)内容中,调色师需要特别关注高亮度和低亮度区域的色彩表现,确保它们在不同光照条件下的一致性。

      (4)参考显示设备

      • 使用经过校准的专业显示设备进行调色,确保色彩还原的准确性。
      • 在不同的显示设备(如影院投影、电视、手机)上进行测试,确保色彩在不同平台上的表现一致。

      (5)光照模拟

      • 在后期制作中,调色师可以通过数字手段模拟不同光照条件的效果,并调整色彩以保持一致性。例如,模拟日光下的阴影效果或夜晚的灯光效果。

      5. 实际案例

      • 《银翼杀手2049》:这部电影在调色时特别注重在不同光照环境下保持色彩的一致性。例如,在室外场景中,尽管光照条件复杂(如雾霾、霓虹灯),但通过精细的调色,画面依然保持了统一的冷色调风格。
      • 《疯狂的麦克斯:狂暴之路》:这部电影以高饱和度的色彩风格著称,调色师通过精确的色彩管理,确保了在沙漠、夜晚、室内等不同光照环境下色彩的一致性。

      6. 总结

      电影调色中保持色彩一致性的核心在于:通过色彩管理、色彩匹配和技术手段,确保在不同光照环境下画面的色彩表现连贯且真实。这不仅是为了提升画面的美观度,更是为了增强叙事的连贯性和观众的沉浸感。调色师在这一过程中扮演了至关重要的角色,他们需要结合艺术感知和技术能力,为电影赋予统一的视觉风格和情感表达。

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment 

      This study presents valuable finding regarding the role of life history differences in determining population size and demography. The evidence for the claims is still partially incomplete, with concerns about generation times and population structure. Nonetheless, the work will be of considerable interest to biologists thinking about the evolutionary consequences of life history changes.  

      Thank you. We have addressed the generation time and population structure issues in detail in our revision and hope that you, like us, find them to be of sufficiently low concern (i.e., they are not driving the results) that they do not overshadow the main findings and conclusions.

      The opportunity to make in-depth revisions also helped the manuscript in two ways unanticipated by both us and the reviewers. First, KW made a mistake in the original analysis of phylogenetic signal, and catching that error simplifies that aspect of the study (there is none in our measured variables). Second, in June 2024 Hilgers et al. (2024; https://doi.org/10.1101/2024.06.17.599025) posted an important manuscript to bioRxiv noting the possibility of false population size peaks in PSMC analyses using the standard default settings. Our results had three of those, which we have eliminated. N<sub>e</sub>ither of these issues affect the overall conclusions, but their resolution improves the work.  

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This interesting study applies the PSMC model to a set of new genome sequences for migratory and nonmigratory thrushes and seeks to describe differences in the population size history among these groups. The authors create a set of summary statistics describing the PSMC traces - mean and standard deviation of N<sub>e</sub>, plus a set of metrics describing the shape of the oldest N<sub>e</sub> peak - and use these to compare across migratory and resident species (taking single samples sequenced here as representative of the species). The analyses are framed as supporting or refuting aspects of a biogeographic model describing colonization dynamics from tropical to temperate North and South America. 

      Strengths: 

      At a technical level, the sequencing and analysis up through PSMC looks good and the paper is engaging and interesting to read as an introduction to some verbal biogeographic models of avian evolution in the Pleistocene.

      The core findings - higher and more variable N<sub>e</sub> in migratory species - seem robust, and the biogeographic explanation is plausible.  

      Thanks. We thought so as well. Our analyses go beyond being simply descriptive and test some simple hypotheses, including a biogeographic+ecological expansion opportunity gained in some lineages through the adoption of a seasonal migration life-history strategy.  

      Weaknesses: 

      I did not find the analyses particularly persuasive in linking specific aspects of clade-level PSMC patterns causally to evolutionary driving forces. To their credit, the authors have anticipated my main criticism in the discussion. This is that variation in population size inferred by methods like PSMC is in "effective" terms, and the link between effective and census population size is a morass of bias introduced by population structure and selection so robustly connecting specific aspects of PSMC traces to causal evolutionary forces is somewhere between extremely difficult and impossible.  

      As R1 notes, we do not attempt to link effective population sizes and census sizes (though we do discuss this), and we are also careful to discuss correlated rather than causative factors when going beyond the overarching hypotheses regarding life-history strategy.

      Population structure is the most obvious force that can generate large N<sub>e</sub> changes mimicking the census-sizefocused patterns the authors discuss. The authors argue in the discussion that since they focus on relatively deep time (>50kya at least, with most analyses focusing on the 5mya - 500kya range) population structure is "likely to become less important", and the resident species are usually more structured today (true) which might bias the findings against the observed higher N<sub>e</sub> in migrants.  

      To clarify, the patterns we discuss are entirely related to effective population size, not census size. But, yes, this is why we’ve given population structure its own section in the Discussion.

      But is structure really unimportant in driving PSMC results at these specific timescales? There is no numerical analysis presented to support the claim in this paper. The biogeographic model of increased temperate-latitude land area supporting higher populations could yield high N<sub>e</sub> via high census size, but shifts in population structure (for example, from one large panmictic population to a series of isolated refugial populations as a result of glaciation-linked climate changes) could plausibly create elevated and more variable N<sub>e</sub>. Is it more land area and ecological release leading to a bigger and faster initial N<sub>e</sub> bump, or is it changes in population connectivity over time at expanding range edges, or is the whole single-bump PSMC trace an artifact of the dataset size, or what? The authors have convinced me that the N<sub>e</sub> history of migratory thrushes is on average very different from nonmigrant thrushes, but beyond that it's unclear what exactly we've learned here about the underlying process.  

      We do not argue that population structure is unimportant, only that it is less important as one goes into deeper time. Further, we agree with the reviewer’s observation above that structure is more likely to bias nonmigrant estimates of N<sub>e</sub>. In other words, following Li & Durbin’s (2011) simulations, we interpret that an inflated N<sub>e</sub> due to structure should occur more often among residents. We have clarified this in the revision. We also agree that what we’ve learned about the underlying process is not entirely clear, but as we stated, population structure does not seem to be the main driver, and there is evidence that both biogeographic and ecological factors are involved. With this being the first time that these questions have been asked, we think we’ve made an important advance and that we’ve opened a number of avenues for future study.

      It also important to consider the time scales involved and the sampling regime. Glacial-interglacial cycles averaged ~100 Kyr back to 0.74 Mya and then averaged ~41 Kyr from then back to 2.47 Mya; about 50-60 of these cycles occurred (Lisiecki & Raymo 2005: fig. 4). This probably caused a lot of population structuring and mixing in these lineages. In addition, in the PSMC output from one of our lineages, C. ustulatus swainsonii, we find that there are 54 time segments sampled for the Pleistocene, indicating the inadequacy of this method to reflect fine-scale changes and suggesting that each estimate is capturing a lot of both phenomena, structuring and mixing. We have added this to the revision.

      I generally agree with the authors that "at present there is no way to fully disentangle the effects of population structure and geographic space on our results". But given that, I think there are two options - either we can fully acknowledge that oversimplified demographic models like PSMC cannot be interpreted as supporting evidence of any particular mechanistic or biogeographic hypothesis and stop trying to use them to do that, or we have to do our best to understand specifically which models can be distinguished by the analyses we're employing. 

      Short of developing some novel theory deep in the PSMC model, I think readers would need to see simulations showing that the analyses employed in this paper are capable of supporting or refuting their biogeographic hypothesis before viewing them as strongly supporting a specific biogeographic model. Tools like msprime and stdpopsim can be used to simulate genome-scale data with fairly complex biogeographic models. Running simulations of a thrush-like population under different biogeographic scenarios and then using PSMC to differentiate those patterns would be a more convincing argument for the biogeographic aspects of this paper. The other benefit of this approach would be to nail down a specific quantitative version of the taxon cycles model referenced in the abstract, and it would allow the authors to better study and explain the motivation behind the specific summary statistics they develop for PSMC posthoc analysis.  

      These could very well be fruitful pursuits for future work, but they are beyond the scope of this paper. The impossibility of reconstructing ranges through deep time makes anything other than the very general biogeographic hypothesis we’ve posed an uncertain pursuit. Also, a purely biogeographic approach neglects the likelihood of ecological expansion also being involved. We get at the importance of the latter in the “Geography and evolutionary ecology” section of the Discussion. Below, the editor states that discussions among reviewers indicate that simulations are not warranted at this time. We agree that the complexities involved are substantial, to the point of making direct relevance to this empirical study uncertain (especially in such an among-lineage context). Regarding taxon cycles, we merely point out that that conceptual framework seems relevant given our findings. This was not even remotely anticipated at the outset of the study, so we are reluctant to do anything more than point out its possible relevance in several aspects of the results. Finally, the motivation for the study’s summary statistics were entirely driven by the hypotheses, as given in Methods, and due to an earlier error (noted above), there are no post-hoc analyses in the revision. Sorry for the needless confusion.

      Reviewer #2 (Public Review): 

      Summary: 

      Winker and Delmore present a study on the demographic consequences of migratory versus resident behavior by contrasting the evolutionary history of lineages within the same songbird group (thrushes of the genus Catharus). 

      Strengths: 

      I appreciate the test-of-hypothesis design of the study and the explicit formulation of three main expectations to test. The data analysis has been done with appropriate available tools. 

      Weaknesses: 

      The current version of the paper, with the case study chosen, the results, and the relative discussion, is not satisfying enough to support or reject the hypotheses here considered.  

      Given the stated strengths, the weaknesses noted seem a little incongruous, but we understand from the comments below that the reviewer would like to see the study redesigned and expanded.  

      The authors hypothesized that the wider realized breeding and ecological range characterising migrants versus resident lineages could be a major drive for increased effective population size and population expansion in migrants versus residents. I understand that this pattern (wider range in migrants) is a common characteristic across bird lineages and that it is viewed as a result of adapting to migration. A problem that I see in their dataset is that the breeding grounds range of the two groups are located in very different geographic areas (mainly South versus North America). The authors could have expanded their dataset to include species whose breeding grounds are from the two areas, regardless of their migratory behaviour, as a comparison to disentangle whether ecological differences of these two areas can affect the population sizes or growth rates.

      Because the questions are about the migratory life history strategy and the best way to get at this is in a phylogenetic framework, we’re not sure how we could effectively add species “regardless of their migratory behavior.” Further, we know that migration causes lineages to experience variable ecological conditions that include breeding, migration, and wintering conditions. Obligate migrants are going to have different breeding ranges from their close relatives, and the more distantly related species are, the less likely it is that they respond to particular ecological conditions the same way. So we do not think that an approach that included miscellaneous species from northern and southern regions would strengthen this study. Here, the comparative framework of closely related lineages that possess or lack the trait of interest is a study design strength. We do agree, however, that future work is needed that does encompass more lineages (we would argue in a phylogenetic context), and that disentangling the effects of geography and ecology will also be an important future endeavor. 

      As I understand from previous literature, the time-scale to population growth and estimates of effective population sizes considered in the present paper for the resident versus migratory clades seem to widely predate the times to speciation for the same lineages, which were reported in previous work of the same authors (Everson et al 2019) and others (Termignoni-Garcia et al 2022). This piece of information makes the calculation of species-specific population size changes difficult to interpret in the light of lineages' comparison. It is unclear what the authors consider to be lineage-specific in these estimates, as the clades were likely undergoing substantial admixture during the time predating full isolation.  

      We do recognize that timing estimates vary among studies. Differences among studies in important variables like markers, methods, generation time, and mutation or substitution rates create much of this uncertainty. Also, we are not confident in prior dating efforts in this group, largely because of gene flow and its effects on bringing estimates closer to the present. As we point out (line 485), differences among studies on these issues do not detract from the strengths here for within-study, among-lineage contrasts. In short, the timing could be off in an among-study context (and likely is with prior work, given gene flow), but relative performance of among-lineage N<sub>e</sub> differences is less susceptible to these factors. This was shown fairly well in Li & Durbin’s initial use of the method among human populations. Regarding substantial admixture, PSMC curves often unite at their origins with sister lineages (when they were the same lineage). A good example is with the two C. guttatus E & W curves in Fig. S3, which still have substantial gene flow today (they are subspecies and in contact), yet they show remarkably different N<sub>e</sub> curves through their history. It is not possible to mark a cutoff point for each lineage that represents the cessation of admixture with another lineage (e.g., Everson et al. 2019 showed substantial admixture between three full species in this group); that period can be very long (Price et al. 2008), varies among lineages, and will not be available for deeper lineage divergences in the phylogeny. We therefore chose to use all of the time intervals retrievable from the genomic data in each lineage, considering that this uniform treatment is the best approach for our among-lineage comparison. And note that we were careful to label these as “the lineages’ PSMC inception” (line 190).  

      Regarding the methodological difficulties in interpreting the impact of population structure on the estimates of effective population sizes with the PSMC approach, I would think that performing simulations to compare different scenarios of different degrees of structured populations would have helped substantially understand some of the outcomes.  

      The complexities of such modeling in a system like this are daunting. The different degrees of structuring among all of these lineages across just a single glacial-interglacial cycle would necessitate a lot of guesswork; projecting that back across 50-60 such cycles just in the Pleistocene would probably end up being fiction. Disentangling the effects of structure versus changes in N<sub>e</sub> in a system like this would probably not be possible with that approach and these data. As noted above and below, there was agreement among reviewers and the editor that simulations in this case are not warranted for revision. We have added the nature of the glacialinterglacial cycles and the PSMC sampling time segments to help readers understand this better (see above in response to R1, and lines 272-278).

      Additionally, I have struggled to understand if migratory behaviour in birds is considered to be acquired to relieve species competition, or as a consequence of expanded range (i.e., birds expand their range but their feeding ground is kept where speciation occurred as to exploit a ground with higher quality and abundance of seasonal local resources).  

      The origins of migration have been a struggle for researchers since the subject was taken up. But how the trait was acquired among these species does not really matter for our study. Here, migratory lineages possess different biogeographic+ecological attributes than their close relatives that are sedentary. Our focus is on the presence and absence of this life-history trait.

      The points raised above could be considered to improve the current version of the paper. 

      Thank you. We appreciate the opportunity to guide our revision using your comments.  

      Reviewer #3 (Public Review): 

      Summary: 

      This paper applies PSMC and genomic data to test interesting questions about how life history changes impact long-term population sizes. 

      Strengths: 

      This is a creative use of PSMC to test explicit a priori hypotheses about season migration and N<sub>e</sub>. The PSMC analyses seem well done and the authors acknowledge much of the complexity of interpretation in the discussion. 

      Weaknesses: 

      The authors use an average generation time for all taxa, but the citations imply generation time is known for at least some of them. Are there differences in generation time associated with migration? I am not a bird biologist, but quick googling suggests maybe this is the case (https://doi.org/10.1111/1365-2656.13983). I think it important the authors address this, as differences in generation time I believe should affect estimates of N<sub>e</sub> and growth.  

      Good point. The study cited by the reviewer encompasses a much higher degree of variation in body size and thus generation time. Differences in generation time in similarly sized close relatives, as in our study, should be small, and our approach has been to average those that are known. Unfortunately, generation times are not known for all of these species, but given their similarity in size we can have reasonable confidence in their being similar. We used data from the life-history research available (as cited) to obtain our average; there are not appropriate data for the residents, though. However, there is thought to be a generation time cost to seasonal migration in birds, and Bird et al. (2020) included this in their estimates to provide modeled values for all of the lineages we studied. We’re leery of using modeled values where good data for the nonmigrants in this group don’t exist (and the basis for quantifying this cost is tiny), but we recognize that this second approach is available and could leave some doubt in our results if not pursued. So we re-did everything with the modeled generation times of Bird et al. (2020). As expected, most of the differences are time-related. Importantly, our overall results are not different. We present them as Table S2 and have added the details on this to the Methods.

      The writing could be improved, both in the introduction for readers not familiar with the system and in the clarity and focus of the discussion.  

      We have added a phylogeny (new Fig. 1) to help readers better understand the system, and we’ve re-worked the Discussion to make it clearer what is clarified by our results and what remains unclear.  

      Recommendations for the authors:

      Reviewing Editor comment: 

      I note that discussion among the reviewers made clear that simulations are probably not the right answer given the complexity of the modeling required.  

      We appreciate this conclusion, with which we agree.  

      Reviewer #2 (Recommendations For The Authors): 

      Apologies for the delay with the review, which came at a very busy time. I hope you will find my comments helpful.

      Thanks. Your comments are helpful, and we fully understand how reviews (and our revisions!) have to wait until more pressing needs are addressed.

      I enjoyed reading the manuscript but I believe that the discussion sections could be heavily rewritten for better clarity. The discussion is sometimes redundant and lacks some flow/clarity. In a nutshell, I had the feeling that a bit of everything is thrown in the discussion but clear conclusions are not made.  

      Yes, the Discussion has been difficult to write, because more issues arose in the Results than we anticipated at the outset. We feel that discussing them is relevant, but we agree that much remains unclear. This coupling of paleodemographics with geography and ecology is a new area, which opens some important new (and relevant) areas to consider. So clarity is not possible in some areas. We’ve revised to point out where we do have clarity (e.g., in migrant lineages having different paleodemographic attributes than nonmigrants) and where only further study can provide clarity (e.g., in the roles of geography versus ecology). The journal format does not seem to have secondary subheaders, but we’ve used bold in one place to highlight ‘ecological mechanisms’ to offset that section, one of the more complex. We’ve also added a paragraph in the conclusions to clarify where we have clear takeaways and where uncertainties remain. 

      Reviewer #3 (Recommendations For The Authors): 

      The introduction should engage the reader with biology, not the use of demographic methods or genomics (both of which have been around for more than a decade). I would drop the first paragraph and considerably expand the second. What has previous research on ecology/behavior/genetics found regarding the demographic effects of seasonal migration?

      There are two important aspects to our study: 1) using paleodemographic methods to test hypotheses about adoption of a major life-history trait—an important biological question regardless of system, and so far (surprisingly) unaddressed; and 2) using this novel approach to study the effects of one such trait, seasonal migration. At these timescales, nothing exists on this subject, so there is really nothing to expand with. If there is relevant literature that we’ve missed, we’d be happy to add it.

      What is the missing bit of information or angle the current study addresses (other than just doing it larger and fancier with genomics)?  

      The effects of major life-history traits on paleodemographics has not been addressed before, to our knowledge. The whole context is new, so we’re not doing something “larger and fancier” with genomics. We are doing something that has not been done before: testing hypotheses about the effects of a major life-history trait on population sizes in evolutionary time. We’re not sure how this can be made clearer. To us this seems like a very engaging biological question with wide applicability. We hope that this study is just the first of many to come, in a diversity of biological systems.

      A figure showing the phylogenetic relationships of these taxa which are migratory would help the reader immensely. Although this is shown in Fig S3 I think it might be nice to have a map of the species and their ranges alongside a phylogeny as a main figure early on.  

      Thank you. This is a good suggestion. We can’t fit a phylogeny and all the distribution maps (Fig. S1) onto a page, but we can include a phylogeny as one of the main figures with nonmigrants highlighted. We’ve inserted this as a new Fig. 1. 

      If I understand correctly, the authors' arguments for why migratory species should show more growth hinge on large range size and geographic expansion. Yet they argue in the discussion that these forces are unlikely to be important (L226). I found the discussion on this confusing (e.g. L231 then says maybe it does matter). I think more clarity here would be helpful.

      Our argument and predictions are based both on geographic and ecological expansion. This was clearly stated as our third prediction “3) early population growth would be higher as seasonal migration opens novel ecological and geographic space…” We have gone back through and reiterated the coupling of these two factors. The line mentioned concludes the first paragraph in the section ‘Geography and evolutionary ecology,’ which focuses on the difficulty of decoupling these in this system. As the paragraph relates, geography alone does not seem to be driving our results (we do not argue that it is unimportant). 

      I also would have liked more time in the discussion addressing why variation in N<sub>e</sub> may be higher in migratory lineages.

      In addition to re-clarifying this in the Introduction, we have touched back on this now at line 221: “We attribute the higher variation in N<sub>e</sub> among migrants to be the result of the relative instability of northern biomes compared with tropical ones through glacial-interglacial cycles (e.g., Colinvaux et al., 2000; Pielou, 1991).”

      Minor comments: 

      L 62: Presumably PSMC is limited by the coalescent depth of the genelaogy, which may be younger or older than population "origins" depending on the history of colonization, lineage splitting, gene flow, etc.  

      We were careful to phrase these as “the lineages’ PSMC inception” (line 190), and responded to this issue in more detail above in response to R2’s public review. 

      L 338: I think a few more details on PSMC would be helpful. Was no maskfile used?  

      We did not use a maskfile, choosing instead to generate data of decent coverage and aligning reads to a single closely related relative. 

      Did the consensus fasta include all species?  

      No, we used a single reference high-quality fasta of Catharus ustulatus , as reported (lines 434-37). We have added that “Identical treatment of all lineages in these respects should provide a strong foundation for a comparative study like this among close relatives.” 

      L 361: Fair to assume the authors used a weighted average of N<sub>e</sub> from the output, rather than just averaging the N<sub>e</sub> values from each time segment?  

      No – we used all the values of N<sub>e</sub> produced by PSMC output. The PSMC method uses nonoverlapping portions of the genome in its analyses (which we’ve added to make that clear), and portions in juxtaposition will often provide data for very different periods in the time segments. Further, time segments are uneven within and among taxa, so it is not clear how a uniform and comparable weighting scheme could be implemented. We consider a uniform approach to be of primary importance, including for future comparisons among studies. 

      L 383 "delta" typo

      Thank you for catching this.

      L 93: I'd be tempted to present the questions (how does seasonal migration affect population size trajectory, means, and variation) and rationale before presenting the hypotheses. I found myself reading the hypotheses and wondering "why?"  

      We’ve tried this change in the revision. It makes the hypotheses a little harder to pull out (they are no longer numbered in a short sequence), but it is shorter and solves this concern.  

      L 337 read depth is usually expressed as X (e.g. "23X") rather than bp.

      Changed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This important study further validates DNAH12 as a causative gene for asthenoteratozoospermia and male infertility in humans and mice. The data supporting the notion that DNAH12 is required for proper axonemal development are generally convincing, although more experiments would solidify the conclusions. This work will interest reproductive biologists working on spermatogenesis and sperm biology, as well as andrologists working on male fertility.

      We thank the editor and the two reviewers for their time and careful evaluation of our manuscript. We sincerely appreciate their encouraging feedback and insightful guidance on improving our study. In the revised manuscript, we have performed additional experiments and provided quantitative data regarding the reviewers' comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Even though this is not the first report that the mutation in the DNAH12 gene causes asthenoteratozoospermia, the current study explores the sperm phenotype in-depth. The authors show experimentally that the said mutation disrupts the proper axonemal arrangement and recruitment of DNALI1 and DNAH1 - proteins of inner dynein arms. Based on these results, the authors propose a functional model of DNAH12 in proper axonemal development. Lastly, the authors demonstrate that the male infertility caused by the studies mutation can be rescued by ICSI treatment at least in the mouse. This study furthers our understanding of male infertility caused by a mutation of axonemal protein DNAH12, and how this type of infertility can be overcome using assisted reproductive therapy.

      Strengths:

      This is an in-depth functional study, employing multiple, complementary methodologies to support the proposed working model.

      Thank you for your recognition of the strength of this study. Your positive feedback motivates us to continue refining our research and methodological rigor in future studies.

      Weaknesses:

      The study strength could be increased by including more controls such as peptide blocking of the inhouse raised mouse and rat DNAH12 antibodies, and mass spectrometry of control IP with beads/IgG only to exclude non-specific binding. Objective quantifications of immunofluorescence images and WB seem to be missing. At least three technical replicates of western blotting of sperm and testis extracts could have been performed to demonstrate that the decrease of the signal intensity between WT and mutant was not caused by a methodological artifact.

      Thank you for your comments. In order to study in-depth, we have analyzed the protein sequence features of DNAH12 protein, 1-200 amino acids of DNAH12 were selected as the ideal antigen considering its good performance (1. high immunogenicity; 2. High hydrophilicity; 3. Good Surface Leakage Groups; 4. Sequence homology analysis to avoid unspecific recognition to other proteins;). The two different anti-DNAH12 antibodies were developed with the help Dia-An Biotech company in 2022, we have tried to acquire the polypeptide fragments of target proteins to do peptide blocking but the material were discard after the service. Luckily, we have got the target band of DNAH12 protein in western blotting experiment while the band was not detected in knockout mice group; the immunofluorescence signals of DNAH12 were strong but not present in knockout mice group. Besides, we have tested that the inhouse raised rabbit antibody were suitable for IP experiment. The IP experiment also showed the raised rabbit antibody were able to immunoprecipitated the DNAH12 band in the Dnah12<sup>+/+</sup> mice but not in Dnah12<sup>-/-</sup> mice. Collectively, these data could support the specificity of the raised DNAH12 antibodies.  In IP assay, we have added the IgG group in the IP-mass spectrometry to exclude non-specific binding. And the experimental design was described in Figure 6B. The raw data were deposited in iProX partner repository (accession number: PXD051681), and we have coordinated with the repository manager to make the data publicly accessible (https://www.iprox.cn/page/subproject.html?id=IPX0008674001).  

      Besides, we have conducted replicates of western blotting of sperm and testis extracts at least 3 times and added the objective quantifications of immunofluorescence signals and WB images. The quantifications of the blot were shown in figures to help readers understand these results easily.

      Reviewer #2 (Public Review):

      Summary:

      The authors first conducted whole exome sequencing for infertile male patients and families where they co-segregated the biallelic mutations in the Dynein Axonemal Heavy Chain 12 (DNAH12) gene.

      Sperm from patients with biallelic DNAH12 mutations exhibited a wide range of morphological abnormalities in both tails and heads, reminiscing a prevalent cause of male infertility, asthenoteratozoospermia. To deepen the mechanistic understanding of DNAH12 in axonemal assembly, the authors generated two distinct DNAH12 knockout mouse lines via CRISPR/Cas9, both of which showed more severe phenotypes than observed in patients. Ultrastructural observations and biochemical studies revealed the requirement of DNAH12 in recruiting other axonemal proteins and that the lack of DNAH12 leads to the aberrant stretching in the manchette structure as early as stage XI-XII. At last, the authors proposed intracytoplasmic sperm injection as a potential measure to rescue patients with DNAH12 mutations, where the knockout sperm culminated in the blastocyst formation with a comparable ratio to that in WT.

      Strengths:

      The authors convincingly showed the importance of DNAH12 in assembling cilia and flagella in both human and mouse sperm. This study is not a mere enumeration of the phenotypes, but a strong substantiation of DNAH12's essentiality in spermiogenesis, especially in axonemal assembly.

      The analyses conducted include basic sperm characterizations (concentration, motility), detailed morphological observations in both testes and sperm (electron microscopy, immunostaining, histology), and biochemical studies (co-immunoprecipitation, mass-spec, computational prediction). Molecular characterizations employing knockout animals and recombinant proteins beautifully proved the interactions with other axonemal proteins.

      Many proteins participate in properly organizing flagella, but the exact understanding of the coordination is still far from conclusive. The present study gives the starting point to untangle the direct relationships and order of manifestation of those players underpinning spermatogenesis. Furthermore, comparing flagella and trachea provides a unique perspective that attracts evolutional perspectives.

      Thank you for your thoughtful and positive feedback. We are delighted that you found our study to be a strong substantiation of DNAH12's essential role in spermiogenesis, particularly in axonemal assembly. We believe that this study represents a meaningful step toward unraveling the intricate coordination of axonemal proteins during spermatogenesis, and your comments further inspire us to continue exploring these complex mechanisms in future work. Thank you once again for your valuable insights and summary of this work.

      Weaknesses:

      Seemingly minor, but the discrepancies found in patients and genetically modified animals were not fully explained. For example, both knockout mice vastly reduced the count of sperm in the epididymis and the motility, while phenotypes in patients were rather milder. Addressing the differences in the roles that the orthologs play in spermatogenesis would deepen the comprehensive understanding of axonemal assembly.

      This is an interesting question. Actually, it seems that although humans and mice share the male infertility phenotypes with deficiency in dynein proteins essential for sperm flagellar development, they are different in some ways. For instance, it has been reported that deficiency in DNAH17 (Clin Genet. 2021. PMID: 33070343) or DNAH8 (Am J Hum Genet. 2020. PMID: 32619401; PMCID: PMC7413861), two other members of Dynein Axonemal Heavy Chain family, also cause more severe phenotype in mice, comparing with that of human patients carrying bi-allelic DNAH17 or DNAH8 loss-of-function mutations. In knockout mice, sperm counts are lower, and the proportion of abnormal sperm morphology is higher, whereas the phenotypes in human patients tend to be milder. These observations suggest that orthologs may influence spermatogenesis to slightly different extents in humans and mice. We plan to investigate the mechanisms underlying these discrepancies in future studies, which will provide deeper insights into axonemal assembly and the evolutionary aspects of spermatogenesis. Thank you again for bringing up this important issue.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This reviewer is impressed by the study's depth and the extent of the methodology used in the study. The study is well-designed, and the results are very interesting. The reviewer's enthusiasm was reduced by the lack of some controls (provided that the reviewer did not miss them). Further are point-to-point suggestions that this reviewer believes will increase the merit of the present study.

      Title:

      (1) Why a "special" dynein? What makes it special when compared to other dyneins? I suggest removing the word special.

      Through phylogenetic and protein domain analyses of the DNAH family, we found that DNAH12 is the shortest member and the only one that lacks a typical microtubule-binding domain (MTBD) in the DNAH family, thus we want to describe it as a “special” dynein. We have fully considered your valuable suggestion and decided to remove it from the title.

      Abstract:

      (2) L23: same as above, why special?

      We identified DNAH12 as the shortest member of the DNAH family and uniquely lacking the typical microtubule-binding domain (MTBD). This distinct characteristic prompted us to describe it as a 'special' dynein in the abstract part.

      (3) L37: the reviewer did not find a figure (neither main nor supplementary) that would demonstrate the proper organization of microtubules in cilia. Figure S11 only shows the presence of cilia in DNAH12-/- mouse. A TEM image of cilia is required to confirm or reject the claim that DNAH12 does not play a crucial role in proper microtubule organization in cilia.

      We have now added TEM images of cilia in wild-type and Dnah12<sup>-/-</sup> mice. The ultra-structures of cilia axonemes were comparable in wild-type and Dnah12<sup>-/-</sup> groups, suggesting that DNAH12 may not play crucial role in proper microtubule organization. The results have now been added to Supplemental Figure 11F.

      (4) L122-6: Did the authors also confirm these structures by cryo-EM? If not, this needs to be pointed out as a shortcoming in the discussion, that the structures and interactions are predicted in silico only.

      Thank you for your comment. Due to resource limit, we do not perform cryo-EM to confirm these structures. We will pursue the structures details at an atomic resolution structure in further study. We understand this point and now we have addressed this as a shortcoming in the discussion part.

      (5) L134: Be more specific about what characteristics of DNAH12 were analyzed.

      Thank you for your comment. We have now updated these in the method part. The characteristics of the DNAH12 including its region immunogenicity, hydrophilicity, surface leakage groups, and sequence homology were analyzed.

      (6) L137: Be more specific about how the antibodies validated were. Were the antibodies validated for both immunofluorescence and western blotting? I suggest doing peptide blocking of the antibody, for instance for ICC, preincubation of ab with immunizing peptide followed by primary ab incubation with studied cells/tissues.

      Thank you for your comments and suggestions. We validated the antibodies for both immunofluorescence and western blotting to ensure their effectiveness in our experiments. The two different anti-DNAH12 antibodies were developed with the help Dia-An Biotech company in 2022, we have attempted to acquire the polypeptide fragments of target proteins to do peptide blocking but the material were disposed after the service. Luckily, we have got the target band of DNAH12 protein which showed strong signal in western blotting experiment and the band was not detected in knockout mice group; the immunofluorescence signals of DNAH12 were strong but not present in knockout mice group. Besides, the IP experiment also showed the raised rabbit antibody were able to immunoprecipitated the DNAH12 band in the Dnah12<sup>+/+</sup> mice but not in Dnah12<sup>-/-</sup> mice. Collectively, these data could support the specificity of the raised DNAH12 antibodies. We sincerely admire your suggestion and will require for the peptide material if we develop new antibodies.

      (7) L142: This reviewer is unfamiliar with using TRIzol for sperm protein extraction. Is there a specific reason for not using PAGE loading buffer for human sperm protein extraction?

      Thanks for your suggestions. TRIzol reagent can be used for small amounts of samples (5×10<sup>6</sup> cells) as well as large amounts of samples (>10<sup>7</sup> cells). It is suitable for extraction of RNA and proteins at the same time. Our lab has adopted these methods in our previous work (Hum Reprod Open. 2023; PMID: 37325547; PMCID: PMC10266965.). This method is very useful to process valuable small amounts of samples for scientific work. The human sperm protein extraction was added with SDS-sample buffer [PAGE loading buffer] before SDS-PAGE separation. We have added this detail in the method part. We are sorry for making this misunderstanding.

      (8) L144: Were these the final concentrations of the SDS loading buffer? 1 × Laemmli buffer contains 62.5 mM TRIS, 2% (w/w) SDS, 10 % (w/v) glycerol, and 5% 2-mercaptoethanol. Please, amend accordingly.

      Thanks for your suggestions.  We apologized for incorrect labelling of concentrations (The previous one is 3× SDS loading buffer).  We have now amended the SDS loading buffer to 1 × Laemmli buffer as suggested.

      (9) L151: Table S2 contains other homemade antibodies than DNAH12. Please, include references to the studies where the generation and validation of these antibodies is described.

      Thank you for your suggestions. We have developed a DNAH1 antibody for use in Western blot assays, with its generation and validation detailed in Frontiers in Endocrinology (Lausanne), 2021 (PMID: 34867808; PMCID: PMC8635859). Additionally, we have produced a DNAH17 antibody for both immunofluorescence (IF) and Western blot, as described in Journal of Experimental Medicine, 2020 (PMID: 31658987; PMCID: PMC7041708). These references have now been included.

      (10) L167: Please, spell out ICR at its first appearance.

      Done as suggested, Thank you. The full name of ICR is Institute of Cancer Research.

      (11)L169: This reviewer is confused. It seems that the mouse encodes DNAH12 on exons 5 and 18 simultaneously. Each mouse model has only one exon targeted for a knockout. Would not this mean that the expression of DNAH12 in both models is not completely knocked down? Please, give more background in this paragraph for those less familiar with CRISPR/Cas9.

      Thank you for your insightful comment. We appreciate your attention to detail. To clarify, while the mouse model does indeed encode DNAH12 on exons 5 and 18 simultaneously, we specifically targeted the key exon 5 or exon 18 in each model to achieve different knockout strategies. This approach allows us to assess the functional implications of the remaining DNAH12 expression in both models. We have checked the DNAH12 expression in both models, and the result showed both models present with undetected DNAH12 proteins, indicating both models were completely knocked out of DNAH12 proteins. Additionally, we will revise the manuscript to include further details on the CRISPR/Cas9 methodology, ensuring accessibility for readers less familiar with this technique. Thank you again for your valuable feedback, which we believe will greatly enhance our manuscript.

      (12) L201: 50 % PBS? As in 0.5 x concentrated PBS? Please, rewrite for clarity.

      The term "50% PBS" refers to a 1:1 dilution of phosphate-buffered saline (PBS) with an appropriate diluent, resulting in a final concentration of 0.5x PBS. We will revise the text to explicitly clarify this, ensuring it is clear to all readers. Thank you for highlighting this point.

      (13) L224: Please, state what beads those were (magnetic/agarose, conjugated to protein A/G...) Include catalog # and manufacturer.

      Thank you for your suggestion. We have updated the manuscript to include this information. The beads used were Protein A/G Magnetic Beads (Catalog #B23202, Bimake, Texas, USA).

      (14) L227: What was the reason for adding a proteasomal inhibitor? What concentration was used? Please, add this information to the text.

      We adding MG132 in cell immunoprecipitation (IP) experiments is to inhibit proteasomal activity, thereby preventing the degradation of the target protein. This helps maintain the stability of the target protein during the experiment (Sci Adv. 2022. PMID: 35020426; PMCID: PMC8754306.), enhancing its detectability in subsequent analyses. MG132 (5 μM) was added. We have added this information in the revised the manuscript

      (15) L233: in vivo IP of mouse testis lysate? This does not make sense. I suggest removing "in vivo".

      Thank you for your careful review and comments on our manuscript. We have modified as suggested.

      (16) L317: Supplemental Figure 6 precedes Supplemental Figure 5 in the text, which is neither logical nor orderly.

      Thank you for your suggestion. Since the N-terminal DNAH12 antibody is already described in the Methods section (L317), we propose removing Supplemental Figure 6 from the content to improve the logical flow and maintain an orderly presentation.

      (17) L345 and elsewhere: how did the authors quantify the decrement of the signal? This needs to be measured objectively.

      Thank you for your valuable suggestion. We quantified the signal intensity using Fiji (Nat Methods. 2012. PMID: 22743772; PMCID: PMC3855844), which allows for precise analysis of pixel intensity. The results are presented in the figures to effectively illustrate the decrement in signal intensity. We appreciate your suggestion, and we have provided a description of the method in our methodology section.

      (18) L371: I recommend: ...and elongated spermatids; the abnormal...

      Done as suggested. Thank you.

      (19) L412-4: Cilia in both Dnah12<sup>mut/mut</sup> and Dnah12<sup>-/-</sup> are developed, but are they motile or immotile? This needs to be investigated. Is the DNAH12 in cilia truncated while still fulfilling its function?

      Thanks for your comment. We have checked the ciliary motility using an inverted microscope, and no significant difference of ciliary motility were observed between the knockout group and the control group. These results indicated that the ciliary motility was not affected by DNAH12 deficiency. The N-terminal DNAH12 antibody was developed to detect whether a truncated protein in mice tissues while we do not detect DNAH12 signals through immunofluorescence assay on trachea sections of the Dnah12<sup>-/-</sup> mice. These results indicate that DNAH12 may exert little influence on cilia, comparing to its important function in flagella.

      (20) L414-6: The results do not support this claim as the authors do not show that cilia are motile.

      Thanks for your comment. The supplemental videos 3-4 of trachea live of Dnah12<sup>+/+</sup> and Dnah12<sup>-/-</sup> mice have been uploaded to support this conclusion.

      (21) L421-3: Did the authors perform a negative test, where they let the testis lysate interact with beads/IgG only and performed the MS to identify non-specific binding? This is a crucial specificity test for this approach.

      We have performed negative test. In IP assay, we have added the IgG group in the IP-mass spectrometry to exclude non-specific binding. And the experimental design was described in Figure 6B. The raw data were deposited in iProX partner repository (PXD051681), which we have required the manager soon to update the status to public, so it will be visible to readers. 

      (22) L462: same as #18 the authors need to show that cilia are also motile. The mere presence of cilia in DNAH12-/- as shown in Fig S11C&D is not sufficient to conclude that the mice do not manifest PCD symptoms.

      Thanks for your comment. We do not observe obvious differences between the cilia of Dnah12<sup>+/+</sup> and Dnah12<sup>-/-</sup> mice.  The supplemental videos 3-4 of trachea live of Dnah12<sup>+/+</sup> and Dnah12<sup>-/-</sup> mice have been uploaded to show the motility of the trachea.

      (23) L529: MTBD region instead of domain, as "domain" is already part of the abbreviation.

      Done as suggested

      (24) L875: Sperm is both the singular and plural form. Spermatozoon vs spermatozoa can be used where the distinction between singular and plural needs to be made.

      Thanks for your suggestion. We have checked and changed this usage.

      (25) Figure 3H: Is there a specific reason why P11 is not shown?

      Because limited smear slides of P11 were available, the P11 were not stained for DNAH17 antibody previously. We have now updated the experiment, which showed that DNAH17 expression were not affected in patient P11. We have now added this result to Figure 3H.

      (26) Figure 8H: The authors in their MS do not describe what is happening to N-DRC proteins, yet they suggest in their model that it's unaffected in the mutant mouse/human. Please, address this in the MS and clearly state in the model that N-DRC needs further exploration in future studies.

      Thanks for your suggestion, we have checked the MS data but do not observe the enrichment of nexin-dynein regulatory complex (N-DRC) protein, just one known N-DRC protein DRC1 present with only 1 unique peptide. Instead, enrichment of inner dynein arm proteins and radial spoke proteins were observed. However, we cannot determine the N-DRC structures maybe affected or not. We have stated this in the discussion part and will pursue this with high resolution technology like cryo-EM in the future.

      (27) Figure 5F: Is it possible to choose a different Dnah12<sup>-/-</sup> spermatozoon to see a reduced level of DNALI1 so that it corresponds with the WB detection in Fig 5B?

      Thanks for your suggestion, we have chosen a Dnah12<sup>-/-</sup> spermatozoon with faint remnants of the DNALI1 signal as the representative picture.

      (28) Figure S2 and elsewhere: How were the authors able to resolve and calibrate 356 kDa protein using SDS PAGE? Agarose electrophoresis protein electrophoresis is more suitable for resolution of high molecular proteins. Most of the protein standards have as high molecular standard as 250 kDa.

      We have found that high molecular proteins (like 356kDa) were able to resolve in concentration 4-12% gradient gel of polyacrylamide gels and employ appropriate voltages and more time during electrophoresis to improve resolution of high molecular weight proteins. The DNAH12 proteins were calibrated by the using of a HiMark™ Pre-Stained High Molecular Weight Protein Standard (30-460 kDa). We have now updated the blot images to show the size of the DNAH12 protein (Fig S6B,). The target band is obvious between 268 kDa and 460 kDa, which make it easy to calculate the target band of DNAH12 antibody elsewhere. Thanks for your suggestion.

      (29) Figure S5: similar to #24: Why P10 and P11 are not shown?

      Because limited smear slides of P10 or P11 were available, we did not stain ODF2 antibody previously. We have now updated the experiments, which showed that ODF2 expression were not affected in patient P10 or P11. We have now added this result to Figure S5.

      (30) Figure S6B: The specificity of the anti-DNAH12 antibody against mouse DNAH12 seems to be questionable since the authors detect multiple bands on WB. I recommend doing peptide blocking to show that these are non-specific binding as opposed to off-target binding.

      Thank you for your comments. In order to study in-depth, we have analyzed the protein sequence features of DNAH12 protein, 1-200 amino acids of DNAH12 were selected as the ideal antigen considering its good performance (1. high immunogenicity; 2. High hydrophilicity; 3. Good Surface Leakage Groups; 4. Sequence homology analysis to avoid unspecific recognition to other proteins;). The two different anti-DNAH12 antibodies were developed with the help Dia-An Biotech company in 2022, we have attempted to acquire the polypeptide fragments of target proteins to do peptide blocking but the material were disposed after the service. Luckily, we have got the target band of DNAH12 protein which showed strong signal in western blotting experiment and the band was not detected in knockout mice group; the immunofluorescence signals of DNAH12 were strong but not present in knockout mice group. Besides, we have tested that the inhouse raised rabbit antibody was suitable for IP experiment. The IP experiment also showed the raised rabbit antibody were able to immunoprecipitated the DNAH12 band in the Dnah12<sup>+/+</sup> mice but not in Dnah12<sup>-/-</sup> mice. Collectively, these data could support the specificity of the raised DNAH12 antibodies. We admire your suggestion and will require for the peptide material if we develop new antibodies.

      Reviewer #2 (Recommendations For The Authors):

      Recruitment of DNAH1 and DNALI1 to the flagella is dependent on DNAH12 expression, according to the data. What would be the mechanism that locates DNAH12 which lacks MTBD to the flagella?

      Thank you for your insightful question. We are currently investigating the mechanisms that facilitate the loading of DNAH12 to the flagella. Based on existing data, we hypothesize that CCDC39 and/or CCDC40 may play a critical role in the recruitment of DNAH12 to sperm flagella during spermiogenesis (Nat Genet. 2011, PMID: 21131972; PMCID: PMC3509786; Nat Genet. 2011, PMID: 21131974; PMCID: PMC3132183). Furthermore, a structural study by Walton et al. showed that DNAH12 associates with CCDC39/CCDC40 proteins (Nature. 2023, PMID: 37258679; PMCID: PMC10266980). These findings suggest that CCDC39 and/or CCDC40 may play a role in facilitating the localization of DNAH12 to the flagella. Additional studies are needed to identify other potential factors involved in this process and to further elucidate the mechanisms underlying this complex biological phenomenon.

    1. Reviewer #1 (Public review):

      Summary:

      The paper demonstrated through a comprehensive multi-omics study of the oviduct that the transcriptomic and proteomic landscape of the oviduct at 4 different preimplantation periods was dynamic during natural fertilization, pseudopregnancy, and superovulation using three independent cell/tissue isolation and analytical techniques. This work is very important for understanding oviductal biology and physiology. In addition, the authors have made all the results available in a web search format, which will maximize the public's access and foster and accelerate research in the field.

      Strengths:

      (1) The manuscript addresses an important and interesting question in the field of reproduction: how does the oviduct at different regions adapt to the sperm and embryos for facilitating fertilization and preimplantation embryo development and transport?<br /> (2) Authors used cutting-edge techniques: Integrated multi-modal datasets followed with in vivo confirmation and machine learning prediction.<br /> (3) RNA-seq, scRNA-seq and proteomic results are immediately available to the scientific community in a web search format<br /> (4) Substantiated results indicate the source of inflammatory responses was the secretory cell population in the IU region when compared to other cell types; sperm modulate inflammatory responses in the oviduct; the oviduct displays immuno-dynamism.

      In addition, the revised version has addressed weaknesses adequately.<br /> (1) The revised version provided a clear explanation and the rationale for using the superovulation model.<br /> (2) The revised version generated a graphic abstract/summary of their major findings.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02788

      Corresponding author(s): Kazuhiro, Aoki and Yuhei, Goto

      1. General Statements [optional]

      We sincerely thank all reviewers for their insightful comments and constructive suggestions that have substantially improved our manuscript. We provide point-to-point responses to each comment and added detailed explanations in the preliminary revised manuscript. The reviewers' comments are shown in dark blue italics, followed by our responses.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *

      Reviewer #1

      Major Concerns

        • Fig. 3G, Cdc2-miRFP670 levels appear to drop after cell division, which is a surprising observation because Cdc2 is generally considered stable. This could be an imaging artifact because the level recovers quickly after division. The authors should substantiate their findings with a western blot analysis of tagged vs untagged proteins. Additionally, the authors should test whether endogenously tagging Cdc2 and Cdc13 causes any cell cycle phenotypes. While Cdc2 protein levels are indeed stable in whole cells as you noted, we specifically measured nuclear Cdc2-miRFP670 levels. A previous study has shown that nuclear Cdc2 levels fluctuate throughout the cell cycle, increasing during interphase and decreasing during mitosis (Curran et al*., 2022). This known behavior of nuclear Cdc2 is consistent with our observation.

      To address your concerns about potential artifacts from fluorescent protein tagging to endogenous Cdc2 and Cdc13, we will perform two additional experiments:

      1. Compare protein expression levels between wild-type and fluorescently tagged strains for Cdc2 and Cdc13 using western blot analysis.
      2. Examine whether the fluorescent tags affect cell cycle progression by measuring cell cycle duration in tagged versus untagged strains using time-lapse imaging.

      3. The authors explore a panel of red-fluorescent proteins to identify those with the best photobleaching properties. Conducting a similar review with a panel of green fluorescent proteins would significantly enhance the manuscript. It would be particularly helpful to test the properties of the new StayGold fluorescent protein.*

      Thank you for this valuable suggestion. We will expand our photobleaching analysis to include green fluorescent proteins, specifically mEGFP and the recently developed mStayGold as well as mNeonGreen. These measurements will be conducted under identical experimental conditions to our red fluorescent protein analysis, allowing for direct comparison of their photostability properties. This additional data will provide a more comprehensive evaluation of fluorescent protein options for FCCS.

      • In both yeast and mammalian experiments, the green fluorophore is consistently fused to the cyclin and the far-red fluorophore to Cdk1. The authors should include an FCCS control reversing the fluorophores in at least one experiment to verify whether comparable Kd values are obtained.*

      We plan to conduct FCCS measurements with reversed fluorophore combinations in HeLa cells to validate our experiments. Specifically, we will compare Kd values between:

      1. cyclin D1-miRFP670 and CDK4-mNG pair versus cyclinD1-mNG and CDK4-miRFP670 pair
      2. cyclin D3-miRFP670 and CDK6-mNG pair versuscyclin D3-mNG and CDK6-miRFP670 pair.
      3. We also plan to do it in fission yeast cells comparing Kd values between: Cdc13-miRFP670 and Cdc2-mNG pair versus Cdc13-mNG and Cdc2-miRFP670 pair Reviewer #2

      SectionA

      Major Comments

      (ii) For the characterisation of the cell cycle dependent expression of Cdc13 and its association with Cdc2, the level of Cdc13 EGFexpression is used to identify cell cycle stage. It would be appropriate to have an independent measure of cell cycle stage (?cell length). In using Cdc13 to identify cell cycle stage, please define the criteria used ie what level of Cdc13-mNG fluorescence intensity was used to define G1 vs S vs G2?

      We would like to thank you for raising these important comments and suggestions about cell cycle stage determination. We agree that using Cdc13-mNG levels alone as a cell cycle marker requires more rigorous validation.We will incorporate cell length measurements as an independent cell cycle stage indicator for FCCS measurements. However, it is important to note that traditional cell cycle stage classification is limited in fission yeast cells due to its unique cell cycle characteristics; a brief G1 phase, continuous S phase during cell separation, and an extended G2 phase. Cdc13 expression keeps at the undetectable level during G1 and S phases, and therefore this inevitably restricts our FCCS measurements to G2 and M phases. G2 and M phase cells can be distinguished by the characteristic relocalization of Cdc2 and Cdc13 to the mitotic spindle during the M phase (Sugiyama et al., 2024). In the revised manuscript, we will demonstrate the FCCS data with both quantitative (cell length) and qualitative (G2 and M phase localization pattern) indicators for more precise cell cycle staging.

      (iii) Include a control experiment to compare the level of Cdc13 expression in untagged wild-type cells vs the Cdc13-mNG, CDK1- miRFP670 expressing cells to confirm that tagging does not affect Cdc13 expression, cell cycle duration or Cdc13 function.

      We agree with the reviewer's comment, which suggests validation of the functionality of tagged proteins. We will perform two key control experiments:

      1. Compare Cdc13 protein expression levels between wild-type cells and cells expressing Cdc13-mNG and Cdc2-miRFP670 using western blot analysis with anti-Cdc13 antibody.
      2. Measure cell cycle duration in both strains through time-lapse microscopy to assess any potential effect of the fluorescent tags on cell cycle progression. Major points

      (ii) Please provide the confidence interval for the data fit for each CDK-cyclin pair. In panel Figure 4I, the results are represented as a heat map to define the Kd for each CDK-cyclin pair. This panel suggests that the technique can sensitively distinguish alternative CDK-cyclin complexes where their Kd values differ in 1 uM increments. The heat map is presented with block colours, but the key to the color coding is a graded color scheme and it is not possible to move between the two. This disconnect has to be addressed. The accompanying text on pages 18 and 19 is a qualitative description of the results, a comparative and quantitative analysis of the data (Kd values with accompanying confidence intervals) has to be included to justify the apparent strength of the technique to discriminate different CDK-cyclin pairs that Figure 4 implies.

      Thank you for highlighting the need for more rigorous statistical analysis. We will calculate and add the confidence intervals for all Kd values of each cyclin-CDK pair.

      (iii) For "low affinity" interactions that are determined to be >10 uM. Please define how this value was calculated. Would it be more appropriate to say a value could not be determined as the data could not be fitted?

      We appreciate the reviewer's valuable comment regarding the determination of low affinity interactions. As mentioned above, we are currently calculating confidence intervals for our curve fitting analyses across all measurements. Based on these statistical analyses, we will carefully evaluate the reliability of the >10 µM designations and revise our descriptions accordingly in the manuscript to ensure accurate representation of the binding parameters.

      • *

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1

      Major Concerns

      • The authors extensively characterize the Kd of cyclin/Cdk pairs using overexpressed proteins. This approach is problematic due to the heterogeneous expression levels associated with transient expression and competition between overexpressed proteins and endogenous proteins. Variable expression levels are a concern because of the limiting rate of T-loop phosphorylation on Cdks (Merrick et al., 2008), which is required to stabilise cyclin/Cdk complexes. While the authors acknowledge the competition between exogenous and endogenous proteins, they do not take into account the cell cycle-dependent fluctuation of cyclin levels. For instance, in cells with low levels of endogenous Cyclin B1 (S-phase), competition with overexpressed Cyclin B1 will have less impact on cross-correlation measurements compared to cells with high endogenous Cyclin B1 (G2-phase).*

      These issues severely affect the relevance of this dataset. Indeed, the reported measurements differ by at least an order of magnitude from the Kd values obtained through biochemical methods or FCCS with endogenously tagged proteins. Moreover, the data partially diverge from the literature; for example, Cdk1 is known to form unconventional complexes with Cyclin Ds and Es.

      We acknowledge the important issues about the limitations of using overexpressed proteins for Kd measurement. Indeed, several factors affect the reliability of our measurements. At first, competition between overexpressed and endogenous proteins varies throughout the cell cycle due to cell cycle-dependent fluctuations in endogenous cyclin levels. Indeed, we had analyzed the effects of the overexpression on in vivo Kd measurements with FCCS (Sadaie, Mol Cell Biol, 2014), showing that not only endogenous proteins but also competitive binding proteins affect Kd values quantified in living cells. Second, variable expression levels from transient transfection may impact T-loop phosphorylation of CDKs, which is known to be rate-limiting (Merrick et al., 2008). We have expanded our discussion to address these limitations and their implications for interpreting the cyclin-CDK binding affinities (page 25, line 16-18). We also note that our overexpression experiments may not fully capture the formation of previously reported unconventional complexes, such as those between CDK1 and D- or E-type of cyclins (Koff et al. 1992; Zhang et al. 1993) (page 26, line 8-10).

      • Fig. S3A, Cyclin E levels are shown to persist into mitosis, whereas endogenous Cyclin E is degraded in late S and G2 phases. This is likely to be caused by over-expression and the authors should comment on this.*

      We agree that the observed persistence of Cyclin E into mitosis differs from the known behavior of endogenous Cyclin E, which is typically degraded during late S and G2 phases. This discrepancy is likely due to our overexpression system overwhelming the normal degradation machinery. In the revised manuscript, we have explicitly acknowledged this limitation and discuss how overexpression may alter the typical cell cycle-dependent regulation of cyclin proteins (page 26, line 12-16). This observation further highlights the importance of considering expression levels when interpreting protein-protein interaction data from overexpression systems.

      Minor Comments

        • The authors should reference relevant studies from Jan Ellenberg's lab on FCS (e.g., Wachsmuth et al., 2015; Cai et al., 2018).* Thank you for your suggestion. We have cited these two papers in introduction (page 6, line 5-8).
      1. The statement, "In order to perform FCCS in a reproducible manner, we are trying to find a better fluorescent protein pair that is bright, crosstalk-free, and highly resistant to photobleaching," would be improved by removing the word "better".*

      We removed the word "better".

      • In Fig. 1C, F, G, and H, the colour codes are difficult to read and should be improved.*

      We have changed the color codes to make them easy to distinguish.

      • The paragraph discussing Fig. 3 states: "We used a fission yeast strain that expressed SynPCB2.1 under the control of the adh promoter," raising the question of how emiRFP670 was imaged in earlier experiments.*

      We apologize for the unclear description. All experiments involving miRFP670 imaging, including those in Figure 1, were performed using fission yeast cells expressing SynPCB2.1 under the control of the adh1 promoter. We have clarified these important experimental details in the revised manuscript under the section "miRFP670, a near-infrared fluorescent protein, is suitable for simultaneous imaging with mNeonGreen."

      • The authors estimate the volume of a mammalian cell as approximately 5 pL. This estimate requires a supporting reference or experimental data. Additionally, it would be helpful to specify which cell type was considered and at which cell cycle stage this estimate applies.*

      Our cell volume estimate was based on HeLa cells reported by our previous work (Aoki, PNAS, 2011). In our study, total cell volume was determined using differential interference contrast microscopy, while nuclear volume was measured through Höechst 33258 fluorescence imaging. While we reported average volumes from 20 cells, we acknowledge that the cell cycle stage was not specified in our measurement. We have added these experimental details to the revised manuscript (page 15, line 7-9), noting that cell volumes vary with cell cycle stage.

      • Including page and/or line numbers would facilitate future revisions.*

      We have added page numbers and line numbers throughout the revised manuscript.

      Reviewer #2

      Section A

      Major Comments

      (i) Materials and Methods: Page 10 "The fitting process was constrained by initial estimates and bounded by physically reasonable limits." Please define physically reasonable limits"

      We apologize for not providing sufficient details about the fitting constraints. In the revised Material and Methods section (page 11, line 20-21) and (page 13, line 8-9), we have specified the initial parameter estimates and their boundary conditions used in our fitting process. These have included explicit numerical values for all parameters and the physical reasoning behind each constraint.

      Minor points

      *(i) Figure 1. Panels C, F, G and H. Please improve color palette to distinguish the overlapping traces. It might be helpful to remove the edge grey and broaden the color spectrum for visual inclusion (eg straw/blue vs green/red). Could the statement "As expected, mNG exhibited tolerance to the photobleaching when excited at low laser power (We have changed the color palette to make them easy to distinguish.

      SectionB

      Major points

      (i) In analysing the data, the model assumes that the monomeric CDK and cyclin subunits are either bound to form a binary complex or not. Can the authors discuss whether this can be presumed to be the case when they present the results. Either the labelled proteins are overexpressed to such a level that it can be presumed in the data handling that they are behaving as monomeric proteins and the resulting derived Kds reflect binary CDK-cyclin interactions. However, within the cell, the situation is more complex, and both CDKs and cyclins will mostly likely (and dependent on identity) be variably associated with multiple alternative protein partners. Can such effects be discounted in the analysis presented here and what would be the experimental grounds to do so. The authors make note of this fact in the discussion when they note that the results presented in this manuscript differ by circa an order of magnitude for the CDK1-cyclin B1 pairing reported by Pines et al using endogenously labelled proteins. They suggest that the discrepancy might result in part from competition from endogenously unlabelled proteins. This discrepancy has to be addressed.

      We acknowledge this important point about the complexity of cyclin-CDK interactions in cellular context. Our current analysis, which assumes simple binary interactions between overexpressed proteins, has several limitations as the reviewer suggested:

      1. As demonstrated by Pines laboratory's work with CDK1-cyclin B1 FCCS, dissociation constant can vary throughout the cell cycle, suggesting regulation by additional factors.
      2. Both cyclins and CDKs interact with multiple binding partners in cells, and therefore the analysis with binary interaction does not account for.
      3. Overexpression of exogenous proteins may alter the balance of these interactions. While our previous studies (Sadaie, MCB, 2014; Komatsubara, JBC, 2019) cited in the manuscript have addressed similar considerations, we agree that this aspect requires more thorough explanation. We have expanded our explanation in the results section (page 16, line 26-page17, line 8) and discussion part (page 26, line 7-23).

      (iv) Previous work from the Pines lab using FCS and FCCS to measure the binding of CDK1 to cyclin B1 in RPE-1 cells reported not only a higher affinity for the pair but also that their apparent affinity was dependent on cell cycle stage suggesting that their assembly might be multi-stepped. Both affinity and cell cycle dependency of CDK-cyclin pairings are of great interest to scientists working in the cell cycle field. It could be argued that measurements of the affinities of multiple CDK-cyclin pairs each "averaged out" over the cell cycle will have less impact on the field than a few well-chosen CDK-cyclin pairs characterised in greater depth.

      We acknowledge the limitations of the current approach that averages dissociation constants across the cell cycle. The Pines laboratory's work revealed cell cycle-dependent variations in the dissociation constant for Cyclin B1-CDK1, suggesting complex regulation beyond simple binary interactions. These variations likely reflect both changes in cyclin expression levels and the involvement of additional regulatory factors throughout the cell cycle. While our comprehensive survey of multiple cyclin-CDK pairs provides a useful overview of relative binding preferences, we agree that a more focused analysis of selected pairs across different cell cycle stages would offer deeper mechanistic insights. We have expanded our discussion to address the significance of cell cycle-dependent changes in binding affinities and the potential role of additional regulatory factors as well as the trade-offs between breadth and depth in studying cyclin-CDK interactions (page 26, line 7-23).

      Minor Points

      (i) For both Figures 3 and 4 address red/green color pair choice.

      We have modified the color codes in Figures 3 and 4.

      **Referee cross-commenting**

      I would like to thank the other reviewer for their comments about requirements and possible control experiments for the use of the fluorescent probes.

      We agree that the use of tagged proteins overexpressed in cells to measure Kd values has significant limitations:

      (i) Competition between tagged and endogenous proteins

      (ii) Limiting factors that affect CDK-cyclin complex stability (PTMs and contributions from binding and assembly factors mentioned).

      (iii) Cell cycle dependent protein expression

      Points (ii) and (iii) are not applicable to all protein-protein pairs but are significant when trying to determine CDK-cyclin affinities.

      As mentioned above, we have expanded our discussion to address these limitations and their implications for interpreting the cyclin-CDK binding affinities (page 26, line 7-23).

      Ideally it would be demonstrated that this approach can return the established values for a limited subset of CDK-cyclin pairs in mammalian cells and so extrapolate the results from yeast cells where endogenous labelling was carried out.

      We are sorry, but we could not fully understand what the reviewer wanted to ask.

      We also have shared concerns about the data presentation in Figure 4.

      According to the suggestion, we have modified Figure 4.

      • *

      4. Description of analyses that authors prefer not to carry out

      Reviewer #2

      Major Comments

      (iv). Could the authors consider exploiting the tractability of yeast cells to block and release and/or genetic means to establish synchronous populations to improve data acquisition? This approach could also be employed to assess whether CDK1-cyclin B1 affinity changes with cell cycle stage (as was shown by Pines et al in RPE-1 cells) and would demonstrate that their approach is as equally suitable to sensitively distinguish CDK-cyclin pairs in yeast cells.

      We appreciate the suggestion to analyze cell cycle-dependent changes in dissociation constants using synchronized cells. However, we have deliberately chosen not to use cell synchronization methods in fission yeast for several important reasons. During cell cycle arrest, cells continue to grow and synthesize proteins, leading to cell elongation and abnormal accumulation of Cdc13. These unphysiological perturbations are evidenced by the unusually rapid progression through the subsequent cell cycle following release. Such conditions deviate significantly from normal cellular physiology. One of the key advantages of FCCS is its ability to measure protein-protein interactions in individual, asynchronous cells. While traditional biochemical analyses require cell synchronization to obtain population-averaged measurements, they inherently suffer from the artifacts mentioned above.

      Instead, as described in (ii), we will utilize cell length as a natural indicator of cell cycle progression in fission yeast, allowing us to examine the relationship between cell cycle stage and Kd values while maintaining normal cellular physiology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In the first part of the manuscript the authors present a thorough description of the background and theoretical basis to the identification of a fluorescent pair that permits both FCS and FCCS measurements at the single cell level to enable the determination of Kd values between labelled protein pairs (Figures 1 and 2). The generation of the reagents and subsequent experimental details are thorough and would permit the experiments to be repeated. The first two sections are well argued and appropriately controlled.

      They then tag the endogenous S. pombe cdk1 and cdc13 genes at their 3' ends with sequences that encode miRFP670 (a near infrared fluorescent protein) and mNG (mNeonGreen) respectively and from measurements collected on 13 cells derive a mean Kd value calculated for each of the 13 cells of 0.31{plus minus}0.22 μM. They note that this value agrees with that reported by the Pines lab following labelling of cyclin B1 and CDK1 with genome editing in RPE-1/hTERT cells.

      The final part of the manuscript then extends the technique to a pair-wise analysis of 9 cyclins and 4 CDKs in a human cell line.

      Major Comments

      (i) Materials and Methods: Page 10 "The fitting process was constrained by initial estimates and bounded by physically reasonable limits." Please define physically reasonable limits"

      (ii) For the characterisation of the cell cycle dependent expression of Cdc13 and its association with Cdc2, the level of Cdc13 expression is used to identify cell cycle stage. It would be appropriate to have an independent measure of cell cycle stage (?cell length). In using Cdc13 to identify cell cycle stage, please define the criteria used ie what level of Cdc13-mNG fluorescence intensity was used to define G1 vs S vs G2?

      (iii) Include a control experiment to compare the level of Cdc13 expression in untagged wild-type cells vs the Cdc13-mNG, CDK1- miRFP670 expressing cells to confirm that tagging does not affect Cdc13 expression, cell cycle duration or Cdc13 function.

      (iv). Could the authors consider exploiting the tractability of yeast cells to block and release and/or genetic means to establish synchronous populations to improve data acquisition? This approach could also be employed to assess whether CDK1-cyclin B1 affinity changes with cell cycle stage (as was shown by Pines et al in RPE-1 cells) and would demonstrate that their approach is as equally suitable to sensitively distinguish CDK-cyclin pairs in yeast cells.

      Minor points

      (i) Figure 1. Panels C, F, G and H. Please improve color palette to distinguish the overlapping traces. It might be helpful to remove the edge grey and broaden the color spectrum for visual inclusion (eg straw/blue vs green/red). Could the statement "As expected, mNG exhibited tolerance to the photobleaching when excited at low laser power (< 5%) (Fig. 1C)." be supported by additional labelling on the figure panel.

      The manuscript then goes on to describe the measurement of Kds for 36 CDK-cyclin pairs in HeLa cells by overexpression of labelled CDKs and cyclins following transient overexpression by plasmid co-transfection. This last section of the manuscript requires significant revision.

      Major points

      (i) In analysing the data, the model assumes that the monomeric CDK and cyclin subunits are either bound to form a binary complex or not. Can the authors discuss whether this can be presumed to be the case when they present the results. Either the labelled proteins are overexpressed to such a level that it can be presumed in the data handling that they are behaving as monomeric proteins and the resulting derived Kds reflect binary CDK-cyclin interactions. However, within the cell, the situation is more complex, and both CDKs and cyclins will mostly likely (and dependent on identity) be variably associated with multiple alternative protein partners. Can such effects be discounted in the analysis presented here and what would be the experimental grounds to do so. The authors make note of this fact in the discussion when they note that the results presented in this manuscript differ by circa an order of magnitude for the CDK1-cyclin B1 pairing reported by Pines et al using endogenously labelled proteins. They suggest that the discrepancy might result in part from competition from endogenously unlabelled proteins. This discrepancy has to be addressed.

      (ii) Please provide the confidence interval for the data fit for each CDK-cyclin pair. In panel Figure 4I, the results are represented as a heat map to define the Kd for each CDK-cyclin pair. This panel suggests that the technique can sensitively distinguish alternative CDK-cyclin complexes where their Kd values differ in 1 uM increments. The heat map is presented with block colours, but the key to the color coding is a graded color scheme and it is not possible to move between the two. This disconnect has to be addressed. The accompanying text on pages 18 and 19 is a qualitative description of the results, a comparative and quantitative analysis of the data (Kd values with accompanying confidence intervals) has to be included to justify the apparent strength of the technique to discriminate different CDK-cyclin pairs that Figure 4 implies.

      (iii) For "low affinity" interactions that are determined to be >10 uM. Please define how this value was calculated. Would it be more appropriate to say a value could not be determined as the data could not be fitted?

      (iv) Previous work from the Pines lab using FCS and FCCS to measure the binding of CDK1 to cyclin B1 in RPE-1 cells reported not only a higher affinity for the pair but also that their apparent affinity was dependent on cell cycle stage suggesting that their assembly might be multi-stepped. Both affinity and cell cycle dependency of CDK-cyclin pairings are of great interest to scientists working in the cell cycle field. It could be argued that measurements of the affinities of multiple CDK-cyclin pairs each "averaged out" over the cell cycle will have less impact on the field than a few well-chosen CDK-cyclin pairs characterised in greater depth.

      Minor Points

      (i) For both Figures 3 and 4 address red/green color pair choice.

      Referee cross-commenting

      I would like to thank the other reviewer for their comments about requirements and possible control experiments for the use of the fluorescent probes.

      We agree that the use of tagged proteins overexpressed in cells to measure Kd values has significant limitations:

      (i) Competition between tagged and endogenous proteins

      (ii) Limiting factors that affect CDK-cyclin complex stability (PTMs and contributions from binding and assembly factors mentioned).

      (iii) Cell cycle dependent protein expression

      Points (ii) and (iii) are not applicable to all protein-protein pairs but are significant when trying to determine CDK-cyclin affinities.

      Ideally it would be demonstrated that this approach can return the established values for a limited subset of CDK-cyclin pairs in mammalian cells and so extrapolate the results from yeast cells where endogenous labelling was carried out.

      We also have shared concerns about the data presentation in Figure 4.

      Significance

      Technology: The paper describes a technical advance in identifying a fluorescent probe pair suitable for FCCS in living cells.

      Cell cycle: The ability of CDKs and cyclins to discriminate each other and pair to form complexes that characterise different cell cycle stages and drive progression has long been appreciated. The formation of non-cognate pairings when the cell cycle is perturbed has also been noted and a greater understanding of the in-cell affinities of all possible CDK-cyclin complexes would be a significant advance in our understanding. However, this manuscript currently does not (i) provide statistically validated measures of apparent differences in affinity between different CDK-cyclin pairs and (ii) address whether the measurements are cell cycle dependent. (iii) Interpretation of the results has to take into consideration that both the CDK and cyclin components are transiently over expressed in cells and therefore the values that are measured are difficult to interpret in terms of CDK and cyclin function. These considerations would dampen interest in the findings by cell cycle biologists.

      Expertise: CDKs, cyclin, cell cycle biology.

      Non-expert in technical aspects of fluorescence microscopy

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Major Concerns

      1. Fig. 3G, Cdc2-miRFP670 levels appear to drop after cell division, which is a surprising observation because Cdc2 is generally considered stable. This could be an imaging artifact because the level recovers quickly after division. The authors should substantiate their findings with a western blot analysis of tagged vs untagged proteins. Additionally, the authors should test whether endogenously tagging Cdc2 and Cdc13 causes any cell cycle phenotypes.
      2. The authors explore a panel of red-fluorescent proteins to identify those with the best photobleaching properties. Conducting a similar review with a panel of green fluorescent proteins would significantly enhance the manuscript. It would be particularly helpful to test the properties of the new StayGold fluorescent protein.
      3. In both yeast and mammalian experiments, the green fluorophore is consistently fused to the cyclin and the far-red fluorophore to Cdk1. The authors should include an FCCS control reversing the fluorophores in at least one experiment to verify whether comparable Kd values are obtained.
      4. The authors extensively characterize the Kd of cyclin/Cdk pairs using overexpressed proteins. This approach is problematic due to the heterogeneous expression levels associated with transient expression and competition between overexpressed proteins and endogenous proteins. Variable expression levels are are a concern because of the limiting rate of T-loop phosphorylation on Cdks (Merrick et al., 2008), which is required to stabilise cyclin/Cdk complexes. While the authors acknowledge the competition between exogenous and endogenous proteins, they do not take into account the cell cycle-dependent fluctuation of cyclin levels. For instance, in cells with low levels of endogenous Cyclin B1 (S-phase), competition with overexpressed Cyclin B1 will have less impact on cross-correlation measurements compared to cells with high endogenous Cyclin B1 (G2-phase). These issues severely affect the relevance of this dataset. Indeed, the reported measurements differ by at least an order of magnitude from the Kd values obtained through biochemical methods or FCCS with endogenously tagged proteins. Moreover, the data partially diverge from the literature; for example, Cdk1 is known to form unconventional complexes with Cyclin Ds and Es.
      5. Fig. S3A, Cyclin E levels are shown to persist into mitosis, whereas endogenous Cyclin E is degraded in late S and G2 phases. This is likely to be caused by over-expression and the authors should comment on this.

      Minor Comments

      1. The authors should reference relevant studies from Jan Ellenberg's lab on FCS (e.g., Wachsmuth et al., 2015; Cai et al., 2018).
      2. The statement, "In order to perform FCCS in a reproducible manner, we are trying to find a better fluorescent protein pair that is bright, crosstalk-free, and highly resistant to photobleaching," would be improved by removing the word "better".
      3. In Fig. 1C, F, G, and H, the colour codes are difficult to read and should be improved.
      4. The paragraph discussing Fig. 3 states: "We used a fission yeast strain that expressed SynPCB2.1 under the control of the adh promoter," raising the question of how emiRFP670 was imaged in earlier experiments.
      5. The authors estimate the volume of a mammalian cell as approximately 5 pL. This estimate requires a supporting reference or experimental data. Additionally, it would be helpful to specify which cell type was considered and at which cell cycle stage this estimate applies.
      6. Including page and/or line numbers would facilitate future revisions.
      7. Fig. 4I would benefit from providing actual Kd values alongside the color-coded representation.

      Significance

      In this study, Toyama and colleagues characterize a novel low-bleaching fluorophore pair to detect protein-protein interactions through FCCS. They demonstrate that while red-fluorescent proteins bleach rapidly, NeonGreen and iRFP670 are relatively stable over time and applicable to both yeast and mammalian cells. Furthermore, they apply their system to cyclin-Cdk pairs and describe a clever approach to enhance the brightness of iRFP670 in mammalian cells. The data are clear and the identification of suitable fluors for FCCS will be of value to the field; however, there are several major concerns that need to be addressed before publication.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      We sincerely appreciate your insightful and constructive comments from the reviewers, which have significantly enhanced the clarity and rigor of our manuscript.

      Reviewer #1

      Evidence, reproducibility and clarity

      The manuscript by Egawa and colleagues investigates differences in nodal spacing in an avian auditory brain stem circuit. The results are clearly presented and data are of very high quality. The authors make two main conclusions:

      1) Node spacing, i.e. internodal length, is intrinsically specified by the oligodendrocytes in the region they are found in, rather than axonal properties (branching or diameter).

      2) Activity is necessary (we don't know what kind of signaling) for normal numbers of oligodendrocytes and therefore the extent of myelination.

      These are interesting observations, albeit phenomenon. I have only a few criticisms that should be addressed:

      1) The use of the term 'distribution' when describing the location of nodes is confusing. I think the authors mean rather than the patterns of nodal distribution, the pattern of nodal spacing. They have investigated spacing along the axon. I encourage the authors to substitute node spacing or internodal length for node distribution.

      Response:

      Thanks for your suggestion to avoid confusion. We used the phrase "nodal spacing" instead of "nodal distribution" throughout the revised manuscript.

      2) In Seidl et al. (J Neurosci 2010) it was reported that axon diameter and internodal length (nodal spacing) were different for regions of the circuit. Can the authors help me better understand the difference between the Seidl results and those presented here?

      Response:

      As a key distinction, our study focuses specifically on the main trunk of the contralateral projection of NM axons. This projection features a sequential branching structure known as the delay line, where collateral branches form terminal arbors and connect to the ventral dendritic layer of NL neurons. This structural organization plays a critical role in influencing the dynamic range of ITD detection by regulating conduction delays along the NM axon trunk.

      The study by Seidl et al. (2010) is a pioneering work that measured diameter of NM axon using electron microscopy, providing highly reliable data. However, due to the technical limitations of electron microscopy, which does not allow for the continuous tracing of individual axons, it is not entirely clear whether the axons measured in the ventral NL region correspond to terminal arbors of collateral branches or the main trunk of NM axons (see Figure 9E, F in their paper). Instead, they categorized axon diameters based on their distance from NL cell layer, showing that axon diameter increases distally (see Figure 9G in their paper). Notably, the diameters of ventral axons located more than 120 μm away from the NL cell layer is almost identical to those in the midline.

      As illustrated in our Figure 4D and Supplementary Video 2, the main trunk of the contralateral NM projection is predominantly located in these distal regions. Therefore, our findings complement those of Seidl et al. (2010) rather than contradicting them. We made this point as clear as possible in text (page 7, line 7).

      3) The authors looked only in very young animals - are the results reported here applicable only to development, or does additional refinement take place with aging?

      Response:

      In this study, we examined chick embryos from E9 to just before hatching (E21) and post-hatch chicks up to P9. Chickens begin to perceive sound around E12 and possess sound localization abilities at the time of hatching (Grier et al., 1967) (added to page 4, line 12). Therefore, by E21, the sound localization circuit is largely established.

      On the other hand, additional refinement of the circuit with aging is certainly possible. A key cue for sound localization, interaural time difference (ITD), depends on the distance between the two ears, which increases as the animal grows. As shown in Figure 2G, internodal length increased by approximately 20% between E18 and P9 while maintaining regional differences. Given that NM axons are nearly fully myelinated by E21 (Figure 4D, 6C), this suggests that myelin extends in proportion to the overall growth of the head and brain volume.

      Thus, our study covers not only the early stages of myelination but also the post-functional maturation in the sound localization circuit.

      4) The fact that internodal length is specified by the oligodendrocyte suggests that activity may not modify the location of nodes of Ranvier - although again, the authors have only looked during early development. This is quite different than this reviewer's original thoughts - that activity altered internodal length and axon diameter. Thus, the results here argue against node plasticity. The authors may choose to highlight this point or argue for or against it based on results in adult birds?

      Response:

      In this study, we demonstrated that although vesicular release did not affect internodal length, it selectively promoted oligodendrogenesis, thereby supporting the full myelination and hence the pattern of nodal spacing along the NM axons. We believe that this finding falls within the broader scope of 'activity-dependent plasticity' involving oligodendrocytes and nodes.

      As summarized in the excellent review by Bonetto et al. (2021), activity-dependent plasticity in oligodendrocytes encompasses a wide range of phenomena, not limited to changes in internodal length but also including oligodendrogenesis. Moreover, the effects of neuronal activity are not uniform but likely depend on the diversity of both neurons and oligodendrocytes. For example, in the mouse visual cortex, activity-dependent myelination occurs in interneurons but not in excitatory neurons (Yang et al., 2020). Additionally, expression of TeNT in axons affected myelination heterogeneously in zebrafish; some axons were impaired in myelination and the others were not affected at all (Koudelka et al., 2016). In the mouse corpus callosum, neuronal activity influences oligodendrogenesis, which in turn facilitates adaptive myelination (Gibson et al., 2014).

      Thus, rather than refuting the role of activity-dependent plasticity in nodal spacing, our findings emphasize the diversity of underlying regulatory mechanisms. We described these explicitly in text (page 10, line 18).

      Significance

      This paper may argue against node plasticity as a mechanism for tuning of neural circuits. Myelin plasticity is a very hot topic right now and node plasticity reflects myelin plasticity. this seems to be a circuit where perhaps plasticity is NOT occurring. That would be interesting to test directly. One limitation is that this is limited to development.

      Response:

      This paper does not argue against node plasticity, but rather demonstrates that oligodendrocytes in the NL region exhibit a form of plasticity; they proliferate in response to vesicular release from NM axons, yet do not undergo morphological changes, ensuring adequate oligodendrocyte density for the full myelination of the auditory circuit. Thus, activity-dependent plasticity involving oligodendrocytes would contributes in various ways to each neural circuit, which is presumably attributed to the fact that myelination is driven by complex multicellular interactions between diverse axons and oligodendrocytes. Oligodendrocytes are known to exhibit heterogeneity in morphology, function, responsiveness, and gene profiles (Foerster et al., 2019; Sherafat et al., 2021; Osanai et al., 2022; Valihrach et al., 2022), but functional significance of this heterogeneity remains largely unclear. This paper also provides insight into how oligodendrocyte heterogeneity may contribute to the fine-tuning of neural circuit function, adding further value to our findings. Importantly, our study covers the wide range of development in the sound localization circuit, from the pre-myelination (E9) to the post-functional maturation (P9), revealing how the nodal spacing pattern along the axon in this circuit emerges and matures.

      __ __

      Reviewer #2

      Evidence, reproducibility and clarity

      Egawa et al describe the developmental timeline of the assembly of nodes of Ranvier in the chick brainstem auditory circuit. In this unique system, the spacing between nodes varies significantly in different regions of the same axon from early stages, which the authors suggest is critical for accurate sound localization. Egawa et al set out to determine which factors regulate this differential node spacing. They do this by using immunohistological analyses to test the correlation of node spacing with morphological properties of the axons, and properties of oligodendrocytes, glial cells that wrap axons with the myelin sheaths that flank the nodes of Ranvier. They find that axonal structure does not vary significantly, but that oligodendrocyte density and morphology varies in the different regions traversed by these axons, which suggests this is a key determinant of the region-specific differences in node density and myelin sheath length. They also find that differential oligodendrocyte density is partly determined by secreted neuronal signals, as (presumed) blockage of vesicle fusion with tetanus toxin reduced oligodendrocyte density in the region where it is normally higher. Based on these findings, the authors propose that oligodendrocyte morphology, myelin sheath length, and consequently nodal distribution are primarily determined by intrinsic oligodendrocyte properties rather than neuronal factors such as activity.

      Major points, detailed below, need to be addressed to overcome some limitations of the study.

      Major comments:

      1) It is essential that the authors validate the efficiency of TeNT to prove that vesicular release is indeed inhibited, to be able to make any claims about the effect of vesicular release on oligodendrogenesis/myelination.

      Response:

      eTeNT is a widely used genetically encoded silencing tool and constructs similar to the one used in this study have been successfully applied in primates and rodents to suppress target behaviors via genetic dissection of specific pathways (Kinoshita et al., 2012; Sooksawate et al., 2013). However, precisely quantifying the extent of vesicular release inhibition from NM axons in the brainstem auditory circuit is technically problematic.

      One major limitation is that while A3V efficiently infects NM neurons, its transduction efficiency does not reach 100%. In electrophysiological evaluations, NL neurons receive inputs from multiple NM axons, meaning that responses may still include input from uninfected axons. Additionally, failure to evoke synaptic responses could either indicate successful silencing or failure to stimulate NM axons, making a clear distinction difficult. Furthermore, unlike in motor circuits, we cannot assess the effect of silencing by observing behavioral outputs.

      Thus, we instead opted to quantify the precise expression efficiency of GFP-tagged eTeNT in the cell bodies of NM neurons. The proportion of NM neurons expressing GFP-tagged eTeNT was 89.7 {plus minus} 1.6% (N = 6 chicks), which is consistent with previous reports evaluating A3V transduction efficiency in the brainstem auditory circuit (Matsui et al., 2012). These results strongly suggest that synaptic transmission from NM axons was globally silenced by eTeNT at the NL region. We described these explicitly in text (page 8, line 5).

      2) Related to 1, can the authors clarify if their TeNT expression system results in the whole tract being silenced? It appears from Fig. 6 that their approach leads to sparse expression of TeNT in individual neurons, which enables them to measure myelination parameters. Can the authors discuss how silencing a single axon can lead to a regional effect in oligodendrocyte number?

      Response:

      Figure 6D depicts a representative axon selected from a dense population of GFP-positive axons in a 200-μm-thick slice after A3V-eTeNT infection to bilateral NM. As shown in Supplementary Video 1 and 2, densely labeled GFP-positive axons can be traced along the main trunk. To prevent any misinterpretation, we have revised the description of Figure 6 in the main text and Figure legend (page 31, line 9), and stated the A3V-eTeNT infection efficiency was 89.7 {plus minus} 1.6% in NM neurons, as mentioned above. Based on this efficiency, we interpreted that the global occlusion of vesicular release from most of the NM axons altered the pericellular microenvironment of the NL region, which led to the regional effect on the oligodendrocyte density.

      On the other hand, your question regarding whether sparse expression of eTeNT still has an effect is highly relevant. As we also discussed in our reply to comment 4 by Reviewer #1, the relationship between neuronal activity and oligodendrocytes is highly diverse. In some types of axons, vesicular release is essential for normal myelination, and this process was disrupted by TeNT (Koudelka et al., 2016), suggesting that direct interaction with oligodendrocytes via vesicle release may actively promote myelination in these types of axons.

      To clarify whether the phenotype observed in Figure 6 arises from changes in the pericellular microenvironment at the NL region or from the direct suppression of axon-oligodendrocyte interactions, we plan to add a new Supplementary Figure. Specifically, we will evaluate the node formation on the axon sparsely expressing eTeNT by electroporation into the unilateral NM. Preliminary data indicate that, unlike the results in Figure 6D, sparse eTeNT expression did not contribute to an increase in heminodes and unmyelinated segments. This result would further support our argument that the increase in unmyelinated segments by A3V-eTeNT was due to a disruption of synaptic transmission between NM axons and NL neurons, which in turn altered the pericellular microenvironment at the NL region.

      3) The authors need to fully revise their statistical analyses throughout and supply additional information that is needed to assess if their analyses are adequate:

      __Response: __

      Thank you for your valuable suggestions to improve the rigor of our statistical analyses. We have reanalyzed all statistical tests using R software. In the revised Methods section and Figure Legends, we have clarified the rationale for selecting each statistical test, specified which test was used for each figure, and explicitly defined both n and N. After reevaluation with the Shapiro-Wilk test, we adjusted some analyses to non-parametric tests where appropriate. However, these adjustments did not alter the statistical significance of our results compared to the original analyses.

      3.1) the authors use a variety of statistical tests and it is not always obvious why they chose a particular test. For example, in Fig. 2G they chose a Kruskal-Wallis test instead of a two-way ANOVA or Mann-Whitney U test, which are much more common in the field. What is the rationale for the test choice?

      __Response: __

      We have revised the explanation of our statistical test choices to provide greater clarity and precision. For example, in Figure 2G, we first assessed the normality of the data in each of the four groups using the Shapiro-Wilk test, which revealed that some datasets did not follow a normal distribution. Given this, we selected the Kruskal-Wallis test, a commonly used non-parametric test for comparisons across three or more groups. Since the Kruskal-Wallis test indicated a significant difference, we conducted a post hoc Steel-Dwass test to determine which specific group comparisons were statistically significant.

      3.2) in some cases, the choice of test appears wholly inappropriate. For example, in Fig. 3H-K, an unpaired t-test is inappropriate if the two regions were analysed in the same samples. In Fig. 5, was a t-test used for comparisons between multiple groups in the same dataset? If so, an ANOVA may be more appropriate.

      __Response: __

      In the case of Figures 3H-K, we compared oligodendrocyte morphology between regions. However, since the number of sparsely labeled oligodendrocytes differs both between regions and across individual samples, there is no strict correspondence between paired measurements. On the other hand, in Figures 5B, C, and E, we compared the density of labeled cells between regions within the same slice, establishing a direct correspondence between paired data points. For these comparisons, we appropriately used a paired t-test.

      3.3) in some cases, the authors do not mention which test was used (Fig 3: E-G no test indicated, despite asterisks; G/L/M - which regression test that was used? What does r indicate?)

      __Response: __

      We have specified the statistical tests used for each figure in the Methods section and Figure Legends for better clarity. Additionally, we have revised the descriptions for Figure 4G, L, and M and their corresponding Figure Legends to explicitly indicate that Spearman's rank correlation coefficient (rₛ) was used for evaluation.

      3.4) more concerningly, throughout the results, data may have been pseudo-replicated. t-tests and ANOVAs assume that each observation in a dataset is independent of the other observations. In figures 1-4 and 6 there is a very large "n" number, but the authors do not indicate what this corresponds to. This leaves it open to interpretation, and the large values suggest that the number of nodes, internodal segments, or cells may have been used. These are not independent experimental units, and should be averaged per independent biological replicate - i.e. per animal (N).

      __Response: __

      We have now clarified what "n" represents in each figure, as well as the number of animals (N) used in each experiment, in the Figure Legends.

      In this study, developmental stages of chick embryos were defined by HH stage (Hamburger and Hamilton, 1951), minimizing individual variability. Additionally, since our study focuses on the distribution of morphological characteristics of individual cells, averaging measurements per animal would obscure important cellular-level variability and potentially mislead interpretation of data. Furthermore, we employed a strategy of sparse genetic labeling in many experiments, which naturally results in variability in the number of measurable cells per animal. Given the clear distinctions in our data distributions, we believe that averaging per biological replicate is not essential in this case.

      To further ensure the robustness of our statistical analysis, data presented as boxplots were preliminarily assessed using PlotsOfDifferences, a web-based application that calculates and visualizes effect sizes and 95% confidence intervals based on bootstrapping (https://huygens.science.uva.nl/PlotsOfDifferences/; https://doi.org/10.1101/578575). Effect sizes can serve as a valuable alternative to p-values (Ho, 2018; https://www.nature.com/articles/s41592-019-0470-3). The significant differences reported in our study are also supported by clear differences in effect sizes, ensuring that our conclusions remain robust regardless of the statistical approach used.

      If requested, we would be happy to provide PlotsOfDifferences outputs as supplementary source data files, similar to those used in eLife publications, for each figure.

      3.5) related to the pseudo-replication issue, can the authors include individual datapoints in graphs for full transparency, per biological replicates, in addition or in alternative to bar-graphs (e.g. Fig. 5 and 6).

      __Response: __

      We have now incorporated individual data points into the bar graphs in Figures 5 and 6.

      4) The main finding of the study is that the density of nodes differs between two regions of the chicken auditory circuit, probably due to morphological differences in the respective oligodendrocytes. Can the authors discuss if this finding is likely to be specific to the bird auditory circuit?

      __Response: __

      The morphological differences of oligodendrocytes between white and gray matter are well established (i.e. shorter myelin at gray matter), but their correspondence with the nodal spacing pattern along the long axonal projections of cortical neurons is not well understood. Future research may find similarities with our findings. Additionally, as mentioned in the final section of the Discussion, the mammalian brainstem auditory circuit is functionally analogous to the avian ITD circuit. Regional differences in nodal spacing along axons have also been observed in the mammalian system, raising the important question of whether these differences are supported by regional heterogeneity in oligodendrocytes. Investigating this possibility will facilitate our understanding of the underlying logic and mechanisms for determining node spacing patterns along axons, as well as provide valuable insights into evolutionary convergence in auditory processing mechanisms. We described these explicitly in text (page 11, line 32).

      5) Provided the authors amend their statistical analyses, and assuming significant differences remain as shown, the study shows a correlation (but not causation) between node spacing and oligodendrocyte density, but the authors did not manipulate oligodendrocyte density per se (i.e. cell-autonomously). Therefore, the authors should either include such experiments, or revise some of their phrasing to soften their claims and conclusions. For example, the word "determine" in the title could be replaced by "correlate with" for a more accurate representation of the work. Similar sentences throughout the main text should be amended.

      __Response: __

      As you summarized in your comment, our results demonstrated that A3V-eTeNT suppressed oligodendrogenesis in the NL region, leading to a reduction in oligodendrocyte density (Figures 6L, M), which caused the emergence of unmyelinated segments. While this is an indirect manipulation of oligodendrocyte density, it nonetheless provides evidence supporting a causal relationship between oligodendrocyte density and nodal spacing.

      The emergence of unmyelinated segments at the NL region further suggests that the myelin extension capacity of oligodendrocytes differs between regions, highlighting regional differences in intrinsic properties of oligodendrocyte as the most prominent determinant of nodal spacing variation. However, as you correctly pointed out, our findings do not establish direct causation.

      In the future, developing methods to artificially manipulate myelin length could provide a more definitive demonstration of causality. Given these considerations, we have modified the title to replace "determine" with "underlie", ensuring that our conclusions are presented with appropriate nuance.

      6) The authors fail to introduce, or discuss, very pertinent prior studies, in particular to contextualize their findings with:

      6.1) known neuron-autonomous modes of node formation prior to myelination, e.g. Zonta et al (PMID 18573915); Vagionitis et al (PMID 35172135); Freeman et al (PMID 25561543)

      6.2) known effects of vesicular fusion directly on myelinating capacity and oligodendrogenesis, e.g. Mensch et al (PMID 25849985)

      6.3) known correlation of myelin length and thickness with axonal diameter, e.g. Murray & Blakemore (PMID 7012280); Ibrahim et al (PMID 8583214); Hildebrand et al (PMID 8441812). 6.4) regional heterogeneity in the oligodendrocyte transcriptome (page 9, studies summarized in PMID 36313617)

      __Response: __

      Thank you for your insightful suggestions. We have incorporated the relevant references you provided and revised the manuscript accordingly to contextualize our findings within the existing literature.

      Minor comments:

      7) Can the authors amend Fig. 1G with the correct units of measurement, not millimetres.

      __Response: __

      Thank you for your suggestion. We have corrected the units in Figure 1G to µm

      8) The Olig2 staining in Fig 2C does not appear to be nuclear, as would be expected of a transcription factor and as is well established for Olig2, but rather appears to be excluded from the nucleus, as it is in a ring or donut shape. Can the authors comment on this?

      __Response: __

      Oligodendrocytes and OPCs have small cell bodies, often comparable in size to their nuclei. The central void in the ring-like Olig2 staining pattern appears too small to represent the nucleus. Additionally, a similar ring-like appearance is observed in BrdU labeling (Figure 5G), suggesting that this staining pattern may reflect nuclear morphology or other structural features.

      Significance

      In our view the study tackles a fundamental question likely to be of interest to a specialized audience of cellular neuroscientists. This descriptive study is suggestive that in the studied system, oligodendrocyte density determines the spacing between nodes of Ranvier, but further manipulations of oligodendrocyte density per se are needed to test this convincingly.

      __Response: __

      The main finding of our study is that the primary determinant of the biased nodal spacing pattern in the sound localization circuit is the regional heterogeneity in the morphology of oligodendrocytes due to their intrinsic properties (e.g., their ability to produce and extend myelin sheaths) rather than the density of the cells. This was based on our observations that a reduction of oligodendrocyte density by A3V-eTeNT expression caused unmyelinated segments but did not increase internodal length (Figure 6), further revealing the importance of oligodendrocyte density in ensuring full myelination for the axons with short internodes. Thus, we think that our study could propose the significance of oligodendrocyte heterogeneity in the circuit function as well as in the nodal spacing using experimental manipulation of oligodendrocyte density.

      __ __

      Reviewer #____3

      Evidence, reproducibility and clarity

      The authors have investigated the myelination pattern along the axons of chick avian cochlear nucleus. It has already been shown that there are regional differences in the internodal length of axons in the nucleus magnocellularis. In the tract region across the midline, internodes are longer than in the nucleus laminaris region. Here the authors suggest that the difference in internodal length is attributed to heterogeneity of oligodendrocytes. In the tract region oligodendrocytes would contribute longer myelin internodes, while oligodendrocytes in the nucleus laminaris region would synthesize shorter myelin internodes. Not only length of myelin internodes differs, but also along the same axon unmyelinated areas between two internodes may vary. This is an interesting contribution since all these differences contribute to differential conduction velocity regulating ipsilateral and contralateral innervation of coincidence detector neurons. However, the demonstration falls rather short of being convincing. I have some major concerns:

      1) The authors neglect the possibility that nodal cluster may be formed prior to myelin deposition. They have investigated stages E12 (no nodal clusters) and E15 (nodal cluster plus MAG+ myelin). Fig. 1D is of dubious quality. It would be important to investigate stages between E12 and E15 to observe the formation of pre-nodes, i.e., clustering of nodal components prior to myelin deposition.

      __Response: __

      Thank you for your insightful comment regarding the potential role of pre-nodal clusters in determining internodal length. Indeed, studies in zebrafish have suggested that pre-nodal clustering of node components prior to myelination may prefigure internodal length (Vagionitis et al., 2022). We have incorporated a discussion on whether such pre-nodal clusters could contribute to regional differences in nodal spacing in our manuscript (page 9, line 35).

      Whether pre-nodal clusters are detectable before myelination appears to depend on neuronal subpopulation (Freeman et al., 2015). To investigate the presence of pre-nodal clusters along NM axons in the brainstem auditory circuit, we previously attempted to visualize AnkG signals at E13 and E14. However, we did not observe clear structures indicative of pre-nodal clusters; instead, we only detected sparse fibrous AnkG signals with weak Nav clustering at their ends, consistent with hemi-node features. This result does not exclude the possibility of pre-nodal clusters on NM axons, as the detection limit of immunostaining cannot be ruled out. In brainstem slices, where axons are densely packed, nodal molecules are expressed at low levels across a wide area, leading to a high background signal in immunostaining, which may mask weak pre-nodal cluster signals prior to myelination. Regarding the comment on Figure 1D, we assume you are referring to Figure 2D based on the context. The lack of clarity in the high-magnification images in Figure 2D results from both the high background signal and the limited penetration of the MAG antibody. Furthermore, we are unable to verify Neurofascin accumulation at pre-nodal clusters, as there is currently no commercially available antibody suitable for use in chickens, despite our over 20 years of efforts to identify one for AIS research. Therefore, current methodologies pose significant challenges in visualizing pre-nodal clusters in our model. Future advancements, such as exogenous expression of fluorescently tagged Neurofascin at appropriate densities or knock-in tagging of endogenous molecules, may help overcome these limitations.

      However, a key issue to be discussed in this study is not merely the presence or absence of pre-nodal clusters, but rather whether pre-nodal clusters-if present-would determine regional differences in internodal length. To address this possibility, we have added new data in Figure 6I, measuring the length of unmyelinated segments that emerged following A3V-eTeNT expression. If pre-nodal clusters were fixed before myelination and predetermined internodal length, then the length of unmyelinated segments should be equal to or a multiple of the typical internodal length. However, our data showed that unmyelinated segments in the NL region were less than half the length of the typical NL internodal length, contradicting the hypothesis that fixed pre-nodal clusters determine internodal length along NM axons in this region.

      2) The claim that axonal diameter is constant along the axonal length need to be demonstrated at the EM level. This would also allow to measure possible regional differences in the thickness of the myelin sheath and number of myelin wraps.

      __Response: __

      As mentioned in our reply to comment 2 by Reviewer #1, the diameter of NM axons was already evaluated using electron microscopy (EM) in the pioneering study by Seidl et al., (2010). Additionally, EM-based analysis makes it difficult to clearly distinguish between the main trunk of NM axons and thin collateral branches at the NL region. Accordingly, we did not do the EM analysis in this revision.

      In Figure 4, we used palGFP, which is targeted to the cell membrane, allowing us to measure axon diameter by evaluating the distance between two membrane signal peaks. This approach minimizes the influence of the blurring of fluorescence signals on diameter measurements. Thus, we believe that our method is sufficient to evaluate the relative difference in axon diameters between regions and hence to show that axon diameter is not the primary determinant of the 3-fold difference in internodal length between regions.

      3) The observation that internodal length differs is explain by heterogeneity of sources of oligodendrocyte is not convincing. Oligodendrocytes a priori from the same origin remyelinate shorter internode after a demyelination event.

      __Response: __

      The heterogeneity in oligodendrocyte morphology would reflect differences in gene profiles, which, in turn, may arise from differences in their developmental origin and/or pericellular microenvironment of OPCs. We made this point as clear as possible in Discussion (page 9, line 21).

      Significance

      The authors suggest that the difference in internodal length is attributed to heterogeneity of oligodendrocytes. In the tract region oligodendrocytes would contribute longer myelin internodes, while oligodendrocytes in the nucleus laminaris region would synthesize shorter myelin internodes. Not only length of myelin internodes differs, but also along the same axon unmyelinated areas between two internodes may vary. This is an interesting contribution since all these differences contribute to differential conduction velocity regulating ipsilateral and contralateral innervation of coincidence detector neurons.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Egawa et al describe the developmental timeline of the assembly of nodes of Ranvier in the chick brainstem auditory circuit. In this unique system, the spacing between nodes varies significantly in different regions of the same axon from early stages, which the authors suggest is critical for accurate sound localization. Egawa et al set out to determine which factors regulate this differential node spacing. They do this by using immunohistological analyses to test the correlation of node spacing with morphological properties of the axons, and properties of oligodendrocytes, glial cells that wrap axons with the myelin sheaths that flank the nodes of Ranvier. They find that axonal structure does not vary significantly, but that oligodendrocyte density and morphology varies in the different regions traversed by these axons, which suggests this is a key determinant of the region-specific differences in node density and myelin sheath length. They also find that differential oligodendrocyte density is partly determined by secreted neuronal signals, as (presumed) blockage of vesicle fusion with tetanus toxin reduced oligodendrocyte density in the region where it is normally higher. Based on these findings, the authors propose that oligodendrocyte morphology, myelin sheath length, and consequently nodal distribution are primarily determined by intrinsic oligodendrocyte properties rather than neuronal factors such as activity.

      Major points, detailed below, need to be addressed to overcome some limitations of the study.

      Major comments:

      1. It is essential that the authors validate the efficiency of TeNT to prove that vesicular release is indeed inhibited, to be able to make any claims about the effect of vesicular release on oligodendrogenesis/myelination.
      2. Related to 1, can the authors clarify if their TeNT expression system results in the whole tract being silenced? It appears from Fig. 6 that their approach leads to sparse expression of TeNT in individual neurons, which enables them to measure myelination parameters. Can the authors discuss how silencing a single axon can lead to a regional effect in oligodendrocyte number?
      3. The authors need to fully revise their statistical analyses throughout and supply additional information that is needed to assess if their analyses are adequate:

      3.1) the authors use a variety of statistical tests and it is not always obvious why they chose a particular test. For example, in Fig. 2G they chose a Kruskal-Wallis test instead of a two-way ANOVA or Mann-Whitney U test, which are much more common in the field. What is the rationale for the test choice?

      3.2) in some cases, the choice of test appears wholly inappropriate. For example, in Fig. 3H-K, an unpaired t-test is inappropriate if the two regions were analysed in the same samples. In Fig. 5, was a t-test used for comparisons between multiple groups in the same dataset? If so, an ANOVA may be more appropriate.

      3.3) in some cases, the authors do not mention which test was used (Fig 3: E-G no test indicated, despite asterisks; G/L/M - which regression test that was used? What does r indicate?)

      3.4) more concerningly, throughout the results, data may have been pseudo-replicated. t-tests and ANOVAs assume that each observation in a dataset is independent of the other observations. In figures 1-4 and 6 there is a very large "n" number, but the authors do not indicate what this corresponds to. This leaves it open to interpretation, and the large values suggest that the number of nodes, internodal segments, or cells may have been used. These are not independent experimental units, and should be averaged per independent biological replicate - i.e. per animal (N).

      3.5) related to the pseudo-replication issue, can the authors include individual datapoints in graphs for full transparency, per biological replicates, in addition or in alternative to bar-graphs (e.g. Fig. 5 and 6). 4. The main finding of the study is that the density of nodes differs between two regions of the chicken auditory circuit, probably due to morphological differences in the respective oligodendrocytes. Can the authors discuss if this finding is likely to be specific to the bird auditory circuit? 5. Provided the authors amend their statistical analyses, and assuming significant differences remain as shown, the study shows a correlation (but not causation) between node spacing and oligodendrocyte density, but the authors did not manipulate oligodendrocyte density per se (i.e. cell-autonomously). Therefore, the authors should either include such experiments, or revise some of their phrasing to soften their claims and conclusions. For example, the word "determine" in the title could be replaced by "correlate with" for a more accurate representation of the work. Similar sentences throughout the main text should be amended. 6. The authors fail to introduce, or discuss, very pertinent prior studies, in particular to contextualize their findings with:

      6.1) known neuron-autonomous modes of node formation prior to myelination, e.g. Zonta et al (PMID 18573915); Vagionitis et al (PMID 35172135); Freeman et al (PMID 25561543)

      6.2) known effects of vesicular fusion directly on myelinating capacity and oligodendrogenesis, e.g. Mensch et al (PMID 25849985)

      6.3) known correlation of myelin length and thickness with axonal diameter, e.g. Murray & Blakemore (PMID 7012280); Ibrahim et al (PMID 8583214); Hildebrand et al (PMID 8441812).

      6.4) regional heterogeneity in the oligodendrocyte transcriptome (page 9, studies summarized in PMID 36313617)

      Minor comments:

      1. Can the authors amend Fig. 1G with the correct units of measurement, not millimetres.
      2. The Olig2 staining in Fig 2C does not appear to be nuclear, as would be expected of a transcription factor and as is well established for Olig2, but rather appears to be excluded from the nucleus, as it is in a ring or donut shape. Can the authors comment on this?

      Significance

      In our view the study tackles a fundamental question likely to be of interest to a specialized audience of cellular neuroscientists. This descriptive study is suggestive that in the studied system, oligodendrocyte density determines the spacing between nodes of Ranvier, but further manipulations of oligodendrocyte density per se are needed to test this convincingly.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Egawa and colleagues investigates differences in nodal spacing in an avian auditory brain stem circuit. The results are clearly presented and data are of very high quality. The authors make two main conclusions:

      1. Node spacing, i.e. internodal length, is intrinsically specified by the oligodendrocytes in the region they are found in, rather than axonal properties (branching or diameter).
      2. Activity is necessary (we don't know what kind of signaling) for normal numbers of oligodendrocytes and therefore the extent of myelination.

      These are interesting observations, albeit phenomenon. I have only a few criticisms that should be addressed:

      1. The use of the term 'distribution' when describing the location of nodes is confusing. I think the authors mean rather than the patterns of nodal distribution, the pattern of nodal spacing. They have investigated spacing along the axon. I encourage the authors to substitute node spacing or internodal length for node distribution.
      2. In Seidl et al. (J Neurosci 2010) it was reported that axon diameter and internodal length (nodal spacing) were different for regions of the circuit. Can the authors help me better understand the difference between the Seidl results and those presented here?
      3. The authors looked only in very young animals - are the results reported here applicable only to development, or does additional refinement take place with aging?
      4. The fact that internodal length is specified by the oligodendrocyte suggests that activity may not modify the location of nodes of Ranvier - although again, the authors have only looked during early development. This is quite different than this reviewer's original thoughts - that activity altered internodal length and axon diameter. Thus, the results here argue against node plasticity. The authors may choose to highlight this point or argue for or against it based on results in adult birds?

      Significance

      This paper may argue against node plasticity as a mechanism for tuning of neural circuits. Myelin plasticity is a very hot topic right now and node plasticity reflects myelin plasticity. this seems to be a circuit where perhaps plasticity is NOT occurring. That would be interesting to test directly. One limitation is that this is limited to development.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Regarding the manuscript's clarity, the sentence on page 5, "We also stained VTA sections for Tyrosine hydroxylase (TH) to estimate the rate of ChR2 colocalization with DA neurons," reads awkwardly. Removing the word "rate" could improve clarity.

      We have made the recommended clarifying edit (page 5, lines 30-31).

      Additionally, the anatomical data and findings are largely non-quantitative in nature. However, solid microscopy images are presented to support each claim. Additional quantification would strengthen the paper, specifically the quantification of projection density for each population and the proportion of each subpopulation that projects to their regions of interest.

      To rigorously quantify the projection density of each subpopulation would require a level of exhaustivity our study was not designed for. This is because during microscopy we focused efforts on imaging regions containing dense signals but did not exhaustively image regions receiving apparently weak or no input. While we considered including a semi-quantitative table of projection density, based on the data available we could not discriminate with confidence between, e.g., regions recipient of minimal input versus no input from VTA populations. Thus, while we stand by our descriptive statements we do not expand on those further.

      The authors should consider discussing the possibility that subpopulations of these cells could still be true interneurons especially if cells were looked at the single neuron level of resolution.

      We agree that some of the VTA populations we studied could include subpopulations that are bona fide interneurons. The identification of alternate markers or combinations of markers, or use of single-cell imaging approaches may indeed support this possibility in future. This is discussed in the context of currently available evidence on page 5 lines 32-34, page 11 lines 2-4, page 12 lines 2-11, and page 12 lines 15-16.

      Overall, the paper is well-written and important for the field and beyond.

      Thank you!

      Reviewer #2:

      Weaknesses:

      While the authors use several Cre driver lines to identify GABAergic projection neurons, they then use wild-type mice to show that projection neurons synapse onto neighboring cells within the VTA. This does not seem to lend evidence to the idea that previously described "interneurons" are projection neurons that collateralize within the VTA.

      We think the use of WT mice is a strength because it allows us to measure both GABA and non-GABA synapses made by VTA projections on to the same cells within VTA. However, we have also done this experiment targeting NAc-projecting VTA VGAT-Cre neurons, and VP-projecting VTA MOR-Cre neurons. Consistent with the WT dataset, we find that these defined projection neurons also make intra-VTA synapses. These data are now included as Figure 7.

      More broadly. Our review of the literature finds very little evidence to support the notion of a VTA interneuron as we define it: VTA neurons that makes only local connections. But the absence of evidence need not imply evidence of absence, thus we do not claim that all VTA neurons previously presumed to be interneurons must be projection neurons. We do express confidence in our findings that VTA projection neurons (that include GABA-releasing neurons) make local synapses in VTA. We argue that in the absence of compelling positive evidence for the existence of VTA interneurons, such as a selective marker, “we”, “the field”, should not presume their existence.

      Other suggestions:

      (1) While the authors present evidence that some projection neurons also synapse locally, there is no quantification as to the proportion of each neuronal subtype that collateralizes within the VTA. This would be a useful analysis.

      We agree this would be useful information. But our experiments were not designed to answer this question. Indeed, we have not conceived of a feasible method to discriminate between collateralizing and non-collateralizing VTA projection neurons at the single-cell level, thus we do not know how we would calculate such proportions.

      (2) There is significant interest in the molecular heterogeneity and spatial topography of the VTA. Additional analyses of the spatial topography of labeled projectors would be useful. For example, knowing if Pvalb+ projection neurons are distributed throughout the VTA or located along the midline would be a useful analysis.

      Prior studies and public databases (e.g., Allen brain atlas, GENSAT) allow one to visualize the location of VTA neurons positive for Pvalb and the other markers we investigated (Olson & Nestler, 2007). However, these label the entire population of neurons and thereby include those that project to any of the various projection targets. There are also studies that have used retrograde labeling approaches to map the distribution of labeled VTA cells projecting to one or another target (Beier et al., 2015; Lammel et al., 2008; Margolis et al., 2006). For example, finding that LHb-projecting neurons (a major target of Pvalb+ VTA neurons) are enriched in medial VTA (Root et al., 2014). From this evidence we might infer that Pvalb+ VTA neurons that project to LHb are likely to be medially biased. Future studies may more carefully map the intersection of specific projection targets for each VTA subpopulation.  

      Reviewer #3 (Recommendations For The Authors):

      Weaknesses:

      This study has a few modest shortcomings, of which the first is likely addressable with the authors' existing data, while the latter items will likely need to be deferred to future studies:

      (1) Some key anatomical details are difficult to discern from the images shown. In Figure 1, the low-magnification images of the VTA in the first column, while essential for seeing what overall section is being shown, are not of sufficient resolution to distinguish soma from processes. A supplemental figure with higher-resolution images could be helpful.

      We uploaded a higher resolution file for figure 1.

      Also, where are the insets shown in the second column obtained from? There is not a corresponding marked region on the low-magnification images. Is this an oversight, or are these insets obtained from other sections that are not shown?

      This was an oversight, we added the corresponding marked region to the low-magnification images.

      Lastly, there is a supplemental figure showing the NAc injection sites corresponding to Figure 5, but not one showing VP or PFC injection sites in Figure 6. Why not?

      We added a figure with histology examples for the VP and the PFC injection sites as done for Figure 5, included as Supplemental Figure 3.

      (2) Because multiple ChR2 neurons are activated in the optogenetic experiments, it is not clear how common is it for any specific projection neuron to make local connections. Are the observed synaptic effects driven by just a few neurons making extensive local collateralizations (while other projection neurons do not), or do most VTA projection neurons have local collaterals? I realize this is a complex question, that may not have an easy answer.

      This is a great question but, indeed, we don’t know the answer. As mentioned in response to Reviewer #2, we are not convinced there is a currently feasible way to discriminate between collateralizing and non-collateralizing cells at the single cell level.

      (3) There is something of a conceptual disconnect between the early and later portions of this paper. Whereas Figures 1-4 examine forebrain projections of genetic subtypes of VTA neurons, the optogenetic studies do not address genetic subtypes at all. I do realize that is outside of the scope of the author's intent, but it does give the impression of somewhat different (but related) studies being stitched together. For example, the MOR-expressing neurons seem to project strongly to the VP, but it is not addressed whether these are also the ones making local projections. Also, after showing that PV neurons project to the LHb, the opto experiments do not examine the LHb projection target at all.

      This too was raised by Reviewer #2. While addressing this question for all the populations we investigated feels redundant, we now include optogenetic data showing that NAc-projecting VTA VGAT-Cre and VP-projecting VTA MOR-Cre neurons also make local collaterals (Figure 7). We think this allows us to connect the two approaches to a greater degree. Based on our findings using a dual virus approach to express Syn:Ruby in each population of VTA projection neuron, we think it very likely that we’d continue to find similar results using optogenetics-assisted slice electrophysiology for each population.

      Other suggestions:

      (1) I appreciated the extensive and high-quality anatomical figures shown in Figures 2-4. However, the layout was sometimes left-to-right, and sometimes right-to-left, which felt distracting. At some point, the text refers to "Fig. 3KJ", i.e. with the letters being in backward alphabetical order, and Figures 3I and 3L do not appear mentioned anywhere in the main text, leading me to wonder if that text was intended to read "Fig. 3I-L".

      Thank you for noting this. We have harmonized the layout of Figures 2-4 and adjusted the in-text Figure call-outs.

      Also, the inset in Figure 3J appears to show local collaterals of NTS neurons in the VTA, since there is no soma in that inset. This is interesting, and worth reporting, but is not explained in either the main text or Figure legend.

      We added a more complete description in the result section (page 6 line 25-30).

      (2) Perhaps I missed it, but I could not find any mention of the intensity of the LED light delivered during the optogenetic experiments. While acknowledging that this can be variable, do the authors have at least a rough range?

      We have added this information to the methods, page 17 line 8.

      Editor's Note:

      Should you choose to revise your manuscript, please double check that you have fully reported all statistics including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals.

      We confirm that we have fully reported all statistics including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals.

      Note to Editor and Readers

      While reanalyzing our data for resubmission, we discovered that some of the short-latency optogenetic evoked postsynaptic currents (oPSCs) we detected were erroneously categorized. Specifically, some VTA cells that showed large outward currents (oIPSCs) when held at 0 mV, also had small inward currents when held at -60 mV. These small inward currents were initially categorized as oEPSCs, suggesting these VTA cells received input from populations of VTA projection neurons that released GABA and/or glutamate. However, the kinetics of these small inward currents were slow and aligned with the within-cell kinetics of the oIPSCs, indicating that these were very likely mediated by GABA<SUB>A</SUB> receptors. In one case the opposite was apparent, with a small PSC initially miscategorized as an oIPSC. These miscategorized oEPSCs and oIPSC were presumably detected because our holding potentials were not precisely identical to the reversal potentials for GABA<SUB>A</SUB> and AMPA receptors, respectively. For this reason, we removed these 14 oEPSCs and 1 oIPSCs from our analyses in the revised version. The revised dataset suggests that VTA glutamate projection neurons may be less likely to collateralize widely within VTA compared to GABA projection neurons. But, importantly, this correction does not affect any of our conclusions.

      Citations:

      Beier, K. T., Steinberg, E. E., DeLoach, K. E., Xie, S., Miyamichi, K., Schwarz, L., Gao, X. J., Kremer, E. J., Malenka, R. C., & Luo, L. (2015). Circuit Architecture of VTA Dopamine Neurons Revealed by Systematic Input-Output Mapping. Cell, 162(3), 622-634. https://doi.org/10.1016/j.cell.2015.07.015

      Lammel, S., Hetzel, A., Hackel, O., Jones, I., Liss, B., & Roeper, J. (2008). Unique properties of mesoprefrontal neurons within a dual mesocorticolimbic dopamine system. Neuron, 57(5), 760-773. https://doi.org/10.1016/j.neuron.2008.01.022

      Margolis, E. B., Lock, H., Chefer, V. I., Shippenberg, T. S., Hjelmstad, G. O., & Fields, H. L. (2006). Kappa opioids selectively control dopaminergic neurons projecting to the prefrontal cortex. Proc Natl Acad Sci U S A, 103(8), 2938-2942. https://doi.org/10.1073/pnas.0511159103

      Olson, V. G., & Nestler, E. J. (2007). Topographical organization of GABAergic neurons within the ventral tegmental area of the rat. Synapse, 61(2), 87-95. https://doi.org/10.1002/syn.20345

      Root, D. H., Mejias-Aponte, C. A., Zhang, S., Wang, H. L., Hoffman, A. F., Lupica, C. R., & Morales, M. (2014). Single rodent mesohabenular axons release glutamate and GABA. Nat Neurosci, 17(11), 1543-1551. https://doi.org/10.1038/nn.3823

    1. Reviewer #2 (Public review):

      In this work the authors develop a mathematical model that incorporates three contributions to cellular force generation in 3D matrices: (1) actively generated contractile forces via myosin motors and consumption of ATP; (2) the energy stored in the extracellular matrix as it is deformed by the contractile cell; and (3), the energy associated with the interactions at the interface between the matrix and the cell, e.g. at focal adhesions. The authors make predictions about the dependence of cell shape on these three contributions.

      The authors succeed in making a number of predictions of how cell shapes will depend on these contributions to force generation. However, these predictions seem to be largely buried in the supplemental material and come in a form that will be accessible to a certain type of physicist and modeler but will likely not be accessible to many experimentalists who may want to test the predictions of the model. The authors show a comparison between their expected cell shape distributions and those predicted by the model, under multiple regimes: cells in two different concentrations of collagen (Figure 4c), cells with inhibited myosin and therefore reduced contractility (Figure 4d), cells with impaired interactions with the ECM (Figure 4e), and for cells with both contractility and ECM interactions impaired. They find a strong agreement between the experiments and their predictions. However, it should be noted that there are multiple "tuning parameters" in their model, so the ability to match experiment and theory may not be ultimately so surprising.

      While the authors do achieve their aim of building this modeling and testing it in comparison to experimental data, the text is frequently unclear and doesn't seem to have the right information at the right place and time to allow the reader to most clearly understand the motivation, the approach, or the results. A number of elements of this manuscript were confusing to this reviewer, and I discuss these below in the hopes that raising these points here can bring more clarity in future revisions, and/or that readers will be able to provide additional insight or attention to these questions.

      There are certain elements of the writing that obscure, rather than clarify, the model and the results. For example, the authors frequently refer to "matrix stiffening" and "strain stiffening", which are typically used in the literature to describe the phenomenon whereby an applied force changes the mechanical properties of the substrate; here, for example in regard to the discussion of Figure 4C, these terms instead seem to be simply referring to the experimental intervention of exposing different cells to different concentrations of the collagen matrix. While there may be some element of classically understood strain stiffening, incorporated into the model as the function f(λ_i), this doesn't seem to match the experimental validation - which, as described above, is not about strain stiffening but instead simply uses softer vs. stiffer gels. Therefore, it is unclear what exactly is meant throughout the manuscript by strain stiffening - does it mean "difference in stiffness between two conditions" or does it mean "change in substrate stiffness upon application of force"?

      Furthermore, while the introductory text emphasizes collective migration, the model itself focuses on the interactions between single cells and their environments. The emphasis on collective migration and cell shape in the introduction invokes previous literature focusing on collective phase transitions, but that is misleading. This paper is all about individual cell mechanics, not about collective migration or unjamming.

      The experimental validation seems to have a significant flaw. The mechanics and interactions of the cellular extensions seem to be completely ignored. We see, in Figure 4, that cell bodies are outlined to determine cell shape, but that the extremely long extensions are simply ignored. We know from previous studies that these extensions are generating quite a bit of traction and are contractile, and yet they've been excluded from the analysis. This doesn't make physical sense or fit with previous literature, and would seem to indicate that the regimes predicted by the model are missing an essential component of force generation and cell-matrix interaction.

    1. 色彩理论

      亚里士多德的色彩理论

      亚里士多德(Aristotle,公元前384–322年)是古希腊哲学家和科学家,他的色彩理论是古代色彩研究的重要组成部分。尽管他的理论在现代科学视角下存在局限性,但在当时具有开创性意义,并对后世的色彩研究产生了深远影响。


      一、理论背景

      1. 哲学基础
      2. 亚里士多德的色彩理论基于他的自然哲学,尤其是四因说(质料因、形式因、动力因、目的因)和元素理论(土、水、火、气)。
      3. 他认为色彩是物体表面的一种属性,与光线的相互作用密切相关。

      4. 观察与实验

      5. 亚里士多德通过观察自然现象(如彩虹、光线透过不同介质的变化)提出了色彩的形成机制。
      6. 他的理论虽然缺乏现代实验手段,但体现了早期科学思维的萌芽。

      二、核心理论内容

      1. 色彩的基本分类
      2. 亚里士多德将色彩分为简单色复合色

        • 简单色:白、黑、黄、红、绿、蓝、紫。
        • 复合色:由简单色混合而成,如橙色(红+黄)。
      3. 色彩的形成机制

      4. 亚里士多德认为色彩是光线与物体表面相互作用的结果。
      5. 他提出透明介质理论:光线通过透明介质(如空气、水)时,会被物体表面反射或吸收,从而产生不同的颜色。

      6. 色彩与光的关系

      7. 亚里士多德认为光是一种透明介质的状态,而非独立实体。
      8. 色彩是光与物体表面相互作用的结果,白色是光的完全反射,黑色是光的完全吸收。

      9. 彩虹的解释

      10. 亚里士多德对彩虹的形成提出了早期解释,认为彩虹是阳光通过雨滴折射和反射的结果。
      11. 他将彩虹的颜色分为红、黄、绿、蓝、紫五种。

      三、理论的特点与局限性

      1. 特点
      2. 系统性:亚里士多德首次尝试将色彩现象系统化,提出了色彩的分类和形成机制。
      3. 哲学性:他的理论与其自然哲学紧密结合,体现了古希腊哲学的特点。
      4. 观察性:基于对自然现象的观察,尽管缺乏实验验证,但具有早期科学思维的萌芽。

      5. 局限性

      6. 缺乏实验支持:亚里士多德的理论主要基于观察和推理,缺乏现代科学的实验验证。
      7. 色彩分类不完整:他的色彩分类未能涵盖所有可见光谱颜色,且对色彩混合的解释较为简单。
      8. 光与色彩关系的误解:亚里士多德未能理解光的本质(如波长和频率),导致对色彩形成机制的解释存在偏差。

      四、对后世的影响

      1. 中世纪色彩研究的基础
      2. 亚里士多德的色彩理论在中世纪被广泛接受,成为欧洲色彩研究的基础。
      3. 许多中世纪学者(如罗杰·培根)在其理论基础上进行了进一步研究。

      4. 文艺复兴时期的启发

      5. 文艺复兴时期的艺术家和科学家(如达·芬奇)借鉴了亚里士多德的理论,探索光与色彩的关系。
      6. 达·芬奇在其著作《论绘画》中多次引用亚里士多德的观点。

      7. 现代色彩科学的铺垫

      8. 尽管亚里士多德的理论在现代科学视角下存在局限性,但其系统化的思维方式为后来的色彩研究提供了启发。
      9. 牛顿的光谱实验(17世纪)部分修正了亚里士多德的理论,但也借鉴了其色彩分类的思想。

      五、与现代色彩理论的对比

      | 方面 | 亚里士多德理论 | 现代色彩理论 | |--------------|------------------------------------|----------------------------------| | 色彩分类 | 简单色(7种)与复合色 | 可见光谱(连续波长) | | 光与色彩关系 | 光为透明介质状态,色彩为光与物体相互作用 | 光为电磁波,色彩由波长决定 | | 彩虹解释 | 阳光通过雨滴折射和反射 | 阳光通过雨滴折射、反射和色散 | | 实验基础 | 基于观察和推理 | 基于实验和数学模型 |


      六、总结

      亚里士多德的色彩理论是古代色彩研究的重要成果,尽管在现代科学视角下存在局限性,但其系统化的思维方式和观察方法为后世的色彩研究奠定了基础。他的理论不仅影响了中世纪的学者,也对文艺复兴时期的艺术家和科学家产生了深远影响。亚里士多德的工作标志着人类对色彩现象从神话解释向科学探索的转变。

      需要进一步了解亚里士多德理论的具体应用或与现代理论的对比案例吗?

    1. 色彩理论

      亚里士多德的色彩理论

      亚里士多德(Aristotle,公元前384–322年)是古希腊哲学家和科学家,他的色彩理论是古代色彩研究的重要组成部分。尽管他的理论在现代科学视角下存在局限性,但在当时具有开创性意义,并对后世的色彩研究产生了深远影响。


      一、理论背景

      1. 哲学基础
      2. 亚里士多德的色彩理论基于他的自然哲学,尤其是四因说(质料因、形式因、动力因、目的因)和元素理论(土、水、火、气)。
      3. 他认为色彩是物体表面的一种属性,与光线的相互作用密切相关。

      4. 观察与实验

      5. 亚里士多德通过观察自然现象(如彩虹、光线透过不同介质的变化)提出了色彩的形成机制。
      6. 他的理论虽然缺乏现代实验手段,但体现了早期科学思维的萌芽。

      二、核心理论内容

      1. 色彩的基本分类
      2. 亚里士多德将色彩分为简单色复合色

        • 简单色:白、黑、黄、红、绿、蓝、紫。
        • 复合色:由简单色混合而成,如橙色(红+黄)。
      3. 色彩的形成机制

      4. 亚里士多德认为色彩是光线与物体表面相互作用的结果。
      5. 他提出透明介质理论:光线通过透明介质(如空气、水)时,会被物体表面反射或吸收,从而产生不同的颜色。

      6. 色彩与光的关系

      7. 亚里士多德认为光是一种透明介质的状态,而非独立实体。
      8. 色彩是光与物体表面相互作用的结果,白色是光的完全反射,黑色是光的完全吸收。

      9. 彩虹的解释

      10. 亚里士多德对彩虹的形成提出了早期解释,认为彩虹是阳光通过雨滴折射和反射的结果。
      11. 他将彩虹的颜色分为红、黄、绿、蓝、紫五种。

      三、理论的特点与局限性

      1. 特点
      2. 系统性:亚里士多德首次尝试将色彩现象系统化,提出了色彩的分类和形成机制。
      3. 哲学性:他的理论与其自然哲学紧密结合,体现了古希腊哲学的特点。
      4. 观察性:基于对自然现象的观察,尽管缺乏实验验证,但具有早期科学思维的萌芽。

      5. 局限性

      6. 缺乏实验支持:亚里士多德的理论主要基于观察和推理,缺乏现代科学的实验验证。
      7. 色彩分类不完整:他的色彩分类未能涵盖所有可见光谱颜色,且对色彩混合的解释较为简单。
      8. 光与色彩关系的误解:亚里士多德未能理解光的本质(如波长和频率),导致对色彩形成机制的解释存在偏差。

      四、对后世的影响

      1. 中世纪色彩研究的基础
      2. 亚里士多德的色彩理论在中世纪被广泛接受,成为欧洲色彩研究的基础。
      3. 许多中世纪学者(如罗杰·培根)在其理论基础上进行了进一步研究。

      4. 文艺复兴时期的启发

      5. 文艺复兴时期的艺术家和科学家(如达·芬奇)借鉴了亚里士多德的理论,探索光与色彩的关系。
      6. 达·芬奇在其著作《论绘画》中多次引用亚里士多德的观点。

      7. 现代色彩科学的铺垫

      8. 尽管亚里士多德的理论在现代科学视角下存在局限性,但其系统化的思维方式为后来的色彩研究提供了启发。
      9. 牛顿的光谱实验(17世纪)部分修正了亚里士多德的理论,但也借鉴了其色彩分类的思想。

      五、与现代色彩理论的对比

      | 方面 | 亚里士多德理论 | 现代色彩理论 | |--------------|------------------------------------|----------------------------------| | 色彩分类 | 简单色(7种)与复合色 | 可见光谱(连续波长) | | 光与色彩关系 | 光为透明介质状态,色彩为光与物体相互作用 | 光为电磁波,色彩由波长决定 | | 彩虹解释 | 阳光通过雨滴折射和反射 | 阳光通过雨滴折射、反射和色散 | | 实验基础 | 基于观察和推理 | 基于实验和数学模型 |


      六、总结

      亚里士多德的色彩理论是古代色彩研究的重要成果,尽管在现代科学视角下存在局限性,但其系统化的思维方式和观察方法为后世的色彩研究奠定了基础。他的理论不仅影响了中世纪的学者,也对文艺复兴时期的艺术家和科学家产生了深远影响。亚里士多德的工作标志着人类对色彩现象从神话解释向科学探索的转变。

      需要进一步了解亚里士多德理论的具体应用或与现代理论的对比案例吗?

    1. pg 4 - 75 year deficit equal to 3.5% of taxable payrolls The shift from 3.61-3.5% from 2023-2-24 is is the result of 3 assumptions 1) Economy: Greater than anticipated growth last year led to an increase in assumed level of productivity growth over the projection period 2) Disability Incidence: It has continued to drop driven largely by the strength of the economy and a stricter process for awarding benefits on appeal 3) Fertility: Has declined sharply because of lower marriage rates, high childcare, and lower fertility from which new immigrants arrive from

    Annotators

    1. À partir de la transcription de la vidéo YouTube, voici un document de synthèse pour un briefing sur les classes préparatoires TSI, incluant des horodatages pour faciliter la référence aux témoignages spécifiques.

      Thèmes Généraux

      • Présentation des Parcours [0:00-0:03]:

        • Les anciens élèves de TSI présentent leurs parcours diversifiés après la prépa.
        • Importance du bac d'origine, souvent STI2D ou STL, comme base de leur formation.
        • Expériences variées en école d'ingénieur, en BUT ou en réorientation.
        • Difficultés et Défis en Prépa [1:00-2:00]

        • Le "choc" initial est un thème récurrent, lié à une charge de travail accrue.

        • Certains élèves rencontrent des difficultés dès la première année, d'autres en deuxième année.
        • La pression et la charge de travail sont identifiées comme des défis majeurs.
        • Concours, Réussite et Échecs [2:00-3:00]:

        • Les résultats aux concours varient, certains réussissent, d'autres échouent.

        • L'importance de la persévérance face à l'échec est soulignée.
        • La réorientation vers d'autres formations comme les BUT est une option viable.
        • Expérience en École d'Ingénieur [3:00-4:00]:

        • Le travail par projet et la pratique sont valorisés en école d'ingénieur.

        • La vie associative est un atout majeur pour le développement personnel.
        • Certains regrettent le manque d'encadrement comparé à la prépa.
        • Les écoles demandent de plus en plus d'exigence.
        • Stages et Expérience Professionnelle [4:00-5:00]:

        • Les stages sont vus comme des opportunités d'apprentissage et de découverte.

        • Ils permettent de développer des compétences et de préciser son projet professionnel.
        • Importance des Langues, Anglais en Tête [5:00-6:00]:

        • L'anglais est crucial pour l'obtention du diplôme et les opportunités à l'étranger.

        • Il ne faut pas négliger les langues.
        • Vie Associative et Engagement Étudiant [6:00-7:00]:

        • La vie associative développe des compétences et valorise le CV.

        • Elle offre des opportunités de gestion de projet et de travail en équipe.
        • Réflexions sur l'ENIM [7:00-8:00]:

        • L'ENIM est perçue comme une école à double tranchant, avec des réorientations possibles.

        • Réorientation et Adaptabilité [8:00-9:00]:

        • Des anciens élèves témoignent de leur réorientation après la prépa ou l'école d'ingénieur.

        • Conseils aux Futurs Élèves [9:00-10:00]:

        • Bien choisir son orientation et travailler régulièrement sont essentiels.

        • Il ne faut pas négliger les langues et s'entraider.
        • Il faut profiter de la vie étudiante.
        • Aptitudes acquises en prépa [10:00-11:00]:

        • La prépa apprend à apprendre, elle apprend la méthodologie pour maîtriser des sujets et pour sortir de sa zone de confort.

        • La prépa apprend à s'adapter.
        • Le rôle des professeurs [11:00-12:00]:

        • Les professeurs sont disponibles et offrent un bon encadrement.

      Points Spécifiques et Anecdotes

      • Expériences Personnelles :

        • Certains étudiants ont intégré la prépa un peu par hasard, poussés par leurs professeurs.
        • D'autres avaient ce choix en tête dès la seconde.
        • Stages à l'étranger:

        • De nombreuses écoles d'ingénieurs proposent des stages à l'étranger.

        • VIE:

        • Le Volontariat International en Entreprise est un contrat de 10 à 24 mois.

        • Il permet de travailler à l'étranger pour une entreprise tout en étant sous la protection de l'État.
        • L'importance du réseau:

        • Les anciens élèves peuvent aider les nouveaux à trouver des stages.

        • L'IA:

        • L'intelligence artificielle va transformer le monde du travail, il faut apprendre à l'utiliser.

      Conclusion

      Ce briefing vise à donner une vue d'ensemble des classes préparatoires TSI à travers les expériences de ceux qui les ont vécues. Il met en évidence les défis, les opportunités et les compétences clés pour réussir dans cette voie exigeante.

    1. A diagram demonstrating common color mixing terms. A tint is any mixture of a a bright “pure” color with white. A shade is any mixture of a “pure” color with black. A tone is any mixture of all three: white, black, and the “pure” color.

      上图展示了绘画色彩混合术语的流程图,具体说明了如何通过将不同颜色(如白色、黑色、灰色等)与纯色混合来生成不同的色调。以下是对每个术语的详细解释:

      1. Pure Color(纯色)

      • 纯色是未经过任何其他颜色调整的原色。它是完全饱和且没有任何添加的颜色(如白色、黑色或灰色)。

      2. Tint(色调)

      • Tint是通过向纯色中添加白色来生成的颜色。这个过程使得颜色变得更,但保持了原始的色相。Tint表示了一种明亮的、带有白色成分的颜色。

      3. Tone(调子)

      • Tone是通过向纯色中添加灰色来改变的颜色。添加灰色会降低颜色的饱和度,减少它的鲜艳度,使颜色更加柔和或温和。Tone的形成使得颜色不那么鲜艳,但仍然保持了其基本的色相。

      4. Shade(阴影)

      • Shade是通过向纯色中添加黑色来生成的颜色。这个过程使得颜色变得更,增加深度和阴影效果。Shade表示了一种深色调,通常用于给颜色增加深度感。

      5. Gray(灰色)

      • Gray是由黑色和白色的混合物形成的颜色,它没有明显的色相,是一种中性颜色。灰色可以通过调整黑白比例来改变,通常用于调和其他颜色。

      6. Black(黑色)

      • Black是缺乏任何颜色的完全黑暗的颜色。它常常用于通过减少亮度或添加深度来调整其他颜色。

      7. White(白色)

      • White代表颜色的完全亮度,是一种没有色相的颜色。它是最明亮的颜色,常常用于通过增加亮度来调整其他颜色。

      总结:

      这个图解释了绘画中的色彩混合过程,展示了如何通过将白色、黑色或灰色纯色进行不同的混合,生成色调(tint)调子(tone)阴影(shade)。这些术语帮助艺术家精确地控制颜色的亮度、饱和度和深度,从而创造出丰富的视觉效果。

    2. 计算过程

      RGB到HSV的转换是色彩模型中的一个重要操作,用于将RGB颜色空间(常用于屏幕显示)转换到HSV颜色空间(通常用于图形设计和图像处理)。RGB颜色模型是基于红、绿、蓝三种基色来表示颜色,而HSV颜色模型则通过色相(Hue)饱和度(Saturation)明度(Value)来表示颜色。

      1. RGB到HSV转换的原理

      RGB颜色模型是通过红色、绿色和蓝色的光强度来表示颜色,而HSV模型通过色相、饱和度和明度来表示颜色的属性。为了将RGB转换为HSV,我们需要计算这三个分量:H、S和V。

      2. RGB到HSV的步骤

      假设我们有一个RGB颜色值,其中: - R:红色分量,范围是0到255 - G:绿色分量,范围是0到255 - B:蓝色分量,范围是0到255

      步骤如下:

      2.1 归一化RGB值

      首先,需要将RGB的值从0到255的范围归一化到0到1的范围。这是为了便于后续的计算。

      [ R' = \frac{R}{255}, \quad G' = \frac{G}{255}, \quad B' = \frac{B}{255} ]

      其中,\( R' \)、\( G' \)和\( B' \)是归一化后的RGB值,范围是 0到1

      2.2 计算最大值和最小值

      计算归一化后的RGB分量的最大值(Max)和最小值(Min):

      [ \text{Max} = \max(R', G', B') ] [ \text{Min} = \min(R', G', B') ]

      2.3 计算明度(V)

      HSV中的明度(V)代表了颜色的亮度,它就是RGB值的最大值:

      [ V = \text{Max} ]

      2.4 计算饱和度(S)

      饱和度(S)表示颜色的纯度,值越高表示颜色越鲜艳。饱和度的计算公式为:

      [ S = \frac{\text{Max} - \text{Min}}{\text{Max}} \quad \text{if Max} \neq 0 ] [ S = 0 \quad \text{if Max} = 0 ]

      如果Max为0,说明颜色是黑色,因此饱和度也为0。

      2.5 计算色相(H)

      色相(H)表示颜色的类型或色调,范围是 0° 到 360°。根据RGB的最大值和最小值,色相的计算方式不同。具体计算方法如下:

      • 如果 Max = R'(即红色最大): [ H = 60^\circ \times \left( \frac{G' - B'}{\text{Max} - \text{Min}} \right) ] 如果计算得到的 H 为负值,则加上360°(使其在0°到360°之间)。

      • 如果 Max = G'(即绿色最大): [ H = 60^\circ \times \left( \frac{B' - R'}{\text{Max} - \text{Min}} \right) + 120^\circ ]

      • 如果 Max = B'(即蓝色最大): [ H = 60^\circ \times \left( \frac{R' - G'}{\text{Max} - \text{Min}} \right) + 240^\circ ]

      这些公式根据RGB的最大分量来确定色相的角度,确保它落在色环上合适的位置。

      2.6 最终HSV值

      • H:色相,范围是 0° 到 360°
      • S:饱和度,范围是 0 到 1
      • V:明度,范围是 0 到 1

      3. 完整的RGB到HSV转换公式

      总结一下,将RGB值转换为HSV值的步骤:

      1. 归一化RGB: [ R' = \frac{R}{255}, \quad G' = \frac{G}{255}, \quad B' = \frac{B}{255} ]

      2. 计算Max和Min: [ \text{Max} = \max(R', G', B'), \quad \text{Min} = \min(R', G', B') ]

      3. 计算明度(V): [ V = \text{Max} ]

      4. 计算饱和度(S): [ S = \frac{\text{Max} - \text{Min}}{\text{Max}} \quad \text{if Max} \neq 0, \quad S = 0 \quad \text{if Max} = 0 ]

      5. 计算色相(H)

      6. 如果 Max = R':\( H = 60^\circ \times \left( \frac{G' - B'}{\text{Max} - \text{Min}} \right) \)
      7. 如果 Max = G':\( H = 60^\circ \times \left( \frac{B' - R'}{\text{Max} - \text{Min}} \right) + 120^\circ \)
      8. 如果 Max = B':\( H = 60^\circ \times \left( \frac{R' - G'}{\text{Max} - \text{Min}} \right) + 240^\circ \)

      9. 最终结果

      10. H:色相(0°到360°)
      11. S:饱和度(0到1)
      12. V:明度(0到1)

      4. RGB到HSV转换的示例

      假设我们有一个RGB值为 (R=255, G=100, B=50),我们将其转换为HSV。

      1. 归一化RGB: [ R' = \frac{255}{255} = 1.0, \quad G' = \frac{100}{255} \approx 0.392, \quad B' = \frac{50}{255} \approx 0.196 ]

      2. 计算Max和Min: [ \text{Max} = 1.0, \quad \text{Min} = 0.196 ]

      3. 计算明度(V): [ V = 1.0 ]

      4. 计算饱和度(S): [ S = \frac{1.0 - 0.196}{1.0} = 0.804 ]

      5. 计算色相(H):

      6. 因为Max = R',我们使用红色的公式: [ H = 60^\circ \times \left( \frac{0.392 - 0.196}{1.0 - 0.196} \right) = 60^\circ \times \left( \frac{0.196}{0.804} \right) \approx 14.6^\circ ]

      7. 最终HSV值为: [ H = 14.6^\circ, \quad S = 0.804, \quad V = 1.0 ]

      5. 总结

      RGB到HSV的转换涉及到: - 归一化RGB值, - 计算颜色的明度(V)饱和度(S), - 根据最大值和最小值来计算色相(H)

      转换后的HSV模型使得我们可以直观地调节颜色的亮度(通过V),饱和度(通过S)以及色调(通过H)。这种模型非常适合图像编辑和颜色选择,因为它能够轻松地操作颜色的不同属性。

    1. Voici un sommaire minuté avec des horodatages basés sur la transcription de la vidéo YouTube, mettant en évidence les points clés de la présentation sur Parcoursup:

      • 0:00-2:37: Introduction et présentation des intervenants.

      • 2:37-3:14: Présentation générale de Parcoursup.

      • 3:14-4:04: Transparence des critères de sélection et des profils des admis.

      • 4:04-4:24: Parcoursup agit pour l'égalité d'accès et la réussite des étudiants.

      • 4:24-5:03: Mise à jour de Parcoursup et dates des portes ouvertes. Les portes ouvertes permettent aux élèves d'obtenir des réponses à leurs questions.

      • 5:03-5:52: Calendrier Parcoursup et étape actuelle.

      • 5:52-7:12: Accompagnement à l'orientation au lycée. L'accompagnement à l'orientation est obligatoire pour les élèves.

      • 7:12-7:57: Ressources en ligne pour l'orientation (ONISEP, Parcoursup).

      • 7:57-9:22: Types de formations : statut étudiant (sélectives et non sélectives) et statut apprenti. Il est conseillé de diversifier les vœux en incluant des formations sélectives et non sélectives.

      • 9:22-11:04: Informations sur les formations en apprentissage et rôle des CFA. Les CFA peuvent aider à trouver un employeur pour les formations en apprentissage.

      • 11:04-12:00: Complément sur l'apprentissage et possibilité de passer d'un statut scolaire à apprenti en cours d'année. Il est possible de commencer une formation sous statut scolaire et de continuer en apprentissage si une entreprise est trouvée.

      • 12:00-15:34: Focus sur les cartes d'identité des formations et les six onglets principaux. Il est important de prendre le temps d'examiner chaque formation.

      • 15:34-16:20: Nouveautés 2025 concernant les chiffres d'accès et les données chiffrées sur les poursuites d'études.

      • 16:20-17:18: Modalités d'examen et rôle des commissions d'analyse des dossiers. Les commissions d'analyse des dossiers sont composées d'enseignants.

      • 17:18-17:54: Visibilité des critères de candidature et données chiffrées.

      • 17:54-18:26: Rappel de l'étape 2 : inscription, formulation des vœux et finalisation du dossier.

      • 18:26-21:45: Activation du dossier Parcoursup et vérification des informations. L'activation du compte doit être faite par l'élève.

      • 21:45-24:58: Formulation des vœux : liberté de choix, conseils des enseignants et données chiffrées.

      • 24:58-27:41: Nombre de vœux possibles et vœux multiples. Il est possible de formuler jusqu'à 10 vœux sous statut scolaire et 10 vœux supplémentaires en apprentissage.

      • 27:41-31:46: Focus sur les vœux multiples et les sous-vœux. Un vœu multiple compte pour un seul vœu parmi les 10 possibles.

      • 31:46-34:23: Exemples de vœux multiples (BTS Métiers de la Chimie) et exceptions (IFSI).

      • 34:23-35:55: Concours et regroupements d'écoles (écoles d'ingénieurs).

      • 35:55-37:11: Vœux en apprentissage et absence de date limite pour les formations en apprentissage.

      • 37:11-38:00: Le secteur géographique de recrutement et son impact.

      • 38:00-41:41: Secteur géographique : formations sélectives vs non sélectives et académies de rattachement. Pour les formations non sélectives, une priorité est donnée aux candidats de l'académie.

      • 41:41-44:34: Gestion des situations particulières : parents résidant dans des académies différentes.

      • 44:34-45:34: Secteur géographique : cas de l'Île-de-France.

      • 45:34-48:24: Année de césure : conditions et démarches. L'année de césure n'est pas accordée de droit et nécessite l'accord de la formation.

      • 48:24-50:06: Rappel de la date limite pour formuler les vœux (13 mars) et importance de ne pas attendre le dernier moment.

      • 50:06-51:15: Accompagnement pour les lettres de motivation.

      • 51:15-54:48: Finalisation du dossier et confirmation des vœux : rôle du lycée et du candidat.

      • 54:48-57:22: Conseils pour la lettre de motivation et importance de l'adapter à la formation. Il est déconseillé d'utiliser des intelligences artificielles pour rédiger la lettre de motivation.

      • 57:22-58:46: La rubrique "Autres projets" : caractère obligatoire et confidentialité des informations.

      • 58:46-1:01:20: Activités et centres d'intérêt : valorisation du profil et engagements extrascolaires.

      • 1:01:20-1:02:32: Attestation spécifique pour les licences de droit.

      • 1:02:32-1:07:25: Éléments transmis par l'établissement : bulletins scolaires, fiche Avenir, appréciations des enseignants. La fiche Avenir est remplie par les professeurs principaux et le chef d'établissement.

      • 1:07:25-1:08:33: Fiche Avenir : compétences transversales et avis du chef d'établissement.

      • 1:08:33-1:10:20: Parcours spécifiques, certifications (Pix) et importance de la certification Pix.

      • 1:10:20-1:11:41: Précisions sur les absences et leur prise en compte.

      • 1:11:41-1:13:03: Fiche Avenir : consultation à partir du 2 juin 2025.

      • 1:13:03-1:13:42: Accompagnement pour les élèves en situation de handicap.

      • 1:13:42-1:14:13: Rappel de la date limite du 2 avril et conséquences du non-respect de cette date.

      • 1:14:13-1:15:16: Ce n'est pas Parcoursup qui décide de l'affectation, mais une commission.

      • 1:15:16-1:15:40: Question sur la remontée de la certification Pix.

      • 1:15:40-1:17:11: Politique d'égalité des chances et places priorisées pour certains bacs. Des places sont priorisées pour les bacs professionnels en BTS et les bacs technologiques en BUT.

      • 1:17:11-1:18:03: Prochaine étape : accompagnement spécifique pour anticiper les réponses.

      • 1:18:03-1:20:28: STAPS : formation sélective ou non sélective et demandes de bourse.

      • 1:20:28-1:21:29: Bourses : le CROUS gère les bourses et il est possible de faire une simulation.

      • 1:21:29-1:22:33: Calendrier du CROUS et contact pour les questions sur les bourses.

      • 1:22:33-1:23:00: Remerciements et prochain YouTube live.

    1. 芒塞尔对传统色彩教学方法的不足感到不满,认为色彩需要一种更科学、更系统化的表达方式

      详细解释:芒塞尔对传统色彩教学方法的不足感到不满,认为色彩需要一种更科学、更系统化的表达方式

      芒塞尔对传统色彩教学方法的批评和改进,源于他在艺术教育实践中发现的一系列问题。以下是详细分析:


      一、传统色彩教学方法的不足

      1. 主观性强,缺乏科学依据
      2. 传统色彩教学主要依赖艺术家的经验和直觉,缺乏系统化的理论支持。
      3. 示例:教师通常通过“感觉”指导学生调配颜色,而非基于可量化的规则。

      4. 色彩描述模糊

      5. 传统方法使用模糊的语言描述颜色(如“深红”“浅蓝”),无法精确传达色彩属性。
      6. 示例:不同人对“深红”的理解可能存在显著差异。

      7. 缺乏统一标准

      8. 不同艺术家和学派对色彩的分类和命名方式各异,难以形成通用标准。
      9. 示例:同一颜色在不同文化或艺术流派中可能有不同名称。

      10. 忽视色彩的三维特性

      11. 传统方法通常只关注色相(如红、蓝),而忽略了明度和彩度的重要性。
      12. 示例:无法准确描述“明亮的深红”与“暗淡的深红”之间的区别。

      13. 教学工具单一

      14. 传统教学主要依赖色轮和有限的色卡,无法全面展示色彩的复杂关系。
      15. 示例:色轮无法直观表现色彩的明暗变化和饱和度差异。

      二、芒塞尔的改进思路

      1. 科学化色彩定义
      2. 引入色相(Hue)明度(Value)彩度(Chroma)三个维度,将色彩定义为可量化的科学对象。
      3. 示例:用5YR 6/8精确描述一种黄红色相、中等明度、高饱和度的颜色。

      4. 系统化色彩空间

      5. 构建三维色彩空间模型,直观展示色彩之间的关系。
      6. 示例:色相环展示色相变化,明度轴展示明暗变化,彩度轴展示饱和度变化。

      7. 视觉均匀性优化

      8. 通过实验确保相邻色块的视觉差异一致,使色彩空间更符合人眼感知。
      9. 示例:在色相环中,红色到黄色的过渡与黄色到绿色的过渡在视觉上均匀。

      10. 标准化色彩命名

      11. 开发统一的色彩标号系统,避免模糊描述。
      12. 示例:用10B 5/6代替“中等亮度的蓝绿色”。

      13. 开发教学工具

      14. 设计芒塞尔色立体模型色彩图谱,帮助学生直观理解色彩关系。
      15. 示例:色立体模型展示色彩的三维结构,色彩图谱提供具体颜色示例。

      三、《A Color Notation》的核心内容

      1. 色彩三维理论
      2. 详细阐述色相、明度、彩度的定义及其相互关系。
      3. 示例:色相环分为10种主色相,每种色相进一步细分为10级。

      4. 色彩标号系统

      5. 提出H V/C格式的色彩标号方法。
      6. 示例:5YR 6/8表示黄红色相、中等明度、高饱和度的颜色。

      7. 视觉均匀性实验

      8. 描述如何通过实验调整色彩空间,确保视觉均匀性。
      9. 示例:通过大量视觉匹配实验确定色块间距。

      10. 教学应用案例

      11. 展示如何将芒塞尔系统应用于艺术教育和色彩实践。
      12. 示例:学生通过色立体模型理解色彩关系,并通过标号系统调配颜色。

      四、芒塞尔系统的意义

      1. 科学化色彩研究
      2. 为色彩科学提供了可量化的研究工具,推动了色彩理论的发展。

      3. 革新艺术教育

      4. 系统化的教学方法帮助学生更高效地掌握色彩知识。

      5. 跨学科应用

      6. 在土壤科学、工业设计、数字色彩管理等领域得到广泛应用。

      7. 标准化色彩表达

      8. 统一的色彩标号系统解决了传统方法中的模糊性问题。

      五、与传统方法的对比

      | 方面 | 传统方法 | 芒塞尔系统 | |--------------|------------------------------|------------------------------| | 色彩定义 | 主观描述(如“深红”) | 科学化标号(如5YR 6/8) | | 教学工具 | 色轮、有限色卡 | 色立体模型、系统化色彩图谱 | | 色彩关系 | 模糊、难以量化 | 三维空间展示,视觉均匀 | | 应用领域 | 主要限于艺术创作 | 跨学科应用(科学、设计等) |


      六、总结

      芒塞尔通过科学化和系统化的方法,解决了传统色彩教学中的模糊性、主观性和局限性问题。他的工作不仅革新了艺术教育,还为色彩科学和跨学科应用奠定了基础。《A Color Notation》的出版标志着色彩研究进入了一个新的时代。

      需要进一步了解芒塞尔系统的具体技术细节或应用案例吗?

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02831

      Corresponding author(s): Charisios Tsiairis

      1. General Statements [optional]

      We are very pleased that all three reviewers found our work to be solid, well-supported by the data, and free of major flaws. It is particularly gratifying that they did not request additional experimental work to support our conclusions. Instead, their comments focused on clarifications, textual improvements, and refinements in data presentation, which we have carefully addressed.

      • *

      We have made revisions to improve the clarity of the manuscript, incorporating insightful suggestions from the reviewers. These include refining key explanations, adjusting figure annotations, and modifying the structure of certain sentences. Additionally, we have addressed specific points regarding statistical significance, genome assembly references, and phylogenetic comparisons, ensuring that all aspects of our study are as precise and informative as possible.

      • *

      We are confident that these revisions have strengthened the manuscript.

      2. Point-by-point description of the revisions

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): *

      • *

      *Overall, the paper is well-written, the figures are easy to interpret, and the conclusions are well supported by the data. Most of the points discussed below could be addressed with simple text changes. *

      • *

      *General Points: *

      • *

      • The upregulation of Gata3 in response to Zic4 RNAi is relatively modest compared to the more pronounced upregulation of Zic4 following Gata3 knockdown, but this point is not really addressed. While these issues could be simply technical, they might also hint at additional layers of regulation that are not yet fully understood. *

      • *

      The observed differences in upregulation are primarily technical. Expression levels are measured relative to unperturbed tissue, and in the control, Zic4 expression in the foot is detected only at noise levels (see figure 2C). As a result, any increase in Zic4 expression upon Gata3 knockdown appears relatively high when normalized to the minimal control levels. In contrast, Gata3 is already present at detectable levels in control samples from the upper body, head, and tentacles (See Fig 2D). Therefore, while its upregulation following Zic4 RNAi appears more modest, we interpret this as a qualitative indication of increased gene expression in the absence of the opposing transcription factor. That said, we acknowledge the possibility of additional regulatory layers contributing to these differences.

      • *

      • Extending the time course would strengthen the conclusion that, in the Gata3 knockdown, the existing basal disk cells remain stable while body column cells migrating into the region differentiate into tentacle cells. If this hypothesis is correct, one would predict that by approximately 20 days, the basal disk cells would be completely replaced. *

      • *

      This is a valid point; however, the interpretation is complicated by the technical limitations of RNAi-based knockdown rather than a complete knockout of Gata3. Over time, the effect of RNAi diminishes, and we have observed that GFP expression returns within four weeks following GFP RNAi, indicating a temporal limit to RNAi-mediated knockdown. Therefore, while an extended time course would be informative, the transient nature of the knockdown makes it challenging to definitively track long-term cell replacement dynamics.

      • *

      • The conclusion that tentacle cells transdifferentiate into basal disc cells in the Zic4 knockdown may require more nuance, as only the tips of the tentacles express peroxidase. Do the more proximal regions of the tentacle express peduncle markers? *

      • *

      We appreciate the reviewer’s comment. In our previous publication (Vogg et al., 2022), we provided evidence supporting this phenomenon. As demonstrated in our data published there, markers of the peduncle, rather than the basal disc—such as manacle (gene ID 100212761) (Bridge et al., 2000) and Bmp5-8 (gene ID 100206618) (Reinhardt et al., 2004)—are also upregulated, suggesting a transition towards a peduncle-like state. However, we opted not to elaborate on this aspect in the current manuscript to maintain focus and avoid redundancy with previously published findings.

      • *

      *Specific Points: *

      • *

      *Figure 1A, Figure 4E: The pictorial representation of Zic4 expression may need to be revised, as in situ hybridization data from Vogg et al., 2022, suggests that Zic4 is absent from the hypostome and tentacle tips. While in situ hybridization can sometimes lack precision due to variability in staining protocols and subjective decisions on when to stop the reaction, this observation aligns with scRNA-seq data, which also indicates a lack of Zic4 expression in the hypostome and tips of the tentacles. *

      • *

      Our intention was to illustrate the general presence of Zic4 in the oral domain, but we acknowledge the reviewer’s point that this could be misleading regarding its precise expression pattern. To address this concern, we have updated the figure panels to more accurately reflect the available in situ hybridization and scRNA-seq data.

      • *

      *Figure 1 Legend: For panel D, the legend says "data taken from 28" but the references are not numbered. Same problem for panel E legend. *

      • *

      We thank the reviewer for catching this error. We have now corrected the references, replacing the numbering with the first authors' last names and publication dates.

      • *

      Figure 1D: There may be a mistake in the Hydra body part labeling. Is "B" supposed to be "P" for peduncle?

      • *

      We appreciate the reviewer’s observation. The label refers to the budding zone, and we acknowledge our omission in specifying this. We have now updated the figure and its legend to clarify this.

      • *

      *Figure 1 Panel E: Please provide clarification regarding what each box means. Are these 8 replicates of the same condition, or are these the proximal and distal regions of the tentacles as was collected in the Vogg paper? *

      • *

      We appreciate the reviewer’s request for clarification. These conditions are indeed similar to those in the previously published Vogg et al. paper. The boxes in the figure represent proximal and distal tentacle regions, each with four replicates. We have now updated the figure and its legend to make this explicit.

      • *

      *Figure 2A: Consider using the co-expression stats from Fig S2, which are very informative. *

      • *

      *We added the percentage of cells expressing Zic4, Gata3 and both genes on the panel. *

      • *

      *Figure 2E, F: It would be more intuitive to group each experimental sample with its corresponding control. *

      • *

      To make the figure clearer, we modified it and grouped each experimental sample with its corresponding control.

      • *

      *Figure 2C-F: Consider conducting statistical tests of significance between control and treatment groups. *

      • *

      We have now expanded the statistical analyses, ensuring that significance tests are presented in all relevant instances. However, we note that while statistical significance is important, it should be interpreted alongside other factors such as the magnitude of the effect, consistent trends across replicates, and biological relevance. Additionally, high standard deviations in certain conditions may influence absolute p-values, and we encourage consideration of the broader context of the data when interpreting these results.

      • *

      *Figure 2 E - Considering the error bars, Gata3 upregulation in response to Zic4 knockdown does not look significant based on qPCR. Showing the significance of the up-regulation in the RNA-seq data may be more convincing. (I believe RNA-seq to be more reliable anyway). *

      • *

      We understand the reviewer’s concern. The p-value for the qPCR data is slightly above 0.05, primarily due to high standard deviation. As the reviewer notes, qPCR on RNAi samples can be noisy, so the data should be interpreted in context. Importantly, the consistent qualitative increase in Gata3 levels after Zic4 knockdown aligns with the RNA-seq results, which, as the reviewer correctly points out, provide a more reliable measurement. Additionally, qPCR samples include a broader portion of head tissue, likely diluting the Gata3 signal from the tentacles and contributing to the observed variability.

      • *

      *Figure S2: Might be helpful to show co-expression UMAPs here, like what is shown in Figure 2A. *

      • *

      We appreciate the reviewer’s suggestion. However, we believe that displaying co-expression UMAPs for Zic4 would be redundant. Additionally, for genes with greater positional overlap, such as FoxI1 and Nfat5, co-expression UMAPs make visualization more challenging. To ensure clarity and optimize the interpretability of the data, we have chosen to present the expression profiles of each gene separately.

      • *

      *Page 4: "Interestingly, a similar binary choice pattern appears in certain neuronal lineages as well. A recent study demonstrated the involvement of Gata3 in specifying neurons at the aboral end (Primack et al. 2023), suggesting that this cross-regulation between Zic4 and Gata3 may extend beyond the epidermal lineage." Just a note that this paper shows expression, but doesn't show function as the statement implies, so the statement should be changed accordingly. *

      • *

      Indeed, the study does not focus on the functional role of Gata3 in these neurons. We have revised the sentence, replacing "involvement of Gata3 in specifying neurons" with "expression expression of Gata3 in neurons emerging*" to more accurately reflect the study’s findings. *

      • *

      *Page 10: "Transcription Factor Binding site analysis... Hydra promoter sequences were compiled from the NCBI Hydra RP 105 assembly." Authors should provide a repository identifier for the genome they are using. Based on the information provided, it appears the authors are using Genome assembly "Hydra_RP_1.0" RefSeq GCF_000004095.1. However, that genome assembly has been suppressed for the following reason: "superseded by newer assembly for species". Authors should consider updating the reference assembly they are using to map their sequencing data and identify promoter sequences. *

      • *

      We appreciate the reviewer’s concern. However, we have chosen to use the Hydra_RP_1.0 assembly for Figure 1 to maintain consistency with previously published data, which were also mapped to this assembly. Since these publications predate the newer assembly, using the same reference ensures comparability in our analysis. Importantly the assembly used is still downloadable and accessible to every researcher. That said, for the phylogenetic analysis in Figure 2, we have used the latest available genome assemblies and annotations for all species, including Hydra. We have now clarified this in the Methods section.

      • *

      *The paper makes great use of the Hydra scRNA-seq data set! Minor point, when referring to the Hydra scRNA-seq data set, please cite Siebert et al., 2019 (data collection) and Cazet et al., 2023 (analysis that is being used in this paper). *

      • *

      We appreciate the reviewer’s suggestion and have updated the references accordingly to include Siebert et al., 2019, for data collection and Cazet et al., 2023, for the analysis used in this paper.

      • *

      Something to keep in mind: To an audience without expertise in Hydra cell type morphology, the nematocyte marker HCR will likely be more convincing than the actin staining in Figure 3D to identify and quantify nematocytes.

      • *

      We agree with the reviewer that the nematocyte marker HCR provides a more specific identification of nematocytes. This is why we have also used the nematocilin marker in separate samples. However, actin staining adds important information on the morphology of the surrounding epithelial cells, which become indistinguishable from battery cells in Gata3 KDs. Unfortunately, combining actin staining with HCR is technically challenging, as the tissue preparation protocols for these two approaches are not compatible, and we have therefore decided to show both stainings next to each other.

      • *

      *Minor Wording Issues: *

      • *

      *Page 2. "However, the mechanism by which Zic4 prevents the battery cell program from misexpression in normal tentacles remained unclear." Could read more clearly as: However, the mechanism by which Zic4 prevents the misexpression of the battery cell program in normal tentacles remained unclear. *

      • *

      We have made the suggested change.

      • *

      *Page 2. "Potential candidates for this function could be found among TFs with highly enriched binding sites in the dataset, which are themselves Zic4 targets." Could read more clearly as: We reasoned that this intermediary factor, likely a target of Zic4, would be a transcription factor with highly enriched binding sites in the dataset. *

      • *

      We are grateful for the suggestion, we have changed the text accordingly.

      • *

      *p3-4. "Q-PCR performed on dissected oral and aboral body regions confirmed this finding (Fig. 2C-D)" It is unclear which "finding" is being confirmed. *

      • *

      We are referring to the upregulation of gata3 expression in tentacles upon Zic4 knockdown. To make this clearer, we have revised the wording to: “Q-PCR performed on dissected oral and aboral body regions confirmed the upregulation of gata3 upon Zic4 knockdown (Fig. 2C-D).”

      • *

      *Reviewer #1 (Significance (Required)): *

      • *

      *This compelling study from the Tsiairis lab uncovers a double-negative feedback loop between the transcription factors Zic4 and Gata3, functioning as a toggle switch to control oral and aboral fates in Hydra's epidermal lineage. Addressing fundamental questions in developmental biology, this research sheds light on the mechanisms underlying cell fate determination in relationship to their spatial organization. In Hydra, Wnt signaling, a conserved pathway critical for establishing primary body axes, promotes oral fate, emanating from an organizer at the oral end. Hydra body column epidermal cells can differentiate into distinct cell types, including oral battery cells and aboral basal disk cells, but the regulatory mechanisms remained elusive. Recent research from the Tsiairis lab identified Zic4 as a direct Wnt signaling target necessary for repressing basal disk-specific genes. Knocking down Zic4 caused battery cells to transform into basal disk cells, though Zic4 did not directly activate basal disk-specific genes, pointing to an intermediary regulator. This study identifies Gata3 as a key regulator of basal disk gene expression, as it is highly expressed at the aboral end, is inversely correlated with Zic4, and is upregulated in Zic4 knockouts. Functional experiments revealed mutual inhibition between Zic4 and Gata3: knocking down Gata3 led to differentiation of battery cells at the aboral end, while simultaneous knockdowns of Zic4 and Gata3 rescued the phenotypes of individual knockdowns. These findings demonstrate a finely tuned balance between Zic4 and Gata3 in regulating cell fate along the oral-aboral axis in Hydra. This paper therefore offers new insights into the spatial organization of cell type specification in Hydra and into broader principles of cell fate determination. *

      • *

      *We appreciate the reviewer’s thoughtful summary and recognition of our study’s significance. *

      • *

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)): *

      • *

      *Summary: *

      *The authors use the freshwater hydrozoan Hydra as a model to investigate mechanisms of cell fate decisions in the context of terminal epithelial differentiation. The epithelia migrates towards the extremities of the animal and takes on one of two fates: elongated battery cells that house the cnidocytes ( stinging cells ) in the oral ( head ) end of the animal, or more compact secretory basal disc cells at the aboral ( foot ) end. In this manuscript the authors build on previous work that showed the transcription factor Zic4 is necessary for battery cell formation. The authors use in situ hybridization and additional labelling techniques to assess cell fate under a variety of conditions. The authors first screen for Zic4 binding sites in the promoter regions of aboral genes that previously were demonstrated to be up-regulated in response to Zic4 knockdown, and survey publicly available expression databases to identify GATA3 as a candidate transcription factor that shows complementary expression patterns. The authors also screen the promoter regions of Zic4 and GATA3 from a number of other cnidarians and find reciprocal binding sites in all but one case. This is interpreted by the authors as evidence for a Zic4/GATA3 cnidarian regulatory motif. The authors demonstrate that KD of GATA3 results in the opposite phenotype: ectopic differentiation of oral battery cells, and that animals with perturbed GATA3 function fail to regenerate the aboral basal disk cells but rather show oral battery cell phenotype. Further, KD of both genes (Zic4: battery cells and GATA3: pedal disc cells) results in a rescue of the phenotype of either single KD, thereby illustrating that together these two genes function as a negative feedback loop controlling the terminal differentiation of the ectodermal epithelia. *

      • *

      *Major comments: *

      *- Are the key conclusions convincing? *

      *The key conclusions are convincing. *

      • *

      *- Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? *

      *The cross species comparison of binding sites is insightful, but is presented very early in the manuscript. This would be better placed as a final piece, to place the Hydra-specific findings in a larger context. *

      • *

      *- Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. *

      *No. *

      *- Are the data and the methods presented in such a way that they can be reproduced? *

      *Yes, *

      *- Are the experiments adequately replicated and statistical analysis adequate? *

      *Yes. *

      • *

      *Minor comments: *

      *- Specific experimental issues that are easily addressable. *

      *None. *

      *- Are prior studies referenced appropriately? *

      *Yes. *

      *- Are the text and figures clear and accurate? *

      *Yes. The figures are very nice. *

      • *

      *- Do you have suggestions that would help the authors improve the presentation of their data and conclusions? *

      • *

      *1) Move the phylogenetic comparisons to the end *

      *2) Similarly, in the section on GATA3 KD, present the normal condition first, and then the regeneration experiment results. *

      • *

      We thank the reviewer for their positive assessment and constructive suggestions. Below, we comment on each point:

      • Placement of cross-species comparison: This suggestion concerns the emphasis and structure of the manuscript. We appreciate the reviewer's interest in the evolutionary aspects of our work. However, we believe that moving this analysis to the end would dilute the main message, which is reinforced by the schematic in Figure 4E-F. We aim to conclude with the experimental results demonstrating the minimization of phenotypic consequences when both factors are knocked down. Therefore, we have chosen to retain the cross-species comparison in its current position to emphasize the conservation of the double-negative interaction before presenting the functional consequences of its perturbation.
      • Reordering of Gata3 KD results: We understand the rationale behind this suggestion. However, our sequencing is guided by the fact that foot regeneration deficiency under Gata3 kd has already been documented and presented in previous work (Ferenc et al., 2021). For this reason, we begin with that reference, then build upon it with a deeper examination of the phenotype.
      • *

      We are grateful for the reviewer’s feedback and for recognizing the clarity of our figures and analysis.

      • *

      ***Referee cross-commenting** *

      • *

      *I have read the other two reviews and find that we are all in agreement that the work presented in this manuscript is sound and is a valuable scientific contribution. I would encourage the authors to consider my own suggests for order of presentation of data, to retain a specific to broad theme (normal then regeneration / hydra then comparisons) and to incorporated the detailed corrections highlighted by reviewer 1. *

      • *

      *Regarding reviewer 3's comment regarding SoxA in cnidarians. This is likely true and the nomenclature of the gene likely comes from an automated pipeline to infer gene identities. Unless the authors follow up on this gene, I don't think the onus is on the authors to confirm the identity. *

      • *

      We appreciate Reviewer’s #3 remark about the nuance of transcription factor homology. The situation is exactly as described here by Reviewer #2 - The gene names in Figure 1 are based on the results of NCBI automated homology annotation, which we have now clarified in a note in the legend of Figure 1.

      • *

      *Reviewer #2 (Significance (Required)): *

      • *

      *- Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. *

      *This paper is a beautiful illustration of the importance of relative gene expression levels in controlling cell fate decisions. Together with their previous works, the role of both transcription factors in specifying one of two possible terminal fates is very clearly illustrated. The final observation, that a mutual knockdown of both factors leads to a rescue of the polarity of the cell type balance is an excellent example of the importance of relative gene expression levels in controlling homeostatic balance between two mutually exclusive cell fates. *

      *- Place the work in the context of the existing literature (provide references, where appropriate). *

      *The manuscript does a good job of placing the work into the appropriate context. *

      • *

      *- State what audience might be interested in and influenced by the reported findings. *

      *Readers with interest in gene regulation, cell specification, and mechanisms of cell type diversification would find these results of interest. *

      • *

      *- Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. *

      *Comparative invertebrate embryogenesis; Single cell transcriptomics; Cell and tissue evolution *

      • *

      We greatly appreciate the reviewer’s positive feedback and recognition of our study's focus on gene expression in cell fate decisions. We're pleased that our findings on the mutual knockdown and the broader context were well received. Thank you for highlighting the relevance of our work to gene regulation and cell specification.

      • *

      *Reviewer #3 (Evidence, reproducibility and clarity (Required)): *

      • *

      *Ferenc et al. have studied the role of transcription factors Zic4 and Gata3 in Hydra epithelial cell fate decision. The Tsiairis team has published a paper recently in which they had studied the role of Zic4 in promoting tentacle formation. Here, they discover a negative feedback loop between Zic4 and Gata3 in the context of epithelial cell differentiation. The authors used computational techniques to identify Zic4 binding sited in Hydra promoters of genes that are upregulated in basal disks, known from a previous study, and identified eight candidate genes. Previous studies were also used to narrow down potential Zic4 targets. They argue that Gata3 appears as a strong candidate to be suppressed by Zic4 in the head and being expressed in the foot. Knockdown experiments, followed by qPCR revealed that Gata3 and Zic4 expression is mutually exclusive such that the one represses the other. Next, they report that Gata3 RNAi results in ectopic battery cells at the lower body column, although basal disk cells maintained their identity following Gata3 knockdown. Finally, knocking down both Gata3 and Zic4 resulted in a more normal phenotype, as predicted if a negative feedback loop existed between the two. *

      • *

      *A minor comment: one of the predicted Zic4 targets is a gene called Sry. Sry is a mammalian male determinant and a SOX-related protein (SoxA). I was wondering if the authors performed phylogenetic analysis or simply took a BLAST hit as the source for this gene's name. I am unaware of SoxA-like genes in cnidarians . Therefore, I would recommend performing a SOX phylogeny and renaming it according to its closest relatives, which probably won't be Sry. *

      • *

      The naming of the gene as Sry was indeed based on the NCBI automated homology annotation, and we have clarified this in the revised manuscript. Since we did not pursue further analysis of this gene, we believe that a deeper phylogenetic analysis may not be necessary and could potentially divert attention from the main focus of our study on Gata3's role.

      • *

      *Reviewer #3 (Significance (Required)): *

      • *

      *This work closes some gaps that remained after publication of previous research by the Tsiairis lab and others. The data are of high quality, solid, and support the authors' conclusions. The manuscript is of general interest for developmental biologists and evodevo workers. *

      • *

      We thank the reviewer for the thoughtful assessment of our work. We appreciate their feedback and the recognition of the quality and significance of our findings.

      • *
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Overall, the paper is well-written, the figures are easy to interpret, and the conclusions are well supported by the data. Most of the points discussed below could be addressed with simple text changes.

      General Points:

      1. The upregulation of Gata3 in response to Zic4 RNAi is relatively modest compared to the more pronounced upregulation of Zic4 following Gata3 knockdown, but this point is not really addressed. While these issues could be simply technical, they might also hint at additional layers of regulation that are not yet fully understood.
      2. Extending the time course would strengthen the conclusion that, in the Gata3 knockdown, the existing basal disk cells remain stable while body column cells migrating into the region differentiate into tentacle cells. If this hypothesis is correct, one would predict that by approximately 20 days, the basal disk cells would be completely replaced.
      3. The conclusion that tentacle cells transdifferentiate into basal disc cells in the Zic4 knockdown may require more nuance, as only the tips of the tentacles express peroxidase. Do the more proximal regions of the tentacle express peduncle markers?

      Specific Points:

      Figure 1A, Figure 4E: The pictorial representation of Zic4 expression may need to be revised, as in situ hybridization data from Vogg et al., 2022, suggests that Zic4 is absent from the hypostome and tentacle tips. While in situ hybridization can sometimes lack precision due to variability in staining protocols and subjective decisions on when to stop the reaction, this observation aligns with scRNA-seq data, which also indicates a lack of Zic4 expression in the hypostome and tips of the tentacles.

      Figure 1 Legend: For panel D, the legend says "data taken from 28" but the references are not numbered. Same problem for panel E legend.

      Figure 1D: There may be a mistake in the Hydra body part labeling. Is "B" supposed to be "P" for peduncle?

      Figure 1 Panel E: Please provide clarification regarding what each box means. Are these 8 replicates of the same condition, or are these the proximal and distal regions of the tentacles as was collected in the Vogg paper?

      Figure 2A: Consider using the co-expression stats from Fig S2, which are very informative.

      Figure 2E, F: It would be more intuitive to group each experimental sample with its corresponding control.

      Figure 2C-F: Consider conducting statistical tests of significance between control and treatment groups.

      Figure 2 E - Considering the error bars, Gata3 upregulation in response to Zic4 knockdown does not look significant based on qPCR. Showing the significance of the up-regulation in the RNA-seq data may be more convincing. (I believe RNA-seq to be more reliable anyway).

      Figure S2: Might be helpful to show co-expression UMAPs here, like what is shown in Figure 2A.

      Page 4: "Interestingly, a similar binary choice pattern appears in certain neuronal lineages as well. A recent study demonstrated the involvement of Gata3 in specifying neurons at the aboral end (Primack et al. 2023), suggesting that this cross-regulation between Zic4 and Gata3 may extend beyond the epidermal lineage." Just a note that this paper shows expression, but doesn't show function as the statement implies, so the statement should be changed accordingly.

      Page 10: "Transcription Factor Binding site analysis... Hydra promoter sequences were compiled from the NCBI Hydra RP 105 assembly." Authors should provide a repository identifier for the genome they are using. Based on the information provided, it appears the authors are using Genome assembly "Hydra_RP_1.0" RefSeq GCF_000004095.1. However, that genome assembly has been suppressed for the following reason: "superseded by newer assembly for species". Authors should consider updating the reference assembly they are using to map their sequencing data and identify promoter sequences.

      The paper makes great use of the Hydra scRNA-seq data set! Minor point, when referring to the Hydra scRNA-seq data set, please cite Siebert et al., 2019 (data collection) and Cazet et al., 2023 (analysis that is being used in this paper).

      Something to keep in mind: To an audience without expertise in Hydra cell type morphology, the nematocyte marker HCR will likely be more convincing than the actin staining in Figure 3D to identify and quantify nematocytes.

      Minor Wording Issues:

      Page 2. "However, the mechanism by which Zic4 prevents the battery cell program from misexpression in normal tentacles remained unclear." Could read more clearly as: However, the mechanism by which Zic4 prevents the misexpression of the battery cell program in normal tentacles remained unclear.

      Page 2. "Potential candidates for this function could be found among TFs with highly enriched binding sites in the dataset, which are themselves Zic4 targets." Could read more clearly as: We reasoned that this intermediary factor, likely a target of Zic4, would be a transcription factor with highly enriched binding sites in the dataset.

      p3-4. "Q-PCR performed on dissected oral and aboral body regions confirmed this finding (Fig. 2C-D)" It is unclear which "finding" is being confirmed.

      Significance

      This compelling study from the Tsiairis lab uncovers a double-negative feedback loop between the transcription factors Zic4 and Gata3, functioning as a toggle switch to control oral and aboral fates in Hydra's epidermal lineage. Addressing fundamental questions in developmental biology, this research sheds light on the mechanisms underlying cell fate determination in relationship to their spatial organization. In Hydra, Wnt signaling, a conserved pathway critical for establishing primary body axes, promotes oral fate, emanating from an organizer at the oral end. Hydra body column epidermal cells can differentiate into distinct cell types, including oral battery cells and aboral basal disk cells, but the regulatory mechanisms remained elusive. Recent research from the Tsiairis lab identified Zic4 as a direct Wnt signaling target necessary for repressing basal disk-specific genes. Knocking down Zic4 caused battery cells to transform into basal disk cells, though Zic4 did not directly activate basal disk-specific genes, pointing to an intermediary regulator. This study identifies Gata3 as a key regulator of basal disk gene expression, as it is highly expressed at the aboral end, is inversely correlated with Zic4, and is upregulated in Zic4 knockouts. Functional experiments revealed mutual inhibition between Zic4 and Gata3: knocking down Gata3 led to differentiation of battery cells at the aboral end, while simultaneous knockdowns of Zic4 and Gata3 rescued the phenotypes of individual knockdowns. These findings demonstrate a finely tuned balance between Zic4 and Gata3 in regulating cell fate along the oral-aboral axis in Hydra. This paper therefore offers new insights into the spatial organization of cell type specification in Hydra and into broader principles of cell fate determination.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript submitted by Qi et al., the authors study the RNA methylation mechanism by the METTL3-METTL14 complex. This complex catalyzes the major epitranscriptome methylation mark of nuclear RNA, including mRNA and lncRNAs. They catalyze the transfer of methyl group from SAM to convert the N6 of adenosine in RNA to m6A. Mutations in this complex have been associated with several diseases, such as type 2 diabetes and several types of cancer. The primary focus of this study was to understand the post-catalytic state of the METTL3-14 bound to a structural mimic of a reaction product known as N6-methyladenosine monophosphate (m6A) using X-ray crystallography. The authors show that the m6A occupies a novel pocket at the interface of the METTL3-14 complex and identified that residues interacting with m6A are mutated in several cancers. Furthermore, the authors demonstrate that the mutations lead to a significant loss in catalytic activity, alter RNA binding, and hinder the proper positioning of the substrate adenine in the active site. Lastly, the authors perform supervised molecular dynamics simulations to understand the effect of the mutations on the interaction network with m6A. The evidence for this study is good, with the combination of X-ray, functional assays, and molecular dynamics justifying their overall conclusions. This structure is significant as it provides new insights into the structural determinants of known cancer-associated mutations of this important class of enzymes. However, some issues need to be addressed.

      Strengths:

      (1) The X-ray structure is well determined, and the density map has the quality to observe all the interactions of the METTL3-14 complex with m6A.

      (2) The structure reveals a novel 'cryptic pocket' in the complex that is 16 Å away from the SAM binding site. It is a functional m6A-sensor, illustrating a mechanism where the complex switches its functionality from an m6A writer to a reader.

      (3) The structure illustrates that the residues forming cryptic pockets are found in multiple Cancer-associated mutations and are well conserved across several organisms.

      (4) The functional assays (methyl transferase, RNA binding, kinetic, and SPR assays) provide a complete picture of the effect of the mutations on the activity of the METTL3-14 complex.

      (5) Molecular dynamics simulations were done to understand the impact of the mutations on the pocket structure and its dynamics and support the X-ray structure findings.

      Weaknesses:

      (1) Although the X-ray structure is well determined, the statistics are a bit troubling, particularly the Ramachandran, Sidechain and RSRZ outliers. It is well above the average for structures at that resolution. Maybe the use of alternative software such as ISOLDE may be adequate to improve those parameters.

      (2) The authors should expand their discussion as to why the affinity for the product is higher than the substrate and the implications on the mechanism.

      (3) The SPR profiles of the association kinetics look to have several minor association-dissociation events occurring. Multiple binding sites? Authors should provide an explanation for such behavior. Also, what is the structural explanation of the difference in binding modes between the wt vs. mutant (one vs. two-state binding modes)?

      (4) In materials and methods, it shows the data in Figure 2a was fitted to a Michaelis-Menten equation, however, the Y axis shows Normalized methylation and not initial rates. The authors should elaborate on their approach. In addition, more than three initial velocity rate points per protein are needed to fit a Michaelis-Menten curve confidently. Additionally, where can the Michaelis-Menten parameters be found?

    1. Reviewer #2 (Public review):

      Summary:

      Kargulyan et al. investigate the function of the transsynaptic adhesion molecule RTN4RL2 in the formation and function of ribbon synapses between type I spiral ganglion neurons (SGNs) and inner hair cells. For this purpose, they study constitutive RTN4RL2 knock-out mice. Using immunohistochemistry, they reveal defects in the recruitment of protein to ribbon synapses in the knockouts. Serial block phase EM reveals defects in SGN projections in mutants. Electrophysiological recordings suggest a small but statistically significant depolarized shift in the activation of Cav1.3 Ca2+ channels. Auditory thresholds are also elevated in the mutant mice. The authors conclude that RTN4RL2 contributes to the formation and function of auditory afferent synapses to regulate auditory function.

      Strengths:

      The authors have excellent tools to analyze ribbon synapses.

      Weaknesses:

      However, there are several concerns that substantially reduce my enthusiasm for the study.

      (1) The analysis of the expression pattern of RTN4RL2 in Figure 1 is incomplete. The authors should show a developmental time course of expression up into maturity to correlate gene expression with major developmental milestones such as axon outgrowth, innervation, and refinement. This would allow the development of models supporting roles in axon outgrowth versus innervation or both.

      (2) It would be important to improve the RNAscope data. Controls should be provided for Figure 1B to show that no signal is observed in hair cells from knockouts. The authors apparently already have the sections because they analyzed gene expression in SGNs of the knock-outs (Figure 1C).

      (3) It is unclear from the immunolocalization data in Figure 1D if all type I SGNs express RTN4RL2. Quantification would be important to properly document the presence of RTN4RL2 in all or a subset of type I SGNs. If only a subset of SGNs express RTN4RL2, it could significantly affect the interpretation of the data. For example, SGNs selectively projecting to the pillar or modiolar side of hair cells could be affected. These synapses significantly differ in their properties.

      (4) It is important to show proper controls for the RTN4RL2 immunolocalization data to show that no staining is observed in knockouts.

      (5) The authors state in the discussion that no staining for RTN4RL2 was observed at synaptic sites. This is surprising. Did the authors stain multiple ages? Was there perhaps transient expression during development? Or in axons indicative of a role in outgrowth, not synapse formation?

      (6) In Figure 2 it seems that images in mutants are brighter compared to wildtypes. Are exposure times equivalent? Is this a consistent result?

      (7) The number of synaptic ribbons for wildtype in Figure 2 is at 10/IHCs, and in Figure 2 Supplementary Figure 2 at 20/IHCs (20 is more like what is normally reported in the literature). The value for mutant similarly drastically varies between the two figures. This is a significant concern, especially because most differences that are reported in synaptic parameters between wild-type and mutants are far below a 2-fold difference.

      (8) The authors report differences in ribbon volume between wild-type and mutant. Was there a difference between the modiolar/pillar region of hair cells? It is known that synaptic size varies across the modiolar-pillar axis. Maybe smaller synapses are preferentially lost?

      (9) The authors show in Figure 2 - Supplement 3 that GluA2/3 staining is absent in the mutants. Are GluA4 receptors upregulated? Otherwise, synaptic transmission should be abolished, which would be a dramatic phenotype. Antibodies are available to analyze GluA4 expression, the experiment is thus feasible. Did the authors carry out recordings from SGNs?

      (10) The authors use SBEM to analyze SGN projections and synapses. The data suggest that a significant number of SGNs are not connected to IHCs. A reconstruction in Figure 3 shows hair cells and axons. It is not clear how the outline of hair cells was derived, but this should be indicated. Also, is this a defect in the formation of synapses and subsequent retraction of SGN projections? Or could RTN4RL2 mutants have a defect in axonal outgrowth and guidance that secondarily affects synapses? To address this question, it would be useful to sparsely label SGNs in mutants, for example with AAV vectors expression GFP, and to trace the axons during development. This would allow us to distinguish between models of RTN4RL2 function. As it stands, it is not clear that RTN4RL2 acts directly at synapses.

      (11) The authors observe a tiny shift in the operation range of Ca2+ channels that has no effect on synaptic vesicle exocytosis. It seems very unlikely that this difference can explain the auditory phenotype of the mutant mice.

      (12) ABR recordings were conducted in whole-body knockouts. Effects on auditory thresholds could be a secondary consequence of perturbation along the auditory pathway. Conditional knockouts or precisely designed rescue experiments would go a long way to support the authors' hypothesis. I realize that this is a big ask and floxed mice might not be available to conduct the study.

    2. Reviewer #3 (Public review):

      In this study, the authors used RNAscope and immunostaining to confirm the expression of RTN4RL2 RNA and protein in hair cells and spiral ganglia. Through RTN4RL2 gene knockout mice, they demonstrated that the absence of RTN4RL2 leads to an increase in the size of presynaptic ribbons and a depolarized shift in the activation of calcium channels in inner hair cells. Additionally, they observed a reduction in GluA2/3 AMPA receptors in postsynaptic neurons and identified additional "orphan PSDs" not paired with presynaptic ribbons. These synaptic alterations ultimately resulted in an increased hearing threshold in mice, confirming that the RTN4RL2 gene is essential for normal hearing. These data are intriguing as they suggest that RTN4RL2 contributes to the proper formation and function of auditory afferent synapses and is critical for normal hearing. However, a thorough understanding of the known or postulated roles of RTN4Rl2 is lacking.

      While the conclusions of this paper are generally well supported by the data, several aspects of the data analysis warrant further clarification and expansion.

      (1) A quantitative assessment is necessary in Figure 1 when discussing RNA and protein expression. It would be beneficial to show that expression levels are quantitatively reduced in KO mice compared to wild-type mice. This suggestion also applies to Figure 2-supplement 3.D, which examines expression levels.

      (2) In Figure 2, the authors present a morphological analysis of synapses and discuss the presence of "orphan PSDs." I agree that Homer1 not juxtaposed with Ctbp2 is increased in KO mice compared to the control group. However, in quantifying this, they opted to measure the number of Homer1 juxtaposed with Ctbp2 rather than directly quantifying the number of Homer1 not juxtaposed with Ctbp2. Quantifying the number of Homer1 not juxtaposed with Ctbp2 would more clearly represent "orphan PSDs" and provide stronger support for the discussion surrounding their presence.

      (3) In Figure 2, Supplementary 3, the authors discuss GluA2/3 puncta reduction and note that Gria2 RNA expression remains unchanged. However, there is an issue with the lack of quantification for Gria2 RNA expression. Additionally, it is noted that RNA expression was measured at P4. While the timing for GluA2/3 puncta assessment is not specified, if it was assessed at 3 weeks old as in Figure 2's synaptic puncta analysis, it would be inappropriate to link Gria2 RNA expression with GluA2/3 protein expression at P4. If RNA and protein expression were assessed at P4, please indicate this timing for clarity.

      (4) In Figure 3, the authors indicate that RTN4RL2 deficiency reduces the number of type 1 SGNs connected to ribbons. Given that the number of ribbons remains unchanged (Figure 2), it is important to clearly explain the implications of this finding. It is already known that each type I SGN forms a single synaptic contact with a single IHC. The fact that the number of ribbons remains constant while additional "orphan PSDs" are present suggests that the overall number of SGNs might need to increase to account for these findings. An explanation addressing this would be helpful.

      (5) In Figure 4F and 5Cii, could you clarify how voltage sensitivity (k) was calculated? Additionally, please provide an explanation for the values presented in millivolts (mV).

      (6) In Figure 6, the author measured the threshold of ABR at 2-4 months old. Since previous figures confirming synaptic morphology and function were all conducted on 3-week-old mice, it would be better to measure ABR at 3 weeks of age if possible.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This is by far the phylogenetic analysis with the most comprehensive coverage for the Nemacheilidae family in Cobitoidea. It is a much-lauded effort. The conclusions derived using phylogenetic tools coincide with geological events, though not without difficulties (Africa pathway).

      Strengths:

      Comprehensive use of genetic tools

      Weaknesses:

      Lack of more fossil records

      Thank you for appreciating the comprehensiveness of our study.

      We agree that additional nemacheilid fossils would have provided valuable support for reconstructing the evolutionary history of the family. However, the nemacheilid fossil used in our study is currently the only fossil species of the family, which precludes the possibility of including more. To address this limitation, we incorporated fossils from closely related fish families, as well as a geological event, to calibrate the time tree. We have added further details on this point in “Divergence time estimations and ancestral range reconstruction” section of the Methods. The reconstruction of the pathway by which loaches reached northeast Africa, is further complicated by the extensive aridification of the Arabian Peninsula and the Nile valley, leaving no fossil or extant Nemacheilidae species of Nemacheilidae to provide insights into the distribution of the family during late Miocene.

      Reviewer #2 (Public review):

      Summary:

      The authors present the results of molecular phylogenetic analysis with very comprehensive samplings including 471 specimens belonging to 250 species, trying to give a holistic reconstruction of the evolutionary history of freshwater fishes (Nemacheilidae) across Eurasia since the early Eocene. This is of great interest to general readers.

      Strengths:

      They provide very vast data and conduct comprehensive analyses. They suggested that Nemacheilidae contain 6 major clades, and the earliest differentiation can be dated to the early Eocene.

      Weaknesses:

      The analysis is incomplete, and the manuscript discussion is not well organized. The authors did not discuss the systematic problems that widely exist. They also did not use the conventional way to discuss the evolutionary process of branches or clades, but just chronologically described the overall history.

      In the revised version, we address the systematic issues within Nemacheilidae in a new paragraph. The polyphyly of the genus Schistura and the polyphyly or paraphyly of many other nemacheilid genera are wellknown challenges in ichthyology. However, the large size of the family Nemacheilidae and the absence of a clear basal classification system has made systematic work difficult.

      The chronological concept in the description of events is in accordance with the sequence in which the events occurred over time and corresponds with Figure 8. Additionally, a clade-by-clade description would make it challenging to capture the periods before all clades were formed. As a compromise, the revised version includes a new table where each clade is represented by a column, allowing readers to trace the history of each clade in a clear overview. With this table, we make both the chronological and clade-by-clade perspectives to enhance reader understanding

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      I have no major comments, except for Figure 8, where the colour code for Sunda is not consistent, appearing as light purple and then dark purple. I was trying to locate the colour legend, maybe include this for all figures or refer to it.

      Figure 8 has been revised to improve matching of the colours.

      Reviewer #2 (Recommendations for the authors):

      (1) It is better to discuss the evolutionary history of the major inner groups. For example, why the Branch A and B differentiated? How are the 6 major clades differentiated?

      As mentioned above, the new table provides an overview of the evolutionary history of the major clades and, where known, the mechanism that led to their differentiation. For branches A and B, the underlying causes of differentiation remain known. Currently, the extensive morphological variability within each clade prevents a definitive morphological diagnosis, but such a study is planned for the future.

      (2) In this study, there are still some phylogenetic or systematic problems unresolved. For example, the Genus Schistura remains polyphyletic even in different major clades. The situation is similar for the Genus Tripophysa though not so serious. These need to be discussed or at least partially solved before discussing the evolutionary history.

      We discuss these topics now in a new paragraph ‘Taxonomic implications’.

      (3) In Table S1, what is the meaning of "-". Does this mean no data available? If so, how do the authors treat this in their phylogenetic analysis?

      Indeed, in Table S1, a ‘-‘ indicates that no sequence was available for the given species and gene. In the phylogenetic analyses, these cases were treated as missing data.

      (4) What is the source of Figure 8? There are different opinions on the geological events. The authors need to indicate the source of their information.

      The sources of Fig. 8 are now provided in the figure caption.

      (5) The Eastern Clade forms continuous distribution in Figure 6, but discontinuous in Figure 8. Is this correct?

      Figure 6 does not display the distribution areas for the clades, but illustrates the biogeographic regions used in the biogeographic analysis.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript by Anbarcia et al. re-evaluates the function of the enigmatic Rete Ovarii (RO), a structure that forms in close association with the mammalian ovary. The RO has generally been considered a functionless structure in the adult ovary. This manuscript follows up on a previous study from the lab) that analyzed ovarian morphogenesis using high-resolution microscopy (McKey et al., 2022). The present study adds finer details to RO development and possible function by 1) identifying new markers for OR sub-regions (e.g. GFR1a labels the connecting rete) suggesting that the sub-regions are functionally distinct, 2) showing that the OR sub-regions are connected by a luminal system that allows transport of material from the extra-ovarian rete (EOR) to the inter-ovarian rete (IOG), 3) identifies proteins that are secreted into the OR lumen and that may regulate ovarian homeostasis, and finally, 4) better defines how the vasculature, nervous, and immune system integrates with the OR.

      Strengths:

      The data is beautifully present and convincing. They show that the RO is composed of three distinct domains that have unique gene expression signatures and thus likely are functionally distinct.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Weaknesses: 

      It is not always clear what the novel findings are that this manuscript is presenting. It appears to be largely similar to the analysis done by McKey et al. (2022) but with more time points and molecular markers. The novelty of the present study's findings needs to be better articulated. 

      The previous study focused on placing the Rete Ovarii in the context of ovarian development. The current study focuses on the novel findings that the EOR is a active structure that sends fluid/information to the ovary. We show this by characterizing the presence of secretory proteins in the RO epithelial cells, by dye injections into the EOR and observing transport of the dye to the ovary, and by collection of EOR fluid followed by proteomic analysis. We also show that RO is embedded in an elaborate vascular network and contacted by neurons. None of this data was not discussed in the McKey 2022 paper. 

      Reviewer #2 (Public Review):

      Clarifications: 

      (1) Is there any comparative data on the proteomics of RO and rete testis in early development? With some molecular markers also derived from rete testis, it would be better to provide the data or references.

      To the best of our knowledge, there are no available proteomic datasets of the embryonic or early postnatal mouse Rete Testis or Epididymis. The authors agree that having this information would be very useful. 

      (2) Although the size of RO and its components is quite small and difficult to operate, the researchers in this article had already been able to perform intracavitary injection of EOR and extract EOR or CR for mass spectrometry analysis. Therefore, can EOR, CR, or IOR be damaged or removed, providing further strong evidence of ovarian development function?

      We attempted to genetically ablate the RO by expressing the diphtheria toxin receptor (DTR) in RO cells and adding DT. This approach was not successful in ablating the RO. We also tried to use Pax2/8 homo- and heterozygous mutants for ablation (as used in the McKey 2022 paper), but so far, we cannot find a genetic combination that ablates the RO, but not the oviduct, uterus and/or kidneys. We have also embarked on a study to surgically remove the RO. This assay is taking some time to optimize. The goal of the current study was to characterize the cells along the length of the RO and to present evidence that it is a secretory appendage of the ovary.

      (3) Although IOR is shown on the schematic diagram, it cannot be observed in the immunohistochemistry pictures in Figure 1 and Figure 3. The authors should provide a detailed explanation.

      An annotation has been added to Figure 1 to indicate the IOR. As the images within the panels are of maximum intensity projections, it is often difficult to clearly see the IOR as it is deeper within the ovary. In Figure 3, the view of the ovary is from the ventral side:  this view does not allow for clear visualization of the IOR.

      Reviewer #3 (Public Review):

      Weaknesses: 

      There is a lack of conclusive data supporting many conclusions in the manuscript. Therefore, the paper's overall conclusions should be moderated until functional validations are conducted.

      We have moderated the conclusions where appropriate

      Reviewer #1 (Recommendations For The Authors):

      (1) The introduction is relatively brief and does not mention some historical data/hypotheses on the role of the RO in ovarian function (e.g. regulation of meiotic entry) or development (e.g. Mayère et al., 2022).

      Mayere 2022 was cited in line 57. Steins hypothesis about entry into meiosis has been added line 58.

      (2) L82-84: It is stated that KRT8 was first identified as a potential RO marker by sc/snRNAseq (Anbarci et al., 2023) and then validated in this manuscript. However, KRT8 was used by McKey et al. (2022) as a RO marker, and they noted there that KRT8 was enriched in the EOR. It is not clear why McKey et al. is not cited as the primary reference validating KRT8 as an EOR marker.

      The embryonic and neonatal timecourse description from KRT8 expression is first identified in this paper. McKey 2022 only highlights KRT8 at E18.5 A reference has been added to address this line 85

      (3) Figure 1: Can the IOR be seen in these images? If so, please label. 

      The label has been added.

      (4) L107: It is hypothesized that "the RO may respond to or interpret homeostatic cues." Can transcriptomics data shed light on what signals the RO may be capable of responding to? E.g. what receptors are expressed by cells of the RO (e.g. ER, LHCGR, FSHR)?

      The RO expresses ESR1, PGR, INSR, IGF1R. The IOR exclusively expresses LHCGR and FSHR.This has been added to the manuscript line 309

      (5) L152: Mass spec was used to identify proteins secreted into the lumen of the RO. These proteins were then compared to the mammalian secretome to filter out possible nonsecreted protein contaminants. Finally, the candidates were compared to the RO scRNAseq data from Anbarci et al., (2023). This method gives a very conservative candidate list. However, it may also be informative to compare the sc/snRNA-seq gene list directly to the secretome to ID other possible candidate-secreted proteins that may not have been detected in the mass spec data set. 

      There are quite a number of secreted proteins that are also not actively secreted. This is a good suggestion for future analysis. For the current study we wanted to take a more conservative approach, and chose to do proteomics to determine proteins that are actively secreted. 

      (6) L195: It is not clear if IGFBP2 is expressed by both OR and granulosa cells or only granulosa cells. It would be informative to know what ovarian cell types express both IGFBP2 and IGF1R (e.g. from sc/snRNA-seq)? This information is referenced in the discussion (L285-287) but would be better to reference it in the results section for clarity.

      Both RO and granulosa cells express IGFBP2 and IGF1R. A sentence has been added to results for clarity. (Line 197)

      (7) L295: "...the RO participates in endocrine signaling..." might be more accurate to say "...the RO responds to endocrine signaling...".

      The authors agreed that this statement is more accurate and the changes have been made. 

      Reviewer #3 (Recommendations For The Authors): 

      Several issues significantly affect the paper's quality in the current version. Firstly, there is a lack of conclusive data supporting many conclusions in the manuscript. For instance, the assertion in line 105 that "EOR was directly innervated by neurons" lacks substantial evidence beyond basic immunofluorescent staining. 

      We agree that the term “innervated” might be a step too far since we rely on IF evidence.  We changed the wording of this sentence to say, “The EOR was directly contacted by neurons”.

      In another pivotal experiment illustrated in Figure 3, the provided images lack temporal continuity and quantitative analysis, suggesting the incorporation of time-lapse imaging for improved sequential presentation in Figure 3.

      The microscope where we can perform injections cannot record movies.  We have tried moving the rete to another microscope after injection, but so far, we have been unable to capture dextran moving through the RO. We therefore believe that transport is rapid, but future experiments will be needed to optimize this imaging.

      Moreover, relying solely on proteomics analysis, as seen in lines 188-189, makes it challenging to assert conclusions such as "EOR actively secretes proteins." Therefore, the paper's overall conclusions should be moderated until functional validations are conducted. 

      The findings that (1) the cells of the EOR express SNARE complex proteins at their apical surfaces and (2) luminal fluid expelled from the EOR contains abundant secreted proteins strongly suggest that the RO is involved in active secretion. We use the word “suggest” in this sentence, lines 188-189 as we realize that further experiments should be done to validate this conclusion.

      Furthermore, the predominant methods in this study involve immunostaining and imaging. However, the current images exhibit a notable inconsistency in color definitions for different markers by the authors. For instance, in Figure 2.A/C, PAX8 is portrayed as cyan, while in D, it is represented in yellow. Similarly, in Figure 4, E-CAD is depicted using both cyan and yellow. Utilizing different colors for the same protein within a figure can significantly confuse readers' interpretation of the experiments. Rectifying these inconsistencies is essential to enhance the clarity and comprehension of the experimental results.

      These colors were chosen to be visible to those with color image impairments. We typically used cyan and magenta to emphasize the most important markers in the image. When E-Cad and KRT8 were often used to emphasized or landmark a structure by localization of these protein. When KRT8 and E-Cad were highlighted, they were represented in cyan and magenta for visibility. When these proteins were used as a landmark to orient the reader and not as the main point, they were labeled in yellow.

      At last, many markers in this study are derived from bulk and single-cell sequencing of developing RO. However, it seems that these important data were separated into another paper as a preprint. If this data were incorporated into the current manuscript, the manuscript would become more comprehensive for guiding future research on the RO.

      Since we have single cell and single nuclei data from fetal and adult estrus and metestrus stages, we found that incorporating all this data into the present manuscript was overwhelming. Instead, we devoted another manuscript to presenting and validating that data. We believe a quick look at the sequencing manuscript will make this clear.

    1. expressed as (adapted

      Are there any reasons to drop the term \(\frac{n_T + n_C - 2}{n_T + n_C - 4}\) from Eq 24, Morris and to replace the last term \(\frac{d^2}{c^2}\) by \(d^2\)?

    1. Author response:

      We appreciate the reviewers’ thoughtful and constructive feedback, which has provided valuable insights to refine our manuscript. Below, we outline the planned revisions in response to the public reviews.

      Response to Reviewer #1

      We are grateful for the reviewer’s recognition of our methodological approach and the potential significance of CD47 as a novel MSC marker for cartilage repair. To address the concerns raised:

      (1) Clarifying the proteomics data supporting CD47 as an MSC marker

      · The manuscript will be revised to clearly indicate where the proteomics data demonstrate elevated CD47 expression in MSCs compared to non-MSCs.

      · Additional figure annotations or a supplemental figure may be included to enhance clarity.

      (2) Providing further details on CD47hi and CD47lo MSC populations

      · Information on the number of isolated CD47hi and CD47lo cells, along with any necessary expansion steps before in vivo use, will be explicitly detailed.

      (3) Expanding the characterization of CD47hi MSCs in vitro

      · A more comprehensive analysis of the chondrogenic differentiation capacity of CD47hi MSCs will be incorporated to strengthen the findings.

      (4) Clarifying experimental details of the in vivo rat OA model

      · The methodology section will be updated to specify the number of injected cells and their labeling strategies.

      · Representative histological images will be added to support the results.

      · To further substantiate the cartilage repair potential of CD47hi MSCs, additional staining for Collagen Type II will be included alongside Sox9 expression.

      Response to Reviewer #2

      We appreciate the reviewer’s enthusiasm for the study and recognition of its rigor and translational significance. The following revisions are planned to address the feedback:

      (1) Addressing additional assessments for OA phenotype in the rat model

      · While this study primarily relied on histology, the limitations of this approach will be acknowledged in the discussion.

      · The absence of microCT and behavioral assessments will be explained, with suggestions for incorporating these methods in future studies.

      (2) Justifying the focus on CD47

      · The rationale behind prioritizing CD47 over other proteomics-identified markers will be expanded to provide better context for this choice.

      (3) Clarifying MSC engraftment patterns

      · The manuscript will include a discussion on whether CD47hi MSCs specifically engraft in articular cartilage or contribute to ectopic cartilage formation (e.g., osteophytes).

      (4) Contextualizing findings within recent research on synovial progenitors

      · Additional discussion will highlight recent studies on DPP4+ PI16+ CD34+ stromal cells and how the identified MSC populations may relate to these universal fibroblasts.

      We are confident that these revisions will strengthen the manuscript and enhance its clarity and impact. The reviewers’ insights have been invaluable, and we look forward to refining the study accordingly.

    1. Intro pg 2 - 27% of direct care workers in the U.S. are immigrants vs. 17% of total U.S. labor force (and increasing)

      4 areas of federal policy intervention: 1. Providing work authorizations and pathways to citizenship that allow more immigrants to work in direct care and other essential sectors 2. Supporting the creation, testing, and replication of workforce innovations that recruit, retain, train, 3. Support immigrant direct care workers strengthening immigrant specific supports for these workers 4. Improving data collection and research to fully capture the realities of immigrant direct care workers.

    Annotators

    1. a musician was paid $1.00 for riding on a truck and playing for fivehours, $2.50 if he played a ball, from 8 p.m. until 4 a.m., with onehour intermission. House parties paid $1.50 to $2.00 and you playedabout five hours—8 p.m. to 1 a.m., or 9 p.m. to 2 a.m.

      I wonder how inflation would make these things worth today

    2. in New Or-leans during the early jazz years. He cites figures showing that “in1870, among the city’s 40,000 blacks, only 7 listed their major occu-pation as musician in the census survey. This compared to 177 blackpolicemen, 397 cigarmakers, and 249 shoemakers. Ten years later thenumber of black musicians rose to only 53.”

      This is a very important time in history as it really shows how black people were overlooked in all aspects of life. Including in musical history as in the music realm many black people did not take on musical aspiration. Which is wild to me as in current day there are a bunch of musical legends and icons who are indeed black and they make amazing music. As well as inspirations to us so it really shows how black culture and combining it with music came a long way. Now tons of black people aspire to be musical artists..

    3. reat It Gentle stands as one of the true gems of autobiography—jazzor otherwise—it is a beautiful story, beautifully told. If, as John Chil-ton has pointed out, Bechet’s tale does not always correlate with whatwe might regard as historical fact, the work does tell us what Bechetwould have us believe to be true, and so serves as an invaluable ac-count of his ideals and aspirations.

      This paragraph introduces an interesting approach to "primary and secondary sources" and whether they can be regarded as "historical facts." Even an autobiography that most would consider a primary source can't always be labeled as "historical facts"." But even if sources can't be labeled as "historical facts," they can provide great insight into the topic at hand.

    4. I contend that Creoles are a unique race of people . . . [with]specific traits and traditions that have been transmittedfrom generation to generation by Creole speakingpeople, a unique nation of mixed bloods.

      The usage of this quote further amplifies that jazz isn't just a black tradition, but a Creole tradition. Diving deeper into the roots of jazz gives more meaning not only to the music, but to the people as well. It gives them a sense of pride.

    5. This derision only intensified as Creoles saw the de-mand for their more polite brand of musicking shrink in favor of“hotter,” “dirtier” styles. Their forced musical interaction must havestruck many Downtown players as particularly galling because most ofthem did not even consider Uptown players to be “musicians” at all.

      Even after all this time, there still seems to be segregation, but in this case, it is between the Creoles and the Uptown players. The Creoles viewed the Uptown folks as inferior because they couldn’t read notation, didn’t know the “classics,” and didn’t seek a “refined” tone. This indicates that, even though they were of the same race, Creole musicians viewed themselves as superior due to their education and the demand for their polite style of jazz, unlike the Uptown players’ "dirtier" style. This divide between Creoles and Uptown musicians reflects how societal hierarchies and internalized perceptions of class and culture influenced the birth of jazz.

    6. This derision only intensified as Creoles saw the de-mand for their more polite brand of musicking shrink in favor of“hotter,” “dirtier” styles. Their forced musical interaction must havestruck many Downtown players as particularly galling because most ofthem did not even consider Uptown players to be “musicians” at all.

      After reviewing this passage, it made me think of how every few decades, history and things always repeat themselves in some aspect. In this more modern age we've found that newer artists are also looked down upon by older artists for the same reasons as the new "uptown jazz artists" of their time. While also thinking about this, i've seen more connection to the modern world. In my opinion, how can you be an artist if you're not willing to see what can evolve from the past masters?

    7. The Creoles, in much the same manner as the house Negroeson plantations in other areas, adopted as much of the French cultureas they could and turned their backs on the “darker” culture of theirhalf-brothers. It is safe to assume, for instance, that there were noblack Creoles dancing in Congo Square.”

      Did the Creoles think they stood at a higher position than their “half brother”?

    8. Moreover, given the present ten-dencies to anoint jazz as “America’s classical music” and its practition-ers as “treasured artists,” it might be useful to recall that these lofty un-derstandings developed only recently (and not just in the mainstreamwhite community).

      I find it interesting how when jazz first came about, there was a negative connotation behind it and was not respected like other genres of music. Im curious to know what shifted the level of respect, for it to now be considered “Americas classical music” and for its artists to be “treasured”.

    9. Anthony’s work reveals the serious differences separating the two pre-dominant African-diasporic communities in New Orleans and, as wewill see, challenges notions of jazz as the product of a unified, South-ern black “folk” culture.

      What are the serious differences separating the two predominant African-diasporic communities?

    1. Maybe it's because I have posted here before, reddit keeps recommending this forum to me when I log in, and I'm immensely frustrated by the posts asking questions about "the Zettlekasten method" and the responses. Why? Because folks are talking about different things all the time. It's like chickens taking to ducks. From my observation, people define "the Zettlekasten method" at least in two ways: (1) A paper or digital index card note system organized by folders, tags, links, tables of contents. (I don't think it's fair to give it a German name as its use can at least be dated in various cultures since the middle ages. Maybe the book authors and influencers want to lure people to think, fancy name=magic bullet?) (2) A note system "based on the principles and practices of Niklas Luhmann's zettelkasten method," as the sidebar of this forum describes. These are different concepts! (2) is a special case of (1). Anything you agree or disagree is meaningless if one of you is talking about (1) and the other is talking about (2). So what is this forum about, (1) or (2)? When you say you are attracted by "the Zettlekasten method," do you mean (1) or (2)? I don't think many people disagree with you if you mean Definition (1). Why you talk about "my zettelkasten," if you maintain a genetic index card system, you are not doing Zettlekasten in the Luhmann sense. At least, when you post, whether OP or as response, please specify which definition you are using, 1, 2, or 3, 4.

      reply to u/Active-Teach6311 at https://old.reddit.com/r/Zettelkasten/comments/1ilvvnc/you_need_to_first_define_the_zettlekasten_methoda/

      #1 == #2 In German contexts, zettelkasten subsumed both ideas which can easily be seen in the 2013 Marbach Exhibition: Zettelkasten: Machines of Fantasy. That exhibition featured six different Zettelkasten of which Luhmann's was but one. It wasn't until after this that sites like zettelkasten.de, this Reddit sub, or the popularity of Ahrens' book shifted the definition to a Luhmann-centric one, particularly in English language contexts which lacked a marketing term on which to latch to sell the idea. The productivity porn portion of the equation assisted in erasing the prior art and popularity of these methods.

      One can easily show mathematically that there is a one-to-one and onto mapping of Luhmann's method with all the other variations. This means that they're equivalent in structure and only differ in the names you give them.

      Even Ahrens suggests as much in his own book when he mentions that in digital contexts one doesn't need numbered cards in particular orders for the system to work. If Erasmus, Agricola, or Melanchthon were to magically arrive from the 15th century to the present day, they would have no difficulty recognizing their commonplacing work at play in a so-called Luhmann-artig zettelkasten.

      I would suggest that Luhmann didn't write more about his method himself because it would have been generally fruitless for him as everyone around him was doing exactly the same thing. The method was both literally and figuratively commonplace! J. E. Heyde's book, from which Luhmann modeled his own system, went through 10 editions from the 1930s through the 1970s in Luhmann's own lifetime.

    1. Reviewer #1 (Public review):

      Summary

      The authors describe a method for gastruloid formation using mouse embryonic stem cells (mESCs) to study YS and AGM-like hematopoietic differentiation. They characterise the gastruloids during nine days of differentiation using a number of techniques including flow cytometry and single-cell RNA sequencing. They compare their findings to a published data set derived from E10-11.5 mouse AGM. At d9, gastruloids were transplanted under the adrenal gland capsule of immunocompromised mice to look for the development of cells capable of engrafting the mouse bone marrow. The authors then applied the gastruloid protocol to study overexpression of Mnx1 which causes infant AML in humans.

      In the introduction, the authors define their interpretation of the different waves of hematopoiesis that occur during development. 'The subsequent wave, known as definitive, produces: first, oligopotent erythro-myeloid progenitors (EMPs) in the YS (E8-E8.5); and later myelo-lymphoid progenitors (MLPs - E9.5-E10), multipotent progenitors (MPPs - E10-E11.5), and hematopoietic stem cells (HSCs - E10.5-E11.5), in the aorta-gonad-mesonephros (AGM) region of the embryo proper.' Herein they designate the yolk sac-derived wave of EMP hematopoiesis as definitive, according to convention, although paradoxically it does not develop from intra-embryonic mesoderm or give rise to HSCs.

      General comments

      The authors make the following claims in the paper:

      (1) The development of a protocol for hemogenic gastruloids (hGx) that recapitulates YS and AGM-like waves of blood from HE.

      (2) The protocol recapitulates both YS and EMP-MPP embryonic blood development 'with spatial and temporal accuracy'.

      (3) The protocol generates HSC precursors capable of short-term engraftment in an adrenal niche.

      (4) Overexpression of MNX1 in hGx transforms YS EMP to 'recapitulate patient transcriptional signatures'.

      (5) hGx is a model to study normal and leukaemic embryonic hematopoiesis.

      There are major concerns with the manuscript. The statements and claims made by the authors are not supported by the data presented, data is overinterpreted, and the conclusions cannot be justified. Furthermore, the data is presented in a way that makes it difficult for the reader to follow the narrative, causing confusion. The authors have not discussed how their hGx compares to the previously published mouse embryoid body protocols used to model early development and hematopoiesis.

      Specific points

      (1) It is claimed that HGxs capture cellularity and topography of developmental blood formation. The hGx protocol described in the manuscript is a modification of a previously published gastruloid protocol (Rossi et al 2022). The rationale for the protocol modifications is not fully explained or justified. There is a lack of novelty in the presented protocol as the only modifications appear to be the inclusion of Activin A and an extension of the differentiation period from 7 to 9 days of culture. No direct comparison has been made between the two versions of gastruloid differentiation to justify the changes.

      The inclusion of Activin A at high concentration at the beginning of differentiation would be expected to pattern endoderm rather than mesoderm. BMP signaling is required to induce Flk1+ mesoderm, even in the presence of Wnt. FACS analysis of the hGx during differentiation is needed to demonstrate the co-expression of Flk1-GFP and lineage markers such as CD34 to indicate patterning of endothelium from Flk1+ mesoderm. The FACS plots in Figure 1 show c-Kit expression but very little VE-cadherin which suggests that CD34 is not induced. Early endoderm expresses c-Kit, CXCR4, and Epcam but not CD34 which could account for the lack of vascular structures within the hGx as shown in Figure 1E.

      (2) The protocol has been incompletely characterised, and the authors have not shown how they can distinguish between either wave of Yolk Sac (YS) hematopoiesis (primitive erythroid/macrophage and erythro-myeloid EMP) or between YS and intraembryonic Aorta-Gonad-Mesonephros (AGM) hematopoiesis. No evidence of germ layer specification has been presented to confirm gastruloid formation, organisation, and functional ability to mimic early development. Furthermore, differentiation of YS primitive and YS EMP stages of development in vitro should result in the efficient generation of CD34+ endothelial and hematopoietic cells. There is no flow cytometry analysis showing the kinetics of CD34 cell generation during differentiation. Benchmarking the hGx against developing mouse YS and embryo data sets would be an important verification.

      Single-cell RNA sequencing was used to compare hGx with mouse AGM. The authors incorrectly conclude that ' ..specification of endothelial and HE cells in hGx follows with time-dependent developmental progression into putative AGM-like HE..' And, '...HE-projected hGx cells.......expressed Gata2 but not Runx1, Myb, or Gfi1b..' Hemogenic endothelium is defined by the expression of Runx1 and Gfli1b is downstream of Runx1.

      (3) The hGx protocol 'generates hematopoietic SC precursors capable of short-term engraftment' is not supported by the data presented. Short-term engraftment would be confirmed by flow cytometric detection of hematopoietic cells within the recipient bone marrow, spleen, thymus, and peripheral blood that expressed the BFP transgene. This analysis was not provided. PCR detection of transcripts, following an unspecified number of amplification cycles, as shown in Figure 3G (incorrectly referred to as Figure 3F in the legend) is not acceptable evidence for engraftment. Transplanted hGx formed teratoma-like structures, with hematopoietic cells present at the site of transplant only analysed histologically. Indeed, the quality of the images provided does not provide convincing validation that donor-derived hematopoietic cells were present in the grafts.

      There is no justification for the authors' conclusion that '... the data suggest that 216h hGx generate AGM-like pre-HSC capable of at least short-term multilineage engraftment upon maturation...'. Indeed, this statement is in conflict with previous studies demonstrating that pre-HSCs in the dorsal aorta of the mouse embryo are immature and actually incapable of engraftment.

      The statement '...low-level production of engrafting cells recapitulates their rarity in vivo, in agreement with the embryo-like qualities of the gastruloid system....' is incorrect. Firstly, no evidence has been provided to show the hGx has formed a dorsal aorta facsimile capable of generating cells with engrafting capacity. Secondly, although engrafting cells are rare in the AGM, approximately one per embryo, they are capable of robust and extensive engraftment upon transplantation.

      (4) Expression MNX1 transcript and protein in hematopoietic cells in MNX1 rearranged acute myeloid leukaemia (AML) is one cause of AML in infants. In the hGX model of this disease, Mnx1 is overexpressed in the mESCs that are used to form gastruloids. Mnx1 overexpression seems to confer an overall growth advantage on the hGx and increase the serial replating capacity of the small number of hematopoietic cells that are generated. The inefficiency with which the hGx model generates hematopoietic cells makes it difficult to model this disease. The poor quality of the cytospin images prevents accurate identification of cells. The statement that the kit-expressing cells represent leukemic blast cells is not sufficiently validated to support this conclusion. What other stem cell genes are expressed? Surface kit expression also marks mast cells, frequently seen in clonogenic assays of blood cells. Flow cytometric and gene expression analyses using known markers would be required.

      (5) In human infant MNX1 AML, the mutation is thought to arise at the fetal liver stage of development. There is no evidence that this developmental stage is mimicked in the hGx model.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors develop an exciting new hemogenic gastruloid (hGX) system, which they claim reproduces the sequential generation of various blood cell types. The key advantage of this cellular system would be its potential to more accurately recapitulate the spatiotemporal emergence of hematopoietic progenitors within their physiological niche compared to other available in vitro systems. The authors present a large set of data and also validate their new system in the context of investigating infant leukemia.

      Strengths:

      The development of this new in vitro system for generating hematopoietic cells is innovative and addresses a significant drawback of current in vitro models. The authors present a substantial dataset to characterize this system, and they also validate its application in the context of investigating infant leukemia.

      Weaknesses:

      The thorough characterization and full demonstration that the cells produced truly represent distinct waves of hematopoietic progenitors are incomplete. The data presented to support the generation of late yolk sac (YS) progenitors, such as lymphoid cells, and aortic-gonad-mesonephros (AGM)-like progenitors, including pre-hematopoietic stem cells (pre-HSCs), by this system are not entirely convincing. Given that this is likely the manuscript's most crucial claim, it warrants further scrutiny and direct experimental validation. Ideally, the identity of these progenitors should be further demonstrated by directly assessing their ability to differentiate into lymphoid cells or fully functional HSCs. Instead, the authors primarily rely on scRNA-seq data and a very limited set of markers (e.g., Ikzf1 and Mllt3) to infer the identity and functionality of these cells. Many of these markers are shared among various types of blood progenitors, and only a well-defined combination of markers could offer some assurance of the lymphoid and pre-HSC nature of these cells, although this would still be limited in the absence of functional assays.

      The identification of a pre-HSC-like CD45⁺CD41⁻/lo c-Kit⁺VE-Cadherin⁺ cell population is presented as evidence supporting the generation of pre-HSCs by this system, but this claim is questionable. This FACS profile may also be present in progenitors generated in the yolk sac such as early erythro-myeloid progenitors (EMPs). It is only within the AGM context, and in conjunction with further functional assays demonstrating the ability of these cells to differentiate into HSCs and contribute to long-term repopulation, that this profile could be strongly associated with pre-HSCs. In the absence of such data, the cells exhibiting this profile in the current system cannot be conclusively identified as true pre-HSCs.

      The engraftment data presented are also not fully convincing, as the observed repopulation is very limited and evaluated only at 4 weeks post-transplantation. The cells detected after 4 weeks could represent the progeny of EMPs that have been shown to provide transient repopulation rather than true HSCs.

    3. Reviewer #3 (Public review):

      In this study, the authors employ a mouse ES-derived "hemogenic gastruloid" model which they generated and which they claim to be able to deconvolute YS and AGM stages of blood production in vitro. This work could represent a valuable resource for the field. However, in general, I find the conclusions in this manuscript poorly supported by the data presented. Importantly, it isn't clear what exactly are the "YS" and the "AGM"-like stages identified in the culture and where is the data that backs up this claim. In my opinion, the data in this manuscript lack convincing evidence that can enable us to identify what kind of hematopoietic progenitor cells are generated in this system. Therefore, the statement that "our study has positioned the MNX1-OE target cell within the YS-EMP stage (line 540)" is not supported by the evidence presented in this study. Overall, the system seems to be very preliminary and requires further optimization before those claims can be made.

      Specific comments below:

      (1) The flow cytometric analysis of gastruloids presented in Figure 1 C-D is puzzling. There is a large % of c-Kit+ cells generated, but few VE-Cad+ Kit+ double positive cells. Similarly, there are many CD41+ cells, but very few CD45+ cells, which one would expect to appear toward the end of the differentiation process if blood cells are actually generated. It would be useful to present this analysis as consecutive gating (i.e. evaluating CD41 and CD45 within VE-Cad+ Kit+ cells, especially if the authors think that the presence of VE-Cad+ Kit+ cells is suggestive of EHT). The quantification presented in D is misleading as the scale of each graph is different.

      (2) The imaging presented in Figure 1E is very unconvincing. C-Kit and CD45 signals appear as speckles and not as membrane/cell surfaces as they should. This experiment should be repeated and nuclear stain (i.e. DAPI) should be included.

      (3) Overall, I am not convinced that hematopoietic cells are consistently generated in these organoids. The authors should sort hematopoietic cells and perform May-Grunwald Giemsa stainings as they did in Figure 6 to confirm the nature of the blood cells generated.

      (4) The scRNAseq in Figure 2 is very difficult to interpret. Specific points related to this:<br /> - Cluster annotation in Figure 2a is missing and should be included.<br /> - Why do the heatmaps show the expression of genes within sorted cells? Couldn't the authors show expression within clusters of hematopoietic cells as identified transcriptionally (which ones are they? See previous point)? Gene names are illegible.<br /> - I see no expression of Hlf or Myb in CD45+ cells (Figure 2G). Hlf is not expressed by any of the populations examined (panels E, F, G). This suggests no MPP or pre-HSC are generated in the culture, contrary to what is stated in lines 242-245. (PMID 31076455 and 34589491).<br /> Later on, it is again stated that "hGx cells... lacked detection of HSC genes like Hlf, Gfi1, or Hoxa9" (lines 281-283). To me, this is proof of the absence of AGM-like hematopoiesis generated in those gastruloids.

      (5) Mapping of scRNA-Seq data onto the dataset by Thambyrajah et al. is not proof of the generation of AGM HE. The dataset they are mapping to only contains AGM cells, therefore cells do not have the option to map onto something that is not AGM. The authors should try mapping to other publicly available datasets also including YS cells.

      (6) Conclusions in Figure 3, named "hGx specify cells with preHSC characteristics" are not supported by the data presented here. Again, I am not convinced that hematopoietic cells can be efficiently generated in this system, and certainly not HSCs or pre-HSCs.<br /> - FACS analysis in 3A is again very unconvincing. I do not think the population identified as c-Kit+ CD144+ is real. Also, why not try gating the other way around, as commonly done (e.g. VE-Cad+ Kit+ and then CD41/CD45)?<br /> - The authors must have tried really hard, but the lack of short- or long-engraftment in a number of immunodeficient mouse models (lines 305-313) really suggests that no blood progenitors are generated in their system. I am not familiar with the adrenal gland transplant system, but it seems like a very non-physiological system for trying to assess the maturation of putative pre-HSCs. The data supporting the engraftment of these mice, essentially seen only by PCR and in some cases with a very low threshold for detection, are very weak, and again unconvincing. It is stated that "BFP engraftment of the Spl and BM by flow cytometry was very low level albeit consistently above control (Fig. S4E)" (lines 337-338). I do not think that two dots in a dot plot can be presented as evidence of engraftment.

      (7) Given the above, I find that the foundations needed for extracting meaningful data from the system when perturbed are very shaky at best. Nevertheless, the authors proceed to overexpress MNX1 by LV transduction, a system previously shown to transform fetal liver cells, mimicking the effect of the t(7;12) AML-associated translocation. Comments on this section:<br /> - The increase in the size of the organoid when MNX1 is expressed is a very unspecific finding and not necessarily an indication of any hematopoietic effect of MNX1 OE.<br /> - The mild increase of cKit+ cells (Figure 4E) at the 144hr timepoint and the lack of any changes in CD41+ or CD45+ cells suggests that the increase in Kit+ cells % is not due to any hematopoietic effect of MNX1 OE. No hematopoietic GO categories are seen in RNA seq analysis, which supports this interpretation. Could it be that just endothelial cells are being generated?

      (8) There seems to be a relatively convincing increase in replating potential upon MNX1-OE, but this experiment has been poorly characterized. What type of colonies are generated? What exactly is the "proportion of colony forming cells" in Figures 5B-D? The colony increase is accompanied by an increase in Kit+ cells; however, the flow cytometry analysis has not been quantified.

      (9) Do hGx cells engraft upon MNX1-OE? This experiment, which appears not to have been performed, is essential to conclude that leukemic transformation has occurred.

    1. CMYK基于减色原理,通过混合不同浓度的油墨来减少反射光的部分波长

      CMYK颜色模型是基于减色原理(subtractive color model)的颜色表示方法,广泛用于印刷行业。减色原理的核心思想是通过混合不同的油墨来减少光的某些波长,从而改变反射回来的光的颜色,最终形成所需要的颜色。

      1. 减色原理的基本概念

      减色原理与加色原理(如RGB模型)相反。在加色模型中,颜色通过光的叠加得到,而在减色模型中,颜色通过光的吸收来形成。具体来说,减色模型通过减少反射光的某些波长,来呈现出不同的颜色。

      如何减少光的波长?

      每种颜料(或油墨)在光照射下会吸收(减去)一部分光的波长,而反射剩余的部分。通过混合不同的颜料(或油墨),我们可以控制吸收的波长,从而控制最终反射的光的颜色。例如: - 青色(Cyan)吸收红色光并反射绿色和蓝色光。 - 品红色(Magenta)吸收绿色光并反射红色和蓝色光。 - 黄色(Yellow)吸收蓝色光并反射红色和绿色光。

      当你将这些颜料混合时,它们会共同吸收不同的光波长,最终呈现出一个新的颜色。

      2. CMYK颜色模型的四种油墨

      CMYK模型基于四种油墨混合来实现颜色的形成,它们分别是: - Cyan(青色):吸收红色光,反射蓝色和绿色光。 - Magenta(品红色):吸收绿色光,反射红色和蓝色光。 - Yellow(黄色):吸收蓝色光,反射红色和绿色光。 - Key(黑色):提供深色,增强颜色对比度,主要用于加深颜色和呈现黑色(因为三种原色混合并不能产生完全的黑色)。

      3. CMYK如何工作:混合油墨减少反射光的波长

      印刷过程中,光线首先照射到纸张,然后光线与油墨上的颜料发生相互作用。每种油墨在纸张上的反射光都会减去光的某些波长。

      假设我们用三种油墨(青色、品红色、黄色)分别涂在纸张的不同区域,当它们混合在一起时,光线照射到这些混合油墨区域时,它会与这些油墨颗粒发生相互作用,吸收一些光的波长,剩下的光被反射回去。这就是减色过程的核心。

      举个例子: - 青色 + 品红色:当青色和品红色混合时,青色吸收了红色光,品红色吸收了绿色光。最终,反射出来的光主要是蓝色,这就是为什么青色和品红色混合时,我们会看到一种紫色(蓝紫色)的效果。 - 青色 + 黄色:青色吸收红色光,黄色吸收蓝色光。最终,反射出来的光是绿色,因为青色和黄色的油墨混合后只反射绿色光。 - 品红色 + 黄色:品红色吸收绿色光,黄色吸收蓝色光。最终反射的光是红色。

      4. 引入黑色(K)

      虽然青色、品红色和黄色可以混合出多种颜色,但它们混合在一起时并不能产生完美的黑色。三者混合后通常会得到一种深灰色,因此在CMYK模型中引入了黑色油墨(K),用于增强颜色的对比度和深度。

      • 黑色油墨提供了更深的阴影和更强烈的对比,使得色彩更加丰富和深邃,尤其在打印过程中,可以避免使用过多的青色、品红色和黄色油墨。

      例如: - 如果需要打印一个深色的区域,使用黑色油墨比三种原色的混合更为高效。 - 黑色油墨能够有效增强图像的细节,使阴影部分更加清晰。

      5. CMYK混合的效果:

      • 青色 + 品红色 + 黄色:理论上,它们应该混合成黑色,但由于颜料的色谱限制,实际会得到一个深灰色。所以在这个基础上,加入黑色油墨(K)来补充。
      • 青色 + 黄色 + 黑色:可以生成非常深的绿色。
      • 品红色 + 黑色:产生深红色,增加了阴影和深度。

      6. 实际应用

      CMYK模型广泛应用于印刷色彩管理领域,尤其是在图像打印、出版、广告设计和包装设计中。具体应用场景包括: - 印刷行业:书籍、海报、杂志、广告等所有需要打印的材料都依赖于CMYK模型来实现精准的色彩再现。 - 数字打印:数字打印机和喷墨打印机通常使用CMYK油墨,精确控制色彩输出。 - 色彩管理系统(CMS):为了确保不同设备间(如显示器、打印机)的颜色一致性,使用CMYK模型来进行色彩校准。

      7. 总结

      CMYK颜色模型是一个减色模型,通过青色、品红色、黄色和黑色油墨的混合来减少光的波长并生成不同的颜色。它的应用主要集中在印刷领域,尤其适用于通过油墨混合来控制颜色的打印技术。通过吸收不同波长的光,CMYK能够提供丰富的颜色选择,同时通过引入黑色油墨来增强深度和对比度。这使得它成为打印和色彩管理中的核心技术之一。

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      What follows is our revision Plan.

      Manuscript number: RC-2024-02794

      Corresponding author(s): Jo Morris

      [The "revision plan" should delineate the revisions that authors intend to carry out in response to the points raised by the referees. It also provides the authors with the opportunity to explain their view of the paper and of the referee reports.

      • *

      The document is important for the editors of affiliate journals when they make a first decision on the transferred manuscript. It will also be useful to readers of the reprint and help them to obtain a balanced view of the paper.

      • *

      If you wish to submit a full revision, please use our "Full Revision" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      We feel the reviewers understood the paper well and made many reasonable points for improvement.

      In response to Reviewer three's concern about the reliance on SAE2 over-expression, in the 'Significance' section "One limitation is the strong reliance on the use of an actyl-mimicking mutant". We were minded not to rely on the mutant. Hence, the paper contains considerable data onthe HDCAC6 deacteylase, responsible for SEA2 deacetylation. We show that HDAC6 inhibition phenocopies SAE2-K164Q expression and, moreover, that the approaches which rescue the mitotic defects of SAE2-K164Q expression cells also rescue the defects of HDCA6 inhibited cells. These observations, we believe, overcome the concern.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *

      Revisions.

      • *

      R1: As the authors state, SUMO1 conjugates decrease during mitosis and this is somewhat at odds with the proposed model regarding NuMA. The authors can detect a SUMOylated NuMA conjugate (fig. 4a). To test whether the proposed model is correct, the authors could check: a. Whether this form is indeed SUMO1-NuMA b. Whether it decreases upon expression of the SAE2K164Q variant.

      R2: Figure 4:The authors show a ML792 sensitive high molecular weight smear of NUMA in nocodazole treated cells. It would be very convincing if the authors could demonstrate whether endogenous NUMA is conjugated to SUMO1 or SUMO2 in mitosis by SUMO IPs and whether they can detect a change upon expression of SAE2 variants as in Figure 3a. By replicating this experiment, it would be important to demonstrate the presence of both free and conjugated SUMO paralogs in the input and paralog specific sumoylation in general (smear) and of NUMA in the IP.

      • *

      Response:These are important points. We intend to perform the suggested experiments to address which isoform NuMa is modified by, and what the impact of the variant is.

      R2:Figures 2 C/Supplementary Figure 3c: The enzyme concentrations used in these reactions are much too high. To discriminate between thioester- and isopeptide-linked SUMO, the same samples should be analyzed in the absence (detection of thioester and isopeptide linkages) and presence of high concentrations of DTT (detection of isopeptide-linked SUMO only). The presented assay is problematic as it shows dimeric SUMO and RanGAP1:SUMO bands in the absence of ATP and no UBC9 but SAE2 thioester/isopetide formation in the absence of RanGAP1 (preferentially UBC9 should form a thioester/isopetide bond in this condition as higher molarities of UBC9 over E1 are used). Dimeric SUMO should not be detected unless disulfide bridges are formed between cysteines - this happens when DTT is not present in the reaction - under such conditions, SAE2 and UBC9 can also form disulfide bridges via their catalytic cysteines, impairing their enzymatic activity. In order to interpret the results correctly, it is important to add low concentrations of DTT (~0.1 mM) even in thioester reactions and to distinguish between thioester and isopeptide linkages.

      R2: Figure 2F/ Supplementary Figure 3d: Again, the enzyme concentrations are much too high and need to be reduced to a concentration where mainly RanGAP1 monosumoylation with SUMO1 is detected. As RanGAP1 is the most efficient SUMO substrate known, the enzyme concentrations and reaction time can be greatly reduced to limit the auto-modification of the enzymes and SUMO chain formation. Due to the efficient chain-forming activity of SUMO2, this is more difficult with SUMO2, but can be reduced by limiting the concentration of UBC9 in particular or by using a SUMO2 KallR mutant. In the reaction shown, the authors used only twice the molarity of SUMO compared to the substrate, too low taking into account SUMO2 chain formation, enzyme and substrate modification (The reaction should be limited by enzyme activity not by SUMO2). How can the authors be sure that the band they report as RanGAP1 high MW SUMO2 is indeed RanGAP1 modified and not SAE2 (in comparison to Suppl Figure 3b)?

      Response: We intend to repeat these assays, as suggested by the reviewer, reducing the enzyme concentrations and using low-concentration DTT. With the relevant controls and blots to show the identity of the RanGAP-SUMO2 product. Further, we will show control experiments with and without DTT that demonstrate the sensitivity of the SAE2~SUMO band to the reducing agent.

      R2: Figure 3 nicely shows that ML792-resistant SAE2 variants conjugate SUMO2 equally well, whereas SAE2 K163R is reduced and SAE2 K163Q appears to be abolished in SUMO1 conjugation. However, only high molecular weight SUMO conjugates are shown. What are the levels of free SUMO after overexpression of SAE2 variants and the indicated treatments?

      Response: We will attempt to show free SUMO levels in mitotic cells.

      R2: According to the work of Zhang et al from the Matunis lab (cited as reference 39 in the proposed study), SUMO conjugation is greatly reduced in nocodazole-arrested cells, but is restored after release in G1. Furthermore, SUMO1 and SUMO2 localize to different subcellular regions during mitosis. Have the authors tested whether SAE2 variants differ in their intracellular localization or alter the subcellular localization of SUMO1 and SUMO2 in interphase and mitotic cells?

      Response: We will examine the localisation of the SAE2 variants (see section below for the SUMO proteins).

      R3: It would be helpful if the authors could more clearly separate the two steps catalyzed by the E1. This would be needed to determine whether the accumulation of the SUMO1-AMP intermediate by the K164Q mutant is due to a faster rate of formation or a reduced rate of conversion to the thioester. They could test the AMP formation step in isolation in a straightforward manner by using the double mutant K164Q C173G and measuring a time course of SUMO1-AMP versus SUMO2-AMP build-up. Alternatively, they could try to isolate the second step by adding SUMO1-AMP versus SUMO2-AMP to the E1 de novo - although isolation of the intermediates may be more involved.

      Response: We intend to perform the first approach suggested, making and examining the double mutant's activity as suggested.

      R3: The reason for the isoform selectivity in the context of NuMA SUMOylation remains unresolved. The study would be significantly strengthened if the authors could address the question of whether the mitotic defects come from a lack of NuMA SUMOylation or the wrong type of SUMOylation. In other words, does it matter which isoform of SUMO is attached to NuMA? This could be addressed by also creating a SUMO2 fusion construct and testing whether that suppresses some of the phenotypes observed with the K164Q mutant and upon HDAC6 inhibition.

      Response. This is an excellent suggestion. We intend to make the constructs suggested and perform this experiment for our revision.

      R3. It would be helpful to show a time course of endogenous SAE2 acetylation over the cell cycle, using synchronized cultures.

      Response. We will attempt to gain a view of SAE2 acetylation over the cell cycle, which requires the precipitation of endogenous SAE following synchronisation.

      R3: Fig 2a: The figure would be easier to understand if the same colour scheme was used for S1 versus S2 to aid the comparison.

      Response: We will change this.

      R3: The title is not immediately understandable. "SUMO protein bias for mitotic stability" sounds a bit awkward. It would be clearer to be more explicit about isoforms.

      Response: We have considered: "HDAC6-Dependent Deacetylation of SAE2 enhances SUMO1 Conjugation for Mitotic Integrity", we have not changed it on the current manuscript so as not to confuse the reader - we will change it at the journal level.

      R3: Fig 2b: I don't understand the units of this graph. Why does normalization result in a value of zero, not 1? On this scale, what would a value of 1 signify? How can a value become negative? I would have expected values relative to the WT, with the WT being set to 1 or to 100%. The authors should also show the raw data for this plot.

      Response: The data will be normalised to the WT condition (1 instead of 0), and raw data shown.

      R3: Fig 2c: Please also show representative raw data.

      Response: Representative images will be shown.

      R3: Fig 2d,f: Again, the legend should explain what the plots were normalized to.

      Response: Inserted in the legend for Fig. 2d&f: 'The RanGAP1-SUMO1 products are normalised to the WT SAE1:SAE2:SUMO1-only condition (top) and the RanGAP1-SUMO2 products are normalised to the WT SAE1:SAE2:SUMO2-only condition (bottom).'

      R3 Fig S5b: The authors argue with the hydrogen bonding capacities of the different pairings. However, acetylation at K164 should not necessarily prevent a hydrogen bond to SUMO1-E93, considering that the "NH" group is likely still at a comparable distance to the carboxylate of E93 and could in principle undergo H-bonding unless prevented by the steric bulk introduced by the acetyl group. On the other hand, the K164-E93 interaction is the only electrostatic interaction among the 4 possible combinations. While a contribution is not easy to prove experimentally, I think the possibility of charge-charge interactions having an impact should be considered in the discussion.

      Response: Agreed. The figure will be redrawn, and the possibility will be discussed.

      R1 Fig. 2c: Why does C173G form a thioester with SUMO2 up to 40% of the WT?

      • *

      Response: We believe this arose in measuring background density in the blots in error. We will re-assess the method used.

      R3: Fig 4b: The images have very poor contrast. In addition, the merged image would be clearer if two different colours were used.

      Response: We will change one of the colours.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *

      • *

      R1:2. Please clarify the use of Dox addition in the text and legend earlier (is found currently in Supp. Fig 4).

      Response: Inserted before first result using doxycycline: 'Furthermore, we generated U2OS with a doxycycline-inducible (wild-type) WT FLAG-SAE2 or a FLAG-SAE2-K164R mutant.'

      R1.3. Fig. 4f: what is the difference between the first (invisible NUMA) bipolar and the second, NuMA visible bipolar spindle?

      Response: Fig. 4f now annotated with 'Untransfected' and 'GFP-NuMA transfected'.

      R1.4. ML972- should read ML792 on pg 8.

      Response: Corrected.

      R3: All the experiments showing acetylation are done with transfected FLAG-tagged constructs - are they overexpressed?

      Response: Supplemental Figure 4a illustrates that with the exception of the C173G mutant, the remainder WT, and K164-mutants are all expressed at near WT-levels and not over-expressed. The C-G-mutant is highly expressed.

      R3: On page 3, the authors could introduce a justification of why they tested IR treatment.

      Response: now justified.

      R3: The authors repeatedly use the word "codon" when they describe a site in the protein. Codon refers to mRNA, so the word "residue" would be more appropriate when talking about a protein.

      Response: Agreed. Done.

      R3: Page 8: "confirmation" should be "conformation".

      Response: Done.

      R3:Page 8: "While we find a little role for..." - delete "a"

      Response: Done.

      R2: Supplementary Figure 2: Please indicate the size of the marker bands, the fraction numbers and which fractions were pooled for further analysis. Is there any explanation why SAE1:SAE2K164R eluates in two peaks, suggesting two complexes? How different are they in size?

      Response: Ladder markers added to each gel image. Fraction numbers added. Black box indicates fractions pooled. Figure updated with relevant recombinant protein preps generated for updated in vitro experiments. The additional SAE1:SAE2-K164R peak which appeared in the previous manuscript Supp. Fig. 2a eluted in the void volume and so we think it comprised aggregated SAE1:SAE2 protein, more recent preparations do not show it.

      • *

      R3: The authors should include a more detailed discussion of the importance of the absolute and relative concentrations of free SUMO1 versus SUMO2/3 as a possible mechanism to impose isoform bias. Specifically, they should consider the different KM values of the E1 for the isoforms. The literature says that the E1 has a lower KM (higher affinity) for SUMO1 than SUMO2/3 but also a lower kcat (considering both steps of its reaction together), resulting in an approximately equal Kcat/KM. This would mean that at low overall SUMO concentrations, SUMO1 would have an advantage, whereas with rising SUMO concentrations SUMO2/3 would be favoured (which might be particularly important during stress conditions). What part of the curve does the cellular environment reflect?

      Response: Yes, good point. Now included:

      R3: Fig 3g: Could the authors comment on the detrimental effects of both SUMO1 and SUMO2 in the WT background?

      Response: Comment included.

      R3: Fig 3h: typo ("Trasfect")

      Response: Done.

      R3: Fig 4f: The DAPI signal is hardly visible - better contrast would help.

      Response: Improved.

      R3: Fig S2: It would be appropriate to indicate which fractions were actually collected or combined during the purification.

      Response: Ladder markers added to each gel image. Fraction numbers added. Black box indicates the fractions pooled.

      • *

      • *

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      • *

      R2: According to the work of Zhang et al from the Matunis lab (cited as reference 39 in the proposed study), SUMO conjugation is greatly reduced in nocodazole-arrested cells, but is restored after release in G1. Furthermore, SUMO1 and SUMO2 localize to different subcellular regions during mitosis. Have the authors tested whether SAE2 variants differ in their intracellular localization or alter the subcellular localization of SUMO1 and SUMO2 in interphase and mitotic cells?

      Response: We have investigated SUMO isoform location. However, in our hands, using a range of SUMO antibodies, we do not see the previously reported localisations in mitotic wild-type cells, and thus, we are not able to assess the impact of the SAE variants. As our phenotypes are restricted to mitosis, we do not consider it worthwhile to look at interphase.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, the authors report on an interesting regulatory mechanism that influences the balance between conjugation of the different SUMO isoforms, SUMO1 versus SUMO2/3. The authors describe that acetylation of a specific residue, K164, in the SUMO activating enzyme (E1) subunit, SAE2, biases the E1's preference towards SUMO2/3. Specifically, they use an acetylation-mimicking K164Q mutation to show that the acetylation state of SAE2 likely affects the affinity of the E1 to SUMO and the rate of thioester formation. With an antibody, they demonstrate the acetylation of SAE2 in cells. Mechanistically, they locate the cause of the isoform bias to a residue in the C-terminus of SUMO in proximity to K164 or SAE2, where SUMO1 carries glutamate, while SUMO2/3 has glutamine. Switching these residues between the SUMO isoforms reverses the isoform preference of the E1. Phenotypically, the SAE2 K164Q mutant induces mitotic problems that the authors attribute to the SUMOylation of the NuMA complex. They assign the deacetylation of SAE1 to HDAC6 and report that deacetylation occurs during mitosis. These results are consistent with a model that SUMO1 modification of the NuMA complex in mitosis is important for mitotic fidelity and that the cell cycle-dependent changes in the acetylation status of SAE2 promote this. Accordingly, fusion of SUMO1 to a NuMA subunit partially overcomes the problems induced by the K164Q mutant or the inhibition of HDAC6.

      Major comments:

      The experiments are largely performed in a well-controlled manner, and overall, the study is very convincing. I would like to suggest a few experiments that would strengthen the authors' conclusions, and there are a few minor issues with some of the figures.

      1. It would be helpful if the authors could more clearly separate the two steps catalyzed by the E1. This would be needed to determine whether the accumulation of the SUMO1-AMP intermediate by the K164Q mutant is due to a faster rate of formation or a reduced rate of conversion to the thioester. They could test the AMP formation step in isolation in a straightforward manner by using the double mutant K164Q C173G and measuring a time course of SUMO1-AMP versus SUMO2-AMP build-up. Alternatively, they could try to isolate the second step by adding SUMO1-AMP versus SUMO2-AMP to the E1 de novo - although isolation of the intermediates may be more involved.
      2. The reason for the isoform selectivity in the context of NuMA SUMOylation remains unresolved. The study would be significantly strengthened if the authors could address the question of whether the mitotic defects come from a lack of NuMA SUMOylation or the wrong type of SUMOylation. In other words, does it matter which isoform of SUMO is attached to NuMA? This could be addressed by also creating a SUMO2 fusion construct and testing whether that suppresses some of the phenotypes observed with the K164Q mutant and upon HDAC6 inhibition.
      3. The authors should include a more detailed discussion of the importance of the absolute and relative concentrations of free SUMO1 versus SUMO2/3 as a possible mechanism to impose isoform bias. Specifically, they should consider the different KM values of the E1 for the isoforms. The literature says that the E1 has a lower KM (higher affinity) for SUMO1 than SUMO2/3 but also a lower kcat (considering both steps of its reaction together), resulting in an approximately equal Kcat/KM. This would mean that at low overall SUMO concentrations, SUMO1 would have an advantage, whereas with rising SUMO concentrations SUMO2/3 would be favoured (which might be particularly important during stress conditions). What part of the curve does the cellular environment reflect?
      4. It would be helpful to show a time course of endogenous SAE2 acetylation over the cell cycle, using synchronized cultures. All the experiments showing acetylation are done with transfected FLAG-tagged constructs - are they overexpressed? Is is not possible to work with endogenous SAE2?

      Minor comments:

      • The title is not immediately understandable. "SUMO protein bias for mitotic stability" sounds a bit awkward. It would be clearer to be more explicit about isoforms.
      • On page 3, the authors could introduce a justification of why they tested IR treatment.
      • The authors repeatedly use the word "codon" when they describe a site in the protein. Codon refers to mRNA, so the word "residue" would be more appropriate when talking about a protein.
      • Page 8: "confirmation" should be "conformation".
      • Page 8: "While we find a little role for..." - delete "a"
      • Fig 2a: The figure would be easier to understand if the same colour scheme was used for S1 versus S2 to aid the comparison.
      • Fig 2b: I don't understand the units of this graph. Why does normalization result in a value of zero, not 1? On this scale, what would a value of 1 signify? How can a value become negative? I would have expected values relative to the WT, with the WT being set to 1 or to 100%. The authors should also show the raw data for this plot.
      • Fig 2c: Please also show representative raw data.
      • Fig 2d,f: Again, the legend should explain what the plots were normalized to.
      • Fig 3g: Could the authors comment on the detrimental effects of both SUMO1 and SUMO2 in the WT background?
      • Fig 3h: typo ("Trasfect")
      • Fig 4b: The images have very poor contrast. In addition, the merged image would be clearer if two different colours were used.
      • Fig 4f: The DAPI signal is hardly visible - better contrast would help.
      • Fig S2: It would be appropriate to indicate which fractions were actually collected or combined during the purification.
      • Fig S5b: The authors argue with the hydrogen bonding capacities of the different pairings. However, acetylation at K164 should not necessarily prevent a hydrogen bond to SUMO1-E93, considering that the "NH" group is likely still at a comparable distance to the carboxylate of E93 and could in principle undergo H-bonding unless prevented by the steric bulk introduced by the acetyl group. On the other hand, the K164-E93 interaction is the only electrostatic interaction among the 4 possible combinations. While a contribution is not easy to prove experimentally, I think the possibility of charge-charge interactions having an impact should be considered in the discussion.

      Significance

      The results presented here are interesting and novel. Importantly, the authors provide a molecular model for a new mechanism of how the SUMO system achieves isoform specificity, which is a still very poorly understood phenomenon. The manuscript makes a significant advance by contributing an important new aspect of how the SUMO conjugation machinery chooses between isoforms. The manuscript is strong by providing very good evidence for its conclusions. One limitation is the strong reliance on the use of an actyl-mimicking mutant; this limitation could be overcome by placing a bit more emphasis on detecting endogenous SAE2 acetylation.

      Audience: The study should be relevant to a broad audience, given the impact of the SUMO system on cellular regulation; after all, the study addresses a very fundamental problem in the field. In addition, it should be of interest to researchers studying regulation of mitosis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Walker et al characterized lysine 164 acetylation of the catalytic SUMO activating enzyme subunit SAE2 and observed that this modification causes a bias towards SUMO2/3 over SUMO1 involving their C-terminal tails. While several enzymes appear to mediate SAE2 acetylation, HDAC6 is responsible for deacetylating SAE2 in mitosis, thereby promoting mitotic SUMO1 modification. The nuclear mitotic apparatus, NuMA, was identified as a putative mitotic SUMO1 substate upon SAE2 deacetylation. Replacement of endogenous SAE2 with an acetylation mimetic SAE2-K164Q mutant restricts SUMO1 conjugation of NuMA resulting in multipolar spindle formation that can be rescued either by overexpression of SUMO1 or by SUMO1-NuMA fusion.

      Major comments:

      • Figures 2 C/Supplementary Figure 3c: The enzyme concentrations used in these reactions are much too high. To discriminate between thioester- and isopeptide-linked SUMO, the same samples should be analyzed in the absence (detection of thioester and isopeptide linkages) and presence of high concentrations of DTT (detection of isopeptide-linked SUMO only). The presented assay is problematic as it shows dimeric SUMO and RanGAP1:SUMO bands in the absence of ATP and no UBC9 but SAE2 thioester/isopetide formation in the absence of RanGAP1 (preferentially UBC9 should form a thioester/isopetide bond in this condition as higher molarities of UBC9 over E1 are used). Dimeric SUMO should not be detected unless disulfide bridges are formed between cysteines - this happens when DTT is not present in the reaction - under such conditions, SAE2 and UBC9 can also form disulfide bridges via their catalytic cysteines, impairing their enzymatic activity. In order to interpret the results correctly, it is important to add low concentrations of DTT (~0.1 mM) even in thioester reactions and to distinguish between thioester and isopeptide linkages.
      • Figure 2F/ Supplementary Figure 3d: Again, the enzyme concentrations are much too high and need to be reduced to a concentration where mainly RanGAP1 monosumoylation with SUMO1 is detected. As RanGAP1 is the most efficient SUMO substrate known, the enzyme concentrations and reaction time can be greatly reduced to limit the auto-modification of the enzymes and SUMO chain formation. Due to the efficient chain-forming activity of SUMO2, this is more difficult with SUMO2, but can be reduced by limiting the concentration of UBC9 in particular or by using a SUMO2 KallR mutant. In the reaction shown, the authors used only twice the molarity of SUMO compared to the substrate, too low taking into account SUMO2 chain formation, enzyme and substrate modification (The reaction should be limited by enzyme activity not by SUMO2). How can the authors be sure that the band they report as RanGAP1 high MW SUMO2 is indeed RanGAP1 modified and not SAE2 (in comparison to Suppl Figure 3b)?
      • Figure 3 nicely shows that ML792-resistant SAE2 variants conjugate SUMO2 equally well, whereas SAE2 K163R is reduced and SAE2 K163Q appears to be abolished in SUMO1 conjugation. However, only high molecular weight SUMO conjugates are shown. What are the levels of free SUMO after overexpression of SAE2 variants and the indicated treatments? According to the work of Zhang et al from the Matunis lab (cited as reference 39 in the proposed study), SUMO conjugation is greatly reduced in nocodazole-arrested cells, but is restored after release in G1. Furthermore, SUMO1 and SUMO2 localize to different subcellular regions during mitosis. Have the authors tested whether SAE2 variants differ in their intracellular localization or alter the subcellular localization of SUMO1 and SUMO2 in interphase and mitotic cells? Can the authors comment on the proportion of SAE2 that is acetylated?
      • Figure 4:The authors show a ML792 sensitive high molecular weight smear of NUMA in nocodazole treated cells. It would be very convincing if the authors could demonstrate whether endogenous NUMA is conjugated to SUMO1 or SUMO2 in mitosis by SUMO IPs and whether they can detect a change upon expression of SAE2 variants as in Figure 3a. By replicating this experiment, it would be important to demonstrate the presence of both free and conjugated SUMO paralogs in the input and paralog specific sumoylation in general (smear) and of NUMA in the IP.

      Minor comments:

      • Supplementary Figure 2: Please indicate the size of the marker bands, the fraction numbers and which fractions were pooled for further analysis. Is there any explanation why SAE1:SAE2K164R eluates in two peaks, suggesting two complexes? How different are they in size?

      Significance

      The finding that E1 acetylation regulates SUMO paralog specificity is very exciting, particularly because of its link to key regulatory mitotic functions. Overall, the findings are intriguing and supported in part by various biological and biochemical methods. However, some concerns remain unsatisfactorily addressed, as outlined above.

      The findings provide a novel basic concept of how E1 enzyme regulation contributes to the specification of modifier selectivity, demonstrates cross-talk with other PTMs and reveals a biological function. Therefore, the study is of interest to a broad audience.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In their manuscript, Walker et al. investigate the physiological role of deacetylation of the SAE2 subunit of the SUMO E1 enzyme. They find that SAE1:SAE2-acK164 is deacetylated in an HDAC6-dependend manner and use a series of biochemical assays to show that deacetylation of the SAE2 subunit shifts the bias of the SUMO E1 towards SUMO1 conjugation in vitro, proposing a mechanism that is similar to the one that the NEDD8 E1 employs to discriminate between NEDD8 and ubiquitin.

      The authors continue to examine the role of different SAE2 variants in different cellular stresses and show that the acetyl-mimicking SAE2K164Q variant displays reduced levels of high molecular weight SUMO1 conjugates in mitotic cells. This variant cannot support proper mitotic spindle formation leading to the appearance of multipolar spindles and centromere-containing micronuclei. Finally, they go on to identify the mechanism underlying these phenotypes and examine NuMA SUMOylation. They test SUMOylation-refractive NuMA variants as well as an already published SUMO1-NuMA fusion that mimics the SUMOylated protein form. They propose a model in which deacetylation of SAE2 changes the bias of the SUMO E1 to increase SUMO1-NuMA conjugation during mitosis, promoting bipolar spindle formation.

      Major point:

      As the authors state, SUMO1 conjugates decrease during mitosis and this is somewhat at odds with the proposed model regarding NuMA. The authors can detect a SUMOylated NuMA conjugate (fig. 4a). To test whether the proposed model is correct, the authors could check:

      a. Whether this form is indeed SUMO1-NuMA

      b. Whether it decreases upon expression of the SAE2K164Q variant.

      Minor points:

      1. Fig. 2c: Why does C173G form a thioester with SUMO2 up to 40% of the WT?
      2. Please clarify the use of Dox addition in the text and legend earlier (is found currently in Supp. Fig 4).
      3. Fig. 4f: what is the difference between the first (invisible NUMA) bipolar and the second, NuMA visible bipolar spindle?
      4. ML972- should read ML792 on pg 8.

      Significance

      General assessment:

      This is a thorough study with complex but well controlled experiments and contains a large amount of valuable information. A point could be further clarified in order to provide further support the proposed model.

      Advance:

      The document brings understanding on the regulation of the SUMO conjugation system a step forward and links it to a physiological context.

      Audience:

      basic science: the Ubiquitin family field and also the mitosis-cytoskeleton field. Applied science concerning the use of SUMO inhibitors in cancer.

      Expertise: SUMO regulation of the cytoskeleton during mitosis (yeast system)

    1. 与YUV关系

      是的,你的理解是准确的!YUV本质上与CIELAB有相似之处,它们都基于人眼对亮度的高度敏感性,并且都将亮度和色度信息分开处理,旨在提高效率并适应人眼的感知方式。

      1. YUV与CIELAB的相似性

      在YUV格式中,Y分量代表亮度(即图像的明暗程度),而UV分量表示色度(即颜色的色相和饱和度)。这与CIELAB的设计非常相似,后者通过L*表示亮度,a*b*表示色度(红绿和蓝黄轴上的色差)。

      为什么YUV与CIELAB类似?

      • 人眼对亮度更敏感:人眼对亮度(或明暗)变化的敏感度要远高于对色度(色相和饱和度)的敏感度。这是因为我们对图像的细节和对比度的识别依赖于亮度信息,而对颜色的变化则不那么敏感。这一特性使得在视频和图像处理中,分离亮度和色度可以有效减少信息冗余,提高数据压缩的效率。

      在YUV中,Y分量表示亮度,这个分量在人眼视觉感知中占据主导地位;UV分量则表示色度,它们的变化对视觉感知的影响较小。因此,在视频压缩和传输中,UV分量通常可以进行下采样(比如4:2:0采样),从而减少数据量,而不显著影响图像的视觉质量。

      • 色度分量压缩:由于人眼对色度(如红绿、蓝黄的变化)不如对亮度敏感,YUV格式将色度分量压缩,并且在许多视频编码格式中,色度分量的分辨率通常低于亮度分量。这类似于CIELAB中,色度部分(a*和b*)与亮度(L*)是分开的,并且色度部分的变化对视觉的影响相对较小。

      2. YUV与CIELAB的区别

      尽管YUV和CIELAB有相似的结构,它们的主要区别在于它们的设计目标和应用领域:

      • CIELAB是一个设备无关的色彩空间,旨在提供一种统一的标准来描述人眼对颜色的感知,并且可以用于不同设备之间的颜色转换。它并不依赖于任何特定的设备或显示技术。

      • YUV是为视频编码和传输而设计的,它特别考虑了数据压缩的需求,并且通过色度的下采样来减少冗余信息。它主要用于视频信号的传输和存储,尤其是高效压缩(例如,H.264、MPEG标准)中的应用。

      3. 总结

      • YUVCIELAB都基于人眼对亮度的敏感性,因此它们都通过分离亮度和色度信息来提高效率。YUV通过将Y作为亮度分量,UV作为色度分量,实现了色度的压缩,而CIELAB通过L*表示亮度,a*b*表示色度,确保颜色的感知与人眼一致。

      • YUV的目标高效的视频压缩和传输,而CIELAB的目标是确保设备无关的颜色表示跨设备色彩管理,以确保不同设备(如显示器、打印机)之间的颜色一致性。

      因此,YUV和CIELAB的设计理念相似,但它们各自的应用场景和功能有所不同,YUV更侧重于视频数据的传输效率,而CIELAB侧重于提供一个统一的色彩标准,用于色彩管理和设备之间的颜色转换。

    2. 与RGB或CMYK等颜色模型不同,CIELAB色彩空间并不依赖于任何特定的设备(如显示器或打印机),而是设计为一种统一的、可用于任何设备和颜色处理系统的标准

      要理解CIELAB色彩空间RGBCMYK等颜色模型的区别,我们需要从设备相关性设备无关性的角度来进行对比。以下是详细的解释:

      1. RGB和CMYK颜色模型的设备相关性

      RGBCMYK是典型的设备相关的颜色模型,这意味着它们依赖于特定设备的颜色特性和显示方式。

      • RGB颜色空间:RGB是基于加色法的颜色模型,通常用于电子显示设备(如显示器、电视和手机屏幕)。RGB的颜色值依赖于设备的显示特性,例如显示器使用的背光类型、液晶面板的色域、显示器的亮度和对比度等因素。因此,相同的RGB数值在不同的显示器上显示时,可能会呈现出不同的颜色。RGB颜色空间是特定于设备的,因为不同设备的色域(即设备能够显示的颜色范围)不同。

      • CMYK颜色空间:CMYK是基于减色法的颜色模型,主要用于印刷设备,如打印机。它使用青色、品红色、黄色和黑色四种颜色的油墨来产生各种颜色。由于打印机的油墨配方、纸张类型和打印机的性能不同,CMYK颜色模型的表现也会因设备的不同而变化。

      这两种颜色模型的共同点是:它们依赖于设备的物理特性,颜色的再现受到设备的限制和影响,因此不同设备之间的颜色表现可能存在差异。

      2. CIELAB色彩空间的设备无关性

      CIELAB(Lab)色彩空间与RGB和CMYK不同,它是一个设备无关的颜色模型,即它并不依赖于任何特定设备的颜色特性。CIELAB是基于人类视觉感知的色彩空间,旨在提供一种标准化的、与设备无关的颜色表示方法。

      • 设备无关性:CIELAB色彩空间设计时考虑了人类视觉系统的感知特性,试图使得颜色的表示尽可能接近人眼对颜色的自然感知。它不依赖于任何特定设备(如显示器、打印机或扫描仪)的工作原理或技术,而是基于一个统一的标准——CIE 1931 XYZ色彩空间(由国际照明委员会CIE制定)。通过转换到CIELAB,颜色的差异和感知可以被精确地表示,而不考虑显示或输出设备的物理特性。

      • 统一标准:CIELAB提供了一个全局标准,适用于所有设备,不同设备之间的色差可以通过CIELAB进行统一表示。因为CIELAB色彩空间与设备的光学和电子特性无关,它能够准确描述和比较任何设备上的颜色。举个例子,尽管显示器和打印机的色域不同,通过CIELAB可以在显示器和打印机之间进行准确的颜色转换和匹配。

      • 色彩一致性:由于CIELAB是设备无关的,它可以作为不同设备之间的中介色彩空间,帮助在不同设备间传递和匹配颜色。例如,当图像从数码相机传输到计算机显示器、再从显示器传输到打印机时,色彩管理系统通常会将图像数据先转换为CIELAB色彩空间,再根据各个设备的色域和特性进行调整,从而确保颜色在各个设备上尽可能一致。

      3. CIELAB的实际应用

      CIELAB的设备无关性使得它在多个领域中得到了广泛应用: - 色彩管理:色彩管理系统(CMS)利用CIELAB作为中介色彩空间,进行设备间的色彩转换。比如,在打印和显示过程中,CIELAB作为中介可以确保图像颜色从显示器到打印机的转换时颜色保持一致。 - 色差计算:CIELAB的设计确保了它能够更准确地表示人眼对颜色差异的感知,因此它被广泛用于色差计算(如ΔE值计算),用来量化两个颜色之间的差异。 - 跨设备颜色匹配:在摄影、印刷和数字图像处理等行业中,CIELAB被用来确保不同设备(如显示器、打印机、扫描仪等)之间的颜色一致性。

      4. 总结

      • RGB和CMYK是设备相关的:它们依赖于具体设备的物理特性和工作原理,例如显示器的光源、打印机的油墨等,因此同样的RGB或CMYK值在不同设备上的表现可能会有所不同。
      • CIELAB是设备无关的:它提供了一个标准化的颜色模型,基于人眼的感知而非特定设备的特性,能够准确描述和比较不同设备上的颜色,确保跨设备的颜色一致性。

      因此,CIELAB的设计使得它成为跨设备色彩管理和颜色匹配的理想选择,不仅有助于保证颜色的一致性,还能够提供更加精确的颜色描述和差异量化。

    1. Reviewer #3 (Public review):

      Genetic manipulation of Leishmania has some challenges, including some limitations in the DNA repair strategies that are present in the organism and the absence of RNA interference in many species. The senior author has contributed significantly to expanding the available routes towards Leishmania genetic manipulation by developing and adapting CRISPR-Cas9 tools to allow gene manipulation via DNA double strand break repair and, more recently, base modification. This work seeks to improve on some limitations in the tools previously described for the latter approach of base modification leading to base change.

      The work in the paper is meticulously described, with solid evidence for the improvements that are claimed: Fig.1 clearly describes reduced impairment in growth of parasites expressing sgRNAs via changes in promoters; Figs.2 and 3 compellingly document the usefulness of using AsCas12a for integration after transformation; Figs.1 and 4 demonstrate the capacity of the combined modifications to efficiently edit a gene in three different Leishmania species; and Fig. 5 shows that this approach can be conducted at scale, providing a means of assessing the fitness of mutant pools. There is little doubt these new tools will be adopted by the Leishmania community, adding to the growing arsenal of approaches for genetic manipulation.

      Two weaknesses suggested in the initial submission have been completely addressed.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      While CRISPR/Cas technology has greatly facilitated the ability to perform precise genome edits in Leishmania spp., the lack of a non-homologous DNA end-joining (NHEJ) pathway in Leishmania has prevented researchers from performing large-scale Cas-based perturbation screens. With the introduction of base editing technology to the Leishmania field, the Beneke lab has begun to address this challenge (Engstler and Beneke, 2023).

      In this study, the authors build on their previously published protocols and develop a strategy that:

      (1) allows for very high editing efficiency. The cell editing frequency of 1 edit per 70 cells reported in this study represents a 400-fold improvement over the previously published protocol,

      (2) reduces the negative effects of high sgRNA levels on parasite growth by using a weaker T7 promoter to drive sgRNA transcription.

      The combination of these two improvements should open the door to exciting large-scale screens and thus be of great interest to researchers working with Leishmania and beyond.

      We thank reviewer #1 for these encouraging comments.

      Reviewer #2 (Public Review):

      Summary:

      Previously, the authors published a Leishmania cytosine base editor (CBE) genetic tool that enables the generation of functionally null mutants. This works by utilising a CAS9-cytidine deaminase variant that is targeted to a genetic locus by a small guide RNA (sgRNA) and causes cytosine to thymine conversion. This has the potential to generate a premature stop codon and therefore a loss of function mutant.

      CBE has advantages over existing CAS-based knockout tools because it allows the targeting of multicopy gene families and, potentially, the easier generation of pooled loss of function mutants in complex population experiments. Although successful, the first generation of this genetic tool had several limitations that may have prevented its wider adoption, especially in complex genome-wide screens. These include nonspecific toxicity of the sgRNAs, low transfection efficiencies, low editing efficiencies, a proportion of transfectants that express multiple different sgRNAs, and insufficient effectivity in some Leishmania species.

      Here, the authors set out to systematically solve each of these limitations. By trialling different transfection conditions and different CAS12a cut sites to promote sgRNA expression cassette integration, they increase the transfection efficiency 400-fold and ensure that only a single sgRNA expression cassette integrates that edits with high efficiencies. By trialling different T7 promoters, they significantly reduce the non-specific toxicity of sgRNA expression whilst retaining high editing efficiencies in several Leishmania species (Leishmania major, L. mexicana and L. donovani). By improving the sgRNA design, the authors predict that null mutants will be more efficiently produced after editing.

      This tool will find adoption for producing null mutants of single-copy genes, multicopy gene families, and potentially genome-wide mutational analyses.

      Strengths:

      This is an impressive and thorough study that significantly improves the previous iteration of the CBE. The approach is careful and systematic and reflects the authors' excellent experience developing CRISPR tools. The quality of data and analysis is high and data are clearly presented.

      Weaknesses:

      Figure 4 shows that editing of PF16 is 'reversed' between day 6 and day 16 in L. mexicana WTpTB107 cells. The authors reasonably conclude that in drug-selected cells there is a mixed population of edited and non-edited cells, possibly due to mis-integration of the sgRNA expression construct, and non-edited cells outcompete edited cells due to a growth defect in PF16 loss of function mutants. However, this suggests that the CBE tool will not work well for producing mutants with strong fitness phenotypes without incorporating a limiting dilution cloning step (at least in L. mexicana and quite possibly other Leishmania species). Furthermore, it suggests it will not be possible to incorporate genes associated with a growth defect into a pooled drop-out screen as described in the paper. This issue is not well explored in the paper and the authors have not validated their tool on a gene associated with a severe growth defect, or shown that their tool works in a mixed population setting.

      We would like to thank reviewer #2 for this helpful comment and valid point. We have now included a small-scale loss-of-function screen in L. mexicana, targeting nine known essential genes with 24 CBE sgRNAs and 15 non-targeting control sgRNAs. This approach successfully detected all known included growth-associated phenotypes in a pooled screening format. This experiment is now shown in Figure 5 and described in section “Detection of fitness-associated phenotypes in a pooled loss-of-function screen”.

      In addition, we would like to re-iterate our initial public response to this comment. We believe that escapes or reversals of mutant phenotypes can be observed also with other genetic tools used for loss-of-function screening, including lentiviral CRISPR approaches in mammalian systems and RNAi in Trypanosoma brucei (e.g. Ariyanayagam et al., 2005 and Schlecker et al., 2005). Notably, in lentiviral delivered CRISPR screens, sgRNA expression cassettes are integrated in random places within the genome and multiple cassettes can be integrated depending on the viral titre. In these type of screens, cells can escape phenotypes through various mechanisms, such as promoter silencing or selection of non-deleterious mutations. Additionally, not every CRISPR guide is efficient in generating a mutant phenotype, and RNAi constructs can also vary in their effectiveness. Despite these challenges, genome-wide loss-of-function screens have been successfully carried out in mammalian cells and Trypanosoma parasites. Therefore, we believe that the observed escape of one mutant phenotype does not preclude the detection of growth-associated or other phenotypes in pooled screens. Moreover, we did not observe a reversal of the mutant phenotype in L. mexicana, L. donovani, and L. major parasites expressing tdTomato from an expression cassette integrated into the 18S rRNA SSU locus (Figure 4). Our now included small scale fitness screen (Figure 5) confirms these assumptions and shows that we can detect “strong” growth associated phenotypes. We would also like to point out that we have recently successfully conducted several genome-wide loss-of-function screens in vivo and in vitro, ultimately confirming the feasibility of this type of screen on a genome-wide scale (manuscript in preparation).

      We have included a discussion of these points under section “Integration of CBE sgRNA expression cassettes via AsCas12a ultra-introduced DSBs increase editing rates” and section “Detection of fitness-associated phenotypes in a pooled loss-of-function screen” in our revised manuscript.

      Although welcome, the improvements to the crRNA CBE design tool are hypothetical and untested.

      We agree that the improvements to the CBE sgRNA design are currently hypothetical. We plan to systematically test our guide design principles in future studies. Since this will require testing hundreds of guides to draw robust conclusions, we believe that this aspect is beyond the scope of the current study. In section “Improved CBE sgRNA design to prioritize edits resulting only in STOP codons” of our revised manuscript we now discuss these future plans.

      The Sanger and Oxford Nanopore Technology analyses on integration sites of the sgRNA expression cassette integration will not detect the mis-integration of the sgRNA expression construct into an entirely different locus.

      We have now re-analysed our ONT data and have extracted all ONT contigs that match the CBE sgRNA expression cassette. All extracted contigs align to the 18S rRNA SSU locus, showing integration of the cassette into this locus. It is important to note that here a population was sequenced and not a clone. Despite this, no contigs could be found that would link the CBE sgRNA expression cassettes to another locus. This is now shown in Figure 4 S2 and described in section “Cas12a-mediated DSB ensures the integration of one CBE sgRNA per L. mexicana transfectant”.

      Reviewer #3 (Public Review):

      Genetic manipulation of Leishmania has some challenges, including some limitations in the DNA repair strategies that are present in the organism and the absence of RNA interference in many species. The senior author has contributed significantly to expanding the available routes towards Leishmania genetic manipulation by developing and adapting CRISPR-Cas9 tools to allow gene manipulation via DNA double-strand break repair and, more recently, base modification. This work seeks to improve on some limitations in the tools previously described for the latter approach of base modification leading to base change.

      The work in the paper is meticulously described, with solid evidence for most of the improvements that are claimed: Figure1 clearly describes reduced impairment in the growth of parasites expressing sgRNAs via changes in promoters; Figures 2 and 3 compellingly document the usefulness of using AsCas12a for integration after transformation; and Figures 1 and 4 demonstrate the capacity of the combined modifications to efficiently edit a gene in three different Leishmania species. There is little doubt these new tools will be adopted by the Leishmania community, adding to the growing arsenal of approaches for genetic manipulation.

      There are two weaknesses the authors may wish to address, one smaller and one larger.

      (1) The main advance claimed here is in this section title: 'Integration of CBE sgRNA expression cassettes via AsCas12a ultra-introduced DSBs increase editing rates', with the evidence for this presented in Figure 4. It is hard work in the submission to discern what direct evidence there is for editing rates being improved relative to earlier, Cas9-based approaches. Did they directly compare the editing by the new and old approach? If not, can they more clearly explain how they are able to make this claim, either by adding text or a new figure? A side-by-side comparison would emphasise the advance of the new approach more clearly.

      We would like to thank reviewer #3 for this helpful comment. We have directly compared our improved method to our previous base editing method in Figures 1E and 4, demonstrating higher editing rates in a much shorter time. Especially the L. major panel in Figure 4B shows that in a direct comparison between the previously published (Engstler and Beneke, eLife 2023) and our here presented new system, editing can be only observed with the version presented here. However, to clarify the improvements we made, we compare now data from our previous screen done in Engstler and Beneke, eLife 2023 with a loss-of-function screen carried out with our updated method (see Figure 5 and section “Detection of fitness-associated phenotypes in a pooled loss-of-function screen”).

      In addition, we also feel that our title might have been misleading in a sense that we claim that Cas12a editing is more efficient than other Cas9 based approaches, which is something that we don’t want to state here. Given that we have now included a small scale CRISPR screen and given that we generally show improved base editing compared to our previous method (improved in terms of less toxicity, more editing in shorter time, higher transfection rates and less species specific variation), we have rephrased our title to: “Improved base editing and functional screening in Leishmania via co-expression of the AsCas12a ultra, a T7 RNA Polymerase, and a cytosine base editor”. 

      (2) The ultimate, stated goal of this work is (abstract) to 'enable a variety of loss-of-function screens', as the older approach had some limitations. This goal is not tested for the new tools that have been developed here; the experiment in Figure 5 merely shows that they can, not unexpectedly, make a gene mutant, which was already possible with available tools. Thus, to what extent is this paper describing a step forward? Why have the authors not run an experiment - even the same one that was described previously in Engstler and Beneke (2023) - to show that the new approach improves on previous tools in such a screen, either in scale or accuracy?

      We have now included a small-scale loss-of-function screen in L. mexicana, targeting nine known essential genes with 24 CBE sgRNAs and 15 non-targeting control sgRNAs. This approach successfully detected all known included growth-associated phenotypes in a pooled screening format. This experiment is now shown in Figure 5 and described in section “Detection of fitness-associated phenotypes in a pooled loss-of-function screen”. We believe that this underscores our claims made here and believe therefore that our updated toolbox will indeed enable a variety of loss-of-function screens.

      As pointed out in the comment to reviewer #2, we have recently successfully conducted several genome-wide loss-of-function screens in vivo and in vitro, ultimately confirming the feasibility of this type of screen on a genome-wide scale (manuscript in preparation). Without the improvements presented here, such as the higher transfection and base editing rates, these genome-wide screens could have not been carried out.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I would like to compliment Tom Beneke and his lab on their continued efforts to develop tools to facilitate genome editing in Leishmania.

      I have no doubt that the toolkit presented in this study will be very useful for the community. The submitted paper is very well written and contains all the necessary controls to support the author's claims. There is only one point that left me a bit concerned if this strategy is to be used for large-scale screens, and that is the potential for integration of multiple sgRNA expression cassettes in a single cell.

      We would like to thank reviewer 1 for helpful comments. We have addressed the major concern raised by including a small-scale loss-of-function screen in our revised manuscript. By targeting nine known essential genes with 24 CBE sgRNAs and 15 non-targeting control sgRNAs, this approach successfully detected growth-associated phenotypes in a pooled format (see section “Detection of fitness-associated phenotypes in a pooled loss-of-function screen” and Figure 5). Regarding the point of multiple sgRNA expression cassette integration, please see the next comment below.

      Major points:

      Integration of multiple sgRNA expression cassettes:

      While Illumina-based gDNA-seq is well suited to determine changes in ploidy, I don't think it is sensitive enough to draw conclusions about possible double integration in a small percentage of cells. In fact, the data shown in Figure 4 S1D show a normalized coverage >1.5 for sgRNA cassette and NeoR, suggesting that they may have integrated >1 times in some cells.

      To verify that the integration of the CBE sgRNA expression cassette is specific, we have re-analysed our ONT results and confirmed that only ONT contigs can be detected that link the CBE sgRNA expression to the 18S rRNA locus. No other integration sites can be found. We also do not detect any contigs containing multiple CBE sgRNA expression cassettes. This is now shown in Figure 4 S2 and described in section “Cas12a-mediated DSB ensures the integration of one CBE sgRNA per L. mexicana transfectant”.

      Nevertheless, it is a valid concern that the sequencing depth is not sufficient to detect small percentage of cells that have integrated the CBE sgRNA expression multiple times. However, in this case we also like to make the point that this small percentage of cells within a screen is likely to be not relevant and we therefore now added a small scale pooled loss-of-function screen, targeting essential genes, to the manuscript (see new Figure 5) to proof our claim. If the integration of multiple sgRNAs into one cell would have any measurable combinatorial effect, the non-targeting controls in our screen would have been depleted as well. However, there is no detectable difference between all 15 included controls in our small-scale screen.

      We have addressed all points in sections “Cas12a-mediated DSB ensures the integration of one CBE sgRNA per L. mexicana transfectant“ and “Detection of fitness-associated phenotypes in a pooled loss-of-function screen”.

      To avoid double integration, wouldn't it be easiest to just create an allele-specific "landing pad" on one chromosome? I believe that a double integration rate of ~20% could severely complicate the analysis of any large-scale screen later on.

      We thank the reviewer for this suggestion but we have tried to use an allele-specific "landing pad" and described this already in our first manuscript version (see section “DSBs introduced by AsCas12a ultra increase integration rates of donor DNA constructs”). Specifically, we integrated CBE sgRNA expression cassettes into the neomycin resistance marker contained in the tdTomato expression cassette (Figure 2 S1D, Cas12a crRNA-5 and 6) but this resulted in lower transfection rates (Figure 2F: crRNA-5 1 in ~47,000; crRNA-6 1 in ~32,000) then when using a Cas12a crRNA that targets the 18S rRNA locus directly (Figure 2F: crRNA-4 1 in ~2,000). As we believe a high transfection rate is key for pooled large-scale screens, we therefore pursued further experiments with crRNA-4. However, since a different crRNA can be easily selected for our tool, simply by just changing the Cas12a crRNA during transfection, users can chose a different integration site or other “landing pads” if they want to. We have updated section “Cas12a-mediated DSB ensures the integration of one CBE sgRNA per L. mexicana transfectant” to clarify these details.

      Also, it is not clear to me how the integration of tdTomato could affect the integration of the sgRNA expression cassette 400 bp downstream.

      As said above, our ONT data clearly shows that we can only see integration into one locus (Figure 4 S1 and S2). Given that the recognition site of crRNA-4 is contained in the homology flank used to integrate tdTomato into the 18S rRNA locus, this may contribute to the effect we observe. But since the homology sequences match the original sequences within the locus, the reasons to why this affects integration of the CBE sgRNA expression cassettes remain also elusive to us. We try to discuss this better now in the section “Cas12a-mediated DSB ensures the integration of one CBE sgRNA per L. mexicana transfectant”.

      Data accessibility:

      The Illumina and ONT data should be made publicly available.

      ONT and Illumina fastq reads are now available at the European Nucleotide Archive (ENA Accession Number: PRJEB83088)

      Minor point:

      Line 30: It would be easier for readers if the authors could briefly explain what bar-seq is.

      We have added more details:[…] and bar-seq screens, which involve individually deleting, barcoding, and pooling mutants for analysis, have facilitated […].

      Lines 114, 120: I think the authors are referring to Figures 1E and F, not Figures 2E and F.

      Many thanks for picking this up, we have corrected the Figure reference.

      Reviewer #2 (Recommendations For The Authors):

      This has the potential to be a valuable tool for the community if it is efficiently distributed. If the authors have not yet done so they should make their plasmids available to the community via Addgene.

      We have started the deposit process with Addgene and plasmids will be available soon. In the meantime, all plasmid maps are available on our website www.leishbaseedit.net and can be requested for shipment from our lab.

      Line 162-165, 400-401: The potential for using AsCAS12a's intrinsic RNase activity for "multiplexing" would benefit from a little more explanation (i.e. how this would work, and what multiplexing means in this context).

      We have added further details on multiplexing with Cas12a and point out potential applications.

      “For example, Cas12a crRNA arrays with four or more guides can be assembled and transfected to introduce multiple DSBs within one gene. Since Cas12a generates sticky DNA ends that facilitate recombination via microhomology-mediated end joining and homologous recombination (Zhang et al., 2021), this approach could effectively disrupt target genes without requiring the addition of donor DNA and this may provide an alternative approach to our here presented base editing method in the future. Moreover, CBE sgRNAs could be multiplexed by interspacing them with Cas12a direct repeats (DRs), enabling simultaneous targeting of multiple genes in one cell.”

      Line 193-194: can the authors offer an explanation for the reduction in mNG editing observed with 30nt homology flanks?

      We assume this is caused by imprecise recombination events in some cells and have revised the original sentence.

      In several places in the manuscript, it is unclear if an analysis has been done on an individual clone or a population derived from multiple transfected cells. If on mixed population, clarify this and calculate the number of clones that the mixture represents. E.g. lines 195-196 and 221-223 (Sanger sequencing of integration site); Line 333-352 (ONT analysis of CBE expression cassette integration).

      Only when we tested whether multiple CBE sgRNAs are integrated, we generated and analysed clones (Figure 4 S3). In all other experiments we analysed parasite populations. For better clarity, we have where possible indicated this in the revised manuscript (e.g. at the lines requested). 

      Line 259: "site by site" should presumably be "side by side".

      Many thanks for pointing this out. We have changed this typo.

      Lines 315-317: Clarify why the mis-integration of the CBE sgRNA expression cassette might cause a lack of editing (e.g. lack of expression?).

      We have added: “This could potentially result in the silencing of the CBE sgRNA expression or even lead to the deletion of the guide cassette”

      Line 364 - 367: it is unlikely there is the statistical power to state that 2/10 represents lower than the previously observed 38% of double integrants.

      We agree that the statistical power is low and have therefore changed our phrasing to an overall estimation.

      Reviewer #3 (Recommendations For The Authors):

      I suggest that the authors make clearer to the reader the evidence for improved editing efficiency in the new CBE system described here relative to the system described in Engstler and Beneke, 2023. Such clarification could be as simple as an extra paragraph or figure, clearly comparing the editing rates with the two systems in, as far as possible, equivalent conditions.

      We have directly compared our improved method to our previous base editing method in Figures 1E and 4, demonstrating higher editing rates in a much shorter time. Especially the L. major panel in Figure 4B shows that in a direct comparison between the previously published (Engstler and Beneke, eLife 2023) and new system, editing can be only observed with the version presented here. However, to clarify the improvements we made, we compare now data from our previous screen done in Engstler and Beneke, eLife 2023 with a loss-of-function screen carried out with our updated method (see Figure 5 and section “Detection of fitness-associated phenotypes in a pooled loss-of-function screen”).

      The significance of this work would be improved by running the type of loss of fitness screen described previously in Engstler and Beneke (2023), thereby showing that the new approach improves on previous tools. Without such data, questions remain about potential confounding effects that might not be anticipated from the targeted experiments provided in the current manuscript.

      We thank the reviewer for this suggestion. The requested experiment is now presented in Figure 5 and described in section “Detection of fitness-associated phenotypes in a pooled loss-of-function screen”.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The authors examined whether aberrantly projecting retinal ganglion cells in albino mice innervate a separate population of thalamocortical neurons, as would be predicted for Hebbian learning rules. The authors find support for this hypothesis in correlated light and electron microscopy (CLEM) reconstructions of retinal ganglion cell axons and thalamocortical neurons. In a second line of investigation, the authors ask the same question about retinal ganglion cell innervation of local inhibitory interneurons of the mouse LGN. The authors conclude that these connections are less specific.

      Strengths:

      The authors make good use of CLEM to test a circuit-level hypothesis, and they find an interesting difference in RGC synaptic innervation patterns for thalamocortical neurons vs. local interneurons.

      Weaknesses:

      The conclusions about the local interneuron innervation are a little more difficult to interpret. One would expect to only capture a small part of the local interneuron dendritic field, as compared to the smaller thalamocortical neurons, right? Doesn't that imply that finding some evidence of promiscuous connectivity means that other dendrites that were not observed probably connect to many different RGCs?

      We will try to clarify this point

      Reviewer #2 (Public review):

      In this article, the authors examined the organization of misplaced retinal inputs in the visual thalamus of albino mice at electron-microscopic (EM) resolution to determine whether these synaptic inputs are segregated from the rest of the retinogeniculate circuitry.

      The study's major strengths include its high resolution, achieved through serial EM and confocal microscopy, which enabled the identification of all synaptic inputs onto neurons in the dorsolateral geniculate nucleus (dLGN).

      The experiments are very precise and demanding; thus, only the synaptic inputs of a few neurons were fully reconstructed in one animal. A few figures could be improved in their presentation.

      Despite this, the authors clearly demonstrate the synaptic segregation of misrouted retinal axons onto dLGN neurons, separate from the rest of the retinogeniculate circuitry.

      This finding is impactful because retinal inputs typically do not segregate within the mouse dLGN, and it was previously thought that this was due to the nucleus's small size, which might prevent proper segregation. The study shows that in cases where axons are misrouted and exhibit a different activity pattern than surrounding retinal inputs, segregation of inputs can indeed occur. This suggests that the normal system has the capacity to segregate inputs, despite the limited volume of the mouse dLGN.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Please include page numbers and line numbers in future submissions.

      Done

      (2) I am red-green colorblind, and I had a lot of trouble seeing the red channels when they were mixed with green. I recommend using magenta when possible.

      Thanks for the heads up. We have switched to green and magenta where possible. In the tinted EM where switching colors did not seem helpful, we added an asterisk to RGC boutons so that red and green would not be the only identifiers.

      (3) It would help if the figure captions also stated the conclusions that can be drawn from the figures. I recommend stating the main conclusion in the first sentence of the caption, rather than stating only what we are viewing. Similarly, the last sentence of the caption can help summarize what has been seen.

      We have included summary sentences at the beginning and end of figure legends.

      (4) In the text when discussing Figure 2J, do the authors mean to cite Supplementary Figure 2?

      Yes, thanks.

      (5) I don't think TC was ever defined (or I didn't find it).

      Corrected

      (6) In the subsection "An exclusive set..." cite Liang et al. as more evidence of non-specific innervation.

      We cite Liang et al in the discussion, but I don’t see a good place to cite it in the referenced results section. Please elaborate if we are missing something.

      (7) Supplementary Figure 3 is never cited.

      We have added the citation to Figure 3.

      (8) I found myself unsure of what to conclude after the results on LIN. A few more sentences of interpretation and restating what was found would help.

      We have added additional clarification in the Results:

      “The LIN results are consistent with our prediction that shaft dendrites would be indifferent to island/non-island boundaries while individual targeted dendrites would target either the island or non-island RGC boutons. However, the restriction of the targeted dendrites to one or the other RGC field does not appear to be an absolute rule. Rather the scale of targeted dendrite exploration and the size of the exclusion zone is likely to reduce the chances that a targeted dendrite would find partners on both in the island and outside of the island. This matching between the exploration of targeted LIN dendrites and the segregation of retinogeniculate connectivity means that targeted LIN dendrites will have an RGC input profile (island/non-island) that matches the TCs they innervate.”

      Reviewer #2 (Recommendations for the authors):

      (1) The abbreviation TC is used in the text without a definition.

      Corrected

      (2) The features that allow for labeling the different dendrites/cells (TC and LIN) in Serial EM data (Figure 1) are necessary. While the explanation is provided for RGC boutons, the labeling for thalamic cells is not discussed.

      We added the sentence:

      “Thalamocortical dendrites were distinguished from local inhibitory neuron dendrites by the presence of spines and the absence of synaptic outputs.”

      (3) Image 2C (EM) appears blurry or pixelated. Enhancing its resolution could improve clarity.

      Image 2C is a demonstration of how much we felt we could sacrifice image quality and still reconstruct TC arbors and RGC inputs.

      (4) The gray circles that show the innervation of TC17 in Figure 2E are barely visible, especially on-screen without high magnification. A more contrasting color and wider lines would enhance visibility. It would also be helpful to indicate TC17 in Figure 2H and 2G, as this cell is special and highlighted in the main text.

      We have made the requested changes

      (5) A TC with no RGC input is mentioned. Have you identified other synaptic inputs, potentially related to SC or the cortex?

      Both TC17 (a few exclusion zone RGC inputs) and TC5 (no RGC inputs) were innervated by some large, dark mitochondria boutons that could be SC inputs.  However, we did not perform enough reconstruction of the axons to confidently describe their non-RGC input profile. I have previously observed occasional TCs in the same region of the dLGN where RGC inputs are almost entirely replaced by SC inputs, so finding two such cells was not surprising.

      (6) Two fully reconstructed TCs are mentioned. Please specify their exact number in the text, as citing Figure 2J or Supplementary Figure 1 alone is not sufficient for identification.

      Clarified as “(TC3, TC4, Figure 2J, Supplementary Figure 2,3).”

      (7) A correlation between the position of the dendrites and the location of RGC inputs would provide additional insights. This is somewhat reminiscent of the dendrite orientation of Layer IV spiny stellate neurons in the somatosensory cortex that receive inputs from the thalamocortical axons and could be mentioned in the discussion.

      We believe that the images provided are a strong argument for TC arbors being shaped by RGC bouton distributions. We agree that reporting the correlation between dendrites and RGC boutons would be useful, but we found this correlation difficult to quantify. One of the challenges is that we would need to perform several-fold more reconstruction of dendrites and RGC boutons to have an unbiased mapping of both. Currently, most of the reconstructions stop when the dendrites assume a distal morphology and stop interacting with RGC boutons. Likewise, the EM of the RGC boutons are only those that innervate the reconstructed cells. We considered simply quantifying the asymmetry of the TC arbors relative to a symmetrical distribution and a random distribution, but we felt that quantification would be difficult to interpret without a similar analysis performed in the same region of dLGN on wild-type TCs.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper is an incremental follow-up to the authors' recent paper which showed that Purkinje cells make inhibitory synapses onto brainstem neurons in the parabrachial nucleus which project directly to the forebrain. In that precedent paper, the authors used a mouse line that expresses the presynaptic marker synaptophysin in Purkinje cells to identify Purkinje cell terminals in the brainstem and they observed labeled puncta not only in the vestibular and parabrachial nuclei, as expected, but also in neighboring dorsal brainstem nuclei, prominently the central pontine grey. The present study, motivated by the lack of thorough characterization of PC projections to the brainstem, uses the same mouse line to anatomically map the density and a PC-specific channelrhodopsin mouse line to electrophysiologically assess the strength of Purkinje cell synapses in dorsal brainstem nuclei. The main findings are (1) the density of Purkinje cell synapses is highest in vestibular and parabrachial nuclei and correlates with the magnitude of evoked inhibitory synaptic currents, and (2) Purkinje cells also synapse in the central pontine grey nucleus but not in the locus coeruleus or mesencephalic nucleus.

      Strengths:

      The complementary use of anatomical and electrophysiological methods to survey the distribution and efficacy of Purkinje cell synapses on brainstem neurons in mouse lines that express markers and light-sensitive opsins specifically in Purkinje cells is the major strength of this study. By systematically mapping presynaptic terminals and light-evoked inhibitory postsynaptic currents in the dorsal brainstem, the authors provide convincing evidence that Purkinje cells do synapse directly onto pontine central grey and nearby neurons but do not synapse onto trigeminal motor or locus coeruleus neurons. Their results also confirm previously documented heterogeneity of Purkinje cell inputs to the vestibular nucleus and parabrachial neurons.

      Weaknesses:

      Although the study provides strong evidence that Purkinje cells do not make extensive synapses onto LC neurons, which is a helpful caveat given previous reports to the contrary, it falls short of providing the comprehensive characterization of Purkinje cell brainstem synapses which seemed to be the primary motivation of the study. The main information provided is a regional assessment of PC density and efficacy, which seems of limited utility given that we are not informed about the different sources of PC inputs, variations in the sizes of PC terminals, the subcellular location of synaptic terminals, or the anatomical and physiological heterogeneity of postsynaptic cell types. The title of this paper would be more accurate if "characterization" were replaced by "survey".

      Several of the study's conclusions are quite general and have already been made for vestibular nuclei, including the suggestions in the Abstract, Results, and Discussion that PCs selectively influence brainstem subregions and that PCs target cell types with specific behavioral roles.

      We agree that we did not provide an in-depth characterization of PC synapses onto all identified types of brainstem neurons. With so many types of neurons in the brainstem, this would be a monumental task. Despite this limitation we prefer to keep our original title, since our study makes the following advances:

      • We provide a comprehensive map of all PC synaptic boutons across the brainstem, and corresponding maps of PC synaptic input sizes. The input sizes vary widely, but are often multiple nanoamps, indicating that the cerebellum is an important regulator of activity in these regions. These maps will be indispensable for future investigations of cerebellar outputs.

      • We find that PC projections and the synapses they make are spatially restricted within most target nuclei such as the vestibular and parabrachial nuclei. This suggests that the influence of the cerebellum is spatially segregated within these nuclei, and likely allows the cerebellum to regulate specific behaviors.  While some aspects of these gradients have been described previously, our study is comprehensive, and has a higher degree of specificity than can be achieved with immunohistochemistry. 

      • We discover that PCs form functional synapses in the pontine central grey and nearby nuclei. Much of this region’s function is unknown, but certain subregions are important for micturition and valence. PCs make large synapses onto a small fraction of cells in this region, which suggests that PCs may target specific cell types to control novel nonmotor behaviors.

      • We provide clarification regarding PC projections to the locus coeruleus. Multiple high-profile, highly influential studies using rabies tracing (Schwarz et al., Nature 2015; Breton-Provencher and Sur, Nature Neuroscience 2019; and others) described a prominent PC input to the locus coeruleus. We showed that this projection is essentially nonexistent, both anatomically and functionally. We previously addressed this issue, but the PC-specific optogenetic approach we used here provides the most compelling evidence against a prominent PC-LC connection. This is an important finding for the cerebellum and a cautionary tale for conclusions based solely on viral tracing methods. We will expand on this issue in response to the comments of reviewer #3.

      Reviewer #2 (Public review):

      Summary:

      While it is often assumed that the cerebellar cortex connects, via its sole output neuron, the Purkinje cell, exclusively to the cerebellar nuclei, axonal projections of the Purkinje cells to dorsal brainstem regions have been well documented. This paper provides comprehensive mapping and quantification of such extracerebellar projections of the Purkinje cells, most of which are confirmed with electrophysiology in slice preparation. A notable methodological strength of this work is the use of highly Purkinje cell-specific transgenic strategies, enabling selective and unbiased visualization of Purkinje terminals in the brainstem. By utilizing these selective mouse lines, the study offers compelling evidence challenging the general assumption that Purkinje cell targets are limited to the cerebellar nuclei. While the individual connections presented are not entirely novel, this paper provides a thorough and unambiguous demonstration of their collective significance. Regarding another major claim of this paper, "characterization of direct Purkinje cell outputs (Title)", however, the depth of electrophysiological analysis is limited to the presence/absence of physiological Purkinje input to postsynaptic brainstem neurons whose known cell types are mostly blinded. Overall, conceptual advance is largely limited to confirmatory or incremental, although it would be useful for the field to have the comprehensive landscape presented.

      Strengths:

      (1) Unsupervised comprehensive mapping and quantification of the Purkinje terminals in the dorsal brainstem are enabled, for the first time, by using the current state-of-the-art mouse lines, BAC-Pcp2-Cre and synaptophysin-tdTomato reporter (Ai34).

      (2) Combinatorial quantification with vGAT puncta and synaptophysin-tdTomato labeled Purkinje terminals clarifies the anatomical significance of the Purkinje terminals as an inhibitory source in each dorsal brainstem region.

      (3) Electrophysiological confirmation of the presence of physiological Purkinje synaptic input to 7 out of 9 dorsal brainstem regions identified.

      (4) Pan-Purkinje ChR2 reporter provides solid electrophysiological evidence to help understand the possible influence of the Purkinje cells onto LC.

      Weaknesses:

      (1) The present paper is largely confirmatory of what is presented in a previous paper published by the author's group (Chen et al., 2023, Nat Neurosci). In this preceding paper, the author's group used AAV1-mediated anterograde transsynaptic strategy to identify postsynaptic neurons of the Purkinje cells. The experiments performed in the present paper are, by nature, complementary to the AAV1 tracing which can also infect retrogradely and thus is not able to demonstrate the direction of synaptic connections between reciprocally connected regions. Anatomical findings are all consistent with the preceding paper. The likely absence of robust physiological connections from the Purkinje to LC has also been evidenced in the preceding paper by examining c-Fos response to Purkinje terminal photoinhibition at the PBN/LC region.

      We agree that we previously dealt with the issue of PC-LC synapses (Chen et al., 2023, Nat Neurosci), but our conclusions differed from several high-profile publications (Schwarz et al., Nature 2015; Breton-Provencher and Sur, Nature Neuroscience 2019), and still met considerable resistance. We felt that the optogenetic approach provided the most definitive means of evaluating the presence and strength of PC-LC synapse that will hopefully settle this issue. These experiments also set a standard for future studies assessing the presence of PC synapses onto other target neurons in the brainstem.

      (2) Although the authors appear to assume uniform cell type and postsynaptic response in each of the dorsal brainstem nuclei (as noted in the Discussion, "PCs likely function similarly to their inputs to the cerebellar nuclei, where a very brief pause in firing can lead to large and rapid elevations in target cell firing"), we know that the responses to the Purkinje cell input are cell type dependent, which vary in neurotransmitter, output targets, somata size, and distribution, in the cerebellar and vestibular nuclei (Shin et al., 2011, J Neurosci; Najac and Raman, 2015, J Neurosci; Özcan et al., 2020, J Neurosci). This consideration impacts the interpretation of two key findings: (a) "Large ... PC-IPSCs are preferentially observed in subregions with the highest densities of PC synapses (Abstract)". For example, we know that the terminal sparse regions reported in the present paper do contain Floccular Targeted Neurons that are sparse yet have dense somatic terminals with profound postinhibitory rebound (Shin et al.). Despite their sparsity, these postsynaptic neurons play a distinct and critical role in proper vestibuloocular reflex. Therefore, associating broad synaptic density with "PC preferential" targets, as written in the Abstract, may not fully capture the behavioral significance of Purkinje extracerebellar projections. (b) "We conclude ... only a small fraction of cell. This suggests that PCs target cell types with specific behavioral roles (Abstract, the last sentence)". Prior research has already established that "PCs target cell types with specific behavioral roles in brainstem regions". Also, whether 23 % (for PCG), for example, is "a small fraction" would be subjective: it might represent a numerically small but functionally important cell type population. The physiological characterization provided in the present cell type-blind analysis could, from a functional perspective, even be decremental when compared to existing cell typespecific analyses of the Purkinje cell inputs in the literature.

      We now cite the papers suggested by the reviewer (Shin et al., 2011, J Neurosci; Najac and Raman, 2015, J Neurosci; Özcan et al., 2020, J Neurosci) and add to the discussion.

      (3) The quantification analyses used to draw conclusions about

      (a) the significance of PC terminals among all GABAergic terminals and the fractions of electrophysiologically responsive postsynaptic brainstem neurons may have potential sampling considerations:.

      (a.i) this study appears to have selected subregions from each brainstem nucleus for quantification (Figure 2). However, the criteria for selecting these subregions are not explicitly detailed, which could affect the interpretation of the results.

      Additional explanation has been added to results in the section, “Quantification of PC synapses in the brainstem.”  

      (a.ii) the mapping of recorded cells (Figure 3) seems to show a higher concentration in terminal-rich regions of the vestibular nuclei.

      In Figure 3, we strived to record in an unbiased manner. However, there may have been a slight bias to recordings in areas of lower myelination where patching is easier. We now clarify this issue in the text.

      Reviewer #3 (Public review):

      Summary:

      The manuscript by Chen and colleagues explores the connections from cerebellar Purkinje cells to various brainstem nuclei. They combine two methods - presynaptic puncta labeling as putative presynaptic markers, and optogenetics, to test the anatomical projections and functional connectivity from Purkinje cells onto a variety of brainstem nuclei. Overall, their study provides an atlas of sorts of Purkinje cell connectivity to the brainstem, which includes a critical analysis of some of their own data from another publication. Overall, the value of this work is to both provide neural substrates by which Purkinje cells may influence the brainstem and subsequent brain regions independent of the deep cerebellar nuclei and also, to provide a critical analysis of viral-based methods to explore neuronal connectivity.

      Strengths:

      The strengths lie in the simplicity of the study, the number of cells patched, and the relationship between the presence of putative presynaptic puncta and electrophysiological results. This type of study is important and should provide a foundation for future work exploring cerebellar inputs and outputs. Overall, I think that the critique of viral-based methods to define connectivity, and a more holistic assessment of what connectivity is and how it should be defined is timely and warranted, as I think this is under-appreciated by many groups and overall, there is a good deal of research being published that do not properly consider the issues that this manuscript raises about what viral-based connectivity maps do and do not tell us.

      We thank the reviewer for highlighting this important aspect of this work, and for agreeing with our thesis concerning viral-based connectivity maps.

      Weaknesses:

      While I overall liked the manuscript, I do have a few concerns that relate to interpretation of results, and discussion of technological limitations. The main concerns I have relate to the techniques that the authors use, and an insufficient discussion of their limitations. The authors use a Cre-dependent mouse line that expresses a synaptophysin-tomato marker, which the authors confidently state is a marker of synapses. This is misleading. Synaptophysin is a vesicle marker, and as such, labels axons, where vesicles are present in transit, and likely cell bodies where the protein is being produced. As such, the presence of tdtomato should not be interpreted definitively as the presence of a synapse. The use of vGAT as a marker, while this helps to constrain the selection of putative pre-synaptic sites, is also a vesicle marker and will likely suffer the same limitations (though in this case, the expression is endogenous and not driven by the ROSA locus). A more conservative interpretation of the data would be that the authors are assessing putative pre-synaptic sites with their analysis. This interpretation is wholly consistent with their findings showing the presence of tdtomato in some regions but only sparse connectivity - this would be expected in the event that axons are passing through. If the authors wish to strongly assert that they are specifically assessing synapses, a marker better restricted to synapses and not vesicles may be more appropriate.

      We agree that synaptophysin-tdTomato is an imperfect marker, although it is vastly superior to cytosolic tdTomato.  We found that viral expression of synaptophysin-GFP gives much more punctate labelling, but an appropriate synaptophysin-GFP line is not available. We carefully point out this issue, and threshold the images to avoid faint labeling associated with fibers of passage.  The intersection of VGAT labelling and of the synaptophysin-tdTomato labelling provides us with superior identification of PC boutons.  We will add additional clarification to point out that these are putative presynaptic boutons, but that alone this does not establish the existence or the strength of functional synapses.

      Similarly, while optogenetics/slice electrophysiology remains the state of the art for assessing connectivity between cell populations, it is not without limitations. For example, connections that are not contained within the thickness of the slice (here, 200 um, which is not particularly thick for slice ephys preps) will not be detected. As such, the absence of connections is harder to interpret than the presence of connections. Slices were only made in the coronal plane, which means that if there is a particular topology to certain connections that is orthogonal to that plane, those connections may be under-represented. As such, all connectivity analyses likely are under-representations of the actual connectivity that exists in the intact brain. Therefore, perhaps the authors should consider revising their assessments of connections, or lack thereof, of Purkinje cells to e.g., LC cells. While their data do make a compelling case that the connections between Purkinje cells and LC cells are not particularly strong or numerous, especially compared to other nearby brainstem nuclei, their analyses do indicate that at least some such connections do exist. Thus, rather than saying that the viral methods such as rabies virus are not accurate reflections of connectivity - perhaps a more circumspect argument would be that the quantitative connectivity maps reported by other groups using rabies virus do not always reflect connectivity defined by other means e.g., functional connections with optogenetics. In some cases, the authors do suggest this (e.g."Together, these findings indicate that reliance on anatomical tracing experiments alone is insufficient to establish the presence and importance of a synaptic connection"), but in other cases, they are more dismissive of viral tracing results (e.g. "it further suggests that these neurons project to the cerebellum and were not retrogradely labeled"). Furthermore, some statements are a bit misleading e.g., mentioning that rabies methods are critically dependent on starter cell identity immediately following the citation of studies mapping inputs onto LC cells. While in general, this claim has merit, the studies cited (19-21) use Dbh-Cre to define LC-NE cells which does have good fidelity to the cells of interest in the LC. Therefore, rewording this section in order to raise these issues generally without proximity to the citations in the previous sentence may maintain the authors' intention without suggesting that perhaps the rabies studies from LC-NE cells that identified inputs from Purkinje cells were inaccurate due to poor fidelity of the Cre line. Overall, this manuscript would certainly not be the first report indicating that the rabies virus does not provide a quantitative map of input connections. In my opinion, this is still under-appreciated by the broad community and should be explicitly discussed. Thus, an acknowledgment of previous literature on this topic and how their work contributes to that argument is warranted.

      We have a different take on connectivity and the use of optogenetics.  Based on our years of experience studying synapses in brain slice, axons survive very well even when they are cut. It is not necessary to preserve intact axons that extend for long distances. It is also true that activation of these axons, with either extracellular electrical stimulation or with optogenetics, is sufficient to evoke synaptic inputs. Robust synaptic responses are evoked with optogenetic activation regardless of the slice orientation. We thank the reviewer for raising this issue, and we have added a couple of sentences to clarify this point under the section “Characterization of functional properties of PC synapses in the brainstem.”

      The discussion on starter cell specificity was not referring to the specificity of cre in transgenic animals, but the TVA/G helper proteins that are introduced by AAV and used in conjunction with the rabies virus. The issues related to this have recently been discussed in Elife (Beier, 2022) in addition to citations 58 and 59 in the manuscript. We have more explicitly highlighted this issue in the revised manuscript in the section “Lack of significant PC inputs to LC neurons.”

      Recommendations for the authors:  

      Reviewer #1 (Recommendations for the authors):

      (1) Methods need detail to be replicable, particularly in how PC synapses were identified and automatically counted. It is not clear what was the variation within subregions across mice. How were neurons selected or rejected for recordings and analyses? Was each subregion sampled at equal spacing? Methods for anatomy should mention sagittal sections.

      Wording in Methods section, “Anatomy” was changed to better reflect how PC synapses were identified as colabeled segments of vGAT and tdTomato labeling. 

      Each datapoint in Figure 2D-F was quantification of a region for each section and each mouse. The color of the data point indicates the anterior posterior location of the section. The violin plot quantifies the median and quartile value for all points across sections and mice. The variability captured by the violin point reflects variability across the anterior-posterior axis. 

      Neurons were mostly randomly selected in each slice, and rejected based on unstable holding current or access resistance. Cell locations were recorded and updated with each experiment so that we minimized oversampling easier to patch regions.

      Sagittal sections were added in methods.

      (2) Figure 2D-F what is the black line and grey region?

      Additional text was added in the caption for Figure 2D-F

      (3) MEV is confusing given LAV stands for lateral vestibular - perhaps call it ME5?

      We will remain consistent with the abbreviations in the Allen Brain Reference Atlas.

      Reviewer #2 (Recommendations for the authors):

      (1) What are the criteria for distinguishing large, small, and non-responders?

      Large are in the nA range, small are in the hundreds of pA, and non-responders are effectively zero. Manual curation of these responses indicated that a current amplitude threshold of 45 pA clearly separated non-responders from responders. To be clear, the average response (as stated in text and displayed in Figure 3D) includes all cells.

      (2) p1. "Unexpectedly": it would not be unexpected, rather, expected, because it was reported in Chen et al., 2023, Nat Neurosci.

      The PCG was hinted at, but an actual functional, anatomical connection was not reported in our previous manuscript.

      (3) p1. "We combined electrophysiological recordings with immunohistochemistry to assess the molecular identities of these PC targets": please clarify "these" here. It could be read that it refers to "pontine central gray and nearby subnuclei" but it doesn't make sense. Immuno has only been performed for MeV and LC.

      Corrected

      (4) p1. "but only inhibit a small fraction of cells in many nuclei": as far as I read Fig.3, it seems that ~50% for PBN/VN and ~25% for PCG: would this be "a small fraction"?

      The small fraction of cells was in reference to subnuclei within the PCG, but we agree this statement is too broad to be useful and have eliminated it.

      (5) p2. "conventional tracer": viral tracer is becoming a standard, so dye tracer could be better here.

      Corrected

      (6) p3. "rostral/cauda": typo.

      Corrected.  

      (7) p3. Quantification of PC synapses in the brainstem: it would be helpful to introduce why synapto-tdT alone is not sufficient, and the purpose of adding vGAT immunostaining.

      We have added more on vGAT labeling putative presynaptic sites and quantifying only synaptic labeling instead of axonal tdTomato in the Results, “Quantification of PC synapses in the brainstem.” In addition, vGAT staining allows us to examine the PC contribution to total inhibition in each region.

      (8) p7. "PB and are": typo.

      Corrected. And all instances of PBN were changed to PB

      (9) p7. "they are likely a mix of excitatory and inhibitory inputs 54,55": Bagnall et al., 2009, J Neurosci, would be critically relevant here.

      Added, thank you

      (10) Figures 2-3: Yellow/Blue color scheme is hard to distinguish, and having two colors could be read as implying two distinct regions.

      We are unsure what the reviewer is referring to exactly here, but the colors refer to the sections in 2C (see the color bar on the bottom right of each atlas schematic). The points represent an individual section that was quantified, and thus do represent distinct samples from distinct regions.

      (11) Figure 2D-F: what is indicated by each point?

      Each data point is the number of PC bouton (D), density of bouton (E), or percentage of synaptophysin/vGAT (F) quantified for each region per section. Each color represents a coronally distinct section of a region. Additional text was added into the captions to clarify this and point 10.

      (12) Figure 3E, right: what is the correlation coefficient?

      The correlation coefficient was found to be 0.74

      Reviewer #3 (Recommendations for the authors):

      Some minor grammatical errors and typos need to be cleaned up (e.g. "To quantifying the densities...", "The medial-ventral region of the PBN...have extensive...".

      These errors have been corrected

    1. Reviewer #1 (Public review):

      Summary:

      The work by Fisher et al describes the role of novel RSPO mimetics in the activation of WNT signaling and hepatocyte regeneration. However, the results of the experiments and weaknesses of the methods used do not support the conclusions of the authors that the new therapy can promote liver regeneration in alcohol-induced liver cirrhosis.

      Strengths:

      Similarly to its precursor, aASGR1-RSPO2-RA-IgG, SZN-043 can upregulate Wnt target genes and promote hepatocyte proliferation in the liver.

      Weaknesses:

      (1) The authors rely on the expression of a single gene, CYP1A1, as a readout of Wnt/ß-catenin target gene expression. A more systemic evaluation of Wnt/ß-catenin activity should be performed.

      (2) The lack of the mRNA upregulation of cell cycle genes is not sufficient to draw a conclusion of the impaired regeneration in cirrhotic livers.

      (3) The authors present single-dose pharmacokinetic (PK) profile of SZN-04. It is not clear how that compares to its precursor, to justify better pharmacokinetic properties.

      (4) The specificity of Wnt/ß-catenin activation should be evaluated in ß-catenin KO mice to show no target gene induction in the absence of ß-catenin.

      (5) The authors demonstrated that the drug promoted hepatocyte proliferation. How it affects liver functional parameters in alcohol-fed mice, hepatocyte differentiation markers, albumin production, and coagulation factor synthesis is not clear.

      (6) Female mice only were used for alcohol studies; the effect on the male mice needs to be evaluated as well.

      (7) Alcohol feeding did not reduce Wnt/ß-catenin target gene expression in mice suggesting that it is a bad model to study the efficacy of the SZN-043 in alcohol-induced liver cirrhosis.

      (8) The authors used CCl4-induced fibrosis as a model of ALD fibrosis. However, this is not a suitable fibrosis model for ALD studies. Adding alcohol to CCl4 treatment could potentially address this issue. Alternatively, the authors should use an ALD model that produces significant fibrosis.

      (9) Sex for the CCl4-treated mice is not indicated.

      (10) Histology and fibrosis assessment data for alcohol-fed mice should be presented.

      (11) The rationale for using 13.5-month-old aging mice for alcohol studies and immunodeficient mice only for CCl4 studies is not clear.

    2. Reviewer #2 (Public review):

      Summary:

      The study by Fisher et al investigates a therapeutic role for SZN-043, a hepatocyte-targeted R-spondin mimetic, for its potential role in restoring Wnt signaling and promoting liver regeneration in alcohol-associated liver disease (ALD). Using multiple preclinical models, the compound was shown to promote hepatocyte proliferation and reduce fibrosis. This study highlights the efficacy of promoting liver regeneration while maintaining controlled signaling. Limitations include a need for further exploration of off-target effects and fibrosis mechanisms. The findings support SZN-043 as a promising candidate for ALD therapy, warranting further clinical evaluation. This is a well-designed study with thorough investigation using multiple disease models.

      Strengths:

      (1) Well-written manuscript with clear design, robust methods, and discussion.

      (2) Using multiple models strengthens the findings and expands beyond ALD.

      (3) Identification of SZN-043 as a novel potent drug for liver regeneration.

      Weaknesses:

      (1) The introduction needs to be re-structured with an emphasis on liver regeneration. It seems that the entire manuscript is focused on liver regeneration, however, only the last two sentences or so describe liver regeneration. The frequency of liver transplants owing to a reduced ability for liver regeneration in AH patients needs to be highlighted.

      (2) In Figure 4, it appears that the humanized mice liver was injected with the SZN-043. Is it possible that using a partial hepatectomy model will be beneficial for assessing the effects of SZN-043 rather than using them in mice without any hepatocyte damage?

      (3) Figure 4B. Panel 3 has 10mpk merged inside the figure. Please correct this.

      (4) Figure 4B. DAPI staining will be vital to show the Ki67 staining specific to hepatocytes (at least visually we can do co-localization with a double nucleus in each cell). The current image shows some cells show Ki67 staining which shows some cells which are not binuclear.

      (5) The alcohol feeding was performed for 8 weeks and is described as NIAAA model in the methods section. NIAAA model is 11 days of alcohol+ one binge. Please correct this or clarify it in the methods section, as this is not reflected. ASGR1 may be also expressed by macrophages so it's important to show the specificity.

      (6) Is it possible that the SZN-043 also has effect on macrophages promoting an anti-inflammatory state? This should be discussed.

      (7) Potential off-target effects of SZN-043, particularly in stellate cell activation in the context of fibrosis should be discussed.

      (8) Discuss the limitations of current models and how they might influence the interpretation of the results.

      (9) Clearly explain how SZN-043 overcomes limitations of prior RSPO-based therapies.

    1. Reviewer #1 (Public review):

      Summary:

      Participants in this study completed three visits. In the first, participants received experimental thermal stimulations which were calibrated to elicit three specific pain responses (30, 50, 70) on a 0-100 visual analogue scale (VAS). Experimental pressure stimulations were also calibrated at an intensity to the same three pain intensity responses. In the subsequent two visits, participants completed another pre-calibration check (Visit 2 of 3 only). Then, prior to the exercise NALOXONE or a SALINE placebo-control was administered intravenously. Participants then completed 1 of 4 blocks of HIGH (100%) or LOW (55%) intensity cycling which was tailored according to a functional threshold power (FTP) test completed in Visit 1. After each block of cycling lasting 10 minutes, participants entered an MRI scanner and were stimulated with the same thermal and pressure stimulations that corresponded to 30, 50, and 70 pain intensity ratings from the calibration stage. Therefore, this study ultimately sought to investigate whether aerobic exercise does indeed incur a hypoalgesia effect. More specifically, researchers tested the validity of the proposed endogenous pain modulation mechanism. Further investigation into whether the intensity of exercise had an effect on pain and the neurological activation of pain-related brain centres were also explored.

      Results show that in the experimental visits (Visit 2 and 3), when participants exercised at two distinct intensities as intended. Power output, heart rate, and perceived effort ratings were higher during the HIGH versus LOW-intensity cycling. In particular. HIGH intensity exercise was perceived as "hard" / ~15 on the Borg (1974, 1998) scale, whereas LOW intensity exercise was perceived as "very light" / ~9 on the same scale.

      The fMRI data from Figure 1 indicates that the anterior insula, dorsal posterior insula, and middle cingulate cortex show pronounced activation as stimulation intensity and subsequent pain responses increased, thus linking these brain regions with pain intensity and corroborating what many studies have shown before.

      Results also showed that participants rated a higher pain intensity in the NALOXONE condition at all three stimulation intensities compared to the SALINE condition. Therefore, the expected effect of NALOXONE in this study seemed to occur whereby opioid receptors were "blocked" and thus resulted in higher pain ratings compared to a SALINE condition where opioid receptors were "not blocked". When accounting for participant sex, NALOXONE had negligible effects at lower experimental nociceptive stimulations for females compared to males who showed a hyperalgesia effect to NALOXONE at all stimulation intensities (peak effect at 50 VAS). Females did show a hyperalgesia effect at stimulation intensities corresponding to 50 and 70 VAS pain ratings. The fMRI data showed that the periaqueductal gray (PAG) showed increased activation in the NALOXONE versus SALINE condition at higher thermal stimulation intensities. The PAG is well-linked to endogenous pain modulation.

      When assessing the effects of NALOXONE and SALINE after exercise, results showed no significant differences in subsequent pain intensity ratings.

      When assessing the effect of aerobic exercise intensity on subsequent pain intensity ratings, authors suggested that aerobic exercise in the form of a continuous cycling exercise tailored to an individual's FTP is not effective at eliciting an exercise-induced hypoalgesia response -irrespective of exercise intensity. This is because results showed that pain responses did not differ significantly between HIGH and LOW intensity exercise with (NALOXONE) and without (SALINE) an opioid antagonist. Therefore, authors have also questioned the mechanisms (endogenous opioids) behind this effect.

      Strengths:

      Altogether, the paper is a great piece of work that has provided some truly useful insight into the neurological and perceptual mechanisms associated with pain and exercise-induced hypoalgesia. The authors have gone to great lengths to delve into their research question(s) and their methodological approach is relatively sound. The study has incorporated effective pseudo-randomisation and conducted a rigorous set of statistical analyses to account for as many confounds as possible. I will particularly credit the authors on their analysis which explores the impact of sex and female participants' stage of menses on the study outcomes. It would be particularly interesting for future work to pursue some of these lines of research which investigate the differences in the endogenous opioid mechanism between sexes and the added interaction of stage of menses or training status.

      There are certainly many other areas that this article contributes to the literature due to the depth of methods the research team has used. For example, the authors provide much insight into: the impact of exercise intensity on the exercise-induced hypoalgesia effect; the impact of sex on the endogenous opioid modulation mechanism; and the impact of exercise intensity on the neurological indices associated with endogenous pain modulation and pain processing. All of which, the researchers should be credited for due to the time and effort they have spent completing this study. Indeed, their in-depth analysis of many of these areas provides ample support for the claims they make in relation to these specific questions. As such, I consider their evidence concerning the fMRI data to be very convincing (and interesting).

      Weaknesses:

      Although the authors have their own view of their results, I do however, have a slightly different take on what the post-exercise pain ratings seem to show and its implications for judging whether an exercise-induced hypoalgesia effect is present or not. From what I have read, I cannot seem to find whether the authors have compared the post-exercise pain ratings against any data that was collected pre-exercise/at rest or as part of the calibration. Instead, I believe the authors have only compared post-exercise pain ratings against one another (i.e., HIGH versus LOW, NALOXONE versus SALINE). In doing so, I think the authors cannot fully assume that there is no exercise-induced hypoalgesia effect as there is no true control comparison (a no-exercise condition).

      In more detail, Figure 6A appears to show an average of all pain ratings combined per participant (is this correct?). As participants were exposed to stimulations expected to elicit a 30, 50, or 70 VAS rating based on pre-calibration values, therefore the average rating would be expected to be around 50. What Figure 6A shows is that in the SALINE condition, average pain ratings are in fact ~10-15 units lower (~35) and then in the NALOXONE condition, average pain ratings are ~5 units lower (~45) for both exercise intensities. From this, I would surmise the following:

      It appears there is an exercise-induced hypoalgesia effect as average pain ratings are ~30% lower than pre-calibrated/resting pain ratings within the SALINE condition at the same temperature of stimulation (it would also be interesting to see if this effect occurred for the pressure pain).

      It appears there is evidence for the endogenous opioid mechanism as the NALOXONE condition demonstrates a minimal hypoalgesia effect after exercise. I.e., NALOXONE indeed blocked the opioid receptors, and such inhibition prevented the endogenous opioid system from taking effect.

      It appears there is no effect of exercise intensity on the exercise-induced hypoalgesia effect. That is, participants can cycle at a moderate intensity (55% FTP) and incur the same hypoalgesia benefits as cycling at an intensity that demarcates the boundary between heavy and severe intensity exercise (100%FTP). This is a great finding in my mind as anyone wishing to reduce pain can do so without having to engage in exercise that is too effortful/intense and therefore aversive - great news! This likely has many applications within the field of public health.

      I will very slightly caveat my summaries with the fact that a more ideal comparison here would be a control condition whereby participants did the same experimental visit but without any exercise prior to entering the MRI scanner. I consider the overall strength of the evidence to be solid, with the answer to the primary research question still a little ambiguous.

    2. Reviewer #2 (Public review):

      Summary:

      This interesting study compared two different intensities of aerobic exercise (low-intensity, high-intensity) and their efficacy in inducing a hypoalgesic reaction (i.e. exercise-induced hypoalgesia; EIH). fMRI was used to identify signal changes in the brain, with the infusion of naloxone used to identify hypoalgesia mechanisms. No differences were found in post-exercise pain perception between the high-intensity and low-intensity conditions, with naloxone infusion causing increased pain perception across both conditions which was mirrored by activation in the medial frontal cortex (identified by fMRI). However, the primary conclusion made in this manuscript (i.e. that aerobic exercise has no overall effect on pain in a mixed population sample) cannot be supported by this study design, because the methodology did not include a baseline (i.e. pain perception following no exercise) to compare high/low-intensity exercise against. Therefore, some of the statements/implications of the findings made in this manuscript need to be very carefully assessed.

      Strengths:

      (1) The use of fMRI and naloxone provides a strong approach by which to identify possible mechanisms of EIH.

      (2) The infusion of naloxone to maintain a stable concentration helps to ensure a consistent effect and that the time course of the protocol won't affect the consistency of changes in pain perception.

      (3) The manipulation checks (differences in intensity of exercise, appropriate pain induction) are approached in a systematic way.

      (4) Whilst the exploratory analyses relating to the interactions for fitness level and sex were not reported in the study pre-registation, they do provide some interesting findings which should be explored further.

      Weaknesses:

      (1) Given that there is no baseline/control condition, it cannot be concluded that aerobic exercise has no effect on pain modulation because that comparison has not been made (i.e. pain perception at 'baseline' has not been compared with pain perception after high/low-intensity exercise). Some of the primary findings/conclusions throughout the manuscript state that there is 'No overall effect of aerobic exercise on pain modulation', but this cannot be concluded.

      (2) Across the manuscript, a number of terms are used interchangeably (and applied, it seems, incorrectly) which makes the interpretation of the manuscript difficult (e.g. how the author's use the term 'exercise-induced pain').

      (3) There is a lack of clarity on the interventions used in the methods, for example, it is not exactly clear the time and order in which the exercise tasks were implemented.

      (4) The exercise test (functional threshold power) used to set the intensity of the low/high exercise bouts is not an accurate means of demarcating steady state and non-steady state exercise. As a result, at the intensity selected for the high-intensity exercise in this study, it is likely that the challenge presented for the high-intensity exercise would have been very different between participants (e.g. some would have been in the 'heavy' domain, whereas others would be in the 'severe' domain).

      (5) It is likely that participants did not properly understand how to use the 6-20 Borg scale to rate their perceived effort, and so caution must be taken in how this RPE data is used/interpreted.

      (6) Although interesting, the secondary analyses (relating to the interaction effects of fitness level and sex) were not included in the study pre-registration, and so the study was not designed to undertake this analysis. These findings should be taken with caution.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Abstract

      I don't think you need the first two sentences of the abstract. This is not a grant and your results are exciting enough to justify a full basic science-based approach.

      We fully understand this perspective.  However, we prefer to introduce the work in the broader context of sleep medicine.  This manuscript is part of our long-standing efforts to develop cavefish as a model for sleep disorders and we believe this provides important context.

      Last sentence of the abstract: the subject is missing. "That have developed..." who has developed?

      Thank you. We have corrected this error, the sentence now reads “...these findings suggest that cavefish have developed resilience to sleep loss...”

      Introduction

      First paragraph. Worth explaining in a sentence what is the link between DNA damage and ROS.

      We now state ‘Further, chronic sleep loss results in elevated reactive oxygen species (ROS), a known mediatior of DNA damage, in the gut and/or brain that contribute to mortality in Drosophila and mice [11,16].’

      "A. mexicanus exists as blind cave populations and an extant surface population that are interfertile". This needs rephrasing. As it is, it sounds like the surface population is infertile.

      We have rephrased for clarity; the line now reads: “while the surface and cave populations are geographically isolated, they remain interfertile and capable of hybridization in nature as well as laboratory settings”.

      "Further, the evolved differences in DNA repair genes, including links between mechanisms regulating sleep, light responsiveness, and DNA repair across all three cave populations studied to date [27,29]" This sentence is incomplete.

      We have corrected the phrasing, which now reads “...evolved differences in DNA repair genes have been identified across all three cave populations studied to date, including links between mechanisms regulating sleep, light responsiveness, and DNA repair”:

      Figure 1

      I recommend improving the legibility of the figure copying some of the information provided in the legend directly within the figure itself.

      A, B: label in the panel itself what is blue and what is green.

      Thank you, we have made this change.

      C: Make it clear in the figure itself that you are measuring yH2AX. Also, probably you have enough room in the figure to avoid abbreviations for Rhomb, mes, and tele. It may also help if you could add a little cartoon that explains what those three brain regions are.

      We have added text to the y axis indicating that yH2AX fluorescence is being measured, and replaced the abbreviations with eh full names of the regions.

      G: again, explain that DHE is being measured here. And perhaps pick a different colour choice to highlight the difference from C?

      We have added clarifiaction to the y-axis of the figure, but have retained the color scheme for consistency; in all surface-cave comparisons in the manuscript, gray is used for surface fish and red for cavefish.

      In the text: I would recommend adding some quantitative reminder of what is the difference in sleep amount between the two species (cave vs surface).

      We have added the following to highlight the magnitude of the difference in sleep: “Strikingly, cavefish sleep as little as 1-2 hours per day, in contrast to their surface counterparts, which sleep as much as 6-10 hours a day”

      "Together, these findings fortify the notion that cellular stress is elevated in the gut of cavefish relative to surface fish." Were the two populations fed the same diet and raised in the same lab conditions? If this is pinpointed to sleep amount, it's worth ruling out possible confounding factors.

      We have added a sentence to the results underlining this point: “Prior to imaging, both surface and cavefish had been reared in a temperature-controlled incubator, and relied solely on their yolk sac for nutrients; so, differences in gut ROS cannot be attributed to differences in rearing or feeding conditions.”

      Figure 2

      Spell out, somewhere in the figure itself, that the 30s and 60s refer to UV treatment protocols.

      We have added X-axis titles to clarify this in Fig 2 and supp. Fig 1.

      It would be worth providing a cartoon of the experimental setup that shows for instance what time of the day UV was given (it's only specified in the text) and which subsequent sleep period was selected for comparisons.

      We have added arrows to all sleep plots indicating the time of UV treatment, and brackets indicating the time period used for statistical comparisons, as well as text in the figure legends indicating this.

      Figure 3

      A. I don't think this is needed, to be honest, and if you want to keep it, it needs a better legend.

      We have edited the figure legend to increase clarity.

      B. I would make it clear in the figure that this refers to transcriptomics analysis. Perhaps you could change the order and show C, D, and then B.

      We have added text to the figure legend and the results text to more explicitly state that the PCA plot is of transcriptional response. We have however retained the original figure order, as well feel this figure is important to establish that both populations have strong, but distinct responses to the UV treatment.

      Figure 4

      A. Spell it out in the figure itself that you're staining for CPD.

      Thank you, we have made this change.

      B. You are using the same colour combination you had in Figure 1 but for yet another pairing. This is a bit confusing.

      Thank you for bringing this to our attention.  We have added descriptions of the colors in the figure legend.

      Discussion

      "Beyond the Pachón cavefish population, all three other cavefish populations have been found to have reduced sleep (Cite)." Citation missing here.

      Thank you.  We have now clarified this sentence and included a citation.

      Reviewer #2 (Recommendations For The Authors):

      Consideration of Environmental Conditions:

      Evaluate whether the lab conditions, which may more closely resemble surface environments, could influence the observed increase in neuronal DNA damage and gut ROS levels in cavefish. Adjusting these conditions or discussing their potential impact in the manuscript would strengthen the findings.

      We are very excited about these experiments.  We have a paper that will be submitted to BioRxiv this week where we record wild-caught fish, as well as fish in caves.  The conclusion is that sleep loss is present in both populations.  This field work took over 10 years to come together and still lacks the power of the lab based assays.  Nevertheless, we can conclusively say that the phenotypes we have observed for the last ~15 years in the lab are present in a natural setting.  We have included a statement about the need for future work to test these findings in a natural setting.

      Alternative Stressors:

      Given that cavefish are albino and blind (to my knowledge), consider using alternative sources of genotoxic stress beyond UV-induced damage. This could include chemical agents or other forms of environmental stress to provide a more comprehensive assessment of DDR.

      We agree and are enthusiastic about looking more generally at stress.  We note that we have previously found that cavefish rebound following sleep deprivation (McGaugh et al, 2020) suggesting that they are responsive to sleep disruption.  This will be a major research focus area moving forward.

      Broader Stress Responses:

      Investigate whether other forms of stress, such as dietary changes or temperature fluctuations, elicit similar differences in sleep patterns and DDR responses. This could provide additional insights into the robustness of the observed phenomena.

      We fully agree.  This will be the primary focus of this research area moving forward. We hypothesize that cavefish are generally less responsive to their environment.  Unpublished data reveals that temperature stress, circadian changes, and aging (presented here) to little to impact gene expression in surface fish.  We would like to test the hypothesis that transcriptional stability of cavefish contributes to their longevity.

      Potential Protective Mechanisms:

      Discuss the possibility that lower levels of gamma-H2AX in cavefish might be protective, as DDR can lead to cellular senescence or cancer. This perspective could add depth to the interpretation of the results.

      This was the hypothesis underlying this manuscript.  However, we found elevated levels of gamma-H2AX.  We believe there may be additional protective mechanisms that have evolved in cavefish, but cannot identify them to date.  Our hope is future functional studies by our group, as well as other groups’ access to this published work, may help address these questions.

      Strengthening the Sleep-DNA Damage Link:

      Further experiments are needed to directly link sleep differences to the observed variations in DNA damage and DDR. This could involve manipulating sleep patterns in surface fish and cavefish to observe corresponding changes in DNA repair mechanisms.

      We agree.  We have referenced work that conclusively showed this relationship in zebrafish. Our current methods for limiting sleep involves shaking, and this has too many confounds.  We are working on developing genetic tools, and applying the gentle rocking methods used previously in zebrafish to address these questions.

      Clarification of Causal Directionality:

      Address the potential that sleep patterns and DDR responses may both be downstream effects of a common cause or independent adaptations to the cave environment. Clarifying this in the manuscript would provide a more nuanced understanding of the evolutionary adaptations.

      Thank you for this suggestion.  We have now added a paragraph describing how these experiments (and the ones described above) are necessary for understanding the relationship between sleep and DDR.

      Clarification and Presentation:

      Fix the many typos, and improve the clarity of the figures and their legends to ensure they are easily interpretable. Additional context in the discussion section would help readers understand the significance and potential implications of the findings.

      Thank you, we have now included this.

      Reviewer #3 (Recommendations For The Authors):

      There are a number of suggestions that I have made in the public review, but there are a few things that I would like to add here.

      The methods section is missing many important details, for instance, the intensity of the illumination used in the UV exposure in larvae is not reported but is vital for the interpretation/replication of these experiments. In general, this section should be redone with a greater effort to include all important information. Similarly, the figure legends could be greatly improved, with important details like n-number and definition of significance thresholds defined (e.g. see Figures 1, C, and G.)

      We have added greater detail to the methods section to specify the spectral peak and power output of the bulbs used.

      There are a number of passages in the manuscript that do not make sense, which suggests that a future version of record should be carefully proofread. I know that this can be a case of reading multiple versions of a manuscript so many times that one doesn't really see it anymore, but, for example, phrases like "To differentiate between these two possibilities" are confusing to the reader when there has been no introduction of alternate possibilities.

      Thank you for this comment.  We have fixed this mistake and proofread the manuscript.

      Additionally, there are multiple examples of errors in citations/references. A few examples are below:

      "Further, chronic sleep loss results in elevated reactive oxygen species (ROS) in the gut and/or brain that contribute to mortality in Drosophila and mice [11, 16]". Reference 16 does not include mice at all, and reference 11 is Vaccaro et al. 2020, where Drosophila mortality is assessed, but mouse mortality is not.

      We have added the appropriate citations and revised this sentence.

      References 13 and 15 are the same.

      Thank you, we have fixed.

      References 24 and 26 are the same.

      Thank you, we have fixed.

      Public Reviews:

      Reviewer #1 (Publc Review):

      Summary:

      Lloyd et al employ an evolutionary comparative approach to study how sleep deprivation affects DNA damage repair in Astyanax mexicanus, using the cave vs surface species evolution as a playground. The work shows, convincingly, that the cavefish population has evolved an impaired DNA damage response both following sleep deprivation or a classical paradigm of DNA damage (UV).

      Strengths:

      The study employs a thorough multidisciplinary approach. The experiments are well conducted and generally well presented.

      Weaknesses:

      Having a second experimental mean to induce DNA damage would strengthen and generalise the findings.

      Overall, the study represents a very important addition to the field. The model employed underlines once more the importance of using an evolutionary approach to study sleep and provides context and caveats to statements that perhaps were taken a bit too much for granted before. At the same time, the paper manages to have an extremely constructive approach, presenting the platform as a clear useful tool to explore the molecular aspects behind sleep and cellular damage in general. The discussion is fair, highlighting the strengths and weaknesses of the work and its implications.

      We fully agree with this assessment.  We are currently performing experiments to test the effects of additional DNA damaging agents.  We hope to extend these studies beyond DNA-damage agents to look more generally at how animals respond to stress including ROS, sleep deprivation, and high temperature.  This will be a major direction of the laboratory moving forward.

      The manuscript investigates the relationship between sleep, DNA damage, and aging in the Mexican cavefish (Astyanax mexicanus), a species that exhibits significant differences in sleep patterns between surface-dwelling and cave-dwelling populations. The authors aim to understand whether these evolved sleep differences influence the DNA damage response (DDR) and oxidative stress levels in the brain and gut of the fish.

      Summary of the Study:

      The primary objective of the study is to determine if the reduced sleep observed in cave-dwelling populations is associated with increased DNA damage and altered DDR. The authors compared levels of DNA damage markers and oxidative stress in the brains and guts of surface and cavefish. They also analyzed the transcriptional response to UV-induced DNA damage and evaluated the DDR in embryonic fibroblast cell lines derived from both populations.

      Strengths of the Study:

      Comparative Approach:

      The study leverages the unique evolutionary divergence between surface and cave populations of A. mexicanus to explore fundamental biological questions about sleep and DNA repair.

      Multifaceted Methodology:

      The authors employ a variety of methods, including immunohistochemistry, RNA sequencing, and in vitro cell line experiments, providing a comprehensive examination of DDR and oxidative stress.

      Interesting Findings:

      The study presents intriguing results showing elevated DNA damage markers in cavefish brains and increased oxidative stress in cavefish guts, alongside a reduced transcriptional response to UV-induced DNA damage.

      Weaknesses of the Study:

      Link to Sleep Physiology:

      The evidence connecting the observed differences in DNA damage and DDR directly to sleep physiology is not convincingly established. While the study shows distinct DDR patterns, it does not robustly demonstrate that these are a direct result of sleep differences.

      We agree with this assessment.  We are currently working to apply tools developed in zebrafish to examine the physiology of sleep.  While this is important, and our results our promising, we will note that functional analysis of sleep physiology in fish has been limited to zebrafish.  We hope future studies will allow us to integrate approaches that examine the physiology of sleep.

      Causal Directionality:

      The study fails to establish a clear causal relationship between sleep and DNA damage. It is possible that both sleep patterns and DDR responses are downstream effects of a common cause or independent adaptations to the cave environment.

      We agree, however, we note that this could be the case for all animals in which sleep has been linked to DNA damage.  We believe the most likely explanation for Astyanax and other animals studied, is that sleep is that sleep and DDR are downstream/interface with the sleep homeostat.

      Environmental Considerations:

      The lab conditions may not fully replicate the natural environments of the cavefish, potentially influencing the results. The impact of these conditions on the study's findings needs further consideration.

      This is correct. We have considered this carefully.  After nearly a decade of effort,  we have completed analysis of sleep in the wild.  These will be uploaded to BioRxiv within the next week.

      Photoreactivity in Albino Fish:

      The use of UV-induced DNA damage as a primary stressor may not be entirely appropriate for albino, blind cavefish. Alternative sources of genotoxic stress should be explored to validate the findings.

      We have addressed this above.  Future work will examine additional stressors. Both fish are transparent at 6dpf and so it is unlikely that albinism impacts the amount of UV that reaches the brain.

      Assessment of the Study's Achievements:

      The authors partially achieve their aims by demonstrating differences in DNA damage and DDR between surface and cavefish. However, the results do not conclusively support the claim that these differences are driven by or directly related to the evolved sleep patterns in cavefish. The study's primary claims are only partially supported by the data.

      Impact and Utility:

      The findings contribute valuable insights into the relationship between sleep and DNA repair mechanisms, highlighting potential areas of resilience to DNA damage in cavefish. While the direct link to sleep physiology remains unsubstantiated, the study's data and methods will be useful to researchers investigating evolutionary biology, stress resilience, and the molecular basis of sleep.

      Reviewer #3 (Public Review):

      Lloyd, Xia, et al. utilised the existence of surface-dwelling and cave-dwelling morphs of Astyanax mexicanus to explore a proposed link between DNA damage, aging, and the evolution of sleep. Key to this exploration is the behavioural and physiological differences between cavefish and surface fish, with cavefish having been previously shown to have low levels of sleep behaviour, along with metabolic alterations (for example chronically elevated blood glucose levels) in comparison to fish from surface populations. Sleep deprivation, metabolic dysfunction, and DNA damage are thought to be linked and to contribute to aging processes. Given that cavefish seem to show no apparent health consequences of low sleep levels, the authors suggest that they have evolved resilience to sleep loss. Furthermore, as extended wake and loss of sleep are associated with increased rates of damage to DNA (mainly double-strand breaks) and sleep is linked to repair of damaged DNA, the authors propose that changes in DNA damage and repair might underlie the reduced need for sleep in the cavefish morphs relative to their surface-dwelling conspecifics.

      To fulfill their aim of exploring links between DNA damage, aging, and the evolution of sleep, the authors employ methods that are largely appropriate, and comparison of cavefish and surface fish morphs from the same species certainly provides a lens by which cellular, physiological and behavioural adaptations can be interrogated. Fluorescence and immunofluorescence are used to measure gut reactive oxygen species and markers of DNA damage and repair processes in the different fish morphs, and measurements of gene expression and protein levels are appropriately used. However, although the sleep tracking and quantification employed are quite well established, issues with the experimental design relate to attempts to link induced DNA damage to sleep regulation (outlined below). Moreover, although the methods used are appropriate for the study of the questions at hand, there are issues with the interpretation of the data and with these results being over-interpreted as evidence to support the paper's conclusions.

      This study shows that a marker of DNA repair molecular machinery that is recruited to DNA double-strand breaks (γH2AX) is elevated in brain cells of the cavefish relative to the surface fish and that reactive oxygen species are higher in most areas of the digestive tract of the cavefish than in that of the surface fish. As sleep deprivation has been previously linked to increases in both these parameters in other organisms (both vertebrates and invertebrates), their elevation in the cavefish morph is taken to indicate that the cavefish show signs of the physiological effects of chronic sleep deprivation.

      It has been suggested that induction of DNA damage can directly drive sleep behaviour, with a notable study describing both the induction of DNA damage and an increase in sleep/immobility in zebrafish (Danio rerio) larvae by exposure to UV radiation (Zada et al. 2021 doi:10.1016/j.molcel.2021.10.026). In the present study, an increase in sleep/immobility is induced in surface fish larvae by exposure to UV light, but there is no effect on behaviour in cavefish larvae. This finding is interpreted as representing a loss of a sleep-promoting response to DNA damage in the cavefish morph. However, induction of DNA damage is not measured in this experiment, so it is not certain if similar levels of DNA damage are induced in each group of intact larvae, nor how the amount of damage induced compares to the pre-existing levels of DNA damage in the cavefish versus the surface fish larvae. In both this study with A. mexicanus surface morphs and the previous experiments from Zada et al. in zebrafish, observed increases in immobility following UV radiation exposure are interpreted as following from UV-induced DNA damage. However, in interpreting these experiments it is important to note that the cavefish morphs are eyeless and blind. Intense UV radiation is aversive to fish, and it has previously been shown in zebrafish larvae that (at least some) behavioural responses to UV exposure depend on the presence of an intact retina and UV-sensitive cone photoreceptors (Guggiana-Nilo and Engert, 2016, doi:10.3389/fnbeh.2016.00160). It is premature to conclude that the lack of behavioural response to UV exposure in the cavefish is due to a different response to DNA damage, as their lack of eyes will likely inhibit a response to the UV stimulus.

      We believe that in A. mexicanus, like in zebrafish, it is highly unlikely that the effects of UV are mediated through visual processing. Even if this were the case, the timeframe of UV activation is very short compared to the time-scale of sleep measurements so this is unlikely to be a confound.

      Indeed, were the equivalent zebrafish experiment from Zada et al. to be repeated with mutant larvae fish lacking the retinal basis for UV detection it might be found that in this case too, the effects of UV on behaviour are dependent on visual function. Such a finding should prompt a reappraisal of the interpretation that UV exposure's effects on fish sleep/locomotor behaviour are mediated by DNA damage.

      We prefer not to comment on Zada et al, as that is a separate manuscript.

      An additional note, relating to both Lloyd, Xia, et al., and Zada et al., is that though increases in immobility are induced following UV exposure, in neither study have assays of sensory responsiveness been performed during this period. As a decrease in sensory responsiveness is a key behavioural criterion for defining sleep, it is, therefore, unclear that this post-UV behaviour is genuinely increased sleep as opposed to a stress-linked suppression of locomotion due to the intensely aversive UV stimulus.

      We understand this concern and are working on improved methodology for measuring sleep.  However, behavioral measurements are the standard for almost every manuscript that has studied sleep in zebrafish, flies, and worms to date. 

      The effects of UV exposure, in terms of causing damage to DNA, inducing DNA damage response and repair mechanisms, and in causing broader changes in gene expression are assessed in both surface and cavefish larvae, as well as in cell lines derived from these different morphs. Differences in the suite of DNA damage response mechanisms that are upregulated are shown to exist between surface fish and cavefish larvae, though at least some of this difference is likely to be due to differences in gene expression that may exist even without UV exposure (this is discussed further below).

      UV exposure induced DNA damage (as measured by levels of cyclobutene pyrimidine dimers) to a similar degree in cell lines derived from both surface fish and cave fish. However, γH2AX shows increased expression only in cells from the surface fish, suggesting induction of an increased DNA repair response in these surface morphs, corroborated by their cells' increased ability to repair damaged DNA constructs experimentally introduced to the cells in a subsequent experiment. This "host cell reactivation assay" is a very interesting assay for measuring DNA repair in cell lines, but the power of this approach might be enhanced by introducing these DNA constructs into larval neurons in vivo (perhaps by electroporation) and by tracking DNA repair in living animals. Indeed, in such a preparation, the relationship between DNA repair and sleep/wake state could be assayed.

      Comparing gene expression in tissues from young (here 1 year) and older (here 7-8 years) fish from both cavefish and surface fish morphs, the authors found that there are significant differences in the transcriptional profiles in brain and gut between young and old surface fish, but that for cavefish being 1 year old versus being 7-8 years old did not have a major effect on transcriptional profile. The authors take this as suggesting that there is a reduced transcriptional change occurring during aging and that the transcriptome of the cavefish is resistant to age-linked changes. This seems to be only one of the equally plausible interpretations of the results; it could also be the case that alterations in metabolic cellular and molecular mechanisms, and particularly in responses to DNA damage, in the cavefish mean that these fish adopt their "aged" transcriptome within the first year of life.

      This is indeed true.  However, one could also interpret this as a lack of aging.  If the profile does not change over time, the difference seems largely semantic.

      A major weakness of the study in its current form is the absence of sleep deprivation experiments to assay the effects of sleep loss on the cellular and molecular parameters in question. Without such experiments, the supposed link of sleep to the molecular, cellular, and "aging" phenotypes remains tenuous. Although the argument might be made that the cavefish represent a naturally "sleep-deprived" population, the cavefish in this study are not sleep-deprived, rather they are adapted to a condition of reduced sleep relative to fish from surface populations. Comparing the effects of depriving fish from each morph on markers of DNA damage and repair, gut reactive oxygen species, and gene expression will be necessary to solidify any proposed link of these phenotypes to sleep.

      We agree this would be beneficial.  We note that relatively few papers have sleep deprived fish.  While we done have this before in A. mexicanus the assay is less than ideal and likely induces generalizable stress.  We are working on adapting more recently developed methods in zebrafish.

      A second important aspect that limits the interpretability and impact of this study is the absence of information about circadian variations in the parameters measured. A relationship between circadian phase, light exposure, and DNA damage/repair mechanisms is known to exist in A. mexicanus and other teleosts, and differences exist between the cave and surface morphs in their phenomena (Beale et al. 2013, doi: 10.1038/ncomms3769). Although the present study mentions that their experiments do not align with these previous findings, they do not perform the appropriate experiments to determine if such a misalignment is genuine. Specifically, Beale et al. 2013 showed that white light exposure drove enhanced expression of DNA repair genes (including cpdp which is prominent in the current study) in both surface fish and cavefish morphs, but that the magnitude of this change was less in the cave fish because they maintained an elevated expression of these genes in the dark, whereas the darkness suppressed the expression of these genes in the surface fish. If such a phenomenon is present in the setting of the current study, this would likely be a significant confound for the UV-induced gene expression experiments in intact larvae, and undermine the interpretation of the results derived from these experiments: as samples are collected 90 minutes after the dark-light transition (ZT 1.5) it would be expected that both cavefish and surface fish larvae should have a clear induction of DNA repair genes (including cpdp) regardless of 90s of UV exposure. The data in Supplementary Figure 3 is not sufficient to discount this potentially serious confound, as for larvae there is only gene expression data for time points from ZT2 to ZT 14, with all of these time points being in the light phase and not capturing any dynamics that would occur at the most important timepoints from ZT0-ZT1.5, in the relevant period after dark-light transition. Indeed, an appropriate control for this experiment would involve frequent sampling at least across 48 hours to assess light-linked and developmentally-related changes in gene expression that would occur in 5-6dpf larvae of each morph independently of the exposure to UV.

      We agree that this would be useful, however, frequent sampling is not feasible given the experiments presented here and the challenges of working with an emerging model.

      On a broader point, given the effects of both circadian rhythm and lighting conditions that are thought to exist in A. mexicanus (e.g. Beale et al. 2013) experiments involving measurements of DNA damage and repair, gene expression, and reactive oxygen species, etc. at multiple times across >1 24 hour cycle, in both light-dark and constant illumination conditions (e.g. constant dark) would be needed to substantiate the authors' interpretation that their findings indicate consistently altered levels of these parameters in the cavefish relative to the surface fish. Most of the data in this study is taken at only single time points.

      Again, see comment above.  The goal was to identify whether there are differences in DNA Damage response between A. mexcicanus. Extending on this to examine interactions with the circadian system could be a useful path to pursue in the future.

      On a broader point, given the effects of both circadian rhythm and lighting conditions that are thought to exist in A. mexicanus (e.g. Beale et al. 2013) experiments involving measurements of DNA damage and repair, gene expression, and reactive oxygen species, etc. at multiple times across >1 24 hour cycle, in both light-dark and constant illumination conditions (e.g. constant dark) would be needed to substantiate the authors' interpretation that their findings indicate consistently altered levels of these parameters in the cavefish relative to the surface fish. Most of the data in this study is taken at only single time points.

      In summary, the authors show that there are differences in gene expression, activity of DNA damage response and repair pathways, response to UV radiation, and gut reactive oxygen species between the Pachón cavefish morph and the surface morph of Astyanax mexicanus. However, the data presented does not make the precise nature of these differences very clear, and the interpretation of the results appears to be overly strong. Furthermore, the evidence of a link between these morph-specific differences and sleep is unconvincing.

      In summary, the authors show that there are differences in gene expression, activity of DNA damage response and repair pathways, response to UV radiation, and gut reactive oxygen species between the Pachón cavefish morph and the surface morph of Astyanax mexicanus. However, the data presented does not make the precise nature of these differences very clear, and the interpretation of the results appears to be overly strong. Furthermore, the evidence of a link between these morph-specific differences and sleep is unconvincing.

    1. Reviewer #2 (Public review):

      Summary:

      This study employed voltage imaging in the CA1 region of the mouse hippocampus during the exploration of a novel environment. The authors report synchronous activity, involving almost half of the imaged neurons, occurred during periods of immobility. These events did not correlate with SWRs, but instead, occurred during theta oscillations and were phased locked to the trough of theta. Moreover, pairs of neurons with high synchronization tended to display non-overlapping place fields, leading the authors to suggest these events may play a role in binding a distributed representation of the context.

      Strengths:

      Technically this is an impressive study, using an emerging approach that allow single-cell resolution voltage imaging in animals, that while head-fixed, can move through a real environment. The paper is written clearly and suggests novel observations about population-level activity in CA1.

      Weaknesses:

      The evidence provided is weak, with the authors making surprising population-level claims based on a very sparse data set (5 data sets, each with less than 20 neurons simultaneously recorded) acquired with exciting, but less tested technology. Further, while the authors link these observations to the novelty of the context, both in the title and text, they do not include data from subsequent visits to support this. Detailed comments are below:

      (1) My first question for the authors, which is not addressed in the discussion, is why these events have not been observed in the countless extracellular recording experiments conducted in rodent CA1 during exploration of novel environments. Those data sets often have 10x the neurons simultaneously recording compared to these present data, thus the highly synchronous firing should be very hard to miss. Ideally, the authors could confirm their claims via the analysis of publicly available electrophysiology data sets. Further, the claim of high extra-SWR synchrony is complicated by the observation that their recorded neurons fail to spike during the limited number of SWRs recorded during behavior- again, not agreeing with much of the previous electrophysiological recordings.<br /> (2) The authors posit that these events are linked to the novelty of the context, both in the text, as well as in the title and abstract. However they do not include any imaging data from subsequent days to demonstrate the failure to see this synchrony in a familiar environment. If these data are available it would strengthen the proposed link to novelty is they were included.<br /> (3) In the discussion the authors begin by speculating the theta present during these synchronous events may be slower type II or attentional theta. This can be supported by demonstrating a frequency shift in the theta recording during these events/immobility versus the theta recording during movement.<br /> (4) The authors mention in the discussion that they image deep layer PCs in CA1, however this is not mentioned in the text or methods. They should include data, such as imaging of a slice of a brain post-recording with immunohistochemistry for a layer specific gene to support this.

      Comments on revisions:

      I have no further major requests and thank the authors for the additional data and analyses.

    1. Reviewer #2 (Public review):

      Summary:

      This study makes a significant contribution to understanding the microenvironment of megakaryocytes (MKs) in the bone marrow, identifying an extracellular matrix (ECM) cage structure that influences MK localization and maturation. The authors provide compelling evidence for the presence of this ECM cage and its role in MK homeostasis, employing an array of sophisticated imaging techniques and molecular analyses. While the work is innovative and impactful, there are several points that require clarification or further data to fully support the conclusions.

      Major Strengths:

      Novelty: The identification of an ECM cage as a regulator of MK localization and maturation in the bone marrow is a novel and exciting finding.

      Imaging Techniques: The use of advanced microscopy to visualize the 3D structure of the ECM cage and its role in MK homeostasis provides a strong visual foundation for the study's claims.

      Comprehensive Analysis: The integration of in vivo and ex vivo approaches enhances the significance of the findings, offering valuable insights into the molecular mechanisms involved in ECM cage formation.

      Areas for Improvement and Clarifications:

      (1) ECM cage imaging:<br /> a) The value or additional information provided by the staining on nano-sections (A) is not clear, especially considering that the thick vibratome sections already display the entirety of the laminin γ1 cage structure effectively. Further clarification on the unique insights gained from each approach would help justify its inclusion.<br /> b) The sMK shown in Supplementary Figure 1C appears to be linked to two sinusoids, releasing proplatelets to the more distant vessels. Is this observation representative, and if so, can further discussion be provided?<br /> c) Freshly isolated BM-derived MKs are reported to maintain their laminin γ1 cage. Are the proportions of MKs with/without cages consistent with those observed in microscopy?

      (2) ECM cage formation:<br /> a) The statement "the full assembly of the 3D ECM cage required megakaryocyte interaction with the sinusoidal basement membrane" on page 7 is too strong given the data presented at this stage of the study. Supplemental Figure 1C shows that approximately 10% of pMKs form cages without direct vessel contact, indicating that other factors may also play a role in cage formation.<br /> b) The data supporting the statement that "pMK represent a small fraction of the total MK population" (cell number or density) could be shown to help contextualize the 10% of them with a cage.<br /> c) How "the full assembly of the 3D ECM cage" is defined at this stage of the study should be clarified, specifically regarding the ECM components and structural features that characterize its completion.

      (3) Data on MK Circulation and Cage Integrity: Does the cage require full component integrity to prevent MK release in circulation? Are circulating MKs found in Lama4-/- mice? Is the intravasation affected in these mice? Are the ~50% sinusoid associated MK functional?

      (4) Methodology:<br /> a) Details on fixation time are not provided, which is critical as it can impact antibody binding and staining. Including this information would improve reproducibility and feasibility for other researchers.<br /> b) The description of 'random length measuring' is unclear, and the rationale behind choosing random quantification should be explained. Additionally, in the shown image, it appears that only the branching ends were measured, which makes it difficult to discern the randomness in the measurements.

      (5) Figures:<br /> a) Overall, the figures and their corresponding legends would benefit from greater clarity if some panels were split, such as separating images from graph quantifications.

    2. Reviewer #3 (Public review):

      In this manuscript, Masson, Scandola, et al investigate how interactions between megakaryocytes and the extracellular matrix contribute to the regulation of thrombopoiesis using primary murine bone marrow MK cultures, integrin B1/B3 knock-out mice, and high-resolution 2D and 3D imaging. They find that laminin and collagen iv create a 3D "cage" of ECM surrounding MKs and anchor them at the sinusoidal basement membrane, which contributes to MK maturation and proplatelet intravasation into circulation. Deletion of laminin a4 disrupts the localization of MKs and the endothelial basement membrane, reducing the number of MKs associated with the sinusoid while having no effect on MK-associated collagen IV. Deletion of B1/B3 integrin reduces the quantity, localization, and structural organization of multiple ECM components surrounding MKs, and reduces MK adhesion when subject to conditions of sinusoidal flow.

      Further, using intravital microscopy of calvarial bone marrow and the pulmonary vasculature, they provide data suggesting that the stabilization of ECM around MKs (either in the BM or lung) prevents MKs from entering circulation as intact cells. Interestingly, deletion of B1 integrin reduces MK coverage in laminin y1, but deletion of both B1 and B3 independently results in increased MK intravasation into the sinusoidal space. Comparison of integrin KO MKs with GPVI KO MKs suggests that ECM cage formation, vessel adhesion, and intravasation are likely dependent on integrin activation/signaling rather than GPVI signals.

      Further, they provide data that the balance of ECM synthesis and degradation is essential for MK maturation and also provide data showing that inhibition of ECM turnover (in vivo inhibition of MMPs) results in increased ECM cage components that correspond with reduced MK maturation, and reduced demarcation membrane development.

      The conclusions of the paper are supported by the data, but there are some areas that would benefit from clarification or expansion.

      (1) The data linking ECM cage formation to MK maturation raises several interesting questions. As the authors mention, MKs have been suggested to mature rapidly at the sinusoids, and both integrin KO and laminin KO MKs appear mislocalized away from the sinusoids. Additionally, average MK distances from the sinusoid may also help separate whether the maturation defects could be in part due to impaired migration towards CXCL12 at the sinusoid. Presumably, MKs could appear mislocalized away from the sinusoid given the data presented suggesting they leaving the BM and entering circulation. Additional data or commentary on intrinsic (ex-vivo) MK maturation phenotypes may help strengthen the author's conclusions and shed light on whether an essential function of the ECM cage is integrin activation at the sinusoid.

      (2) The data demonstrating intact MKs inter circulation is intriguing - can the authors comment or provide evidence as to whether MKs are detectable in blood? A quantitative metric may strengthen these observations.

      (3) Supplementary Figure 6 - shows no effect on in vitro MK maturation and proplt, or MK area - But Figures 6B/6C demonstrate an increase in total MK number in MMP-inhibitor treated mice compared to control. Some additional clarification in the text may substantiate the author's conclusions as to either the source of the MMPs or the in vitro environment not fully reflecting the complex and dynamic niche of the BM ECM in vivo.

      (4) Similarly, one function of the ECM discussed relates to MK maturation but in the B1/3 integrin KO mice, the presence of the ECM cage is reduced but there appears to be no significant impact upon maturation (Supplementary Figure 4). By contrast, MMP inhibition in vivo (but not in vitro) reduces MK maturation. These data could be better clarified in the text, or by the addition of experiments addressing whether the composition and quantity of ECM cage components directly inhibit maturation versus whether effects of MMP-inhibitors perhaps lead to over-activation of the integrins (as with the B4galt KO in the discussion) are responsible for the differences in maturation.

    3. Author response:

      Reviewer #1 (Public review):

      Point 1. The authors postulate a synergistic role for Itgb1 and Itgb3 in the intravasation phenotype, because the single KOs did not replicate the phenotype of the DKO. However, this is not a correct interpretation in the opinion of this reviewer. The roles appear rather to be redundant. Synergistic roles would rather demonstrate a modest effect in the single KO with potentiation in the DKO.

      We agree that the interaction between Itgb1 and Itgb3 appears redundant and we will correct this point in the revised manuscript.

      Point 2. The experiment does not explain how these integrins influence the interaction of the MK with their microenvironment. It is not surprising that attachment will be impacted by the presence or absence of integrins. However, it is unclear how activation of integrins allows the MK to become "architects for their ECM microenvironment" as the authors posit. A transcriptomic analysis of control and DKO MKs may help elucidate these effects.

      We do not currently understand how α5β1 or αvβ3 integrins activation would contribute to ECM remodeling by megakaryocytes. Integrins are well known key regulators of ECM remodelling (https://doi.org/10.1016/j.ceb.2006.08.009). They can transmit traction force that provoques ECM remodelling (https://doi.org/10.1016/j.bpj.2008.10.009). We will discuss our previous study on the observed reduction in RhoA activation in double knockout (DKO) mice (Guinard et al., 2023,  PMID: 37171626), which likely impact the organization of the ECM microenvironment. Alternatively, integrin signalling contribute to gene expression regulation involved in ECM remodelling (ECM proteins, proteases….). We do agree with the reviewer that the transcriptomic analysis could provide strong evidence; however, it is challenging to perform this analysis in vivo. Isolation of native megakaryocytes (MKs) from DKO mice is challenging due to their reduced numbers, requiring too many mice for sufficient RNA and risk of cell contamination. An alternative approach will be to analyze platelets, which are more abundant and easier to isolate, while still mimicking the characteristics of bone marrow MKs. We will use PCR array technology for selected ECM panels and adhesion molecules (from all players currently known to contribute to ECM remodelling), providing a practical way to address the reviewer's suggestions and provide valuable insights.

      Point 3. Integrin DKO have a 50% reduction in platelets counts as reported previously, however laminin α4 deficiency only leads to 20% reduction in counts. This suggests a more nuanced and subtle role of the ECM in platelet growth. To this end, functional assays of the platelets in the KO and wildtype mice may provide more information.

      The difference in platelet counts between integrin DKO and laminin α4 KO mice is not fully understood. Although our study specifically focuses on MK-ECM interactions in the bone marrow, we recognize the importance of providing additional information on platelet functionality. To address this, we will use flow cytometry to examine the levels of P-selectin surface expression and fibrinogen binding under basal conditions and after stimulation with collagen-related peptide and TRAP.

      Point 4. There is insufficient information in the Methods Section to understand the BM isolation approach. Did the authors flush the bone marrow and then image residual bone, or the extruded bone marrow itself as described in PMID: 29104956?

      Additional information on the methodology will be provided to clarify the BM isolation.

      Point 5. The references in the Methods section were very frustrating. The authors reference Eckly et al 2020 (PMID: 32702204) which provides no more detail but references a previous publication (PMID: 24152908), which also offers no information and references a further paper (PMID: 22008103), which, as far as this reviewer can tell, did not describe the methodology of in situ bone marrow imaging.

      To address this confusion, we will add the reference "In Situ Exploration of the Major Steps of Megakaryopoiesis Using Transmission Electron Microscopy" by C. Scandola et al. (PMID: 34570102), which provides a standardized protocol for bone marrow isolation.

      Therefore, this reviewer cannot tell how the preparation was performed and, importantly, how can we be sure that the microarchitecture of the tissue did not get distorted in the process?

      Thank you for pointing this out. While we cannot completely rule out the possibility of distortion, we will clarify the precautions taken to minimize it. We utilized a double fixation process immediately after extruding the bone marrow, followed by embedding it in agarose to preserve its integrity as much as possible. We will address this point in greater detail in Methods section of the revised version.

      Reviewer #2 (Public review):

      Point 1. ECM cage imaging

      a) The value or additional information provided by the staining on nano-sections (A) is not clear, especially considering that the thick vibratome sections already display the entirety of the laminin γ1 cage structure effectively. Further clarification on the unique insights gained from each approach would help justify its inclusion.

      Ultrathin cryosection allow high-resolution imaging (10x fold increased in Z), facilitating the analysis of signal superposition. This study explores the interactions between MKs and their immediate ECM microenvironment, located at a distance of less than one micrometer, making nano-sections optimal for precise analysis of ECM distribution both within and surrounding MKs. This high-resolution approach has revealed the presence of collagen IV, laminin, fibronectin, and fibrinogen near MKs, More importantly, ultrathin cryosection allow us to clearly show with high resolution the presence of activated integrin in contact with laminin an coll IV fibers (see Fig. 3)

      We employed large-volume whole-mount imaging to clarify the overall three-dimensional architecture of the ECM interface, allowing us to identify the cages. Our findings emphasize the role of specific ECM components in facilitating proplatelet passage through the sinusoid barrier, an essential step for platelet production. Further details will be addressed in the revised manuscript.

      b) The sMK shown in Supplementary Figure 1C appears to be linked to two sinusoids, releasing proplatelets to the more distant vessels. Is this observation representative, and if so, can further discussion be provided?

      This observation is not representative; MKs can also be associated with just one sinusoid.

      c) Freshly isolated BM-derived MKs are reported to maintain their laminin γ1 cage. Are the proportions of MKs with/without cages consistent with those observed in microscopy?   

      In the revised manuscript, we will include the quantification of the proportion of BM-derived MKs with/without cages.

      Point 2.  ECM cage formation

      a) The statement "the full assembly of the 3D ECM cage required megakaryocyte interaction with the sinusoidal basement membrane" on page 7 is too strong given the data presented at this stage of the study. Supplemental Figure 1C shows that approximately 10% of pMKs form cages without direct vessel contact, indicating that other factors may also play a role in cage formation.

      The reviewer is correct. We will modify the text to reflect a more cautious interpretation of our results.

      b) The data supporting the statement that "pMK represent a small fraction of the total MK population" (cell number or density) could be shown to help contextualize the 10% of them with a cage.

      New bar graphs will be provided to represent the density of MK in the parenchyma against the total MK in the bone marrow.

      c) How "the full assembly of the 3D ECM cage" is defined at this stage of the study should be clarified, specifically regarding the ECM components and structural features that characterize its completion.

      We recognize that the term ' full assembly' of the 3D ECM cage can be misleading, as it might suggest different stages of cage formation, such as a completed cage, one that is in the process of formation, or an incomplete cage. Since we have not yet studied this concept, we will eliminate the term "full assembly" from the manuscript to avoid any confusion. Instead, we will simply mention the presence of a cage.

      Point 3. Data on MK Circulation and Cage Integrity: Does the cage require full component integrity to prevent MK release in circulation? Are circulating MKs found in Lama4-/- mice? Is the intravasation affected in these mice? Are the ~50% sinusoid associated MK functional?  

      These are very valid points. We will answer all these questions by performing a detailed analysis of MK localization, vessel association and intravascular MK detection using IF and high-resolution EM imaging of Lamα4<sup>-/-</sup> mice. Additionally, we will analyze data from Lamα4-/- bone marrow explants to assess the capacity of MKs to extend proplatelets.

      Point 4. Methodology

      a) Details on fixation time are not provided, which is critical as it can impact antibody binding and staining. Including this information would improve reproducibility and feasibility for other researchers.

      We will added this information in the methods section.

      b) The description of 'random length measuring' is unclear, and the rationale behind choosing random quantification should be explained. Additionally, in the shown image, it appears that only the branching ends were measured, which makes it difficult to discern the randomness in the measurements.

      The random length measurement method uses random sampling to provide unbiased data on laminin/collagen fibers in a 3D cage. Contrary to what the initial image might have suggested, measurements go beyond just the branching ends; they include intervals between various branching points throughout the cage.

      To clarify this process, we will outline these steps: 1) acquire 3D images, 2) project onto 2D planar sections, 3) select random intersection points for measurement, 4) measure intervals using ImageJ software, and 5) repeat the process for a representative dataset. This will better illustrate the randomness of our measurements.

      Point 5.  Figures

      a) Overall, the figures and their corresponding legends would benefit from greater clarity if some panels were split, such as separating images from graph quantifications.

      Following the reviewer’s suggestion, we will fully update all the Figures and separate images from graph quantifications.

      Reviewer #3 (Public review):

      Point 1. The data linking ECM cage formation to MK maturation raises several interesting questions. As the authors mention, MKs have been suggested to mature rapidly at the sinusoids, and both integrin KO and laminin KO MKs appear mislocalized away from the sinusoids. Additionally, average MK distances from the sinusoid may also help separate whether the maturation defects could be in part due to impaired migration towards CXCL12 at the sinusoid. Presumably, MKs could appear mislocalized away from the sinusoid given the data presented suggesting they leaving the BM and entering circulation. Additional data or commentary on intrinsic (ex-vivo) MK maturation phenotypes may help strengthen the author's conclusions and shed light on whether an essential function of the ECM cage is integrin activation at the sinusoid.

      The hypothesis of MK migration towards CXCL12 is interesting, although it has recently been challenged by Stegner et al. (2017), who found that MKs are primarily sessile. However, we cannot exclude this possibility. To address the reviewer's concerns, we will quantify the distance of MKs from the sinusoids. This could help to determine whether the maturation defects are due to impaired migration towards CXCL12 at the sinusoids or other factors, such as the ECM cage.

      We would appreciate some clarification regarding the second point raised by the reviewer. Is the question  specifically addressing whether the ECM cage has an effect on the activation of integrins in the sinusoids? If so, we will use immunofluorescence (IF) to investigate the relationship between the presence of an ECM cage and the activation of integrins on the surface of endothelial cells within the sinusoids. Thank you for your guidance on this matter.

      Point 2. The data demonstrating intact MKs inter circulation is intriguing - can the authors comment or provide evidence as to whether MKs are detectable in blood? A quantitative metric may strengthen these observations.

      We will conduct flow cytometry experiments and prepare blood smears to determine whether intact MKs are detectable in blood.

      Point 3. Supplementary Figure 6 - shows no effect on in vitro MK maturation and proplt, or MK area - But Figures 6B/6C demonstrate an increase in total MK number in MMP-inhibitor treated mice compared to control. Some additional clarification in the text may substantiate the author's conclusions as to either the source of the MMPs or the in vitro environment not fully reflecting the complex and dynamic niche of the BM ECM in vivo.

      This is a valid point. We will revise the text to include further clarification.

      Point 4.  Similarly, one function of the ECM discussed relates to MK maturation but in the B1/3 integrin KO mice, the presence of the ECM cage is reduced but there appears to be no significant impact upon maturation (Supplementary Figure 4). By contrast, MMP inhibition in vivo (but not in vitro) reduces MK maturation. These data could be better clarified in the text, or by the addition of experiments addressing whether the composition and quantity of ECM cage components directly inhibit maturation versus whether effects of MMP-inhibitors perhaps lead to over-activation of the integrins (as with the B4galt KO in the discussion) are responsible for the differences in maturation.

      These are very good questions, but they are difficult to assess in situ. To approach this, we will perform in vitro experiments :

      (1) We will vary collagenIV and laminin411 concentrations in the culture conditions to determine how this affects MK maturation ; and

      (2) We will assess the integrin activation states on cultured MKs treated with MMP inhibitors to determine if MMP inhibitors could influence MK maturation through over-activation of integrins.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This paper contains what could be described as a "classic" approach towards evaluating a novel taste stimuli in an animal model, including standard behavioral tests (some with nerve transections), taste nerve physiology, and immunocytochemistry of the tongue. The stimulus being tested is ornithine, from a class of stimuli called "kokumi", which are stimuli that enhance other canonical tastes, increasing essentially the hedonic attributes of these other stimuli; the mechanism for ornithine detection is thought to be GPRC6A receptors expressed in taste cells. The authors showed evidence for this in an earlier paper with mice; this paper evaluates ornithine taste in a rat model.

      Strengths:

      The data show the effects of ornithine on taste: in two-bottle and briefer intake tests, adding ornithine results in a higher intake of most, but not all, stimuli tests. Bilateral nerve cuts or the addition of GPRC6A antagonists decrease this effect. Small effects of ornithine are shown in whole-nerve recordings.

      Weaknesses:

      The conclusion seems to be that the authors have found evidence for ornithine acting as a taste modifier through the GPRC6A receptor expressed on the anterior tongue. It is hard to separate their conclusions from the possibility that any effects are additive rather than modulatory. Animals did prefer ornithine to water when presented by itself. Additionally, the authors refer to evidence that ornithine is activating the T1R1-T1R3 amino acid taste receptor, possibly at higher concentrations than they use for most of the study, although this seems speculative. It is striking that the largest effects on taste are found with the other amino acid (umami) stimuli, leading to the possibility that these are largely synergistic effects taking place at the tas1r receptor heterodimer.

      We would like to thank Reviewer #1 for the valuable comments. Our basis for considering ornithine as a taste modifier stems from our observation that a low concentration of ornithine (1 mM), which does not elicit a preference on its own, enhances the preference for umami substances, sucrose, and soybean oil through the activation of the GPRC6A receptor. Notably, this receptor is not typically considered a taste receptor. The reviewer suggested that the enhancement of umami taste might be due to potentiation occurring at the TAS1R receptor heterodimer. However, we propose that a different mechanism may be at play, as an antagonist of GPRC6A almost completely abolished this enhancement. In the revised manuscript, we will endeavor to provide additional information on the role of ornithine as a taste modifier acting through the GPRC6A receptor.

      Reviewer #2 (Public review):

      Summary:

      The authors used rats to determine the receptor for a food-related perception (kokumi) that has been characterized in humans. They employ a combination of behavioral, electrophysiological, and immunohistochemical results to support their conclusion that ornithine-mediated kokumi effects are mediated by the GPRC6A receptor. They complemented the rat data with some human psychophysical data. I find the results intriguing, but believe that the authors overinterpret their data.

      Strengths:

      The authors examined a new and exciting taste enhancer (ornithine). They used a variety of experimental approaches in rats to document the impact of ornithine on taste preference and peripheral taste nerve recordings. Further, they provided evidence pointing to a potential receptor for ornithine.

      Weaknesses:

      The authors have not established that the rat is an appropriate model system for studying kokumi. Their measurements do not provide insight into any of the established effects of kokumi on human flavor perception. The small study on humans is difficult to compare to the rat study because the authors made completely different types of measurements. Thus, I think that the authors need to substantially scale back the scope of their interpretations. These weaknesses diminish the likely impact of the work on the field of flavor perception.

      We would like to thank Reviewer #2 for the valuable comments and suggestions. Regarding the question of whether the rat is an appropriate model system for studying kokumi, we have chosen this species for several reasons: it is readily available as a conventional experimental model for gustatory research; the calcium-sensing receptor (CaSR), known as the kokumi receptor, is expressed in taste bud cells; and prior research has demonstrated the use of rats in kokumi studies involving gamma Glu-Val-Gly (Yamamoto and Mizuta, Chem. Senses, 2022).

      We acknowledge that fundamentally different types of measurements were conducted in the human psychophysical study and the rat study. Kokumi can indeed be assessed and expressed in humans; however, we do not currently have the means to confirm that animals experience kokumi in the same way that humans do. Therefore, human studies are necessary to evaluate kokumi, a conceptual term denoting enhanced flavor, while animal studies are needed to explore the potential underlying mechanisms of kokumi. We believe that a combination of both human and animal studies is essential, as is the case with research on sugars. While sugars are known to elicit sweetness, it is unclear whether animals perceive sweetness identically to humans, even though they exhibit a strong preference for sugars. In the revised manuscript, we will incorporate additional information to address the comments raised by the reviewer. We will also carefully review and revise our previous statements to ensure accuracy and clarity.

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors set out to investigate whether GPRC6A mediates kokumi taste initiated by the amino acid L-ornithine. They used Wistar rats, a standard laboratory strain, as the primary model and also performed an informative taste test in humans, in which miso soup was supplemented with various concentrations of L-ornithine. The findings are valuable and overall the evidence is solid. L-Ornithine should be considered to be a useful test substance in future studies of kokumi taste and the class C G protein-coupled receptor known as GPRC6A (C6A) along with its homolog, the calcium-sensing receptor (CaSR) should be considered candidate mediators of kokumi taste.

      Strengths:

      The overall experimental design is solid based on two bottle preference tests in rats. After determining the optimal concentration for L-Ornithine (1 mM) in the presence of MSG, it was added to various tastants, including inosine 5'-monophosphate; monosodium glutamate (MSG); mono-potassium glutamate (MPG); intralipos (a soybean oil emulsion); sucrose; sodium chloride (NaCl); citric acid and quinine hydrochloride. Robust effects of ornithine were observed in the cases of IMP, MSG, MPG, and sucrose, and little or no effects were observed in the cases of sodium chloride, citric acid, and quinine HCl. The researchers then focused on the preference for Ornithine-containing MSG solutions. The inclusion of the C6A inhibitors Calindol (0.3 mM but not 0.06 mM) or the gallate derivative EGCG (0.1 mM but not 0.03 mM) eliminated the preference for solutions that contained Ornithine in addition to MSG. The researchers next performed transections of the chord tympani nerves (with sham operation controls) in anesthetized rats to identify the role of the chorda tympani branches of the facial nerves (cranial nerve VII) in the preference for Ornithine-containing MSG solutions. This finding implicates the anterior half-two thirds of the tongue in ornithine-induced kokumi taste. They then used electrical recordings from intact chorda tympani nerves in anesthetized rats to demonstrate that ornithine enhanced MSG-induced responses following the application of tastants to the anterior surface of the tongue. They went on to show that this enhanced response was insensitive to amiloride, selected to inhibit 'salt tastant' responses mediated by the epithelial Na+ channel, but eliminated by Calindol. Finally, they performed immunohistochemistry on sections of rat tongue demonstrating C6A positive spindle-shaped cells in fungiform papillae that partially overlapped in its distribution with the IP3 type-3 receptor, used as a marker of Type-II cells, but not with (i) gustducin, the G protein partner of Tas1 receptors (T1Rs), used as a marker of a subset of type-II cells; or (ii) 5-HT (serotonin) and Synaptosome-associated protein 25 kDa (SNAP-25) used as markers of Type-III cells.

      Weaknesses:

      The researchers undertook what turned out to be largely confirmatory studies in rats with respect to their previously published work on Ornithine and C6A in mice (Mizuta et al Nutrients 2021).

      The authors point out that animal models pose some difficulties of interpretation in studies of taste and raise the possibility in the Discussion that umami substances may enhance the taste response to ornithine (Line 271, Page 9). They miss an opportunity to outline the experimental results from the study that favor their preferred interpretation that ornithine is a taste enhancer rather than a tastant.

      At least two other receptors in addition to C6A might mediate taste responses to ornithine: (i) the CaSR, which binds and responds to multiple L-amino acids (Conigrave et al, PNAS 2000), and which has been previously reported to mediate kokumi taste (Ohsu et al., JBC 2010) as well as responses to Ornithine (Shin et al., Cell Signaling 2020); and (ii) T1R1/T1R3 heterodimers which also respond to L-amino acids and exhibit enhanced responses to IMP (Nelson et al., Nature 2001). While the experimental results as a whole favor the authors' interpretation that C6A mediates the Ornithine responses, they do not make clear either the nature of the 'receptor identification problem' in the Introduction or the way in which they approached that problem in the Results and Discussion sections. It would be helpful to show that a specific inhibitor of the CaSR failed to block the ornithine response. In addition, while they showed that C6A-positive cells were clearly distinct from gustducin-positive, and thus T1R-positive cells, they missed an opportunity to clearly differentiate C6A-expressing taste cells and CaSR-expressing taste cells in the rat tongue sections.

      It would have been helpful to include a positive control kokumi substance in the two-bottle preference experiment (e.g., one of the known gamma-glutamyl peptides such as gamma-glu-Val-Gly or glutathione), to compare the relative potencies of the control kokumi compound and Ornithine, and to compare the sensitivities of the two responses to C6A and CaSR inhibitors.

      The results demonstrate that enhancement of the chorda tympani nerve response to MSG occurs at substantially greater Ornithine concentrations (10 and 30 mM) than were required to observe differences in the two bottle preference experiments (1.0 mM; Figure 2). The discrepancy requires careful discussion and if necessary further experiments using the two-bottle preference format.

      We would like to thank Reviewer #3 for the valuable comments and helpful suggestions. We propose that ornithine has two stimulatory actions: one acting on GPRC6A, particularly at lower concentrations, and another on amino acid receptors such as T1R1/T1R3 at higher concentrations. Consequently, ornithine is not preferable at lower concentrations but becomes preferable at higher concentrations. For our study on kokumi, we used a low concentration (1 mM) of ornithine. The possibility mentioned in the Discussion that 'the umami substances may enhance the taste response to ornithine' is entirely speculative. We will reconsider including this description in the revised version. As the reviewer suggested, in addition to GPRC6A, ornithine may bind to CaSR and/or T1R1/T1R3 heterodimers. However, we believe that ornithine mainly binds to GPRC6A, as a specific inhibitor of this receptor almost completely abolished the enhanced response to umami substances, and our immunohistochemical study indicated that GPRC6A-expressing taste cells are distinct from CaSR-expressing taste cells (see Supplemental Fig. 3). We conducted essentially the same experiments using gamma-Glu-Val-Gly in Wistar rats (Yamamoto and Mizuta, Chem. Senses, 2022) and compared the results in the Discussion. The reviewer may have misunderstood the chorda tympani results: we added the same concentration (1 mM) used in the two-bottle preference test to MSG (Fig. 5-B). Fig. 5-A shows nerve responses to five concentrations of plain ornithine. In the revised manuscript, we will strive to provide more precise information reflecting the reviewer’s comments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The behavioral effects found with the CPRC6A antagonists are not entirely convincing, as the antagonist is seemingly just mixed up in the solution with the stimuli. There are no control experiments demonstrating that the antagonists do not have a taste themselves.

      We mixed the antagonists into both liquids used in the two-bottle preference test to eliminate any potential taste effects of the antagonists themselves. In the electrophysiological experiments, the antagonist was incorporated into the solution after confirming that it did not elicit any appreciable response in the taste nerve.

      (2) The effects of ornithine found with quinine did not have a satisfying explanation - if there is some taste cell-taste cell modulation that accounts for the taste enhancement, why is the quinine less aversive? Why is it not enhanced like the other compounds?

      The effects of ornithine on quinine responses remain difficult to explain. A previous study (Tokuyama et al., Chem Pharm Bull, 2006) proposed that ornithine prevents bitter substances from binding to bitter receptors, although this hypothesis lacks definitive evidence. In the present study, our findings suggest that the binding of quinine to bitter receptors is essential, as another agonist, gallate, also enhanced the preference for quinine, but this effect was abolished by EGCG, a GPRC6A antagonist (see Supplemental Fig. 2).

      (3) Unless I am missing something, there appears to be no quantitative analysis of the immunocytochemical data, just assertions.

      We have made quantitative analyses in the revised text, and the following sentences have been added: “Approximately 11% of GPRC6A-positive cells overlapped with IP3R3 (9 double-positive cells/80 GPRC6A-positive cells), while approximately 8.3% of IP3R3-positive cells expressed GPRC6A (9 double-positive /109 IP3R3-positive cells). In addition, GPRC6A-positive cells were unlikely to colocalize with a-gustducin, another marker for a subset of type II cells, in single taste cells (0 double-positive cell/93 GPRC6A-positive cells). Regarding type III cell markers, GPRC6A-positive cells were unlikely to colocalize with 5-HT in single taste cells (0 double-positive cell/75 GPRC6A-positive cells).”

      (4) The hallmarks of Kokumi taste include descriptors such as "thickness", and "mouthfeel", which sound like potential somatosensory attributes. Perhaps the authors should consider this possibility for at least some of the effects found.

      The term kokumi, a Japanese word, refers to a phenomenon in which the flavor of complexly composed food is enhanced through certain processes, making them more delicious. To date, kokumi has been described using the representative terms thickness, mouthfulness, and continuity, originally introduced in the first paper on kokumi by Ueda et al. (1990). However, these terms are derived from Japanese and may not fully convey the nuances of the original language when translated into these simple English words. In particular, thickness is often interpreted as referring to physical properties such as viscosity or somatosensory sensations. Since kokumi inherently lacks somatosensory elements, this revised paper adopts alternative terms and explanations for the three components of kokumi to prevent misunderstanding and confusion.

      Therefore, to clarify that kokumi attributes are inherently gustatory, thickness is replaced with intensity of whole complex tastes (rich flavor with complex tastes), emphasizing the synergistic effects of a variety of tastes rather than the mere enhancement of a single flavor. Mouthfulness is clarified as not referring to mouthfeel (the tactile sensation a food gives in the mouth) but rather as spread of taste and flavor throughout the oral cavity, describing how the flavor fills the mouth. Continuity is replaced with persistence of taste (lingering flavor).

      (5) I don't think the human experiment (S1) belongs to the paper, even as a supplementary bit of data. It's only 17 subjects, they are all female, and we don't know anything about how they were selected, even though it states they are all students/staff at Kio. Were any of them lab members? Were they aware of the goals of the experiment? Could simply increasing the amount of solute in the soup make it seem thicker? This (sparse) data seems to have been shoehorned into the paper without enough detail/justification.

      Despite the reviewer’s suggestion, we would like to include the human experiment because the rationale of the present study is to confirm, through a human sensory test, that the kokumi of a complex solution (in this case, miso soup) is enhanced by the addition of ornithine. This is followed by basic animal experiments to investigate the underlying mechanisms. Therefore, this human study serves an important role.

      The total number of participants increased to 22 (19 women and three men) following an additional experiment with 5 new participants. New results have been shown in Supplemental Figure 1 with statistical analyses. The rewritten parts are as follows:

      We recruited 22 participants (19 women and three men, aged 21-28 years) from Kio University who were not affiliated with our laboratory, including students and staff members. All participants passed a screening test based on taste sensitivity. According to the responses obtained from a pre-experimental questionnaire, we confirmed that none of the participants had any sensory abnormalities, eating disorders, or mental disorders, or were taking any medications that may potentially affect their sense of taste. All participants were instructed not to eat or drink anything for 1 hour prior to the start of the experiment. We provided them with a detailed explanation of the experimental procedures, including safety measures and personal data protection, without revealing the specific goals of the study.

      (6) The introduction could be more concise - for example, when describing Kokumi stimuli such as ornithine and its possible receptors, the authors do not need to add the detail about how this stimulus was deduced from adding clams to the soup. Details like this can be reserved for the discussion.

      Thank you for this comment. We have tried to shorten the Introduction.

      (7) Line 86: awkward phrasing - this doesn't need to be a rhetorical question.

      We have deleted the sentence.

      (8) Supplementary Figure 1: The labels on the figure say "Miso soup in 1 mM Orn" when the Orn is dissolved into the soup.

      Thank you for pointing out our mistake. We have changed the description, such as “1 mM Orn in miso soup”.

      Reviewer #2 (Recommendations for the authors):

      Major concerns

      (1) The impact of "kokumi" taste ligands on food perception appears to be profound in humans. This observation is fascinating because it implies that molecules like ornithine impact a variety of flavor perceptions, some of which are non-gustatory in nature (e.g., spread, mouthfulness and harmony). What remains unclear is whether "kokumi" ligands produce analogous sensations in rodents. If they don't, then rodents are an inappropriate model system for studying the impact of kokumi on flavor perceptions. The authors fail to address this key issue, and uncritically assume that kokumi ligands produce sensations like thickness, mouthfulness, and continuity in rodents. For this reason, the authors' reference to GPRC6A as a kokumi receptor is inappropriate.

      Thank you very much for the valuable comments. The term kokumi refers to a phenomenon in which the flavor of complexly composed foods is enhanced through certain processes, making them more delicious. It is an important concept in the field of food science, which studies how to make prepared dishes more enjoyable. Kokumi is also considered a higher-order, profound cognitive function evaluated by humans who experience a wide variety of foods. However, it is unclear whether animals, particularly experimental animals, can perceive kokumi in the same way humans do.

      To date, kokumi has been described using the representative terms thickness, mouthfulness, and continuity, originally introduced in the first paper on kokumi by Ueda et al. (1990). However, these terms are derived from Japanese and may not fully convey the nuances of the original language when translated into these simple English words. In particular, thickness is often interpreted as referring to physical properties such as viscosity or somatosensory sensations. Since kokumi inherently lacks somatosensory elements, this revised paper adopts alternative terms and explanations for the three components of kokumi to prevent misunderstanding and confusion.

      Therefore, to clarify that kokumi attributes are inherently gustatory, thickness is replaced with intensity of whole complex tastes (rich flavor with complex tastes), emphasizing the synergistic effects of a variety of tastes rather than the mere enhancement of a single flavor. Mouthfulness is clarified as not referring to mouthfeel (the tactile sensation a food gives in the mouth) but rather as spread of taste and flavor throughout the oral cavity, describing how the flavor fills the mouth. Continuity is replaced with persistence of taste (lingering flavor).

      Rodents are thought to possess basic taste functions similar to humans, such as the expression of taste receptors, including kokumi receptors, in taste cells. Regardless of whether rodents can perceive kokumi, findings from studies on rodents may provide insights into aspects of the kokumi concept as experienced by humans.

      Indeed, the results of this study indicate that ornithine enhances umami, sweetness, fat taste, and saltiness, leading to the enhancement of complex flavors—referred to as intensity of whole taste. The activation of various taste cells, resulting in the enhancement of multiple tastes, may contribute to the sensation of flavors spreading throughout the oral cavity. Furthermore, the strong enhancement of MSG and MPG suggests that glutamate contributes to the mouthfulness and persistence of taste characteristic of kokumi.

      (2) A related concern is that the authors did not make any measurements that model kokumi sensations documented in the literature. For example, they would need to develop behavioral/electrophysiological measurements that reflect the known effects of kokumi ligands on flavor perception (i.e., increases in intensity, spread, continuity, richness, harmony, and punch). For example, ornithine is thought to produce more "punch" (i.e., a more rapid rise in intensity). This could be manifested as a more rapid rise in peripheral taste response or a more rapid fMRI response in the taste cortex. Alternatively, ornithine is thought to increase "continuity" (i.e., make the taste response more persistent). This response would presumably be manifested as a peripheral taste response that adapts more slowly or a more persistent fMRI response. As it stands, the authors have documented that ornithine increases (i) the preference of rats for some chemical stimuli, but not others; and (ii) the response of the CT nerve to some but not all taste stimuli.

      In animal experiments, it is challenging to examine each attribute of kokumi. The increase of complex tastes can be investigated through behavioral experiments and neural activity recordings. However, phenomena such as spread or harmony, which arise from profound human judgments, are difficult to validate in animal studies.

      While it was possible to examine persistence through neural responses to tastants, all stimuli were rinsed at 30 seconds after onset of stimulation, so the exact duration of persistence was not investigated. However, since the MSG response was enhanced approximately 1.5 times with the addition of ornithine, it is strongly suggested that the duration might also have been prolonged.

      Regarding punch, no differences were observed in the neural responses when ornithine was added, likely because the phasic response already had a rapid onset.

      In the context of fMRI studies, there has been a report that adding glutathione to mixtures of umami and salt solutions increases responses (Goto et al. Chem Senses, 2016). However, research specifically examining the attributes of kokumi has not yet been reported.

      (3) The quality of the SNAP-25 immunohistochemistry is poor (see Figure 7D), with lots of seemingly nonspecific staining in and outside the taste bud.

      The quality of the SNAP-25 is not poor. It is known that SNAP-25 labels not only type III cells but also the dense network of intragemmal nerve fibers (Tizzano et al., Immunohistochemical Analysis of Human Vallate Taste Buds. Chem Senses.40:655-60, 2015). Therefore, lots of seemingly nonspecific staining is due to intense SNAP-25-immunoreactivity of the nerve fibers.

      (4) The authors need to drastically scale back the scope of their conclusions. What they can say is that ornithine appears to enhance the taste responses of rats to a variety of taste stimuli and that this effect appears to be mediated by the GPRC6A receptor. They cannot use their data to address kokumi effects in humans, as they have not attempted to model any of these effects. Given the known problems with pharmacological blocking agents (e.g., nonspecificity), the authors would significantly strengthen their case if they could generate similar results in a GPRC6A knockout mouse.

      Our research approach begins with confirming in humans that the addition of ornithine to complex foods (such as miso soup) induces kokumi. Based on this confirmation, we conduct fundamental studies using animal models to investigate the peripheral taste mechanisms underlying the expression of kokumi.

      It is possible that the key to kokumi expression lies in the enhancement of desirable tastes (particularly umami) and the suppression of unpleasant tastes. Moving forward, we will deepen our fundamental research on the action of ornithine mediated through GPRC6A, including studies using knockout mice.

      (5) The introduction is too long. Much of the discussion of kokumi perception in humans should either be removed or shortened considerably.

      Following the reviewer’s suggestion, the introduction has been shortened.

      (6) I recommend that the authors break up the Methods and Results sections into different experiments. This would enable the authors to provide separate rationales for each procedure. For instance, the authors conducted a variety of different behavioral procedures (e.g., long- and short-term preference tests, and preference tests with and without GPRC6A receptor antagonists).

      Rather than following the reviewer’s suggestion, we have added subheadings to describe the purpose of each experiment. This approach would help readers better understand the experimental flow, as each experiment is relatively straightforward.

      (7) The inclusion of the human data is odd for two reasons. First, the measurements used to assess the impact of ornithine on flavor perception in humans were totally different than those used in rats. This makes it impossible to compare the human and rat datasets. Second, the human study was rather limited in scope, had small effect sizes, and had a lot of individual variation. For these reasons, the human data are not terribly helpful. I recommend that the authors remove the human data from this paper, and publish them as part of a more extensive study on humans.

      Despite the reviewer’s suggestion, we would like to include the human experiment because the rationale of the present study is to confirm, through a human sensory test, that the kokumi of a complex solution (in this case, miso soup) is enhanced by the addition of ornithine. This is followed by basic animal experiments to investigate the underlying mechanisms. Therefore, this human study serves an important role. The considerable variation in the scores suggests that evaluating the three kokumi attributes is challenging and likely influenced by differences in judgment criteria among participants.

      The total number of participants increased to 22 (19 women and three men) following an additional experiment with 5 new participants. New results have been shown in Supplemental Figure 1 with statistical analyses. The rewritten parts are as follows:

      We recruited 22 participants (19 women and three men, aged 21-28 years) from Kio University who were not affiliated with our laboratory, including students and staff members. All participants passed a screening test based on taste sensitivity. According to the responses obtained from a pre-experimental questionnaire, we confirmed that none of the participants had any sensory abnormalities, eating disorders, or mental disorders, or were taking any medications that may potentially affect their sense of taste. All participants were instructed not to eat or drink anything for 1 hour prior to the start of the experiment. We provided them with a detailed explanation of the experimental procedures, including safety measures and personal data protection, without revealing the specific goals of the study.

      (8) While the use of English is generally good, there are many instances where the English is a bit awkward. I recommend that the authors ask a native English speaker to edit the text.

      Thank you for this comment. The text has been edited by a native English speaker.

      Minor concerns

      (1) Lines 13-14: The authors state that "the concept of 'kokumi' has garnered significant attention in gustatory physiology and food science." This is an exaggeration. Kokumi has generated considerable interest in food science but has yet to generate much interest in gustatory physiology.

      We have rewritten this part: “The concept of “kokumi” has generated considerable interest in food science but kokumi has not been well studied in gustatory physiology.”

      (2) Line 20: The use of "specific taste" is unclear in this context. The authors indicate (in Figure 5A) that 1 mM ornithine generates a CT nerve response. They also reveal (in Figure 1A) that rats do not prefer 1 mM ornithine over water. The results from a preference test do not provide insight into whether a solution can be tasted; they merely demonstrate a lack of preference for that solution. Based on these data, the authors cannot infer that 1 mM ornithine cannot be tasted.

      We agree with the reviewer’s comment. Ornithine at 1 mM concentration may have a weak taste because this solution elicited a small neural response (Fig. 5-A). We have rewritten the text: “… at a concentration without preference for this solution.”

      (3) Line 44: Sensory information from foods enters the oral and the nasal cavity.

      The nasal cavity has been added.

      (5) Lines 59: The terms "thickness", "mouthfulness" and "continuity" are not intuitive in English, and may reflect, at least in part, a failure in translation. The word thickness implies a tactile sensation (e.g., owing to high viscosity), but the authors use it to indicate a flavor that is more intense and onsets more quickly. The word mouthfulness is supposed to indicate that a flavor is experienced throughout the oral cavity. The problem here is that this happens with all tastants, independent of the presence of substances like ornithine. Indeed, taste buds occur in a limited portion of the oral epithelium, but we nevertheless experience tastes throughout the oral cavity, owing to a phenomenon called tactile referral (see the following reference: Todrank and Bartoshuk, 1991, A taste illusion: taste sensation localized by touch" Physiology & Behavior 50:1027-1031). The word continuity does not imply that the taste is long-lasting or persistent.

      These three attributes were originally introduced by Ueda et al. (1990), who translated Japanese terms describing the profound characteristics of kokumi, which are deeply rooted in Japanese culinary culture. However, these simply translated terms have caused global misunderstanding and confusion, because they sound like somatosensory rather than gustatory descriptions. Therefore, to clarify that kokumi attributes are inherently gustatory, in the revised version we use the terms “intensity of whole complex tastes (rich flavor with complex tastes)” instead of thickness, “mouthfulness (spread of taste and flavor throughout the oral cavity),” and “persistence of taste (lingering flavor)” instead of continuity.

      The results of this study indicate that ornithine enhances umami, sweetness, fat taste, and saltiness, leading to the enhancement of complex flavors—referred to as intensity of whole taste. The activation of various taste cells, resulting in the enhancement of multiple tastes, may contribute to the sensation of flavors spreading throughout the oral cavity. Furthermore, the strong enhancement of MSG and MPG suggests that glutamate contributes to the mouthfulness and persistence of taste characteristic of kokumi.

      (6) Figure legends: The authors provide results of statistical comparisons in several of the figures. They need to explain what statistical procedures were performed. As it stands, it is impossible to interpret the asterisks provided.

      We have explained statistical procedures in each Figure legend.

      (7) I did not see any reference to the sources of funding or any mention of potential conflicts of interest.

      We have added the following information:

      Funding: JSPS KAKENHI Grant Numbers JP17K00935 (to TY) and JP22K11803(to KU).

      Declaration of interests: The authors declare that they have no competing interests.

      Reviewer #3 (Recommendations for the authors):

      (1) I suggest that the authors increase their level of interest in glutathione and gamma-glutamyl peptides. This might include an appropriate gamma-glutamyl control substance in the two-bottle preference study (see Public Review). It might also include more careful attention to the work that identified glutathione as an activator of the CaSR (Wang et al., JBC 2006) and the nature of its binding site on the CaSR which overlaps with its site for L-amino acids (Broadhead et al., JBC 2011). This latter article also identified S-methyl glutathione, in which the free-SH group is blocked, as a high-potency activator of the CaSR. It would be expected to show comparable potency to gamma-glu-Val-Gly in assays of kokumi taste.

      We have appropriately referenced glutathione and gamma-Glu-Val-Gly, potent agonists of CaSR, where necessary. In our previous study (Yamamoto and Mizuta, Chem Senses, 2022), we examined the additive effects of these substances on basic taste stimuli in rodents, and the results were compared in greater detail with those obtained from the addition of ornithine in the present study. We have also discussed the potential binding of ornithine to other receptors, including CaSR and T1R1/T1R3 heterodimers.

      (2) Figures:

      -None of the figures were labelled with their Figure numbers. I have inferred the Figure numbers from the legends and their positions in the pdf.

      We are sorry for this inconvenience.

      - The labelling of Figure 1 and Figure 2 are problematic. In Figure 1 it should be made clear that the horizontal axes refer to the Ornithine concentration. In Figure 2 it should be made clear that the horizontal axes refer to the tastant concentrations (MSG, IMP, etc) and that the Ornithine concentrations were fixed at either zero or 1.0 mM.

      We are sorry for the lack of information about the horizontal axes. We have explained the horizontal axes in figure legends in Figs. 1 and 2. The labelling of both figures has also been modified to make this clear.

      - Figure 3B: 'Control' should appear at the top of this panel since the panels that follow all refer to it.

      Following the reviewer’s suggestion, we have added ‘Control’ at the top of Figure 3B.

      - Figure 5A. Provide a label for the test substance, presumably Ornithine.

      Yes, we have added ‘Ornithine’.

      - Figure 7 would be strengthened by the inclusion of immunohistochemistry analyses of the CaSR.

      We are sorry that we did not analyze immunohistochemistry for the CaSR because a previous study precisely had analyzed the CaSR expression on taste cells in rats. We have analyzed co-expression of GPRC6A and CaSR (see Supplemental Figure 3).

      (3) Other Matters:

      - Line 38: list the five basic taste modalities here.

      Yes, we have included the five basic taste modalities here.

      - Line 107: 'even if ... kokumi ... is less developed in rodents' - if there is evidence that kokumi is less developed in rodents it should be cited here.

      We cannot cite any references here because no studies have compared the perception of kokumi between humans and rodents.

      - Line 308: 'recently we conducted experiments in rats using gallate ...' - the authors appear to imply that they performed the research in Reference 43, however, I was unable to find an overlap between the two lists of authors.

      We are not doing a similar study as the research in Reference 43 (40 in the revised paper). Following the result that gallate is an agonist of GPRC6A as shown by Reference 43, we were interested in doing similar behavioral experiments using gallate instead of ornithine.

      The sentences have been rewritten to avoid misunderstanding.

      - Line 506: the sections are said to be 20 mm thick - should this read 20 micrometers?

      Thank you. We have changed to 20 micrometers.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02767

      Corresponding author(s): Kazuaki Maruyama

      1. General Statements

      Response to Reviewer #1:

      We sincerely appreciate your thoughtful review of our manuscript. Our primary objective is to elucidate the pathogenic mechanisms underlying congenital low-flow vascular malformations, thereby informing the development of novel therapeutic strategies. We recognize that, given the dual nature of our study encompassing both fundamental and clinical science, the presentation may have appeared somewhat convoluted. In response, we have revised the manuscript to clarify these points and have reformatted the text corresponding to your comments—originally presented as a single continuous block—into defined, numbered sections to enhance readability.

      Response to Reviewer #2:

      We are deeply grateful for the time and effort you have dedicated to reviewing our manuscript despite your busy schedule. Your comments have been particularly insightful, especially regarding the section on the preclinical mouse model. In light of your suggestions, we have conducted additional experiments and revised the manuscript accordingly. We trust that these modifications address your concerns and contribute to the overall improvement of our work.

      The revised sections have been highlighted in red in the text.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required):

      The authors investigate the pathogenesis of congenital vascular malformations by overexpressing the Pik3caH1047R mutation under the R26 locus in different cell populations and developmental stages using various Cre and CreERT2 lines, including endothelial-specific and different mesoderm precursor lines. The authors provide a thorough characterization of the vascular malformation phenotypes across models. Specifically, they claim that expressing Pik3caH1047R in the cardiopharyngeal mesoderm (CPM) precursors results in vascular abnormalities localized to the head and neck region of the embryo. The study also includes scRNAseq data analyses, including from previously published data and new data generated by the authors. Trajectory inference analysis of a previous scRNA-seq dataset revealed that Isl1+ mesodermal cells can differentiate into ETV2+ cells, directly giving rise to Prox1+ lymphatic endothelial cell progenitors, bypassing the venous stage. Single-cell RNA sequencing of their CPM model and other in vitro datasets show that Pik3caH1047R upregulates VEGF-A via HIF-1α-mediated hypoxia signaling, findings further corroborated in human samples. Finally, preclinical studies in adult mice confirm that pharmacological inhibition of HIF-1α and VEGF-A reduces the number and size of mutant vessels.

      Major comments

      1. While the study provides a nice characterization of Pik3caH1047R-derived vascular phenotypes induce by expressing this mutation in different cells, the main message of the study is unclear. What is the main question that the authors want to address with this manuscript?

      Response:

      Our main message is as follows:

      1. __ Elucidation of pathogenesis based on developmental cellular origins:__ This study focuses on using embryonic models to elucidate the mechanism by which the Pik3caH1047R mutation induces low-flow vascular malformations. Specifically, we demonstrate that expression of Pik3caH1047R in cells derived from the cardiopharyngeal mesoderm (CPM) induces vascular abnormalities that are confined to the head and neck region. Furthermore, vascular malformations originating from another cell type—for example, Pax3+ cells—are confined to the lower body. This suggests that the embryonic origin of endothelial cells may determine the anatomical location of vascular malformations, with important implications for clinical severity and treatment strategies.

      Molecular ____s____i____gnaling pathways and targeted therapeutic approaches:

      Through single-cell RNA sequencing, we have identified hypoxia signaling—particularly via HIF-1α and VEGF-A—as central to the pathogenesis of these malformations. Moreover, preclinical mouse model experiments demonstrate that pharmacological inhibition of HIF-1α and VEGF-A significantly reduces lesion formation, supporting the potential of targeting these pathways as a novel therapeutic strategy.

      In summary, our main message is that by elucidating the developmental and molecular mechanisms underlying Pik3caH1047R-driven low-flow vascular malformations—especially the pivotal role of hypoxia signaling via HIF-1α/VEGF-A—we provide a strong rationale for novel therapeutic strategies aimed at these challenging conditions

      To further clarify these points, we have revised the manuscript by incorporating additional experiments and reorganizing the text into clearly defined sections.

      The precursor type form where these lesions appear, that venous and lymphatic malformations emerge independently, when and where this phenotype appear?

      Response:

      In Tie2-Cre; R26R-Pik3caH1047R mutant embryos, no prominent phenotype was observed at E9.5 or E11.5. Vascular (venous) malformations are evident from E12.5, whereas lymphatic malformations become prominent from E13.5. We propose that the emergence of the lymphatic phenotype after E13.5 is due to the fact that lymphatic vessels, particularly in the upper body, begin forming a luminal structure mainly from E13.5 onward(Maruyama et al, 2022) . For further details, please refer to the explanation provided in Question 6.

      To address this, we have newly included Supplemental Figure 2 and revised the Results section as follows:

      Whereas clear phenotypes were evident at E12.5 and E13.5, no pronounced external abnormalities were observed at E9.5 or E11.5 (Supplemental Figure 2A–B). Similarly, histological examination revealed no significant differences in the short-axis diameter of the PECAM+ CV or in the number of Prox1+ LECs surrounding the CV between control and mutant embryos at E11.5 (Supplemental Figure 2C–F). We also assessed Tie2-Cre; R26R-Pik3caH1047R mutant embryos at E14.0 from five pregnant mice. Only two embryos were alive at this stage, and both showed severe edema and hemorrhaging, indicating they were nearly moribund. These observations suggest that the critical point for survival of these mutant embryos lies between E13.5 and E14.0 (Supplemental Figure 2G). (Page 5, lines 157–165)

      The manuscript needs some work to make the sections more cohesive and to structure better the main findings and the rationale for choosing the models. Authors should explain better when and where the pathogenic phenotypes refer to blood and/or lymphatic malformations. From the quantifications provided in Figure 1, Pik3caH1047R leads to different phenotypes in blood and lymphatic vessels. These are larger diameters with no difference in the number of blood vessels (are you quantifying all pecam1 positive? Vein, arteries, capillaries?), and an increase in the number of lymphatics vessels. Please clarify and discuss.

      Response:

      We interpreted this as a question regarding which vessels were quantified. The answer to this question is provided in Question 4.

      Which vessel types are considered for the quantifications shown in Fig. 1I, M, Q? All Pecam1+ vessels, including lymphatic, vein, capillaries and arteries or which ones? Provide clarifications.

      __Response: __

      Vessel types were characterized based on anatomical and histological features. For the anatomical details, we referred to The Atlas of Mouse Development by M.H. Kaufman.

      This aspect is described in the Methods section, as follows:

      Veins and arteries were classified based on anatomical criteria. Vessels demonstrating continuity with a clearly identifiable vein (e.g., the anterior cardinal vein) in serial sections were defined as veins. In contrast, the aorta and pulmonary artery, each exhibiting a distinct wall structure indicative of a direct connection to the heart, were designated as arteries. Lymphatic vessels were identified based on the combined expression of Prox1, VEGFR3, and PECAM, along with the developmental stage, morphology, and anatomical location as described in our previous studies (Maruyama et al, 2019, 2022, 2021) . PECAM+ vessels that lacked a definitive wall structure, did not express lymphatic markers, or did not exhibit clearly identifiable continuity necessary for classification as veins or capillaries were collectively designated as blood vessels or vasculatures. (Page 16, lines 530-539)

      Regarding Figure 1I:

      In the tongue and mandible, the facial vein—which branches from the anterior cardinal vein—is dilated, and its continuity with the venous system is confirmed. In contrast, Figure 1J shows the number of PECAM+ vasculatures; however, for smaller vessels, continuity is not always demonstrable, so these are designated as vasculatures according to the criteria.

      Regarding Figures 1M and N:

      In the liver, the dilated vessels are classified as veins because they exhibit continuity with the inferior vena cava. Even in the control group, the central veins tend to have relatively large diameters. Therefore, we compared the average area and quantified the number of abnormal central veins—defined as those contiguous with a vein and exceeding a specified area.

      Regarding Figures 1Q and R:

      Cerebral vessels are classified as veins due to their continuity with the common cardinal and jugular veins. However, as these vessels extend into the periphery, this continuity becomes less distinct, and they are consequently designated as blood vessels lacking Prox1 expression.

      The authors propose that the CPM model results in localized head and neck vascular malformations. However, I am not convinced. The images supporting the neck defects are evident, but it is unclear whether there are phenotypes in the head.

      Response:

      Perhaps the discrepancy arises from a terminological issue. According to the WHO Classification of Tumours, commonly used in clinical settings, the term "Head and Neck" refers to the facial and cervical regions (including the oral cavity, larynx, pharynx, salivary glands, nasal cavity, etc.) and excludes the central nervous system. The inclusion of the brain in Figure 1O-R may have led to some confusion. We included the brain because cerebral cavernous malformations are classified as venous malformations, and thus serve as an example of common sites for venous malformations in humans. To clarify this point, we have made slight revisions to the first part of the Introduction, as follows:

      They frequently manifest in the head and neck region—here defined as the orofacial and cervical areas, excluding the brain. (Page2, lines 52-53)

      Why are half of the experiments with the Tie2-Cre model conducted at E12.5 (e.g., validation of recombination, signaling, proliferation) and the others at E13.5? It becomes confusing for the reader why the authors start the results section with E13.5 and then study E12.5.

      Response:

      This is also related to the previous question (Question 4). We decided to include extensive anatomical information in a single figure. In Supplemental Figure 1, sagittal sections at E12.5 were used so that the pulmonary artery, aorta, and dilated common cardinal vein could be visualized within one sample. This allowed us to demonstrate that the Pik3caH1047R mutation does not affect arteries by contrasting them with the dilated veins. At E13.5, in addition to the dilation observed at E12.5, the common cardinal vein becomes markedly dilated and compresses the surrounding structures. Capturing both veins and arteries simultaneously would require multiple images, which could potentially confuse the reader. Moreover, lymphatic and other organ phenotypes (e.g., in the liver) are more prominent at E13.5. Therefore, we selectively employed both E12.5 and E13.5 stages to suit our specific objectives.

      The quantifications provided do not clarify what the "n" represents or how many embryos or litters were analyzed. 

      Response:

      Thank you for your feedback. We have now incorporated the sample size (n) directly into the graphs and figure legends.

      Blasio et al. (2018), Hare et al (2015) reported that Pik3caH1047R with Tie2-Cre embryos die before E10.5. How do the authors explain the increase in survival here? Were embryos at E13.5alive? What was the Mendelian ratio observed by the authors? Please provide this information and discuss this point.

      Response:

      Two types of Tie2-Cre lines are widely used worldwide. The mouse line employed by Blasio et al. (2018) differs from that used in our study (their manuscript did not specify whether the background was B6 or a mixed strain). In contrast, although Hare et al. (2015) used the same mouse line as we did, they maintained a C57BL/6 background. We selected a mixed background of B6 and ICR, as we believe that a heterogeneous genetic background more accurately reflects the diversity of human pathology. We examined five pregnant females, which yielded approximately 30 embryos from five pregnant mice, of which only two survived until E14.0. Based on these observations, we consider E13.5 to be the appropriate survival limit (see Supplemental Figure 2G for additional details). In our breeding strategy, mice in the Tie2-Cre or Tie2-Cre; R26R-eYFP line were maintained as heterozygotes for Tie2-Cre and homozygotes for R26R-eYFP, whereas those carrying the R26R-Pik3caH1047R allele were homozygous. This approach produced control(Cre (-)) and heterozygous offspring in an expected 1:1 ratio at all examined stages: E9.5 (mutant n = 4, control n = 4 from two pregnant females), E11.5 (mutant n = 8, control n = 8 from two pregnant females), E12.5 (mutant n = 4, control n = 4 from two pregnant females), and E13.5 (mutant n = 5, control n = 5 from two pregnant females), with no deviation from the anticipated Mendelian ratio.

      Regarding this point, we have described it in the Results section as follows:

      Whereas clear phenotypes were evident at E12.5 and E13.5, no pronounced external abnormalities were observed at E9.5 or E11.5 (Supplemental Figure 2A–B). Similarly, histological examination revealed no significant differences in the short-axis diameter of the PECAM+ CV or in the number of Prox1+ LECs surrounding the CV between control and mutant embryos at E11.5 (Supplemental Figure 2C–F). We also assessed Tie2-Cre; R26R-Pik3caH1047R mutant embryos at E14.0 from five pregnant mice. Only two embryos were alive at this stage, and both showed severe edema and hemorrhaging, indicating they were nearly moribund. These observations suggest that the critical point for survival of these mutant embryos lies between E13.5 and E14.0 (Supplemental Figure 2G). (Page 5, lines 157-165)

      Please explain the rationale for using the Cdh5-CreERT2. It is likely due to the lethality observed with Tie2Cre, but this was not mentioned.

      Response:

      Thank you very much for your comment. As mentioned above, nearly all Tie2‐Cre;Pik3caH1047R embryos fail to survive past E14.0.

      The lethality observed with Tie2‐Cre mice is described as follows:

      We also assessed Tie2-Cre; R26R-Pik3caH1047R mutant embryos at E14.0 from five pregnant mice. Only two embryos were alive at this stage, and both showed severe edema and hemorrhaging, indicating they were nearly moribund. These observations suggest that the critical point for survival of these mutant embryos lies between E13.5 and E14.0 (Supplemental Figure 2G). (Page 5, lines 161-165)

      The rationale for using CDH5-CreERT2 mice is described as follows:

      To investigate whether the resulting human disease subtype (e.g., lesions confined to the head and neck region) is determined by the specific embryonic stage at which Pik3caH1047R is expressed, we crossed tamoxifen-inducible, pan-endothelial CDH5-CreERT2 mice with R26R-Pik3caH1047R mice and analyzed the embryos at E16.5 or E17.5. (Page 5, lines 169-172)

      Why were tamoxifen injections done at various time points (E9.5, E12.5, E15.5)? Please clarify the reasoning behind administering tamoxifen at these specific times. Explaining the rationale will help the reader follow the experimental design more easily. Additionally, including an initial diagram summarizing all the strategies to guide the reader from the beginning would be helpful.

      Response:

      Martinez‐Corral et al. (Nat. Commun., 2020) focused on lymphatic malformations, arguing that the timing of tamoxifen administration during the embryonic period determines the anatomical features of these lesions. They stated, “The majority of lesions appeared as large isolated cysts that were localized mainly to the cervical, and less frequently to the sacral region of the skin (Figure 2)”. Although not stated definitively, their data suggest that early embryonic tamoxifen administration results in the formation of large‐caliber lymphatic vessels with region‐specific distribution in the cervical skin (Figure 2C, Supplemental Figure 2). This description likely reflects an intention to model human vascular malformations, implying that the anatomical characteristics of these malformations are influenced by the developmental stage at which the Pik3caH1047R somatic mutation occurs.

      Inspired by these findings, we conducted experiments to determine whether altering the timing of tamoxifen administration would yield region-specific anatomical patterns in vascular malformation development. However, our results indicate that changing the timing of tamoxifen administration does not lead to an anatomical bias similar to that observed in human vascular malformations. Instead, we propose that the embryological cellular origin plays a more significant role in the formation of these human pathologies.

      Regarding this section, we have slightly revised the introductory part of the Figure 2 explanation as follows:

      To investigate whether the resulting human disease subtype (e.g., lesions confined to the head and neck region) is determined by the specific embryonic stage at which Pik3caH1047R is expressed, we crossed tamoxifen-inducible, pan-endothelial CDH5-CreERT2 mice with R26R-Pik3caH1047R mice and analyzed the embryos at E16.5 or E17.5. (Page 5, lines 169-172)

      Additionally, we have added a schematic diagram of the tamoxifen administration schedule at the beginning of Figure 2 and Supplemental Figure 3.

      Why do you use the Isl1-Cre constitutive line (instead of the CreERT2)? The former does not allow control of the timing of recombination (targeting specifically your population of interest) and loses the ability to trace the mutant cell behaviors over time. Is the constitutive expression of Pik3caH1047R in Isl1+ cells lethal at any embryonic time, or do the animals survive into adulthood? When you later use the Isl1-CreERT2 line, why do you induce recombination specifically at E8.5? It would be helpful for the reader to have an explanation for this choice, along with a reference to your previous paper.

      Response:

      Thank you for your comments. We did attempt the same experiments using Isl1-CreERT2 under various conditions. However, administering tamoxifen earlier than E8.5 invariably caused embryonic lethality, likely due to both Pik3ca activity and tamoxifen toxicity, leaving no embryos for analysis. In our previous study, repeated attempts from E6.5 to E16.5 resulted in only two surviving embryos (Maruyama et al., eLife, 2022, Supplemental Figure 3). We also failed to recover any live embryos with tamoxifen administration at E7.5.

      Even reducing the tamoxifen dose to one-fifth did not succeed when given before E8.5. Although E8.5 administration was feasible, the observed phenotype remained mild, and no phenotype was detected at E9.5, E11.5, E12.5, or later stages. These findings align with our earlier observations that moving tamoxifen injection from E8.5 to E9.5 markedly diminishes the Isl1+ contribution to the endothelial lineage.

      Furthermore, Supplemental Figure 5____ and 6 suggest that a decrease in Isl1 mRNA, which occurs as early as E8.0–E8.25, triggers the shift toward endothelial differentiation. Considering these data and the mild phenotype at E8.5, earlier administration would be ideal for impacting Isl1+ cell fate. However, technical constraints prevented us from doing so, leading us to utilize the constitutive Isl1-Cre line instead.

      This section was already included in the Discussion; however, for clarity, we have revised it as follows:

      Given that Isl1 expression disappears at a very early stage and contributes to endothelial differentiation, experiments using Isl1-Cre or Isl1-CreERT2 mice cannot clearly distinguish between LMs, VMs, and capillary malformations, In other words, Isl1+ cells likely label a common progenitor population for multiple endothelial subtypes. Consequently, the diverse vascular malformations in the head and neck—including mixed venous-lymphatic and capillary malformations, as well as the macro- and microcystic subtypes of LMs—cannot be fully accounted for by this study alone. (Page 13, lines 419-425)

      What is the purpose of using this battery of CreERT2 lines (for example, the Myf5-CreERT2)?

      Response:

      The head and neck mesoderm arises primarily from the cardiopharyngeal mesoderm and the cranial paraxial mesoderm. Myf5-CreERT2 labels the cranial paraxial mesoderm in the facial region, which gives rise to facial skeletal muscles. Stone et al. (Dev Cell, 2019) reported that a subset of this lineage contributes to head and neck lymphatic vessels, whereas our study (Maruyama et al., eLife, 2022) found no such contribution—an ongoing point of debate. Nevertheless, expressing Pik3caH1047R in this lineage did not induce any vascular malformations.

      Pax3-CreERT2 mice label Pax3____⁺ paraxial mesoderm (including cranial paraxial mesoderm), which reportedly contributes to the common cardinal vein and subsequently forms trunk lymphatics (Stone & Stainier, 2019; Lupu et al, 2022) . When Pik3caH1047R was expressed in Pax3⁺ cells, we observed abnormal vasculature in the lower trunk and around the vertebrae, consistent with that report.

      Synthesizing these observations with our results from Isl1-Cre, Isl1-CreERT2, and Mef2c-AHF-Cre lines, we propose that Pik3caH1047R mutations within the cardiopharyngeal mesoderm underlie the clinically significant vascular malformations seen in the head and neck region.

      We have also incorporated the following explanation into the main text.

      Regarding the Pax3-CreERT2:

      The head and neck mesoderm arises primarily from the cardiopharyngeal mesoderm and the cranial paraxial mesoderm. In Pax3-CreERT2; R26R-Pik3caH1047R embryos, Pax3+ paraxial mesoderm (including cranial paraxial mesoderm) is labeled; this lineage reportedly contributes to the common cardinal vein and subsequently forms trunk lymphatics(Lupu et al, 2022), (Page 8, lines 247-250)

      Regarding the Myf5-CreERT2;

      In Myf5-CreERT2; R26R-tdTomato mice—which label the cranial paraxial mesoderm, particularly muscle satellite cells—crossed with R26R-Pik3caH1047R, tamoxifen was administered to pregnant mice at E9.5. (Page 8, lines 255-257)

      I find the scRNAseq data in Fig S4 and S5 results very interesting, although I am unsure how they fit with the rest of the story. In principle, a subset of Isl1+ cardiopharyngeal mesoderm (CPM) derivatives into lymphatic endothelial cells was already demonstrated in a previous publication from the group. What is the novelty and purpose here?

      Response:

      This also addresses Question 11. Our aim in using the Isl1⁺ lineage was to determine the extent of analysis possible with this experimental system. Through reanalysis, we found that the downregulation of Isl1 triggers a switch toward endothelial cell differentiation, with this cell fate decision occurring at a very early embryonic stage. Consequently, our single‐cell analysis supports the conclusion that, regardless of the Isl1-CreERT2 line used or the timing of tamoxifen administration, it is challenging to precisely recapitulate the fine clinical phenotypes observed in humans (e.g., lymphatic or venous malformations) with this experimental system. We believe that this single‐cell analysis provides a theoretical basis for the notion that our Isl1-Cre-based developmental model can only generate a mixed phenotype of vascular and lymphatic malformations.

      This section is explained in a similar manner in the revised Discussion for Question 11 as follows:

      Given that Isl1 expression disappears at a very early stage and contributes to endothelial differentiation, experiments using Isl1-Cre or Isl1-CreERT2 mice cannot clearly distinguish between LMs, VMs, and capillary malformations, In other words, Isl1+ cells likely label a common progenitor population for multiple endothelial subtypes. Consequently, the diverse vascular malformations in the head and neck—including mixed venous-lymphatic and capillary malformations, as well as the macro- and microcystic subtypes of LMs—cannot be fully accounted for by this study alone. (Page 13, lines 419-425)

      Why in Fig. 4 ECs were not subclustered for further analysis (as in Fig. S4,5)? This is a missed opportunity to understand the pathogenic phenotypes.

      Response:

      Thank you for your question. We performed sub-clustering analysis, particularly focusing on why no phenotype is observed in arteries, as we believed this approach could provide molecular-level insights. Accordingly, we conducted the analysis presented in Figure 1 for Reviewer 1.





      Figure legends for Figure ____1 ____for Reviewer 1. The number of endothelial cells was insufficient, making subclustering ineffective.

      (Figure for Reviewer 1A, B) Left: UMAP plot showing color-coded clusters (0–3). Subcluster analysis of the Endothelium (Cluster 1) from Fig. 4B. Right: UMAP plot color-coded by condition. (Figure for Reviewer 1C) Heatmap showing the average gene expression of marker genes for each cluster by condition. After cluster annotation, subclusters 0, 1, 2, and 3 were defined as Vein, Capillary, Artery, and Lymphatics, respectively. (Figure for Reviewer 1D) Cell type proportions. (Figure for Reviewer 1E) Number of differentially expressed genes (DEGs) in each sucluster of the PIK3CAH1047R group relative to Control. (Figure for Reviewer 1F) Comparison of enrichment analysis between EC subclusters from scRNA-seq. The bar graph shows the top 20 significantly altered Hallmark gene sets in EC subclusters from scRNA-seq using ssGSEA (escape R package). Red bars represent significantly upregulated Hallmark gene sets in mutants (FDR Initially, we performed sub-clustering on endothelial cells; however, this resulted in a considerably reduced number of cells per sub-cluster, especially in control group (Figure for Reviewer 1A, B). In the control group, there were only approximately 149 endothelial cells in total, and dividing these into four clusters led to very few cells per cluster, thereby introducing statistical instability. Although arterial endothelial cells were relatively well defined by their high expression of Hey1 and Hey2 and lower levels of Nr2f2 and Aplnr, the boundaries between venous, capillary, and lymphatic endothelial cells were less distinct. In particular, defining lymphatic endothelial cells solely by Prox1 expression yielded a very small population; even after incorporating additional lymphatic markers such as Flt4 and Lyve1, it remained challenging to clearly separate the venous, capillary, and lymphatic populations (Figure for Reviewer 1C). Consequently, the proportion of lymphatic endothelial cells was markedly low, and discrepancies with the histological findings further reduced our confidence in this dataset (Figure for Reviewer 1D, E). Moreover, the number of differentially expressed genes (DEGs) increased with the number of cells, and the results of the enrichment analysis as well as the volcano plot were nearly identical to those shown in Figure 4 (Figure for Reviewer 1F, G). In other words, the subclustering process itself had limitations, resulting in the overall outcome being dominated by the most abundant venous cluster.

      It is possible that these limitations in sub-clustering are due to the relatively small number of endothelial cells. Nonetheless, a major strength of our single-cell analysis is its ability to compare various cell types derived from Isl1+ lineages, not just endothelial cells. Therefore, the relative scarcity of endothelial cells represents a limitation of this experimental system. For these reasons, we decided to omit this figure from the final version of the manuscript.

      This point is described in the Discussion section as follows:

      Additionally, we performed endothelial subclustering to explore potential differences in gene expression among arterial, venous, capillary, and lymphatic endothelium. However, in the control embryos, the number of endothelial cells was too low to yield reliable data (data not shown). (Page 13, lines 434-437)

      Hypoxia and glycolysis signatures are not specific to mutant ECs. Do the authors have an explanation for this? It is well known that PI3K overactivation increases glycolysis; please acknowledge this.

      __Response: __

      Thank you for your important comment. We have now incorporated a discussion, along with relevant references, on the section addressing that PI3K overactivation increases glycolysis into the Discussion section as follows:

      It is well known that overactivation of PI3K enhances glycolysis(Hu et al, 2016) . In our study, the elevated expression of glycolytic enzymes, including Ldha, suggests a shift toward aerobic glycolysis, consistent with the Warburg effect. (Page 13, lines447-450)

      Do you have an explanation for the expression of VEGFA by lymphatic mutant cells?

      __Response: __

      VEGF-A acts on VEGFR2 expressed on LECs, thereby promoting their proliferation and migration(Hong et al, 2004; Dellinger & Brekken, 2011) .To clarify this point, we have revised the text accordingly and added additional references as follows:

      We focused on Vegf-a, a key regulator of ECs proliferation and a downstream target of Hif-1α. Vegf-a likely drives both cell-autonomous and non-cell-autonomous effects on blood ECs , as well as LECs(Hong et al, 2004; Dellinger & Brekken, 2011). (Page 13, lines 445-447)

      Likewise, why mesenchymal cells traced from the Islt1-Cre decreased upon expression of Pik3caH1047R?

      Response: When comparing the mesenchyme cluster with other mesoderm-derived cells, we observed a marked downregulation of signaling pathways—notably those involved in inhibiting EMT, such as TGF-β, Wnt/βcatenin, and MYC target genes (Supplemental Figure 7B). Many of these pathways are associated with decreased epithelial-to-mesenchymal transition(Xu et al, 2009; Singh et al, 2012; Larue & Bellacosa, 2005; Yu et al, 2015), which could explain the reduction in the number of mesenchymal cells. However, PI3K activation is generally considered to promote EMT, which is at odds with previous studies.

      On the other hand, several investigations—including those using ES cells—suggest that PI3K activation could suppress TGF-β signaling via SMAD2/3(Yu et al, 2015) , and in some undifferentiated cell contexts, it may also inhibit the Wnt/β-catenin pathway via Smad2/3(Singh et al, 2012) . These multifaceted roles of PI3K could be particularly important during embryonic development(Larue & Bellacosa, 2005).

      Understanding how mesenchymal cell changes under PI3K activation affect endothelial cells is an important issue that requires further study. Accordingly, we have added these points to the Discussion section as follows:

      In our data, the mesenchymal cell population was decreased, and within this cluster, pathways typically promoting epithelial mesenchymal tansition (EMT) (e.g., TGF-β, Wnt, and MYC target genes) were downregulated (Supplemental Figure 7B). Although PI3K activation is generally thought to enhance EMT, several studies in undifferentiated cells have reported that PI3K can suppress these signals via SMAD2/3(Singh et al, 2012; Yu et al, 2015) . Elucidating how these changes in the mesenchyme contribute to vascular malformation pathogenesis remains an important avenue for future research. (Page 13, lines 437-444)

      Authors need to characterize the preclinical model before conducting any preclinical study. No controls are provided, including wild-type mice and phenotypes, before starting the treatment (day 4).

      Response:

      Thank you very much for your comment. We have now added new images illustrating skin under three conditions: untreated skin at Day 7, skin from Cre-negative animals that received tamoxifen, and skin from Cre-positive animals examined 4 days after tamoxifen administration. Additionally, we have included the corresponding statistical data for these skin samples (Figure 6C–E).

      Why did the authors not use their developmental model of head and neck malformation model for preclinical studies? This would be much more coherent with the first part of the manuscript. Also, how many animals were treated and quantified for the different conditions?

      Response:

      We have now indicated the number of animals (n) used under each condition directly on the graphs for clarity. As for why we did not use the Isl1-Cre model, we observed that—similar to the Tie2-Cre line—all Isl1-Cre mutant embryos died between E13.5 and E14.0 (indeed, none survived beyond E14.0; see our newly added Figure 3N). Consequently, we could not perform any postnatal treatment experiments. Moreover, as previously noted, the Isl-CreERT2 line has an extremely narrow developmental window for vascular malformation formation, making it less suitable as a general model.

      Although we considered potential in utero or maternal interventions (e.g., direct uterine injection or placental transfer), these approaches demand extensive technical optimization and remain an area for future investigation. From a clinical standpoint, postnatal therapy meets a more immediate need: while vascular malformations are congenital, they often enlarge over time(Ryu et al, 2023) , becoming more apparent and more likely to require treatment.

      In this study, because embryonic Pik3caH1047R expression was lethal before birth, we generated and treated postnatal cutaneous vascular malformations instead. Although this model does not strictly recapitulate the embryonic disease state, previous studies assessing drug efficacy have similarly employed postnatal tamoxifen-inducible mouse models(Martinez-Corral et al, 2020) , lending validity to this approach. Moreover, because lesions typically become evident later in life rather than in utero, this method more closely aligns with clinical reality and may be more readily translated into practice.

      Minor Comments

      References in the introduction need to be revised. Specifically, how authors reached the stats on head and neck vascular malformations needs to be clarified. For instance, one of the cited papers refers to all types of vascular malformation, while the other focuses exclusively on lymphatic malformations with PIK3CA mutations. Moreover, in the latter, the groups are divided into orofacial and neck and body categories. How do authors substrate the information from the neck and head here?

      Response:

      We have clarified our definition of the “head and neck” region early in the Introduction and separated the discussion on anatomical localization from that on PIK3CA genetics. Additionally, we removed the percentage data of localization to avoid potential confusion with the genetic aspects.

      In Japan, lymphatic and other vascular malformations of the head and neck typically require complex, multidisciplinary management. Consequently, these conditions are officially designated as “intractable diseases,” and the government provides financial assistance for their treatment. Although most of the information is available only in Japanese, we refer reviewers to the following websites for details on head and neck vascular malformations:

      https://www.nanbyou.or.jp/entry/4893 https://www.nanbyou.or.jp/entry/4631 https://www.nanbyou.or.jp/entry/4758.

      (Please read with English translator, e.g., Google chrome translator)

      We are not aware of a comparable system in other countries. However, it is well recognized that vascular malformations frequently occur in the head and neck region(Nair, 2018; Alsuwailem et al, 2020; Sadick et al, 2017), as evidenced by over 250 PubMed hits when searching for “vascular malformation” and “head and neck.

      Incorporating this comment, we have revised the early part of the Introduction as follows:

      They frequently manifest in the head and neck region—here defined as the orofacial and cervical areas, excluding the brain (Zenner et al, 2019; Lee & Chung, 2018; Nair, 2018; Alsuwailem et al, 2020). (Page 2, lines 52-53)

      Also, in line 79, I need clarification on ref 24 about fibrosis.

      __Response: __

      Thank you very much for pointing out the error. We have corrected the placement of the reference accordingly.

      Include references: Studies in mice have shown that p110α is essential for normal blood and lymphatic vessel development. Please clarify and correct. 

      __Response: __

      Thank you very much. We have now added the references(Graupera et al, 2008; Gupta et al, 2007; Stanczuk et al, 2015).

      Please define PIP2 and PIP3

      __Response: __

      Thank you very much for your comment. We have now added the following definitions to the Introduction:

      PIP2: Phosphatidylinositol 4,5-bisphosphate

      PIP3: Phosphatidylinositol 3,4,5-trisphosphate


      Why is Prox1 showing positivity in erythrocytes in Figure 1?

      Response:

      We used paraffin-embedded sections to preserve tissue morphology. Although we applied a reagent to suppress autofluorescence, some spillover from excitation around 488 nm was unavoidable. Moreover, in the mutant mice, blood remained within the abnormal vessels rather than being completely flushed out, which further increased the autofluorescence. Despite our efforts to mitigate this, some residual autofluorescence persisted. Consequently, we also employed DAB-based staining to confirm the specificity of Prox1 labeling in other Figures.

      Regarding Figure 1, I suggest organizing the quantifications in the same order to facilitate phenotype comparisons. For example, I, J vs. Q, R. What is the difference between M and N?

      Response:

      To facilitate the comparison between Figures 1I, J and 1Q, R, we have swapped Figures 1Q and R. Regarding Figures 1M and N, these panels represent the average cross-sectional area of an enlarged malformed vessel and the number of vessels exceeding a defined size, respectively. Although some central veins appeared slightly enlarged in the control group, the liver exhibits both a significant dilation of malformed vessels and an increased number of such vessels.

      Add the reference of the Bulk RNseq data.

      __Response: __

      We have added the following references: (Jauhiainen et al, 2023)

      Mark in the Fig. 4F that the volcano plots are from cluster one of the scRNASeq (this is explained in text and legend, but when you go to the figure, it isn't very clear).

      __Response: __

      We have added the label “Cluster 1: Volcano Plot (genes associated with hypoxia/glycolysis)” to

      Figure 4F.

      Please label Figure 6D/E with the proper labels.

      __Response: __

      We have provided appropriate labels for Figure 6.

      In Fig. 6, it is mentioned that vacuoles are from the tamoxifen injection, how do you know? Do you also see them if you add oil alone (without tamoxifen) or tamoxifen in a WT background?

      __Response: __

      In Figure 6C, we have included both the image at Day 4 and the condition of Cre(–) animals 7 days after tamoxifen injection.

      **Referees cross-commenting**

      I complete agree with referee #2 regarding the preclinical studies. Bevacizumab, does not neutralize murine VEGFA. This is a major issue.

      __Response: __

      As noted in the Reviewer #2 section, there appears to be some effect on mouse vasculature (Lin et al, 2022). However, given the ongoing debate regarding this issue, we performed additional experiments using a neutralizing antibody against mouse VEGF-A (clone 2G11). This antibody has been shown to suppress the proliferation of mouse vascular endothelial cells in vivo, for example(Mashima et al, 2021; Wuest & Carr, 2010). Our results demonstrate that it more sharply suppresses the proliferation of malformed vasculatures (both blood and lymphatic vessels) than bevacizumab. Based on these additional experiments, we revised the figures and updated them as Figure 6.

      Reviewer #1 (Significance (Required)):

      This study addresses a timely and relevant question: the origins, onset and progression of congenital vascular malformations, a field with limited understanding. The work is novel in its approach, employing complex embryonic models that aim to mimic the disease in its native context. By focusing on the effects of Pik3caH1047R mutations in cardiopharyngeal mesoderm-derived endothelial cells, it sheds light on how these mutations drive phenotypic outcomes through specific pathways, such as HIF-1α and VEGF-A signaling, while also identifying potential therapeutic targets. A strong aspect of the study is the use of embryonic models, which enables the investigation of disease onset in a context that closely resembles the in vivo environment. This is particularly valuable for congenital disorders, where native developmental cues are an integral aspect of disease progression. The study also integrates advanced techniques, including single-cell RNA sequencing, to dissect the cellular and molecular responses induced by the Pik3caH1047R mutation. Moreover, from a translational perspective, it provides novel therapeutic strategies for these diseases. Limitations of the study are (1) unclarity of the main question authors try to address, and main conclusions dereived thereof; (2) the different parts of the manuscripts are not well connected, not clear the rationale; (3) scRNAseq analysis is underdeveloped; (4) characterization of the preclinical model is not provided.

      Audience:

      The findings presented here interest specialized audiences within developmental biology, vascular biology, and congenital disease research fields, and clinicians by providing new therapies to treat vascular anomalies. Moreover, the study's integration of single-cell and in vivo models could inspire further research in other contexts where understanding clonal behavior and signaling pathways is critical.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This paper focuses on vascular malformations driven by PI3K mutation, with particular interest on the vascular defects localized at head and neck anatomical sites. The authors exploit the H1047R mutant which has been largely demonstrated to induce both vascular and lymphatic malformation. To limit the effect of H1047R to tissues originated from cardiopharinegal mesoderm, PI3caH1047R mice were crossed with mice expressing Cre under the control of the promoter of Ils1 , a transcription factor that contributes to the development of cardiopharinegal mesoderm-derived tissues. By comparing the embryo phenotype of this model with that observed by inducing at different times of development the expression of PI3caH1047R, the authors conclude that Isl-Cre; PI3caH1047R; R26R-eYFP model recapitulates better the anatomical features of human vascular malformations and in particular those localized at head and neck. In my opinion the new proposed model represents a significant progress to study human vascular malformations. Furthermore, scRNA seq analysis has allowed to propose a mechanism focused on the role of HIF and VEGFA. The authors provides partial evidences that HIF and VEGFA inhibitors halt the development of vascular malformation in VeCAdCre; Pik3caH1047 mice. This experiment is characterized by a conceptual mistake because bevacizumab does not recognize murine VEGFA (see for instance 10.1073/pnas.0611492104; 10.1167/iovs.07-1175. This error dampens my enthusiasm

      CRITICISM

      1. Fig 1A. E13.5 corresponds to the early phase of vascular remodelling. Which is the phenotype at earliest stages (e.g. 9.5 or 10.5)

      Response:

      Thank you very much for your comment. We have created new Supplemental Figure 2, which demonstrates that no obvious phenotype is observed in mutant embryos at E9.5 and E11.5, and that the survival limit of these mutant embryos is around E13.5 to E14.0.

      In response to Reviewer 1’s question, previous study(Hare et al, 2015) have shown that on a B6 background, this mouse model exhibits an earlier onset of phenotype, resulting in early lethality. However we selected a mixed background of B6 and ICR, as we believe that a heterogeneous genetic background more accurately reflects the diversity of human pathology. We examined five pregnant females, which yielded approximately 30 embryos, of which only two survived until E14.0. Based on these observations, we consider E13.5 to E14.0 to be the appropriate survival limit (see Supplemental Figure 2G for additional details).

      We have described this in the Results section as follows:

      Whereas clear phenotypes were evident at E12.5 and E13.5, no pronounced external abnormalities were observed at E9.5 or E11.5 (Supplemental Figure 2A–B). Similarly, histological examination revealed no significant differences in the short-axis diameter of the PECAM+ CV or in the number of Prox1+ LECs surrounding the CV between control and mutant embryos at E11.5 (Supplemental Figure 2C–F). We also assessed Tie2-Cre; R26R-Pik3caH1047R mutant embryos at E14.0 from five pregnant mice. Only two embryos were alive at this stage, and both showed severe edema and hemorrhaging, indicating they were nearly moribund. These observations suggest that the critical point for survival of these mutant embryos lies between E13.5 and E14.0 (Supplemental Figure 2G). (Page 5, lines 157-165)

      Fig 1,2,3. The analysis of VEGFR2 expression is required. This request is important for the paradigmatic and non-overlapping role of this receptor in early and late vascular development. Furthermore ,these data better clarify the mechanism suggested by the experiments reported in fig 5 (VEGFA and HIF expression)

      __Response: __

      Thank you very much for your comment. For each mouse presented in Figures 1, 2, and 3, we performed VEGFR2 immunostaining on serial sections corresponding to each figure and created a new Supplemental Figure 9. VEGFR2 was broadly expressed in both vascular and lymphatic endothelial cells in control and mutant embryos.

      We have described this in the Results section as follows:

      Furthermore, to verify whether VEGF‐A can act via VEGFR2, we performed VEGFR2 immunostaining on several mouse models: Tie2‐Cre; R26R‐Pik3caH1047R embryos (E13.5, corresponding to Figure 1), CDH5‐CreERT2; R26R‐Pik3caH1047R embryos (tamoxifen administered at E9.5 and analyzed at E16.5, corresponding to Figure 2), and Isl1‐Cre; R26R‐Pik3caH1047R embryos (E11.5 and E13.5, corresponding to Figure 3). In all cases, both control and mutant embryos exhibited widespread VEGFR2 expression in blood and lymphatic vessels at early and late developmental stages (Supplemental Figure 9A-R’). These findings suggest that Pik3caH1047R may act in an autocrine manner, at least in part via the VEGF‐A/VEGFR2 axis in endothelial cells, potentially explaining the observed phenotype. (Page 11, lines352-361)

      As done in Fig 1,2 and 3, data quantification by morphometric analysis is also required for results reported in supplemental figure 3

      __Response: __

      Thank you for your comment. We have now added additional statistics and graphs for clarity, which are presented as Supplemental Figure 4.

      Lines 166-174. I suppose that the reported observations were done at E16.5. What happens later? It's crucial to sustain the statement at lines 187-190

      Response:

      At E9.5 and E12.5, we reduced the tamoxifen dose to one-fifth of the standard dose. After collecting embryos from approximately 10 pregnant females, we were only able to obtain three embryos at these stages. When tamoxifen was administered at E15.5, three embryos were obtained from two litters. In most cases, miscarriages occurred by E16.5, making further observation difficult. We focused on the time point around E16.5 because it is generally believed that the basic distribution of the lymphatic system throughout the body is established around this stage (Srinivasan et al, 2007; Maruyama et al, 2022).

      A similar experiment has been reported using T-CreERT2 to induce mosaic expression of Pik3caH1047R in the mesoderm, which resulted in subcutaneous venous malformations in mice at P1–P5 (Castillo et al, 2016). However, that study did not report whether the mice survived normally after birth. In fact, regarding the survival rate, the authors stated, “Our observations on the lethality and vascular defects in MosMes-Pik3caH1047R (T-CreERT2;R26R-Pik3caH1047R) embryos are similar to the previously reported phenotypes of ubiquitous or EC-specific expression of Pik3caH1047R in the developing embryo (Hare et al, 2015),” suggesting a high mortality rate when Pik3caH1047R is expressed using Tie2-Cre. Moreover, according to Hare et al., analysis of 250 Tie2-Cre; R26R-Pik3caH1047R embryos revealed that all were lethal by E11.5. Thus, considering our results in conjunction with those from previous studies, it appears that expression of Pik3caH1047R in the mesoderm or endothelial cells during embryonic development results in the death of most embryos before birth.

      We have supplemented the Results section with the following details:

      Since the standard tamoxifen dose (125 mg/kg body weight) leads to miscarriage or embryonic death within 1–2 days, we diluted it to one-fifth of the original concentration. (Pages 5-6, lines 175-177)

      scRNAseq was performed at E13.5 (Fig 4). It's mandatory to perform the same analysis at E16.5, which corresponds to the phenotypic analysis shown in fig 3. This experiment is required to understand how hypoxia and glycolysis genes changes along the development of the vascular malformation.

      __Response: __

      Thank you very much for your comment. First, regarding the experiments using Isl1‐Cre, we would like to clarify that the survival aspect was not adequately addressed. Our Isl1‐Cre embryos die between E13.5 and E14.0, which makes it practically impossible to perform single‐cell analysis beyond this stage (please refer to the newly added Figure 4N). Similarly, for experiments using CDH5‐CreERT2, the limited number of embryos obtained renders further analysis extremely challenging. Additionally, we have supplemented the Results section with the following description:

      These Isl1-Cre; R26R-Pik3caH1047R mutant embryos likely died from facial hemorrhaging between E13.5 and E14.0 (Figure 3N). (Page 7, lines 236-237)

      Further analysis at later embryonic stages proved challenging. Consequently, we aimed to investigate the effects of Pik3caH1047R on endothelial cells by comparing gene expression at E10.5 with that at E13.5. We performed single‐cell RNA sequencing on E10.5 embryos from both the control (Isl1-Cre; R26R-eYFP) and mutant (Isl1-Cre; R26R-eYFP; R26R-Pik3caH1047R) embryos. Unfortunately, the quality of both datasets was insufficient for reliable analysis. In the control sample, only 40.3% of reads were assigned to cell‐associated barcodes—substantially below the ideal threshold of >70%—with an estimated 790 cells and a median of 598 genes per cell. Similarly, in the mutant sample, only 37.0% of reads were associated with cells, despite an estimated cell count of 7,326 and a median of only 526 genes per cell. These metrics indicate that both datasets were severely compromised by high levels of ambient RNA or by a significant number of cells with low RNA content, precluding robust downstream analysis. This may be due to the fact that immature cells are particularly susceptible to damage incurred during FACS sorting and transportation to the analysis facility. Moreover, the relatively low number of control endothelial cells at E13.5 led us to conclude that performing similar experiments at earlier stages would be difficult. Despite our best efforts, we acknowledge this as a limitation of the present study.

      Lines 326-343. In this section the authors provide pharmacological evidences that HIF and VEGFa are involved in vascular malformation caused by H1047R . However , I'm surprised of efficacy of bevacizumab, which neutralizes human but not murine VEGFA. Genetech has developed B20 mAb that specifically neutralizes murine VEGFA. So the data shown require a. clarification by the authors and the experiments must be done with the appropriate reagent. Furthermore, which is the pharmacokynetics of these compounds topically applied?

      Response:

      Thank you very much for your comment. There are reports that bevacizumab exerts an in vivo inhibitory effect on neovascularization mediated by mouse Vegf-A (Lin et al, 2022). However, given the contentious nature of this issue, we conducted additional experiments. Due to the requirement for an MTA to obtain B20 mAb from Genentech—and considering the time constraints during revision—we opted to use a neutralizing antibody against mouse VEGF-A (clone 2G11) instead. This antibody has been shown to suppress the proliferation of mouse vascular endothelial cells in vivo (Mashima et al, 2021; Wuest & Carr, 2010) .

      The dosing regimen for 2G11 was determined based on previous studies (Surve et al, 2024; Churchill et al, 2022). Moreover, an example of effective local administration is provided in (Nagao et al, 2017). Since this product is an antibody drug, it is metabolized and does not function as a prodrug. Although the precise half-life of 2G11 is unknown, rat IgG2a antibodies generally have a circulating half-life of approximately 7–10 days in rats. However, when administered to mice, the half-life is often significantly reduced due to interspecies differences in neonatal Fc receptor (FcRn) binding affinity, with estimates in murine models typically around 2–4 days(Abdiche et al, 2015; Medesan et al, 1998) . However, in our model the injection is subcutaneous—almost equivalent to an intradermal injection (Figure 6B, C). Because this method is expected to provide a more sustained, slow-release effect (similar to the tuberculin reaction), the half-life should be longer than that achieved with intravenous administration. Consequently, we believe that sufficient efficacy is maintained in this model.

      Regarding LW-6:

      LW-6 is a small molecule that, due to its hydrophobic nature, is believed to freely cross cell membranes. Once inside the cell, it facilitates the degradation of HIF-1α, leading to reduced expression of its downstream targets (Lee et al, 2010). Although its half-life is estimated to be around 30 minutes, the active metabolites may exert sustained secondary effects (Lee et al, 2021). When administered intravenously, peak blood concentrations are reached within 5 minutes, making Cmax a critical parameter due to the rapid onset of action. In our experiments, we based the dosing regimen on previous studies (Lee et al, 2010; Song et al, 2016; Xu et al, 2022, 2024). While those studies administered doses comparable to or twice as high as ours via intravenous, intraperitoneal, or oral routes, our experimental design—in which a single dose was administered on Day 4 and samples were collected on Day 7—necessitated a single-dose protocol.

      Regarding Rapamycin:

      Several studies have demonstrated that local administration yields anti-inflammatory effects (Takayama et al, 2014; Tyler et al, 2011). Similar outcomes have been observed in vascular malformations (Boscolo et al, 2015; Martinez-Corral et al, 2020). Although the half-life of rapamycin is estimated to be approximately 6 hours following intravenous administration, it may be even shorter (Comas et al, 2012; Popovich et al, 2014).

      In light of these comments, we have revised Figure 6. Furthermore, the Results section pertaining to Figure 6 has been updated as follows:

      Hif-1α and Vegf-A inhibitors suppress the progression of vascular malformations.

      We next examined whether administering Hif-1α and Vegf-A inhibitors could effectively treat vascular malformations. Tamoxifen was administered to 3–4-week-old CDH5-CreERT2;R26R-Pik3caH1047R mice to induce mutations in the dorsal skin. Anti-VEGF-A, a Vegf-A neutralizing antibody; LW6, a Hif-1α inhibitor; and rapamycin, an mTOR inhibitor, were topically applied, and their effects were analyzed (Figure 6A). Both anti-VEGF-A and LW6 reduced the visible swelling in the dorsal skin, whereas the difference between the drug-treated and control groups was less pronounced with rapamycin (Figure 6B). In tamoxifen-treated Cre(–) mice, inflammatory cell infiltration and fibrosis were observed from the dermis to the subcutaneous tissue; however, there were no changes in the number of PECAM⁺ vasculatures or VEGFR3⁺ lymphatic vessels, including their enlarged forms, compared to the untreated control (Figure 6C–E). In contrast, tamoxifen administration to CDH5-CreERT2;R26R-Pik3caH1047R mice resulted in an increase in these vascular structures by day 4 (Figure 6C–E). At day 7, comparing mice with or without treatment using anti-VEGF-A, LW6, or rapamycin, the number of PECAM⁺ vasculatures was reduced in the treated groups; however, in the rapamycin group, the number of enlarged PECAM⁺ vasculatures did not differ from that in the untreated group (Figure 6F–M). Similarly, for VEGFR3⁺ lymphatic vessels, both anti-VEGF-A and LW6 induced a reduction, whereas rapamycin did not produce a statistically significant decrease (Figure 6N–U). (Page 11, lines 363-381)

      **Referees cross-commenting**

      The issues raised by refereee #1 related to the phenotype analysis are right. In my opinion the Isl model here proposed well mimic human pathology evenf the vascular damage at. head is not so evident

      Response:

      Perhaps the discrepancy arises from a terminological issue. According to the WHO Classification of Tumours, commonly used in clinical settings, the term "Head and Neck" refers to the facial and cervical regions (including the oral cavity, larynx, pharynx, salivary glands, nasal cavity, etc.) and excludes the central nervous system. The inclusion of the brain in Figure 1O-R may have led to some confusion. We included the brain because cerebral cavernous malformations are classified as venous malformations, and thus serve as an example of common sites for venous malformations in humans.

      To clarify this point, we have made slight revisions to the first part of the Introduction, as follows:

      They frequently manifest in the head and neck region—here defined as the orofacial and cervical areas, excluding the brain. (Page2, lines 52-53)

      Reviewer #2 (Significance (Required)):

      General assessment

      STRENGTH : a new mouse model seems to well recapitulate human vascular malformation. Possible key molecules have been identified

      WEAKNESS. The pharmacological approach to support the role of VEGFA e HIF is not appropriate

      References for the review:

      Abdiche YN, Yeung YA, Chaparro-Riggers J, Barman I, Strop P, Chin SM, Pham A, Bolton G, McDonough D, Lindquist K, et al (2015) The neonatal Fc receptor (FcRn) binds independently to both sites of the IgG homodimer with identical affinity. mAbs 7: 331–343

      Alsuwailem A, Myer CM & Chaudry G (2020) Vascular anomalies of the head and neck. Semin Pediatr Surg 29: 150968

      Boscolo E, Limaye N, Huang L, Kang K-T, Soblet J, Uebelhoer M, Mendola A, Natynki M, Seront E, Dupont S, et al (2015) Rapamycin improves TIE2-mutated venous malformation in murine model and human subjects. J Clin Investig 125: 3491–3504

      Castillo SD, Tzouanacou E, Zaw-Thin M, Berenjeno IM, Parker VER, Chivite I, Milà-Guasch M, Pearce W, Solomon I, Angulo-Urarte A, et al (2016) Somatic activating mutations in Pik3ca cause sporadic venous malformations in mice and humans. Sci Transl Med 8: 332ra43

      Churchill MJ, Bois H du, Heim TA, Mudianto T, Steele MM, Nolz JC & Lund AW (2022) Infection-induced lymphatic zippering restricts fluid transport and viral dissemination from skin. J Exp Med 219: e20211830

      Comas M, Toshkov I, Kuropatwinski KK, Chernova OB, Polinsky A, Blagosklonny MV, Gudkov AV & Antoch MP (2012) New nanoformulation of rapamycin Rapatar extends lifespan in homozygous p53−/− mice by delaying carcinogenesis. Aging (Albany NY) 4: 715–722

      Dellinger MT & Brekken RA (2011) Phosphorylation of Akt and ERK1/2 Is Required for VEGF-A/VEGFR2-Induced Proliferation and Migration of Lymphatic Endothelium. PLoS ONE 6: e28947

      Graupera M, Guillermet-Guibert J, Foukas LC, Phng L-K, Cain RJ, Salpekar A, Pearce W, Meek S, Millan J, Cutillas PR, et al (2008) Angiogenesis selectively requires the p110α isoform of PI3K to control endothelial cell migration. Nature 453: 662–666

      Gupta S, Ramjaun AR, Haiko P, Wang Y, Warne PH, Nicke B, Nye E, Stamp G, Alitalo K & Downward J (2007) Binding of Ras to Phosphoinositide 3-Kinase p110α Is Required for Ras- Driven Tumorigenesis in Mice. Cell 129: 957–968

      Hare LM, Schwarz Q, Wiszniak S, Gurung R, Montgomery KG, Mitchell CA & Phillips WA (2015) Heterozygous expression of the oncogenic Pik3ca H1047R mutation during murine development results in fatal embryonic and extraembryonic defects. Dev Biol 404: 14–26

      Hong Y, Lange‐Asschenfeldt B, Velasco P, Hirakawa S, Kunstfeld R, Brown LF, Bohlen P, Senger DR & Detmar M (2004) VEGF‐A promotes tissue repair‐associated lymphatic vessel formation via VEGFR‐2 and the α1β1 and α2β1 integrins. FASEB J 18: 1111–1113

      Hu H, Juvekar A, Lyssiotis CA, Lien EC, Albeck JG, Oh D, Varma G, Hung YP, Ullas S, Lauring J, et al (2016) Phosphoinositide 3-Kinase Regulates Glycolysis through Mobilization of Aldolase from the Actin Cytoskeleton. Cell 164: 433–446

      Jauhiainen S, Ilmonen H, Vuola P, Rasinkangas H, Pulkkinen HH, Keränen S, Kiema M, Liikkanen JJ, Laham-Karam N, Laidinen S, et al (2023) ErbB signaling is a potential therapeutic target for vascular lesions with fibrous component. eLife 12: e82543

      Larue L & Bellacosa A (2005) Epithelial–mesenchymal transition in development and cancer: role of phosphatidylinositol 3′ kinase/AKT pathways. Oncogene 24: 7443–7454

      Lee JW & Chung HY (2018) Vascular anomalies of the head and neck: current overview. Arch Craniofacial Surg 19: 243–247

      Lee K, Kang JE, Park S-K, Jin Y, Chung K-S, Kim H-M, Lee K, Kang MR, Lee MK, Song KB, et al (2010) LW6, a novel HIF-1 inhibitor, promotes proteasomal degradation of HIF-1α via upregulation of VHL in a colon cancer cell line. Biochem Pharmacol 80: 982–989

      Lee K, Lee J-Y, Lee K, Jung C-R, Kim M-J, Kim J-A, Yoo D-G, Shin E-J & Oh S-J (2021) Metabolite Profiling and Characterization of LW6, a Novel HIF-1α Inhibitor, as an Antitumor Drug Candidate in Mice. Molecules 26: 1951

      Lin Y, Dong M, Liu Z, Xu M, Huang Z, Liu H, Gao Y & Zhou W (2022) A strategy of vascular‐targeted therapy for liver fibrosis. Hepatology 76: 660–675

      Lupu I-E, Kirschnick N, Weischer S, Martinez-Corral I, Forrow A, Lahmann I, Riley PR, Zobel T, Makinen T, Kiefer F, et al (2022) Direct specification of lymphatic endothelium from non-venous angioblasts. Biorxiv: 2022.05.11.491403

      Martinez-Corral I, Zhang Y, Petkova M, Ortsäter H, Sjöberg S, Castillo SD, Brouillard P, Libbrecht L, Saur D, Graupera M, et al (2020) Blockade of VEGF-C signaling inhibits lymphatic malformations driven by oncogenic PIK3CA mutation. Nat Commun 11: 2869

      Maruyama K, Miyagawa-Tomita S, Haneda Y, Kida M, Matsuzaki F, Imanaka-Yoshida K & Kurihara H (2022) The cardiopharyngeal mesoderm contributes to lymphatic vessel development in mouse. Elife 11

      Maruyama K, Miyagawa-Tomita S, Mizukami K, Matsuzaki F & Kurihara H (2019) Isl1-expressing non-venous cell lineage contributes to cardiac lymphatic vessel development. Dev Biol 452: 134–143

      Maruyama K, Naemura K, Arima Y, Uchijima Y, Nagao H, Yoshihara K, Singh MK, Uemura A, Matsuzaki F, Yoshida Y, et al (2021) Semaphorin3E-PlexinD1 signaling in coronary artery and lymphatic vessel development with clinical implications in myocardial recovery. Iscience: 102305

      Mashima T, Wakatsuki T, Kawata N, Jang M-K, Nagamori A, Yoshida H, Nakamura K, Migita T, Seimiya H & Yamaguchi K (2021) Neutralization of the induced VEGF-A potentiates the therapeutic effect of an anti-VEGFR2 antibody on gastric cancer in vivo. Sci Rep 11: 15125

      Medesan C, Cianga P, Mummert M, Stanescu D, Ghetie V & Ward ES (1998) Comparative studies of rat IgG to further delineate the Fc : FcRn interaction site. Eur J Immunol 28: 2092–2100

      Nagao M, Hamilton JL, Kc R, Berendsen AD, Duan X, Cheong CW, Li X, Im H-J & Olsen BR (2017) Vascular Endothelial Growth Factor in Cartilage Development and Osteoarthritis. Sci Rep 7: 13027

      Nair SC (2018) Vascular Anomalies of the Head and Neck Region. J Maxillofac Oral Surg 17: 1–12

      Popovich IG, Anisimov VN, Zabezhinski MA, Semenchenko AV, Tyndyk ML, Yurova MN & Blagosklonny MV (2014) Lifespan extension and cancer prevention in HER-2/neu transgenic mice treated with low intermittent doses of rapamycin. Cancer Biol Ther 15: 586–592

      Ryu JY, Chang YJ, Lee JS, Choi KY, Yang JD, Lee S-J, Lee J, Huh S, Kim JY & Chung HY (2023) A nationwide cohort study on incidence and mortality associated with extracranial vascular malformations. Sci Rep 13: 13950

      Sadick M, Wohlgemuth WA, Huelse R, Lange B, Henzler T, Schoenberg SO & Sadick H (2017) Interdisciplinary Management of Head and Neck Vascular Anomalies: Clinical Presentation, Diagnostic Findings and Minimalinvasive Therapies. Eur J Radiol Open 4: 63–68

      Singh AM, Reynolds D, Cliff T, Ohtsuka S, Mattheyses AL, Sun Y, Menendez L, Kulik M & Dalton S (2012) Signaling Network Crosstalk in Human Pluripotent Cells: A Smad2/3-Regulated Switch that Controls the Balance between Self-Renewal and Differentiation. Cell Stem Cell 10: 312–326

      Song JG, Lee YS, Park J-A, Lee E-H, Lim S-J, Yang SJ, Zhao M, Lee K & Han H-K (2016) Discovery of LW6 as a new potent inhibitor of breast cancer resistance protein. Cancer Chemother Pharmacol 78: 735–744

      Srinivasan RS, Dillard ME, Lagutin OV, Lin F-J, Tsai S, Tsai M-J, Samokhvalov IM & Oliver G (2007) Lineage tracing demonstrates the venous origin of the mammalian lymphatic vasculature. Gene Dev 21: 2422–2432

      Stanczuk L, Martinez-Corral I, Ulvmar MH, Zhang Y, Laviña B, Fruttiger M, Adams RH, Saur D, Betsholtz C, Ortega S, et al (2015) cKit Lineage Hemogenic Endothelium-Derived Cells Contribute to Mesenteric Lymphatic Vessels. Cell Reports 10: 1708–1721

      Stone OA & Stainier DYR (2019) Paraxial Mesoderm Is the Major Source of Lymphatic Endothelium. Dev Cell 50: 247-255.e3

      Surve CR, Duran CL, Ye X, Chen X, Lin Y, Harney AS, Wang Y, Sharma VP, Stanley ER, Cox D, et al (2024) Signaling events at TMEM doorways provide potential targets for inhibiting breast cancer dissemination. bioRxiv: 2024.01.08.574676

      Takayama K, Kawakami Y, Kobayashi M, Greco N, Cummins JH, Matsushita T, Kuroda R, Kurosaka M, Fu FH & Huard J (2014) Local intra-articular injection of rapamycin delays articular cartilage degeneration in a murine model of osteoarthritis. Arthritis Res Ther 16: 482

      Tyler B, Wadsworth S, Recinos V, Mehta V, Vellimana A, Li K, Rosenblatt J, Do H, Gallia GL, Siu I-M, et al (2011) Local delivery of rapamycin: a toxicity and efficacy study in an experimental malignant glioma model in rats. Neuro-Oncol 13: 700–709

      Wuest TR & Carr DJJ (2010) VEGF-A expression by HSV-1–infected cells drives corneal lymphangiogenesis. J Exp Med 207: 101–115

      Xu H, Chen Y, Li Z, Zhang H, Liu J & Han J (2022) The hypoxia-inducible factor 1 inhibitor LW6 mediates the HIF-1α/PD-L1 axis and suppresses tumor growth of hepatocellular carcinoma in vitro and in vivo. Eur J Pharmacol 930: 175154

      Xu J, Lamouille S & Derynck R (2009) TGF-β-induced epithelial to mesenchymal transition. Cell Res 19: 156–172

      Xu Q, Liu H, Ye Y, Wuren T & Ge R (2024) Effects of different hypoxia exposure on myeloid-derived suppressor cells in mice. Exp Mol Pathol 140: 104932

      Yu JSL, Ramasamy TS, Murphy N, Holt MK, Czapiewski R, Wei S-K & Cui W (2015) PI3K/mTORC2 regulates TGF-β/Activin signalling by modulating Smad2/3 activity via linker phosphorylation. Nat Commun 6: 7212

      Zenner K, Cheng CV, Jensen DM, Timms AE, Shivaram G, Bly R, Ganti S, Whitlock KB, Dobyns WB, Perkins J, et al (2019) Genotype correlates with clinical severity in PIK3CA-associated lymphatic malformations. Jci Insight 4

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This paper focuses on vascular malformations driven by PI3K mutation, with particular interest on the vascular defects localized at head and neck anatomical sites. The authors exploit the H1047R mutant which has been largely demonstrated to induce both vascular and lymphatic malformation. To limit the effect of H1047R to tissues originated from cardiopharinegal mesoderm, PI3caH1047R mice were crossed with mice expressing Cre under the control of the promoter of Ils1 , a transcription factor that contributes to the development of cardiopharinegal mesoderm-derived tissues. By comparing the embryo phenotype of this model with that observed by inducing at different times of development the expression of PI3caH1047R, the authors conclude that Isl-Cre; PI3caH1047R; R26R-eYFP model recapitulates better the anatomical features of human vascular malformations and in particular those localized at head and neck. In my opinion the new proposed model represents a significant progress to study human vascular malformations. Furthermore, scRNA seq analysis has allowed to propose a mechanism focused on the role of HIF and VEGFA. The authors provides partial evidences that HIF and VEGFA inhibitors halt the development of vascular malformation in VeCAdCre; Pik3caH1047 mice. This experiment is characterized by a conceptual mistake because bevacizumab does not recognize murine VEGFA (see for instance 10.1073/pnas.0611492104; 10.1167/iovs.07-1175. This error dampens my enthusiasm

      Criticism

      Fig 1A. E13.5 corresponds to the early phase of vascular remodelling. Which is the phenotype at earliest stages (e.g. 9.5 or 10.5)

      Fig 1,2,3. The analysis of VEGFR2 expression is required. This request is important for the paradigmatic and non-overlapping role of this receptor in early and late vascular development. Furthermore ,these data better clarify the mechanism suggested by the experiments reported in fig 5 (VEGFA and HIF expression)

      As done in Fig 1,2 and 3, data quantification by morphometric analysis is also required for results reported in supplemental figure 3

      Lines 166-174. I suppose that the reported observations were done at E16.5. What happens later? It's crucial to sustain the statement at lines 187-190

      scRNAseq was performed at E13.5 (Fig 4). It's mandatory to perform the same analysis at E16.5, which corresponds to the phenotypic analysis shown in fig 3. This experiment is required to understand how hypoxia and glycolysis genes changes along the development of the vascular malformation.

      Lines 326-343. In this section the authors provide pharmacological evidences that HIF and VEGFa are involved in vascular malformation caused by H1047R . However , I'm surprised of efficacy of bevacizumab, which neutralizes human but not murine VEGFA. Genetech has developed B20 mAb that specifically neutralizes murine VEGFA. So the data shown require a. clarification by the authors and the experiments must be done with the appropriate reagent. Furthermore, which is the pharmacokynetics of these compounds topically applied?

      Referees cross-commenting

      The issues raised by refereee #1 related to the phenotype analysis are right. In my opinion the Isl model here proposed well mimic human pathology evenf the vascular damage at. head is not so evident

      Significance

      General assessment

      Strength: a new mouse model seems to well recapitulate human vascular malformation. Possible key molecules have been identified

      Weakness: The pharmacological approach to support the role of VEGFA e HIF is not appropriate

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors investigate the pathogenesis of congenital vascular malformations by overexpressing the Pik3caH1047R mutation under the R26 locus in different cell populations and developmental stages using various Cre and CreERT2 lines, including endothelial-specific and different mesoderm precursor lines. The authors provide a thorough characterization of the vascular malformation phenotypes across models. Specifically, they claim that expressing Pik3caH1047R in the cardiopharyngeal mesoderm (CPM) precursors results in vascular abnormalities localized to the head and neck region of the embryo. The study also includes scRNAseq data analyses, including from previously published data and new data generated by the authors. Trajectory inference analysis of a previous scRNA-seq dataset revealed that Isl1+ mesodermal cells can differentiate into ETV2+ cells, directly giving rise to Prox1+ lymphatic endothelial cell progenitors, bypassing the venous stage. Single-cell RNA sequencing of their CPM model and other in vitro datasets show that Pik3caH1047R upregulates VEGF-A via HIF-1α-mediated hypoxia signaling, findings further corroborated in human samples. Finally, preclinical studies in adult mice confirm that pharmacological inhibition of HIF-1α and VEGF-A reduces the number and size of mutant vessels.

      Major comments

      While the study provides a nice characterization of Pik3caH1047R-derived vascular phenotypes induce by expressing this mutation in different cells, the main message of the study is unclear. What is the main question that the authors want to address with this manuscript? The precursor type form where these lesions appear, that venous and lymphatic malformations emerge independently, when and where this phenotype appear? The manuscript needs some work to make the sections more cohesive and to structure better the main findings and the rationale for choosing the models. Authors should explain better when and where the pathogenic phenotypes refer to blood and/or lymphatic malformations. From the quantifications provided in Figure 1, Pik3caH1047R leads to different phenotypes in blood and lymphatic vessels. These are larger diameters with no difference in the number of blood vessels (are you quantifying all pecam1 positive? Vein, arteries, capillaries?), and an increase in the number of lymphatics vessels. Please clarify and discuss. Which vessel types are considered for the quantifications shown in Fig. 1I, M, Q? All Pecam1+ vessels, including lymphatic, vein, capillaries and arteries or which ones? Provide clarifications. The authors propose that the CPM model results in localized head and neck vascular malformations. However, I am not convinced. The images supporting the neck defects are evident, but it is unclear whether there are phenotypes in the head. Why are half of the experiments with the Tie2-Cre model conducted at E12.5 (e.g., validation of recombination, signaling, proliferation) and the others at E13.5? It becomes confusing for the reader why the authors start the results section with E13.5 and then study E12.5. The quantifications provided do not clarify what the "n" represents or how many embryos or litters were analyzed. Blasio et al. (2018), Hare et al (2015) reported that Pik3caH1047R with Tie2-Cre embryos die before E10.5. How do the authors explain the increase in survival here? Were embryos at E13.5 alive? What was the Mendelian ratio observed by the authors? Please provide this information and discuss this point. Please explain the rationale for using the Cdh5-CreERT2. It is likely due to the lethality observed with Tie2Cre, but this was not mentioned. Including this information will help readers who may need to become more familiar with the vasculature or the different Cre lines. Why were tamoxifen injections done at various time points (E9.5, E12.5, E15.5)? Please clarify the reasoning behind administering tamoxifen at these specific times. Explaining the rationale will help the reader follow the experimental design more easily. Additionally, including an initial diagram summarizing all the strategies to guide the reader from the beginning would be helpful. Why do you use the Isl1-Cre constitutive line (instead of the CreERT2)? The former does not allow control of the timing of recombination (targeting specifically your population of interest) and loses the ability to trace the mutant cell behaviors over time. Is the constitutive expression of Pik3caH1047R in Isl1+ cells lethal at any embryonic time, or do the animals survive into adulthood? When you later use the Isl1-CreERT2 line, why do you induce recombination specifically at E8.5? It would be helpful for the reader to have an explanation for this choice, along with a reference to your previous paper. What is the purpose of using this battery of CreERT2 lines (for example, the Myf5-CreERT2)? I find the scRNAseq data in Fig S4 and S5 results very interesting, although I am unsure how they fit with the rest of the story. In principle, a subset of Isl1+ cardiopharyngeal mesoderm (CPM) derivatives into lymphatic endothelial cells was already demonstrated in a previous publication from the group. What is the novelty and purpose here? Why in Fig. 4 ECs were not subclustered for further analysis (as in Fig. S4,5)? This is a missed opportunity to understand the pathogenic phenotypes. Hypoxia and glycolysis signatures are not specific to mutant ECs. Do the authors have an explanation for this? It is well known that PI3K overactivation increases glycolysis; please acknowledge this. Do you have an explanation for the expression of VEGFA by lymphatic mutant cells? Likewise, why mesenchymal cells traced from the Islt1-Cre decreased upon expression of Pik3caH1047R? Authors need to characterize the preclinical model before conducting any preclinical study. No controls are provided, including wild-type mice and phenotypes, before starting the treatment (day 4). Why did the authors not use their developmental model of head and neck malformation model for preclinical studies? This would be much more coherent with the first part of the manuscript. Also, how many animals were treated and quantified for the different conditions?

      Minor Comments

      References in the introduction need to be revised. Specifically, how authors reached the stats on head and neck vascular malformations needs to be clarified. For instance, one of the cited papers refers to all types of vascular malformation, while the other focuses exclusively on lymphatic malformations with PIK3CA mutations. Moreover, in the latter, the groups are divided into orofacial and neck and body categories. How do authors substrate the information from the neck and head here? Also, in line 79, I need clarification on ref 24 about fibrosis. Include references: Studies in mice have shown that p110α is essential for normal blood and lymphatic vessel development. Please clarify and correct. Please define PIP2 and PIP3 Why is Prox1 showing positivity in erythrocytes in Figure 1? Regarding Figure 1, I suggest organizing the quantifications in the same order to facilitate phenotype comparisons. For example, I, J vs. Q, R. What is the difference between M and N? Add the reference of the Bulk RNseq data. Mark in the Fig. 4F that the volcano plots are from cluster one of the scRNASeq (this is explained in text and legend, but when you go to the figure, it isn't very clear). Please label Figure 6D/E with the proper labels. In Fig. 6, it is mentioned that vacuoles are from the tamoxifen injection, how do you know? Do you also see them if you add oil alone (without tamoxifen) or tamoxifen in a WT background?

      Referees cross-commenting

      I complete agree with referee #2 regarding the preclinical studies. Bevacizumab, does not neutralize murine VEGFA. This is a major issue.

      Significance

      This study addresses a timely and relevant question: the origins, onset and progression of congenital vascular malformations, a field with limited understanding. The work is novel in its approach, employing complex embryonic models that aim to mimic the disease in its native context. By focusing on the effects of Pik3caH1047R mutations in cardiopharyngeal mesoderm-derived endothelial cells, it sheds light on how these mutations drive phenotypic outcomes through specific pathways, such as HIF-1α and VEGF-A signaling, while also identifying potential therapeutic targets. A strong aspect of the study is the use of embryonic models, which enables the investigation of disease onset in a context that closely resembles the in vivo environment. This is particularly valuable for congenital disorders, where native developmental cues are an integral aspect of disease progression. The study also integrates advanced techniques, including single-cell RNA sequencing, to dissect the cellular and molecular responses induced by the Pik3caH1047R mutation. Moreover, from a translational perspective, it provides novel therapeutic strategies for these diseases.

      Limitations of the study are (1) unclarity of the main question authors try to address, and main conclusions dereived thereof; (2) the different parts of the manuscripts are not well connected, not clear the rationale; (3) scRNAseq analysis is underdeveloped; (4) characterization of the preclinical model is not provided.

      Audience:

      The findings presented here interest specialized audiences within developmental biology, vascular biology, and congenital disease research fields, and clinicians by providing new therapies to treat vascular anomalies. Moreover, the study's integration of single-cell and in vivo models could inspire further research in other contexts where understanding clonal behavior and signaling pathways is critical.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Serra et al have conducted transcriptomic analyses for thalamic Sox2 and Nr2f1 cKO mice, revealing gene regulatory networks underlying development and functions of dLGN which plays pivotal roles in visual sensation. The findings are also potentially important for understanding vision disability in human. Their conclusions are mostly supported by the data, but some reinforcement and additional explanations may further improve the paper.

      *We thank the reviewer for their appreciation of our work, and the constructive comments.

      *

      Major points:

      1. Although they showed that Sox2 does not regulate Nr2f1 by immunostaining in Fig.1, it would be reinforced by the RNA-seq results. What about evidence for regulation of Sox2 by Nr2f1? I could not find.

      *We have now highlighted, in Fig.1D, the requested RNAseq results from Table S1, showing a very limited reduction of expression of Nr2f1 in Sox2 mutant and of Sox2 in Nr2f1 mutants. We further added ISH results confirming this data (Fig. 4A). *

      The onset of and specificity among the thalamic nuclei of Sox2 and Nr2f1 expression would better be mentioned in the beginning. As far as I remember, both genes are quite widely expressed in the thalamic nuclei, not necessarily specific to dLGN.

      We previously reported in Mercurio et al 2019 (ref. 7) that Sox2 is highly expressed in the dorsal thalamus (precursor to the sensory thalamic nuclei) at least from E15.5 and is later expressed in all the sensory thalamic nuclei, though not in surrounding regions (Mercurio et al 2019 Fig.1). A similar expression pattern was previously reported for Nr2f1 in Chou et al 2013 (ref. 6). A brief mention of this point is now present in Introduction.

      Mechanistically, how Sox2 function becomes distinct in neural stem cells and neurons would be of a great interest (e.g., changes in binding partner). But, it might be too much for the present package.

      *We agree on the interest of this point. We note that SOX2 binding sites in neurons (but not in stem cells), as detected by CUT&RUN, are enriched for SOX2 and RORA/NRF binding sites. The co-presence of SOX and NRF potential binding motifs (Fig. 2F-G), suggests the possibility of direct physical interaction between SOX2 and NR2F1 mediating joint binding to DNA. This is interesting and will be experimentally addressed in a follow up study. *

      Minor points: 1. Explanation for the values in Fig.3A in the text or the figure legend would be helpful for readers unfamiliar with MuSiC.

      We clarified the figure legend, better explaining how the plotted were computed and their meaning.

      Since Ror-alpha is also expressed layer 4 in the cortex, some explanations for these phenotypes being caused by thalamic defects may be provided. I know that expression of Sox2 and Ror-alpha do not overlap in layer 4, though.

      *In fact, we propose that downregulation of RORa in layer 4 maybe caused by reduced thalamic afferents to layer 4, possibly also acting through a reduced delivery of VGF to the cortex; in fact, as the reviewer correctly states Sox2 itself is not expressed in the cortex. *

      Why did the authors use two types of Sox2 antibodies in Fig.4A?

      We strive to replicate our CUT&RUN data such that we can rely only on the reproducible binding events. We have often noted that – being CUT&RUN a “challenging” application for antibodies – different antibodies yield non-fully overlapping binding profiles. While we do not have a clear explanation for this, we consider more robust converging on those binding events that are obtained by two independent antibodies, when such tools are available. This, in our opinion and experience, drastically decreases the chance of stumbling upon false positive hits.

      Quatification for Fig.1A, Fig.2A and 2B may be necessary for the current publication standards.

      The requested quantification has been added in Fig. S1A and in Fig. 4C.

      In Introduction, NRF1 or NRF is somewhat confusing because there is a different gene named NRF (Nuclear respiratory factor).

      *We corrected this. *

      Reference 14 is identical to 44.

      *We corrected this. *

      Reviewer #1 (Significance (Required)):

      This work provides a basis of gene regulatory network involved in development and function of dLGN neurons, which may also be important for understanding mechanisms of vision disability in human caused by genetic mutations. Although I am not an expert in this particular field (GRNs in thalamic neurons), a series of the authors' works certainly establish a molecular basis of the roles of Sox2 ranging from neural stem/progenitor cells to neurons. Limitations of the current study in my opinion would be that it only lists up candidate genes for the functions or cause of visual sensations or defects, and thus experimental proof awaits actual biological experiments. Although the results and conclusion provided by the authors are reasonable and convincing, conceptual advance may be limited to some extent. Readers in both basic and clinical researches will be interested in that vision disability caused by mutations in Sox2 and Nr2f1 could be explained by synapse-related genes, axon guidance molecules, or secreting factors like VGF, albeit not with big surprise. My research expertise would be in the field of brain development, particularly in regionalization and morphogenesis of the brain. Yet, I am not particularly familiar with transcriptomic analyses in general.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In the current manuscript, Serra, Mercurio, and colleagues carried out Ror-alpha-Cre specific conditional mutant analysis of Sox2 and Nr2f1 in the thalamus/dLGN. The workflow primarily focused on potential mechanisms underlying transcriptional regulation. With RNA-Seq, the authors identified multiple "common" targets shared by both Sox2 and Nr2f1 factors. In parallel, the authors also carried out CUT-RUN analysis for Sox2 binding patterns in dLGN chromatin.

      The current work is built upon the intellectual framework of two papers: the past work led by the senior author in 2019, as well as an earlier work by Chou /O'Leary 2013, in terms of genetic reagents and anatomical and functional analysis. While the newly performed experiments may open some new avenues for future investigation, the current manuscript did NOT vigorously validate bioinformatics predictions using experimental approaches. The current dataset did NOT present any functional and anatomical analysis, esp. in terms of the target gene functions back to the same circuits/connections (thalamus-cortex). The manuscript presented in the current format offers limited biological insights into the neurobiology of dLGN. The limited experimental data also indicated that the manuscript may not be suitable for a very general readership.

      We thank the reviewer for pointing out contributions as well as limitations of our work. We are convinced that our work does indeed open up " new avenues for future investigation", reporting for the first time hundreds of targets of SOX2 and NR2F1 as well as hundreds of direct SOX2 binding sites in dLGN neurons that will contribute to future investigations.

      Major points: 1. Unless I missed anything - I was not sure why the current Figure 1/ Tables 1&2 took a sharp pause without any in situ/histochemical validations of the "prominent" downstream targets - at minimum, the authors should validate the common targets, including VGF among others;

      We now validated the downregulation VGF and Sox5 at the RNA level by ISH confirming SOX5 downregulation by IF. These data are presented in the new Fig. 4, in results page 5 and discussion page 7.

      Could the over-expression of any targets (Sox5, etc) reverse the loss of Sox2-phenotypes, esp. in terms of the establishment of thalamic-cortical connections, as assayed by Fig 2A (as well as Mercurio, 2019, Figure4)? Having such an assay would significantly boost the significance of the current study.

      The experiment suggested by the reviewer would undoubtly be interesting to address Sox5 contribution to the mutant phenotype; unfortunately, this is too demanding for the present paper.

      However, for the sake of data interpretation, we propose that the mutant phenotypes observed rather result from the global deregulation of a set of genes, not just of a single gene. Indeed, we discuss the potential contribution of several different genes, among those co-regulated by SOX2 and NR2F1. From this point of view, we don't necessarily expect the contribution of a specific gene to be prominent. In fact, we believe an interesting result emerging from our work is the identification of a rather numerous set of genes collectively responding to both Sox2 and Nr2f1 mutation, many of which may contribute to the shared phenotypes of the two mutants.

      Figure 3 is presented in a very inconvenient manner for any reviewers/future readers to understand and interpret. The plots in B and C are what matter the most, while the raw data in 3A could be included in a table. The presentation and comparison of this figure need some significant work.

      We have now modified Fig. 3 as requested and moved the raw data to the Supplementary material (Table S4).

      The Cut-n-Run assays offered several dLGN unique (non-neurogenesis) targets. However, the study paused at bioinformatics prediction without experimental validations as well, including the dLGN peaks near Vgf and Sox5.

      We are not sure we understand the reviewer's question. The " dLGN unique (non-neurogenesis) targets" that we report are not the results of a bioinformatics prediction, but of the CUT&RUN experiment itself including the dLGN peaks near Vgf and Sox5. In addition, we experimentally validated the downregulation of Vgf and Sox5 by in situ hybridization in the new Figure 4.

      Minor points: For general readers, (1) please explicitly document whether Ror-alpha-Cre does NOT(?) impact the retina and cortex;

      This is now mentioned in results in agreement with the results in Chou et al. 2013 and Mercurio et al. 2019.

      Chou et al mentions explicitly absence of Rora Cre activity in the cortex and this is also in agreement with our own results in Mercurio et al. 2019. As to the retina, we reported not observing any retinal phenotypes in Sox2 mutants in agreement with the absence of any Sox2 deletion within the retina, that would have caused a drastic phenotype as reported in Taranova et al. 2006.

      (2) please explain when Ror-alpha-Cre expression timing - is it solely post-mitotic in the dLGN? The authors may have taken these for granted, esp. given Mercurio 2019 and Chou 2013, but such information may help readers outside the field.

      The onset of Rora Cre activity is at a stage in which dLGN neurogenesis is completed and most if not all cells are postmitotic as reported in Chou et al. 2013. This point is now more explicitly mentioned in results.

      Reviewer #2 (Significance (Required)):

      The manuscript offers limited new information to general readers. It might be a good dataset for researchers specialized in transcriptional regulation in terms of finding useful/relevant information to design future experiments. However, the study did NOT offer any histological and functional assays based on bioinformatics tests.

      • General assessment: The strengths were a careful analysis of dLGN in early development using both RNA-Seq and Cut-n-Run with a focus on Sox2's post-mitotic role. The limitations were that the study was lack of histological validations and functional tests of the candidate genes.

      We now added histological validation of selected targets as requested in the new Fig. 4.

      • Advance: The advance of the study is limited, though the experiments were carefully launched.

      • Audience: Very limited audience with a specialty in transcription factors in visual system development.

      The reviewer is an expert in neurodevelopment using the mouse genetics approach, with primary interests in studying the retina and retino-recipient zone development.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This manuscript investigates the role of Sox2 and Nr2f1 on dLGN development. The authors perform RNA-seq on thalamus-specific conditional knock outs of Sox2 and Nr2f1. The author compile lists of the genes that showed the greatest change in detection between control mice (3 and 3) and mutant mice (3 and 3). The authors find significant overlap in the lists of genes most altered in the mutants and argue that this overlap is consistent with the two transcription factors regulating the same gene network. The authors also perform a CUT&RUN analysis of Sox2 binding sites and find overlap in the list of genes that Sox2 binds to and the genes with altered expression levels in the Sox2-cKO. Regulation of neuron-specific cellular components are highly represented in both the list of binding sites and genes with altered expression levels.

      The RNA-seq data and binding site data are valuable resources for researchers trying to understand the development of the dLGN and should be published. However, I am not confident that author's interpretations of their data are supported by what is provided in the manuscript.

      Major comments:

      Issues with the statistical logic

      -Lack of statistical significance is not evidence of equality. The fact that Sox2 and Nr2f1 do not pass the FDR threshold is not evidence that they are unchanged in the two conditional knock-outs.

      The meaning of statistical testing and significance in this context is assessing if, and how much, the observed changes in expression in RNA-Seq estimated transcript levels can be due only to experimental variability (not significant) or, vice versa, if there is an additional biological factor (the knock-out of Sox2 or Nr2f1, in this case) behind the changes observed. Clearly, the more “significant” (lower) are the p-value/FDR values associated with changes observed for a gene, the more likely is that the gene transcript levels are affected by the knock outs. Vice versa, if the change is reported to be “not significant”, there isn’t enough evidence – at least from a statistical point of view - that the observed changes in transcript levels are due to the effect of the knock outs. Three replicates per condition are required in order to estimate variance – which is gene specific and estimates what is the “natural” range of variability of each gene due only to experimental variability (and not generated by the knock-outs).

      We now report the RNAseq data for Sox2 and Nr2f1 in Fig. 1D and complete them with ISH data in the new Fig. 4. The results are consistent with a limited reduction Nr2f1 in the Sox2 mutants and Sox2 in the Nr2f1 mutants. Though we cannot rule out that they might contribute to some extent to the mutant phenotype, we document a stronger downregulation, in both mutants, of a vast set of other genes (Fig. 1C) onto which our analysis focuses.

      -Many arguments are based on the result that Sox2 knock out has a "strong" effect on a gene. FDR and p-values do not provide evidence about effect size beyond "not 0". Average TPN values are provided but, without sorting through thousands of values in the supplementary data, it is not possible to judge the reliability of a claimed effect size. Finally, no biological reference is given for what should be considered a strong effect size besides the relative values within the knockout experiment. I would like to see the replicates for the relevant TPN data presented in the main text and I would like to see the variance between those replicates considered in the author's conclusions. Space in the tables could be saved by reporting fewer digits in the fold changes.

      See previous point. The more “significant” are the changes of transcript levels according to statistical testing, the “stronger” the effect of the knock out on them, where by “strong” we mean a more relevant variation of transcript levels. However, since we realized that this term could cause confusion in the reader, we rephrased the relevant parts. Variance is taken into account in the computation of pvalues/FDRs, so the same difference in mean TPM values for two different genes can result to more/less significant according to the estimated variance of the values.

      -The authors identify 469 dLGN specific SOX2 binding sites by subtracting the 248 high confidence binding sites identified in non-dLGN cells from the 717 high confidence binding sites identified in dLGN. This subtraction is basically a comparison of p-values with the false assumption that lack of statistical significance means there was no change. The quantitation required to make the claim would be a direct comparison of the two data sets for each binding site.

      *We appreciate the concern from the reviewer. CUT&RUN, especially when performed in vivo versus cell lines, has a high intrinsic variability between experiments, and even between technical replicates (DOI: 10.1093/nar/gkae180). While it would be possible to, for example, run DiffBind (built for ChIP-seq), on the dLGN data versus the NS data, these are not, in our opinion, directly comparable as they were not performed in the same batch, on the same type of material (dissected mouse tissue versus cultured cells) or even with the same batches of reagents. Thus, to quantify them in terms of signal at specific loci, without taking into account things like global background, local background, and overall signal to noise ratio, we do not believe is correct. There are many attempts in the field to better quantify CUT&RUN data (spike-in yeast or E. coli DNA at different moments, spike-in drosophila nuclei, etc.) but there remains to be determined a general consensus on what is best or trustworthy. The best way we could do the comparison, with our data as it was generated, was as pointed out above, by comparing the statistically significant events in the dLGN versus those in the NS, that way each dataset is considered independently before the overlap is performed. To help alleviate the reviewers concerns, we have provided here, for the reviewer, signal profiles and heatmaps of the dLGN only regions in both dLGN and NS CUT&RUN. *

      Non-quantitative issues:

      -It is known that both the Sox2 and Nr2f1 mutants have similar dLGN phenotypes. How, then, can we know if individual changes in gene expression reflect direct regulation by Sox2 and Nr2f1 or the dramatically altered state of the dLGN? The binding data would add to the argument of direct regulation, but it is difficult to judge the specificity of the binding data.

      The timepoint of the RNAseq analyses was chosen to precede any phenotypic changes detected in the dLGN based on our previous analyses reported in Mercurio et al. 2019 as stated in Results page 3.

      * * -The authors argue that a decrease in layer 4 of the cortex argues that Vgf1 is a likely link between Sox2 and cortical development. However, some decrease in layer 4 thickness is a given if the number of thalamocortical cells in dLGN is reduced.

      We agree with the Reviewer. The possible contribution of VGF has been rephrased considering a possible wider contribution of thalamic afferents in general.

      -Immuno fluorescence is used to support the idea that the number of cells strongly expressing Sox5 is reduced in the Sox2 cKO. The image shows a reduced patch of Sox5 labeling. However, the dLGN is generally reduced in the Sox2 cKO so it is not clear if there is a difference in the proportion of cells expressing Sox5. The sample size also appears to be 1.

      The time of this analysis was chosen to precede dLGN size reduction in mutants, as clearly shown in our previous work Mercurio et al. 2019 and further confirmed by the new ISH for Sox2 and Nr2f1 presented in the new Fig. 4.

      The sample size is n=4 as reported in the Figure legend.

      Minor

      Introduction:

      -Writing could be improved.

      -Descriptions of effects of Sox2 or Nr2fl using RORalpha-Cre use words like "reduced", "significant", "important". It is unclear what the actual effects or effect sizes are.

      We revised the wording for this point.

      RESULTS

      -What is "Three independent pools of mutant and control dissected visual thalami"? Three mice for each condition (twice for control)?

      -Why are there two groups of 3 control mice each and not one group of 6?

      As reported in Materials and Methods " RNA sequencing was performed on three independent samples for both mutant and control dLGN. Each sample was composed of dLGNs from three animals of the same genotype pooled together."

      *Thalami from 3 mice represent an adequate amount of RNA to perform a single experiment of RNAseq. 3 x 3 represents a biological triplicate for the RNAseq experiment. * Section 2

      -For the model in which the probability of genes changing in the same direction is calculated, are all genes assumed to have the same chance of passing the FDR? Gene variance and detection rate will be correlated between conditions. I would suggest a more conservative comparison. What is the correlation of fold change for genes that pass FDR? Of 514 that change in both, 481 go in the same direction and 33 go in a different direction. If everything is random, the number would be 257/257. The claim of four times random overlap does not seem like the conservative estimate.

      Genes were selected with the same FDR thresholds in both experiments. The assumption is anyway more simple: the probability of a gene to have a significant change (passing the FDR threshold) in one experiment does not influence its probability to change also in the other, and vice versa. That is, we compute the probability to have a given number of up- or down-regulated genes in common in the two experiments assuming that the two experiments were independent from one another. From another point of view, this is the usual strategy employed in order to assess whether the overlap between two gene sets obtained by two different genome-wide experiments can be considered to be random or not, that is, if the number of genes in the overlap is close to random expected values they can be considered to be independent from one another.

      Section 3

      -I don't see any basis to judge the p-values in Fig 1D. How do these changes compare to what you would from other dramatic manipulations of neural tissue? Can figure 1D compare to changes in non-neuronal standard? How about metabolism and cell death?

      The graph shown represents the most significantly enriched functional annotations (GO annotations, pathways, etc.) among the deregulated genes as computed by Enrichr, one of the many tools developed for this task. And as for all the tools performing this analysis, the p-value means “the probability of having the same number of genes sharing the same functional annotation in a set of genes chosen at random”, computed with the same strategy employed for the overlap between the two deregulated gene sets described before.

      Section "Deconvolution..."

      -It is great that results for each replicate is presented.

      We thank the reviewer.

      * * -There are too many significant digits in Fig 3A given the variance.

      This has been adjusted as suggested.

      -Why do the NR2F1 mutants look more like the Sox2 controls (in terms of excitatory Neurons) than the NR2F1 controls do?

      *The graphical presentation of the data in Fig. 3 has been improved, and the numerical data (former panel A) have been moved to the supplementary materials (Table S4) as recommended. *

      Controls for Nr2f1 and Sox2 mutants have similar values for excitatory neurons, as expected, see Table S4. Fig. 3 shows the variation between each knock-out and its respective control experiments, and although excitatory neurons are reduced in both mutants the extent of reduction is greater in the Sox2 mutant.

      Section "CUT&RUN..."

      -How many overlaps (Figure 4B) would you expect by chance?

      *This is an extremely difficult number to calculate. It is possible to, for example, generate a random set of genomic fragments of similar length, and check how many of them overlap. This would however be extremely unfair, as CUT&RUN is naturally biased towards open chromatin, and thus would preferentially contain these types of regions in a “randomly” digested set. Additionally, data analysis and mapping biases further increase what overlaps would often occur. To circumvent this, we i) use an IgG control, which should identify and remove regions that are nonspecifically digested and sequenced during the experiment, and ii) performed our analysis after first removing sets of known artifact regions (Nordin et al 2023, ref. 43). *

      -Fig 4J needs more description. What does the first full pie represent?

      *We have added more description in the figure legend, it now reads: *

      1. *Schematic depiction of CUT&RUN and RNA-seq overlap, showing Sox2 peak associated genes that are transcribed ( > 5 TPM, 784/1102) and those that are differentially expressed (DEG) in Sox2 mutant dLGN (FDR -Please include the denominator in the binding event argument. It is difficult to judge the specificity of the effect in this section.

      We apologize but we don't understand this comment.

      Reviewer #3 (Significance (Required)):

      The mouse dorsal lateral geniculate nucleus (dLGN) is an important model system for understanding vision and the development of visual circuitry. A considerable literature exists on the role of activity dependent development and molecular gradients in shaping the synaptic connections between the retina and the dLGN. Less is known about the transcriptional networks that regulate dLGN development. Mutations in the transcription factors Sox2 and NR2F1 are associated with severe vision defects and conditional knockout of Sox2 has been shown to cause dramatic defects in dLGN development. The data provided in the current study adds to our understanding of how these transcription factors influence gene expression and circuit formation in the dLGN. Their work points to changes in VGF expression and fewer thalamocortical cells as the most salient effects of Sox2 deletion. These results increase our understanding of the transcriptional networks underlying dLGN development and several visual pathologies.

      I think the manuscript should be helpful to researchers interested in the dLGN or researchers interested in the transcription factors important for neural circuit development. My own expertise covers dLGN development but not transcription factors and the interpretation of RNA-seq data. My impression was that the biggest contribution of this manuscript was in obtaining gene expression levels in the Sox2 conditional knockout with multiple RNA-seq replicates. The impact of the paper, as written, is lessened by the fact that the confidence gained by replicating the analysis is not leveraged in the main text of the manuscript.

      Performing a RNA-Seq analysis in replicates is common practice, and as we detailed in our replies to the reviewer’s comments the goal of replicates is to have reliable estimations of the parameters needed (mean, variance of each gene) for the subsequent statistical analyses. So, we leveraged the information obtained from the replicates in order to identify with high confidence with genes could be considered to be affected by the knock-outs.

      Much of the results, interpretation, and discussion depend on sorting strong effects on genes from weak ones without presenting replicates for effect size or confidence intervals. The replicate data is available in the supplementary data and should be a good resource for future research.

      As discussed in the previous responses, the statistical evaluations usually performed on estimated transcript levels and their variance can be translated into a more qualitative evaluation of the effect of the knock-outs performed – the larger is the impact on transcript levels of a gene with respect to its estimated variance (variability) the stronger the effect is assumed to be. Confidence intervals are not usually employed in this context – the “confidence” with which the experimental setting can be assumed to affect gene expression is summarized by the p-values and the subsequent FDR values.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      This manuscript investigates the role of Sox2 and Nr2f1 on dLGN development. The authors perform RNA-seq on thalamus-specific conditional knock outs of Sox2 and Nr2f1. The author compile lists of the genes that showed the greatest change in detection between control mice (3 and 3) and mutant mice (3 and 3). The authors find significant overlap in the lists of genes most altered in the mutants and argue that this overlap is consistent with the two transcription factors regulating the same gene network. The authors also perform a CUT&RUN analysis of Sox2 binding sites and find overlap in the list of genes that Sox2 binds to and the genes with altered expression levels in the Sox2-cKO. Regulation of neuron-specific cellular components are highly represented in both the list of binding sites and genes with altered expression levels.

      The RNA-seq data and binding site data are valuable resources for researchers trying to understand the development of the dLGN and should be published. However, I am not confident that author's interpretations of their data are supported by what is provided in the manuscript.

      Major comments:

      Issues with the statistical logic

      • Lack of statistical significance is not evidence of equality. The fact that Sox2 and Nr2f1 do not pass the FDR threshold is not evidence that they are unchanged in the two conditional knockouts.
      • Many arguments are based on the result that Sox2 knock out has a "strong" effect on a gene. FDR and p-values do not provide evidence about effect size beyond "not 0". Average TPN values are provided but, without sorting through thousands of values in the supplementary data, it is not possible to judge the reliability of a claimed effect size. Finally, no biological reference is given for what should be considered a strong effect size besides the relative values within the knockout experiment. I would like to see the replicates for the relevant TPN data presented in the main text and I would like to see the variance between those replicates considered in the author's conclusions. Space in the tables could be saved by reporting fewer digits in the fold changes.
      • The authors identify 469 dLGN specific SOX2 binding sites by subtracting the 248 high confidence binding sites identified in non-dLGN cells from the 717 high confidence binding sites identified in dLGN. This subtraction is basically a comparison of p-values with the false assumption that lack of statistical significance means there was no change. The quantitation required to make the claim would be a direct comparison of the two data sets for each binding site.

      Non-quantitative issues:

      • It is known that both the Sox2 and Nr2f1 mutants have similar dLGN phenotypes. How, then, can we know if individual changes in gene expression reflect direct regulation by Sox2 and Nr2f1 or the dramatically altered state of the dLGN? The binding data would add to the argument of direct regulation, but it is difficult to judge the specificity of the binding data.
      • The authors argue that a decrease in layer 4 of the cortex argues that Vgf1 is a likely link between Sox2 and cortical development. However, some decrease in layer 4 thickness is a given if the number of thalamocortical cells in dLGN is reduced.
      • Immuno fluorescence is used to support the idea that the number of cells strongly expressing Sox5 is reduced in the Sox2 cKO. The image shows a reduced patch of Sox5 labeling. However, the dLGN is generally reduced in the Sox2 cKO so it is not clear if there is a difference in the proportion of cells expressing Sox5. The sample size also appears to be 1.

      Minor

      Introduction:

      • Writing could be improved.
      • Descriptions of effects of Sox2 or Nr2fl using RORalpha-Cre use words like "reduced", "significant", "important". It is unclear what the actual effects or effect sizes are.

      RESULTS

      • What is "Three independent pools of mutant and control dissected visual thalami"? Three mice for each condition (twice for control)?
      • Why are there two groups of 3 control mice each and not one group of 6?

      Section 2

      • For the model in which the probability of genes changing in the same direction is calculated, are all genes assumed to have the same chance of passing the FDR? Gene variance and detection rate will be correlated between conditions. I would suggest a more conservative comparison. What is the correlation of fold change for genes that pass FDR? Of 514 that change in both, 481 go in the same direction and 33 go in a different direction. If everything is random, the number would be 257/257. The claim of four times random overlap does not seem like the conservative estimate.

      Section 3

      • I don't see any basis to judge the p-values in Fig 1D. How do these changes compare to what you would from other dramatic manipulations of neural tissue? Can figure 1D compare to changes in non-neuronal standard? How about metabolism and cell death?

      Section "Deconvolution..."

      • It is great that results for each replicate is presented.
      • There are too many significant digits in Fig 3A given the variance.
      • Why do the NR2F1 mutants look more like the Sox2 controls (in terms of excitatory Neurons) than the NR2F1 controls do?

      Section "CUT&RUN..."

      • How many overlaps (Figure 4B) would you expect by chance?
      • Fig 4J needs more description. What does the first full pie represent?
      • Please include the denominator in the binding event argument. It is difficult to judge the specificity of the effect in this section.

      Significance

      The mouse dorsal lateral geniculate nucleus (dLGN) is an important model system for understanding vision and the development of visual circuitry. A considerable literature exists on the role of activity dependent development and molecular gradients in shaping the synaptic connections between the retina and the dLGN. Less is known about the transcriptional networks that regulate dLGN development. Mutations in the transcription factors Sox2 and NR2F1 are associated with severe vision defects and conditional knockout of Sox2 has been shown to cause dramatic defects in dLGN development. The data provided in the current study adds to our understanding of how these transcription factors influence gene expression and circuit formation in the dLGN. Their work points to changes in VGF expression and fewer thalamocortical cells as the most salient effects of Sox2 deletion. These results increase our understanding of the transcriptional networks underlying dLGN development and several visual pathologies.

      I think the manuscript should be helpful to researchers interested in the dLGN or researchers interested in the transcription factors important for neural circuit development. My own expertise covers dLGN development but not transcription factors and the interpretation of RNA-seq data. My impression was that the biggest contribution of this manuscript was in obtaining gene expression levels in the Sox2 conditional knockout with multiple RNA-seq replicates. The impact of the paper, as written, is lessened by the fact that the confidence gained by replicating the analysis is not leveraged in the main text of the manuscript. Much of the results, interpretation, and discussion depend on sorting strong effects on genes from weak ones without presenting replicates for effect size or confidence intervals. The replicate data is available in the supplementary data and should be a good resource for future research.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the current manuscript, Serra, Mercurio, and colleagues carried out Ror-alpha-Cre specific conditional mutant analysis of Sox2 and Nr2f1 in the thalamus/dLGN. The workflow primarily focused on potential mechanisms underlying transcriptional regulation. With RNA-Seq, the authors identified multiple "common" targets shared by both Sox2 and Nr2f1 factors. In parallel, the authors also carried out CUT-RUN analysis for Sox2 binding patterns in dLGN chromatin.

      The current work is built upon the intellectual framework of two papers: the past work led by the senior author in 2019, as well as an earlier work by Chou /O'Leary 2013, in terms of genetic reagents and anatomical and functional analysis. While the newly performed experiments may open some new avenues for future investigation, the current manuscript did NOT vigorously validate bioinformatics predictions using experimental approaches. The current dataset did NOT present any functional and anatomical analysis, esp. in terms of the target gene functions back to the same circuits/connections (thalamus-cortex). The manuscript presented in the current format offers limited biological insights into the neurobiology of dLGN. The limited experimental data also indicated that the manuscript may not be suitable for a very general readership.

      Major points:

      1. Unless I missed anything - I was not sure why the current Figure 1/ Tables 1&2 took a sharp pause without any in situ/histochemical validations of the "prominent" downstream targets - at minimum, the authors should validate the common targets, including VGF among others;
      2. Could the over-expression of any targets (Sox5, etc) reverse the loss of Sox2-phenotypes, esp. in terms of the establishment of thalamic-cortical connections, as assayed by Fig 2A (as well as Mercurio, 2019, Figure4)? Having such an assay would significantly boost the significance of the current study.
      3. Figure 3 is presented in a very inconvenient manner for any reviewers/future readers to understand and interpret. The plots in B and C are what matter the most, while the raw data in 3A could be included in a table. The presentation and comparison of this figure need some significant work.
      4. The Cut-n-Run assays offered several dLGN unique (non-neurogenesis) targets. However, the study paused at bioinformatics prediction without experimental validations as well, including the dLGN peaks near Vgf and Sox5.

      Minor points:

      For general readers, (1) please explicitly document whether Ror-alpha-Cre does NOT(?) impact the retina and cortex; (2) please explain when Ror-alpha-Cre expression timing - is it solely post-mitotic in the dLGN? The authors may have taken these for granted, esp. given Mercurio 2019 and Chou 2013, but such information may help readers outside the field.

      Significance

      The manuscript offers limited new information to general readers. It might be a good dataset for researchers specialized in transcriptional regulation in terms of finding useful/relevant information to design future experiments. However, the study did NOT offer any histological and functional assays based on bioinformatics tests.

      General assessment:

      The strengths were a careful analysis of dLGN in early development using both RNA-Seq and Cut-n-Run with a focus on Sox2's post-mitotic role. The limitations were that the study was lack of histological validations and functional tests of the candidate genes.

      Advance:

      The advance of the study is limited, though the experiments were carefully launched.

      Audience:

      Very limited audience with a specialty in transcription factors in visual system development.

      The reviewer is an expert in neurodevelopment using the mouse genetics approach, with primary interests in studying the retina and retino-recipient zone development.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Serra et al have conducted transcriptomic analyses for thalamic Sox2 and Nr2f1 cKO mice, revealing gene regulatory networks underlying development and functions of dLGN which plays pivotal roles in visual sensation. The findings are also potentially important for understanding vision disability in human. Their conclusions are mostly supported by the data, but some reinforcement and additional explanations may further improve the paper.

      Major points:

      1. Although they showed that Sox2 does not regulate Nr2f1 by immunostaining in Fig.1, it would be reinforced by the RNA-seq results. What about evidence for regulation of Sox2 by Nr2f1? I could not find.
      2. The onset of and specificity among the thalamic nuclei of Sox2 and Nr2f1 expression would better be mentioned in the beginning. As far as I remember, both genes are quite widely expressed in the thalamic nuclei, not necessarily specific to dLGN.
      3. Mechanistically, how Sox2 function becomes distinct in neural stem cells and neurons would be of a great interest (e.g., changes in binding partner). But, it might be too much for the present package.

      Minor points:

      1. Explanation for the values in Fig.3A in the text or the figure legend would be helpful for readers unfamiliar with MuSiC.
      2. Since Ror-alpha is also expressed layer 4 in the cortex, some explanations for these phenotypes being caused by thalamic defects may be provided. I know that expression of Sox2 and Ror-alpha do not overlap in layer 4, though.
      3. Why did the authors use two types of Sox2 antibodies in Fig.4A?
      4. Quatification for Fig.1A, Fig.2A and 2B may be necessary for the current publication standards.
      5. In Introduction, NRF1 or NRF is somewhat confusing because there is a different gene named NRF (Nuclear respiratory factor).
      6. Reference 14 is identical to 44.

      Significance

      This work provides a basis of gene regulatory network involved in development and function of dLGN neurons, which may also be important for understanding mechanisms of vision disability in human caused by genetic mutations. Although I am not an expert in this particular field (GRNs in thalamic neurons), a series of the authors' works certainly establish a molecular basis of the roles of Sox2 ranging from neural stem/progenitor cells to neurons. Limitations of the current study in my opinion would be that it only lists up candidate genes for the functions or cause of visual sensations or defects, and thus experimental proof awaits actual biological experiments. Although the results and conclusion provided by the authors are reasonable and convincing, conceptual advance may be limited to some extent. Readers in both basic and clinical researches will be interested in that vision disability caused by mutations in Sox2 and Nr2f1 could be explained by synapse-related genes, axon guidance molecules, or secreting factors like VGF, albeit not with big surprise.

      My research expertise would be in the field of brain development, particularly in regionalization and morphogenesis of the brain. Yet, I am not particularly familiar with transcriptomic analyses in general.

    1. 象征意义

      The image you've uploaded is a detailed version of Salvador Dalí's "The Persistence of Memory". The annotations in the image explain the symbolism of various elements within the painting. Let me summarize the explanation for each of the numbered items:

      1. Soft Watches: The three melting clocks represent time, which is relative and in constant motion. As in dreams, past, present, and future coexist and interact with each other. The different positions of the clocks symbolize these various temporalities.

      2. Orange Watch: This watch does not melt, symbolizing the passage of time in a more tangible way. It is also covered with ants, which represent decay and death. Dalí observed ants on a bat's remains as a child, which linked them to decomposition.

      3. Ants: The ants swarm over the solid watch, symbolizing decomposition and the inevitability of death. They suggest the inevitable passage of time and its destructive effects.

      4. Insect: The insect in the image can be interpreted as representing time's fleeting nature, as it is often associated with the passage of time and change.

      5. Strange Object: This object or character lying on the ground might represent Dalí himself or the dreamlike state of the inner world, symbolizing the surreal, subconscious elements of human experience.

      6. Mirror: The mirror in the painting symbolizes the inconstancy of reality. It reflects both the real world and the imaginary, blurring the line between the two.

      7. Olive Tree: Traditionally a symbol of wisdom, the tree here is dry and dead, suggesting the decay of past knowledge or wisdom. It may represent the idea of the past becoming irrelevant or obsolete.

      8. Empty Shoreline: The empty shoreline reflects Dalí's emotional emptiness, emphasizing the loneliness or void of the inner self, and the passage of time.

      9. Bright Sea: The sea is a symbol of memory and reality. It contrasts with the dark foreground, representing the boundary between the known (memory and reality) and the unknown (the imaginary and overwhelming world).

      10. Mountains: These mountains are rooted in the ground, symbolizing permanence and memory. They are part of Dalí's childhood and hold personal significance.

      11. Egg: The egg is a symbol of birth, renewal, and the potential for new beginnings. It reflects hope, creation, and transformation.

      This combination of symbolism is a key feature of Dalí's surrealism, where the fluidity of time, memory, and personal experiences are intertwined, forming a complex narrative about existence, perception, and the subconscious.

    1. Data set 4:

      Since two groups are compared with repect to time, I would reformulate this data set into a line graph with one line for “Revolution helmet” and another for “Standard helmet”. Each line would have a unique color and pattern labeled in a legend. The x-axis label = “Interval (weeks)” with tick marks for 1, 2, 3, and 4. The y-axis label = “Return to Play Rates (%)” with tick marks from 0 to 100.

      Title of the graph = “Return to Play Rates After Experiencing a Concussion When Wearing a Revolution vs. Standard Helmet”

    2. Figure 1: Average GPA (on 4-point scale) by hours studied per week of students at University X. Figure 2 shows the average test scores of male and female students. Figure 3: Levelized costs (in dollars per megawatt hours of electricity) of five different power plants. Table 1: Memory usage of five different web browsers with one tab open. Figure 4: Harley-Davidson is the most popular motorcycle. Table 2: Popularity of the top-selling motorcycle brands among registered motorcycle owners in Pittsburgh. Harley-Davidson is the most popular.

      This caption is too long and encapsulates two main points. Average GPA should be on the x or y axis. The hours studied per week at Uni X should be a title.

    1. LISTE DES PROPOSITIONSDES ENFANTS CONSULTÉS

      En m'appuyant sur le Rapport Enfant 2024 du Défenseur des droits, voici les acteurs à mobiliser pour chacune des 20 propositions des enfants consultés, listées en annexe 2 du rapport:

      1. Accroître l’investissement public à destination de projets d’intérêt général en faveur des personnes vulnérables, pour favoriser un accès équitable aux services essentiels tels que l’eau potable et les transports, et pour augmenter le soutien aux pays en développement.
        • Acteurs : Pouvoirs publics, collectivités territoriales, institutions financières internationales, organisations non gouvernementales (ONG).
      2. Informer et sensibiliser davantage quant aux impacts environnementaux de la production de masse et de la surconsommation, afin de favoriser une prise de conscience collective. Interdire les plastiques à usage unique, réduire drastiquement les emballages, et promouvoir activement la démarche « zéro déchet ».
        • Acteurs : Établissements scolaires, associations de sensibilisation à l'environnement, pouvoirs publics, médias, entreprises
      3. Lutter contre le gaspillage alimentaire dans les cantines scolaires et les restaurants collectifs en favorisant une meilleure gestion des stocks, en sensibilisant les convives et en mettant en place des systèmes de dons alimentaires aux associations caritatives.
        • Acteurs : Gestionnaires de cantines scolaires et de restaurants collectifs, élèves, pouvoirs publics, associations caritatives.
      4. Mettre en place une politique nationale d’économie d’eau, en limitant la consommation par foyer et en généralisant l’utilisation de réservoirs pour la récupération des eaux de pluie, afin de sensibiliser l’ensemble de la population à la nécessité de préserver cette ressource.
        • Acteurs : Pouvoirs publics, collectivités territoriales, agences de l'eau, sociétés de distribution d'eau, citoyens.
      5. Développer les énergies renouvelables (éolien, solaire, hydraulique, biomasse, géothermie) et encourager la transition énergétique en accordant des aides financières aux particuliers et aux entreprises qui investissent dans les technologies propres.
        • Acteurs : Pouvoirs publics, entreprises du secteur de l'énergie, collectivités territoriales, citoyens.
      6. Favoriser une production agro-alimentaire qualitative, durable et transparente :
        • Acteurs : Producteurs agricoles, distributeurs, consommateurs, pouvoirs publics, organismes de certification.
      7. Favoriser une consommation plus saine, responsable et durable en:
        • Acteurs : Consommateurs, distributeurs, producteurs, pouvoirs publics, associations de consommateurs.
      8. Mettre en place des mesures incitatives pour encourager les entreprises à adopter des pratiques respectueuses de l’environnement (bonus-malus, labels écologiques, etc.) et à réduire leur empreinte carbone.
        • Acteurs : Pouvoirs publics, entreprises, organismes de certification, consommateurs.
      9. Renforcer les contrôles et les sanctions à l’encontre des entreprises qui ne respectent pas les normes environnementales, et créer un fonds d’indemnisation pour les victimes de pollutions industrielles.
        • Acteurs : Pouvoirs publics, inspection du travail, justice, entreprises, associations de protection de l'environnement.
      10. Encourager l’utilisation des modes de transport alternatifs à la voiture individuelle (transports en commun, vélo, marche) en développant les infrastructures et en proposant des tarifs attractifs.
        • Acteurs : Collectivités territoriales, opérateurs de transport, citoyens.
      11. Créer des zones à faibles émissions dans les centres urbains pour limiter la circulation des véhicules les plus polluants et améliorer la qualité de l’air.
        • Acteurs : Collectivités territoriales, pouvoirs publics, citoyens.
      12. Soutenir la recherche et l’innovation dans le domaine des technologies vertes et des solutions durables pour relever les défis environnementaux.
        • Acteurs : Pouvoirs publics, organismes de recherche, entreprises, universités.
      13. Organiser des campagnes de sensibilisation à grande échelle pour informer le public sur les enjeux environnementaux et encourager l’adoption de comportements éco-responsables au quotidien.
        • Acteurs : Pouvoirs publics, associations de sensibilisation à l'environnement, médias, entreprises.
      14. Assurer un ramassage plus régulier des poubelles dans les espaces publics.
        • Acteurs : Collectivités territoriales, pouvoirs publics, citoyens.
      15. Mettre en place des concertations régulières avec les enfants, voire des processus de co-décision avec de jeunes représentants, afin qu’ils puissent contribuer aux décisions qui les concernent, en particulier aux projets d’aménagement urbain.
        • Acteurs : Collectivités territoriales, pouvoirs publics, établissements scolaires, associations de jeunes.
      16. Végétaliser les lieux d’accueil et de vie des enfants.
        • Acteurs : Collectivités territoriales, pouvoirs publics, établissements scolaires, associations de parents d'élèves.
      17. Valoriser davantage les actions individuelles et collectives de protection de l’environnement au quotidien, ainsi que les solutions concrètes pour faire face aux défis environnementaux.
        • Acteurs : Médias, pouvoirs publics, associations de sensibilisation à l'environnement, citoyens.
      18. Intégrer davantage les enjeux environnementaux dans les programmes scolaires, avec la mise en place de cours dédiés pour garantir une information claire et fiable sur ces sujets.
        • Acteurs : Établissements scolaires, ministère de l'Éducation nationale, enseignants.
      19. Mettre en place des « Journées vertes » pour sensibiliser et mobiliser les adultes d’aujourd’hui et de demain.
        • Acteurs : Collectivités territoriales, associations de sensibilisation à l'environnement, entreprises, citoyens.
      20. Promouvoir une citoyenneté active des enfants en créant des espaces de dialogue où ils peuvent partager leurs idées et leurs propositions, entre eux et avec les adultes.
        • Acteurs : Collectivités territoriales, établissements scolaires, associations de jeunes, pouvoirs publics.
    2. LISTE DESRECOMMANDATIONS

      Liste des 20 recommandations du défenseur des droits

      En s'appuyant sur le rapport Enfant 2024 du Défenseur des droits, voici les acteurs à mobiliser pour chacune des 20 recommandations, afin de promouvoir le droit des enfants à un environnement sain:

      1. Initier des négociations internationales en vue d’adopter un traité international juridiquement contraignant pour la protection de l’environnement et un protocole additionnel à la Convention de sauvegarde des droits de l’Homme et des libertés fondamentales reconnaissant expressément le droit à un environnement sain.

        • Acteurs : Gouvernement français, Ministère de l'Europe et des Affaires étrangères, Représentants de la France auprès des organisations internationales (ONU, Conseil de l'Europe).
      2. Prendre en compte la particulière vulnérabilité des enfants dans la définition des politiques publiques conduites pour la protection de l’environnement.

        • Acteurs : Pouvoirs publics, Parlement, Ministère de la Transition écologique, Ministère de la Santé, Collectivités territoriales.
      3. Prendre des mesures adaptées pour répondre aux objectifs de réduction d’émissions de gaz à effet de serre et de neutralité carbone, notamment le 3e plan national d’adaptation au changement climatique (PNACC-3) encore en cours de préparation.

        • Acteurs : Gouvernement, Ministère de la Transition écologique, Agence de la transition écologique (ADEME), Haut Conseil pour le climat.
      4. Renforcer la formation de l’ensemble des acteurs de la justice en intégrant dans le tronc commun de tout cursus universitaire en droit des enseignements en matière de droit à un environnement sain.

        • Acteurs : Universités, Écoles de formation des magistrats, Ministère de la Justice, Ordre des avocats.
      5. Généraliser, au sein des établissements hospitaliers, les consultations dédiées à l’exposition prénatale aux pesticides.

        • Acteurs : Ministère de la Santé, Agences régionales de santé (ARS), Établissements hospitaliers, Professionnels de santé (médecins généralistes, gynécologues, pédiatres).
      6. Garantir l’accès à l’eau potable sur l’ensemble du territoire national, et particulièrement dans les territoires ultra-marins.

        • Acteurs : Pouvoirs publics, Collectivités territoriales, Agences de l'eau, Sociétés de distribution d'eau, Ministère de la Transition écologique.
      7. Offrir, dans tous les lieux accueillant des enfants, une alimentation saine et durable et favoriser la mise en place d’habitudes alimentaires diversifiées, équilibrées au service de leur santé et dans le respect de l’environnement.

        • Acteurs : Gestionnaires de cantines scolaires, Établissements scolaires, Crèches, Centres de loisirs, Collectivités territoriales, Ministère de l'Agriculture et de la Souveraineté alimentaire.
      8. Accroître le nombre et la qualité des logements très sociaux destinés aux familles les plus précaires et développer des structures de transition – de l’hébergement au logement – adaptées à l’accueil de familles avec enfants.

        • Acteurs : Collectivités territoriales, Organismes de logement social, État, Associations de lutte contre le mal-logement.
      9. Inclure dans les cahiers des charges applicables aux structures d’hébergement d’urgence des exigences répondant au droit des enfants à un environnement sain.

        • Acteurs : Collectivités territoriales, Associations gestionnaires de centres d'hébergement d'urgence, Ministère du Logement.
      10. Accélérer la rénovation des bâtiments scolaires et de tous les lieux qui accueillent des enfants :

        • Acteurs : Collectivités territoriales (communes, départements, régions), Établissements scolaires, Ministère de l'Éducation nationale, ADEME.
      11. Améliorer les conditions de vie des enfants dans les aires d’accueil des gens du voyage :

        • Acteurs : Collectivités territoriales, Associations de soutien aux gens du voyage, État.
      12. Repenser les espaces publics à hauteur d’enfants pour réduire leur exposition aux pollutions, notamment en généralisant la piétonisation des zones à proximité des établissements scolaires et périscolaires et en développant des transports alternatifs à la voiture.

        • Acteurs : Municipalités, Urbanistes, Agences de mobilité, Établissements scolaires.
      13. Augmenter le ratio par habitant d’espaces verts au sein des villes, sur la base d’un référentiel national, et encourager la végétalisation des cours de récréation des établissements scolaires et de tous les lieux accueillant des enfants.

        • Acteurs : Municipalités, Urbanistes, Paysagistes, Établissements scolaires.
      14. Assurer un ramassage plus régulier des poubelles dans les espaces publics. Mettre en place ou augmenter les poubelles adaptées au tri sélectif, dans l’espace public et dans les structures collectives. Mener des campagnes d’éducation au tri sélectif, au recyclage et au compostage.

        • Acteurs : Municipalités, Services de collecte des déchets, Associations de sensibilisation à l'environnement, Établissements scolaires.
      15. Mettre en place des concertations régulières avec les enfants, voire des processus de co-décision avec de jeunes représentants, afin qu’ils puissent contribuer aux décisions qui les concernent, en particulier aux projets d’aménagement urbain.

        • Acteurs : Municipalités, Urbanistes, Conseils municipaux de jeunes, Établissements scolaires.
      16. Lutter contre toutes les formes de discrimination environnementale dont sont victimes les enfants les plus vulnérables.

        • Acteurs : Pouvoirs publics, Associations de lutte contre les discriminations, Défenseur des droits, Collectivités territoriales.
      17. Renforcer l’éducation à l’environnement :

        • Acteurs : Ministère de l'Éducation nationale, Établissements scolaires, Enseignants, Associations d'éducation à l'environnement, Collectivités territoriales.
      18. Élargir la composition du Parlement des enfants pour permettre à davantage d’enfants d’y participer et rendre obligatoire la discussion parlementaire des textes issus de leurs travaux.

        • Acteurs : Assemblée nationale, Sénat, Ministère de l'Éducation nationale.
      19. Prendre en compte de manière effective la parole des enfants dans l’élaboration des décisions publiques :

        • Acteurs : Pouvoirs publics, Collectivités territoriales, Conseils de jeunes, Établissements scolaires, Défenseur des droits.
      20. Renforcer la justice environnementale :

        • Acteurs : Magistrats, Avocats, Ministère de la Justice, Associations de protection de l'environnement, Défenseur des droits.
    1. If you concatenate a list with 2 items and a list with 4 items, you will get a new list with 6 items (not a list with two sublists). Similarly, repetition of a list of 2 items 4 times will give a list with 8 items.

      very important in knowing that the list is now a new list not a combination of the two original lists meaning they are still not altered only a new list is created from the combination of them two not created per se but a new reference to a new list is created

    1. Voici un sommaire de l'entretien avec Gabrielle Légeret, fondatrice de l'association "De l'or dans les mains", avec les horodatages:

      • Vrai ou faux sur la perception des métiers manuels [0:00]:
        • L'orientation vers les filières professionnelles est souvent perçue comme un échec [0:00].
        • L'intelligence manuelle est moins valorisée que l'intelligence intellectuelle [0:06].
        • Les jeunes ne s'intéressent plus aux métiers de l'artisanat (Faux) [0:11].
        • L'artisanat et les métiers manuels sont incompatibles avec l'innovation (Faux) [0:17].
        • Les métiers manuels sont associés à un manque d'intelligence [0:24].
        • La valorisation des métiers manuels pourrait réduire le chômage [0:31].
        • Les jeunes préfèrent les carrières intellectuelles aux manuelles (Faux) [0:37].
        • Les métiers manuels peuvent offrir des salaires compétitifs [0:42].
        • Les compétences manuelles sont de plus en plus demandées [0:48].
      • Interventions de "De l'or dans les mains" dans les collèges [1:07]:
        • Le mépris pour les métiers manuels est ancré dans la société et l'éducation [1:09-1:18].
        • Comment redonner leurs lettres de noblesse aux métiers manuels? [1:35]
        • Présentation du programme phare "Je découvre les métiers manuels" [2:04].
        • Programme de 15h en classe de 5ème [2:10].
        • Création de liens entre les programmes scolaires et la pratique artisanale [2:19-2:26].
        • Intervention d'artisans dans les classes pour fabriquer des objets liés au programme scolaire [2:41-2:50].
        • Découverte de neuf métiers à travers la pratique artisanale [3:18-3:27].
        • Les élèves découvrent qu'ils ont des pouvoirs avec leurs mains et réussissent à l'école [3:55-4:02].
        • Les interventions changent le regard des enseignants sur leurs élèves [4:02-4:11].
        • Les élèves en difficulté peuvent se retrouver en situation de réussite [4:11].
        • Un bon élève peut découvrir la menuiserie comme option [5:14-5:25].
        • Le système scolaire n'est adapté qu'à une certaine typologie de cerveaux [5:25-5:43].
        • Importance de la manipulation et de l'expérimentation pour l'apprentissage [5:43-5:52].
        • L'intelligence du geste sollicite aussi l'intellect [6:14-6:19].
        • Il faut apprendre aux enfants à connecter leur corps à ce qu'ils vivent [6:27-6:41].
      • Reconnecter la tête et les mains [7:00]:
        • L'opposition entre la tête et les mains fait beaucoup de mal et place des jeunes en situation d'échec scolaire [7:17-7:23].
        • Nécessité de replacer l'intelligence manuelle au cœur du projet de société [7:32-7:39].
        • Importance d'une culture matérielle face aux logiques de consommation [7:39-7:58].
        • Faire des études supérieures n'assure plus une bonne place dans la société [8:10-8:16].
        • Les métiers manuels peuvent répondre à la transition écologique et retisser du lien social [8:42-9:01].
        • Les métiers de demain sont des métiers de la transition écologique et manuels [9:15-9:35].
        • Importance de révéler les talents manuels des jeunes à équivalence avec les talents intellectuels [9:42-9:58].
        • La pratique manuelle doit irriguer les savoirs fondamentaux [10:20-10:41].
        • Importance de donner aux élèves un réel choix concernant ce qu'ils ont envie d'expérimenter [11:53-12:13].
        • Les jeunes ne vont pas vers les métiers manuels par manque d'information et de valorisation de leurs compétences manuelles [12:21-12:50].
      • Expliquer l’opposition [13:23]:
        • Les origines de la dévalorisation des métiers manuels sont complexes et typiquement françaises [13:23-13:39].
        • En Suisse et en Allemagne, l'orientation vers une filière professionnelle est moins dévalorisée et les passerelles sont plus intenses [13:39-14:13].
        • Décisions politiques en France de délocaliser les industries et de devenir une société de services [14:22-14:34].
        • Les métiers manuels étaient souvent pénibles [14:34-14:47].
        • Il y a aussi des enjeux de pénibilité dans les métiers de bureau [14:53-15:06].
        • Importance de sortir de la pénibilité dans les entreprises et d'avoir des salaires décents [15:16-15:53].
        • La France place les enfants en situation de compétitivité et de comparaison dès le plus jeune âge [16:29-16:45].
        • Sur un chantier, il y a une responsabilité collective face au métier et à la société [16:45-17:38].
        • Importance de faire des projets collectifs à l'école [17:55-18:13].
        • Une société qui ne regarde pas de façon condescendante les métiers va mieux [18:19-18:41].
        • Beaucoup de jeunes se retrouvent à 40 ans à vouloir se reconvertir vers des métiers manuels [18:41-19:13].
        • Un salarié sur trois dans les entreprises du patrimoine vivant est en reconversion [19:20-19:29].
        • Il ne faut pas attendre d'avoir prouvé quelque chose à la société pour choisir un métier manuel [20:05-20:19].
      • Changer la donne [20:27]:
        • Réintégrer la pratique manuelle à l'école [20:48-20:53].
        • L'objectif est de replacer l'intervention manuelle dans le système éducatif au profit des jeunes [21:19-21:40].
        • Favoriser le développement de soi et la réussite académique [21:40].
        • S'adosser aux politiques publiques éducatives existantes et venir au service des enseignants et des élèves [22:33-22:48].
        • Changer le regard des pouvoirs publics, de l'opinion publique, des parents, des enfants, des enseignants [22:57-23:02].
        • Mobiliser l'ensemble de la société sur cette question [23:29-23:36].
      • Pour un autre futur [23:47]:
        • Comment imaginer un système où l'orientation soit au service de la réalisation des citoyens? [23:47-23:58]
        • Il faut désacraliser l'orientation et permettre de se tromper et de recommencer [24:06-24:19].
        • Valoriser et évaluer d'autres compétences que les compétences académiques [24:42-24:50].
        • Faire rentrer plus le monde du travail et de l'entreprise au collège [25:23-25:32].
        • L'orientation a besoin d'être vécue [26:02].
        • Mettre en pratique des expériences, des mini-entreprises [26:22-26:32].
        • Faire en sorte que l'école soit plus ancrée sur son territoire et réponde à des projets d'intérêt général [26:38-26:50].
        • Travailler au bien commun [27:24-27:33].
        • Revaloriser le métier d'enseignant [28:00-28:05].
        • Permettre et donner du temps et des moyens aux enseignants d'innover [29:10-29:19].
        • Faire rentrer des artisans au sein du collège et faire travailler des artisans et des enseignants ensemble [30:12-30:34].
        • Le monde de l'entreprise aurait tout intérêt à mieux venir pénétrer nos classes [30:45-30:50].
        • Importance des partenariats avec l'Onisep et sa plateforme Parcours Avenir [31:34-31:43].
        • Il y a des choses à changer et des prises de conscience [32:08-32:26].
        • Le corps ne doit pas être utilisé qu'en sport [33:10-33:19].
        • Importance de l'agencement des salles et de la pratique dans les cours [33:19-33:33].
        • Apprendre aux enfants comment fonctionne leur cerveau et les encourager à développer des méthodes d'apprentissage qui mobilisent le corps [34:33-34:46].
        • Une société qui est en mouvement dans son corps est une société qui est bien dans son corps [35:09-35:15].
        • Importance d'une éducation à la nourriture [35:21-35:36].
        • Retrouver un regard global sur ce qu'est la vie et ce dont on a besoin pour créer des individus en santé mentale et physique [36:13-36:31].
        • Recréer un rapport holistique au monde qui nous entoure [36:59-37:05].
        • Donner aux enfants les clés pour s'ancrer dans la société et avoir le pouvoir d'agir [38:06-38:14].
        • Faire l'expérience du sensible et redonner la place au sensible [39:16-39:22].
    1. Moorish tessellations including this one at the Alhambra inspired Escher's work with tilings of the plane. He made sketches of this and other Alhambra patterns in 1936.([6])

      穆斯林的镶嵌图案(Moorish tessellations),尤其是阿尔罕布拉宫(Alhambra)中的图案,以其复杂的几何形状和对称性,深深吸引了艺术家和数学家,其中包括荷兰艺术家M.C. Escher。这些图案不仅展现了极高的艺术美感,还蕴含了丰富的数学原理,特别是关于平面镶嵌的研究,后来成为Escher创作的核心之一。

      穆斯林镶嵌图案的特点

      1. 几何精度:这些图案由几何形状(如三角形、正方形、六边形等)构成,通过这些形状的组合和重复,形成复杂的重复图案,填满整个表面。

      2. 对称性:图案通常使用多种对称形式,包括平移、旋转、反射等,展现出强烈的对称美感。它们有时还表现出分形特征,即图案在不同的尺度上重复。

      3. 抽象性:在伊斯兰教的禁忌下,穆斯林艺术避免表现生命形态,因此图案通常是抽象的,常见的元素包括星形、交织的曲线植物图案,而非具象的动物或人物。

      4. 互锁形式:许多图案采用互锁的多边形,如彭罗斯拼图(Penrose tiling),这些拼图在不留空隙的情况下彼此连接,形成迷人的视觉效果。

      Escher与阿尔罕布拉宫的图案

      Escher在1936年访问西班牙时,被阿尔罕布拉宫的镶嵌图案深深吸引,并做了草图。这些图案为他后来的创作提供了灵感,特别是在他后期作品中对平面镶嵌的探索。

      Escher对镶嵌图案的探索

      Escher的著名作品常常包括使用类似穆斯林图案的平面镶嵌。他不仅仅模仿这些图案,而是进一步将它们转化为数学问题,探索如何利用几何学创造出视觉上既完美又富有创意的设计。

      Escher在创作过程中深入研究了如何利用几何形状(尤其是正多边形)来填充平面,并探讨了对称性如何在艺术中表现出来,这些想法直接来源于他在阿尔罕布拉宫看到的复杂几何图案。

      Escher作品中的穆斯林影响

      1. 重复与互锁:在Escher的《日与夜》(1938)等作品中,我们可以看到形状的互锁与变化,例如,鸟与陆地之间的无缝过渡,体现了图案的连续性和流动感,这与阿尔罕布拉宫的图案有异曲同工之妙。

      2. 几何变换:Escher非常感兴趣的是如何用对称性将一个形状转变成另一个形状,这一点与穆斯林图案中的几何变化高度契合。阿尔罕布拉宫中的图案往往展现了多种几何变换的组合,Escher也在他的作品《规则的平面分割》中进行了类似的探索。

      3. 变形图案:Escher的《变形》系列(1939)正是从穆斯林图案中的逐渐变化概念中汲取灵感,其中一个图案可以在视觉上转变成另一个图案,而这种变化依然保持几何的和谐。

      Escher 1936年草图的影响

      在1936年访问阿尔罕布拉宫时,Escher制作了详细的草图,记录了宫内复杂的几何图案,这些草图为他后来的平面镶嵌数学艺术创作奠定了基础。这些草图让他能够从穆斯林艺术的几何精度中汲取灵感,并将其转化为视觉上的悖论不可能的形状

      总结

      穆斯林的镶嵌图案,尤其是在阿尔罕布拉宫中的图案,深深影响了Escher的创作。他不仅借鉴了这些图案的几何对称性平面镶嵌的技巧,还通过艺术和数学的结合,创造出了许多富有视觉冲击力和数学深度的作品。Escher的艺术探索既是对穆斯林艺术的致敬,也推动了艺术与数学交汇的新境界。

    1. Voici un document de synthèse pour un briefing sur la mise en place d'une newsletter associative, basé sur le webinaire de Solidatech :

      Objectifs et Cibles

      • Avant de se lancer, il est crucial de définir les objectifs de la newsletter : création de trafic vers le site web, levée de fonds, mobilisation de bénévoles, etc..
      • Il faut également identifier les cibles : bénévoles, donateurs, adhérents.

      Base de Données et RGPD

      • Base de données propre : désigner un responsable, identifier les informations utiles et uniformiser les données. Privilégier une solution dédiée plutôt qu'un tableur.
      • RGPD : obtenir le consentement (cases non précochées), prouver l'obtention du consentement, faciliter le désabonnement, être transparent sur la raison de l'envoi.

      Calendrier Éditorial

      • Mettre en place un calendrier éditorial pour avoir une vision d'ensemble, regrouper les idées, garantir la régularité et répartir le travail.
      • Définir les types de mailing et leur fréquence, lister les événements importants et prévoir le contenu.

      Création et Rédaction

      • Créer une trame de mailing aux couleurs de l'association.
      • Rédiger des phrases courtes, claires et concises. Ajouter un appel à l'action (CTA) à chaque sujet.
      • Soigner l'objet et l'aperçu de la newsletter.

      Conseils Anti-Spam

      • Éviter les majuscules dans l'objet, les pièces jointes, et les listes de contacts achetées.

      Outils d'Emailing

      • Choisir un outil adapté aux besoins et au budget.
      • Outils gratuits (avec limites) ou tarifs réduits pour les associations (GetResponse, CleverReach, Mailchimp, Mailjet, ActiveTrail).
      • Pour le planning éditorial : Google Sheets, Airtable, Trello, Asana.

      Suivi et Analyse

      • Suivre les indicateurs clés : taux d'ouverture, taux de clic, taux de désabonnement.
      • Construire son propre référentiel et optimiser les pratiques.

      Étapes Clés

      1. Réfléchir au projet en équipe.
      2. Désigner un responsable.
      3. Mettre au propre la base de données.
      4. Choisir un outil de mailing.
      5. Créer une trame de mailing.
      6. Établir un planning éditorial.
      7. Rédiger et relire le premier mailing.
      8. Envoyer et suivre les indicateurs.

      Ressources Complémentaires

      • Formations Solidatech sur la communication digitale et les outils de newsletter.
    1. Monroe’s Motivated Sequence is a five-step organization pattern that attempts to persuade an audience by making a topic relevant, using positive and/or negative motivation, and including a call to action. The five steps are (1) attention, (2) need, (3) satisfaction, (4) visualization, and (5) action (Monroe & Ehninger, 1964).

      It is really interesting to me that there can be a formula for convincing people to get off their butts and do something. It feels like such an abstract idea to me, if you're trying to convince someone you're trying to reach them emotionally, internally. The idea that there's a "five-easy-steps" way of organizing your thoughts to help you convince people to do something is a strange concept to me, but I'm going to think about it a lot as i work on this speech

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this work, Harpring et al. investigated divisome assembly in Chlamydia trachomatis serovar L2 (Ct), an obligate intracellular bacterium that lacks FtsZ, the canonical master regulator of bacterial cell division. They find that divisome assembly is initiated by the protein FtsK in Ct by showing that it forms discrete foci at the septum and future division sites. Additionally, knocking down ftsK prevents divisome assembly and inhibits cell division, further supporting their hypothesis that FtsK regulates divisome assembly. Finally, they show that MreB is one of the last chlamydial divisome proteins to arrive at the site of division and is necessary for the formation of septal peptidoglycan rings but does not act as a scaffold for division assembly as previously proposed.

      Strengths:

      The authors use microscopy to clearly show that FtsK forms foci both at the septum as well as at the base of the progenitor cell where the next septum will form. They also show that the Ct proteins PBP2, PBP3, MreC, and MreB localize to these same sites suggesting they are involved in the divisome complex.

      Using CRISPRi the authors knock down ftsK and find that most cells are no longer able to divide and that PBP2 and PBP3 no longer localized to sites of division suggesting that FtsK is responsible for initiating divisome assembly. They also performed a knockdown of pbp2 using the same approach and found that this also mostly inhibited cell division. Additionally, FtsK was still able to localize in this strain, however PBP3 did not, suggesting that FtsK acts upstream of PBP2 in the divisome assembly process while PBP2 is responsible for the localization of PBP3.

      The authors also find that performing a knockdown of ftsK also prevents new PG synthesis further supporting the idea that FtsK regulates divisome assembly. They also find that inhibiting MreB filament formation using A22 results in diffuse PG, suggesting that MreB filament formation is necessary for proper PG synthesis to drive cell division.

      Overall the authors propose a new hypothesis for divisome assembly in an organism that lacks FtsZ and use a combination of microscopy and genetics to support their model that is rigorous and convincing. The finding that FtsK, rather than a cytoskeletal or "scaffolding" protein is the first division protein to localize to the incipient division site is unexpected and opens up a host of questions about its regulation. The findings will progress our understanding of how cell division is accomplished in bacteria with non-canonical cell wall structure and/or that lack FtsZ.

      Weaknesses:

      No major weaknesses were noted in the data supporting the main conclusions. However, there was a claim of novelty in showing that multiple divisome complexes can drive cell wall synthesis simultaneously that was not well-supported (i.e. this has been shown previously in other organisms). In addition, there were minor weaknesses in data presentation that do not substantially impact interpretation (e.g. presenting the number of cells rather than the percentage of the population when quantifying phenotypes and showing partial western blots instead of total western blots).

      We agree with the weaknesses identified by the reviewer. We removed the statements in the Results and Discussion that multiple independent divisome complexes can simultaneously direct PG synthesis. We presented the data in Figs. 3-5 as % of the cells in the population, and complete western blots are shown in Supp. Fig. S1.

      Reviewer #2 (Public review):

      Summary:

      Chlamydial cell division is a peculiar event, whose mechanism was mysterious for many years. C. trachomatis division was shown to be polar and involve a minimal divisome machinery composed of both homologues of divisome and elongasome components, in the absence of an homologue of the classical division organizer FtsZ. In this paper, Harpring et al., show that FtsK is required at an early stage of the chlamydial divisome formation.

      Strengths:

      The manuscript is well-written and the results are convincing. Quantification of divisome component localization is well performed, number of replicas and number of cells assessed are sufficient to get convincing data. The use of a CRISPRi approach to knock down some divisome components is an asset and allows a mechanistic understanding of the hierarchy of divisome components.

      Weaknesses:

      The authors did not analyse the role of all potential chlamydial divisome components and did not show how FtsK may initiate the positioning of the divisome. Their conclusion that FtsK initiates the assembly of the divisome is an overinterpretation and is not backed by the data. However, data show convincingly that FtsK, if perhaps not the initiator of chlamydial division, is definitely an early and essential component of the chlamydial divisome.

      The following statement has been included in the Discussion (pg. 16 of the revised manuscript)  “Although we focused our study on a subset of the divisome and elongasome proteins that Chlamydia expresses (bolded in Fig. 6G), our results support our conclusion that chlamydial budding is dependent upon a hybrid divisome complex and that FtsK is required for the assembly of this hybrid divisome. At this time, we cannot rule out that other proteins act upstream of FtsK to initiate divisome assembly in this obligate intracellular bacterial pathogen.”

      We will soon be submitting another manuscript that addresses how FtsK specifies the site of divisome assembly. This work is too extensive to be included in this manuscript.

      Reviewer #3 (Public review):

      Summary:

      The obligate intracellular bacterium Chlamydia trachomatis (Ct) divides by binary fission. It lacks FtsZ, but still has many other proteins that regulate the synthesis of septal peptidoglycan, including FtsW and FtsI (PBP3) as well as divisome proteins that recruit and activate them, such as FtsK and FtsQLB. Interestingly, MreB is also required for the division of Ct cells, perhaps by polymerizing to form an FtsZ-like scaffold. Here, Harpring et al. show that MreB does not act early in division and instead is recruited to a protein complex that includes FtsK and PBP2/PBP3. This indicates that Ct cell division is organized by a chimera between conserved divisome and elongasome proteins. Their work also shows convincingly that FtsK is the earliest known step of divisome activity, potentially nucleating the divisome as a single protein complex at the future division site. This is reminiscent of the activity of FtsZ, yet fundamentally different.

      Strengths:

      The study is very well written and presented, and the data are convincing and rigorous. The data underlying the proposed localization dependency order of the various proteins for cell division is well justified by several different approaches using small molecule inhibitors, knockdowns, and fluorescent protein fusions. The proposed dependency pathway of divisome assembly is consistent with the data and with a novel mechanism for MreB in septum synthesis in Ct.

      Weaknesses:

      The paper could be improved by including more information about FtsK, the "focus" of this study. For example, if FtsK really is the FtsZ-like nucleator of the Ct divisome, how is the Ct FtsK different sequence-wise or structurally from FtsK of, e.g. E. coli? Is the N-terminal part of FtsK sufficient for cell division in Ct like it is in E. coli, or is the DNA translocase also involved in focus formation or localization? Addressing those questions would put the proposed initiator role of FtsK in Ct in a better context and make the conclusions more attractive to a wider readership.

      We will be submitting another manuscript soon that details the conserved domain organization of FtsK from different bacteria, and the role of the various domains of chlamydial FtsK (including the N-terminus and the C-terminal translocase domain) in directing its localization in dividing Chlamydia. We have added text to the discussion (pg. 16 of the revised manuscript) that describes the sequence homology of chlamydial FtsK to FtsK from E. coli.

      Another weakness is that the title of the paper implies that FtsK alone initiates divisome assembly. However, the data indicate only that FtsK is important at an early stage of divisome assembly, not that it is THE initiator. I suggest modifying the title to account for this--perhaps "FtsK is required to initiate....".

      We agree with the reviewer and modified the title to “FtsK is Critical for the Assembly of the Unique Divisome Complex of the FtsZ-less Chlamydia trachomatis”. We have also modified the text throughout to indicate that FtsK is required for the assembly of the hybrid divisome of Chlamydia

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Suggestions for improvement (mostly minor):

      (1) For several of the graphs, the authors plot the number of cells with a given phenotype on the y-axis, but then describe percentages of cells in the text. It would make it clearer if all the graphs had the percentage of cells on the y-axis instead.

      We have modified the figures to indicate the percentage of cells on the y-axis with a given phenotype.

      (2) In Figures 3, 4, and 5 the authors show separate graphs for plus/minus drug or inducer. These should be on the same graph as they are directly comparing these two different conditions. Having them on separate graphs makes it less clear whether these differences are significant between the two conditions

      We modified Fig. 4 to show +/- inducer in ftsk and pbp2 knockdown strains in the same graph.  Regarding Figures 3 and 5, we believe the figures in the original submission effectively demonstrate the +/- drug conditions, so these figures remain unchanged in the revised manuscript.

      (3) In Figure 2 the authors show microscopy of the colocalization of FtsK with several other divisome proteins from Ct. Quantification of the colocalization of FtsK with these other proteins would provide a more holistic understanding of their colocalization and help further support their argument that FtsK initiates the assembly of the divisome.

      Supp. Fig. S4A of the revised manuscript contains images showing the colocalization of FtsK with the fusions at the septum and the base of dividing cells, and the colocalization of FtsK with the fusions that are only at the base of dividing cells. Supp. Fig. S4B quantified the percentage of dividing cells where FtsK overlaps the localization of each of the fusions at the septum, at the septum and the base, and at the base alone.

      (4) In Figure 6 the authors mention that the PG ring was at a slight angle relative to the MOMP-stained septum. What is the significance of this? The authors mention it several times but do not explain its relevance to divisome assembly. It is not really evident in the images presented.

      We mention in the discussion pgs. 17-18 of the revised manuscript that “The relevance of the angled orientation of PG and MreC rings relative to the MOMP-stained septum in division intermediates is unclear. However, it appears to be a conserved feature of the cell division process and may arise because the divisome proteins are often positioned slightly above or below the plane of the MOMP-stained septum. The positioning of divisome proteins above or below the septum is indicated in Figs. 1 and 2.

      We included cartoons in Fig. 6C of the revised manuscript to assist the reader in visualizing the angled orientation of the PG ring relative to the MOMP-stained septum.

      (5) In line 270 the authors claim that "these are the first data in any system to suggest that septal PG synthesis/modification is simultaneously directed by multiple independent divisome complexes." However, their experiments do not demonstrate that multiple divisome complexes are active at the same time. They show that multiple foci of FtsK etc. are present at sites where PG synthesis has occurred, but that does not necessarily mean that each focus/complex was actively synthesizing PG at the same time. Moreover, similar approaches were used to support a claim that septal PG synthesis is directed by multiple discrete divisome complexes previously (e.g. in Figure 1 of Bisson-Filho et al. 2017 (PMID: 28209898) in Bacillus subtilis and in Perez et al 2021 (PMID: 33269494) in Streptococcus pneumoniae). This claim is not central to the main conclusions of the study and could just be removed.

      This statement has been removed from the Results and the Discussion.

      (6) In Figure 6B the authors see three distinct FtsK foci. Why is this the only place in the manuscript where they see three foci? They mentioned previously that they saw foci at the septum and at the base of the progenitor mother cell, but why are there three foci here?

      The vast majority of dividing cells displayed one foci at the septum and/or the base.  Representative images were chosen that reflected the localization profiles observed in the majority of cells. While we observed cells with  multiple foci, as shown in Figure 6C, these cells were relatively rare   (~2% of cells for all the divisome proteins in 3 independent experiments).  Since  the number of cells with multiple foci were relatively rare, we chose to group these cells with the cells that had single foci at the septum, the septum and base, or base alone categories in the quantification shown in Fig. 2C. This is stated in the legend of Fig. 2 of the revised manuscript.

      (7) The Discussion section is lacking a couple of things that would put the data in a broader context. Can the authors speculate on how FtsK knows how to find the division site? I.e. what might be upstream of FtsK localization? Additionally, the authors do not talk about the FtsK sequence or domains at any point in the paper. Does Ct FtsK have a similar sequence/structure to FtsKs from other bacteria? Are there any differences in sequence/structure that might tell us about its function in Ct?

      We will be submitting another manuscript soon that examines how the site of assembly of the divisome is defined in dividing Chlamydia. This manuscript will also define the localization of the different sub-domains of chlamydial FtsK during cell division.  For this manuscript, we added a paragraph in the Discussion (pg. 16 of the revised manuscript) that states the domain organization is conserved in FtsK proteins from different bacteria. This paragraph includes information regarding the % sequence identity of the C-terminus and the N-terminus of chlamydial FtsK when compared to E. coli FtsK.

      (8) For Supplementary Figure S1B-C. The authors should show the full blots rather than just the single band of the protein of interest to show that the antibodies are specific. Additionally, the authors should include a loading control to show that they loaded the same amount of protein for each sample.

      We have included the full blots in Supp. Fig. S1 of the revised manuscript. We do not see the need for including a loading control for these blots because we are not making arguments about the relative level of the proteins that were assayed. We only use the blots to show that the fusion proteins are primarily a single species of the predicted molecular mass.

      (9) In Supplementary Figure S4A the authors use RT-qPCR to measure ftsK and pbp2 transcript levels. Since they have antibodies against these proteins, they should also include Western blots to show that the proteins are not being produced when targeted using CRISPRi.

      We have included data in Supp. Fig. S5E of the resubmission that indicates foci of FtsK and PBP2 could not be detected following the knockdown of ftsk and pbp2. We feel that these data support our conclusion that the induced expression of dCas12 in the the ftsk and pbp2 knockdown strains results in the downregulation of the endogenous FtsK and PBP2 polypeptides.

      (10) In lines 261-262 the authors say that "PG organization was the same or differed at the septum." What is the PG organization being compared to? Same or different from what?

      We agree with the reviewer that the text in lines 261-262 in the original submission was confusing.  The text has been modified.

      (11) Lines 201-215 the authors refer to Supplementary Figure S3 throughout this section, but they should refer to Supplementary Figure S4.

      This has been corrected.

      Reviewer #2 (Recommendations for the authors):

      I am not convinced that this paper shows that FtsK initiates the assembly of the divisome since the authors did not analyse the role and localization of all other chlamydial divisome components. Out of the ten homologues of divisome and elongasome components encoded by C. trachomatis genome, only five are investigated in this study. There is no explanation about how these five were chosen.

      We state on pg. 16 of the revised manuscript that “Although we focused our study on a subset of the divisome and elongasome proteins that Chlamydia expresses (bolded in Fig. 6G), our results support our conclusion that chlamydial budding is dependent upon a hybrid divisome complex and that FtsK is required for the assembly of this hybrid divisome. At this time, we cannot rule out that other proteins act upstream of FtsK to initiate divisome assembly in this obligate intracellular bacterial pathogen.

      Results convincingly indicate that FtsK is an early divisome component, but proofs are lacking to indicate that it initiates the divisome formation. Indeed, the authors do not show how FtsK would be the first protein to selectively accumulate at a given location to initiate the divisome formation. For this reason, the model they propose at the end of their study is not backed by sufficient data, to my opinion.

      We agree with the reviewer that our data does not show that FtsK initiates divisome assembly. The title of the manuscript has been modified to “FtsK is Critical for the Assembly of the Unique Divisome Complex of the FtsZ-less Chlamydia trachomatis” and the text throughout has been modified to indicate that FtsK is the first protein we assayed that associates with nascent divisomes at the base of dividing cells. We will soon be submitting another manuscript that details how FtsK is recruited to a specific site to initiate nascent divisome assembly, This work is too extensive to be included in this manuscript.

      There are also discrepancies in the number of cells analysed to quantify the localization of divisome components, ranging from 50 to 250 cells. The authors could better explain why there are such variations.

      There were differences in the number of cells analyzed in the various experiments, but in every instance the effect of inhibitors (A22 and mecillinam) or ftsk and pbp2 knockdown on divisome assembly was statistically significant.

      There are a few mistakes in the text regarding figure numbering (Figure S4 is mentioned as S3 in the text). Figures 5B and D are not specifically cited.

      These mistakes have been corrected in the revised manuscript.

      Line 261-262: the sentence starting "Our imaging analysis.." is not clear to me.

      We agree with the reviewer that the text in lines 261-262 was confusing.  The text has been modified (pg. 14 of the revised manuscript).

      Line 270-271: there are insufficient proofs to say that there are multiple independent divisome complexes. This is in my opinion an overinterpretation of the data, since there is no proof that these complexes are independent.

      This statement has been removed from the text.

      A few details are lacking in the figure legends:

      Figure 2C: when was the expression of the different mCherry and 6xHis constructs induced?

      The onset and length of the induction of the fusions have been included in the legend of Fig. 2.

      Bars are sometimes mentioned as uM and should be um. Bars sizes, number of replicates, and/or meaning of the error bars are lacking in legends of Figures S2, S3, and S4

      This has been corrected in the revised manuscript.

      The consistency of Figures could be improved between Figures 3A, 4A, B, and 5A. The results of treated cells could be always shown as dark grey. It would help the reader.

      We have used consistent coloring in Figs. 3-5 to indicate the treated cells.

      Reviewer #3 (Recommendations for the authors):

      (1) Lines 113-118: do Ct cells increase in size as they get closer to starting division? If so, could a pseudo-time course (demograph) be done to bolster the evidence that the base foci formed mainly in predivisional cells and not newborn cells? This evidence might be more convincing than the data in Figures 1F and G.

      Chlamydial cells in the population were heterogeneous in size at the timepoint we are studying. This observation is consistent with previous reports in the literature (Liechti et al.,2021). While we agree that a pseudo-time course could potentially bolster the evidence about when FtsK foci appear, we believe our current analysis sufficiently demonstrates that basal foci of FtsK appear prior to the appearance of new buds at the base of dividing cells.

      (2) Figure 3E: It looks like MreC localization to foci doesn't strictly require MreB polymerization. Is this known for E. coli or other species?

      To our knowledge, MreC assembly into a filament has not been shown to be dependent upon MreB in other bacteria.  In Caulobacter crescentus, MreC forms a helical structure that is not dependent upon MreB or MreB filament formation (Dye et al., 2005. PNAS; Divakaruni et al., 2005. PNAS).

      (3) Figure 5E: why is nearly half of PBP2 and PBP3 still localized to foci at the membrane even after treatment with mecillinam? This suggests, as the authors mention, that mecillinam reduces the efficiency of localization to the divisome but does not eliminate it. Any ideas why?

      At this time, we do not know why inhibiting the catalytic activity of PBP2 with mecillinam does not fully prevent the association of PBP2 with the chlamydial divisome. We have included a statement in the Results (pg. 13 of the revised manuscript) that inhibiting the catalytic activity of PBP2 prevents it from efficiently associating with or maintaining its association with polarized divisome complexes.

      (4) Line 262-263: This sentence is confusing-please rephrase. The same as what? Differed from what?

      We agree with the reviewer. The wording in lines 262-263 of the original submission has been modified.  

      (5) Lines 265-267 and Figure 6: Adding cartoon schematics might help readers visualize cell orientations in Fig. 6 (especially 6B).

      Cartoons have been added to Fig. 6C (Fig. 6B in the original submission) to orient the reader.

      (6) Line 294-298: Do the authors think that the residual 5-10% of PG foci after FtsK knockdown is due to the ability of residual FtsK to organize divisomes?

      We show that knockdown of FtsK is not complete, and while we cannot be certain, it is likely, that the PG foci detected in FtsK knockdown cells is due to the ability of the residual FtsK to organize divisomes that direct PG synthesis.

      (7) Do the authors have any evidence that FtsK foci are mobile like treadmilling FtsZ?

      We have not performed real-time imaging studies, and we currently have no evidence that FtsK foci are mobile.

      (8) FtsK foci here are reminiscent of mobile foci formed by the FtsK-like SpoIIIE at the Bacillus subtilis sporulation septum. This might be a good idea to mention in the Discussion. Is it possible that Ct FtsK is also involved in coordinating chromosome partitioning through the developing septum? (That is another reason why it would be useful to know if the translocase domain was dispensable for localization/activity).

      We are currently preparing another manuscript that documents the contribution of the various domains of FtsK to its localization profile and whether the division defect in ftsk knockdown cells can be suppressed by specific subdomains of FtsK. This manuscript not only will include these data, it will also include experiments that address how the site of polarized budding is defined. In the revised manuscript, we have included a description of how the domain organization of chlamydial FtsK is similar to E. coli FtsK (pg. 16 of revision). Chlamydial FtsK also has a similar domain organization as SpoIIIE from B. subtilis. The C-terminal catalytic domain of SpoIIIE is 45% identical to chlamydial FtsK. The N-terminus of SpoIIIE is predicted to encode 4 transmembrane spanning helices, like chlamydial FtsK. However, the N-terminus of SpoIIIE shares no sequence homology with the N-terminus of chlamydial FtsK.  We have not included the similar domain organization of SpoIIIE and chlamydial FtsK in the revised manuscript.

      (9) It seems that FtsK foci localize to a particular spot opposite from the active septum, although how this spot is specified is not clear. Is there any geometric clue for FtsK's localization like there is for Min-specified FtsZ localization?

      As mentioned above, we are currently preparing another manuscript that documents our efforts to understand how the site of polarized budding is defined.  This analysis is too extensive to include in this study.

      (10) As mentioned in the Summary, do the authors know whether the N-terminal membrane binding part of FtsK (FtsKn) sufficient for localization/divisome assembly in Ct as it is in other species? Oullette et al. 2012 showed that FtsKn could interact with MreB in BACTH.

      We are currently preparing another manuscript that documents the contribution of the various domains of FtsK to its localization profile.

      (11) The previous BACTH result with MreB and FtsKn implies that this interaction is direct, yet the current data suggest that this is not the case. Can the authors comment on this? Is this due to bridging effects inherent in the BACTH system?

      We have not presented any data to indicate that FtsK and MreB do not interact. We have only shown that FtsK localization is not dependent upon MreB filament formation (Fig. 3).

      (12) The FtsZ-independent role of FtsK in nucleating the divisome suggests that Ct FtsK may differ from other FtsKs structurally - can this be explored, perhaps with AlphaFold 3?

      As mentioned above, we have included a paragraph in the discussion of the revised manuscript (pg. 16 of the revised manuscript) that states the domain organization of chlamydial FtsK is similar to E.coli FtsK. This conserved domain organization is evident when we view the structures of the proteins using Alphafold.

      (13) Typo on line 559: should be HeLa.

      This has been corrected.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents a comprehensive exploration of the role of liver-specific Survival Motor Neuron (SMN) depletion in peripheral and central nervous system tissue pathology through a well-constructed mouse model. This study is pioneering in its approach, focusing on the broader physiological implications of SMN, which has traditionally been associated predominantly with spinal muscular atrophy (SMA).

      Strengths:

      (1) Novelty and Relevance: The study addresses a significant gap in understanding the role of liver-specific SMN depletion in the context of SMA. This is a novel approach that adds valuable insights into the multi-organ impact of SMN deficiency.

      (2) Comprehensive Methodology: The use of a well-characterized mouse model with liver-specific SMN depletion is a strength. The study employs a robust set of techniques, including genetic engineering, histological analysis, and various biochemical assays.

      (3) Detailed Analysis: The manuscript provides a thorough analysis of liver pathology and its potential systemic effects, particularly on the pancreas and glucose metabolism.

      (4) Clear Presentation: The manuscript is well written. The results are presented clearly with well-designed figures and detailed legends.

      Weaknesses:

      (1) Limited Time Points: The study primarily focuses on a single time point (P19). This limits the understanding of the temporal progression of liver and pancreatic pathology in the context of SMN depletion. Longitudinal studies would provide a better understanding of disease progression.

      (2) Incomplete Recombination: The mosaic pattern of Cre-mediated excision leads to variability in SMN depletion, which complicates the interpretation of some results. Ensuring more consistent recombination across samples would strengthen the conclusions.

      After the revision, the authors addressed the reviewers' questions by extending their analyses to include P60 mice, conducting both liver and pancreatic analyses, and adding a comprehensive panel of metabolic hormones related to glucose metabolism in animals at P19 and P60. They also corrected all errors identified during the initial review process and expanded the discussion to clarify raised issues. All my questions have now been addressed.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript presents a comprehensive exploration of the role of liver-specific Survival Motor Neuron (SMN) depletion in peripheral and central nervous system tissue pathology through a well-constructed mouse model. This study is pioneering in its approach, focusing on the broader physiological implications of SMN, which has traditionally been associated predominantly with spinal muscular atrophy (SMA).

      Strengths:

      (1) Novelty and Relevance: The study addresses a significant gap in understanding the role of liver-specific SMN depletion in the context of SMA. This is a novel approach that adds valuable insights into the multi-organ impact of SMN deficiency.

      (2) Comprehensive Methodology: The use of a well-characterized mouse model with liver-specific SMN depletion is a strength. The study employs a robust set of techniques, including genetic engineering, histological analysis, and various biochemical assays.

      (3) Detailed Analysis: The manuscript provides a thorough analysis of liver pathology and its potential systemic effects, particularly on the pancreas and glucose metabolism.

      (4) Clear Presentation: The manuscript is well written. The results are presented clearly with well-designed figures and detailed legends.

      We thank the reviewer for their positive comments. They had some concerns for us to consider (see below). We provide a point-by-point response to their comments.

      Weaknesses:

      (1) Limited Time Points: The study primarily focuses on a single time point (P19). This limits the understanding of the temporal progression of liver and pancreatic pathology in the context of SMN depletion. Longitudinal studies would provide a better understanding of disease progression.

      We thank the reviewer for the suggestion. We extended our analysis to include P60 mice and performed both liver and pancreatic analyses at this time point to address this suggestion.

      (2) Incomplete Recombination: The mosaic pattern of Cre-mediated excision leads to variability in SMN depletion, which complicates the interpretation of some results. Ensuring more consistent recombination across samples would strengthen the conclusions.

      The variability in Cre-mediated excision is inherently stochastic, influenced by factors such as Cre expression levels, timing of recombination, and the accessibility of the target locus in individual cells. Achieving complete consistency across samples is particularly challenging, especially given the complexity of our breeding scheme, which occasionally results in litters without any animals of the desired genotype. Importantly, our study not only demonstrates that liver-specific SMN depletion results in liver alterations and pancreatic dysfunction but also highlights the limitations and challenges associated with this mouse model. By doing so, we aim to provide valuable insights for other researchers considering similar approaches in future studies.

      Reviewer #2 (Public review):

      Summary:

      Marylin Alves de Almeida et al. developed a novel mouse cross via conditionally depleting functional SMN protein in the liver (AlbCre/+;Smn2B/F7). This mouse model retains a proportion of SMN in the liver, which better recapitulates SMN deficiency observed in SMA patients and allows further investigation into liver-specific SMN deficiency and its systemic impact. They show that AlbCre/+;Smn2B/F7 mice do not develop an apparent SMA phenotype as mice did not develop motor neuron death, neuromuscular pathology or muscle atrophy, which is observed in the Smn2B/- controls. Nonetheless, at P19, these mice develop mild liver steatosis, and interestingly, this conditional depletion of SMN in the liver impacts cells in the pancreas.

      Strengths:

      The current model has clearly delineated the apparent metabolic perturbations which involve a significantly increased lipid accumulation in the liver and pancreatic cell defects in AlbCre/+;Smn2B/F7 mice at P19. Standard methods like H&E and Oil Red-O staining show that in AlbCre/+;Smn2B/F7 mice, their livers closely mimic the livers of Smn2B/- mice, which have the full body knockout of SMN protein. Unlike previous work, this liver-specific conditional depletion of SMN is superior in that it is not lethal to the mouse, which allows an opportunity to investigate the long-term effects of liver-specific SMN on the pathology of SMA.

      We thank the reviewer for their positive comments. They had some concerns for us to consider (see below). We provide a point-by-point response to their comments (review comments in black, our response in red).

      Weaknesses:

      Given that SMA often involves fatty liver, dyslipidemia and insulin resistance, using the current mouse model, the authors could have explored the long-term effects of liver-specific depletion of SMN on metabolic phenotypes beyond P19, as well as systemic effects like glucose homeostasis. Given that the authors also report pancreatic cell defects, the long-term effect on insulin secretion and resistance could be further explored. The mechanistic link between a liver-specific SMN depletion and apparent pancreatic cell defects is also unclear.

      We extended our analysis to include P60 mice and performed both liver and pancreatic analyses at this time point to address this suggestion. In addition, we discussed the liver-pancreas axis in the Discussion.

      Discussion:

      This current work explores a novel mouse cross in order to specifically deplete liver SMN using an Albumin-Cre driver line. This provides insight into the contribution of liver-specific SMN protein to the pathology of SMA, which is relevant for understanding metabolic perturbations in SMA patients. Nonetheless, given that SMA in patients involve a systemic deletion or mutation of the SMN gene, the authors could emphasize the utility of this liver-specific mouse model, as opposed to using in vitro models, which have been recently reported (Leow et al, 2024, JCI). Authors should also discuss why a mild metabolic phenotype is observed in this current mouse model, as opposed to other SMA mouse models described in literature.

      We appreciate the reviewer’s insightful comment. We have thoroughly addressed this suggestion in the Discussion section, particularly in lines 284-298; 309-322 and 334-359.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Longitudinal Studies: Conducting studies at maybe one more time points postnatally to provide a clearer picture of how liver-specific SMN depletion affects tissue pathology over time.

      We extended our analysis to include P60 mice and performed both liver and pancreatic analyses at this time point to address this suggestion.

      (2) Functional Assays: Incorporate glucose tolerance tests, insulin sensitivity tests, and more detailed metabolic profiling to better understand the physiological consequences of liver-specific SMN depletion on glucose metabolism and pancreatic function.

      We sincerely thank the reviewer for this suggestion. We have included a full panel of metabolic hormones associated with glucose metabolism from animals at P19 and P60. These new data, along with additional figures, have now been provided in our revised manuscript.

      (3) Mechanism: Discuss the molecular pathways affected by SMN depletion in the liver and pancreas. Mechanistic studies including transcriptomic or proteomic analyses to identify dysregulated pathways will help.

      We appreciate the reviewer’s insightful comment. We have thoroughly addressed this suggestion in the Discussion section, particularly in lines 284-298 and 334-359.

      (4) Typos in the abstract: beta cells secret insulin and alpha cells produce gulcagon. 

      Thank you for catching this error. It has been corrected to reflect that insulin is produced by beta cells and glucagon by alpha cells.

      (5) Efficiency and specificity of the Alb-Cre: if possible, cross the Alb-Cre with the Rosa26 reporter line to test the efficiency and specificity of the Alb-Cre.

      We agree that this would provide valuable insights. However, initiating a new breeding program to generate the required genotypes would take over a year and is beyond the scope of this study. To address this in part, we performed Cre immunostaining of the liver, pancreas, and spinal cord at P19, as well as the liver at P60. These results, now included in Supplemental Figure 1, demonstrate liver-specific expression and variability across hepatocytes.

      Reviewer #2 (Recommendations for the authors):

      The title of this manuscript is potentially misleading. The manuscript largely investigates the involvement of SMN protein on peripheral organs such as the liver, muscles, neuromuscular junction, and the pancreas. Yet, the title could be interpreted that the peripheral nervous system or central nervous system is the main focus. The title should be edited to indicate key terms such as "motor neuron and peripheral tissue pathology".

      Thank you for pointing this out. We have revised the title to better represent the study’s focus. It is now “Impact of liver-specific survival motor neuron (SMN) depletion on central nervous system and peripheral tissue pathology”.

      Suggestions:

      Please clarify and explain clearly the various mouse lines (Smn2B/+, Smn2B/- and +/+; Smn2B/F7 ) used as controls as the nomenclature used is confusing. In addition, authors could consider the use of a wild-type mouse line to be used as a control to validate changes in AlbCre/+;Smn2B/F7 mice.

      We have now provided clarification on mouse line nomenclature in the Results section (lines 104–124). Full-body heterozygous mice (_Smn_2B/+) are used as controls due to their slightly reduced SMN protein levels and absence of phenotypic changes compared to wild-type mice.

      Given that the main phenotype implicated by the liver-specific depletion of SMN protein in AlbCre/+;Smn2B/F7 mice is pancreatic abnormalities (changes in alpha- and beta- cell numbers and blood glucose levels), authors should expand further on the pancreatic phenotype.  

      We added a full panel of metabolic hormones related to glucose metabolism in animals at P19 and P60. Furthermore, this has been discussed in detail in lines 284-298 and 334-344 of the Discussion.

      A pancreas-specific depletion of SMN would provide this current manuscript with a better understanding of the role of SMN in regulating SMA pathology and provide more definitive conclusions on the contribution of liver-specific SMN depletion on normal pancreatic function.

      We agree that this would be very informative. However, to do this would require initiation of a new breeding program that will take more than a year to arrive at the right genotypes. Although valuable, it is beyond the scope of the present study.

      The authors should also delineate the role of hepatic SMN in pancreatic function, and how the intrinsic liver-specific loss of SMN directly impacts the pancreas. Currently, literature demonstrates that the fatty liver phenotype in SMA patients is a primary SMN-dependent hepatocyte-intrinsic liver defect associated with mitochondrial and other hepatic metabolism implications (see Leow et al, 2024 J Clin Invest). Given that the authors describe that SMN protein levels are not altered in the pancreas of AlbCre/+;Smn2B/F7 mice at P19, the authors ought to clarify how pancreas development and function is impacted in this mouse model, whether in-utero or postnatally. This could potentially underscore the cross-talk between liver SMN and pancreas function.

      We have discussed the relationship between hepatic SMN and pancreatic function in the Discussion at lines 284-298 and 334-359.

      Authors should also perform some metabolic tolerance tests to both oral glucose and insulin at an older age (e.g. P60) to study their homeostasis in these mice. These would help to substantiate the authors' conclusion and provide the paper with a greater level of novelty.

      We thank the reviewer for this suggestion. A full panel of metabolic hormones related to glucose metabolism at P19 and P60 has been included, supported by additional figures that enhance the manuscript's novelty and depth.

      Authors mentioned in the Discussion in lines 238 to 240: "Altogether, our findings underscore the necessity of conducting further investigations at later time points to unveil potential modifications in other pathways and their repercussions on liver physiology". Please elucidate the effects of longer term liver-specific depletion of SMN beyond P19, such as the onset of NAFLD or a diabetic phenotype due to pancreatic dysfunctions.

      We extended our data to include P60 mice and performed liver and pancreatic analyses at these time points. The observed effects were transient, possibly due to the stochastic nature of Cre expression.

      In addition, while AlbCre/+;Smn2B/F7 mice had similar weight gain trends as controls, it does appear that AlbCre/+;Smn2B/F7 mice weigh more than their controls by P60 (Figure 9C). This data would provide more convincing evidence of the metabolic defects observed in these mice.

      As per the reviewer’s suggestion, we included new data (Figure 9D) showing % weight gain at P60 normalized to basal weight at P7. However, no statistically significant differences were detected.

      Other than protein quantification, authors should perform immunohistochemistry or in-situ hybridization of SMN and imaging of AlbCre/+;Smn2B/F7 organs to validate the loss of liver-specific SMN. It is unclear from western blots that the expression of SMN is only in hepatocytes.

      We thank the reviewer for the suggestion. Unfortunately, SMN antibodies have not produced reliable tissue immunostaining. To address this, we performed Cre immunostaining of the liver, pancreas, and spinal cord at P19, and the liver at P60, which demonstrated liver-specific expression. These results are now included in Supplemental Figure 1.

      Authors should consider re-wording lines 228 through 231: "While our current analysis did not reveal significant differences in AlbCre/+;Smn2B/F7 mice, the observed upward trend in transferrin and HO levels suggests ongoing changes in iron metabolism, which may not be fully manifested at P19". Alternatively, a higher number of mouse samples would allow them to qualify this statement. Authors should also consider comparing levels of liver biomarkers such as ALT and AST, to check for liver homeostatic function.

      We have removed speculative statements to avoid unsupported claims.

      Recommendations:

      The methods and additional details to generate the AlbCre/+;Smn2B/F7 should be explained better in section 2.1 of the Results. It is potentially confusing as to why these mice had to carry both 2B and F7 alleles. Additionally, the role of the F7 allele is not deliberately clear in the Introduction.

      Additional details are now included in the Introduction (lines 87-90) and the Results section (lines 104-124).

      Authors should refer to Leow et al 2024 (J Clin Invest) and discuss how their current findings compare with their hepatocyte-intrinsic SMN deficiency IPSCs model.<br /> We note a previous publication (Deguise et al 2021 Cell Mol Gastroenterol Hepatol) by the authors which characterized the Smn2B/- mouse model and its NAFLD/NASH features. From our understanding, the Smn2B/- mouse model appears to recapitulate SMA phenotype well, such as the early onset of hepatic steatosis and neurological conditions. As a follow-up to this publication, authors should discuss why this current study of a liver-specific SMN depletion is important and relevant to the study of SMA pathology.

      We thank the reviewer for the insightful suggestions. We have included a discussion of these findings and their relevance to the study of SMA pathology in lines 284-298 and 309-322.

      Minor corrections:

      Abstract (line 32) reads: "a decrease in insulin producing alpha-cells and an increase in glucagon producing beta-cells". The authors should clarify and correct as insulin producing beta-cells and glucagon producing alpha-cells.

      Thank you for catching the error. We corrected the description of insulin- and glucagon-producing cells.

      Please clarify the number and gender of mice used for weight tracking and motor function experiments up to P60 (Figure 9C). It would be inappropriate if male and female mice were plotted together. If so, authors should stratify data by gender.

      We thank the reviewer for the suggestion. Unfortunately, we did not stratify the animals by sex due to the unequal and insufficient number of males and females in our study. To address this, we normalized weight gain to each animal’s starting weight, and no significant differences were observed (now shown in Figure 9D).

      The number of figures should be reduced. We recommend merging Figures 1 and 2 (generation of AlbCre/+;Smn2B/F7 mouse line and validation) and Figures 3 and 4 (liver function). Figures 5 through 9 may be supplemental figures instead.

      We thank the reviewer for the suggestions. We merged Figures 1 and 2, and Figures 3 and 4, as requested. However, we would prefer to keep the other figures within the main results as they assess the impact of liver-specific depletion of SMN on other pathologies within the mouse model.

      Standardize the use of asterisks and reporting p-values in Figure 2. All other figures in the manuscript utilize asterisks, but Figures 2C', 2D' and 2E' use p-values across comparisons.

      P-values were included only when they approached statistical significance, providing additional clarity to the results.

      It is unclear what the white arrow in Figure 7A indicates.

      It is meant to point out the absence of an innervating axon. Please see Figure 5 legend, lines 801-802.

      Note spelling errors in Figures 8B and 8C: 'Muscle flber'.

      Thank you for catching this. We have corrected the typo to indicate muscle fiber instead.

      Please clarify if muscle fiber size should be indicated as µm2 instead of µ2 in Figures 8B and 8C, as written in Materials and Methods under line 394.

      Thank you for catching this. We corrected the typo to indicate µm2 instead.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      (1) The overall conclusion, as summarized in the abstract as "Together, our study documents the diversification of locomotor and oculomotor adaptations among hunting teleost larvae" is not that compelling. What would be much more interesting would be to directly relate these differences to different ecological niches (e.g. different types of natural prey, visual scene conditions, height in water column etc), and/or differences in neural circuit mechanisms. While I appreciate that this paper provides a first step on this path, by itself it seems on the verge of stamp collecting, i.e. collecting and cataloging observations without a clear, overarching hypothesis or theoretical framework.

      There are limited studies on the prey capture behaviors of larval fishes, and ours is the first to compare multiple species systematically using a common analysis framework. Our analysis approach could have uncovered a common set of swim kinematics and capture strategies shared by all species; but instead, we found that medaka used a monocular strategy rather than the binocular strategy of cichlids and zebrafish. Our analysis similarly could have revealed first-feeding larvae of all species go through a “bout” stage, which was previously proposed as important for sensorimotor decision making (Bahl et al., 2019), but instead we found that medaka and some cichlids have more continuous swimming from an early life stage. Finally, the rate at which prey capture kinematics evolves is not known. Our approach could have revealed rapid diversification of feeding strategies in cichlids (similarly to how adult feeding behavior evolves), but instead we found smaller differences within cichlids than between cichlids and medaka.

      (2) The data to support some of the claims is either weak or lacking entirely.

      Highlighted timestamps in videos, new stats in fig 1H and fig 2, updated supplementary figures now provide additional support for claims.

      - It would be helpful to include previously published data from zebrafish for comparison.

      We appreciate the suggestion. Mearns et al. (2020) provided a comprehensive account of prey capture in zebrafish larvae in an almost identical setup with similar analyses. We do not feel it is necessary to recount all the findings in that paper here. There are many studies on prey capture in zebrafish from the past 20 years, and reproducing these here would not add anything to that extensive pre-existing literature.

      - Justification is required for why it is meaningful to compare hunting strategies when both fish species and prey species are being varied. For instance, artemia and paramecia are different sizes and have different movement statistics.

      We added text explaining why different food was chosen for medaka/cichlids. There is no easy way to stage match fishes as evolutionarily diverged as cichlids, medaka, and zebrafish. Size is a reasonable metric within a species, but there is no guarantee that sizematched larvae of two different species are at the same level of maturity. Therefore, we thought the most appropriate stage to address is when larvae first start feeding, as this enables us to study innate prey capture behavior before any learning or experience-dependent changes have taken place. Given that zebrafish, medaka and cichlid larvae are different sizes when they first start feeding, it was necessary to study their hunting behavior to different prey items.

      - It would be helpful in Figure 1A to add the abbreviations used elsewhere in the paper. I found it slightly distracting that the authors switch back and forth in the paper between using "OL" and "medaka" to refer to the same species: please pick one and then remain consistent.

      Medaka is the common name for the japanese rice fish, O. latipes. Cichlilds do not have common names are only referred to by their scientific names. Since readers are more likely to be familiar with the common name, medaka, we now use medaka (OL) throughout the manuscript, which we hope makes the text clearer.

      - The conceptual meaning of behavioral segmentation is somewhat unclear. For zebrafish, the bouts already come temporally segmented. However in medaka for instance, swimming is more continuous, and the segmentation is presumably more in terms of "behavioral syllables" as have been discussed for example mouse or drosophila behavior (in the last row of Figure S1 it is not at all obvious why some of the boundaries were placed at their specific locations). It's not clear whether it's meaningful to make an equivalence between syllables and bouts, and so whether for instance Figure 1H is making an apples-to-apples comparison.

      We clarified the text to say we are comparing syllables, rather than bouts.

      - The interpretation of 1H is that "medaka exhibited significantly longer swims than cichlids"; however this is not supported by the appropriate statistical test. The KS test only says that two probability distributions are different; to say that one quantity is larger than another requires a comparison of means.

      Updated Fig 1H; boostrap test (difference of medians) and re plotted data as violin plots.

      (2) The data to support some of the claims is either weak or lacking entirely.

      Highlighted timestamps in videos, new stats in fig 1H and fig 2, updated supplementary figures now provide additional support for claims.

      - I think the evidence that there are qualitatively different patterns of eye convergence between species is weak. In Figure 2A I admire the authors addressing this using BIC, and the distributions are clearly separated in LA (the Hartigan dip test could be a useful additional test here). However for LO, NM, and AB the distributions only have one peak, and it's therefore unclear why it's better to fit them with two Gaussians rather than e.g. a gamma distribution. Indeed the latter has fewer parameters than a two-gaussian model, so it would be worthwhile to use BIC to make that comparison. The positions of the two Gaussians for LO, NM, and AB are separated by only a handful of degrees (cf LA, where the separation is ~20 degrees), which further supports the idea that there aren't really two qualitatively different convergence states here.

      Added explanation to text.

      - Figure S2 is unfortunately misleading in this regard. I don't claim the authors aimed to mislead, but they have made the well-known error of using colors with very different luminances in a plot where size matters (see e.g.

      https://nam12.safelinks.protection.outlook.com/?url=https%3A %2F%2Fwww.r-project.org%2Fconferences%2FDSC2003%2FProceedings%2FIhaka.pdf&data=05%7C02%7Cdme arns%40princeton.edu%7C17ae2b44f0f246f15ddd08dc9b8e2 01c%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7

      C638556282750568814%7CUnknown%7CTWFpbGZsb3d8ey

      JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJ XVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Ll4J4Xo39JEtKb %2FNnRWNoyedZAu5aAOMq0lHJCwsfXI%3D&reserved=0).

      Thus, to the eye, it appears there's a big valley between the red and blue regions, but actually, that valley is full of points: it's really just one big continuous blob.

      Kernel density estimation of eye convergence angles were added to Figure S2. The point we wish to make is that there is higher density when both eyes are rotated invwards (converged) in cichlids, but not medaka (O. latipes). The valley between converged and unconverged states being full of points is due to (1) slight variation with placement of key points in SLEAP, which blurs the boundary between states and (2) the eye convergence angle must pass through the valley in order to become converged, so necessarily there are points in between the two extremes of eye convergence.

      - In Figure 2D please could the authors double-check the significance of the difference between LO and NM: they certainly don't look different in the plot.

      Thank for for flagging this. We realize the way we previously reported the stats was open to misinterpretation. We have updated figure 2C, D and F to use letters to indicate statistical groupings, which hopefully makes it clearer which species are statistically different from each other.

      - In Figure 2G it's not clear why AB is not included. It is mentioned that the artemia was hard to track in the AB videos, but the supplementary videos provided do not support this.

      The contrast of the artemia in the AB videos is sufficiently different from the other cichlid videos that our pre-trained YOLO model fails. Retraining the model would be a lot of extra work and we feel like a comparison of three species is sufficient to address the sensorimotor transformations that occur over the course of prey capture in cichlids.

      - The statement "Zebrafish larvae have a unique swim repertoire during prey capture, which is distinct from exploratory swim bouts" is not supported by the work of others or indeed the authors' own work. In Figure 4F all types of bouts can occur at any time, it's just the probability at which they occur that varies during prey capture versus other times (see also Mearns et al (2020) Figure S4B).

      The point is well taken that there probably is not a hard separation between spontaneous and prey capture swims based on tail kinematics alone, which is also shown in Marques et al. (2018). However, we think that figure 2I of Mearns et al., which plots the probability of swims being drawn from different parts of the behavior space during prey capture (eyes converged) or not (eyes unconverged), shows that the repertoire of swims during the two states is substantially different. Points are blue or red; there are very few pale blue/pale red points in that figure panel. Figure S4B is showing clustered data, and clustering is a notoriously challenging problem for which there exists no perfect solution (Kleinberg, 2002). The clusters in Mearns et al. incorporated information about transition structure, as this was necessary for obtaining interpretable clusters for subsequent analyses. However, a different clustering approach could have yielded different boundaries, which may have shown more (or less) separation of bout types during prey capture/exploratory swimming. Therefore, we have updated the text to say that zebrafish perferentially perform different swim types during prey capture and exploration, and re-interpreted the behavior of cichlids similarly.

      - More discussion is warranted of the large variation in the number of behavioral clusters found between species (11-32). First, how much is this variation really to be trusted? I appreciate the affinity propogation parameters were the same in all cases, but what parameters "make sense" is somewhat dependent on the particular data set. Second, if one does believe this represents real variation, then why? This is really the key question, and it's unsatisfying to merely document it without trying to interpret it.

      Extended paragraph with more interpretation.

      - What is the purpose of "hovers"? Why not stay motionless? Could it be a way of reducing the latency of a subsequent movement? Is this an example of the scallop theorem?

      Added a couple of sentences speculating on function.

      - I'm not sure "spring-loaded" is a good term here: the tension force of a coiled tail is fairly negligible since there's little internal force actively trying to straighten it.

      Rewrote this part to highlight that fish spring toward the prey, without the implication that tension forces in the tail are responible for the movement. However, we are not aware of any literature measuring passive forces within the tail of fishes. Presumably the notochord is relatively stiff and may provide an internal force trying to straighten the tail.

      - There are now several statements for which no direct evidence is presented. We shouldn't have to rely on the author's qualitative impressions of what they observed: show us quantitative analysis.

      * "often hover"

      * "cichlids often alternate between approaches and hover swims"

      * "over many hundreds of milliseconds"

      * "we have also observed suction captures and ram-like attacks"

      * "may swim backwards"

      * "may expel prey from their mouth"

      * "cichlid captures often occur in two phases"

      Added references to supplementary videos with timestamps to highlight these behaviors.

      - I don't find it plausible that sated fish continue hunting prey that they know they're not going to eat just for the practice.

      Removed the speculation.

      - In Figure 3 is it not possible to include medaka, based on the hand-tracked paramecia?

      The videos are recorded at high frame rate, so it would be a lot of additional work to track these manually. Furthermore, earlier in prey capture it is very difficult to tell by watching videos which prey the medaka are tracking, especially as single paramecia can drift in and out of focus in the videos. Since there is no eye convergence, it is very difficult to ascertain for certain when tracking a given prey begins. In Fig 4, it was only possible to track paramecia by hand since it is immediately prior to the strike and from the video it is possible to see which paramecium the fish targeted. Our analyses of heading changes was performed over the 200 ms prior to a strike, which we think is a conservative enough cutoff to say that fish were probably pursuing prey in this window (it is shorter than the average behavioral syllable duration in medaka).

      - Figure 3 (particularly 3D) suggests the interesting finding that LA essentially only hunt prey that is directly in front of them (unlike LO and NM, the distribution of prey azimuth actually seems to broaden slightly over the duration of hunting events).

      This is worthy of discussion.

      We offer a suggestion for the many instances of prey capture being initiated in the central visual field in LA later in the manuscript when we discuss spitting behavior. We have added text to make this point earlier in the manuscript. The increase in azimuthal range at the end of prey capture may be due to abort swims (e.g. supp. vid. 1, 00:21). The widening of azimuthal angles is present in LO and NM also and is not unique to LA.

      - The reference Ding et al (2016) is not in the reference list.

      Wrong paper was referenced. Should be Ding 2019, which has been added to bibliography.

      - I am not convinced that medaka exhibit a unique side-swing behavior. I agree there is this tendency in the example movie, however, the results of the quantification (Figure 4) are underwhelming. First, cluster 5 in 4K appears to include a proportion of cases from LA and AB. These proportions may be small, but anything above zero means this is not unique to medaka. Second, the heading angle (4N) starts at 4 degrees for LA and 8 degrees for medaka. This difference is genuine but very small, much smaller than what's drawn in the schematic (4M). I'm not sure it's justifiable to call a difference of 4 degrees a qualitatively different strategy.

      We have changed the text to highlight that side swing is highly enriched in medaka. Comparing 4J to 3B we would argue that there is a qualitative difference in the strategy used to capture prey in the cichlid larvae we study here and medaka. We agree that further work is required to understand distance estimation behaviors in different species. In this manuscript, we use heading angle as a proxy for how prey position might change on the retina over a hunting sequence. But as the heading and distance are changing over time, the actual change in angle on the retina for prey may be much larger than the ~8 degree shift reported here. The actual position of the prey is also important here, which, for reasons mentioned above, we could not track. Given the final location of prey in the visual field prior to the strike (Fig 4J), the most parsimonious explanation of the data is that the prey is always in the monocular visual field. In cichlids, the prey is more-or-less centered in the 200 ms preceding the strike. While it is true theat the absolute difference in heading is 4 degrees, when converted to an angular velocity (4N, right), the medaka (OL) effectively rotate twice as fast as LA (20 deg/s vs 40 deg/s), which we think is a substantial difference and evidence of a different targeting strategy.

      - 4K: This is referred to in the caption as a confusion matrix, which it's not.

      Fixed.

      - 4N right panel: how many fish contributed to the points shown?

      Added to figure legend (n=113, LA; n=36, OL). Same data in left and right panels.

      - In the Discussion it is hypothesized that medaka use their lateral line in hunting more than in other species. Testing this hypothesis (even just compared to one other species) would be fairly straightforward, and would add significant interest to the paper overall.

      We agree that this is an interesting experiment for follow up studies, but it is beyond the scope of the current manuscript as we do not have the appropriate animal license for this experiment.

      Reviewer 2:

      The paper is rather descriptive in nature, although more context is provided in the discussion. Most figures are great, but I think the authors could add a couple of visual aids in certain places to explain how certain components were measured.

      Added new supplemental figure (Supp Fig 2)

      Figure 1B- it could be useful to add zebrafish and medaka to the scientific names (I realize it's already in Figure A but I found myself going back and forth a couple of times, mostly trying to confirm that O. latipes is medaka).

      Added common names to 1B, sprinkled reminders of OL/medaka throughout text.

      Figure 1G. I wasn't sure how to interpret the eye angle relative to the midline. Can they rotate their eyes or is this due to curvature in the 'upper' body of the fish? Adding a schematic figure or something like that could help a reader who is not familiar with these methods. Related to this, I was a bit confused by Figure 2A. After reading the methods section, I think I understand - but I little cartoon to describe this would help. It also reminds the reader (especially if they don't work with fish) that fish eyes can rotate. I also wanted to note that initially, I thought convergence was a measure of how the two eyes were positioned relative to the prey given the emphasis given on binocular vision, and only after reading certain sections again did I realize convergence was a measure of eye rotation/movement.

      New supplemental figure explaining how eye tracking is performed

      Figure 3. It was not immediately clear to me what onset, middle, and end represented - although it is explained in the caption. I think what tripped me up is the 'eye convergence' title in the top right corner of Figure 3A.

      Updated figure with schematic illustrating that time is measured relative to eye convergence onset and end.

      The result section about attack swim, S-strike, capture spring, etc. was a bit confusing to read and could benefit from a couple of concise descriptions of these behaviors. For example, I am not familiar with the S strike but a couple of paragraphs into this section, the reader learns more about the difference between S strike vs. attack swim. This can be mentioned in the first paragraph when these distinct behaviors are mentioned.

      Added description of behavior earlier in text.

      Figure 4. Presents lots of interesting data! I wonder if using Figure 1E could help the reader better understand how these measurements were taken.

      New supplemental figure added, explaining how tail tracking is performed.

      I probably overlooked this, but I wonder why so many panels are just focused on one species.

      Added explanation to the text.

      Is the S-shaped capture strategy the same as an S strike?

      Clarified in text to say "S-strike-like". This is a description of prey capture from adult largemouth bass in New et al. (2002). From the still frames shown in that paper, the kinematics looks similar to an S-strike or capture spring. The important point we wish to make is that tail is coiled in an S-shape prior to a strike, which indicates this that a kinematically similar behavior exists fishes beyond just larval cichlids and zebrafish.

      At the end of the page, when continuous swimming versus interrupted swimming is discussed, please remind the reader that medaka shows more continuous swimming (longer bouts).

      Added "while medaka swim continuously with longer bouts ("gliding")".

      After reading the discussion, it looks like many findings are unique. For example, given that medaka is such a popular model species in biology, it strikes me that nobody has ever looked into their hunting movements before. If their findings are novel, perhaps they should state so it is clear that the authors are not ignoring the literature.

      We have highlighted what we believe to be the novelty of our findings (first description of prey capture in larval cichlids and medaka). To our knowledge, we are first to describe hunting in medaka; but there is an extensive literature on medaka dating back to the early 20th century, some of which is only published in Japanese. We have done our best to review the literature, but we cannot rule out that there are papers that we missed. No English language article or review we found mentions literature on hunting behavior in medaka larvae.

      Reviewer 3:

      More evidence is needed to assess the types of visual monocular depth cues used by medaka fish to estimate prey location, but that is beyond the scope of this compelling paper. For example, medaka may estimate depth through knowledge of expected prey size, accommodation, defocus blur, ocular parallax, and/or other possible algorithms to complement cues from motion parallax.

      Added sentence to discussion highlighting that other cues may also contribute to distance estimation in cichlids and medakas. Follow-up studies will require new animal license.

      None. It's quite nice, timely, and thorough work! For future work, one could use 3D pose estimation of eye and prey kinematics to assess the dynamics of the 2D image (prey and background) cast onto the retina. This sort of representation could be useful to infer which monocular depth cues may be used by medaka during hunting.

      Great suggestion for follow up studies. Bolton et al. and Mearns et al. both find changes in z associated with prey capture, and it would be interesting to see how other fish species use the full 3-dimensional water column during prey capture, especially considering the diversity of hunting strategies in adult cichlids (ranging from piscivorous species, like LA, to algar grazers).

      In Figure 4N, you use "change in heading leading up to a strike as a proxy for the change in visual angle of the prey for cichlids and medaka." This proxy makes sense, but you also have the eye angles and (in some cases) the prey positions. One could estimate the actual change in visual angle from this information, which would also allow one to measure whether the fish are trying to stabilize the position of the prey on a high-acuity patch of the retina during the final moments of the hunt. This information may also shed light on which monocular depth cues are used.

      As addressed in comment to reviewer 1, this would require actually manually tracking individual paramecia over hundreds of frames. It is not possible to determine exactly when hunting begins in medaka, and it is prone to errors if medaka switch between targets over the course of a hunting episode. This question is better addressed with psychophysics experiments in embedded animals where it is possible to precisely control the stimulus, but this requires new animal licenses and is beyond the scope of this paper.

      In Figure 5, you could place the prey object a little farther from the D. rerio fish for the S-strike diagram.

      Fixed.

      Figure 4F legend should read "...at the peak of each bout."

      Fixed.

    1. For Nieto de Herrera, it was clear that jazz, the dance, was the creation of a single individual,Oscar Duryea, a well-known dancer, choreographer, and dance instructor in New York at thetime, who had contributed in 1914 to the standardization and popularization of the steps forthe foxtrot and who apparently invented a new dance in 1916: the ‘two-two’.47 Thus, the intri-cate set of dancing instructions that Nieto de Herrera presented in Spanish to her readers inHavana came from a set of instructions published previously in English by Duryea. All thingsconsidered, following the steps seems to make evident that jazz dance – as prescribed byDuryea and amplified by Nieto de Herrera – was bounded within the stylistic realm of foxtrot,tango, waltz, one-step, two-step, and other ‘society dances’ of the 1910s. While the specificityand sophistication of the instructions reminds us, in a way, of much older society dances –such as the contradance – they also point to social parameters defined by Victorian and bour-geois sensibilities:Walk slowly during three steps in the line of direction, starting with the left foot. Stopon the third step, balancing the weight of the body [bending?] over the left foot;count long ‘one, two, three,’ which should account for six [beats] in the music.Quickly bring the right foot next to the left foot, put the left foot forward and changethe step so that it falls over the right foot. Count two and four. Repeat the last twosteps starting with the right foot and stopping over the right foot when giving thethird step forward and count two. Advance with the left foot and stop for an instant.(1). Cross the right foot behind the left foot, closing suddenly to find the step of the45 ‘[E]l jazz, elegantísimo baile que está haciendo furor en los salones de alta sociedad neoyorkinos. El jazz es el últimoderivado del Fox Trot, más animado, más entusiasta y más elegante que el primero, pues su ritmo hace lucir, en toda suamplitud, la gracia y la gentileza de los bailadores.’ El Universal, 17 April 1918, 8. The ‘Dancing casino’ parties,announced as ‘recepciones’ and ‘bailes de etiqueta’ were to take place at the business’s salons, located at ‘Casa No.12 de la calle San Juan de Letrán’.46 See Wald, How the Beatles Destroyed Rock ’n’ Roll, 49–59.47 ‘Oscar Duryea,’ in Sonny Watson’s Street Swing website: www.streetswing.com/histmai2/d2durya1.htm (accessed 13August 2022); The Northeastern Reporter, vol. 90, 4 January–22 March 1910, 1019, 1141.362 Ospina Romero The Dawn of the Jazz Age in the Caribbean

      As a dancer who was trained in ballet, modern, jazz, afro Caribbean, hip hop, African and many more styles for about 15 years, to learn more about the origin of jazz is quite interesting in a way because there is a known saying of " Ballet is the foundation of all dance" and for a while I have believed that because ballet is what we all started in, it's the most popular etc. But when I read this, it makes me wonder how untrue that is. It's a series of genres hat can form one new style of dance it's no always ballet or on its own. I wonder what caused ballet dance to take the title. (And for a dancer who has recently left high school for dance and forgot most of the dance terms the bottom is really helpful)

    1. Reviewer #1 (Public review):

      Summary:

      Dorrego-Rivas et al. investigated two different DA neurons and their neurotransmitter release properties in the main olfactory bulb. They found that the two different DA neurons in mostly glomerular layers have different morphologies as well as electrophysiological properties. The anaxonic DA neurons are able to self-inhibit but the axon-bearing ones are not. The findings are interesting and important to increase the understanding both of the synaptic transmissions in the main olfactory bulb and the DA neuron diversity. However, there are some major questions that the authors need to address to support their conclusions.

      (1) It is known that there are two types of DA neurons in the glomerular layer with different diameters and capacitances (Kosaka and Kosaka, 2008; Pignatelli et al., 2005; Angela Pignatelli and Ottorino Belluzzi, 2017). In this manuscript, the authors need to articulate better which layer the imaging and ephys recordings took place, all glomerular layers or with an exception. Meanwhile, they have to report the electrophysiological properties of their recordings, including capacitances, input resistance, etc.

      (2) It is understandable that recording the DA neurons in the glomerular layer is not easy. However, the authors still need to increase their n's and repeat the experiments at least three times to make their conclusion more solid. For example (but not limited to), Fig 3B, n=2 cells from 1 mouse. Fig.4G, the recording only has 3 cells.

      (3) The statistics also use pseudoreplicates. It might be better to present the biology replicates, too.

      (4) In Figure 4D, the authors report the values in the manuscript. It is recommended to make a bar graph to be more intuitive.

      (5) In Figure 4F and G, although the data with three cells suggest no phenotype, the kinetics looked different. So, the authors might need to explore that aside from increasing the n.

      (6) Similarly, for Figure 4I and J, L and M, it is better to present and analyze it like F and G, instead of showing only the after-antagonist effect.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Thank you for your constructive feedback and recognition of our work. We followed your suggestion and improved the accuracy of the language used to interpret some of our findings. 

      Summary:

      The present study by Mikati et al demonstrates an improved method for in-vivo detection of enkephalin release and studies the impact of stress on the activation of enkephalin neurons and enkephalin release in the nucleus accumbens (NAc). The authors refine their pipeline to measure met and leu enkephalin using liquid chromatography and mass spectrometry. The authors subsequently measured met and leu enkephalin in the NAc during stress induced by handling, and fox urine, in addition to calcium activity of enkephalinergic cells using fiber photometry. The authors conclude that this improved tool for measuring enkephalin reveals experimenter handling stress-induced enkephalin release in the NAc that habituates and is dissociable from the calcium activity of these cells, whose activity doesn't habituate. The authors subsequently show that NAc enkephalin neuron calcium activity does habituate to fox urine exposure, is activated by a novel weigh boat, and that fox urine acutely causes increases in met-enk levels, in some animals, as assessed by microdialysis.

      Strengths:

      A new approach to monitoring two distinct enkephalins and a more robust analytical approach for more sensitive detection of neuropeptides. A pipeline that potentially could help for the detection of other neuropeptides.

      Weaknesses:

      Some of the interpretations are not fully supported by the existing data or would require further testing to draw those conclusions. This can be addressed by appropriately tampering down interpretations and acknowledging other limitations the authors did not cover brought by procedural differences between experiments.

      We have taken time to go through the manuscript ensuring we are more detailed and precise with our interpretations as well as appropriately acknowledging limitations. 

      Reviewer #2 (Public Review):

      Thank you for your constructive and thorough assessment of our work. In our revised manuscript, we adjusted the text to reflect the references you mentioned regarding the methionine oxidation procedure. Additionally, we expanded the methods section to include the key details of the statistical tests and procedures that you outlined. 

      Summary:

      The authors aimed to improve the detection of enkephalins, opioid peptides involved in pain modulation, reward, and stress. They used optogenetics, microdialysis, and mass spectrometry to measure enkephalin release during acute stress in freely moving rodents. Their study provided better detection of enkephalins due to the implementation of previously reported derivatization reaction combined with improved sample collection and offered insights into the dynamics and relationship between Met- and Leu-Enkephalin in the Nucleus Accumbens shell during stress.

      Strengths:

      A strength of this work is the enhanced opioid peptide detection resulting from an improved microdialysis technique coupled with an established derivatization approach and sensitive and quantitative nLC-MS measurements. These improvements allowed basal and stimulated peptide release with higher temporal resolution, lower detection thresholds, and native-state endogenous peptide measurement.

      Weaknesses:

      The draft incorrectly credits itself for the development of an oxidation method for the stabilization of Met- and Leu-Enk peptides. The use of hydrogen peroxide reaction for the oxidation of Met-Enk in various biological samples, including brain regions, has been reported previously, although the protocols may slightly vary. Specifically, the manuscript writes about "a critical discovery in the stabilization of enkephalin detection" and that they have "developed a method of methionine stabilization." Those statements are incorrect and the preceding papers that relied on hydrogen peroxide reaction for oxidation of Met-Enk and HPLC for quantification of oxidized Enk forms should be cited. One suggested example is Finn A, Agren G, Bjellerup P, Vedin I, Lundeberg T. Production and characterization of antibodies for the specific determination of the opioid peptide Met5-Enkephalin-Arg6-Phe7. Scand J Clin Lab Invest. 2004;64(1):49-56. doi: 10.1080/00365510410004119. PMID: 15025428.

      Thank you for highlighting this. It was not our intention to imply that we developed the oxidation method, rather that we were able improve the detection of metenkephalin by oxidation of the methionine without compromising the detection resolution of leu-enkephalin, enabling the simultaneous detection of both peptides. We have addressed this is the manuscript and included the suggested citation. 

      Another suggestion for this draft is to make the method section more comprehensive by adding information on specific tools and parameters used for statistical analysis:

      (1) Need to define "proteomics data" and explain whether calculations were performed on EIC for each m/z corresponding to specific peptides or as a batch processing for all detected peptides, from which only select findings are reported here. What type of data normalization was used, and other relevant details of data handling? Explain how Met- and Leu-Enk were identified from DIA data, and what tools were used.

      Thank you for pointing out this source of confusion. We believe it is because we use a different DIA method than is typically used in other literature. Briefly, we use a DIA method with the targeted inclusion list to ensure MS2 triggering as opposed to using large isolation widths to capture all precursors for fragmentation, as is typically done with MS1 features. For our method, MS2 is triggered based on the 4 selected m/z values (heavy and light versions of Leu and Met-Enkephalin peptides) at specific retention time windows with isolation width of 2 Da; regardless of the intensity of MS1 of the peptides. 

      (2) Simple Linear Regression Analysis: The text mentions that simple linear regression analysis was performed on forward and reverse curves, and line equations were reported, but it lacks details such as the specific variables being regressed (although figures have labels) and any associated statistical parameters (e.g., R-squared values). 

      Additional detail about the linear regression process was added to the methods section, please see lines 614-618. The R squared values are also now shown on the figure. 

      ‘For the forward curves, the regression was applied to the measured concentration of the light standard as the theoretical concentration was increased. For plotting purposes, we show the measured peak area ratios for the light standards in the forward curves. For the reverse curves, the regression was applied to the measured concentration of the heavy standard, as the theoretical concentration was varied.’

      (3) Violin Plots: The proteomics data is represented as violin plots with quartiles and median lines. This visual representation is mentioned, but there is no detail regarding the software/tools used for creating these plots.

      We used Graphpad Prism to create these plots. This detail has been added to the statistical analysis section. See line 630.

      (4) Log Transformation: The text states that the data was log-transformed to reduce skewness, which is a common data preprocessing step. However, it does not specify the base of the logarithm used or any information about the distribution before and after transformation.

      We have added the requested details about the log transformation, and how the data looked before and after, into the statistical analysis section. We followed convention that the use of log is generally base 10 unless otherwise specified as natural log (base 2) or a different base. See lines 622-625

      ‘The data was log10 transformed to reduce the skewness of the dataset caused by the variable range of concentrations measured across experiments/animals. Prior to log transformation, the measurements failed normality testing for a Gaussian distribution. After the log transformation, the data passed normality testing, which provided the rationale for the use of statistical analyses that assume normality.’

      (5) Two-Way ANOVA: Two-way ANOVA was conducted with peptide and treatment as independent variables. This analysis is described, but there is no information regarding the software or statistical tests used, p-values, post-hoc tests, or any results of this analysis.

      Information about the two-way ANOVA analysis has been added to the statistical analysis section. Additionally, more detailed information has been added to the figure legends about the statistical results. Please see lines 625-628.

      ‘Two-way ANOVA testing with peptide (Met-Enk or Leu-Enk) and treatment (buffer or stress for example) as the two independent variables. Post-hoc testing was done using Šídák's multiple comparisons test and the p values for each of these analyses are shown in the figures (Figs. 1F, 2A).’ 

      (6) Paired T-Test: A paired t-test was performed on predator odor proteomic data before and after treatment. This step is mentioned, but specific details like sample sizes, and the hypothesis being tested are not provided.

      The sample size is included in the figure legend to which we have included a reference. We have also included the following text to highlight the purpose of this test. See lines 628-630

      A paired t-test was performed on the predator odor proteomic data before and after odor exposure to test that hypothesis that Met-Enk increases following exposure to predator odor  (Fig. 3F). These analyses were conducted using Graphpad Prism.

      (7) Correlation Analysis: The text mentions a simple linear regression analysis to correlate the levels of Met-Enk and Leu-Enk and reports the slopes. However, details such as correlation coefficients, and p-values are missing.

      We apologize for the use of the word correlation as we think it may have caused some confusion and have adjusted the language accordingly. Since this was a linear regression analysis, there is no correlation coefficient. The slope of the fitted line is reported on the figures to show the fitted values of Met-Enk to Leu-Enk. 

      (8) Fiber Photometry Data: Z-scores were calculated for fiber photometry data, and a reference to a cited source is provided. This section lacks details about the calculation of zscores, and their use in the analysis. 

      These details have been added to the statistical analysis section. See lines 634-637

      ‘For the fiber photometry data, the z-scores were calculated as described in using GuPPy which is an open-source python toolbox for fiber photometry analysis. The z-score equation used in GuPPy is z=(DF/F-(mean of DF/F)/standard deviation of DF/F) where F refers to fluorescence of the GCaMP6s signal.’

      (9) Averaged Plots: Z-scores from individual animals were averaged and represented with SEM. It is briefly described, but more details about the number of animals, the purpose of averaging, and the significance of SEM are needed.

      We have added additional information about the averaging process in the statistical analysis section. See lines 639-643.

      ‘The purpose of the averaged traces is to show the extent of concordance of the response to experimenter handling and predator odor stress among animals with the SEM demonstrating that variability. The heatmaps depict the individual responses of each animal. The heatmaps were plotted using Seaborn in Python and mean traces were plotted using Matplotlib in Python.’

      A more comprehensive and objective interpretation of results could enhance the overall quality of the paper.

      We have taken this opportunity to improve our manuscript following comments from all the reviewers that we hope has resulted in a manuscript with a more objective interpretation of results. 

      Reviewer #3 (Public Review):

      Thank you for your thoughtful review of our work. To clarify some of the points you raised, we revised the manuscript to include more detail on how we distinguish between the oxidized endogenous and standard signal, as well as refine the language concerning the spatial resolution. We also edited the manuscript regarding the concentration measurements. We conducted technical replicates, so we appreciate you raising this point and clarify that in the main text. 

      Summary:

      This important paper describes improvements to the measurement of enkephalins in vivo using microdialysis and LC-MS. The key improvement is the oxidation of met- to prevent having a mix of reduced and oxidized methionine in the sample which makes quantification more difficult. It then shows measurements of enkephalins in the nucleus accumbens in two different stress situations - handling and exposure to predator odor. It also reports the ratio of released met- and leu-enkephalin matching what is expected from the digestion of proenkephalin. Measurements are also made by photometry of Ca2+ changes for the fox odor stressor. Some key takeaways are the reliable measurement of met-enkephalin, the significance of directly measuring peptides as opposed to proxy measurements, and the opening of a new avenue into the research of enkephalins due to stress based on these direct measurements.

      Strengths:

      -Improved methods for measurement of enkephalins in vivo.

      -Compelling examples of using this method.

      -Opening a new area of looking at stress responses through the lens of enkephalin concentrations.

      Weaknesses:

      (1) It is not clear if oxidized met-enk is endogenous or not and this method eliminates being able to discern that.

      We clarified our wording in the text copied below to provide an explanation on how we distinguish between the two. Even after oxidation, the standard signal has a higher m/z ratio due to the presence of the Carbon and Nitrogen isotopes as described in the Chemicals section of the methods ‘For Met Enkephalin, a fully labeled L-Phenylalanine (<sup>13</sup>C<sub>9</sub>, <sup>15</sup>N) was added (YGGFM). The resulting mass shift between the endogenous (light) and heavy isotope-labeled peptide are 7Da and 10Da, respectively.’, so they can still be differentiated from the endogenous signal. We have clarified the language in the results section. See lines 82-87. 

      ‘After each sample collection, we add a consistent known concentration of isotopically labeled internal standard of Met-Enk and Leu-Enk of 40 amol/sample to the collected ISF for the accurate identification and quantification of endogenous peptide. These internal standards have a different mass/charge (m/z) ratio than endogenous Met- and Leu-Enk. Thus, we can identify true endogenous signal for Met-Enk and Leu-Enk (Suppl Fig. 1A,C) versus noise, interfering signals, and standard signal (Suppl. Fig. 1B,D).’

      (2) It is not clear if the spatial resolution is really better as claimed since other probes of similar dimensions have been used.

      Apologies for any confusion here. To clarify we primarily state that our approach improves temporal resolution and in a few cases refer to improved spatiotemporal resolution, which we believe we show. The dimensions of the microdialysis probe used in these experiments allow us to target the nucleus accumbens shell and as well as being smaller – especially at the membrane level - than a fiber photometry probe. 

      (3) Claims of having the first concentration measurement are not quite accurate.

      Thank you for your feedback. To clarify, we do not claim that we have the first concentration measurements, rather we are the first to quantify the ratio of Met-Enk to Leu-Enk in vivo in freely behaving animals in the NAcSh. 

      (4) Without a report of technical replicates, the reliability of the method is not as wellevaluated as might be expected.

      We have added these details in the methods section, please see lines 521-530. 

      ‘Each sample was run in two technical replicates and the peak area ratio was averaged before concentration calculations of the peptides were conducted. Several quality control steps were conducted prior to running the in vivo samples. 1) Two technical replicates of a known concentration were injected and analyzed – an example table from 4 random experiments included in this manuscript is shown below. 2) The buffers used on the day of the experiment (aCSF and high K+ buffer) were also tested for any contaminating Met-Enk or Leu-Enk signals by injecting two technical replicates for each buffer. Once these two criteria were met, the experiment was analyzed through the system. If either step failed, which happened a few times, the samples were frozen and the machines were cleaned and restarted until the quality control measures were met.’

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      • The authors should provide appropriate citations of a study that has validated the Enkephalin-Cre mouse line in the nucleus accumbens or provide verification experiments if they have any available.

      Thank you for your comment. We have added a reference validating the Enk-Cre mouse line in the nucleus accumbens to the methods section and is copied here. 

      D.C. Castro, C.S. Oswell, E.T. Zhang, C.E. Pedersen, S.C. Piantadosi, M.A. Rossi, A.C. Hunker, A. Guglin, J.A. Morón, L.S. Zweifel, G.D. Stuber, M.R. Bruchas, An endogenous opioid circuit determines state-dependent reward consumption, Nature 2021 598:7882 598 (2021) 646–651. https://doi.org/10.1038/s41586-02104013-0.

      • Better definition of the labels y1,y2,b3 in Figures 1 and S1 would be useful. I may have missed it but it wasn't described in methods, results, or legends.

      Thank you for this comment. We have added this information to Fig.1 legend ‘Y1, y2, b3 refer to the different elution fragments resulting from Met-Enk during LC-MS.

      • It is interesting that the ratio of KCl-evoked release is what changes differentially for Met- vs Leu. Leu enk increases to the range of met-enk. There is non-detectable or approaching being non-detectable leu-enk (below the 40 amol / sample limit of quantification) in most of the subjects that become apparent and approach basal levels of met-enkephalin. This suggests that the K+ evoked response may be more pronounced for leu-enk. This is something that should be considered for further analysis and should be discussed.

      Thank you for this astute observation, and you make a great point. We have added some discussion of this finding in the results and discussion sections see lines 111112 and lines 253-257. 

      ‘Interestingly, Leu-Enk showed a greater fold change compared to baseline than did Met-Enk with the fold changes being 28 and 7 respectively based on the data in Fig.1F.’

      ‘We also noted that Leu-Enk showed a greater fold increase relative to baseline after depolarization with high K+ buffer as compared to Met-Enk. This may be due to increased Leu-Enk packaging in dense core vesicles compared to Met-Enk or due to the fact that there are two distinct precursor sources for Leu-Enk, namely both proenkephalin and prodynorphin while Met-Enk is mostly cleaved from proenkephalin (see Table 1 [48]).’

      • For example in 2E, it would be helpful to label in the graph axis what samples correspond to the manipulation and also in the text provide the reader with the sample numbers. The authors interpret the relationship between the last two samples of baseline and posthandling stress as the following in the figure legend "the concentration released in later samples is affected; such influence suggests that there is regulation of the maximum amount of peptide to be released in NAcSh. E. The negative correlation in panel d is reversed by using a high K+ buffer to evoke Met-Enk release, suggesting that the limited release observed in D is due to modulation of peptide release rather than depletion of reserves." However, the correlations are similar between 2D and E and it appears that two mice are mediating the difference between the two groups. The appropriate statistical analysis would be to compare the regressions of the two groups. Statistics for the high K+ (and all other graphs where appropriate) need to be reported, including the r2 and p-value.

      Thank you for your constructive critique. To elucidate the effect of high K+, we have plotted the regression line and reported the slope for Fig. 2E. Notably, the slope is reduced by a factor of 2 and appears to be driven by a large subset of the animals. The statistics for the high K+ graph are shown on the figure (Fig 1F) which test the hypothesis of whether high K+ leads to the release of Leu-Enk and Met-Enk respectively compared to baseline with aCSF. We have added the test statistics to the figure legend for additional clarity. Fig. 1G has no statistics because it is only there to elucidate the ratio between Met-Enk and Leu-Enk in the same samples. We did not test any hypotheses related to whether there are differences between their levels as that is not relevant to our question. The correlation on the same data is depicted in Fig. 1H, and we have added the R<sup>2</sup> value per your request. 

      • The interpretation that handling stress induces enkephalin release from microdialysis experiments is also confounded by other factors. For instance, from the methods, it appears that mice were connected and sample collection started 30 min after surgery, therefore recovery from anesthesia is also a confounding variable, among other technical aspects, such as equilibration of the interstitial fluid to the aCSF running through the probe that is acting as a transmitter and extracellular molecule "sink". Did the authors try to handle the mice post hookup similar to what was done with photometry to have a more direct comparison to photometry experiments? This procedural difference, recording from recently surgerized animals (microdialysis) vs well-recovered animals with photometry should be mentioned in addition to the other caveats the authors mention.

      Thank you for your comment. We are aware of this technical limitation, and it is largely why we sought to conduct the fiber photometry experiments to get at the same question. As you requested, we have included additional language in the discussion to acknowledge this limitation and how we chose to address it by measuring calcium activity in the enkephalinergic neurons, which would presumably be the same cell population whose release we are quantifying using microdialysis. See lines 262-273.  

      ‘Our findings showed a robust increase in peptide release at the beginning of experiments, which we interpreted as due to experimenter handling stress that directly precedes microdialysis collections. However, there are other technical limitations to consider such as the fact that we were collecting samples from mice that were recently operated on. Another consideration is that the circulation of aCSF through the probe may cause a sudden shift in oncotic and hydrostatic forces, leading to increased peptide release to the extracellular space. As such, we wanted to examine our findings using a different technique, so we chose to record calcium activity from enkephalinergic neurons - the same cell population leading to peptide release. Using fiber photometry, we showed that enkephalinergic neurons are activated by stress exposure, both experimenter handling and fox odor, thereby adding more evidence to suggest that enkephalinergic neurons are activated by stress exposure which could explain the heightened peptide levels at the beginning of microdialysis experiments.’

      • The authors should provide more details on handling stress manipulation during photometry. For photometry what was the duration of the handling bout, what was the interval between handling events, and can the authors provide a description of what handling entailed? Were mice habituated to handling days before doing photometry recording experiments?

      Thank you for your suggestion. We have addressed all of your points in the methods section. See lines 564-570. 

      ‘The handling bout which mimicked traditional scruffing lasted about 3-5 seconds. The mouse was then let go and the handling was repeated another two times in a single session with a minimum of 1-2 minutes between handling bouts. Mice were habituated to this manipulation by being attached to the fiber photometry rig, for 3-5 consecutive days prior to the experimental recording. Additionally, the same maneuver was employed when attaching/detaching the fiber photometry cord, so the mice were subjected to the same process several times.’

      • For the novel weigh boat experiments, the authors should explicitly state when these experiments were done in relation to the fox urine, was it a different session or the same session? Were they the same animals? Statements like the following (line 251) imply it was done in the same animals in the same session but it should be clarified in the methods "We also showed using fiber photometry that the novelty of the introduction of a foreign object to the cage, before adding fox odor, was sufficient to activate enkephalinergic neurons."

      As shown in supplementary figure 4, individual animal data is shown for both water and fox urine exposure (overlaid) to depict whether there were differences in their responses to each manipulation – in the same animal. And yes, you are correct, the animals were first exposed to water 3 times in the recording session and then exposed to fox urine 3 times in the same session. We have added that to the methods section describing in vivo fiber photometry. See lines 575-576.  

      • Statistical testing would be needed to affirm the conclusions the authors draw from the fox urine and novel weigh boat experiments. For example, it shows stats that the response attenuates, that it is not different between fox urine and novel (it looks like the response is stronger to the fox urine when looking at the individual animals), etc. These data look clear but stats are formally needed. Formal statistics are also missing in other parts of the manuscript where conclusions are drawn from the data but direct statistical comparisons are not included (e.g. Fig 2.G-I).

      The photometry data is shown as z-scores which is a formal statistical analysis. ANOVA would be inappropriate to run to compare z-scores. We understand that this is erroneously done in fiber photometry literature, however, it remains incorrect. The z-scores alone provide all the information needed about the deviation from baseline. We understand that this is not immediately clear to readers, and we thank you for allowing us to explain why this is the case. We have added test statistics to figure legends where hypothesis testing was done and p-values were reported. 

      • Did the authors try to present the animals with repeated fox urine exposure to see if this habituates like the photometry?

      No, we did not do that experiment due to the constrained timing within which we had to run our microdialysis/LC-MS timeline, but it is a great point for future exploration. 

      • It would be useful to present the time course of the odor experiment for the microdialysis experiment.

      The timeline is shown in Fig.1a and Fig.3e. To reiterate, each sample is 13 minutes long.

      • Can the authors determine if differences in behavior (e.g. excessive avoidance in animals with with one type of response) or microdialysis probe location dictate whether animals fall into categories of increased release, no release, or no-detection? From the breakdown, it looks like it is almost equally split into three parts but the authors' descriptions of this split are somewhat misleading (line 210). " The response to predator odor varies appreciably: although most animals show increased Met-Enk release after fox odor exposure, some show continued release with no elevation in Met-Enk levels, and a minority show no detectable release".

      Thank you for your constructive feedback. We do not believe the difference in behavior is correlated with probe placement. The hit map can be found in suppl. Fig 3 and shows that all mice included in the manuscript had probes in the NAcSh. We purposely did not distinguish between dorsal and ventral because of our 1 mm membrane would make it hard to presume exclusive sampling from one subregion. That is a great point though, and we have thought about it extensively for future studies. We have edited the language to reflect the almost even split of responses for Met-Enk and appreciate you pointing that out. 

      • Overall, given the inconsistencies in experimental design and overall caveats associated, I think the authors are unable to draw reasonable conclusions from the repeated stressor experiments and something they should either consider is not trying to draw strong conclusions from these observations or perform additional experiments that provide the grounds to derive those conclusions.

      We have included additional language on the caveats of our study, and our use of a dual approach using fiber photometry and microdialysis was largely driven by a

      desire to offer additional support of our conclusions. We expected pushback about our conclusions, so we wanted to offer a secondary analysis using a different technique to test our hypothesis. To be honest the tone of this comment and content is not particularly constructive (especially for trainees) nor does it offer a space to realistically address anything. This work took multiple years to optimize, it was led by a graduate student, and required a multidisciplinary team. As highlighted, we believe it offers an important contribution to the literature and pushes the field of peptide detection forward.  

      Reviewer #2 (Recommendations For The Authors):

      A more comprehensive and objective interpretation of results could enhance the overall quality of the paper. The manuscript contains statements like "we are the first to confirm," which can be challenging to substantiate and may not significantly enhance the paper. It's essential to ensure that novelty statements are well-founded. For example, the release of enkephalins from other brain regions after stress exposure is well-documented but not addressed in the paper. Similarly, the role of the NA shell in stress has been extensively studied but lacks coverage in this manuscript.

      We have edited the language to reflect your feedback. We have also included relevant literature expanding on the demonstrated roles of enkephalins in the literature. We would like to note that most studies have focused on chronic stress, and we were particularly interested in acute stress. See lines 129-134.

      ‘These studies have included regions such as the locus coeruleus, the ventral medulla, the basolateral nucleus of the amygdala, and the nucleus accumbens core and shell. Studies using global knockout of enkephalins have shown varying responses to chronic stress interventions where male knockout mice showed resistance to chronic mild stress in one study, while another study showed that enkephalin-knockout mice showed delayed termination of corticosteroid release. [33,34]’ 

      Finally, not a weakness but a clarification suggestion: the method description mentions the use of 1% FA in the sample reconstitution solution and LC solvents, which is an unusually high concentration of acid. If this concentration is intentional for maintaining the peptides' oxidation state, it would be beneficial to mention this in the text to assist readers who might want to replicate the method.

      This is correct and has been clarified in the methods section

      Reviewer #3 (Recommendations For The Authors):

      -The Abstract should state the critical improvements that are made. Also, quantify the improvements in spatiotemporal resolution.

      Thank you for your comment. We have edited the abstract to reflect this. 

      - The use of "amol/sample" as concentration is less informative than an SI units (e.g., pM concentration) and should be changed. Especially since the volume used was the same for in vivo sampling experiments.

      Thank you for your comment. We chose to report amol/sample because we are measuring such a small concentration and wanted to account for any slight errors in volume that can make drastic differences on reported concentrations especially since samples are dried and resuspended.  

      -Please check this sentence: "After each collection, the samples were spiked with 2 µL of 12.5 fM isotopically labeled Met-Enkephalin and Leu-Enkephalin" This dilution would yield a concentration of ~2 fM. In a 12 uL sample, that would be ~0.02 amol, well below the detection limit. (note that fM would femtomolar concentration and fmol would be femtomoles added).

      -"liquid chromatography/mass spectrometry (LC-MS) [9-12]"... Reference 9 is a RIA analysis paper, not LC-MS as stated.

      Thank you for catching these. We have corrected the unit and citation. 

      -Given that improvements in temporal resolution are claimed, the lack of time course data with a time axis is surprising. Rather, data for baseline and during treatment appear to be combined in different plots. Time course plots of individuals and group averages would be informative.

      Due to the expected variability between individual animal time course data, where for example, we measure detectable levels in one sample followed by no detection, it was very difficult to combine data across time. Therefore, to maximize data inclusion from all animals that showed baseline measurements and responses to individual manipulations, we opted to report snapshot data. Our improvement in temporal resolution refers to the duration of each sample rather than continuous sampling, so those two are unrelated. Thank you for your feedback and allowing us to clarify this.

      - I do not understand this claim "We use custom-made microdialysis probes, intentionally modified so they are similar in size to commonly used fiber photometry probes to avoid extensive tissue damage caused by traditional microdialysis probes (Fig. 1B)." The probes used are 320 um OD and 1 mm long. This is not an uncommon size of microdialysis probes and indeed many are smaller, so is their probe really causing less damage than traditional probes?

      Thank you for your comment. We are only trying to make the point that the tissue damage from these probes is comparable to commonly used fiber photometry probes. We only point that out because tissue damage is used as a point to dissuade the usage of microdialysis in some literature, and we just wanted to disambiguate that. We have clarified the statement you pointed out.  

      -The oxidation procedure is a good idea, as mentioned above. It would be interesting to compare met-enk with and without the oxidation procedure to see how much it affects the result (I would not say this is necessary though). It is not uncommon to add antioxidants to avoid losses like this. Also, it should be acknowledged that the treatment does prevent the detection of any in vivo oxidation, perhaps that is important in met-enk metabolism?

      The comparison between oxidized and unoxidized Met-Enk detection is in figure 1C. 

      -It would be a best practice to report the standard deviation of signal for technical replicates (say near in vivo concentrations) of standards and repeated analysis of a dialysate sample to be able to understand the variability associated with this method. Similarly, an averaged basal concentration from all rats.

      Thank you for your comment. We have included a table showing example quality control standard injections from 4 randomly selected experiments included in the manuscript that were run before and after each experiment and descriptive statistics associated with these technical replicates. We also added some detail to the methods section to describe how quality control is done. See lines 521-530. 

      ‘Each sample was run in two technical replicates and the peak area ratio was averaged before concentration calculations of the peptides were conducted. Several quality control steps were conducted prior to running the in vivo samples. 1) Two technical replicates of a known concentration were injected and analyzed – an example table from 4 random experiments included in this manuscript is shown below. 2) The buffers used on the day of the experiment (aCSF and high K+ buffer) were also tested for any contaminating Met-Enk or Leu-Enk signals by injecting two technical replicates for each buffer. Once these two criteria were met, the experiment was analyzed through the system. If either step failed, which happened a few times, the samples were frozen and the machines were cleaned and restarted until the quality control measures were met.’

      EDITORS NOTE

      Should you choose to revise your manuscript, please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.

      Thank you for your suggestion. We have included more detail about statistical analysis in the figure legends per this comment and reviewer comments.

    1. 4.6 Threading Issues In this section, we discuss some of the issues to consider in designing multithreaded programs. 4.6.1 The fork() and exec() System Calls In Chapter 3, we described how the fork() system call is used to create a separate, duplicate process. The semantics of the fork() and exec() system calls change in a multithreaded program. If one thread in a program calls fork(), does the new process duplicate all threads, or is the new process single-threaded? Some UNIX systems have chosen to have two versions of fork(), one that duplicates all threads and another that duplicates only the thread that invoked the fork() system call. The exec() system call typically works in the same way as described in Chapter 3. That is, if a thread invokes the exec() system call, the program specified in the parameter to exec() will replace the entire process—including all threads. Which of the two versions of fork() to use depends on the application. If exec() is called immediately after forking, then duplicating all threads is unnecessary, as the program specified in the parameters to exec() will replace the process. In this instance, duplicating only the calling thread is appropriate. If, however, the separate process does not call exec() after forking, the separate process should duplicate all threads. 4.6.2 Signal Handling A signal is used in UNIX systems to notify a process that a particular event has occurred. A signal may be received either synchronously or asynchronously, depending on the source of and the reason for the event being signaled. All signals, whether synchronous or asynchronous, follow the same pattern: 1. A signal is generated by the occurrence of a particular event. 2. The signal is delivered to a process. 3. Once delivered, the signal must be handled. Examples of synchronous signals include illegal memory access and division by 0. If a running program performs either of these actions, a signal is generated. Synchronous signals are delivered to the same process that performed the operation that caused the signal (that is the reason they are considered synchronous). When a signal is generated by an event external to a running process, that process receives the signal asynchronously. Examples of such signals include terminating a process with specific keystrokes (such as <control><C>) and having a timer expire. Typically, an asynchronous signal is sent to another process. A signal may be handled by one of two possible handlers: 1. A default signal handler 2. A user-defined signal handler Every signal has a default signal handler that the kernel runs when handling that signal. This default action can be overridden by a user-defined signal handler that is called to handle the signal. Signals are handled in different ways. Some signals may be ignored, while others (for example, an illegal memory access) are handled by terminating the program. Handling signals in single-threaded programs is straightforward: signals are always delivered to a process. However, delivering signals is more complicated in multithreaded programs, where a process may have several threads. Where, then, should a signal be delivered? In general, the following options exist: 1. Deliver the signal to the thread to which the signal applies. 2. Deliver the signal to every thread in the process. 3. Deliver the signal to certain threads in the process. 4. Assign a specific thread to receive all signals for the process. The method for delivering a signal depends on the type of signal generated. For example, synchronous signals need to be delivered to the thread causing the signal and not to other threads in the process. However, the situation with asynchronous signals is not as clear. Some asynchronous signals—such as a signal that terminates a process (<control><C>, for example)—should be sent to all threads. The standard UNIX function for delivering a signal is kill(pid_t pid, int signal) This function specifies the process (pid) to which a particular signal (signal) is to be delivered. Most multithreaded versions of UNIX allow a thread to specify which signals it will accept and which it will block. Therefore, in some cases, an asynchronous signal may be delivered only to those threads that are not blocking it. However, because signals need to be handled only once, a signal is typically delivered only to the first thread found that is not blocking it. POSIX Pthreads provides the following function, which allows a signal to be delivered to a specified thread (tid): pthread_kill(pthread_t tid, int signal) Although Windows does not explicitly provide support for signals, it allows us to emulate them using asynchronous procedure calls (APCs). The APC facility enables a user thread to specify a function that is to be called when the user thread receives notification of a particular event. As indicated by its name, an APC is roughly equivalent to an asynchronous signal in UNIX. However, whereas UNIX must contend with how to deal with signals in a multithreaded environment, the APC facility is more straightforward, since an APC is delivered to a particular thread rather than a process. 4.6.3 Thread Cancellation Thread cancellation involves terminating a thread before it has completed. For example, if multiple threads are concurrently searching through a database and one thread returns the result, the remaining threads might be canceled. Another situation might occur when a user presses a button on a web browser that stops a web page from loading any further. Often, a web page loads using several threads—each image is loaded in a separate thread. When a user presses the stop button on the browser, all threads loading the page are canceled. A thread that is to be canceled is often referred to as the target thread. Cancellation of a target thread may occur in two different scenarios: 1. Asynchronous cancellation. One thread immediately terminates the target thread. 2. Deferred cancellation. The target thread periodically checks whether it should terminate, allowing it an opportunity to terminate itself in an orderly fashion. The difficulty with cancellation occurs in situations where resources have been allocated to a canceled thread or where a thread is canceled while in the midst of updating data it is sharing with other threads. This becomes especially troublesome with asynchronous cancellation. Often, the operating system will reclaim system resources from a canceled thread but will not reclaim all resources. Therefore, canceling a thread asynchronously may not free a necessary system-wide resource. With deferred cancellation, in contrast, one thread indicates that a target thread is to be canceled, but cancellation occurs only after the target thread has checked a flag to determine whether or not it should be canceled. The thread can perform this check at a point at which it can be canceled safely. In Pthreads, thread cancellati

      Multithreaded programs face several challenges, including handling system calls like fork() and exec(). The behavior of fork() varies—some UNIX implementations duplicate all threads, while others duplicate only the calling thread. If exec() is called immediately after forking, duplicating all threads is unnecessary. Signal handling is another challenge, as signals can be synchronous (e.g., division by zero) or asynchronous (e.g., termination signals). Signals in multithreaded programs can be delivered to a specific thread, all threads, or certain threads based on signal type. POSIX Pthreads provide pthread_kill() to direct signals to a specific thread. Windows uses Asynchronous Procedure Calls (APCs) to handle event-driven notifications, similar to UNIX signals. Developers must ensure proper signal handling to avoid unintended behavior. Thread cancellation is another concern, requiring careful implementation to prevent resource leaks and ensure that terminated threads do not leave operations incomplete.

    2. 4.2.1 Programming Challenges The trend toward multicore systems continues to place pressure on system designers and application programmers to make better use of the multiple computing cores. Designers of operating systems must write scheduling algorithms that use multiple processing cores to allow the parallel execution shown in Figure 4.4. For application programmers, the challenge is to modify existing programs as well as design new programs that are multithreaded. In general, five areas present challenges in programming for multicore systems: 1. Identifying tasks. This involves examining applications to find areas that can be divided into separate, concurrent tasks. Ideally, tasks are independent of one another and thus can run in parallel on individual cores. 2. Balance. While identifying tasks that can run in parallel, programmers must also ensure that the tasks perform equal work of equal value. In some instances, a certain task may not contribute as much value to the overall process as other tasks. Using a separate execution core to run that task may not be worth the cost. 3. Data splitting. Just as applications are divided into separate tasks, the data accessed and manipulated by the tasks must be divided to run on separate cores. 4. Data dependency. The data accessed by the tasks must be examined for dependencies between two or more tasks. When one task depends on data from another, programmers must ensure that the execution of the tasks is synchronized to accommodate the data dependency. We examine such strategies in Chapter 6. 5. Testing and debugging. When a program is running in parallel on multiple cores, many different execution paths are possible. Testing and debugging such concurrent programs is inherently more difficult than testing and debugging single-threaded applications. Because of these challenges, many software developers argue that the advent of multicore systems will require an entirely new approach to designing software systems in the future. (Similarly, many computer science educators believe that software development must be taught with increased emphasis on parallel programming.)

      Programming for multicore systems introduces significant challenges. Identifying tasks involves finding independent operations that can be executed concurrently. Balancing workload is critical to ensure efficient use of all cores, preventing bottlenecks from uneven task distribution. Data splitting requires dividing data efficiently so that multiple threads can process different sections in parallel. Data dependency complicates execution, as synchronization mechanisms must prevent race conditions and inconsistencies. Testing and debugging are inherently more difficult in parallel programs due to non-deterministic execution paths. These challenges necessitate new programming paradigms and tools to develop efficient, scalable parallel applications that fully leverage multicore architectures.

    3. The benefits of multithreaded programming can be broken down into four major categories: 1. Responsiveness. Multithreading an interactive application may allow a program to continue running even if part of it is blocked or is performing a lengthy operation, thereby increasing responsiveness to the user. This quality is especially useful in designing user interfaces. For instance, consider what happens when a user clicks a button that results in the performance of a time-consuming operation. A single-threaded application would be unresponsive to the user until the operation had been completed. In contrast, if the time-consuming operation is performed in a separate, asynchronous thread, the application remains responsive to the user. 2. Resource sharing. Processes can share resources only through techniques such as shared memory and message passing. Such techniques must be explicitly arranged by the programmer. However, threads share the memory and the resources of the process to which they belong by default. The benefit of sharing code and data is that it allows an application to have several different threads of activity within the same address space. 3. Economy. Allocating memory and resources for process creation is costly. Because threads share the resources of the process to which they belong, it is more economical to create and context-switch threads. Empirically gauging the difference in overhead can be difficult, but in general thread creation consumes less time and memory than process creation. Additionally, context switching is typically faster between threads than between processes. 4. Scalability. The benefits of multithreading can be even greater in a multiprocessor architecture, where threads may be running in parallel on different processing cores. A single-threaded process can run on only one processor, regardless how many are available. We explore this issue further in the following section.

      Multithreaded programming provides key advantages, making applications more efficient and responsive. Responsiveness is crucial for user-friendly interfaces, ensuring a program remains functional even when a thread is blocked. Resource sharing allows threads to utilize the same memory and resources, avoiding complex communication mechanisms like message passing. Economy is another major advantage, as creating threads is significantly less resource-intensive than creating entire processes, reducing memory and CPU overhead. Lastly, scalability enables multithreaded applications to take full advantage of multiprocessor systems, distributing workload across multiple cores for enhanced performance. These benefits collectively make multithreading a fundamental concept in modern computing, particularly in systems requiring parallel execution, real-time processing, and interactive performance enhancements.

    1. Reviewer #1 (Public review):

      The results of these experiments support a modest but important conclusion: If sub-optimal methods are used to collect retrospective reports, such as simple yes/no questions, inattentional blindness (IB) rates may be overestimated by up to ~8%.

      (1) In experiment 1, data from 374 subjects were included in the analysis. As shown in figure 2b, 267 subjects reported noticing the critical stimulus and 107 subjects reported not noticing it. This translates to a 29% IB rate if we were to only consider the "did you notice anything unusual Y/N" question. As reported in the results text (and figure 2c), when asked to report the location of the critical stimulus (left/right), 63.6% of the "non-noticer" group answered correctly. In other words, 68 subjects were correct about the location while 39 subjects were incorrect. Importantly, because the location judgment was a 2-alternative-forced-choice, the assumption was that if 50% (or at least not statistically different than 50%) of the subjects answered the location question correctly, everyone was purely guessing. Therefore, we can estimate that ~39 of the subjects who answered correctly were simply guessing (because 39 guessed incorrectly), leaving 29 subjects from the non-noticer group who were correct on the 2AFC above and beyond the pure guess rate. If these 29 subjects are moved from the non-noticer to the noticer group, the corrected rate of IB for Experiment 1 is 20.86% instead of the original 28.61% rate that would have been obtained if only the Y/N question was used. In other words, relying only on the "Y/N did you notice anything" question led to an overestimate of IB rates by 7.75% in Experiment 1.

      In the revised version of their manuscript, the authors provided the data that was missing from the original submission, which allows this same exercise to be carried out on the other 4 experiments. Using the same logic as above, i.e., calculating the pure-guess rate on the 2AFC, moving the number of subjects above this pure-guess rate to the non-noticer group, and then re-calculating a "corrected IB rate", the other experiments demonstrate the following:

      Experiment 2: IB rates were overestimated by 4.74% (original IB rate based only on Y/N question = 27.73%; corrected IB rate that includes the 2AFC = 22.99%)

      Experiment 3: IB rates were overestimated by 3.58% (original IB rate = 30.85%; corrected IB rate = 27.27%)

      Experiment 4: IB rates were overestimated by ~8.19% (original IB rate = 57.32%; corrected IB rate for color* = 39.71%, corrected IB rate for shape = 52.61%, corrected IB rate for location = 55.07%)

      Experiment 5: IB rates were overestimated by ~1.44% (original IB rate = 28.99%; corrected IB rate for color = 27.56%, corrected IB rate for shape = 26.43%, corrected IB rate for location = 28.65%)

      *note: the highest overestimate of IB rates was from Experiment 4, color condition, but the authors admitted that there was a problem with 2AFC color guessing bias in this version of the experiment which was a main motivation for running experiment 5 which corrected for this bias.

      Taken as a whole, this data clearly demonstrates that even with a conservative approach to analyzing the combination of Y/N and 2AFC data, inattentional blindness was evident in a sizeable portion of the subject populations. An important (albeit modest) overestimate of IB rates was demonstrated by incorporating these improved methods.

      (2) One of the strongest pieces of evidence presented in this paper was the single data point in Figure 3e showing that in Experiment 3, even the super subject group that rated their non-noticing as "highly confident" had a d' score significantly above zero. Asking for confidence ratings is certainly an improvement over simple Y/N questions about noticing, and if this result were to hold, it could provide a key challenge to IB. However, this result can most likely be explained by measurement error.

      In their revised paper, the authors reported data that was missing from their original submission: the confidence ratings on the 2AFC judgments that followed the initial Y/N question. The most striking indication that this data is likely due to measurement error comes from the number of subjects who indicated that they were highly confident that they didn't notice anything on the critical trial, but then when asked to guess the location of the stimulus, indicated that they were highly confident that the stimulus was on the left (or right). There were 18 subjects (8.82% of the high-confidence non-noticer group) who responded this way. To most readers, this combination of responses (high confidence in correctly judging a stimulus feature that one is highly confident in having not seen at all) indicates that a portion of subjects misunderstood the confidence scales (or just didn't read the questions carefully or made mistakes in their responses, which is common for experiments conducted online).

      In the authors' rebuttal to the first round of peer review, they wrote, "it is perfectly rationally coherent to be very confident that one didn't see anything but also very confident that if there was anything to be seen, it was on the left." I respectfully disagree that such a combination of responses is rationally coherent. The more parsimonious interpretation is that a measurement error occurred, and it's questionable whether we should trust any responses from these 18 subjects.

      In their rebuttal, the authors go on to note that 14 of the 18 subjects who rated their 2AFC with high confidence were correct in their location judgment. If these 14 subjects were removed from analysis (which seems like a reasonable analysis choice, given their contradictory responses), d' for the high-confidence non-noticer group would most likely fall to chance levels. In other words, we would see a data pattern similar to that plotted in Figure 3e, but with the first data point on the left moving down to zero d'. This corrected Figure 3e would then provide a very nice evidence-based justification for including confidence ratings along with Y/N questions in future inattentional blindness studies.

      (3) In most (if not all) IB experiments in the literature, a partial attention and/or full attention trial is administered after the critical trial. These control trials are very important for validating IB on the critical trial, as they must show that, when attended, the critical stimuli are very easy to see. If a subject cannot detect the critical stimulus on the control trial, one cannot conclude that they were inattentionally blind on the critical trial, e.g., perhaps the stimulus was just too difficult to see (e.g., too weak, too brief, too far in the periphery, too crowded by distractor stimuli, etc.), or perhaps they weren't paying enough attention overall or failed to follow instructions. In the aggregate data, rates of noticing the stimuli should increase substantially from the critical trial to the control trials. If noticing rates are equivalent on the critical and control trials, one cannot conclude that attention was manipulated in the first place.

      In their rebuttal to the first round of peer review, the authors provided weak justification for not including such a control condition. They cite one paper that argues such control conditions are often used to exclude subjects from analysis (those who fail to notice the stimulus on the control trial are either removed from analysis or replaced with new subjects) and such exclusions/replacements can lead to underestimations of inattentional blindness rates. However, the inclusion of a partial or full attention condition as a control does not necessitate the extra step of excluding or replacing subjects. In the broadest sense, such a control condition simply validates the attention manipulation, i.e., one can easily compare the percent of subjects who answered "yes" or who got the 2AFC judgment correct during the critical trial versus the control trial. The subsequent choice about exclusion/replacement is separate, and researchers can always report the data with and without such exclusions/replacements to remain more neutral on this practice.

      If anyone were to follow-up on this study, I highly recommend including a partial or full attention control condition, especially given the online nature of data collection. It's important to know the percent of online subjects who answer yes and who get the 2AFC question correct when the critical stimulus is attended, because that is the baseline (in this case, the "ceiling level" of performance) to which the IB rates on the critical trial can be compared.

    2. Reviewer #2 (Public review):

      In this study, Nartker et al. examine how much observers are conscious of using variations of classic inattentional blindness studies. The key idea is that rather than simply ask observers if they noticed a critical object with one yes/no question, the authors also ask follow-up questions to determine if observers are aware of more than the yes/no questions suggest. Specifically, by having observers make forced choice guesses about the critical object, the authors find that many observers who initially said "no" they did not see the object can still "guess" above chance about the critical object's location, color, etc. Thus, the authors claim, that prior claims of inattentional blindness are mistaken and that using such simple methods has led numerous researchers to overestimate how little observers see in the world. To quote the authors themselves, these results imply that "inattentionally blind subjects consciously perceive these stimuli after all... they show sensitivity to IB stimuli because they can see them."

      Before getting to a few issues I have with the paper, I do want to make sure to explicitly compliment the researchers for many aspects of their work. Getting massive amounts of data, using signal detection measures, and the novel use of a "super subject" are all important contributions to the literature that I hope are employed more in the future.

      Main point 1: My primary issue with this work is that I believe the authors are misrepresenting the way people often perform inattentional blindness studies. In effect, the authors are saying, "People do the studies 'incorrectly' and report that people see very little. We perform the studies 'correctly' and report that people see much more than previously thought." But the way previous studies are conducted is not accurately described in this paper. The authors describe previous studies as follows on page 3:

      "Crucially, however, this interpretation of IB and the many implications that follow from it rest on a measure that psychophysics has long recognized to be problematic: simply asking participants whether they noticed anything unusual. In IB studies, awareness of the unexpected stimulus (the novel shape, the parading gorilla, etc.) is retroactively probed with a yes/no question, standardly, "Did you notice anything unusual on the last trial which wasn't there on previous trials?". Any subject who answers "no" is assumed not to have any awareness of the unexpected stimulus.

      If this quote were true, the authors would have a point. Unfortunately, I do not believe it is true. This is simply not how many inattentional blindness studies are run. Some of the most famous studies in the inattentional blindness literature do not simply as observes a yes/no question (e.g., the invisible gorilla (Simons et al. 1999), the classic door study where the person changes (Simons and Levin, 1998), the study where observers do not notice a fight happening a few feet from them (Chabris et al., 2011). Instead, these papers consistently ask a series of follow-up questions and even tell the observers what just occurred to confirm that observers did not notice that critical event (e.g., "If I were to tell you we just did XYZ, did you notice that?"). In fact, after a brief search on Google Scholar, I was able to relatively quickly find over a dozen papers that do not just use a yes/no procedure, and instead as a series of multiple questions to determine if someone is inattentionally blind. In no particular order some papers:

      (1) Most et al. (2005) Psych Review<br /> (2) Drew et al. (2013) Psych Science<br /> (3) Drew et al. (2016) Journal of Vision<br /> (4) Simons et al. (1999) Perception<br /> (5) Simons and Levin (1998) Perception<br /> (6) Chabris et al. (2011) iPerception<br /> (7) Ward & Scholl (2015) Psych Bulletin and Review<br /> (8) Most et al. (2001) Psych Science<br /> (9) Todd & Marois (2005) Psych Science<br /> (10) Fougnie & Marois (2007) Psych Bulletin and Review<br /> (11) New and German (2015) Evolution and Human Behaviour<br /> (12) Jackson-Nielsen (2017) Consciousness and cognition<br /> (13) Mack et al. (2016) Consciousness and cognition<br /> (14) Devue et al. (2009) Perception<br /> (15) Memmert (2014) Cognitive Development<br /> (16) Moore & Egeth (1997) JEP:HPP<br /> (17) Cohen et al. (2020) Proc Natl Acad Sci<br /> (18) Cohen et al. (2011) Psych Science

      This is a critical point. The authors' key idea is that when you ask more than just a simple yes/no question, you find that other studies have overestimated the effects of inattentional blindness. But none of the studies listed above only asked simple yes/no questions. Thus, I believe the authors are mis-representing the field. Moreover, many of the studies that do much more than ask a simple yes/no question are cited by the authors themselves! Furthermore, as far as I can tell, the authors believe that if researchers do these extra steps and ask more follow-ups, then the results are valid. But since so many of these prior studies do those extra steps, I am not exactly sure what is being criticized.

      To make sure this point is clear, I'd like to use a paper of mine as an example. In this study (Cohen et al., 2020, Proc Natl Acad Sci USA) we used gaze-contingent virtual reality to examine how much color people see in the world. On the critical trial, the part of the scene they fixated on was in color, but the periphery was entirely in black and white. As soon as the trial ended, we asked participants a series of questions to determine what they noticed. The list of questions included:

      (1) "Did you notice anything strange or different about that last trial?"<br /> (2) "If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?"<br /> (3) "If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?"<br /> (4) "Did you notice anything different about the colors in the last scene?"<br /> (5) We then showed observers the previous trial again and drew their attention to the effect and confirmed that they did not notice that previously.<br /> In a situation like this, when the observers are asked so many questions, do the authors believe that "the inattentionally blind can see after all?" I believe they would not say that and the reason they would not say that is because of the follow-up questions after the initial yes/no question. But since so many previous studies use similar follow-up questions, I do not think you can state that the field is broadly overestimating inattentional blindness. This is why it seems to me to be a bit of a straw-man: most people do not just use the yes/no method.

      Main point 2: Let's imagine for a second that every study did just ask a yes/no question and then would stop. So, the criticism the authors are bringing up is valid (even though I believe it is not). I am not entirely sure that above chance performance on a forced choice task proves that the inattentionally blind can see after all. Could it just be a form of subliminal priming? Could there be a significant number of participants who basically would say something like, "No I did not see anything, and I feel like I am just guessing, but if you want me to say whether the thing was to the left or right, I will just 100% guess"? I know the literature on priming from things like change and inattentional blindness is a bit unclear, but this seems like maybe what is going on. In fact, maybe the authors are getting some of the best priming from inattentional blindness because of their large sample size, which previous studies do not use.<br /> I'm curious how the authors would relate their studies to masked priming. In masked priming studies, observers say the did not see the target (like in this study) but still are above chance when forced to guess (like in this study). Do the researchers here think that that is evidence of "masked stimuli are truly seen" even if a participant openly says they are guessing?

      Main point 3: My last question is about how the authors interpret a variety of inattentional blindness findings. Previous work has found that observers fail to notice a gorilla in a CT scan (Drew et al., 2013), a fight occurring right in front of them (Chabris et al., 2011), a plane on a runway that pilots crash into (Haines, 1991), and so forth. In a situation like this, do the authors believe that many participants are truly aware of these items but simply failed to answer a yes/no question correctly? For example, imagine the researchers made participants choose if the gorilla was in the left or right lung and some participants who initially said they did not notice the gorilla were still able to correctly say if it was in the left or right lung. Would the authors claim "that participant actually did see the gorilla in the lung"? I ask because it is difficult to understand what it means to be aware of something as salient as a gorilla in a CT scan, but say "no" you didn't notice it when asked a yes/no question. What does it mean to be aware of such important, ecologically relevant stimuli, but not act in response to them and openly say "no" you did not notice them?

      Overall: I believe there are many aspects of this set of studies that are innovative and I hope the methods will be used more broadly in the literature. However, I believe the authors misrepresent the field and overstate what can be interpreted from their results. While I am sure there are cases where more nuanced questions might reveal inattentional blindness is somewhat overestimated, claims like "the inattentionally blind can see after all" or "Inattentionally blind subjects consciously perceive thest stimuli after all" seem to be incorrect (or at least not at all proven by this data).

    3. Author response:

      The following is the authors’ response to the current reviews.

      Responses to Reviewer #1:

      We thank the reviewer for these additional comments, and more generally for their extensive engagement with our work, which is greatly appreciated. Here, we respond to the three points in their latest review in turn.

      The results of these experiments support a modest but important conclusion: If sub-optimal methods are used to collect retrospective reports, such as simple yes/no questions, inattentional blindness (IB) rates may be overestimated by up to ~8%.

      It is true, of course, that we think the field has overstated the extent of IB, and we appreciate the reviewer characterizing our results as important along these lines. Nevertheless, we respectfully disagree with the framing and interpretation the reviewer attaches to them. As explained in our previous response, we think this interpretation — and the associated calculations of IB overestimation ‘rates’ — perpetuates a binary approach to perception and awareness which we regard as mistaken.

      A graded approach to IB and visual awareness 

      Our sense is that many theorists interested in IB have conceived of perception and awareness as ‘all or nothing’: You either see a perfectly clear gorilla right in front of you, or you see nothing at all. This is implicit in the reviewer’s characterization of our results as simply indicating that fewer subjects fail to see the critical stimulus than previously assumed. To think that way is precisely to assume the orthodox binary position about perception, i.e., that any given subject can neatly be categorized into one of two boxes, saw or didn’t see.

      Our perspective is different. We think there can be degraded forms of perception and awareness that fall neatly into neither of the categories “saw the stimulus perfectly clearly” or “saw nothing at all”. On this graded conception, the question is not: “What proportion of subjects saw the stimulus?” but: “What is the sensitivity of subjects to the stimulus?” This is why we prefer signal detection measures like d′ over % noticing and % correct. This powerful framework has been successful in essentially every domain to which it has been applied, and we think perception and visual awareness are no exception. We understand that the reviewer may not think the same way about this foundational issue, but since part of our goal is to promote a graded approach to perception, we are keen to highlight our disagreement here and so resist the reviewer’s interpretation of our results (even to the extent that it is a positive one!).

      Finally, we note that given this perspective, we are correspondingly inclined to reject many of the summary figures following below in Point (1) by the reviewer. These calculations (given in terms of % noticing and not noticing) make sense on the binary conception of awareness, but not on the SDT-based approach we favor. We say more about this below. 

      (1) In experiment 1, data from 374 subjects were included in the analysis. As shown in figure 2b, 267 subjects reported noticing the critical stimulus and 107 subjects reported not noticing it. This translates to a 29% IB rate if we were to only consider the "did you notice anything unusual Y/N" question. As reported in the results text (and figure 2c), when asked to report the location of the critical stimulus (left/right), 63.6% of the "non-noticer" group answered correctly. In other words, 68 subjects were correct about the location while 39 subjects were incorrect. Importantly, because the location judgment was a 2-alternative-forced-choice, the assumption was that if 50% (or at least not statistically different than 50%) of the subjects answered the location question correctly, everyone was purely guessing. Therefore, we can estimate that ~39 of the subjects who answered correctly were simply guessing (because 39 guessed incorrectly), leaving 29 subjects from the nonnoticer group who were correct on the 2AFC above and beyond the pure guess rate. If these 29 subjects are moved from the non-noticer to the noticer group, the corrected rate of IB for Experiment 1 is 20.86% instead of the original 28.61% rate that would have been obtained if only the Y/N question was used. In other words, relying only on the "Y/N did you notice anything" question led to an overestimate of IB rates by 7.75% in Experiment 1.

      In the revised version of their manuscript, the authors provided the data that was missing from the original submission, which allows this same exercise to be carried out on the other 4 experiments.  

      (To briefly interject: All of these data were provided in our public archive since our original submission and remain available at https://osf.io/fcrhu. The difference now is only that they are included in the manuscript itself.)

      Using the same logic as above, i.e., calculating the pure-guess rate on the 2AFC, moving the number of subjects above this pure-guess rate to the non-noticer group, and then re-calculating a "corrected IB rate", the other experiments demonstrate the following:

      Experiment 2: IB rates were overestimated by 4.74% (original IB rate based only on Y/N question = 27.73%; corrected IB rate that includes the 2AFC = 22.99%)

      Experiment 3: IB rates were overestimated by 3.58% (original IB rate = 30.85%; corrected IB rate = 27.27%)

      Experiment 4: IB rates were overestimated by ~8.19% (original IB rate = 57.32%; corrected IB rate for color* = 39.71%, corrected IB rate for shape = 52.61%, corrected IB rate for location = 55.07%)

      Experiment 5: IB rates were overestimated by ~1.44% (original IB rate = 28.99%; corrected IB rate for color = 27.56%, corrected IB rate for shape = 26.43%, corrected IB rate for location = 28.65%)

      *note: the highest overestimate of IB rates was from Experiment 4, color condition, but the authors admitted that there was a problem with 2AFC color guessing bias in this version of the experiment which was a main motivation for running experiment 5 which corrected for this bias.

      Taken as a whole, this data clearly demonstrates that even with a conservative approach to analyzing the combination of Y/N and 2AFC data, inattentional blindness was evident in a sizeable portion of the subject populations. An important (albeit modest) overestimate of IB rates was demonstrated by incorporating these improved methods.

      We appreciate the work the reviewer has put into making these calculations. However, as noted above, such calculations implicitly reflect the binary approach to perception and awareness that we reject. 

      Consider how we’d think about the single subject case where the task is 2afc detection of a low contrast stimulus in noise. Suppose that this subject achieves 70% correct. One way of thinking about this is that the subject fully and clearly sees the stimulus on 40% of trials (achieving 100% correct on those) and guesses completely blindly on the other 60% (achieving 50% correct on those) for a total of 40% + 30% = 70% overall. However, this is essentially a ‘high threshold’ approach to the problem, in contrast to an SDT approach. On an SDT approach — an approach with tremendous evidential support — on every trial the subject receives samples from probabilistic distributions corresponding to each interval (one noise and one signal + noise) and determines which is higher according to the 2afc decision rule. Thus, across trials, they have access to differentially graded information about the stimulus. Moreover, on some trials they may have significant information from the stimulus (perhaps, well above their single interval detection criterion) but still decide incorrectly because of high noise from the other spatial interval. From this perspective, there is no nonarbitrary way of saying whether the subject saw/did not see on a given trial. Instead, we must characterize the subject’s overall sensitivity to the stimulus/its visibility to them in terms of a parameter such as d′ (here, ~ 0.7).

      We take the same attitude to the subjects in our experiments (and specifically to our ‘super subject’). Instead of calculating the proportion of subjects who saw or failed to see the stimulus (with some characterized as aware and some as unaware), we think the best way to characterize our results is that, across subjects (and so trials also), there was differential graded access to information from the stimulus, and this is best represented in terms of the group-level sensitivity parameter d′. This is why we frame our results as demonstrating that subjects traditionally considered inattentionally blind exhibit significant residual visual sensitivity to the critical stimulus.

      (2) One of the strongest pieces of evidence presented in this paper was the single data point in Figure 3e showing that in Experiment 3, even the super subject group that rated their non-noticing as "highly confident" had a d' score significantly above zero. Asking for confidence ratings is certainly an improvement over simple Y/N questions about noticing, and if this result were to hold, it could provide a key challenge to IB. However, this result can most likely be explained by measurement error.

      In their revised paper, the authors reported data that was missing from their original submission: the confidence ratings on the 2AFC judgments that followed the initial Y/N question. The most striking indication that this data is likely due to measurement error comes from the number of subjects who indicated that they were highly confident that they didn't notice anything on the critical trial, but then when asked to guess the location of the stimulus, indicated that they were highly confident that the stimulus was on the left (or right). There were 18 subjects (8.82% of the high-confidence non-noticer group) who responded this way. To most readers, this combination of responses (high confidence in correctly judging a stimulus feature that one is highly confident in having not seen at all) indicates that a portion of subjects misunderstood the confidence scales (or just didn't read the questions carefully or made mistakes in their responses, which is common for experiments conducted online).

      In the authors' rebuttal to the first round of peer review, they wrote, "it is perfectly rationally coherent to be very confident that one didn't see anything but also very confident that if there was anything to be seen, it was on the left." I respectfully disagree that such a combination of responses is rationally coherent. The more parsimonious interpretation is that a measurement error occurred, and it's questionable whether we should trust any responses from these 18 subjects.

      In their rebuttal, the authors go on to note that 14 of the 18 subjects who rated their 2AFC with high confidence were correct in their location judgment. If these 14 subjects were removed from analysis (which seems like a reasonable analysis choice, given their contradictory responses), d' for the high-confidence non-noticer group would most likely fall to chance levels. In other words, we would see a data pattern similar to that plotted in Figure 3e, but with the first data point on the left moving down to zero d'. This corrected Figure 3e would then provide a very nice evidence-based justification for including confidence ratings along with Y/N questions in future inattentional blindness studies.

      We appreciate the reviewer’s highlighting of this particular piece of evidence as amongst our strongest. (At the same time, we must resist its characterization as a “single data point”: it derives from a large pre-registered experiment involving some 7,000 subjects total, with over 200 subjects in the relevant bin — both figures being far larger than a typical IB experiment.) We also appreciate their raising the issue of measurement error.

      Specifically, the reviewer contends that our finding that even highly confident non-noticers exhibit significant sensitivity is “most likely … explained by measurement error” due to subjects mistakenly inverting our confidence scale in giving their response. In our original reply, we gave two reasons for thinking this quite unlikely; the reviewer has not addressed these in this revised review. First, we explicitly labeled our confidence scale (with 0 labeled as ‘Not at all confident’ and 3 as ‘Highly confident’) so that subjects would be very unlikely simply to invert the scale. This is especially so as it is very counterintuitive to treat “0” as reflecting high confidence. More importantly, however, we reasoned that any measurement error due to inverting or misconstruing the confidence scale should be symmetric. That is: If subjects are liable to invert the confidence scale, they should do so just as often when they answer “yes” as when they answer “no” – after all the very same scale is being used in both cases. This allows us to explore evidence of measurement error in relation to the large number of high-confidence “yes” subjects (N = 2677), thus providing a robust indicator as to whether subjects are generally liable to misconstrue the confidence scale. Looking at the number of such high confidence noticers who subsequently respond to the 2afc question with low confidence (a pattern which might, though need not, suggest measurement error), we found that the number was tiny. Only 28/2677 (1.05%) of high-confidence noticers subsequently gave the lowest level of confidence on the 2afc question, and only 63/2677 (2.35%) subjects gave either of the two lower levels of confidence. For these reasons, we consider any measurement error due to misunderstanding the confidence scale to be extremely minimal.

      The reviewer is correct to note that 18/204 (9%) subjects reported both being highly confident that they didn't notice anything and highly confident in their 2afc judgment, although only 14/18 were correct in this judgment. Should we exclude these 14? Perhaps if we agree with the reviewer that such a pattern of responses is not “rationally coherent” and so must reflect a misconstrual of the scale. But such a pattern is in fact perfectly and straightforwardly intelligible. Specifically, in a 2afc task, two stimuli can individually fall well below a subject’s single interval detection criterion — leading to a high confidence judgment that nothing was presented in either interval. Quite consistent with this, the lefthand stimulus may produce a signal that is much higher than the right-hand stimulus — leading to a high confidence forced-choice judgment that, if something was presented, it was on the left. (By analogy, consider how a radiologist could look at a scan and say the following: “We’re 95% confident there’s no tumor. But even on the 5% chance that there is, our tests completely rule out that it’s a malignant one, so don’t worry.”) 

      (3) In most (if not all) IB experiments in the literature, a partial attention and/or full attention trial is administered after the critical trial. These control trials are very important for validating IB on the critical trial, as they must show that, when attended, the critical stimuli are very easy to see. If a subject cannot detect the critical stimulus on the control trial, one cannot conclude that they were inattentionally blind on the critical trial, e.g., perhaps the stimulus was just too difficult to see (e.g., too weak, too brief, too far in the periphery, too crowded by distractor stimuli, etc.), or perhaps they weren't paying enough attention overall or failed to follow instructions. In the aggregate data, rates of noticing the stimuli should increase substantially from the critical trial to the control trials. If noticing rates are equivalent on the critical and control trials, one cannot conclude that attention was manipulated in the first place.

      In their rebuttal to the first round of peer review, the authors provided weak justification for not including such a control condition. They cite one paper that argues such control conditions are often used to exclude subjects from analysis (those who fail to notice the stimulus on the control trial are either removed from analysis or replaced with new subjects) and such exclusions/replacements can lead to underestimations of inattentional blindness rates. However, the inclusion of a partial or full attention condition as a control does not necessitate the extra step of excluding or replacing subjects. In the broadest sense, such a control condition simply validates the attention manipulation, i.e., one can easily compare the percent of subjects who answered "yes" or who got the 2AFC judgment correct during the critical trial versus the control trial. The subsequent choice about exclusion/replacement is separate, and researchers can always report the data with and without such exclusions/replacements to remain more neutral on this practice.

      If anyone were to follow-up on this study, I highly recommend including a partial or full attention control condition, especially given the online nature of data collection. It's important to know the percent of online subjects who answer yes and who get the 2AFC question correct when the critical stimulus is attended, because that is the baseline (in this case, the "ceiling level" of performance) to which the IB rates on the critical trial can be compared.

      We agree with the reviewer that future studies could benefit from including a partial or full attention condition. They are surely right that we might learn something additional from such conditions. 

      Where we differ from the reviewer is in thinking of these conditions as “controls” appropriate to our research question. This is why we offered the justification we did in our earlier response. When these conditions are used as controls, they are used to exclude subjects in ways that serve to inflate the biases we are concerned with in our work. For our question, the absence of these conditions does not impact the significance of the findings, since such conditions are designed to answer a question which is not the one at the heart of our paper. Our key claim is that subjects who deny noticing an unexpected stimulus in a standard inattentional blindness paradigm nonetheless exhibit significant residual sensitivity (as well as a conservative bias in their response to the noticing question); the presence or absence of partial- or full-attention conditions is orthogonal to that question.

      Moreover, we note that our tasks were precisely chosen to be classic tasks widely used in the literature to manipulate attention. Thus, by common consensus in the field, they are effective means to soak up attention, and have in effect been tested in partial- and full-attention control settings in a huge number of studies. Second, we think it very doubtful that subjects in a full-attention trial would not overwhelmingly have detected our critical stimuli. The reviewer worries that they might have been “too weak, too brief, too far in the periphery, too crowded by distractor stimuli, etc.” But consider E5 where the stimulus was a highly salient orange or green shape, present on the screen for 5 seconds. The reviewer also suggests that subjects in the full-attention control might not have detected the stimulus because they “weren't paying enough attention overall”. But evidently if they weren’t paying attention even in the full-attention trial this would be reason for thinking that there was inattentional blindness even in this condition (a point made by White et al. 2018) and certainly not a reason for thinking there was not an attentional effect in the critical trial. Lastly, the reviewer suggests that a full-attention condition would have helped ensure that subjects were following instructions. But we ensured this already by (as per our pre-registration) excluding subjects who performed poorly in the relevant primary tasks.

      Thus, both in principle and in practice, we do not see the absence of such conditions as impacting the interpretation of our findings, even as we agree that future work posing a different research question could certainly learn something from including such conditions.

      Responses to Reviewer #2:

      We note that this report is unchanged from an earlier round of review, and not a response to our significantly revised manuscript. We believe our latest version fully addresses all the issues which the reviewer originally raised. The interested reader can see our original response below. We again thank the reviewer for their previous report which was extremely helpful.

      —-

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This study presents valuable findings to the field interested in inattentional blindness (IB), reporting that participants indicating no awareness of unexpected stimuli through yes/no questions, still show above-chance sensitivity to specific properties of these stimuli through follow-up forced-choice questions (e.g., its color). The results suggest that this is because participants are conservative and biased to report not noticing in IB. The authors conclude that these results provide evidence for residual perceptual awareness of inattentionally blind stimuli and that therefore these findings cast doubt on the claim that awareness requires attention. Although the samples are large and the analysis protocol novel, the evidence supporting this interpretation is still incomplete, because effect sizes are rather small, the experimental design could be improved and alternative explanations have not been ruled out.

      We are encouraged to hear that eLife found our work “valuable”. We also understand, having closely looked at the reviews, why the assessment also includes an evaluation of “incomplete”. We gave considerable attention to this latter aspect of the assessment in our revision. In addition to providing additional data and analyses that we believe strengthen our case, we also include a much more substantial review and critique of existing methods in the IB literature to make clear exactly the gap our work fills and the advance it makes. (Indeed, if it is appropriate to say this here, we believe one key aspect of our work that is missing from the assessment is our inclusion of ‘absent’ trials, which is what allows us to make the crucial claims about conservative reporting of awareness in IB for the first time.) Moreover, we refocus our discussion on only our most central claims, and weaken several of our secondary claims so that the data we’ve collected are better aligned with the conclusions we draw, to ensure that the case we now make is in fact complete. Specifically, our two core claims are (1) that there is residual sensitivity to visual features for subjects who would ordinarily be classified as inattentionally blind (whether this sensitivity is conscious or not), and (2) that there is a tendency to respond conservatively on yes/no questions in the context of IB. We believe we have very compelling support for these two core claims, as we explain in detail below and also through revisions to our manuscript.

      Given the combination of strengthened and clarified case, as well as the weakening of any conclusions that may not have been fully supported, we believe and hope that these efforts make our contribution “solid”, “convincing”, or even “compelling” (especially because the “compelling” assessment characterizes contributions that are “more rigorous than the current state-of-the-art”, which we believe to be the case given the issues that have plagued this literature and that we make progress on).

      Reviewer #1 (Public review):

      Summary:

      In the abstract and throughout the paper, the authors boldly claim that their evidence, from the largest set of data ever collected on inattentional blindness, supports the views that "inattentionally blind participants can successfully report the location, color, and shape of stimuli they deny noticing", "subjects retain awareness of stimuli they fail to report", and "these data...cast doubt on claims that awareness requires attention." If their results were to support these claims, this study would overturn 25+ years of research on inattentional blindness, resolve the rich vs. sparse debate in consciousness research, and critically challenge the current majority view in cognitive science that attention is necessary for awareness.

      Unfortunately, these extraordinary claims are not supported by extraordinary (or even moderately convincing) evidence. At best, the results support the more modest conclusion: If sub-optimal methods are used to collect retrospective reports, inattentional blindness rates will be overestimated by up to ~8% (details provided below in comment #1). This evidence-based conclusion means that the phenomenon of inattentional blindness is alive and well as it is even robust to experiments that were specifically aimed at falsifying it. Thankfully, improved methods already exist for correcting the ~8% overestimation of IB rates that this study successfully identified.

      We appreciate here the reviewer’s recognition of the importance of work on inattentional blindness, and the centrality of inattentional blindness to a range of major questions. We also recognize their concerns with what they see as a gap between our data and the claims made on their basis. We address this in detail below (as well as, of course, in our revised manuscript). However, from the outset we are keen to clarify that our central claim is only the first one the reviewer mentions — and the one which appears in our title — namely that, as a group, participants can successfully report the location, color, and shape of stimuli they deny noticing, and thus that there is “Sensitivity to visual features in inattentional blindness”. This is the claim that we believe is strongly supported by our data, and all the more so after revising the manuscript in light of the helpful comments we’ve received.

      By contrast, the other claims the reviewer mentions, concerning awareness (as opposed to residual sensitivity–which might be conscious or unconscious) were intended as both secondary and tentative. We agree with the referee that these are not as strongly supported by our data (and indeed we say so in our manuscript), whereas we do think our data strongly support the more modest — and, to us central — claim that, as a group, inattentionally blind participants can successfully report the location, color, and shape of stimuli they deny noticing. 

      We also feel compelled to resist somewhat the reviewer’s summary of our claims. For example, the reviewer attributes to us the claim that “subjects retain awareness of stimuli they fail to report”; but while that phrase does appear in our abstract, what we in fact say is that our data are “consistent with an alternative hypothesis about IB, namely that subjects retain awareness of stimuli they fail to report”. We do in fact believe that our data are consistent with that hypothesis, whereas earlier investigations seemed not to be. We mention this only because we had used that careful phrasing precisely for this sort of reason, so that we wouldn’t be read as saying that our results unequivocally support that alternative.

      Still, looking back, we see how we may have given more emphasis than we intended to some of these more secondary claims. So, we’ve now gone through and revised our manuscript throughout to emphasize that our main claim is about residual sensitivity, and to make clear that our claims about awareness are secondary and tentative. Indeed, we now say precisely this, that although we favor an interpretation of “our results in terms of residual conscious vision in IB … this claim is tentative and secondary to our primary finding”. We also weaken the statements in the abstract that the reviewer mentions, to better reflect our key claims.

      Finally, we note one further point: Dialectically, inattentional blindness has been used to argue (e.g.) that attention is required for awareness. We think that our data concerning residual sensitivity at least push back on the use of IB to make this claim, even if (as we agree) they do not provide decisive evidence that awareness survives inattention. In other words, we think our data call that claim into question, such that it’s now genuinely unclear whether awareness does or does not survive inattention. We have adjusted our claims on this point accordingly as well.

      Comments:

      (1) In experiment 1, data from 374 subjects were included in the analysis. As shown in figure 2b, 267 subjects reported noticing the critical stimulus and 107 subjects reported not noticing it. This translates to a 29% IB rate, if we were to only consider the "did you notice anything unusual Y/N" question. As reported in the results text (and figure 2c), when asked to report the location of the critical stimulus (left/right), 63.6% of the "non-noticer" group answered correctly. In other words, 68 subjects were correct about the location while 39 subjects were incorrect. Importantly, because the location judgment was a 2-alternative-forced-choice, the assumption was that if 50% (or at least not statistically different than 50%) of the subjects answered the location question correctly, everyone was purely guessing. Therefore, we can estimate that ~39 of the subjects who answered correctly were simply guessing (because 39 guessed incorrectly), leaving 29 subjects from the nonnoticer group who may have indeed actually seen the location of the stimulus. If these 29 subjects are moved to the noticer group, the corrected rate of IB for experiment 1 is 21% instead of 29%. In other words, relying only on the "Y/N did you notice anything" question leads to an overestimate of IB rates by 8%. This modest level of inaccuracy in estimating IB rates is insufficient for concluding that "subjects retain awareness of stimuli they fail to report", i.e. that inattentional blindness does not exist.

      In addition, this 8% inaccuracy in IB rates only considers one side of the story. Given the data reported for experiment 1, one can also calculate the number of subjects who answered "yes, I did notice something unusual" but then reported the incorrect location of the critical stimulus. This turned out to be 8 subjects (or 3% of the "noticer" group). Some would argue that it's reasonable to consider these subjects as inattentionally blind, since they couldn't even report where the critical stimulus they apparently noticed was located. If we move these 8 subjects to the non-noticer group, the 8% overestimation of IB rates is reduced to 6%.

      The same exercise can and should be carried out on the other 4 experiments, however, the authors do not report the subject numbers for any of the other experiments, i.e., how many subjects answered Y/N to the noticing question and how many in each group correctly answered the stimulus feature question. From the limited data reported (only total subject numbers and d' values), the effect sizes in experiments 2-5 were all smaller than in experiment 1 (d' for the non-noticer group was lower in all of these follow-up experiments), so it can be safely assumed that the ~6-8% overestimation of IB rates was smaller in these other four experiments. In a revision, the authors should consider reporting these subject numbers for all 5 experiments.

      We now report, as requested, all these subject numbers in our supplementary data (see Supplementary Tables 1 and 2 in our Supplementary Materials).

      However, we wish to address the larger question the reviewer has raised: Do our data only support a relatively modest reduction in IB rates? Even if they did, we still believe that this would be a consequential result, suggesting a significant overestimation of IB rates in classic paradigms. However, part of our purpose in writing this paper is to push back against a certain binary way of thinking about seeing/awareness. Our sense is that the field has conceived of awareness as “all or nothing”: You either see a perfectly clear gorilla right in front of you, or you see nothing at all. Our perspective is different: We think there can be degraded forms of awareness that fall into neither of those categories. For that reason, we are disinclined to see our results in the way that the reviewer suggests, namely as simply indicating that fewer subjects fail to see the stimulus than previously assumed. To think that way is, in our view, to assume the orthodox binary position about awareness. If, instead, one conceives of awareness as we do (and as we believe the framework of signal detection theory should compel us to), then it isn’t quite right to think of the proportion of subjects who were aware, but rather (e.g.) the sensitivity of subjects to the relevant stimulus. This is why we prefer measures like d′ over % noticing and % correct. We understand that the reviewer may not think the same way about this issue as we do, but part of our goal is to promote that way of thinking in general, and so some of our comments below reflect that perspective and approach.

      For example, consider how we’d think about the single subject case where the task is 2afc detection of a low contrast stimulus in noise. Suppose that this subject achieves 70% correct. One way of thinking about that is that the subject sees the stimulus on 40% of trials (achieving 100% correct on those) and guesses blindly on the other 60% (achieving 50% correct on those) for a total of 40% + 30% = 70% overall. However, this is essentially a “high threshold” approach to the problem, in contrast to an SDT approach. On an SDT approach (an approach with tremendous evidential support), on every trial the subject receives samples from probabilistic distributions corresponding to each interval (one noise and one signal + noise) and determines which is higher according to the 2afc decision rule. Thus, across trials they have access to differentially graded information about the stimulus. Moreover, on some trials they may have significant information from the stimulus (perhaps, well above their single interval detection criterion) but still decide incorrectly because of high noise from the other spatial interval. From this perspective, there is no non-arbitrary way of saying whether the subject saw/did not see on a given trial. Instead, we must characterize the subject’s overall sensitivity to the stimulus/its visibility to them in terms of a parameter such as d′ (here, ~ 0.7).

      We take the same attitude to our super subject. Instead of saying that some subjects saw/failed to see the stimuli, instead we suggest that the best way to characterize our results is that across subjects (and so trials also) there was differential graded access to information from the stimulus best represented in terms of the group-level sensitivity parameter d′.

      We acknowledge that (despite ourselves) we occasionally fell into an all-too-natural binary/high threshold way of thinking, as when we suggested that our data show that “inattentionally blind subjects consciously perceive these stimuli after all” and “the inattentionally blind can see after all." (p.17) We have removed such problematic phrasing as well as other problematic phrasing as noted below.

      (2) Because classic IB paradigms involve only one critical trial per subject, the authors used a "super subject" approach to estimate sensitivity (d') and response criterion (c) according to signal detection theory (SDT). Some readers may have issues with this super subject approach, but my main concern is with the lack of precision used by the authors when interpreting the results from this super subject analysis.

      Only the super subject had above-chance sensitivity (and it was quite modest, with d' values between 0.07 and 0.51), but the authors over-interpret these results as applying to every subject. The methods and analyses cannot determine if any individual subject could report the features above-chance. Therefore, the following list of quotes should be revised for accuracy or removed from the paper as they are misleading and are not supported by the super subject analysis: "Altogether this approach reveals that subjects can report above-chance the features of stimuli (color, shape, and location) that they had claimed not to notice under traditional yes/no questioning" (p.6)

      "In other words, nearly two-thirds of subjects who had just claimed not to have noticed any additional stimulus were then able to correctly report its location." (p.6)

      "Even subjects who answer "no" under traditional questioning can still correctly report various features of the stimulus they just reported not having noticed, suggesting that they were at least partially aware of it after all." (p.8)

      "Why, if subjects could succeed at our forced-response questions, did they claim not to have noticed anything?" (p.8)

      "we found that observers could successfully report a variety of features of unattended stimuli, even when they claimed not to have noticed these stimuli." (p.14)

      "our results point to an alternative (and perhaps more straightforward) explanation: that inattentionally blind subjects consciously perceive these stimuli after all... they show sensitivity to IB stimuli because they can see them." (p.16)

      "In other words, the inattentionally blind can see after all." (p.17)

      We thank the reviewer for pointing out how these quotations may be misleading as regards our central claim. We intended them all to be read generically as concerning the group, and not universally as claiming that all subjects could report above-chance/see the stimuli etc. We agree entirely that the latter universal claim would not be supported by our data. In contrast, we do contend that our super-subject analysis shows that, as a group, subjects traditionally considered intentionally blind exhibit residual sensitivity to features of stimuli (color, shape, and location) that they had all claimed not to notice, and likewise that as a group they could succeed at our forced-choice questions. 

      To ensure this claim is clear throughout the paper, and that we are not interpreted as making an unsupported universal claim we have revised the language in all of the quotations above, as follows, as well as in numerous other places in the paper.

      “Altogether this approach reveals that subjects can report above-chance the features of stimuli (color, shape, and location) that they had claimed not to notice under traditional yes/no questioning” (p.6) => “Altogether this approach reveals that as a group subjects can report above-chance the features of stimuli (color, shape, and location) that they had all claimed not to notice under traditional yes/no questioning” (p.6)

      “Even subjects who answer “no” under traditional questioning can still correctly report various features of the stimulus they just reported not having noticed, suggesting that they were at least partially aware of it after all.” (p.8) => “... even subjects who answer “no” under traditional questioning can, as a group, still correctly report various features of the stimuli they just reported not having noticed, indicating significant group-level sensitivity to visual features. Moreover, these results are even consistent with an alternative hypothesis about IB, that as a group, subjects who would traditionally be classified as inattentionally blind are in fact at least partially aware of the stimuli they deny noticing.” (p.8)

      “Why, if subjects could succeed at our forced-response questions, did they claim not to have noticed anything?” (p.8) => “Why, if subjects could succeed at our forcedresponse questions as a group, did they all individually claim not to have noticed anything?” (p.8)

      “we found that observers could successfully report a variety of features of unattended stimuli, even when they claimed not to have noticed these stimuli.” (p.14) => “we found that groups of observers could successfully report a variety of features of unattended stimuli, even when they all individually claimed not to have noticed those stimuli.” (p.14)

      “our results point to an alternative (and perhaps more straightforward) explanation: that inattentionally blind subjects consciously perceive these stimuli after all... they show sensitivity to IB stimuli because they can see them.” (p.16) => “our results just as easily raise an alternative (and perhaps more straightforward) explanation: that inattentionally blind subjects may retain a degree of awareness of these stimuli after all.” (p.16) Here deleting: “they show sensitivity to IB stimuli because they can see them.”

      “In other words, the inattentionally blind can see after all.” (p.17) => “In other words, as a group, the inattentionally blind enjoy at least some degraded or partial sensitivity to the location, color and shape of stimuli which they report not noticing.” (p.17)

      In one case, we felt the sentence was correct as it stood, since it simply reported a fact about our data:

      “In other words, nearly two-thirds of subjects who had just claimed not to have noticed any additional stimulus were then able to correctly report its location.” (p.6)

      After all, if subjects were entirely blind and simply guessed, it would be true to say that 50% of subjects would be able to correctly report the stimulus location (by guessing).

      In addition to these and numerous other changes, we also added the following explicit statement early in the paper to head-off any confusion on this point: “Note that all analyses reported here relate to this super subject as opposed to individual subjects”. 

      (3) In addition to the d' values for the super subject being slightly above zero, the authors attempted an analysis of response bias to further question the existence of IB. By including in some of their experiments critical trials in which no critical stimulus was presented, but asking subjects the standard Y/N IB question anyway, the authors obtained false alarm and correct rejection rates. When these FA/CR rates are taken into account along with hit/miss rates when critical stimuli were presented, the authors could calculate c (response criterion) for the super subject. Here, the authors report that response criteria are biased towards saying "no, I didn't notice anything". However, the validity of applying SDT to classic Y/N IB questioning is questionable.

      For example, with the subject numbers provided in Box 1 (the 2x2 table of hits/misses/FA/CR), one can ask, 'how many subjects would have needed to answer "yes, I noticed something unusual" when nothing was presented on the screen in order to obtain a non-biased criterion estimate, i.e., c = 0?' The answer turns out to be 800 subjects (out of the 2761 total subjects in the stimulus-absent condition), or 29% of subjects in this condition.

      In the context of these IB paradigms, it is difficult to imagine 29% of subjects claiming to have seen something unusual when nothing was presented. Here, it seems that we may have reached the limits of extending SDT to IB paradigms, which are very different than what SDT was designed for. For example, in classic psychophysical paradigms, the subject is asked to report Y/N as to whether they think a threshold-level stimulus was presented on the screen, i.e., to detect a faint signal in the noise. Subjects complete many trials and know in advance that there will often be stimuli presented and the stimuli will be very difficult to see. In those cases, it seems more reasonable to incorrectly answer "yes" 29% of the time, as you are trying to detect something very subtle that is out there in the world of noise. In IB paradigms, the stimuli are intentionally designed to be highly salient (and unusual), such that with a tiny bit of attention they can be easily seen. When no stimulus is presented and subjects are asked about their own noticing (especially of something unusual), it seems highly unlikely that 29% of them would answer "yes", which is the rate of FAs that would be needed to support the null hypothesis here, i.e., of a non-biased criterion. For these reasons, the analysis of response bias in the current context is questionable and the results claiming to demonstrate a biased criterion do not provide convincing evidence against IB.

      We are grateful to the reviewer for highlighting this aspect of our data. We agree with several of these points. For example, it is indeed striking that — given the corresponding hit rate — a false alarm rate of 29% would be needed to obtain an unbiased criterion. At the same time, we would respectfully push back on other points above. In our first experiment that uses the super-subject analysis, for example, d′ is 0.51 and highly significant; to describe that figure, as the reviewer does, as “slightly above zero” seemed not quite right to us (and all the more so given that these experiments involve very large samples and preregistered analysis plans). 

      We also respectfully disagree that our data call into question the validity of applying SDT to classic yes/no IB questioning. The mathematical foundations of SDT are rock solid, and have been applied far more broadly than we have applied them here. In fact, in a way we would suggest that exactly the opposite attitude is appropriate: rather than thinking that IB challenges an immensely well-supported, rigorously tested and broadly applicable mathematical model of perception, we think that the conflict between our SDT-based model of IB and the standard interpretation constitutes strong reason to disfavor the standard interpretation. Several points are worth making here.

      First, it is already surprising that 11.03% of our subjects in E2 (46/417) and 7.24% of our subjects in E5 (200/2761) E5 reported noticing a stimulus when no stimulus was present. But while this may have seemed unlikely in advance of inquiry, this is in fact what the data show and forms the basis of our criterion calculations. Thus, our criterion calculations already factor in a surprising but empirically verified high false alarm rate of subjects answering “yes” when no stimulus was presented and were asked about their noticing. (We also note that the only paper we know of to report a false alarm rate in an IB paradigm, though not one used to calculate a response criterion, found a very consistent false alarm rate of 10.4%. See Devue et al. 2009.)

      Second, while the reviewer is of course correct that a common psychophysical paradigm involves detection of a “threshold-level”/faint stimulus in noise, it is widely recognized that SDT has an extremely broad application, being applicable to any situation in which two kinds of event are to be discriminated (Pastore & Scheirer 1975) and being “almost universally accepted as a theoretical account of decision making in research on perceptual detection and recognition and in numerous extensions to applied domains” quite generally (Estes 2002, see also: Wixted 2020). Indeed, cases abound in which SDT has been successfully applied to situations which do not involve near threshold stimuli in noise. To pick two examples at random, SDT has been used in studying acceptability judgments in linguistics (Huang and Ferreira 2020) and the assessment of physical aggression in childstudent interactions (Lerman et al. 2010; for more general discussion of practical applications, see Swets et al. 2000). Given that the framework of SDT is so widely applied and well supported, and that we see no special reason to make an exception, we believe it can be relied on in the present context.

      Finally, we note that inattentional blindness can in many ways be considered analogous to “near threshold” detection since inattention is precisely thought to degrade or even abolish awareness of stimuli, meaning that our stimuli can be construed as near threshold in the relevant sense. Indeed, our relatively modest d′ values suggest that under inattention stimuli are indeed hard to detect. Thus, even were SDT more limited in its application, we think it still would be appropriate to apply to the case of IB.

      (4) One of the strongest pieces of evidence presented in the entire paper is the single data point in Figure 3e showing that in Experiment 3, even the super subject group that rated their non-noticing as "highly confident" had a d' score significantly above zero. Asking for confidence ratings is certainly an improvement over simple Y/N questions about noticing, and if this result were to hold, it could provide a key challenge to IB. However, this result hinges on a single data point, it was not replicated in any of the other 4 experiments, and it can be explained by methodological limitations. I strongly encourage the authors (and other readers) to follow up on this result, in an in-person experiment, with improved questioning procedures.

      We agree that our finding that even the super-subject group that rated their non-noticing as “highly confident” had a d' score significantly above zero is an especially strong piece of evidence, and we thank the reviewer for highlighting that here. At the same time, we note that while the finding is represented by a single marker in Figure 3e, it seemed not quite right to call this a “single data point”, as the reviewer does, given that it derives from a large pre-registered experiment involving some 7,000 subjects total, with over 200 subjects in the relevant bin — both figures being far larger than a typical IB experiment. It would of course be tremendous to follow up on this result – and we certainly hope our work inspires various follow-up studies. That said, we note that recruiting the necessary numbers of in person subjects would be an absolutely enormous, career-level undertaking – it would involve bringing more than the entire undergraduate population at our own institution, Johns Hopkins, into our laboratory! While those results would obviously be extremely valuable, we wouldn’t want to read the reviewer’s comments as implying that only an experiment of that magnitude — requiring thousands upon thousands of in-person subjects — could make progress on these issues. Indeed, because every subject can only contribute one critical trial in IB, it has long been recognized as an extremely challenging paradigm to study in a sufficiently well-powered and psychophysically rigorous way. We believe that our large preregistered online approach represents a major leap forward here, even if it involves certain trade-offs.

      In the current Experiment 3, the authors asked the standard Y/N IB question, and then asked how confident subjects were in their answer. Asking back-to-back questions, the second one with a scale that pertains to the first one (including a tricky inversion, e.g., "yes, I am confident in my answer of no"), may be asking too much of some subjects, especially subjects paying half-attention in online experiments. This procedure is likely to introduce a sizeable degree of measurement error.

      An easy fix in a follow-up study would be to ask subjects to rate their confidence in having noticed something with a single question using an unambiguous scale:

      On the last trial, did you notice anything besides the cross?

      (1): I am highly confident I didn't notice anything else

      (2): I am confident I didn't notice anything else

      (3): I am somewhat confident I didn't notice anything else

      (4): I am unsure whether I noticed anything else

      (5): I am somewhat confident I noticed something else

      (6): I am confident I noticed something else

      (7): I am highly confident I noticed something else

      If we were to re-run this same experiment, in the lab where we can better control the stimuli and the questioning procedure, we would most likely find a d' of zero for subjects who were confident or highly confident (1-2 on the improved scale above) that they didn't notice anything. From there on, the d' values would gradually increase, tracking along with the confidence scale (from 3-7 on the scale). In other words, we would likely find a data pattern similar to that plotted in Figure 3e, but with the first data point on the left moving down to zero d'. In the current online study with the successive (and potentially confusing) retrospective questioning, a handful of subjects could have easily misinterpreted the confidence scale (e.g., inverting the scale) which would lead to a mixture of genuine high-confidence ratings and mistaken ratings, which would result in a super subject d' that falls between zero and the other extreme of the scale (which is exactly what the data in Fig 3e shows).

      One way to check on this potential measurement error using the existing dataset would be to conduct additional analyses that incorporate the confidence ratings from the 2AFC location judgment task. For example, were there any subjects who reported being confident or highly confident that they didn't see anything, but then reported being confident or highly confident in judging the location of the thing they didn't see? If so, how many? In other words, how internally (in)consistent were subjects' confidence ratings across the IB and location questions? Such an analysis could help screen-out subjects who made a mistake on the first question and corrected themselves on the second, as well as subjects who weren't reading the questions carefully enough.

      As far as I could tell, the confidence rating data from the 2AFC location task were not reported anywhere in the main paper or supplement.

      We are grateful to the reviewer for raising this issue and for requesting that we report the confidence rating data from our 2afc location task in Experiment 3. We now report all this data in our Supplementary Materials (see Supplementary Table 3).

      We of course agree with the reviewer’s concern about measurement error, which is a concern in all experiments. What, then, of the particular concern that some subjects might have misunderstood our confidence question? It is surely impossible in principle to rule out this possibility; however, several factors bear on the plausibility of this interpretation. First, we explicitly labeled our confidence scale (with 0 labeled as ‘Not at all confident’ and 3 as ‘Highly confident’) so that subjects would be very unlikely simply to invert the scale. This is especially so as it is very counterintuitive to treat “0” as reflecting high confidence. However, we accept that it is a possibility that certain subjects might nonetheless have been confused in some other way.

      So, we also took a second approach. We examined the confidence ratings on the 2afc question of subjects who reported being highly confident that they didn't notice anything.

      Reassuringly, the large majority of these high confidence “no” subjects (~80%) reported low confidence of 0 or 1 on the 2afc question, and the majority (51%) reported the lowest confidence of 0. Only 18/204 (9%) subjects reported high confidence on both questions. 

      Still, the numbers of subjects here are small and so may not be reliable. This led us to take a third approach. We reasoned that any measurement error due to inverting or misconstruing the confidence scale should be symmetric. That is: If subjects are liable to invert the confidence scale, they should do so just as often when they answer “yes” as when they answer “no” – after all the very same scale is being used in both cases. This allows us to explore evidence of measurement error in relation to the much larger number of highconfidence “yes” subjects (N = 2677), thus providing a much more robust indicator as to whether subjects are generally liable to misconstrue the confidence scale. Looking at the number of such high confidence noticers who subsequently respond to the 2afc question with low-confidence, we found that the number was tiny. Only 28/2677 (1.05%) of highconfidence noticers subsequently gave the lowest level of confidence on the 2afc question, and only 63/2677 (2.35%) subjects gave either of the two lower levels of confidence. In this light, we consider any measurement error due to misunderstanding the confidence scale to be extremely minimal.

      What should we make of the 18 subjects who were highly confident non-noticers but then only low-confidence on the 2afc question? Importantly, we do not think that these 18 subjects necessarily made a mistake on the first question and so should be excluded. There is no a priori reason why one’s confidence criterion in a yes/no question should carry over to a 2afc question. After all, it is perfectly rationally coherent to be very confident that one didn’t see anything but also very confident that if there was anything to be seen, it was on the left. Moreover, these 18 subjects were not all correct on the 2afc question despite their high confidence (4/18 or 22% getting the wrong answer). 

      Nonetheless, and again reassuringly, we found that the above-chance patterns in our data remained the same even excluding these 18 subjects. We did observe a slight reduction in percent correct and d′ but this is absolutely what one should expect since excluding the most confident performers in any task will almost inevitably reduce performance.

      In this light, we consider it unlikely that measurement error fully explains the residual sensitivity found even amongst highly confident non-noticers. That said, we appreciate this concern. We now raise the issue and the analysis of high confidence noticers which addresses it in our revised manuscript. We also thank the reviewer for pressing us to think harder about this issue, which led directly to these new analyses that we believed have strengthened the paper.

      (5) In most (if not all) IB experiments in the literature, a partial attention and/or full attention trial (or set of trials) is administered after the critical trial. These control trials are very important for validating IB on the critical trial, as they must show that, when attended, the critical stimuli are very easy to see. If a subject cannot detect the critical stimulus on the control trial, one cannot conclude that they were inattentionally blind on the critical trial, e.g., perhaps the stimulus was just too difficult to see (e.g., too weak, too brief, too far in the periphery, too crowded by distractor stimuli, etc.), or perhaps they weren't paying enough attention overall or failed to follow instructions. In the aggregate data, rates of noticing the stimuli should increase substantially from the critical trial to the control trials. If noticing rates are equivalent on the critical and control trials one cannot conclude that attention was manipulated.

      It is puzzling why the authors decided not to include any control trials with partial or full attention in their five experiments, especially given their online data collection procedures where stimulus size, intensity, eccentricity, etc. were uncontrolled and variable across subjects. Including such trials could have actually helped them achieve their goal of challenging the IB hypothesis, e.g., excluding subjects who failed to see the stimulus on the control trials might have reduced the inattentional blindness rates further. This design decision should at least be acknowledged and justified (or noted as a limitation) in a revision of this paper.

      We acknowledge that other studies in the literature include divided and full attention trials, and that they could have been included in our work as well. However, we deliberately decided not to include such control trials for an important reason. As the referee comments, the main role of such trials in previous work has been to exclude from analysis subjects who failed to report the unexpected stimulus on the divided and/or full attention control trials.

      (For example, as Most et al. 2001 write: “Because observers should have seen the object in the full-attention trial (Mack & Rock, 1998), we used this trial as a control … Accordingly, 3 observers who failed to see the cross on this trial were replaced, and their data were excluded from the analyses.") As the reviewer points out, excluding such subjects would very likely have ‘helped' us. However, the practice is controversial. Indeed, in a review of 128 experiments, White et al. 2018 argue that the practice has “problematic consequences” and “may lead researchers to understate the pervasiveness of inattentional blindness". Since we wanted to offer as simple and demanding a test of residual sensitivity in IB as possible, we thus decided not to use any such exclusions, and for that reason decided not to include divided/full attention trials. 

      As recommended, we discuss this decision not to include divided/full attention trials and our logic for not doing so in the manuscript. As we explain, not having those conditions makes it more impressive, not less impressive, that we observed the results we in fact did — it makes our results more interpretable, not less interpretable, and so absence of such conditions from our manuscript should not (in our view) be considered any kind of weakness.

      (6) In the discussion section, the authors devote a short paragraph to considering an alternative explanation of their non-zero d' results in their super subject analyses: perhaps the critical stimuli were processed unconsciously and left a trace such that when later forced to guess a feature of the stimuli, subjects were able to draw upon this unconscious trace to guide their 2AFC decision. In the subsequent paragraph, the authors relate these results to above-chance forced-choice guessing in blindsight subjects, but reject the analogy based on claims of parsimony.

      First, the authors dismiss the comparison of IB and blindsight too quickly. In particular, the results from experiment 3, in which some subjects adamantly (confidently) deny seeing the critical stimulus but guess a feature at above-chance levels (at least at the super subject level and assuming the online subjects interpreted and used the confidence scale correctly), seem highly analogous to blindsight. Importantly, the analogy is strengthened if the subjects who were confident in not seeing anything also reported not being confident in their forced-choice judgments, but as mentioned above this data was not reported.

      Second, the authors fail to mention an even more straightforward explanation of these results, which is that ~8% of subjects misinterpreted the "unusual" part of the standard IB question used in experiments 1-3. After all, colored lines and shapes are pretty "usual" for psychology experiments and were present in the distractor stimuli everyone attended to. It seems quite reasonable that some subjects answered this first question, "no, I didn't see anything unusual", but then when told that there was a critical stimulus and asked to judge one of its features, adjusted their response by reconsidering, "oh, ok, if that's the unusual thing you were asking about, of course I saw that extra line flash on the left of the screen". This seems like a more parsimonious alternative compared to either of the two interpretations considered by the authors: (1) IB does not exist, (2) super-subject d' is driven by unconscious processing. Why not also consider: (3) a small percentage of subjects misinterpreted the Y/N question about noticing something unusual. In experiments 4-5, they dropped the term "unusual" but do not analyze whether this made a difference nor do they report enough of the data (subject numbers for the Y/N question and 2AFC) for readers to determine if this helped reduce the ~8% overestimate of IB rates.

      Our primary ambition in the paper was to establish, as our title suggests, residual sensitivity in IB. The ambition is quite neutral as to whether the sensitivity reflects conscious or unconscious processing (i.e. is akin to blindsight as traditionally conceived). We were evidently not clear about this, however, leading to two referees coming away with an impression of our claims that is different than we intended. We have revised our manuscript throughout to address this. But we also want to emphasize here that we take our data primarily to support the more modest claim that there is residual sensitivity (conscious or unconscious) in the group of subjects who are traditionally classified as inattentionally blind. We believe that this claim has solid support in our data.

      We do in the discussion section offer one reason for believing that there is residual awareness in the group of subjects who are traditionally classified as inattentionally blind. However, we acknowledge that this is controversial and now emphasize in the manuscript that this claim “is tentative and secondary to our primary finding”. We also emphasize that part of our point is dialectical: Inattentional blindness has been used to argue (e.g.) that attention is required for awareness. We think that our data concerning residual sensitivity at least push back on the use of IB to make this claim, even if they do not provide decisive evidence (as we agree) that awareness survives inattention. (Cf. here, Hirshhorn et al. 2024 who take up a common suggestion in the field that awareness is best assessed by using both subjective and objective measures, with claims about lack of awareness ideally being supported by both; our data suggest at a minimum that in IB objective measures do not neatly line up with subjective measures.)

      We hope this addresses the referee’s concern that we dismiss the “the comparison of IB and blindsight too quickly”. We do not intend to dismiss that comparison at all, indeed we raise it because we consider it a serious hypothesis. Our aim is simply to raise one possible consideration against it. But, again, our main claim is quite consistent with sensitivity in IB being akin to “blindsight”.

      We also agree with the referee that a possible explanation of why some subjects say they do not notice something unusual in IB paradigms, is not because they didn’t notice anything but because they didn’t consider the unexpected stimulus sufficiently unusual. However, the reviewer is incorrect that we did not mention this interpretation; to the contrary, it was precisely the kind of concern which led us to be dissatisfied with standard IB methods and so motivated our approach. As we wrote in our main text: “However, yes/no questions of this sort are inherently and notoriously subject to bias…   For example, observers might be under-confident whether they saw anything (or whether what they saw counted as unusual); this might lead them to respond “no” out of an excess of caution.” On our view, this is exactly the kind of reason (among other reasons) that one cannot rely on yes/no reports of noticing unusual stimuli, even though the field has relied on just these sorts of questions in just this way.

      We do not, however, think that this explanation accounts for why all subjects fail to report noticing, nor do we think that it accounts for our finding of above-chance sensitivity amongst non-noticers. This is for two critical reasons. First, whereas the word “unusual” did appear in the yes/no question in our Experiments 1-3, it did not appear in our Experiments 4 and 5 on dynamic IB. (In both cases, we used the exact wording of such questions in the experiments we were basing our work on.) And, of course, we still found significant residual sensitivity amongst non-noticers in Experiments 4 and 5. Second, in relation to our confidence experiment, we think it unlikely that subjects who were highly confident that they did not notice anything unusual only said that because they thought what they had seen was insufficiently unusual. Yet even in this group of subjects who were maximally confident that they did not notice anything unusual, we still found residual sensitivity.

      (7) The authors use sub-optimal questioning procedures to challenge the existence of the phenomenon this questioning is intended to demonstrate. A more neutral interpretation of this study is that it is a critique on methods in IB research, not a critique on IB as a manipulation or phenomenon. The authors neglect to mention the dozens of modern IB experiments that have improved upon the simple Y/N IB questioning methods. For example, in Michael Cohen's IB experiments (e.g., Cohen et al., 2011; Cohen et al., 2020; Cohen et al., 2021), he uses a carefully crafted set of probing questions to conservatively ensure that subjects who happened to notice the critical stimuli have every possible opportunity to report seeing them. In other experiments (e.g., Hirschhorn et al., 2024; Pitts et al., 2012), researchers not only ask the Y/N question but then follow this up by presenting examples of the critical stimuli so subjects can see exactly what they are being asked about (recognition-style instead of free recall, which is more sensitive). These follow-up questions include foil stimuli that were never presented (similar to the stimulus-absent trials here), and ask for confidence ratings of all stimuli. Conservative, pre-defined exclusion criteria are employed to improve the accuracy of their IB-rate estimates. In these and other studies, researchers are very cautious about trusting what subjects report seeing, and in all cases, still find substantial IB rates, even to highly salient stimuli. The authors should consider at least mentioning these improved methods, and perhaps consider using some of them in their future experiments.

      The concern that we do not sufficiently discuss the range of “improved” methods in IB studies is well-taken. A similar concern is raised by Reviewer #2 (Dr. Cohen). To address the concern, we have added to our manuscript a substantial new discussion of such improved methods. However, although we do agree that these methods can be helpful and may well address some of the methodological concerns which our paper raises, we do not think that they are a panacea. Thus, our discussion of these methods also includes a substantial discussion of the problems and pitfalls with such methods which led us to favor our own simple forced-response and 2afc questions, combined with SDT analysis. We think this approach is superior both to the classic approach in IB studies and to the approach raised by the reviewers.

      In particular, we have four main concerns about the follow up questions now commonly used in the field:

      First, many follow up questions are used not to exclude people from the IB group but to include people in the IB group. Thus, Most et al. 2001 asked follow up questions but used these to increase their IB group, only excluding subjects from the IB group if they both reported seeing and answered their follow ups incorrectly: “Observers were regarded as having seen the unexpected object if they answered 'yes' when asked if they had seen anything on the critical trial that had not been present before and if they were able to describe its color, motion, or shape." This means that subjects who saw the object but failed to see its color, say, would be treated as inattentionally blind. This has the purpose of inflating IB rates, in exactly the way our paper is intended to critique. So, in our view this isn’t an improvement but rather part of the approach we take issue with.

      Second, many follow up questions remain yes/no questions or nearby variants, all of which are subject to response bias. For example, in Cohen’s studies which the reviewer mentions, it is certainly true that “he uses a carefully crafted set of probing questions to conservatively ensure that subjects who happened to notice the critical stimuli have every possible opportunity to report seeing them.” We agree that this improves over a simple yes/no question in some ways. However, such follow up probes nonetheless remain yes/no questions, subject to response bias, e.g.:

      (1) “Did you notice anything strange or different about that last trial?”

      (2) “If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?”

      (3) “If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?”

      (4) “Did you notice anything different about the colors in the last scene?”

      Indeed, follow up questions of this kind can be especially susceptible to bias, since subjects may be reluctant to “take back” their earlier answers and so be conservative in responding positively to avoid inconsistency or acknowledgement of earlier error. This may explain why such follow up questions produce remarkable consistency despite their rather different wording. Thus, Simons and Chabris (1999) report: “Although we asked a series of questions escalating in specificity to determine whether observers had noticed the unexpected event, only one observer who failed to report the event in response to the first question (“did you notice anything unusual?'') reported the event in response to any of the next three questions (which culminated in “did you see a ... walk across the screen?''). Thus, since the responses were nearly always consistent across all four questions, we will present the results in terms of overall rates of noticing.” Thus, while there are undoubtedly merits to these follow ups, they do not resolve problems of bias.

      This same basic issue affects the follow up question used in Pitts et al. 2012 which the reviewer mentions. Pitts et al. write: “If a participant reported not seeing any patterns and rated their confidence in seeing the square pattern (once shown the sample) as a 3 or less (1 = least confident, 5 = most confident), she or he was placed in Group 1 and was considered to be inattentionally blind to the square patterns.” The confidence rating follow-up question here remains subject to bias. Moreover, and strikingly, the inclusion criterion used means that subjects who were moderately confident that they saw the square pattern when shown (i.e. answered 3) were counted as inattentionally blind (!). We do not think this is an appropriate inclusion criterion.

      The third problem is that follow up questions are often free/open-response. For instance, Most et al. (2005) ask the follow up question: "If you did see something on the last trial that had not been present during the first two trials, what color was it? If you did not see something, please guess." This is a much more difficult and to that extent less sensitive question than our binary forced-response/2afc questions. For this reason, we believe our follow up questions are more suitable for ascertaining low levels of sensitivity.

      The fourth and final issue is that whereas 2afc questions are criterion free (in that they naturally have an unbiased decision rule), this is in fact not true of n_afc questions in general, nor is it true in general of _delayed n-alternative match to sample designs. Thus, even when limited response options are given, they are not immune to response biases and so require SDT analysis. Moreover, some such tasks can involve decision spaces which are often poorly understood or difficult to analyze without making substantial assumptions about observer strategy. 

      This last point (as well as the first) is relevant to Hirshhorn et al. 2024. Hirshhorn et al. write that they “used two awareness measures. Firstly, participants were asked to rate stimulus visibility on the Perceptual Awareness Scale (PAS, a subjective measure of awareness: Ramsøy & Overgaard, 2004), and then they were asked to select the stimulus image from an array of four images (an objective measure: Jakel & Wichmann, 2006).”

      While certainly an improvement on simple yes/no questioning, the PAS remains subject to response bias. On the other hand, we applaud Hirshhorn et al.’s use of objective measures in the context of IB which of course our design implements. However, while Hirshhorn et al. 2024 suggest that their task is a spatial 4afc following the recommendation of this design by Jakel & Wichmann (2006), it is strictly a 4-alternative delayed match to sample task, so it is doubtful if it can be considered a preferred psychophysical task for the reasons Jakel & Wichmann offer. Regardless, the more crucial point is that observers in such a task might be biased towards one alternative as opposed to another. Thus, use of d′ (as opposed to percent correct as in Hirshhorn et al. 2024) is crucial in assessing performance in such tasks.

      For all these reasons, then, while we agree that the field has taken significant steps to move beyond the simple yes/no question traditionally used in IB studies (and we have revised our manuscript to make this clear); we do not think it has resolved the methodological issues which our paper seeks to highlight and address, and we believe that our approach contributes something additional that is not yet present in the literature. We have now revised our manuscript to make these points much more clearly, and we thank the reviewer for prompting these improvements.

      Reviewer #2 (Public review):

      In this study, Nartker et al. examine how much observers are conscious of using variations of classic inattentional blindness studies. The key idea is that rather than simply asking observers if they noticed a critical object with one yes/no question, the authors also ask follow-up questions to determine if observers are aware of more than the yes/no questions suggest. Specifically, by having observers make forced choice guesses about the critical object, the authors find that many observers who initially said "no" they did not see the object can still "guess" above chance about the critical object's location, color, etc. Thus, the authors claim, that prior claims of inattentional blindness are mistaken and that using such simple methods has led numerous researchers to overestimate how little observers see in the world. To quote the authors themselves, these results imply that "inattentionally blind subjects consciously perceive these stimuli after all... they show sensitivity to IB stimuli because they can see them."

      Before getting to a few issues I have with the paper, I do want to make sure to explicitly compliment the researchers for many aspects of their work. Getting massive amounts of data, using signal detection measures, and the novel use of a "super subject" are all important contributions to the literature that I hope are employed more in the future.

      We really appreciate this comment and that the reviewer found our work to make these important contributions to the literature. We wrote this paper expecting not everyone to accept our conclusions, but hoping that readers would see the work as making a valuable contribution to the literature promoting an underexplored alternative in a compelling way. Given that this reviewer goes on to express some skepticism about our claims, it is especially encouraging to see this positive feedback up top!

      Main point 1: My primary issue with this work is that I believe the authors are misrepresenting the way people often perform inattentional blindness studies. In effect, the authors are saying, "People do the studies 'incorrectly' and report that people see very little. We perform the studies 'correctly' and report that people see much more than previously thought." But the way previous studies are conducted is not accurately described in this paper. The authors describe previous studies as follows on page 3:

      "Crucially, however, this interpretation of IB and the many implications that follow from it rest on a measure that psychophysics has long recognized to be problematic: simply asking participants whether they noticed anything unusual. In IB studies, awareness of the unexpected stimulus (the novel shape, the parading gorilla, etc.) is retroactively probed with a yes/no question, standardly, "Did you notice anything unusual on the last trial which wasn't there on previous trials?". Any subject who answers "no" is assumed not to have any awareness of the unexpected stimulus.

      If this quote were true, the authors would have a point. Unfortunately, I do not believe it is true. This is simply not how many inattentional blindness studies are run. Some of the most famous studies in the inattentional blindness literature do not simply as observes a yes/no question (e.g., the invisible gorilla (Simons et al. 1999), the classic door study where the person changes (Simons and Levin, 1998), the study where observers do not notice a fight happening a few feet from them (Chabris et al., 2011). Instead, these papers consistently ask a series of follow-up questions and even tell the observers what just occurred to confirm that observers did not notice that critical event (e.g., "If I were to tell you we just did XYZ, did you notice that?"). In fact, after a brief search on Google Scholar, I was able to relatively quickly find over a dozen papers that do not just use a yes/no procedure, and instead as a series of multiple questions to determine if someone is inattentionally blind. In no particular order some papers (full disclosure: including my own):

      (1) Most et al. (2005) Psych Review

      (2) Drew et al. (2013) Psych Science

      (3) Drew et al. (2016) Journal of Vision

      (4) Simons et al. (1999) Perception

      (5) Simons and Levin (1998) Perception

      (6) Chabris et al. (2011) iPerception

      (7) Ward & Scholl (2015) Psych Bulletin and Review

      (8) Most et al. (2001) Psych Science

      (9) Todd & Marois (2005) Psych Science

      (10) Fougnie & Marois (2007) Psych Bulletin and Review

      (11) New and German (2015) Evolution and Human Behaviour

      (12) Jackson-Nielsen (2017) Consciousness and cognition

      (13) Mack et al. (2016) Consciousness and cognition

      (14) Devue et al. (2009) Perception

      (15) Memmert (2014) Cognitive Development

      (16) Moore & Egeth (1997) JEP:HPP

      (17) Cohen et al. (2020) Proc Natl Acad Sci

      (18) Cohen et al. (2011) Psych Science

      This is a critical point. The authors' key idea is that when you ask more than just a simple yes/no question, you find that other studies have overestimated the effects of inattentional blindness. But none of the studies listed above only asked simple yes/no questions. Thus, I believe the authors are mis-representing the field. Moreover, many of the studies that do much more than ask a simple yes/no question are cited by the authors themselves! Furthermore, as far as I can tell, the authors believe that if researchers do these extra steps and ask more follow-ups, then the results are valid. But since so many of these prior studies do those extra steps, I am not exactly sure what is being criticized.

      To make sure this point is clear, I'd like to use a paper of mine as an example. In this study (Cohen et al., 2020, Proc Natl Acad Sci USA) we used gaze-contingent virtual reality to examine how much color people see in the world. On the critical trial, the part of the scene they fixated on was in color, but the periphery was entirely in black and white. As soon as the trial ended, we asked participants a series of questions to determine what they noticed. The list of questions included:

      (1) "Did you notice anything strange or different about that last trial?"

      (2) "If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?"

      (3) "If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?"

      (4) "Did you notice anything different about the colors in the last scene?"

      (5) We then showed observers the previous trial again and drew their attention to the effect and confirmed that they did not notice that previously.

      In a situation like this, when the observers are asked so many questions, do the authors believe that "the inattentionally blind can see after all?" I believe they would not say that and the reason they would not say that is because of the follow-up questions after the initial yes/no question. But since so many previous studies use similar follow-up questions, I do not think you can state that the field is broadly overestimating inattentional blindness. This is why it seems to me to be a bit of a strawman: most people do not just use the yes/no method.

      We appreciate this reviewer raising this issue. As he (Dr. Cohen) states, his “primary issue” concerns our discussion of the broader literature (which he worries understates recent improvements made to the IB methodology), rather than, e.g., the experiments we’ve run. We take this concern very seriously and address it comprehensively here.

      A very similar issue is identified by Reviewer #1, comment (7). To review some of what we say in reply to them: To address the concern we have added to our manuscript a substantial new discussion of such improved methods. However, although we do agree that these methods can be helpful and may well address some of the methodological concerns which our paper raises, we do not think that they are a panacea. Thus, our discussion of these methods also includes a substantial discussion of the problems and pitfalls with such methods which led us to favor our own simple forced-response and 2afc questions, combined with SDT analysis. We think this approach is superior both to the classic approach in IB studies and to the approach raised by the reviewers.

      In particular, we have three main concerns about the follow up questions now commonly used in the field:

      First, many follow up questions are used not to exclude subjects from the IB group but to include subjects in the IB group. Thus, Most et al. (2001) asked follow up questions but used these to increase their IB group, only excluding subjects from the IB group if they both reported seeing and failed to answer their follow ups correctly: “Observers were regarded as having seen the unexpected object if they answered 'yes' when asked if they had seen anything on the critical trial that had not been present before and if they were able to describe its color, motion, or shape." This means that subjects who saw the object but failed to describe it in these respects would be treated as inattentionally blind. This is problematic since failure to describe a feature (e.g., color, shape) does not imply a complete lack of information concerning that feature; and even if a subject did lack all information concerning these features of an object, this would not imply a complete failure to see the object. Similarly, Pitts et al. (2012) asked subjects to rate their confidence in their initial yes/no response from 1 = least confident to 5 = most confident, and used these ratings to include in the IB group those who rated their confidence in seeing at 3 or less. This is evidently problematic, since there is a large gap between being under confident that one saw something and being completely blind to it. More generally, using follows up to inflate IB rates in such ways raises precisely the kinds of issues our paper is intended to critique. So in our view this isn’t an improvement but rather part of the approach we take issue with.

      Second, many follow up questions remain yes/no questions or nearby variants, all of which are subject to response bias. For example, in the reviewer’s own studies (Cohen et al. 2020, 2011; see also: Simons et al., 1999; Most et al., 2001, 2005; Drew et al., 2013; Memmert, 2014) a series of follow up questions are used to try and ensure that subjects who noticed the critical stimuli are given the maximum opportunity to report doing so, e.g.:

      (1) “Did you notice anything strange or different about that last trial?”

      (2) “If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?”

      (3) “If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?”

      (4) “Did you notice anything different about the colors in the last scene?”

      We certainly agree that such follow up questions improve over a simple yes/no question in some ways. However, such follow up probes nonetheless remain yes/no questions, intrinsically subject to response bias. Indeed, follow up questions of this kind can be especially susceptible to bias, since subjects may be reluctant to “take back” their earlier answers and so be conservative in responding positively to avoid inconsistency or acknowledgement of earlier error. This may explain why such follow up questions produce remarkable consistency despite their rather different wording. Thus, Simons and Chabris (1999) report: “Although we asked a series of questions escalating in specificity to determine whether observers had noticed the unexpected event, only one observer who failed to report the event in response to the first question (“did you notice anything unusual?'') reported the event in response to any of the next three questions (which culminated in “did you see a ... walk across the screen?''). Thus, since the responses were nearly always consistent across all four questions, we will present the results in terms of overall rates of noticing.” Thus, while there are undoubtedly merits to these follow ups, they do not resolve problems of bias.

      It is also important to recognize that whereas 2afc questions are criterion free (in that they naturally have an unbiased decision rule), this is not true of n_afc nor delayed _n-alternative match to sample designs in general. Performance in such tasks thus requires SDT analysis – which itself may be problematic if the decision space is not properly understood or requires making substantial assumptions about observer strategy.

      Third, and finally, many follow up questions are insufficiently sensitive (especially with small sample sizes). For instance, Todd, Fougnie & Marois (2005) used a 12-alternative match-tosample task (see similarly: Fougnie & Marois, 2007; Devue et al., 2009). And Most et al. (2005) asked an open-response follow-up: “If you did see something on the last trial that had not been present during the first two trials, what color was it? If you did not see something, please guess.” These questions are more difficult and to that extent less sensitive than binary forced-response/2afc questions of the sort we use in our own studies – a difference which may be critical in uncovering degraded perceptual sensitivity.

      For all these reasons, then, while we agree that the field has taken significant steps to move beyond the simple yes/no question traditionally used in IB studies (and we have revised our manuscript to make this clear); we do not think it has resolved the methodological issues which our paper seeks to highlight and address, and we believe that our approach of using 2afc or forced-response questions combined with signal detection analysis is an important improvement on prior methods and contributes something additional that is not yet present in the literature. We have now revised our manuscript to make these points much clearer.

      Other studies that improve on the standard methodology

      This reviewer adds something else, however: A very helpful list of 18 papers which include follow ups and that he believes overcome many of the issues we raise in our paper. To just state our reaction bluntly: We are familiar with every one of these papers (indeed, one of them is a paper by one of us!), and while we think these are all very valuable contributions to the literature, it is our view that none of these 18 papers resolves the worries that led us to conduct our work.  

      Here we briefly comment on the relevant pitfalls in each case. We hope this serves to underscore the importance of our methodological approach.

      (1) Most et al. (2005) Psych Review

      Either a 2-item or 5-item questionnaire was used. The 2-item questionnaire ran as follows:

      (1) On the last trial, did you see anything other than the 4 circles and the 4 squares (anything that had not been present on the original two trials)? Yes No 

      (2) If you did see something on the last trial that had not been present during the original two trials, please describe it in as much detail as possible.

      This clearly does not substantially improve on the traditional simple yes/no question. Moreover, the second question (as well as being open-ended) was used to include additional subjects in the IB group, in that participants were counted as having seen the object only if they responded “yes” to Q1 and in addition “were able to report at least one accurate detail” in response to Q2. In other words, either a subject says “no” (and is treated as unaware), or says “yes” and then is asked to prove their awareness, as it were. If anything, this intensifies the concerns we raise, by inflating IB rates. 

      The 5-item questionnaire looked like this: 

      (1) On the last trial, did you see anything other than the black and white L’s and T’s (anything that had not been present on the first two trials)?

      (2) If you did see something on the last trial that had not been present during the first two trials, please describe it.

      (3) If you did see something on the last trial that had not been present during the first two trials, what color was it? If you did not see something, please guess. (Please indicate whether you did see something or are guessing)

      (4) If you did see something during the last trial that had not been present in the first two trials, please draw an arrow on the “screen” below showing the direction in which it was moving. If you did not see something, please guess. (Please indicate whether you did see something or are guessing)

      (5) If you did see something during the last trial that had not been present during the first two trials, please circle the shape of the object below [4 shapes are presented to choose from]. If you did not see anything, please guess. (Please indicate whether you did see something or are guessing)

      Q5 was not used for analysis purposes. (It suffers from the second issue raised above.) Q1 is the traditional y/n question. Qs 2&3 are open ended. It is unclear how responses to Q4 were analyzed (at the limit it could be considered a helpful, forced-choice question – though it again would suffer from the second issue raised above). However, as noted with respect to the 2-item questionnaire, these responses were not used to exclude people from the IB group but to include people in it. So again, this approach does not in any way address the issues we are concerned about, and if anything, only makes them worse. 

      (2)  Drew et al. (2013) Psych Science

      All follow ups were yes/no: “we asked a series of questions to determine whether they noticed the gorilla: ‘Did the final trial seem any different than any of the other trials?’, ‘Did you notice anything unusual on the final trial?’, and, finally, ‘Did you see a gorilla on the final trial?’”. So, this paper essentially implements the standard methodology we mention (and criticize). 

      (3)  Drew et al. (2016) Journal of Vision

      Follow up questions were used, but the reported procedure does not provide sufficient details to evaluate them (we are only told: “After the final trial, they were asked: ‘On that last trial of the task, did you notice anything that was not there on previous trials?’ They then answered questions about the features of the unexpected stimulus on a separate screen (color, shape, movement, and direction of movement).”). It is not clear that these follow ups were used to exclude any subjects from the analysis. Finally, given that the unexpected object could be the same color as the targets/distractors, it is clear that biases would have been introduced which would need to be considered (but which were not).

      (4)  Simons & Chabris (1999) Perception

      All follow ups were yes/no: “observers were … asked to provide answers to a surprise series of additional questions. (i) While you were doing the counting, did you notice anything unusual on the video? (ii) Did you notice any- thing other than the six players? (iii) Did you see anyone else (besides the six players) appear on the video? (iv) Did you see a gorilla [woman carrying an umbrella] walk across the screen? After any “yes'' response, observers were asked to provide details of what they noticed. If at any point an observer mentioned the unexpected event, the remaining questions were skipped.” As noted previously, the analyses in fact did not use these questions to exclude subjects since answers were so consistent.

      (5)  Simons and Levin (1998) Perception

      This is a change detection paradigm, not a study of inattentional blindness. And in any case, one yes/no follow up was used: “Did you notice that I'm not the same person who approached you to ask for directions?”

      (6)  Chabris et al. (2011) iPerception

      Two yes/no questions were asked: “we asked whether the subjects had seen anything unusual along the route, and then whether they had seen anyone fighting.” It seems that follow up questions (a request to describe the fight) were asked only of those who said yes.

      This is in fact a common procedure – follow up questions only being asked of the “yes” group. As discussed, it is sometimes used to increase rates of IB, compounding the problem we identify in our paper. So this is another example of a follow-up question that makes the problem we identify worse, not better.

      (7) Ward & Scholl (2015) Psych Bulletin and Review

      Two yes/no questions were used: “...observers were asked whether they noticed ‘anything … that was different from the first three trials’ — and if so, to describe what was different. They were then shown the gray cross and asked if they had noticed it—and if so, to describe where it was and how it moved. Only observers who explicitly reported not noticing the cross were counted as ‘nonnoticers’ to be included in the final sample (N = 100).” In each case, combining the traditional noticing question with a request to describe and identify may have induced conservative response biases in the noticing question, since a subject might consider being able to describe or identify the unexpected stimulus a precondition of giving a positive answer to the noticing question.

      (8) Most et al. (2001) Psych Science

      The same 5-item questionnaire discussed above in relation to Most et al. (2005) was used: 

      (1) On the last trial, did you see anything other than the black and white L’s and T’s (anything that had not been present on the first two trials)?

      (2)   If you did see something on the last trial that had not been present during the first two trials, please describe it.

      (3) If you did see something on the last trial that had not been present during the first two trials, what color was it? If you did not see something, please guess. (Please indicate whether you did see something or are guessing)

      (4) If you did see something during the last trial that had not been present in the first two trials, please draw an arrow on the “screen” below showing the direction in which it was moving. If you did not see something, please guess. (Please indicate whether you did see something or are guessing)

      (5) If you did see something during the last trial that had not been present during the first two trials, please circle the shape of the object below [4 shapes are presented to choose from]. If you did not see anything, please guess. (Please indicate whether you did see something or are guessing)

      Q5 was not used for analysis purposes. (It suffers from the second issue raised above.) Q1 is the traditional yes/no question. Qs 2&3 are open ended. It is unclear how responses to Q4 were analyzed (at the limit it could be considered a helpful, forced-choice question – though it again would suffer from the second issue raised above). However, as noted with respect to the two item questionnaire in Most et al. 2005, these responses were not used to exclude people from the IB group but to include people in it. So again this approach does not in any way address the issues we are concerned about, and if anything only makes them worse.

      (9) Todd, Fougnie & Marois (2005) Psych Science

      “participants were probed with three questions to determine whether they had detected the critical stimulus ... .The first question assessed whether subjects had seen anything unusual during the trial; they responded ‘‘yes’’ or ‘‘no’’ by pressing the appropriate key on the keyboard. The second question asked participants to select which stimulus they might have seen among 12 possible objects and symbols selected from MacIntosh font databases. The third question asked participants to select the quadrant in which the critical stimulus may have appeared by pressing one of four keys, each of which corresponded to one of the quadrants.”

      These follow ups were used to include people in the IB group: “In keeping with previous studies (Most et al., 2001), participants were considered to have detected the critical stimulus successfully if they (a) reported seeing an unexpected stimulus and (b) correctly selected its quadrant location.” In line with our third point about sensitivity, the object identity test transpired to be “too difficult even under full-attention conditions … Thus, performance with this question was not analyzed further.”

      (10) Fougnie & Marois (2007) Psych Bulletin and Review

      Same exact methods and problems as with Todd & Marois (2005) Psych Science, just discussed.

      (11) New and German (2015) Evolution and Human Behaviour

      “After the fourth trial containing the additional experimental stimulus, the participant was asked, “Did you see anything in addition to the cross on that trial?” and which quadrant the additional stimulus appeared in. They were then asked to identify the stimulus in an array which in Experiment 1 included two variants chosen randomly from the spider stimuli and the two needle stimuli. Participants in Experiment 2 picked from all eight stimuli used in that experiment.”

      Our second concern about response biases and the need for appropriate SDT analysis of the 4/8 alternative tasks applies to all these questions. We also note that analyses were only performed on groups separately (those who detected/failed to detect, those who located/failed to locate, and those who identified/failed to identify) and on the group which did all three/failed to do any one of the three. Especially in light of the fact that some subjects could clearly detect the stimulus without being able to identity it (e.g.), the most stringent test given our concerns (which were not obviously New and German’s comparative concerns), would be to consider the group which could not detect, identify or localize.

      (12) Jackson-Nielsen (2017) Consciousness and cognition

      This is a very interesting example of a follow-up which used a 3-AFC recognition test:

      “participants were immediately asked, ‘‘which display looks most like what you just saw?’ from 3 alternatives”. However, though such an objective test is definitely to be preferred in our view to an open-ended series of probes, the 3-AFC test administered clearly had issues with response biases, as discussed, and actually yielded significantly below chance performance in one of the experiments.

      (13) Mack et al. (2016) Consciousness and cognition

      The follow ups here were essentially yes/no combined with an assessment of surprise. Participants were asked to enter letters into a box, and if they did so “were immediately asked by the experimenter whether they had noticed anything different about the array on this last trial and if they did not, they were told that there had been no letters and their responses to that news were recorded. Clearly, if they expressed surprise, this would be compelling evidence that they were unaware of the absence of the letters. Those observers who did not enter letters and realized there were no letters present were considered aware of the absence.” So, this again has all of the same problems we identify, considering subjects unaware because they expressed surprise.

      (14) Devue et al. (2009) Perception

      An 8-alternative task was used. The authors were primarily interested in a comparative analysis and so did not use this task to exclude subjects. We note that an 8 alternative task is very demanding – compare the 12-alternative task used in Todd, Fougnie & Marois (2005). There was an attempt to investigate biases in a separate bias trial, however SDT measures were not used.

      (15) Memmert (2014) Cognitive Development

      “After watching the video and stating the number of passes, participants answered four questions (following Simons & Chabris, 1999): (1) While you were counting, did you perceive anything unusual on the video? (2) Did you perceive anything other than the six players? (3) Did you see anyone else (besides the six players) appear on the video? (4) Did you notice a gorilla walk across the screen? After any “yes” reply, children were asked to provide details of what they noticed. If at any point a child mentioned the unexpected event, the remaining questions were omitted.” All of these follow-up questions are yes/no judgments, used to determine awareness in exactly the way we critique as problematic.

      (16) Moore & Egeth (1997) JEP:HPP

      This study (which includes one of us, Egeth, as author) did use forced choice questions. In one case, the question was 2-alternative, in the other it was 4-alternative. In the latter case, SDT would have been appropriate but was not used. In the former case, it may have been that a larger sample would have revealed evidence of sensitivity to the background pattern (as it stood 55% answered the 2-alternative question correctly). Although these results have been replicated, unfortunately the replication in Wood and Simons 2019 used a 6-alternative recognition task and this was not analyzed using SDT. We also note that the task is rather difficult in this study. Wood and Simons report: “Exclusion rates were much higher than anticipated, primarily due to exclusions when subjects failed to correctly report the pattern on the full-attention trial; we excluded 361 subjects, or 58% of our sample.”

      (17) Cohen et al. (2020) Proc Natl Acad Sci

      While this paper improves over a simple yes/no question in some ways, especially in that it used the follow up questions to exclude subjects from the unaware (IB) group, the follow up probes nonetheless remain yes/no questions, subject to response bias, e.g.:

      (1) “Did you notice anything strange or different about that last trial?”

      (2) “If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?”

      (3) “If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?”

      (4) “Did you notice anything different about the colors in the last scene?”

      Follow up questions of this kind can be especially susceptible to bias, since subjects may be reluctant to “take back” their earlier answers and so be conservative in responding positively to avoid inconsistency or acknowledgement of earlier error. This may explain why such follow up questions can produce remarkable consistency despite their rather different wording. 

      (18) Cohen et al. (2011) Psych Science

      Here are the probes used in this study:

      (1) Did you notice anything different on that trial?

      (2) Did you notice something different about the background stream of images?

      (3) Did you notice that a different type of image was presented in the background that was unique in some particular way?

      (4) Did you see an actual photograph of a natural scene in that stream?

      (5) If I were to tell you that there was a photograph in that stream, can you tell me what it was a photograph of?

      Qs 1-4 are yes/no. Q5 is yes/no with an open-ended response. After this, a 5 or 6-alternative recognition test was administered. So again, this faces the same issues, since y/n questions are subject to bias in the way we have described, and many-alternative tests are more problematic than 2afc tests.

      In summary

      We really appreciate the care that went into compiling this list, and we agree that these papers and the improved methods they contain are relevant. But as hopefully made clear above, the approaches in each of these papers simply don’t solve the foundational issues our critique is aimed at (though they may address other issues). This is why we felt our new approach was necessary. And we continue to feel this way even after reading and incorporating these comments from Dr. Cohen.

      Nevertheless, there is clearly lots for us to do in light of these comments. And so as noted earlier we have now added a very substantial new section to our discussion section to more fairly and completely portray the state of the art in this literature. This is really to our benefit in the end, since we now not only better acknowledge the diverse approaches present, but also set up ourselves to make our novel contribution exceedingly clear.

      Main point 2: Let's imagine for a second that every study did just ask a yes/no question and then would stop. So, the criticism the authors are bringing up is valid (even though I believe it is not). I am not entirely sure that above chance performance on a forced choice task proves that the inattentionally blind can see after all. Could it just be a form of subliminal priming? Could there be a significant number of participants who basically would say something like, "No I did not see anything, and I feel like I am just guessing, but if you want me to say whether the thing was to the left or right, I will just 100% guess"? I know the literature on priming from things like change and inattentional blindness is a bit unclear, but this seems like maybe what is going on. In fact, maybe the authors are getting some of the best priming from inattentional blindness because of their large sample size, which previous studies do not use.

      I'm curious how the authors would relate their studies to masked priming. In masked priming studies, observers say the did not see the target (like in this study) but still are above chance when forced to guess (like in this study). Do the researchers here think that that is evidence of "masked stimuli are truly seen" even if a participant openly says they are guessing?

      We’re grateful to the reviewer for raising this question. As we say in response to Reviewer #1, our primary ambition in the paper is to establish, as our title suggests, residual sensitivity in IB. The ambition is quite neutral as to whether the sensitivity reflects conscious or unconscious processing (i.e. is akin to blindsight as traditionally conceived, or what the reviewer here suggests may be happening in masked priming). Since we were evidently insufficiently clear about this we have revised our manuscript in several places to clarify that we take our data primarily to support the more modest claim that there is residual sensitivity (conscious or unconscious) in the group of subjects who are traditionally classified as inattentionally blind. We believe that this claim has much more solid support in our data than our secondary and tentative suggestion about awareness.

      This said, we do consider masked priming studies to be susceptible to the critique that performance may reflect degraded conscious awareness which is unreported because of conservative response criteria. There is good evidence that response criteria tend to be conservative near threshold (Björkman et al. 1993; see also: Railo et al. 2020), including specifically in masked priming studies (Sand 2016, cited in Phillips 2021). So, we consider it a perfectly reasonable hypothesis that subjects who say they feel they are guessing in fact have conscious access to a degraded signal which is insufficient to reach a conservative response criterion but nonetheless sufficient to perform above chance in 2afc detection. Of course, we appreciate that this hypothesis is controversial, so it is not one we argue for in our paper (though we are happy to share our feelings about it here).

      Main point 3: My last question is about how the authors interpret a variety of inattentional blindness findings. Previous work has found that observers fail to notice a gorilla in a CT scan (Drew et al., 2013), a fight occurring right in front of them (Chabris et al., 2011), a plane on a runway that pilots crash into (Haines, 1991), and so forth. In a situation like this, do the authors believe that many participants are truly aware of these items but simply failed to answer a yes/no question correctly? For example, imagine the researchers made participants choose if the gorilla was in the left or right lung and some participants who initially said they did not notice the gorilla were still able to correctly say if it was in the left or right lung. Would the authors claim "that participant actually did see the gorilla in the lung"? I ask because it is difficult to understand what it means to be aware of something as salient as a gorilla in a CT scan, but say "no" you didn't notice it when asked a yes/no question. What does it mean to be aware of such important, ecologically relevant stimuli, but not act in response to them and openly say "no" you did not notice them?

      Our view is that in such cases, observers may well have a “degraded” percept of the relevant feature (gorilla, plane, fight etc.). But crucially we do not suggest that this percept is sufficient for observers to recognize the object/event as a gorilla, plane, fight etc. Our claim is only that, in our studies at least, observers (as a group) do have enough information about the unexpected stimuli to locate them, and discriminate certain low level features better than chance. Crudely, it may be that subjects see the gorilla simply as a smudge or the plane as a shadowy patch etc. (One of us who is familiar with the gorilla CT scan stimuli notes that the gorilla is in fact rather hard to see even when you know which slide it is on, suggesting that they are not as “salient” as the reviewer suggests!) 

      More precisely, in the paper we write that in our view perhaps “...unattended stimuli are encoded in a partial or degraded way. Here we see a variety of promising options for future work to investigate. One is that unattended stimuli are only encoded as part of ensemble representations or summary scene statistics (Rosenholtz, 2011; Cohen et al., 2016). Another is that only certain basic “low-level” or “preattentive” features (see Wolfe & Utochkin, 2019 for discussion) can enter awareness without attention. A final possibility consistent with the present data is that observers can in principle be aware of individual objects and higher-level features under inattention but that the precision of the corresponding representations is severely reduced. Our central aim here is to provide evidence that awareness in inattentional blindness is not abolished. Further work is needed to characterize the exact nature of that awareness.” We hope this sheds light on our perspective while still being appropriately cautious not to go too far beyond our data.

      Overall: I believe there are many aspects of this set of studies that are innovative and I hope the methods will be used more broadly in the literature. However, I believe the authors misrepresent the field and overstate what can be interpreted from their results. While I am sure there are cases where more nuanced questions might reveal inattentional blindness is somewhat overestimated, claims like "the inattentionally blind can see after all" or "Inattentionally blind subjects consciously perceive thest stimuli after all" seem to be incorrect (or at least not at all proven by this data).

      Once again, we would like to thank this reviewer for his feedback, which obviously comes from a place of tremendous expertise on these issues. We appreciate his assessment that our studies are innovative and that our methodological advances will be of use more broadly. We also hear the reviewer loud and clear about the passages in question, which on reflection we agree are not as central to our case as the other claims we make (regarding residual sensitivity and conservative responding), and so we have now edited them accordingly to refocus our discussion on only those claims that are central and supported. Thank you for making our paper stronger!

      Reviewer #3 (Public review):

      Summary:

      Authors try to challenge the mainstream scientific as well as popularly held view that Inattentional

      Blindness (IB) signifies subjects having no conscious awareness of what they report not seeing (after being exposed to unexpected stimuli). They show that even when subjects indicate NOT having seen the unexpected stimulus, they are at above chance level for reporting features such as location, color or movement of these stimuli. Also, they show that 'not seen' responses are in part due to a conservative bias of subjects, i.e. they tend to say no more than yes, regardless of actual visibility. Their conclusion is that IB may not (always) be blindness, but possibly amnesia, uncertainty etc.

      We just thought to say that we felt this was a very accurate summary of our claims, and in ways underscore the modesty we had hoped to convey. This is especially true of the reviewer’s final sentence: “Their conclusion is that IB may not (always) be blindness, but possibly amnesia, uncertainty etc.”; as we noted in response to other reviewers, our claim is not that IB doesn’t exist, that subjects are always conscious of the stimulus, etc.; it is only that the cohort of IB subjects show sensitivity to the unattended stimulus in ways that suggest they are not as blind as traditionally conceived. Thank you for reading us as intended!

      Strengths:

      A huge pool of (25.000) subjects is used. They perform several versions of the IB experiments, both with briefly presented stimuli (as the classic Mack and Rock paradigm), as well as with prolonged stimuli moving over the screen for 5 seconds (a bit like the famous gorilla version), and all these versions show similar results, pointing in the same direction: above chance detection of unseen features, as well as conservative bias towards saying not seen.

      We’re delighted that the reviewer appreciated these strengths in our manuscript!

      Weaknesses:

      Results are all significant but effects are not very strong, typically a bit above chance. Also, it is unclear what to compare these effects to, as there are no control experiments showing what performance would have been in a dual task version where subjects have to also report features etc for stimuli that they know will appear in some trials

      The backdrop to the experiments reported here is the “consensus view” (Noah & Mangun, 2020) according to which inattention completely abolishes perception, such that subjects undergoing IB “have no awareness at all of the stimulus object” (Rock et al., 1992) and that “one can have one’s eyes focused on an object or event … without seeing it at all” (Carruthers, 2015). In this context, we think our findings of significant above-chance sensitivity (e.g., d′ = 0.51 for location in Experiment 1; chance, of course, would be d′ = 0 here) are striking and constitute strong evidence against the consensus view. We of course agree that the residual sensitivity is far lower than amongst subjects who noticed the stimulus. For this reason, we certainly believe that inattention has a dramatic impact on perception. To that extent, our data speak in favor of a “middle ground” view on which inattention substantially degrades but crucially does not abolish perception/explicit encoding. We see this as an importantly neglected option in a literature which has overly focused on seen/not seen binaries (see our section ‘Visual awareness as graded’).

      Regarding the absence of a control condition, we think those conditions wouldn’t have played the same role in our experiments as they typically play in other experiments. As Reviewer #1 comments, the main role of such trials in previous work has been to exclude from analysis subjects who failed to report the unexpected stimulus on the divided and/or full attention control trials. As Reviewer #1 points out, excluding such subjects would very likely have ‘helped’ us. However, the practice is controversial. Indeed, in a review of 128 experiments, White et al. 2018 argue that the practice has “problematic consequences” and “may lead researchers to understate the pervasiveness of inattentional blindness". Since we wanted to offer as simple and demanding a test of residual sensitivity in IB as possible, we thus decided not to use any exclusions, and for that reason decided not to include divided/full attention trials.

      As recommended, we discuss this decision not to include divided/full attention trials and our logic for not doing so in the manuscript. As we explain, not having those conditions makes it more impressive, not less impressive, that we observed the results we in fact did — it makes our results more interpretable, not less interpretable, and so absence of such conditions from our manuscript should not (in our view) be considered any kind of weakness.

      There are quite some studies showing that during IB, neural processing of visual stimuli continues up to high visual levels, for example, Vandenbroucke et al 2014 doi:10.1162/jocn_a_00530 showed preserved processing of perceptual inference (i.e. seeing a kanizsa illusion) during IB. Scholte et al 2006 doi: 10.1016/j.brainres.2005.10.051 showed preserved scene segmentation signals during IB. Compared to the strength of these neural signatures, the reported effects may be considered not all that surprising, or even weak.

      We agree that such evidence of neural processing in IB is relevant to — and perhaps indeed consistent with — our picture, and we’re grateful to the reviewer for pointing out further studies along those lines. Previously, we mentioned a study from Pitts et al., 2012 in which, as we wrote, “unexpected line patterns have been found to elicit the same Nd1 ERP component in both noticers and inattentionally blind subjects (Pitts et al., 2012).” We have added references to both the studies which the reviewer mentions – as well as an additional relevant study – to our manuscript in this context. Thank you for the helpful addition.

      We do however think that our studies are importantly different to this previous work. Our question is whether processing under IB yields representations which are available for explicit report and so would constitute clear evidence of seeing, and perhaps even conscious experience. As we discuss, evidence for this kind of processing remains wanting: “A handful of prior studies have explored the possibility that inattentionally blind subjects may retain some visual sensitivity to features of IB stimuli (e.g., Schnuerch et al., 2016; see also Kreitz et al., 2020, Nobre et al., 2020). However, a recent meta-analysis of this literature (Nobre et al., 2022) argues that such work is problematic along a number of dimensions, including underpowered samples and evidence of publication bias that, when corrected for, eliminates effects revealed by earlier approaches, concluding “that more evidence, particularly from well-powered pre-registered experiments, is needed before solid conclusions can be drawn regarding implicit processing during inattentional blindness” (Nobre et al., 2022).” Our paper is aimed at addressing this question which evidence of neural processing can only speak to indirectly.

      Recommendations for the authors:  

      Reviewer #1 (Recommendations for the authors):

      (1) Please report all of the data, especially the number of subjects in each experiment that answered Y/N and the numbers of subjects in each of the Y and N groups that guessed a feature correctly/incorrectly on the 2AFC tasks. And also the confidence ratings for the 2AFC task (for comparison with the confidence ratings on the Y/N questions).

      We now report all this data in our (revised) Supplementary Materials. We agree that this information will be helpful to readers.

      (2) Consider adding a control condition with partial attention (dual task) or full attention (single task) to estimate the rates of seeing the critical stimulus when it's expected.

      This is the only recommendation we have chosen not to implement. The reason, as we explain in detail above (especially in response to Reviewer #1 comment 5), is that this would not in fact be a “control condition” in our studies, and indeed would only inflate the biases we are concerned with in our work. As the referee comments, the main role of such trials in previous work has been to exclude from analysis subjects who failed to report the unexpected stimulus on the divided and/or full attention control trials. And the practice is controversial: Indeed, in a review of 128 experiments, White et al. 2018 argue that the practice has “problematic consequences” and “may lead researchers to understate the pervasiveness of inattentional blindness" (emphasis added). So, our choice not to have such conditions ensures an especially stringent test of our central claim. Not having those conditions (and their accompanying exclusions) makes our results more interpretable, not less interpretable, and so the absence of such conditions from our manuscript should not (in our view) be considered any kind of weakness.

      We have added a paragraph to our “Design and analytical approach” section explaining the logic behind our deliberate decision not to include divided or full attention trials in our experiments. (For even fuller discussion, see our response to Reviewer #1’s comment 5 above.)

      (3) Consider revising the interpretations to be more precise about the distinction between the super subject being above chance versus each individual subject who cannot be at chance or above chance because there was only a single trial per subject.

      We have now done this throughout the manuscript, as discussed above. We have also added a substantive additional discussion to our “Design and analytical approach” section discussing what should be said about individual subjects in light of our group level data.

      This was a very helpful point, and greatly clarifies the claims we wish to make in the paper. Thank you for this comment, which has certainly made our paper stronger.

      Reviewer #2 (Recommendations for the authors):

      I would be curious to hear the authors' response to two points:

      (1) What do they have to say about prior studies that do more than just ask yes/no questions (and ask several follow-ups)? Are those studies "valid"?

      A very substantial new discussion of this important point has been added. As you will see above, we comment on every one of the 18 papers this reviewer raised (as well as the general argument made); we contend that while many of these papers improve on past methodology in various ways, most in fact do “just ask yes/no questions”, and none of them makes the methodological advance we offer in our manuscript. However, this discussion has helped us clarify that very advance, and so working through this issue has really helped us improve our paper and make its relation to existing literature that much clearer. Thank you for raising this crucial point.

      (2) Do the authors think it is possible that in many cases, people are just guessing about a critical item's location or color and this is at least in part a form of priming?

      We have clarified our discussion in numerous places to further emphasize that our main point concerns above-chance sensitivity, not awareness. Given this, we take very seriously the hypothesis that something like priming of a kind sometimes proposed to occur in cases of blindsight or other putative cases of unconscious perception could be what is driving the responses in non-noticers.

      Reviewer #3 (Recommendations for the authors):

      (1) Control dual task version with expected stimuli would be nice

      We have added a paragraph to our “Design and analytical approach” section explaining the logic behind our deliberate decision not to include divided or full attention trials, which would not in fact be a “control” task in our experiments. For full discussion, see our response to Reviewer 3 above, as well as our summary here in the Recommendations for Authors section in responding to Reviewer 1, recommendation (2).

      (2) Please do a better job in discussing and introducing experiments about neural signatures during IB.

      A discussion of Vandenbroucke et al. 2014 and Scholte et al. 2006 has been added to our discussion of neural signatures in IB, as well as an additional reference to an important early study of semantic processing in IB (Rees et al., 1999). Thank you for these very helpful suggestions!

    1. Joint public review:

      Summary

      In this manuscript, Dong et al. study the directed cell migration of tracheal stem cells in Drosophila pupae. The authors study how the directionality of these cells is regulated along the dorsal trunk. They show that inter-organ communication between the tracheal stem cells and the nearby fat body plays a role in posterior migration. They provide compelling evidence that Upd2 production in the fat body and JAK/STAT activation in the tracheal stem cells play a role. Moreover, they show that JAK/STAT signalling might induce the expression of apicobasal and planar cell polarity genes in the tracheal stem cells which appear to be needed to ensure unidirectional migration. Finally, the authors suggest that trafficking and vesicular transport of Upd2 from the fat body towards the tracheal cells might be important.

      Strengths

      The manuscript is well written and presents extensive and varied experimental data to show a link between Upd2-JAK/STAT signaling from the fat body and tracheal progenitor cell migration. The authors provide convincing evidence that the fat body, located near the trachea, secretes vesicles containing the Upd2 cytokine and that affecting JAK-STAT signaling results in aberrant migration of some of the tracheal stem cells towards the anterior. Using ChIP-seq as well as analysis of GFP-protein trap lines of planar cell polarity genes in combination with RNAi experiments, the authors show that STAT92E likely regulates the transcription of planar cell polarity genes and some apicobasal cell polarity genes in tracheal stem cells which appear to be needed for unidirectional migration. The work presented here provides some novel insights into the mechanism that ensures polarized migration of tracheal stem cells, preventing bidirectional migration. This might have important implications for other types of directed cell migration in invertebrates or vertebrates including cancer cell migration. Overall, the authors have substantially improved their manuscript since the first submission but there are still some weaknesses.

      Weaknesses

      Overall, the manuscript lacks insights into the potential significance of the observed phenotypes and of the proposed new signaling model. Most of our concerns could be dealt with by adjusting the text (explaining some parts better and toning down some statements).

      (1) Directional migration of tracheal progenitors is only partially compromised, with some cells migrating anteriorly and others maintaining their posterior migration, a quite discrete phenotype. The strongest migration defects quantified in graphs (e.g. 100 μm) are not shown in images, since they would be out of frame, it would be beneficial to see them. In addition, the consequence of defects in polarized migration on tracheal development is not clear and data showing phenotypes on the final trachea morphology in pupae are not explained nor linked to the previous phenotypes.

      (2) Some important information is lacking, such as the origin of mutant and UAS-RNAi lines, which are not reported in the material and methods. For instance, mutants for components of the JAK-STAT pathway are used but not described. Are they all viable at the pupal stage? Otherwise, pupae would not be homozygous mutants. From the figure legend, it seems that the Stat92EF allele has been used, which is a point mutation, thus not leading to an absence of protein. If the hopTUM allele has been used, as mentioned in the legend, it is a gain-of-function allele. Thus, the authors should not conclude that "The aberrant anterior migration of tracheal progenitors in the absence of JAK/STAT components led to impairment of tracheal integrity and caused melanization in the trachea (Figure 3-figure supplement 1E-I)".

      (3) The authors observe that tracheal progenitors display a polarized distribution of Fat that is controlled by JAK-STAT signaling. However, this conclusion is made from a single experiment using only 3 individuals with no statistics. This is insufficient to support the claim that "JAK/STAT signaling promotes the expression of genes involved in planar cell polarity leading to asymmetric localization of Fat in progenitor cells", as mentioned in the abstract, or that "the activated tracheal progenitors establish a disciplined migration through the asymmetrical distribution of polarity proteins which is directed by an Upd2-JAK/STAT signaling stemming from the remote organ of fat body."

      (4) The authors demonstrate that Upd2 is transported through vesicles from the fat body to the tracheal progenitors. It remains somewhat unclear in the proposed model how Upd2 activates JAK-STAT signaling. Are vesicles internalized, as it seems to be proposed, and thus how does Upd2 activate JAK-STAT signaling intracellularly? Or is Upd2 released from vesicles to bind Dome extracellularly to activate the JAK-STAT pathway? Moreover, it is not clear nor discussed what would be the advantage of transporting the ligand in vesicles compared to classical ligand diffusion.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, Dong et al. study the directed cell migration of tracheal stem cells in Drosophila pupae. The migration of these cells which are found in two nearby groups of cells normally happens unidirectionally along the dorsal trunk towards the posterior. Here, the authors study how this directionality is regulated. They show that inter-organ communication between the tracheal stem cells and the nearby fat body plays a role. They provide compelling evidence that Upd2 production in the fat body and JAK/STAT activation in the tracheal stem cells play a role. Moreover, they show that JAK/STAT signalling might induce the expression of apicobasal and planar cell polarity genes in the tracheal stem cells which appear to be needed to ensure unidirectional migration. Finally, the authors suggest that trafficking and vesicular transport of Upd2 from the fat body towards the tracheal cells might be important.

      Strengths:

      The manuscript is well written. This novel work demonstrates a likely link between Upd2JAK/STAT signalling in the fat body and tracheal stem cells and the control of unidirectional cell migration of tracheal stem cells. The authors show that hid+rpr or Upd2RNAi expression in a fat body or Dome RNAi, Hop RNAi, or STAT92E RNAi expression in tracheal stem cells results in aberrant migration of some of the tracheal stem cells towards the anterior. Using ChIP-seq as well as analysis of GFP-protein trap lines of planar cell polarity genes in combination with RNAi experiments, the authors show that STAT92E likely regulates the transcription of planar cell polarity genes and some apicobasal cell polarity genes in tracheal stem cells which appear to be needed for unidirectional migration. Moreover, the authors hypothesise that extracellular vesicle transport of Upd2 might be involved in this Upd2-JAK/STAT signalling in the fat body and tracheal stem cells, which, if true, would be quite interesting and novel.

      Overall, the work presented here provides some novel insights into the mechanism that ensures unidirectional migration of tracheal stem cells that prevents bidirectional migration. This might have important implications for other types of directed cell migration in invertebrates or vertebrates including cancer cell migration.

      Weaknesses:

      It remains unclear to what extent Upd2-JAK/STAT signalling regulates unidirectional migration. While there seems to be a consistent phenotype upon genetic manipulation of Upd2-JAK/STAT signalling and planar cell polarity genes, as in the aberrant anterior migration of a fraction of the cells, the phenotype seems to be rather mild, with the majority of cells migrating towards the posterior.

      We agree that the phenotype is mild, as perturbing JAK/STAT signaling in the progenitors specifically affects the coordinated migration of the cells rather than alters their direction or completely blocks migration. Our data indicate that inter-organ communication ensures coordinated behavior of the progenitor cells, although the differential responses exhibited by individual cells represent an interesting unresolved issue that awaits future in-depth investigation.

      While I am not an expert on extracellular vesicle transport, the data presented here regarding Upd2 being transported in extracellular vesicles do not appear to be very convincing.

      We performed additional PLA experiments which support the interaction between Upd2 and the core components of extracellular vesicles (revised Figure 8). Furthermore, we performed electron microscopy to visualize the Lbm-containing vesicles in fat body (Figure 8-figure supplement 1D).

      These data are now provided in the revised manuscript.

      Major comments:

      (1) The graphs showing the quantification of anterior (and in some cases also posterior migration) are quite confusing. E.g. Figure 1F (and 5E and all others): These graphs are difficult to read because the quantification for the different conditions is not shown separately. E.g. what is the migration distance for Fj RNAi anterior at 3h in Fig5E? Around -205micron (green plus all the other colors) or around -70micron (just green, even though the green bar goes to -205micron). If it's -205micron, then the images in C' or D' do not seem to show this strong phenotype. If it's around -70, then the way the graph shows it is misleading, because some readers will interpret the result as -205. Moreover, it's also not clear what exactly was quantified and how it was quantified. The details are also not described in the methods. It would be useful, to mark with two arrowheads in the image (e.g. 5 A' -D') where the migration distance is measured (anterior margin and point zero).

      Overall, it would be better, if the graph showed the different conditions separately. Also, n numbers should be shown in the figure legend for all graphs.

      We apologize for those inappropriate presentation and insufficient description and thank you for kindly pointing them out. We used different colors to represent different genotypes, and the columns were superimposed. we chose to show the quantification in different conditions separately in the revised Figures. The anterior migration distance for Fj RNAi is around 70 µm.

      We now provided detailed description in the revised methods. For migration distance measurement, we took snapshots at 0hr\ 1hr\ 2hr and 3hr, and measured the distance from the starting point (the junction of TC and DT) to the leading edge of progenitor clusters. The velocity formula: v=d (micrometer)/t (min). As you kindly suggested, we indicated the anterior margin and point zero in the corresponding panels. We have added n number in the legends.

      (2) Figure 2-figure supplement 1: C-L and M: From these images and graph it appears that Upd2 RNAi results in no aberrant anterior migration. Why is this result different from Figures 2D-F where it does?

      The fat body-expressing lsp2-Gal4 was used in Figure 2-figure supplement 1C-L and Figure 2D-F, while trachea specific btl-Gal4 was used in Figure 2-figure supplement 1K-L. The lsp2-Gal4-driven but not btl-Gal4-driven upd2RNAi causes aberrant anterior migration, suggesting that fat bodyderived Upd2 plays a role. We have further clarified this in the text.

      (3) Figure 5F: The data on the localisation of planar cell polarity proteins in the tracheal stem cell group is rather weak. Figure 5G and J should at least be quantified for several animals of the same age for each genotype. Is there overall more Ft-GFP in the cells on the posterior end of the cell group than on the opposite side? Or is there a more classic planar cell polarity in each cell with FtGFP facing to the posterior side of the cell in each cell? Maybe it would be more convincing if the authors assessed what the subcellular localisation of Ft is through the expression of Ft-GFP in clones to figure out whether it localises posteriorly or anteriorly in individual cells.

      We staged the animals, measured several animals for each genotype and provided the quantifications in the revised manuscript. The level of Ft-GFP is higher in the cells at the frontal edge. We tried to examine the expression of Ft-GFP at single-cell level. However, this turned out to be technically difficult because the tracheal stem cells are not regularly arranged as epithelial cells and the proximal-distant axis of the tracheal stem cells remains unclear. We thus decided to measure the fluorescence signal of groups of stem cells along the DT regardless of their individual polarity within cells.

      (4) Regarding the trafficking of Upd2 in the fat body, is it known, whether Grasp65, Lbm, Rab5, and 7 are specifically needed for extracellular vesicle trafficking rather than general intracellular trafficking? What is the evidence for this?

      In our experiments, knocking down rab5, rab7, grasp65 or lbm in trachea using btl-Gal4 did not cause abnormality in the disciplined migration, which excludes their intracellular contribution in the trachea (Figure 7-figure supplement 1). Perturbation of Grasp65 or Lbm in fat body increased intracellular upd2-containing vesicles, indicating that intracellular production is functional (Figure 6J). The Grasp65 is specifically required for Upd2 production. Lbm, Rab5 and Rab7 are important of vesicle trafficking. Our conclusion does not pertain to extracellular or intracellular compartment.

      (5) Figure 8A-B: The data on the proximity of Rab5 and 7 to the Upd2 blobs are not very convincing.

      The confocal images indicate the proximity of Rab5 and Rab7 to the Upd2 vesicles. We interpret the proximity together with the results from Co-IP and PLA data (Figure 8E-K).

      (6) The authors should clarify whether or not their work has shown that "vesicle-mediated transport of ligands is essential for JAK/STAT signaling". In its current form, this manuscript does not appear to provide enough evidence for extracellular vesicle transport of Upd2.

      Lbm belongs to the tetraspanin protein family that contains four transmembrane domains, which are the principal components of extracellular vesicles. We show that Lbm interacts with Upd2. The JAK/STAT signaling depends on the Upd2 in the fat body as well as vesicle trafficking machinery. Furthermore, we performed electron microscopy and show the presence of Lbm-containing vesicles in fat body (Figure 8-figure supplement 1D).

      (7) What is the long-term effect of the various genetic manipulations on migration? The authors don't show what the phenotype at later time points would be, regarding the longer-term migration behaviour (e.g. at 10h APF when the cells should normally reach the posterior end of the pupa). And what is the overall effect of the aberrant bidirectional migration phenotype on tracheal remodelling?

      We observed that the integrity of tracheal network especially the dorsal trunk was impaired, which may be due to incomplete regeneration (Figure 3-figure supplement1E-I).

      (8) The RNAi experiments in this manuscript are generally done using a single RNAi line. To rule out off-target effects, it would be important to use two non-overlapping RNAi lines for each gene.

      We validated the phenotype using several independent RNAi alleles.

      Reviewer #2 (Public review):

      Summary:

      This work by Dong and colleagues investigates the directed migration of tracheal stem cells in Drosophila pupae, essential for tissue homeostasis. These cells, found in two nearby groups, migrate unidirectionally along the dorsal trunk towards the posterior to replenish degenerating branches that disperse the FGF mitogen. The authors show that inter-organ communication between tracheal stem cells and the neighboring fat body controls this directionality. They propose that the fat body-derived cytokine Upd2 induces JAK/STAT signaling in tracheal progenitors, maintaining their directional migration. Disruption of Upd2 production or JAK/STAT signaling results in erratic, bidirectional migration. Additionally, JAK/STAT signaling promotes the expression of planar cell polarity genes, leading to asymmetric localization of Fat in progenitor cells. The study also indicates that Upd2 transport depends on Rab5- and Rab7-mediated endocytic sorting and Lbm-dependent vesicle trafficking. This research addresses inter-organ communication and vesicular transport in the disciplined migration of tracheal progenitors.

      Strengths:

      This manuscript presents extensive and varied experimental data to show a link between Upd2JAK/STAT signaling and tracheal progenitor cell migration. The authors provide convincing evidence that the fat body, located near the trachea, secretes vesicles containing the Upd2 cytokine. These vesicles reach tracheal progenitors and activate the JAK-STAT pathway, which is necessary for their polarized migration. Using ChIP-seq, GFP-protein trap lines of planar cell polarity genes, and RNAi experiments, the authors demonstrate that STAT92E likely regulates the transcription of planar cell polarity genes and some apicobasal cell polarity genes in tracheal stem cells, which seem to be necessary for unidirectional migration.

      Weaknesses:

      Directional migration of tracheal progenitors is only partially compromised, with some cells migrating anteriorly and others maintaining their posterior migration.

      Our results suggest that Upd2-JAK/STAT signaling is required for the consistency of disciplined migration. Although only a few tracheal progenitors display anterior migration, these cells lose the commitment of directional movement. We acknowledge that the phenotype is moderate.

      Additionally, the authors do not examine the potential phenotypic consequences of this defective migration.

      We examined the long-term effects of the aberrant migration and observed an impairment of tracheal integrity and melanized tracheal branches (Figure 3-figure supplement1E-I).

      It is not clear whether the number of tracheal progenitors remains unchanged in the different genetic conditions. If there are more cells, this could affect their localization rather than migration and may change the proposed interpretation of the data.

      We examined the progenitor cell number in bidirectional movement samples and control group. The results show that cell number does not exhibit a significant difference between control and bidirectional movement groups (Figure 3-figure supplement 1).

      Upd2 transport by vesicles is not convincingly shown.

      We performed additional PLA experiments to further support the interaction between Upd2 and the core components of extracellular vesicles. Furthermore, we performed electron microscopy and show the presence of Lbm-containing vesicles in fat body (Figure 8-supplement 1D). Additional experiments such as colocalization and Co-IP assay and better quantification are provided in the revised manuscript (see revised Figure 8).

      Data presentation is confusing and incomplete.

      We used different colors to represent different genotypes, and the columns were superimposed. we changed the graphs to show the quantification in different conditions separately. We revised data presentation to avoid confusing.

      Reviewer #3 (Public review):

      Summary:

      Dong et al tackle the mechanism leading to polarized migration of tracheal progenitors during Drosophila metamorphosis. This work fits in the stem cell research field and its crucial role in growth and regeneration. While it has been previously reported by others that tracheal progenitors migrate in response to FGF and Insulin signals emanating from the fat body in order to regenerate tracheal branches, the authors identified an additional mechanism involved in the communication of the fat body and tracheal progenitors.

      Strengths:

      The data presented were obtained using a wide range of complementary techniques combining genetics, molecular biology, quantitative, and live imaging techniques. The authors provide convincing evidence that the fat body, found in close proximity to the trachea, secrete vesicles containing the Upd2 cytokine that reach tracheal progenitors leading to JAK-STAT pathway activation, which is required for their polarized migration. In addition, the authors show that genes regulating planar cell polarity are also involved in this inter-organ communication.

      Weaknesses:

      (1) Affecting this inter-organ communication leads to a quite discrete phenotype where polarized migration of tracheal progenitors is partially compromised. The study lacks data showing the consequences of this phenotype on the final trachea morphology, function, and/or regeneration capacities at later pupal and adult stages. This could potentially increase the significance of the findings.

      Regarding your kind suggestion, we examined the long-term effects of the aberrant migration and observed the impairment of tracheal integrity and melanized tracheal branches (Figure 3-figure supplement1E-I).

      (2) The conclusions of this paper are mostly well supported by data, but some aspects of data acquisition and analysis need to be clarified and corrected, such as recurrent errors in plotting of tracheal progenitor migration distance that mislead the reader regarding the severity of the phenotype.

      We used different colors to represent different genotypes, and the columns were superimposed. we changed the graphs to show the quantification in different conditions separately. We thank you for kindly pointing it out.

      (3) The number of tracheal progenitors should be assessed since they seem to be found in excess in some genetic conditions that affect their behavior. A change in progenitor number could lead to crowding, thus affecting their localization rather than migration capacities, thereby changing the proposed interpretation. In addition, the authors show data suggesting a reduced progenitor migration speed when the fat body is affected, which would also be consistent with a crowding of progenitors.

      We examined the cell number in bidirectional movement samples and control group. We examined cell number and cell proliferation and observed that there was no significance between control and bidirectional movement groups (Figure 3-figure supplement 2).

      (4) The authors claim that tracheal progenitors display a polarized distribution of PCP proteins that is controlled by JAK-STAT signaling. However, this conclusion is made from a single experiment that is not quantified and for which there is no explanation of how the plot profile measurements were performed. It also seems that this experiment was done only once. Altogether, this is insufficient to support the claim. Finally, a quantification of the number of posterior edges presenting filopodia rather than the number of filopodia at the anterior and posterior leading edges would be more appropriate.

      We staged the animals, measured several animals for each genotype and provided the quantifications in the revised manuscript. The level of Ft-GFP is higher in the cells at the frontal edge. We tried to examine the expression of Ft-GFP at single-cell level. However, this turned out to be difficult due to the fact that the tracheal stem cells are not regularly patterned as epithelial cells and the proximaldistant axis of tracheal stem cells is not well defined. We thus decided to measure the fluorescence signal of groups of stem cells along the DT regardless of their individual polarity.

      (5) The authors demonstrate that Upd2 is transported through vesicles from the fat body to the tracheal progenitors where they propose they are internalized. Since the Upd2 receptor Dome ligand binding sites are exposed to the extracellular environment, it is difficult to envision in the proposed model how Upd2 would be released from vesicles to bind Dome extracellularly and activate the JAK-STAT pathway. Moreover, data regarding the mechanism of the vesicular transport of Upd2 are not fully convincing since the PLA experiments between Upd2 and Rab5, Rab7, and Lbm are not supported by proper positive and negative controls and co-immunoprecipitation data in the main figure do not always correlate to the raw data.

      We use molecular modeling to show that Upd2 and Lbm intermingle, and Upd2 is not entirely encapsulated in vesicles (Figure 8-supplement 1E). We performed PLA experiments using the animals not expressing upd2-Cherry as negative control (Figure 8 E-J). We corrected the Co-IP panel and apologize for this error.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Minor comments:

      (1) Figure 1-figure supplement 1: E: How was the migration velocity assessed? By live imaging individual cells or following the cell front of the group? Over what time period? Do the data points in the graph correspond to individual cells or the cell group? It would be important to show confocal images that go along with this quantification.

      We took snapshots of pupae at 0hr\ 1hr\ 2hr and 3hr, and measured the distance covered by the migrating progenitor cells from the start place (the junction of TC and DT) to the leading edge of progenitor groups. We then calculated the migration rate by v=d (micrometer)/t (min). As the progenitor cells revolve around and migrate along the DT, tracking single tracheoblast through intact cuticle is technically challenging. We have therefore measured the leading edge as a proxy to the whole cell group. We agree with you that time-lapse imaging is favorable for analysis of migration.

      (2) Figure 1-figure supplement 1: F: Why is there Gal80ts in the genotype? (and in Figure 1H). Also, what pupal age was used for this quantification?

      Expression of hid and rpr in L3 stage impaired fat body integrity and adipocyte abundance, and caused lethality. Gal80ts was used for controlling the expression of rpr.hid. The pupal at 0hr APF were used in EdU experiment.

      (3) Figure 2C: what is shown in the 6 columns (why 3 each for control and rpr/hid)?

      We conducted 3 replicates of each group for control and rpr.hid.

      (4) In the methods, several Drosophila stocks are listed as 'source:" from a particular person (e.g. Dr Ma). Please list the real source of this stock, e.g. Bloomington stock number, or the lab and publication in which the stock was originally made.

      We provide the information on these stocks in the revised methods.

      (5) The SKOV3 carcinoma cell and S2 cell work is not described in the methods.

      We added detailed description of this experiment in the revised method-Cell culture and transfection. 

      (6) Figure 6 (F) 'Bar graph plots the abundance of Upd2-mCherry-containing vesicles in progenitors.' What does abundance mean? What was quantified, the number of vesicles, or the mean intensity? This is also not mentioned in the methods.

      We counted the number of Upd2-mCherry-containing vesicles in fat body cells and trachea progenitors and added the description of measurement in the method.

      (7) There are a few language mistakes throughout the manuscript. E.g.

      (a) Line 117 and other places: Language: 'fat body' should be 'the fat body'.

      We thank you for pointing out these errors and corrected it accordingly.

      (b) Line 1276 Language mistakes: 'Video 1 3D-view of confocal image stacks of tracheal progenitors and fat body. Scale bar: 100 μm. Genotypes: UAS-mCD8-GFP/+;lsp2-Gal4,P[B123]-RFP-moe/+.' :stacks and genotypes should be singular.

      We fixed these errors and thank you for kindly pointing them out. We also proofread the entire manuscript to assure accuracy.

      (8) In general, it is hard to figure out the exact genotypes used in experiments. This is mostly not written very clearly in the figure legends. E.g. Figure 2: genotype for A-C missing in figure legend (is B from control animals?)

      We added genotypes in the figure legends. For Figure 2, A and C lsp2-Gal4,P[B123]-RFP-moe/+ for control, UAS-rpr-hid/+;Gal80ts/+;lsp2-Gal4,P[B123]-RFP-moe/+ for rpr.hid; B from control animals.

      Reviewer #2 (Recommendations for the authors):

      Major comments:

      (1) The phenotype resulting from Upd2 downregulation by RNAi is subtle and shown by unconvincing images. In addition, these phenotypes are analyzed using only one RNAi line.

      We used two independent alleles of upd2RNAi from THFC (THU1288 and THU1331), and observed similar phenotype. For RNAi experiments, we always use multiple independent alleles.

      (2) The authors should analyze the phenotypic consequences of directional migration changes. Is there an effect on tracheal remodeling?

      We observed that the integrity of tracheal network especially the dorsal trunk was impaired and that melanized tracheal branches were present, which may be due to incomplete regeneration (Figure 3figure supplement1E-I).

      (3) The number of tracheal progenitors should be quantified, as some genetic conditions may affect cell numbers, as is apparent in some panels.

      We examined cell number and cell proliferation and observed that there was no significance between control and bidirectional movement groups (Figure 3-figure supplement 1).

      (4) The data on PCP protein distribution are unconvincing, unquantified, and insufficient to support one of the main conclusions of the study, which is stated in the abstract: "JAK/STAT signaling promotes the expression of genes involved in planar cell polarity, leading to asymmetric localization of Fat in progenitor cells."

      We staged the animals, measured several animals for each genotype and provided the quantifications in the revised manuscript. The level of Ft-GFP is higher in the cells at the frontal edge. We tried to examine the expression of Ft-GFP at single-cell level. However, this turned out to be difficult due to the fact that the tracheal stem cells are not regularly patterned as epithelial cells and the proximaldistant axis of tracheal stem cells is not well defined. We thus decided to measure the fluorescence signal of groups of stem cells along the DT regardless of their individual polarity.

      Minor comments:

      (1) Language should be revised. In many places in the manuscript, starting in line 113, "fat body" should be "the fat body".

      Thank you for pointing out this error. We corrected it accordingly.

      (2) Genotypes used in experiments should be described.

      We added all the genotypes. We proofread the entire manuscript to complete the figure legends for genotypes.

      (3) Line 67, the reference to "The progenitor cells reside in Tr4 and Tr5 metameres and start to move along the tracheal branch" should include (Chen and Krasnow, Science 2014).

      We added the reference in the manuscript.

      (4) Line 1081, Figure 7 Legend. "Bar graph plots the abundance of Upd2-mCherry-containing vesicles" Abundance is the number of vesicles? The graph displays the average number of vesicles? Please explain and describe the quantification.

      The bar graph represents the number of Upd2-mCherry-containing vesicles in different conditions. We quantified the number of vesicles per area.

      (5) Figure 1 (I-J) What is shown on the panels? Progenitors marked with? This information is not present in the figure or figure legend. Same for Figure 2 (D-E).

      Figure 1I-J show the vector of migrating progenitors. We added the information in the legends. The tracheal cells were labeled by nls-mCherry in Figure 1I-J. In Figure 2D-E, the progenitors were marked with P[B123]-RFP-moe.

      (6) Figure 3 Q, Stat92E-GFP values in the graph are not well-explained. What do the numbers in the y-axis refer to?

      y-axis represents the intensity of Stat92E-GFP normalized to control. We have changed the y-axis label to ‘normalized Stat92E-GFP intensity’ in the legends.

      (7) In general, figures and figure legends must be revised. Sometimes stainings are not well-defined, some scale bars are missing and plots do not say what the values are.

      We apologized for inadequate information and have revised the figures and legends accordingly.

      Reviewer #3 (Recommendations for the authors):

      Several points should be addressed by the authors in order to improve their manuscript.

      Major points:

      (1) The phenotype obtained from decreasing the inter-organ signaling is quite discrete. It is further weakened by the fact that the images chosen to illustrate the measures are not really convincing. No image at 1h APF shows any clear anterior migration. Based on the scale, most of the images at 3h APF do not show a striking difference compared to the control, and in any case, stronger phenotypes would be missed anteriorly since they would thus be out of frame. In addition, at 3h APF, progenitors migrating anteriorly from Tr5 position get mixed with those migrating posteriorly from Tr4 so it is not clear how measurements were made. Given that most phenotypes are observed upon the use of RNAis, it is possible that phenotypes are weak due to persistent gene expression. Using null clones for dome, hop, or stat in progenitors could therefore aggravate the phenotypes and support further the significance of the study. Finally, assessing the consequences of compromised fat body-tracheal communication on trachea morphology, function, and regeneration later in pupal development and on adult flies would also help strengthen the importance of the findings.

      We agree with you that anteriorly migrated Tr5 progenitors adjoining Tr4 progenitor hinders measurements and that mutants may give stronger phenotype than RNAi lines. We only measured Tr4 progenitors (instead of Tr5) when assessing anterior migration. Thus, we performed experiments using mutant alleles, which gave aberrant migration of tracheal progenitors (Figure 3-figure supplement1A-D). We can now show that the integrity of tracheal network especially dorsal trunk was impaired, which may be due to incomplete regeneration (Figure 3-figure supplement1E-I).

      (2) Although the authors did not observe defects in tracheal progenitor proliferation, progenitors seem to be present in excess in some key genetic background (e.g, upon expression of rpr.hid, statRNAi, Rab-RNAi or in the presence of BFA). This excess could be the result of another mechanism than proliferation (recruitment of extra progenitors since it is not clear how they originate, defect in apoptosis...) and could impact the localization of progenitors, those being pushed anteriorly as a consequence of crowding. A proper characterization of tracheal progenitor number would thus help to discriminate between defects in migration or crowding. This point could also be addressed by performing individual tracking of tracheal progenitors, to find out whether each progenitor is indeed migrating in the wrong direction or if the movement assessed by the global tracking method that is used is just a consequence of progenitor excess.

      We examined the cell number in bidirectional movement samples and control group. The results show that there was no significance between control and bidirectional movement groups (Figure 3figure supplement 1). We also tried to follow every progenitor, but were unable to obtain convincing results with P[B123]-RFP-moe, as tracking single tracheoblast through intact cuticle is technically challenging.

      (3) Regarding the ChIP-seq experiment, an explanation of why choosing the "establishment of planar polarity" family should be provided since data indicate a quite low GeneRatio. Indeed, the "cell adhesion" family seems a more obvious candidate, which would be further supported by the fact that the JAK-STAT pathway has been shown to affect cell adhesion components such as ECadherin and FAK (Silver and Montell 2001, Mallart et al 2024). Also, have these known targets of JAK-STAT signaling been found in the ChIP-seq data? Since filopodia polarization is affected in tracheal progenitors when JAK-STAT signaling is decreased, the same question also applies to enabled, which is involved in filopodia formation and has been recently identified as a target of JAK-STAT signaling.

      As you kindly suggested, we tested a number of cell adhesion-related genes such as E-Cadherin (shg), fak, robo2 and enabled (ena). We did not observe an apparent aberrancy in the migration of tracheal progenitors (Figure 5-supplement 1J).

      (4) Data investigating PCP protein distribution is not convincing, not quantified, and not sufficient to draw one of the main conclusions of the study, which is even written in the abstract "JAK/STAT signaling promotes the expression of genes involved in planar cell polarity leading to asymmetric localization of Fat in progenitor cells."

      We better quantified the abundance of Ft in in the progenitors in the frontal edge and those lagging behind. The traces plot multiple replicates in the figures. The level of Ft-GFP is higher in the cells at the frontal edge.

      (5) Overall, the figures together with their caption and/or the material and methods section lack some important information for the reader to fully understand the data. In addition, some errors are found in multiple plots throughout the article and must be corrected. Here are some examples:

      According to your suggestion, we revised legends and methods section to include sufficient information.

      (a) Migration distance plots from Figure 3E do not match the data presented in the source data file. It seems that, when creating the plot, instead of superimposing the bars, bars were stacked. This should be corrected for all migration distance plots from Figure 3E onward, including in supplementary figures.

      We apologized for misleading representation. We revised it accordingly and show the quantification in different conditions separately.

      (b) The number of analyzed flies and/or clusters of tracheal progenitors from different flies should be stated for all quantification or observations made on images. This information is lacking for all migration distance plots, for progenitor migration tracking (Figure 1 I, J), for DIPF reporter in Figure 2J, for plot profiles (Figure 5G, J), for Upd2-Rab5/Rab7/Lbm co-detections, PLA, CoIP, and lbm-pHluorin experiments. This also applies to RNA seq, ChIP seq, and surface proteomics, for which the number of pupae and number of replicates is not indicated.

      We changed the graphs to show the quantification and n number in different conditions separately.

      We also added the n number of replicates in methods.

      (c) How quantifications were performed is not sufficiently explained. For example, the reference point for migration distance measurement is not defined, and neither is whether the measures were made on fixed or live imaging samples. In fluorescence intensity measurements and Upd2 vesicle counting, information on whether measures were made on a single z slice or on a projection of several z slices should be stated together with what ROI and which FIJI tool for quantification were used. For plot profiles, the same information regarding z slices misses together with how the orientation, the thickness, and the length of the line were chosen, and again the number of times the experiment was conducted should be mentioned and error bars should appear on graphs.

      We thank this reviewer for the suggestions which help clarify the methodology of our experiments and improve presentation of our data. We have made the changes according to the suggestions and modified our methods section and the related figures to incorporate these changes.

      For measuring the migration distance of tracheal progenitors, we took snapshots of living pupae at 0hr\ 1hr\ 2hr and 3hr APF, and measured the migration distance of tracheal progenitors from the start place (the junction of TC and DT) to the leading edge of progenitor groups.

      For the measurements of fluorescent intensity of stat92E-GFP and DIPF, we took z-stack confocal images of samples and quantified the fluorescent intensity using FIJI. Specifically, intensity was quantified for regions of interest, using the Analysis and Measurement tools. To quantify Upd2mCherry vesicles, z-stack confocal images of fat body were taken and the cell counting function of FIJI was used to measure the vesicle number.

      To quantify the fluorescent intensity of in vivo tagged Ds, Ft and Fj proteins, a single z slice was used. The expression level of the protein was assessed as the integrated fluorescent intensity normalized to area.

      For the measurement of Ft-GFP distribution, a single z slice of the progenitors immediately proximal to the DT was imaged. An arbitrary line was drawn along the migration direction from the starting TC-DT junction to the leading front (the length of the line corresponds to the distribution range of tracheal stem cell clusters). Then, fluorescent intensity along the line was automatically calculated with the imbedded measurement function of Zeiss confocal software.

      Minor points:

      (1) In several instances, the authors generalize that stem cells migrate to leave their niche, but this is not the case for all stem cells.

      The phenomenon that stem cells leave their niche when they are activated is commonly observed. We interpreted the general mechanism from our system of tracheal stem cells. We fully agree with you that it may not be the case for all stem cells. We modified the text accordingly.

      (2) Line 122 -a reference paper or an image showing the expression pattern of the lsp2-Gal4 driver is missing.

      We added the reference in the manuscript.

      (3) Line 136 - The term "traces of individual progenitors" is overstated and should be reformulated as the method used does not seem to be individual cell tracking.

      We rephrased accordingly in the revised manuscript.

      (4) Line 146 - Fat body and tracheal progenitors are qualified as interdependent organs, in which aspect do tracheal progenitors affect the fat body?

      Current knowledge suggests a close inter-organ crosstalk between trachea and fat body: The fly trachea provides oxygen to the body and influences the oxidation and metabolism of the whole body. When the trachea is perturbed, the body is in hypoxia, which causes inflammatory response in adipose tissue as an important immune organ (Shin et al., 2024).

      (5) Line 163 - Not all the genes tested are cytokines, so the sentence should be reformulated. In addition, in supplementary Fig2-1 C-J, the KD of hh seems to abolish completely tracheal progenitor migration, which is not commented on.

      According to your suggestion, we revised the description on information of the genes tested. We added comments in the revised manuscript regarding phenotypes of hh knockdown. 

      (6) Line 180 - Conclusion is made on Dome expression while using a dome-Gal4 construct, which does not necessarily recapitulate the endogenous pattern of dome expression, so it should be reformulated. Ideally, dome expression should be assessed in another way. Also, it is not clear whether GFP is present only in progenitors since images are zoomed.

      We revised statement and provided larger view of dome>GFP that shows an enriched expression in the tracheal progenitors (Figure 2-figure supplement 2E), an expression pattern that is consistent with FlyBase.

      (7) Line 199 - Is it upd-Gal4 or upd2-Gal4 that is used? Since the conclusion of the experiment is made on upd2, the use of upd-gal4 would not be relevant. If upd2-gal4 is used, it should be corrected. In general, the provenance of the Gal4 lines should be provided. In addition, a strong GFP signal in the trachea is visible on the image in Supplementary Figure 2-2F but not commented on and seems contradictory with the conclusion mentioning that fat body and gut are the main source of Upd2 production.

      We removed data obtained from the use of this irrelevant upd-Gal4 line.

      (8) Figures:

      -  Figure 1 G, H - Scale bar is missing.

      We added it accordingly.

      -  Figure 1 I, J - The information on the staining is missing.

      We added it in the revised manuscript.

      -  Figure 2A - Providing explanations of the terms "Count" and "Gene ratio" in the caption would be helpful for readers who are not used to this kind of data. In addition, the color code is confusing since the same color is used for the selected gene family and for high p-values (the same applies to other similar graphs).

      Gene ratio refers to the proportion of genes in a dataset that are associated with a particular biological process, function, or pathway. Count indicates the number of genes from input gene list that are associated with a specific GO term. We used redness to indicate a smaller p-value and a higher significance.

      -  Figure 2 B, C - What does the color scale represent? What do the columns in C correspond to, different time points, different replicates?

      The color scale represents the normalized expression. The columns in C correspond to different replicates of control and rpr.hid.

      -  Figure 2 F - The error bars on the 3h APF posterior bars are missing.

      We added error bars accordingly.

      -  Figure 2 G - The legend "Down-Stable-Up" is in comparison to what?

      The control group was generated from the reaction without H2O2. The comparison was relative to the control group.

      -  Figure 2 J - The specificity of the DIPF tool that has been created should be validated in other tissues displaying known JAK-STAT activity and/or in conditions of decreased JAK-STAT signaling. In addition, the added value of the tool as compared to the JAK-STAT activity reporter used later, which has been well characterized, is not obvious.

      We added the signal of DIPF in fat body and salivary gland, both of which harbor active JAK/STAT signaling (Figure 2-figure supplement 2F-H). As opposed to the well characterized Stat92E-GFP reporter that assays the downstream transcription activity, the DIPF reporter measures the upstream event of receptor dimerization.

      -  Figure 3 I-P - Reporter tool validation in Images I-L could be moved to supplementary data. In images M-P, staining of nuclei and/or membranes would be useful to assess cell integrity.

      We revised the figures accordingly.

      -  Figure 3Q and similar plots in the following figures do not explain the normalization performed and how it can be higher than 1 in control conditions.

      In these figures, we normalized the signal relative to control groups, e.g., The value of Stat92E-GFP in btl-GFP control group was set to 1 in the previous Figure 3Q (revised Figure 3-supplementary

      Figure B-J).

      -  Figure 4C - These representations lack explanations to be fully understood by a broad audience.

      The figure showing that Stat92E binding was detected in the promoters and intronic regions (the orange peaks) of genes functioning in distal-to-proximal signaling, such as ds, fj, fz, stan, Vang and fat2. We added the information in figure legend according to your suggestion.

      -  Figure 5 K,L - What is the x-axis missing, together with the method of tracking used?

      The x-axis refers to time of recording from a t stack series with a time interval of 5 min. We revised method section and provide detailed procedure of this experiment.

      -  Figures 6 and 8- The overall figures lack a wider view of the cells/tissues/organs and/or additional staining to understand what is presented.

      We showed preparation of fat body. In order to obtain the high resolution of vesicles, we used high magnification. We now added wider views of the tissues under investigation (e.g. Figure 6-figure supplement 1).

      -  Figure 6 D,E - The scale bar is missing.

      We added it accordingly.

      -  Figure 8 O-S - What is the blue staining?

      The blue staining shows DAPI-stained nuclei. We have added the information in the legend.

      -  PLA experiments can give a lot of non-specific background. What kind of controls have been used in Figure 8 F-J? Negative controls should be done on cells that do not express upd2-mCherry using both antibodies to detect non-specific background, which does not usually appear completely black.

      If possible, a positive control using a known protein interacting with Rab5-GFP should be included.

      We used the control samples without one of the primary antibodies in previous Figure 8. In the revised Figure 8, we conducted experiment as you suggested with controls that do not express upd2mCherry (Figure 8 E-J).

      -  Co-IP experiments - The raw data file for blots is quite hard to read through. Some legends are not facing the right lane and some blots presented in the main figure are difficult to track since several blots are presented in the raw data file. e.g.

      (a)  Raw blot for Figure 8 K: the band for mCherry in the IP anti-GFP blot (lane one in K) is not convincing, it is not distinguishable from other aspecific bands. On the reverse IP presented only in raw data, on the input from blot IB anti-mCherry, both lanes present exactly the same bands at 72kb when one of the lanes corresponds to extract from flies not expressing upd2-mCherry.

      We thank you for pointing out the incorrect labels. We apologized for the errors and corrected it accordingly.

      (b)  Raw blot for Figure 8 L: on the input blot IB anti-GFP, there is a band corresponding to Rab7-GFP in the lane of the extract from flies not expressing Rab7-GFP.

      We corrected it.

      (c)  Raw data for Figure 8 M: on the last blot, legends are missing above the input Ib anti-GFP blot.

      We added the missing legends in the figure.

      Shin, M., Chang, E., Lee, D., Kim, N., Cho, B., Cha, N., Koranteng, F., Song, J.J., and Shim, J. (2024). Drosophila immune cells transport oxygen through PPO2 protein phase transition. Nature 631, 350-359.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1) Summary:

      In this manuscript, the model's capacity to capture epistatic interactions through multi-point mutations and its success in finding the global optimum within the protein fitness landscape highlights the strength of deep learning methods over traditional approaches.

      We thank the reviewer for his/her recognition of our model’s potential and advantages.

      (2) Strengths:

      It is impressive that the authors used AI combined with limited experimental validation to achieve such significant enhancements in protein performance. Besides, the successful application of the designed antibody in industrial settings demonstrates the practical and economic relevance of the study. Overall, this work has broad implications for future AI-guided protein engineering efforts.

      We are thankful for the editor’s appreciation on our work, especially acknowledged the practical application of our model.

      (3) Weaknesses:

      However, the authors should conduct a more thorough computational analysis to complement their manuscript. While the identification of improved multi-point mutants is commendable, the manuscript lacks a detailed investigation into the mechanisms by which these mutations enhance protein properties. The authors briefly mention that some physicochemical characteristics of the mutants are unusual, but they do not delve into why these mutations result in improved performance. Could computational techniques, such as molecular dynamics simulations, be employed to explore the effects of these mutations?

      We thank the reviewer for this good question, which allows us to provide a deeper investigation into the mechanisms by which the mutations significantly enhance the alkali-resistance of proteins. By following the reviewer’s suggestion, we have expanded our analysis by incorporating molecular dynamics (MD) simulations to understand the impact of the mutations. As an example, we focused on the representative alkali-resistant mutant, A57D;P29T, and examined its MD simulation results. As shown in Figure S4A, the two-point mutant of A57D;P29T has a Tm increase of around 8 ℃ and a much stronger binding affinity than the WT. Our analysis of the MD trajectories indicates that the A57D;P29T mutant has a more rigid structure than that of WT due to its lower root mean squared deviation (RMSD) of protein (Figure S4B). Furthermore, we calculated the root mean squared fluctuation (RMSF) for each residue, and realized that the mutant displayed less fluctuation at residue 29 but similar flexibility at residue 57. Interestingly, residues at positions 10, 108 and 118 which spatially distant from residues 29 and 57 in the mutant exhibited remarkable weakened fluctuations than those in the WT (Figure S4C), implying a more rigid structure of the mutant contributing to its improved resistance on high temperature and strong alkalinity. However, Figure S4D shows the AlphaFold3 predicted structures of the WT and the mutant are quite similar.

      To unveil the origin of change on structural flexibility, we computed the intramolecular interactions, such as salt bridges and hydrogen bonds for both WT and the mutant. We observed that the mutations increased the number of hydrogen bonds between the mutation sites and the rest of the protein (Figure S4E). However, the overall structure of the mutant did not show significant changes, which is also evident from the solvent-accessible surface area (SASA) analysis (Figure S4F). We also analyzed changes in salt bridges and found that although residue 57 mutated to Histidine, no new salt bridges were formed. Additionally, RMSF results showed that residues 10, 108, and 118 became more rigid, but further analysis revealed that there was no significant change in hydrogen bonds or other interactions in these regions. Overall, the MD results suggest that more hydrogen bonds introduced by the mutations of A57D;P29T stabilize the protein, leading to the enhanced alkali resistance observed in the mutant. These results are now presented in Figure S4 and discussed in detail in the revised manuscript.

      Specifically, we have added the following discussion in the main text:

      “In order to gain deeper insights into the mechanisms by which the identified mutations enhance protein properties, we performed molecular dynamics (MD) simulations on the best alkali-resistant mutant. The simulation results revealed several key observations that help explain the observed improvements in protein stability and alkali resistance. As shown in Figure S4A, the two-point mutant of A57D;P29T has a Tm increase of around 8℃ and a much stronger binding affinity than the WT. Our analysis of the MD trajectories indicates that the A57D;P29T mutant has a more rigid structure than that of WT due to its lower root mean squared deviation (RMSD) of protein (Figure S4B). Furthermore, we calculated the root mean squared fluctuation (RMSF) for each residue, and realized that the mutant displayed less fluctuation at residue 29 but similar flexibility at residue 57. Interestingly, residues at positions 10, 108 and 118 which spatially distant from residues 29 and 57 in the mutant exhibited remarkable weakened fluctuations than those in the WT (Figure S1C), implying a more rigid structure of the mutant contributing to its improved resistance on high temperature and strong alkalinity. However, Figure S4D shows the AlphaFold3 predicted structures of the WT and the mutant are quite similar. To unveil the origin of change on structural flexibility, we computed the intramolecular interactions, such as salt bridges and hydrogen bonds for both WT and the mutant. We observed that the mutations increased the number of hydrogen bonds between the mutation sites and the rest of the protein (Figure S4E). However, the overall structure of the mutant did not show significant changes, which is also evident from the solvent-accessible surface area (SASA) analysis (Figure S4F). We also analyzed changes in salt bridges and found that although residue 57 mutated to Histidine, no new salt bridges were formed. Additionally, RMSF results showed that residues 10, 108, and 118 became more rigid, but further analysis revealed that there were no significant changes in hydrogen bonds or other interactions in these regions. Taken together, these findings suggest that the enhanced alkali resistance of the mutant is likely due to an overall increase in protein stability, rather than a dramatic change in its structural conformation. The MD simulation results, which are detailed in Figure S4, provide a deeper understanding of how specific mutations can improve protein properties and offer valuable insights for future protein engineering applications.”

      And we also included the following content in the SI:

      “Molecular Dynamics (MD) simulations

      The initial structures for molecular dynamics (MD) simulations of both the wild type and the mutant were predicted using AlphaFold3. To simulate experimental conditions, each protein was placed in a cubic water box containing 0.1 M NaCl. The CHARMM27 force field and the TIP4P water model were applied throughout the simulations. After an initial energy minimization of 50,000 steps, the systems were heated and equilibrated for 1 ns in the NVT ensemble at 300 K followed by an additional 1 ns in the NPT ensemble at 1 atm. The production phase then involved 200-ns simulations with periodic boundary conditions, using a 2 fs integration time step. The LINCS algorithm was used to constrain covalent bonds involving hydrogen atoms, while Lennard-Jones interactions were cut off at 10 Å. Electrostatic interactions were computed with the particle mesh Ewald method, using a 10 Å cutoff and a grid spacing of approximately 1.6 Å with a fourth-order spline. Temperature and pressure were regulated by the velocity rescaling thermostat and Parrinello-Rahman algorithm, respectively. All simulations were performed using GROMACS 2020.4 software packages. Both systems have reached equilibrium according to the analyses of root mean squared deviation (RMSD).”

      (4) Additionally, the authors claim that their method is efficient. However, the selected VHH is relatively short (<150 AA), resulting in lower computational costs. It remains unclear whether the computational cost of this approach would still be acceptable when designing larger proteins (>1000 AA). Besides, the design process involves a large number of prediction tasks, including the properties of both single-site saturation and multi-point mutants. The computational load is closely tied to the protein length and the number of mutation sites. Could the authors analyze the model's capability boundaries in this regard and discuss how scalable their approach is when dealing with larger proteins or more complex mutation tasks?

      In our prior work, we have demonstrated that our method is applicable to larger proteins as well [Jiang et al., Sci. Adv. 10, eadr2641 (2024)]. For instance, when engineering a protein with 1000 amino acids, inferring the fitness of one million mutants using the model on a single 4090 GPU takes approximately 20 hours. However, it remains infeasible to explore all possible mutations when designing multi-point mutants due to the vast space. To address this challenge, we propose the design of a reliable mutant library. In the first round of experiments, we used the model to score all single-point mutations, and then constructed the multi-point mutant library by combining experimentally tested single-point mutations. In this way, even when designing five-point mutants, we only need to score on the order of millions of mutants, making the inference process time-efficient and fully acceptable. As a result, the number of single-point mutations selected for combination into the multi-point mutant library becomes a crucial parameter that affects both inference time and scope. We limited the number of single-point mutations to between 30 and 50 to strike a balance between efficiency and accuracy.

      These results are discussed in the revised manuscript. Specifically, we have added the following discussion at the section 2.2 in the main text:

      “Although the model inference is fast, it is not feasible to explore all possible mutations when designing multi-point mutants due to the exponential increase in the number of potential combinations. To manage this challenge, we constructed a mutant library based on a two-stage design process. In the first stage, we scored all single-point mutations using the model, and in the second stage, we combined experimentally validated single-point mutations to create the multi-point mutant library. This approach ensures that even when designing multi-point mutants (e.g., five-point mutants), the number of mutants to score remains in the millions, which is computationally efficient and practical. The number of single-point mutations selected for the multi-point mutant library is a key factor influencing both the computational load and the scope of the design space. To maintain a balance between efficiency and accuracy, we limited the number of single-point mutations to between 30 and 50. This strategic approach allows us to achieve both scalability and precision in our protein engineering tasks.”

      Reviewer #2 (Public review):

      In this paper, the authors aim to explore whether an AI model trained on natural protein data can aid in designing proteins that are resistant to extreme environments. While this is an interesting attempt, the study's computational contributions are weak, and the design of the computational experiments appears arbitrary.

      The reviewer’s comments give us an opportunity to further state the novelty of this study. Despite the AI model has been reported in our previous work [Sci. Adv. 10, eadr2641 (2024)], the unnatural physicochemical properties of proteins, to the best of our knowledge, have never been predicted using AI models. Our preceding work [Sci. Adv. 10, eadr2641 (2024)] has demonstrated that the large language model can predict the performances of the mutants on thermostability, catalytic activity, and binding affinity, etc. However, whether the AI models are able to evaluate the unnatural properties of the mutants remains unexplored. Our work has shown that AI models trained on the natural proteins can be used to design the mutants that resistant extreme conditions, such as strong alkalinity, substantially expanding the application of AI for bioengineering. Moreover, our design of the computational experiments was driven by the nature of the task and the availability of experimental data. We employed different strategies for designing single-point and multi-point mutants, specifically using a zero-shot approach for single-point mutations to overcome the challenge of rare data and fine-tuning the model for multi-point mutations to leverage the experimental data of single-point mutations.

      (1) The writing throughout the paper is poor. This leaves the reader confused.

      The manuscript has been revised accordingly, and we would like to address the reader’s questions if anything is confused.

      (2) The main technical issue the authors address is whether AI can identify protein mutations that adapt to extreme environments based solely on natural protein data. However, the introduction could be more concise and focused on the key points to better clarify the significance of this question.

      We thank the reviewer for this comment. We have revised the manuscript, particularly the introduction, where we focused on the research questions, methods, and main findings, while removing excessive background information to improve the manuscript’s conciseness and clarity.

      “Protein engineering, situated at the nexus of molecular biology, bioinformatics, and biotechnology, focuses on the design of proteins to introduce novel functionalities or enhance existing attributes[1-3]. With the exponential growth of biological data and computational power, protein engineering has experienced a significant shift towards advanced computational methodologies, particularly deep learning, to expedite the design process and unravel complex protein-function relationships[4-9]. However, a significant challenge in industrial protein engineering is designing proteins with inherent resistance to extreme conditions, such as high temperature and extreme pH environments (acidic or alkaline)[17, 18]. Unlike proteins in natural ecosystems, those used in industrial processes often encounter harsh physical and chemical conditions, necessitating exceptional resilience to maintain functionality[19, 20]. Previous efforts to enhance protein resistance have often relied on rational design and mutant library screening. These methods are typically labor-intensive, inefficient, and yield limited improvements[23-26]. Consequently, the industrial demand for proteins resilient to harsh environments poses a notable absence within the training datasets of Artificial Intelligence (AI) models. Exploring whether AI can achieve the evolution of protein resistance to extreme environments is crucial for broadening protein applications and improving modification efficiency.

      Recent advances in large-scale protein language models (LLMs) have enabled zero-shot predictions of protein mutants based on self-supervised learning from natural protein sequences. Although AI-guided protein design has been applied to predict the mutants with greater thermostability and higher activity[34-36], it is unexplored whether these models based on the natural protein information can find the mutants that adapt the unnatural extreme environments, such as the alkaline solution with the pH value higher than 13.

      Here, we employed a LLM (large language model) developed by our group, the Pro-PRIME model[27], to predict dozens of mutants of a nano-antibody against growth hormone (a VHH antibody), and examined their fitness, including alkali resistance and thermostability, to evaluate their performance under extreme environments.

      We utilized the Pro-PRIME model to score saturated single-point mutations of the VHH in a zero-shot setting, and selected the top 45 mutants for experimental testing. Some mutants exhibited improved alkali resistance, while others demonstrated higher thermal stability or affinity. Subsequently, we fine-tuned the Pro-PRIME model to predict dozens of multi-point mutations. As a result, we obtained three multi-point mutants with enhanced alkali resistance, higher thermostability, as well as strong affinity to the targeted protein. Also, the dynamic binding capacity of the selected mutant did not show significant decline after more than 100 cycles, making it suitable for practical application in industrial production. The selected mutant has been used in practical production and lower the cost for over one million dollars in a year. To the best of our knowledge, this is the first protein product developed by a LLM that has been successfully applied in mass production. Due to the Pro-PRIME model's ability to achieve precise predictions of multi-point mutations with reliance on a small amount of experimental data, our two-round design process involved experimental validation of only 65 mutants in two months, demonstrating remarkable high efficiency. Furthermore, we performed a systematic analysis of these findings and determined that the model can yield more valuable predictive outcomes while remaining consistent with rational design principles. Specifically, within the framework of multi-point combinations, the model's incorporation of negative single-point mutations into the combinatorial space led to exceptional results, showcasing its capacity to capture epistatic interactions. Notably, in striving for global optimum, deep learning methods offer distinct advantages over traditional rational design approaches.”

      (3) The authors did not develop a new model but instead used their previously developed Pro-PRIME model. This significantly weakens the novelty and contribution of this work.

      While it is true that the Pro-PRIME model was previously developed, the novelty and contribution of this work lie in its novel application to design proteins with properties that are not naturally found or are rare in nature. In our original work, the Pro-PRIME model was used to optimize proteins for existing, well-established properties, such as thermal stability, enzymatic activity, and affinity. However, in this study, we extended the model’s capabilities to design proteins that exhibit resilience to extreme environments, such as high pH—properties that are not inherently present in most natural proteins. To our knowledge, no existing model has addressed the challenge of engineering alkali-resistant proteins, nor is there relevant dataset available for training such models.

      This shift from optimizing existing characteristics to engineering entirely new properties represents a significant step forward in the field of protein design. By focusing on the design of proteins that can survive and function in harsh, unnatural environments, we have demonstrated the broader applicability of the Pro-PRIME model beyond its initial scope. This expansion of the model's application is a novel contribution that has the potential to accelerate the development of proteins for industrial, agricultural, and biotechnological applications.

      Thus, while the Pro-PRIME model itself is not new, its application to the new challenge of engineering proteins with alkali resistance and other novel properties significantly enhances the impact and novelty of this work. Moreover, this work is groundbreaking not only in terms of the model’s novel application but also because no previous studies have specifically targeted alkali resistance or provided data for training models on such extreme properties. Therefore, our approach is unique, marking a new direction in protein engineering.

      We have made the following revisions to the conclusions section of the manuscript:

      “Through two rounds of evolution, we successfully designed a VHH antibody with strong resistance to extreme environments and enhanced affinity using the Pro-PRIME model. Although rare case can tolerate the extreme pH and saline conditions in our pre-training dataset, the Pro-PRIME model showed impressive performance after supervised learning with limited data, especially on capturing the epistatic effects. The analysis of these 65 mutants revealed that the Pro-PRIME model is adept at exploring the large space of protein fitness, being less susceptible to local optima, and having greater potential to find the global optimum. Our efficient method of designing mutants that consider multiple properties improvement holds promise for industrial application of proteins. Specifically, the VHH antibody has been deployed in practical production and significantly enhancing the efficiency of the entire production line after our design. While the Pro-PRIME model itself has been reported, this work demonstrates its first-time application to the challenge of designing proteins with alkali resistance and other extreme properties that are not found in natural proteins, nor have previous studies addressed or provided data for such applications. This shift from optimizing existing protein properties to engineering entirely new, unnatural traits is a significant advance in the field. This study shows that the AI models, such as Pro-PRIME, can not only guide the evolution of protein thermal stability, enzymatic activity, ligand affinity, etc., but also enable to develop the mutants adapting the harsh unnatural environments, such as extreme pH and concentrated salt, largely expanding its application. The novelty of this work lies in the ability to design and engineer proteins with novel properties, specifically alkali resistance, which is an unprecedented achievement in AI-assisted protein engineering. The great potential of AI model is expected to significantly accelerate the development of proteins for diverse applications in medicine, agriculture, bioengineering, etc.”

      (4) The computational experiments are not well-justified. For instance, the authors used a zero-shot setting for single-point mutation experiments but opted for fine-tuning in multiple-point mutation experiments. There is no clear explanation for this discrepancy. How does the model perform in zero-shot settings for multiple-point mutations? How would fine-tuning affect single-point mutation results? The choice of these strategies seems arbitrary and lacks sufficient discussion.

      We appreciate the reviewer’s comment regarding the use of zero-shot and fine-tuning settings for single-point and multi-point mutation experiments, and we are grateful for the opportunity to further clarify this aspect of our work.

      In the first round of design, we used the zero-shot approach for single-point mutations because the number of possible single-point mutations is limited, and no prior experimental data was available. In the absence of relevant data, the zero-shot approach allows the model to make predictions based on the learned sequence patterns from the pre-trained protein language model. Given that single-point mutations are relatively fewer in number and computationally feasible to evaluate, the zero-shot approach was deemed appropriate for this task.

      However, when it comes to designing multi-point mutants, the number of potential combinations increases exponentially, making it computationally impractical to explore all possible mutations in a reasonable timeframe. Furthermore, since we had already obtained some experimental data for single-point mutations in the first round, we fine-tuned the model with this data in the second round to improve the accuracy of predictions for multi-point mutants. Fine-tuning helps the model better capture the specific features that contribute to protein functionality, which are critical when dealing with multi-point mutations where multiple residues interact. This allows the model to produce more reliable and targeted predictions for multi-point mutants, ultimately leading to better design outcomes.

      Regarding the model's performance in zero-shot settings for multi-point mutations, we tested this approach, and the results did not align well with the experimental data for multi-point mutants. Specifically, the Spearman correlation coefficient between the zero-shot predictions and experimental results was -0.71, indicating that zero-shot predictions for multi-point mutations were not as accurate as those from the fine-tuned model.

      In summary, the choice of using zero-shot for single-point mutations and fine-tuning for multi-point mutations was driven by the nature of the task and the availability of experimental data. Fine-tuning the model improves its predictive performance, especially for more complex multi-point mutation tasks. We have now clarified these choices in the manuscript and have added further discussion on the trade-offs between zero-shot and fine-tuning approaches.

      Specifically, we have added the following discussion at the section 2.2 in the main text:

      “Note that we employed different strategies for designing single-point and multi-point mutants, specifically using a zero-shot approach for single-point mutations and fine-tuning the model for multi-point mutations. These choices were made based on the distinct characteristics of the two tasks and the availability of experimental data. For single-point mutations, the number of possible mutations is relatively limited, and at the outset, there were no experimental data available. In such cases, the zero-shot setting was chosen because it allows the model to predict the fitness of mutants based solely on the information learned during pre-training on a large protein sequence dataset. Since single-point mutations are computationally manageable, this approach was deemed appropriate to generate initial predictions for protein engineering. However, when designing multi-point mutants, the situation changes significantly. The potential combinations of mutations increase exponentially, and without prior data, it becomes computationally infeasible to evaluate every possible combination within a reasonable timeframe. Moreover, by the time we reached the multi-point mutation design stage, experimental data for several single-point mutations had already been obtained. This data enabled us to fine-tune the model to better capture the specific structural and functional features that contribute to protein stability and resistance, especially in the context of multiple interacting mutations. Fine-tuning improves the model’s accuracy by adjusting its parameters to align more closely with the experimental data, ensuring that the predicted multi-point mutants are more likely to meet the desired engineering goals. After the second round of design, the fitness of the mutants was further improved. In improving alkali resistance, experimental results showed that 15 of the 45 designed mutants exhibited positive responses, yielding a success rate of 30%, close to the 35% success rate achieved in the second round. Compared to the wild type, the best single-point mutant improved alkali resistance by approximately 44.7%, while the best multi-point mutant achieved a 67.7% increase. For thermal stability enhancement, the success rate in the first round was 77.8%, rising to 100% in the second round. The top single-point mutant exhibited a Tm increase of 6.37°C over the wild type, while the best multi-point mutant had a Tm increase of 10.02°C. We also tested the performance of the zero-shot approach for multi-point mutants, and the results showed that this method did not yield satisfactory predictions. The Spearman correlation coefficient between the zero-shot predictions and experimental results for multi-point mutants was -0.71, indicating a significant discrepancy. This further highlights the importance of fine-tuning the model for multi-point mutations, as the fine-tuned model provided more accurate and reliable results. In summary, the choice of zero-shot for single-point mutations and fine-tuning for multi-point mutations was driven by practical considerations regarding computational feasibility and the availability of experimental data. Fine-tuning the model significantly enhances its predictive performance, particularly for complex multi-point mutations where multiple residues interact. We believe this strategy strikes an optimal balance between computational efficiency and predictive accuracy, making it well-suited for practical protein engineering applications.”

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates the impact of Pink1 loss on glial function and neuronal health in a Drosophila model, highlighting the role of mitochondria-organelle contacts and key genes such as Ccz1, Vps13, Mon1, and Rab7. The work provides insights into cellular processes underlying neurodegenerative diseases, with a focus on glia-neuron interactions. While the findings are promising, the study lacks critical controls, detailed mechanistic evidence, and explanatory figures to strengthen its claims.

      Strengths:

      (1) The study addresses an important topic in neuroscience, exploring the mechanisms of Pink1 loss, which has implications for Parkinson's disease and neurodegeneration.

      (2) The focus on mitochondria-organelle contacts and their regulation by Rab7-mediated pathways is novel and provides a potential mechanism for neuronal dysfunction.

      (3) The identification of key genes (Ccz1, Vps13, Mon1, Rab7) and their potential roles in Pink1-related pathways adds valuable knowledge to the field.

      (4) The manuscript uses a combination of genetic tools, Drosophila models, and functional assays to approach the problem from multiple angles.

      Weaknesses:

      (1) Specificity of Mz-Gal4: The study lacks validation of Mz-Gal4 specificity, as it may also drive expression in a few neurons or other types of glia. Additional control experiments using nls-GFP with Elav, Repo, or Draper antibody staining or alternative glial drivers would be helpful.

      (2) DLG staining is central to the story but is not well-supported by high-resolution Z-stack imaging, which should be included in the supplementary figures.

      (3) The manuscript does not confirm whether the candidate RNAi (Ccz1, Vps13, Mon1, Rab7) directly influence Rab7-mediated membrane trafficking or mitochondria-lysosome contacts in Pink1 mutants.

      (4) Using ERG as a readout for EG effects in the antenna is not a direct or appropriate assay. Alternative functional assays relevant to antenna glia should be considered.

      (5) A graphical explanation of the interactions and functions of the candidate genes in Pink1 KO mutants is missing. This would greatly enhance the manuscript's clarity.

      (6) The study lacks details on sample sizes, effect sizes, and reproducibility, which are necessary for robust conclusions.

      (7) There are repeated words on page 3 ("olfactory Olfactory Receptor Neurons") and a lack of explanation in Figure 3C regarding the most up-regulated and down-regulated genes and the significance of large red dots.

    2. Reviewer #2 (Public review):

      Summary: This study proposes a novel role for ensheathing glia (EG) in a Pink1-model of Parkinson's disease and shows that this cell population exibits the highest number of DEG in a pre-symptomatic stage. In the olfactory system, there seems to be morphological changes in this cell-type that resembles an 'activated' state and the authors further show that the neuronal loss of Pink1 is responsible for this defect. The authors go on to show that manipulation of Pink1 in EG also leads to some defects in the visual system and in the dopaminergic neurons (DAN) that innervate the mushroom body (MB), and performed a screen based on the 'on-transient' defect of the ERG to identify potential genes that may modulate the function of EG in synaptic regulation. They focus on several genes related to Rab7/Vps13, and performed some additional experiments in the visual system and MB to propose the role of vesicle/lipid trafficking in EG as a important factor for PD pathogenesis.

      Strengths: The study proposes functional and mechanistic connections between several genes that have been linked to PD (PINK1, VPS13A/C). I feel that the data presented in Figure 1 and Fig3A-C are performed with rigor and are convincing/novel. The selection of Drosophila to study the questions is also a strength and the lab has extensive experiences in this field and model organism.

      Weaknesses: There is one fundamental concern I have with the genetic experiments performed in this paper (especially in Fig 3D and Fig4, see major issue #1), and I feel that there is a bit of a disconnect between the EG 'activation' phenotype the author show in the olfactory system and the other two neuronal systems (visual system, MB DAN) that the authors investigate see major issue #2). Also, there are quite a bit of information that is not provided in the manuscript (see major issues #3 and #4), which makes me difficult to judge the rigor and interpretation of several experiments.

      Major Concern #1: A number of lines used in this study are referred to as "RNAi" lines but when I look at the actual genotypes of reagents listed in the table in the METHODS section, many are actually NOT RNAi lines. Quite a few lines, including lines that the authors use as RNAi against Ccz1, Rab7 and Mon1, are gRNA lines for the TKO (TRiP-CRISPR knockout) system. While these reagents can theoretically knock-out these genes in somatic cells if used in combination with UAS-Cas9, there is no mention that UAS-Cas9 was used in this work throughout the manuscript. Hence, when these lines are just crossed to GAL4 with or without the Pink1 mutant, they shouldn't be having any effects. Similarly, the strongest hit from their screen was a TOE (TRiP-CRISPR Over Expression) gRNA against PIG-A, which could allow overexpression of PIG-A if there is a UAS-dCas9::VP64. However, I also do not see any mention that such activator was introduced into the crossing scheme. Considering that 3 of the 4 'hits' from their screen are not RNAi lines, I am quite skeptical of the study. Similarly, except for Vps13, all reagents used in Fig4 are TKO gRNA lines. Therefore, if this experiment was conducted without an UAS-Cas9, most of the data shown here are problematic. Also, note that several of the 'RNAi' lines listed in the Table in the METHODS section are actually MiMIC alleles. While some MiMIC lines could function as strong LOF alleles (if they are inserted in the exon or in an intron of the gene in the same orientation as the gene), some of the lines are not expected to affect gene function (e.g. FASN2 and CG17712, MiMICs are in introns and face the opposite orientation). Hence, the rationale of including these reagents in the screen doesn't make much sense. The description of the modifier screen should be much more detailed in the RESULTS and METHODS section and if the UAS-Cas9/dCas9::VP64 transgenes were not introduced when the TKO/TOE reagents were utilized, what can be concluded?

      In addition, for the 4 genes that the authors further study in Fig4, there are many other reagents that the authors can use, including mutant alleles, previously characterized RNAi lines (e.g. Vps13) and dominant negative/constitute active lines (e.g. especially for Rab7). The authors should validate their results with independent reagents to really convincingly show that the same conclusions can be drawn for the Vps13/Rab7 related genes since this is the key takeaway message of this paper.

      Also, they do not show whether the manipulation of these genes in a wild-type background (they only show what happens in Pink1 mutants) affect ERG and MB DAN synapse morphology. If these manipulations alone dramatically affect these phenotypes, it would be very difficult to interpret their data.

      Major Concern #2: In Figure 1, the authors show some morphological evidence that EG are 'activated' in Pink1 mutants, but whether the same phenomenon occurs in the visual system and in the MB is not shown. Since all of the studies in Fig3D and Fig4 are done in the visual system and MB, it is not clear whether the visual system and MB phenotypes are related to 'activation' of EG.

      Also, in the RNA-seq data in Fig1A and Fig3C, is there any molecular evidence that EG are indeed 'activated'? The only evidence that the authors show to state that EG are 'activated' in young Pink1 null animals is based on increased CD8::GFP staining in the olfactory system.

      The authors cannot draw a strong conclusion that indeed EG are 'activated' based on these data (e.g. perhaps the expression level of CD8::GFP is just increased). Additional evidence that the EG are 'activated' could be provided by looking at the increase in Draper intensity (as reported by Doherty et al. and MacDonald et al. that the authors cite), not only in the olfactory system, but also in the visual system and in the MB. It would also be informative if the authors can look at morphology of the EG in the visual system and MB to convincingly that the data shown in Fig4 is relevant to EG 'activation'.

      Major Concern #3: In Fig3, there is no clear explanation why they focus on the ON transients and ignore the OFF transients, and also why the difference in the depolarization is not quantified in Fig4.

      Major Concern #4: While the authors claim that mz709-GAL4 is a EG specific driver, do the authors know that this is indeed true in the tissues and stages that are studied here? The Ito et al,. paper that is cited in the METHOD section has only looked at the expression of this reporter in embryonic and larval stages. The authors need to that the authors should validate their findings with an additional EG specific driver and/or provide additional data that mz709-GAL4 is indeed specific to EG in the adult fly brain and eye. If mz709-GAL4 is expressed in other cell-types, the interpretation of many of the data in this paper becomes quite questionable. I believe the data in Fig3B is suggesting that mz709-GAL4 is indeed specific to glia cells and not expressed in neurons, but whether this driver is truly specific to EG (and not in other glial types), especially in the visual system (including the lamina as well as in the eye), is not obvious.

    1. Kwasy tłuszczowe omega-3 wydają się spowalniać procesy starzenia
      • Daily intake of 1g of omega-3 fatty acids can slow down aging, especially when combined with vitamin D and exercise - as demonstrated by a 3-year study on 777 Swiss seniors[1][2].

      • Epigenetic clocks were used to measure the pace of aging, showing up to 4 months of biological age reduction[1][3].

      • Additional benefits included:

        • 61% reduction in invasive cancer risk[4]
        • 10% reduction in falls[4]
        • 13% decrease in infection rates[4]
      • Sources of omega-3: fish, seafood, rapeseed and flaxseed oil, walnuts, supplements[7][9]

      Citations: [1] https://www.sci.news/medicine/omega-3-supplementation-biological-aging-13635.html [2] https://www.news-medical.net/news/20250204/Omega-3-fatty-acids-and-vitamin-D-slow-biological-aging-in-older-adults.aspx [3] https://www.nature.com/articles/s43587-024-00793-y [4] https://www.medonet.pl/leki-od-a-do-z/witaminy-i-mineraly,wystarczy-gram-dziennie--spowolnisz-starzenie-nawet-o-cztery-miesiace,artykul,86261609.html [5] https://www.eurekalert.org/news-releases/1072552 [6] https://www.national-geographic.pl/nauka/kwasy-omega-3-to-pigulka-mlodosci-zwlaszcza-w-polaczeniu-z-witamina-d/ [7] https://dzienniknaukowy.pl/kwasy-tluszczowe-omega-3-wydaja-sie-spowalniac-procesy-starzenia [8] https://translate.google.com/?prev=_t&hl=pl&ie=UTF-8&eotf=1&sl=en&tl=zh-CN [9] https://www.businessinsider.com/taking-omega-3-supplements-may-slow-down-aging-study-2025-2 [10] https://pubmed.ncbi.nlm.nih.gov/36629088/ [11] https://www.sciencedaily.com/releases/2025/02/250204132423.htm

    1. Reviewer #2 (Public review):

      The authors inject, into the rete testes, mRNA and plasmids encoding mRNAs for GFP and then ARMC2 (into infertile Armc2 KO mice) in a gene therapy approach to express exogenous proteins in male germ cells. They do show GFP epifluorescence and ARMC2 protein in KO tissues, although the evidence presented is weak. Overall, the data do not necessarily make sense given the biology of spermatogenesis and more rigorous testing of this model is required to fully support the conclusions, that gene therapy can be used to rescue male infertility.

      In this revision, the authors attempt to respond to the critiques from the first round of reviews. While they did address many of the minor concerns, there are still a number to be addressed. With that said, the data still do not support the conclusions of the manuscript.

      (1) The authors have not satisfactorily provided an explanation for how a naked mRNA can persist and direct expression of GFP or luciferase for ~3 weeks. The most stable mRNAs in mammalian cells have half-lives of ~24-60 hours. The stability of the injected mRNAs should be evaluated and reported using cell lines. GFP protein's half-life is ~26 hours, and luciferase protein's half-life is ~2 hours.

      (2) There is no convincing data shown in Figs. 1-8 that the GFP is even expressed in germ cells, which is obviously a prerequisite for the Armc2 KO rescue experiment shown in the later figures! In fact, to this reviewer the GFP appears to be in Sertoli cell cytoplasm, which spans the epithelium and surrounds germ cells - thus, it can be oft-confused with germ cells. In addition, if it is in germ cells, then the authors should be able to show, on subsequent days, that it is present in clones of germ cells that are maturing. Due to intracellular bridges, a molecule like GFP has been shown to diffuse readily and rapidly (in a matter of minutes) between adjacent germ cells. To clarify, the authors must generate single cell suspensions and immunostain for GFP using any of a number of excellent commercially-available antibodies to verify it is present in germ cells. It should also be present in sperm, if it is indeed in the germline.

      Other comments:

      70-1 This is an incorrect interpretation of the findings from Ref 5 - that review stated there were ~2,000 testis-enriched genes, but that does not mean "the whole process involves around two thousand of genes"

      74 would specify 'male'

      79-84 Are the concerns with ICSI due to the procedure itself, or the fact that it's often used when there is likely to be a genetic issue with the male whose sperm was used? This should be clarified if possible using references from the literature, as this reviewer imagines this could be a rather contentious issue with clinicians who routinely use this procedure, even in cases where IVF would very likely have worked

      199 Codon optimization improvement of mRNA stability needs a reference; in one study using yeast transcripts, optimization improved RNA stability on the order of minutes (e.g., from ~5 minutes to ~17 minutes); is there some evidence that it could be increased dramatically to days or weeks?

      472-3 The reported half-life of EGFP is ~36 hours - so, if the mRNA is unstable (and not measured, but certainly could be estimated by qRT-PCR detection of the transcript on subsequent days after injection) and EGFP is comparatively more stable (but still hours), how does EGFP persist for 21 days after injection of naked mRNA??

      Curious why the authors were unable to get anti-GFP to work in immunostaining?

      In Fig. 3-4, the GFP signals are unremarkable, in that they cannot be fairly attributed to any structure or cell type - they just look like blobs; and why, in Fig. 4D-E, why does the GFP signal appear stronger at 21 days than 15 days? And why is it completely gone by 28 days? This data is unconvincing. If the authors did a single cell suspension, what types or percentage of cells would be GFP+? Since germ cells are not adherent in culture, a simple experiment could be done whereby a single cell suspension could be made, cultured for 4-6 hours, and non-adherent cells "shaken off" and imaged vs adherent cells. Cells could also be fixed and immunostained for GFP, which has worked in many other labs using anti-GFP.

      In Fig. 5, what is the half-life of luciferase? From this reviewer's search of the literature, it appears to be ~2-3 h in mammalian cells. With this said, how do the authors envision detectable protein for up to 20 days from a naked mRNA? The stability of the injected mRNAs should be shown in a mammalian cell line - perhaps this mRNA has an incredibly long half-life, which might help explain these results. However, even the most stable endogenous mRNAs (e.g., globin) are ~24-60 hrs.

      527-8 The Sertoli cell cytoplasm is not just present along the basement membrane as stated, but also projects all the way to the lumina

      529-30 This is incorrect, as round spermatids are never "localized between the spermatocytes and elongated spermatids" - if elongated spermatids are present, rounds are not - they are never coincident in the same testis section

      Fig. 7 To this reviewer, all of the GFP appears to be in Sertoli cell cytoplasm

      In Figs 1-8 there is no convincing evidence presented that GFP is expressed in germ cells! In fact, it appears to be in Sertoli cells

      Fig. 9 - alpha-tubuline?

      Fig. 11 - how was sperm morphology/motility not rescued on "days 3, 6, 10, 15, or 28 after surgery", but it was in some at 21 and 35? How does this make sense, given the known kinetics of male germ cell development?? And at least one of the sperm in the KO in Fig. B5 looks relatively normal, and the flagellum may be out-of-focus in the image? With only a few sperm for reviewers to see, how can we know these represent the population?

    1. Reviewer #1 (Public review):

      Summary:

      This work presents an Interpretable protein-DNA Energy Associative (IDEA) model for predicting binding sites and affinities of DNA-binding proteins. Experimental results demonstrate that such an energy model can predict DNA recognition sites and their binding strengths across various protein families and can capture the absolute protein-DNA binding free energies.

      Strengths:

      (1) The IDEA model integrates both structural and sequence information, although such an integration is not completely original.

      (2) The IDEA predictions seem to have agreement with experimental data such as ChIP-seq measurements.

      Weaknesses:

      (1) The authors claim that the binding free energy calculated by IDEA, trained using one MAX-DNA complex, correlates well with experimentally measured MAX-DNA binding free energy (Figure 2) based on the reported Pearson Correlation of 0.67. However, the scatter plot in Figure 2A exhibits distinct clustering of the points and thus the linear fit to the data (red line) may not be ideal. As such. the use of the Pearson correlation coefficient that measures linear correlation between two sets of data may not be appropriate and may provide misleading results for non-linear relationships.

      (2) In the same vein, the linear Pearson Correlation analysis performed in Figure 5A and the conclusion drawn may be misleading.

      (3) The authors included the sequences of the protein and DNA residues that form close contacts in the structure in the training dataset, whereas a series of synthetic decoy sequences were generated by randomizing the contacting residues in both the protein and DNA sequences. In particular, synthetic decoy binders were generated by randomizing either the DNA (1000 sequences) or protein sequences (10,000 sequences) from the strong binders. However, the justification for such randomization and how it might impact the model's generalizability and transferability remain unclear.

      (4) The authors performed Receiver Operating Characteristic (ROC) analysis and reported the Area Under the Curve (AUC) scores in order to quantitate the successful identification of the strong binders by IDEA. It would be beneficial to analyze the precision-recall (PR) curve and report the PRAUC metric which could be more robust.

    2. Author response:

      Reviewer 1:

      Summary: This work presents an Interpretable protein-DNA Energy Associative (IDEA) model for predicting binding sites and affinities of DNA-binding proteins. Experimental results demonstrate that such an energy model can predict DNA recognition sites and their binding strengths across various protein families and can capture the absolute protein-DNA binding free energies.

      We appreciate the reviewer’s careful assessment of the paper, and we thank the reviewer for the insightful suggestions and comments.

      Strengths:

      (1) The IDEA model integrates both structural and sequence information, although such an integration is not completely original. (2) The IDEA predictions seem to have agreement with experimental data such as ChIP-seq measurements.

      We appreciate the reviewer’s comments on the strength of the paper.

      Weaknesses:

      (1) The authors claim that the binding free energy calculated by IDEA, trained using one MAX-DNA complex, correlates well with experimentally measured MAX-DNA binding free energy (Figure 2) based on the reported Pearson Correlation of 0.67. However, the scatter plot in Figure 2A exhibits distinct clustering of the points and thus the linear fit to the data (red line) may not be ideal. As such. the use of the Pearson correlation coefficient that measures linear correlation between two sets of data may not be appropriate and may provide misleading results for non-linear relationships.

      We thank the reviewer for the insightful comments and agree that the linear fit between our predictions and the experimental data may not be ideal. The primary utility of the IDEA model is for assessing the relative binding affinities of different DNA sequences. To further support this, we plan to conduct additional statistical analyses that are independent of the linear correlation assumption but instead focus on the ranked order of DNA sequence binding affinities.

      (2) In the same vein, the linear Pearson Correlation analysis performed in Figure 5A and the conclusion drawn may be misleading.

      We thank the reviewer for the insightful comments. We will perform the same analysis for Figure 5A as detailed in our response to the previous comments.

      (3) The authors included the sequences of the protein and DNA residues that form close contacts in the structure in the training dataset, whereas a series of synthetic decoy sequences were generated by randomizing the contacting residues in both the protein and DNA sequences. In particular, synthetic decoy binders were generated by randomizing either the DNA (1000 sequences) or protein sequences (10,000 sequences) from the strong binders. However, the justification for such randomization and how it might impact the model’s generalizability and transferability remain unclear.

      We thank the reviewer for the insightful comments. We will perform additional analyses to assess the robustness of our model predictions with respect to the number of randomized decoys. Additionally, we will examine how randomization would potentially affect the model’s generalizability and transferability.

      (4) The authors performed Receiver Operating Characteristic (ROC) analysis and reported the Area Under the Curve (AUC) scores in order to quantitate the successful identification of the strong binders by IDEA. It would be beneficial to analyze the precision-recall (PR) curve and report the PRAUC metric which could be more robust.

      We agree with Reviewer 1 that more statistical metrics should be used to evaluate our model’s performance. We will include a more robust approach, such as PRAUC, to evaluate our model.

      Reviewer 2:

      Summary:

      Zhang et al. present a methodology to model protein-DNA interactions via learning an optimizable energy model, taking into account a representative bound structure for the system and binding data. The methodology is sound and interesting. They apply this model for predicting binding affinity data and binding sites in vivo. However, the manuscript lacks discussion of/comparison with state-of-the-art and evidence of broad applicability. The interpretability aspect is weak, yet over-emphasized.

      We appreciate the reviewer’s excellent summary of the paper, and we thank the reviewer for the insightful suggestions and comments.

      Strengths:

      The manuscript is well organized with good visualizations and is easy to follow. The methodology is discussed in detail. The IDEA energy model seems like an interesting way to study a protein-DNA system in the context of a given structure and binding data. The authors show that an IDEA model trained on one system can be transferred to other structurally similar systems. The authors show good performance in discriminating between binding-vs-decoy sequences for various systems, and binding affinity prediction. The authors also show evidence of the ability to predict genome-wide binding sites.

      We appreciate the reviewer’s strong assessment of the strengths of this paper.

      Weaknesses:

      An energy-based model that needs to be optimized for specific systems is inherently an uncomfortable idea. Is this kind of energy model superior to something like Rosetta-based energy models, which are generally applicable? Or is it superior to family-specific knowledge-based models? It is not clear.

      We thank the reviewer for the insightful comments. We will include predictions by generic protein-DNA energy models, such as the Rosetta-based energy model or family-specific knowledge-based model, to compare with our model performance.

      Prediction of binding affinity is a well-studied domain and many competitors exist, some of which are well-used. However, no quantitative comparison to such methods is presented. To understand the scope of the presented method, IDEA, the authors should discuss/compare with such methods (e.g. PMID 35606422).

      We thank the reviewer for the insightful comments. In our initial submission, Figure S5 presents a comparison between our model’s prediction and those of an existing method using 10-fold cross-validation. We agree a more comprehensive comparison with other methods is needed and will include a discussion and comparison of the IDEA model’s performance with additional state-of-the-art models.

      The term “interpretable” has been used lavishly in the manuscript while providing little evidence on the matter. The only evidence shown is the family-specific residue-nucleotide interaction/energy matrix and speculations on how these values are biologically sensible. Recent works already present more biophysical, fine-grained, and sometimes family-independent interpretability (e.g. PMID 39103447, 36656856, 38352411, etc.). The authors should put into context the scope of the interpretability of IDEA among such works.

      We agree that “interpretability” should be discussed in a relevant context. We will discuss the scope of IDEA interoperability within the context of recent works, including those suggested by the reviewers.

      The manuscript disregards subtle yet important differences in commonly used terminology in the field. For example, the authors use the term ”specificity” and ”affinity” almost interchangeably (for example, the caption for Figure 3A uses ”specificity” although the Methods text describes the prediction as about ”affinity”). If the authors are looking to predict specificity, IDEA needs to be put in the context of the corresponding state-of-the-art (PMID 36123148, 39103447, 38867914, 36124796, etc).

      We really appreciate the reviewer for pointing out our conflation of “specificity” and “affinity” in the manuscript. To clarify, IDEA’s primary function is to predict the binding affinities of protein-DNA pairs in a sequence-specific manner. The acquired binding affinities of target DNA sequences can then be used to assess the specific binding motifs. We will revise our text to clarify this point.

      It is not clear how much the learned energy model is dependent on the structural model used for a specific system/family. It would be interesting to see the differences in learned model based on different representative PDB structures used. Similarly, the supplementary figures show a lack of discriminative power for proteins like PDX1 (homeodomain family), POU, etc. Can the authors shed some light on why such different performances?

      We thank the reviewer for the insightful comments and agree that the familyspecific energy model could provide insight into the model predictions. We will examine different energy models based on the protein family, and especially investigate whether they can explain the lack of discriminative power for certain proteins.

      It is also not clear if IDEA’s prediction for reverse complement sequences is the same for a given sequence. If so, how is this property being modelled? Either this description is lacking or I missed it.

      We thank the reviewer for the insightful comments. The IDEA model treats reverse complementary sequences separately. We will provide additional details on how these sequences are modeled.

      Reviewer 3:

      Summary:

      Protein-DNA interactions and sequence readout represent a challenging and rapidly evolving field of study. Recognizing the complexity of this task, the authors have developed a compact and elegant model. They have applied well-established approaches to address a difficult problem, effectively enhancing the information extracted from sparse contact maps by integrating artificial sequences decoy set and available experimental data. This has resulted in the creation of a practical tool that can be adapted for use with other proteins.

      We appreciate the reviewer’s excellent summary of the paper, and we thank the reviewer for the insightful suggestions and comments.

      Strengths:

      (1) The authors integrate sparse information with available experimental data to construct a model whose utility extends beyond the limited set of structures used for training. (2) A comprehensive methods section is included, ensuring that the work can be reproduced. Additionally, the authors have shared their model as a GitHub project, reflecting their commitment to transparency of research.

      We appreciate the reviewer’s strong assessment of the strengths of this paper.

      Weaknesses:

      (1) The coarse-graining procedure appears artificial, if not confusing, given that full-atom crystal structures provide more detailed information about residue-residue contacts. While the selection procedure for distance threshold values is explained, the overall motivation for adopting this approach remains unclear. Furthermore, since this model is later employed as an empirical potential for molecular modeling, the use of P and C5 atoms raises concerns, as the interactions in 3SPN are modeled between C<sub>α</sub> and the nucleic base, represented by its center of mass rather than P or C5 atoms.

      We appreciate the reviewer’s insightful comments. The selection of P and C5 atoms will augment our model prediction, but the prediction is robust without this selection scheme. We will provide more details on the motivation behind this selection.

      Regarding the simulation model, we acknowledge a potential disconnection between the coarse-grained level of the 3SPN model (3 coarse-grained sites per nucleotide) and the data-driven model (1 coarse-grained site per nucleotide). The selection of nucleic bases for molecular interactions in the 3SPN model follows the PI’s previous work [PMID: 34057467] and its code implementation. We will test the simulation model by incorporating interactions between Cff and P atoms. In the future, we will work on implementing IDEA model output for 1-bead-per-nucleotide DNA simulation models.

      (2) Although the authors use a standard set of metrics to assess model quality and predictive power, some ∆∆G predictions compared to MITOMI-derived ∆∆G values appear nonlinear, which casts doubt on the interpretation of the correlation coefficient.

      We thank the reviewer for the insightful comments and agree that the linear fit between our model’s prediction and the experimental data may not be ideal. The primary utility of the IDEA model is for assessing the relative binding affinities of different DNA sequences. To this end, we plan to perform additional statistical analyses that are independent of the linear correlation assumption but instead focus on the ranked order of DNA sequence binding affinities.

      (3) The discussion section lacks information about the model’s limitations and a comprehensive comparison with other models. Additionally, differences in model performance across various proteins and their respective predictive powers are not addressed.

      We thank the reviewer for the insightful comments and will compare the performance of the IDEA model with state-of-the-art methods. We will also perform detailed analyses of the learned energy models across different proteins and examine their correlation with the model’s predictive powers.

    1. Reviewer #2 (Public review):

      Summary:

      The authors aimed to investigate how IL-4 modulates the reactive state of microglia in the context of neuropathic pain. Specifically, they sought to determine whether IL-4 drives an increase in CD11c+ microglial cells, a population associated with anti-inflammatory responses and whether this change is linked to the suppression of neuropathic pain. The study employs a combination of behavioral assays, pharmacogenetic manipulation of microglial populations, and characterization of microglial markers to address these questions.

      Strengths:

      The methodological approach in this study is robust, providing convincing evidence for the proposed mechanism of IL-4-mediated microglial regulation in neuropathic pain. The experimental design is well thought out, utilizing two distinct neuropathic pain models (SpNT and SNI), each yielding different outcomes. The SpNT model demonstrates spontaneous pain remission and an increase in the CD11c+ microglial population, which correlates with pain suppression. In contrast, the SNI model, which does not show spontaneous pain remission, lacks a significant increase in CD11c+ microglia, underscoring the specificity of the observed phenomenon. This design effectively highlights the role of the CD11c+ microglial population in pain modulation. The use of behavioral tests provides a clear functional assessment of IL-4 manipulation, and pharmacogenetic tools allow for precise control of microglial populations, minimizing off-target effects. Notably, the manipulation targets the CD11c promoter, which presumably reduces the risk of non-specific ablation of other microglial populations, strengthening the experimental precision. Moreover, the thorough characterization of microglial markers adds depth to the analysis, ensuring that the changes in microglial populations are accurately linked to the behavioral outcomes.

      Weaknesses:

      One potential limitation of the study is that the mechanistic details of how IL-4 induces the observed shift in microglial populations are not fully explored. While the study demonstrates a correlation between IL-4 and CD11c+ microglial cells, a deeper investigation into the specific signaling pathways and molecular processes driving this population shift would greatly strengthen the conclusions. Additionally, the paper does not clearly integrate the findings into the broader context of microglial reactive state regulation in neuropathic pain.

    1. Reviewer #3 (Public review):

      Summary

      The study presents an investigation into how hypothalamic orexin neurons (HONs) track body movement with high precision. Using techniques including fiber photometry, video-based movement metrics, and empirical mode decomposition (EMD), the authors demonstrate that HONs encode net body movement consistently across a range of behaviors and metabolic states. They test the ability of HONs to track body movement to that of other subcortical neural populations, from which they distinguish HONs activity from other subcortical neural populations.

      Strengths:

      The study characterizes HONs activity as key indicators of movement and arousal, and this method may have potential implications for understanding sleep disorders, energy regulation, and brain-body coordination. Overall, I think this is a very interesting story, with novel findings and implications about sensorimotor systems in animals. The manuscript is clearly written and the evidence presented is rigorous. The conclusions are well supported by experimental data with clear statistical analyses.

      Weaknesses/suggestions:

      There are a couple of issues I think the authors could address to make the paper better and more complete:

      (1) The study primarily focuses on steady-state behaviors. It would be interesting if the authors' current dataset allows analyses of HON dynamics during transitions between behavioral states (e.g., resting to running or grooming to sniffing). This could provide additional insights into how HONs adapt to rapid changes in body movement.

      (2) Given the established role of HONs in arousal and wakefulness, the study could further investigate how movement-related HON dynamics interact with arousal states. For example, does HON encoding of movement differ during sleep versus wakefulness?

      (3) Although HON ablation experiments suggest that HONs do not shape movement frequency profiles. It would be more compelling if the authors could investigate whether HONs contribute to specific types of movements (e.g., fine motor vs. gross motor movements) or modulate movement initiation thresholds.

      (4) The heterogeneous movement-related orexin dynamics observed in the LC and SNc raise intriguing questions about the circuit-level mechanisms underlying these differences. Optogenetic or chemogenetic manipulation of these projections could validate the functional implications of these dynamics.

    1. Voici un document de synthèse pour un briefing, basé sur le rapport de l'Inspection générale de l'éducation, du sport et de la recherche (IGÉSR) de novembre 2024, concernant l'enseignement et l'apprentissage dans un environnement de confiance et de sécurité :

      Titre du Rapport : Enseigner et apprendre en confiance et en sécurité : un enjeu essentiel pour la nation.

      Contexte et Enjeux : * Le rapport souligne l'importance cruciale de la confiance et de la sécurité dans le processus d'enseignement-apprentissage. Ces sentiments sont essentiels pour la motivation et le bien-être des élèves et des personnels. * Les vulnérabilités croissantes chez les élèves et les personnels, telles que la dégradation de la santé mentale et la démotivation, rendent ces enjeux encore plus importants. * Les phénomènes sociétaux (crises climatiques, sanitaires, économiques, violences urbaines, cyberharcèlement, complotisme, fracture sociale, défiance vis-à-vis des institutions) ont un impact sur le climat scolaire.

      Constat Principal : * Un engagement qualitatif du personnel à agir quotidiennement avec loyauté, rigueur et bienveillance pour favoriser un climat de confiance et de protection. * La confiance reste insuffisamment problématisée, bien qu'elle soit une condition fondamentale de la qualité de la relation pédagogique et un ciment de l'engagement collectif. * L'insécurité est une problématique sociétale qui pénètre l'École et fragilise la confiance.

      Programme pHARe : * Le programme pHARe de lutte contre le harcèlement est globalement apprécié et considéré comme efficace, bien que certains de ses piliers nécessitent d'être consolidés pour améliorer sa lisibilité. * La méthode de préoccupation partagée (MPP) est jugée efficace pour lutter contre l'intimidation et le harcèlement récents, à condition de respecter certaines conditions d'utilisation. De nombreux personnels regrettent sa disparition des recommandations nationales.

      Recommandations Clés : * Réaffirmer le respect d'autrui comme apprentissage fondamental. * Sanctuariser une séance hebdomadaire d'enseignement moral et civique (EMC). Augmenter le coefficient de l’EMC aux examens. * Expliciter le sens du règlement intérieur aux parents et aux élèves. * Engager une réflexion nationale sur les conditions susceptibles de rendre l'évaluation et l'orientation moins anxiogènes. * Élaborer un texte fondateur (charte ou autre) pour une déontologie partagée des personnels de l'éducation nationale. * Généraliser la formation à l'écoute active et au dialogue avec les adolescents. * Déployer la pratique régulière d'activités collaboratives, notamment au collège. * Généraliser les bilans de vie scolaire pour évaluer l'impact de la politique éducative d'établissement. * Développer une double alliance éducative, avec les élèves comme avec les parents.

      Priorités pour un Climat Relationnel de Confiance et de Sécurité : 1. Priorité 1 : Accorder une place centrale au respect d'autrui et développer les compétences psychosociales. 2. Priorité 2 : Renforcer le soutien aux équipes éducatives (groupes de parole, supervision, formation des directeurs). 3. Priorité 3 : Favoriser l'engagement des élèves et des équipes dans des dynamiques collaboratives et renforcer le sentiment d'appartenance. 4. Priorité 4 : Développer une double alliance éducative, avec les élèves comme avec les parents.

      Actions et Dispositifs : * Soutien aux personnels : Groupes de parole, leadership sécurisant, amélioration de la qualité de vie au travail. * Engagement des élèves : Activités collaboratives, projets collectifs, élèves ambassadeurs et sentinelles. * Alliance éducative : Implication des parents, communication claire sur le harcèlement, actions pour faciliter la participation des familles. * Outils et méthodes : Méthode de la préoccupation partagée (MPP), bilans de vie scolaire, outils numériques.

      Conclusion : * Il est crucial d'agir avec fermeté et bienveillance pour créer un contexte d'enseignement et d'apprentissage confiant et sécurisant. * Il est essentiel d'identifier clairement les problèmes et de renforcer la collaboration à tous les niveaux du système éducatif.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Hahn et al use bystander BRET, NanoBiT assays, and APEX2 proteomics to investigate endosomal signaling of CCR7 by two agonists, CCL19 and CCL21. The authors suggest that CCR7 signals from early endosomes following internalisation. They use spatial proteomics to try to identify novel interacting partners that may facilitate this signaling and use this data to specifically enhance a Rac1 signaling pathway. Many of the results in the first few figures showing simultaneous recruitment of Barr and G proteins by CCR7 have been shown previously (Laufer et al, 2019, Cell Reports), as has signaling from endomembranes, and Rac1 activation at intracellular sites. The new findings are the APEX2 proteomics studies, which could be useful to the scientific community. Unfortunately, the authors only follow up on a single finding, and the expansion of this section would improve the manuscript.

      First of all, we would like to thank the reviewer for helping with the manuscript. The summary is mostly accurate except for the statement that simultaneous recruitment of barr and G protein to CCR7 has been shown before. It should also be noted that it has not been demonstrated that CCR7 activates G proteins from endosomes previously nor has the functional role of this signaling mechanism. However, that CCR7 activity at endomembranes is associated with Rac1 signaling was demonstrated in the Laufer et al. study as the reviewer correctly points out.

      Strengths:

      (1) The APEX2 resource will be valuable to the GPCR and immunology community. It offers many opportunities to follow up on findings and discover new biology. The resource could also be used to validate earlier findings in the current manuscript and in previous manuscripts. Was there enrichment of early endosomal markers, Barr and Gi as this would provide further evidence for their earlier claims regarding endosomal signaling? Previous studies have suggested signaling from the TGN, so it is possible that the different ligands also direct to different sites. This could easily be investigated using the APEX2 data.

      Thank you for your comment. We do in fact observe enrichment of TGN/Golgi markers in response to chemokine stimulation, which we now have highlighted in the manuscript (fourth paragraph on page 7).

      (2) The results section is well written and can be followed very easily by the reader.

      We are glad that the reviewer found the results section very readable.

      (3) Some findings verify previous studies (e.g. endomembrane signalling). This should be acknowledged as this shows the validity of the findings of both studies.

      This is correct. We have now included more discussion of previous work related to CCR7 signaling at endomembranes (thirdparagraph on page 10).

      Weaknesses:

      (1) The findings are interesting although the studies are almost all performed in HEK293 cells. I understand that these are commonly used in GPCR biology and are easy to transfect and don't express many GPCRs at high concentrations, but their use is still odd when there are many cell-lines available that express CCR7 and are more reflective of the endogenous state (e.g. they are polarised, they can perform chemotaxis/ migration). Some of the findings within the study should also be verified in more physiologically relevant cells. At the moment only the final figure looks at this, but findings need to be verified elsewhere.

      We thank the reviewer for raising this point and giving us an opportunity to elaborate in further detail. The major goal of our study was to investigate whether CCR7 activates G protein from endosomes, the underlying mechanism, and functions of this potential signaling mechanism. The reason we chose CCR7 as our model receptor was that it belongs to a group of GPCRs, the chemokine receptors, that most often have features associated with the ability to promote endosomal G protein activation (phosphorylation site clusters in the C-terminal region).

      Specific detection of G protein activation at distinct subcellular compartments is currently very challenging in truly endogenous systems despite new innovative biosensors that are available (not just related to CCR7, but GPCRs in general). To our knowledge, most if not all studies that detect direct activation of G protein at a specific compartment whether at the plasma membrane, endosome, Golgi, or other compartments, have overexpressed either the receptor, G protein, or both. This is why we choose the HEK293 cell system for most of our experiments, which are easy to manipulate. That being said, we did confirm major findings in an indirect manner using Jurkat T-cells, which express CCR7 endogenously and are physiological relevant. Our hope is that in the future we will be able to use highly sensitive biosensors to directly confirm our findings in such a cell system as the reviewer wisely suggests.

      (2) The authors acknowledge that the kinetic patterns of the signals at the early endosome are not consistent with the rates of internalisation. They mention that this could be due to trafficking elsewhere. This could be easily looked at in their APEX2 data. Is there evidence of proximity to markers of other membranes? Perhaps this could be added to the discussion. Similarly, previous studies have shown that CCR7 signaling may involve the TGN. Was there enrichment of these markers? If not, this could also be an interesting finding and should be discussed. It is also possible that the Rab5 reporter is just not as efficient as the trafficking one, especially as in later figures the very convincing differences in the two ligands are not as robust as the differences in trafficking.

      Excellent point. We have now highlighted the possibility of CCR7 being further trafficked to the trans-Golgi network (TGN) as possible explanation for the transient translocation of activated CCR7 to the early endosome in Fig. 1G-H (second paragraph on page 3).

      Furthermore, in the APEX2 experiment we observe enrichment of proteins involved in lysosomal trafficking (LAMP1, VPS16, VAMP7, WDR91, and PP4P1) by CCL19 stimulation at 25 min, and recycling endosomes/TGN markers (SNX6, RAB7L, and GGA) by CCL21 stimulation at 25 min. In addition to this, several markers of TGN/Golgi (SNX3, COG5, YIF1A, SC22B, and AP3S1) were enriched as well in response to both CCL19 and CCL21 stimulation. We have now included a statement in the manuscript, which describes the likely trafficking of CCR7 to the TGN/Golgi in response to CCL19 and CCL21 stimulation (fourth paragraph on page 7).

      (3) In the final sentence of paragraph 2 of the results the authors state that the internalisation is specific to CCR7 as there isn't recruitment to V2R. I'm not sure this is the best control. The authors can only really say it doesn't recruit to unrelated receptors. The authors could have used a different chemokine receptor which does not respond to these ligands to show this.

      The point with this control experiment was to demonstrate that the loss of NanoBiT signal in response to CCL19 in CCR7-SmBiT/LgBiT-CAAX expressing cells, but not in V2R-SmBiT/LgBiT-CAAX expressing cells, was a result of bona fide CCR7 internalization rather than potential artifactual effects of CCL19 on the NanoBiT system. Our intent was not to demonstrate specificity of CCL19 among chemokine receptors, which already has been thoroughly tested in previous studies. We have now modified the sentence (second paragraph on page 3) “Moreover, CCL19/CCL21-stimulation of receptor internalization to endosomes is specific to CCR7 as none of the chemokines promote internalization or trafficking to endosomes of the vasopressin type 2 receptor (V<sub>2</sub>R)-SmBiT construct (Fig. S1E-F)” to “Moreover, CCL19/CCL21-stimulation did not promote internalization or trafficking to endosomes of the vasopressin type 2 receptor (V<sub>2</sub>R)-SmBiT construct, which validates that these chemokines act specifically via the CCR7-SmBiT system (Fig. S1E-F).”

      (4) The miniGi-Barr1 and imaging showing co-localisation could be more convincing if it was also repeated in a more physiological cell line as in the final figure. Imaging of CCR7, miniGi, and Barr1 would also provide further evidence that the receptor is also present within the complex.

      We agree with the reviewer’s assessment. However, as mentioned above it is currently extremely challenging to detect endogenous G protein coupling/activation to endogenous receptors. In addition, we are not sure if overexpressing fluorophore-tagged receptor, miniG, and barr1 in a physiological-relevant cell line would provide truly physiological conditions as the expression of these proteins still would be artificially high. This is why we chose to conduct these mechanistic experiments in HEK293 cells and then indirectly verify key findings in an endogenous and physiological-relevant cell line.

      (5) The findings regarding Rac1 are interesting, although an earlier paper found similar results (Laufer et al, 2019, Cell Reports), so perhaps following up on another APEX2-identified protein pathway would have been more interesting. The authors' statement that Rac1 is specifically activated, and RhoA and Cdc42 are not, is unconvincing from the current data. Only a single NanoBiT assay was used, and as raw values are not reported it is difficult for the reader to glean some essential information. The authors should show evidence that these reporters work well for other receptors (or cite previous studies) and also need evidence from an independent (i.e. non-NanoBiT or BRET) assay.

      The major focus of the study was to investigate whether CCR7 can activate G protein after having been internalized into endosomes via formation of CCR7-Gi/o-barr megaplexes, and to dissect out potential functions of said endosomal G protein signaling. To do this, we used CCL19 and CCL21 which stimulate G protein to the same extent but differ in their ability of promote barr recruitment and receptor internalization with CCL19 being superior to CCL21. To this end, we found that CCL19 also promote endosomal G protein activation to a greater extent than CCL21, and therefore, we specifically looked for proteins enriched by CCL19 in our APEX experiment. This led us to some Rho GTPase regulators that were differentially enriched by CCL19 and CCL21. We agree that there were other interesting effectors related to CCR7 biology identified in the APEX experiment such as EYA2, GRIP2, and EI24. However, those proteins were enriched similar by CCL19 and CCL21 challenge, and thus, do not seem to be activated specifically at endosomes. Following the same argument, we also did not observe any difference in the activity of RhoA or Cdc42 when stimulated with CCL19 or CCL21, so we cannot conclude that these signaling proteins are activated specifically in endosomes. On the other hand, Rac1 was stimulated to a larger degree by CCL19 than CCL21, its activity was inhibited by the Gi/o inhibitor PTX and endocytosis inhibitors Dyngo-4a and PitStop2. CCR7-mediated Rac1 signaling was also inhibited by expression of a dominant negative dynamin mutant that inhibits receptor internalization, and Rac1 was not activated by an internalization-deficient CCR7-DS/T mutant. Finally, the involvement of Rac1 in CCR7 mediated chemotaxis of Jurkat T cells was also demonstrated. We believe that these findings together provide strong basis for the claim that endosomal Gi/o protein signaling by CCR7 activates Rac1.

      Following the reviewer’s suggestion, we have now included experiments to show that the activation of RhoA, Rac1, and Cdc42 by CXCR4 also can be detected by the NanoBiT biosensors (Fig. S7D-F). We have also added the appropriate references to the original studies where these biosensors were developed in the results section (first paragraph on page 8).

      (6) At present, the studies in Figure 7 do not go beyond those in the previous Laufer et al study in which they showed blocking endocytosis affected Rac1 signalling. The authors could show that Rac1 signalling is from early endosomes to improve this, otherwise, it could be from the TGN as previously reported.

      The major purpose of Figure 7 was to indirectly confirm findings from HEK293 cells experiments and to tie them to physiological functions. Our experiments using Jurkat T-cells show that CCL19 promote stronger chemotactic response than CCL21 despite similar Gi/o response. In addition, we showed that CCR7-mediated Gi/o activation, receptor endocytosis, as well as Rac1 activity, are required to drive chemotaxis. The Laufer et al. study did not investigate whether CCR7 activates G protein after having been internalized into endosomes via formation of CCR7-Gi/o-barr megaplexes, and thus, did not focus on functional outcomes of this signaling mechanism. Based on this, we believe our work provides new and valuable knowledge to the field.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript describes a comprehensive analysis of signalling downstream of the chemokine receptor CCR7. A comprehensive dataset supports the authors' hypothesis that G protein and beta-arrestin signalling can occur simultaneously at CCR7 with implications for continued signalling following receptor endocytosis.

      We would like to thank the reviewer for helping with the manuscript. We agree on all points made and have now updated the manuscript accordingly.

      Strengths:

      The experiments are well controlled and executed, employing a wide range of assays using - in the main - CCR7 transfectants. Data are well presented, with the authors' claims supported by the data. The paper also has an excellent narrative which makes it relatively easy to follow. I think this would certainly be of interest to the readership of the journal.

      We appreciate the positive assessment of strengths.

      Weaknesses:

      Since the authors show a differential enrichment of RhoGTPases by CCR7 stimulation with CCL19 versus CCL21, I think that they also need to show that the Gi/o coupling of HEK-292-CCR7-APEX2 cells to both CCL19 and CCL21 is not perturbed by the modification. Currently, the authors only show data for CCL19 signalling, which leaves the potential for a false negative finding in terms of CCL21 signalling being selectively impaired. This should be relatively easy to do and should strengthen the authors' conclusions.

      We agree with the reviewer and have now included experiments to show that both CCL19- and CCL21-mediated CCR7-APEX2 stimulation leads to Gi/o activation (Fig. S4C). In addition, our proteomics experiments show strong effects of both CCL19 and CCL21 stimulation, which suggest that the receptor is activated by both ligands.

      The authors conclude the discussion by suggesting that their findings highlight endosomal signalling as a general mechanism for chemokine receptors in cell migration. I think this is an overreach. The authors chose several studies of CXC chemokine receptors to support their argument that C-terminal truncation or mutation of the C-terminal phosphorylation sites impairs endocytosis and chemotaxis (refs 40-42). However, in some instances e.g. at the related chemokine receptor CCR4, C-terminal removal of these sites impairs endocytosis but promotes chemotaxis (Nakagawa et al, 2014); Anderson et al, 2020). I therefore think that either the final statement needs to be tempered down or the counterargument discussed a little.

      We appreciate the reviewer highlighting this point. We have now modified the concluding sentence from “Thus, the findings from our study highlight endosomal G protein signaling by chemokine receptors as a potential general mechanism that regulates key aspects of cell migration” to “Thus, the findings from our study highlight endosomal G protein signaling by some chemokine receptors as a potential mechanism that regulates key aspects of cell migration.” We hope that the temper level of this sentence is more appropriate.

      References:

      Anderson, C. A. et al. A degradatory fate for CCR4 suggests a primary role in Th2 inflammation. J Leukocyte Biol 107, 455-466 (2020).

      Nakagawa, M. et al. Gain-of-function CCR4 mutations in adult T cell leukaemia/lymphoma. Journal of Experimental Medicine 211, 2497-2505 (2014).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The results section is well written, although the introduction needs more information on what is known about CCR7 trafficking and endomembrane signaling. I understand this is because the authors wanted to focus on GPCR signaling, but the study will equally be of interest to researchers in the immunology and chemokine fields, and therefore more CCR7-focussed discussion in the introduction would be useful. Similarly, the discussion would benefit from more discussion of previous studies of CCR7 trafficking and endomembrane signaling (in particular the Laufer et al paper) to acknowledge that many of the findings within this paper verify previous studies.

      We have now included additional immunology/endomembrane background information about CCR7 at the place where the receptor is introduced (first paragraph on page 3). We have also expanded our discussion of our work in relation to the Laufer et al. study (third paragraph on page 10).

      (2) On page 5, the authors state that 'The response to chemokine stimulation was not observed in mock transfected HEK293 cells'. Figure S4D does not have a legend so it is difficult to see what they mean by mock transfected. Do they mean not transfecting with anything or not with the receptor? The better control would be transfecting the reporters but not the receptor. This may have been done, but the wording needs clarifying and S4D needs a legend.

      Thanks for pointing this out. We believe the reviewer refers to Figure S2D and we have now highlighted/clarified the legend better. Mock transfected conditions refer to HEK293 cells transfected with the reporter, but not the receptor. This is written in the legend as “(D) Change in luminescence signal generated between SmBiT-barr1 and LgBiT-miniGi in response to 100 nM CCL19 or 100 nM CCL21 in mock transfected HEK293 cells (no CCR7)”, which we believe should be clear to the audience.

      (3) The validation of the APEX2 receptor construct relies on a single assay with one ligand. The authors should show that the receptor expresses at the cell surface, is internalised normally, and that both ligands activate the receptor.

      We have now included additional data to show that (1) the receptor is expressed at the cell surface, (2) that the CCR7-APEX2 recruits barr1 to the plasma membrane, (3) that this association leads to barr1 translocation to the early endosomes as an indirect measurement of receptor internalization, and (4) that both CCL19- and CCL21-stimulation inhibit forskolin induced cAMP production (Fig.S4A-C, and described in fifth paragraph on page 6).

      (4) The APEX2 section is very short, especially as this is novel data. It lacks some important information, e.g. when the authors state that 'we identified a total of 579 proteins', is this in total for both ligands, separately or were some shared? More information on each ligand separately and combined would make this clearer.

      We have now specified that the identified total proteins enriched from our APEX2 approach is when the cells are stimulated with either CCL19 or CCL21 (third paragraph on page 7). Furthermore, we have included a Venn diagram in Fig. S5C to show how many proteins were enriched by CCL19 or CCL21 stimulation and how many of those were shared at different time points.

      (5) The discussion would benefit from some further work. The current first two paragraphs just reiterate the introduction and don't discuss the current paper so could be removed completely. The Laufer et al study needs much more discussion as they report many of the findings of the current paper (signaling following endocytosis, Rac1 endomembrane signaling) five years ago. The APEX2 findings that are discussed, though interesting, are not followed up by further experimental evidence and there is little discussion of why the two ligands have different responses or what the physiological effects could be.

      We appreciate the reviewer’s effort in helping with the discussion. To this end, we have now expanded our discussion of the mentioned paper further as suggested (third paragraph on page 10). We agree that the findings from our APEX experiment are interesting, but the focus of this study relates to proteins enriched specifically at endosomes. Several of the most enriched proteins did not show this localization bias, which is why these proteins were not further investigated.

      Minor changes:

      (1) The authors should remove the word 'recent' at the start of the first sentence of the third paragraph. Endosomal signaling by GPCRs was described 15 years ago so cannot really be seen as recent anymore.

      We have now adjusted the manuscript accordingly.

      (2) Tukey defaulted to Turkey in some places.

      We thank the reviewer for pointing out these typos, which now have been corrected.

      Reviewer #2 (Recommendations For The Authors):

      Minor Points:

      (1) ACKRs do not couple to G proteins so it is peculiar to see them in this table. I would limit the table to the conventional CCR1-10, CXCR1-6 and XCR1. The ligand for XCR1 is XCL1 which is absent from the table.

      We have now modified the table accordingly.

      (2) CCL19 (formerly known as ELC) has been long known to be a more efficacious and potent ligand in chemotaxis assays (Bardi et al, 2001). This earlier reference should be added to the citations in the preceding statement on page 10.

      This is an important study showing that CCL19 is more efficacious than CCL21 in promoting chemotaxis and that this has been known for decades. We have now included the reference accordingly (reference 59 in second paragraph on page 11).

      (3) Figure 6, Panel Q. I think the legends for CCR7 and CCR7 delta ST might be flipped.

      We thank the reviewer for pointing out this error. We have now corrected the figure panel.

      (4) Figure S5 (or 5) might benefit from simple Venn diagrams showing the numbers of differentially enriched proteins following treatment with the two ligands at different time points.

      We have included a Venn diagram in Fig. S5C to show how many proteins were enriched by CCL19 or CCL21 stimulation and how many of those where shared.

      Reference:

      Bardi, G., Lipp, M., Baggiolini, M. & Loetscher, P. The T cell chemokine receptor CCR7 is internalized on stimulation with ELC, but not with SLC. European Journal of Immunology 31, 3291-3297 (2001).

    1. Reviewer #3 (Public review):

      In this work, the authors conducted a large-scale field trial of 130 indica accessions in normal vs. moderate salt stress conditions. The experiment consists of 3 replicates for each accession in each treatment, making it 780 plants in total. Leaf transcriptome, plant traits, and final yield were collected. Starting from a quantitative genetics framework, the authors first dissected the heritability and selection forces acting on gene expression. After summarizing the selection force acting on gene expression (or plant traits) in each environment, the authors described the difference in gene expression correlation between environments. The final part consists of eQTL investigation and categorizing cis- and trans-effects acting on gene expression.

      Building on the group's previous study and using a similar methodology (Groen et al. 2020, 2021), the unique aspect of this study is in incorporating large-scale empirical field works and combining gene expression data with plant traits. Unlike many systems biology studies, this study strongly emphasizes the quantitative genetics perspective and investigates the empirical fitness effects of gene expression data. The large amounts of RNAseq data (one sample for each plant individual) also allow heritability calculation. This study also utilizes the population genetics perspective to test for traces of selection around eQTL. As there are too many genes to fit in multiple regression (for selection analysis) and to construct the G-matrix (for breeder's equation), grouping genes into PCs is a very good idea.

      In the previous review, three major points were mentioned. The manuscript was modified, and here I briefly summarize them as a reference for future works:

      (1) The separate sections (selection analysis, transcript correlation structure change, and eQTL) could use better integration.<br /> (2) It would be worth considering joint analyses integrating the two environments together.<br /> (3) Whether gene expression PCs or unique expression modules should be used in selection analyses.

      Regarding whether to use PCs or WGCNA eigengenes to summarize gene expression for selection analyses, the authors reported that only a few WGCNA eigengenes were under selection, citing this observation as the rationale for choosing PC over eigengenes. However, as the relative false positive-negative rates of these choices likely require another dedicated study to explore, at this stage, it might be premature to state which method is better based on which gives more positive results. On one hand, one could easily imagine that plants screwed up by salinity have erratic genomewide expression and become extreme data points on the PCs, making the PCs a good proxy to correlate with fitness. On the other, it remains to be discussed whether this genomewide screwed-up-ness is what we want to measure in this study or whether we should focus on more dedicated gene modules instead. I suggest the authors acknowledge both possibilities. In this revision, I do not see relevant WGCNA results (as mentioned in the previous response letter) reported.

      Figure 4: The observation that chlorophyll a content is under negative selection under BOTH conditions is a bit counterintuitive. The manuscript only mentioned "consistent with the general trend for reduced photosynthesis under salinity stress" (line 329) but did not mention why this increased fitness, even in normal conditions.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Understanding the mechanisms of how organisms respond to environmental stresses is a key goal of biological research. Assessment of transcriptional responses to stress can provide some insights into those underlying mechanisms. The researchers quantified traits, fitness, and gene expression (transcriptional) response to salinity stress (control vs stress treatments) for 130 accessions of rice (three replicates for each accession), which were grown in the field in the Philippines. This experimental design allowed for many different types of downstream analyses to better understand the biology of the system. These analyses included estimating the strength of selection imposed on transcription in each environment, evaluating possible trade-offs in gene expression, testing whether salinity induces transcriptional decoherence, and conducting various eQTL-type analyses.

      Strengths:

      The study provides an extensive analysis of gene expression responses to stress in rice and offers some insights into underlying mechanisms of salinity responses in this important crop system. The fact that the study was conducted under field conditions is a major plus, as the gene expression responses to soil salinity are more realistic than if the study was conducted in a greenhouse or growth chamber. The preprint is generally well-written and the methods and results are mostly well-described.

      Weaknesses:

      While the study makes good use of analyzing the dataset, it is not clear how the current work advances our understanding of gene regulatory evolution or plant responses to soil salinity generally. Overall, the results are consistent with other prior studies of gene expression and studies of selection across environmental conditions. Some of the framing of the paper suggests that there is more novelty to this study than there is in reality. That said, the results will certainly be useful for those working in rice and should be interesting to scientists interested in how gene expression responses to stress occur under field conditions. I detail other concerns I had about the preprint below:

      The abstract on lines 33-35 illustrates some of my concerns about the overstatement of the novelty of the current study. For example, is it really true that the role of gene expression in mediating stress response and adaptation is largely unexplored? There have been numerous studies that have evaluated gene expression responses to stresses in a wide range of organisms. Perhaps, I am missing something critically different about this study. If so, I would recommend that the authors reword this sentence to clarify what gap is being filled by this study. Further, is it really the case that none of them have evaluated how the correlational structure of gene expression changes in response to stresses in plants, as implied in lines 263-265? Don't the various modules and PC analyses of gene expression get at this question?

      We have re-worded these sentences, and highlighted the novelty of our work.

      There were some places in the methods of the preprint that required more information to properly evaluate. For example, more information should be provided on lines 664-668 about how G, E, and GxE effects were established, especially since this is so central to this study. What programs/software (R? SAS? Other?) were used for these analyses? If R, how were the ANOVAs/models fit? What type of ANOVA was used? How exactly was significance determined for each term? Which effects were considered fixed and which were random? If the goal was to fit mixed models, why not use an approach like voom-limma (Law et al. 2014 Genome Biology)? More details should also be added to lines 688-709 about these analyses, including what software/programs were used for these analyses.

      We have added more details in the methods. Also, although we could in priciple use voom-limma to fit our mixed model, to be able to partition variance into G, E and G×E, we need to use the function fitExtractVarPartModel (from package VariancePartition) which requires all categorical variables to be modeled as random effects. Therefore, we couldn’t model environment as a fixed effect.

      One thing that I found a bit confusing throughout was the intermixing of different terms and types of selection. In particular, there seemed to be some inconsistencies with the usage of quantitative genetics terms for selection (e.g. directional, stabilizing) vs molecular evolution terms for selection (e.g. positive, purifying). I would encourage the authors to think carefully about what they mean by each of these terms and make sure that those definitions are consistently applied here.

      We have defined the selection terms used in the study and used these terms consistently throughout the manuscript.

      It would be useful to clarify the reasons for the inherent bias in the detection of conditional neutrality (CN) and antagonistic pleiotropy (AP; Lines 187-196). It is also not clear to me what the authors did to deal with the bias in terms of adjusting P-value thresholds for CN and AP the way it is currently written. Further, I found the discussion of antagonistic pleiotropy and conditional neutrality to be a bit confusing for a couple of reasons, especially around lines 489-491. First of all, does it really make sense to contrast gene expression versus local adaptation, when lots of local adaptation likely involves changes in gene expression? Second, the implication that antagonistic pleiotropy is more common for local adaptation than the results found in this study seems questionable. Conditional neutrality appears to be more common for local adaptation as well: see Table 2 of Wadgymar et al. 2017 Methods in Ecology and Evolution. That all said, it is always difficult to conclude that there are no trade-offs (antagonistic pleiotropy) for a particular locus, as the detecting trade-offs may only manifest in some years and not others and can require large sample sizes if they are subtle in effect.

      We have now explained the cause of the inherent bias in the detection of CN, and also elaborated on how we deal with this bias. Also, we have edited our discussion and added relevant citations to indicate both conditional neutrality and antagonistic pleiotropy can lead to local adaptations and added the caveat regarding detecting antagonistic pleiotropy.

      Reviewer #2 (Public Review):

      The authors investigate the gene expression variation in a rice diversity panel under normal and saline growth conditions to gain insight into the underlying molecular adaptive response to salinity. They present a convincing case to demonstrate that environmental stress can induce selective pressure on gene expression, which is in agreement to their earlier study (Groen et al, 2020). The data seems to be a good fit for their study and overall the analytic approach is robust.

      (1) The work started by investigating the effect of genotype and their interaction at each transcript level using 3'-end-biased mRNA sequencing, and detecting a wide-spread GXE effect. Later, using the total filled grain number as a proxy of fitness, they estimated the strength of selection on each transcript and reported stronger selective pressure in a saline environment. However, this current framework relies on precise estimation of fitness and, therefore can be sensitive to the choice of fitness proxy.

      We now acknowledge this caveat in the discussion.

      (2) Furthermore, the authors decomposed the genetic architecture of expression variation into cis- and trans-eQTL in each environment separately and reported more unique environment-specific trans-eQTLs than cis-. The relative contribution of cis- and trans-eQTL depends on both the abundance and effect size. I wonder why the latter was not reported while comparing these two different genetic architectures. If the authors were to compare the variation explained by these two categories of eQTL instead of their frequency, would the inference that trans-eQTLs are primarily associated with expression variation still hold?

      We have now also reported the effect sizes for both cis- and trans-eQTLs in the two environments and showed that the trans-eQTLs have higher effect sizes as compared to cis-eQTLs, indicating that they are able to explain higher proportion of variation in transcript abundances in the two environments.

      (3) Next, the authors investigated the relationship between cis- and trans-eQTLs at the transcript level and revealed an excess of reinforcement over the compensation pattern. Here, I struggle to understand the motivation for testing the relationship by comparing the effect of cis-QTL with the mean effect of all trans-eQTLs of a given transcript. My concern is that taking the mean can diminish the effect of small trans-eQTLs potentially biasing the relationship towards the large-effect eQTLs.

      We wanted to estimate compensating vs reinforcing effects, which essentially entails identifying genes that have opposing directionality of cis and trans-effects. To get the total trans-effect we decided to take the mean effect of trans-eQTLs. This mean was only used to identify the compensating/reinforcing genes and although the mean effects diminishes the effect of small trans-eQTLs, this mean was not used in downstream analyses.

      Reviewer #3 (Public Review):

      In this work, the authors conducted a large-scale field trial of 130 indica accessions in normal vs. moderate salt stress conditions. The experiment consists of 3 replicates for each accession in each treatment, making it 780 plants in total. Leaf transcriptome, plant traits, and final yield were collected. Starting from a quantitative genetics framework, the authors first dissected the heritability and selection forces acting on gene expression. After summarizing the selection force acting on gene expression (or plant traits) in each environment, the authors described the difference in gene expression correlation between environments. The final part consists of eQTL investigation and categorizing cis- and trans-effects acting on gene expression.

      Building on the group's previous study and using a similar methodology (Groen et al. 2020, 2021), the unique aspect of this study is in incorporating large-scale empirical field works and combining gene expression data with plant traits. Unlike many systems biology studies, this study strongly emphasizes the quantitative genetics perspective and investigates the empirical fitness effects of gene expression data. The large amounts of RNAseq data (one sample for each plant individual) also allow heritability calculation. This study also utilizes the population genetics perspective to test for traces of selection around eQTL. As there are too many genes to fit in multiple regression (for selection analysis) and to construct the G-matrix (for breeder's equation), grouping genes into PCs is a very good idea.

      Building on large amounts of data, this study conducted many analyses and described some patterns, but a central message or hypothesis would still be necessary. Currently, the selection analysis, transcript correlation structure change, and eQTL parts seem to be independent. The manuscript currently looks like a combination of several parallel works, and this is reflected in the Results, where each part has its own short introduction (e.g., 185-187, 261-266, 349-353). It would be great to discuss how these patterns observed could be translated to larger biological insights. On a related note, since this and the previous studies (focusing on dry-wet environments) use a similar methodology, one would also wonder what the conclusions from these studies would be. How do they agree or disagree with each other?

      We acknowledge that the manuscript currently presents some analyses in a somewhat independent manner. Although it would be ideal to have a central hypothesis/message, our study is meant to broadly outline the various responses and fitness effects of salinity stress in rice. Throughout the manuscript, we have also included comparisons between our findings and that of our previous studies on drought stress to highlight any consistent themes or novel insights.

      Many analyses were done separately for each environment, and results from these two environments are listed together for comparison. Especially for the eQTL part, no specific comparison was discussed between the two environments. It would be interesting to consider whether one could fit the data in more coherent models specifically modeling the X-by-environment effects, where X might be transcripts, PCs, traits, transcript-transcript correlation, or eQTLs.

      We do plan to consider fitting models that explicitly incorporate X-by-environment interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.

      As stated, grouping genes into PCs is a good idea, but although in theory, the PCs are orthogonal, each gene still has some loadings on each PC (ie. each PC is not controlled by a completely different set of genes). Another possibility is to use any gene grouping method, such as WGCNA, to group genes into modules and use the PC1 of each module. There, each module would consist of completely different sets of genes, and one would be more likely to separate the biological functions of each module. I wonder whether the authors could discuss the pros and cons of these methods.

      We recognize that individual genes can contribute to multiple PCs, and this is precisely why we choose PCA clustering over WGCNA where one gene can belong to only one module. Our aim was to recognize all biological processes that could be under selection in either environment, and since one gene can be involved in various different processes, we wanted to identify the contribution of these genes to different processes which can be done effectively by a PCA analyses.

      Reviewer #4 (Public Review):

      The manuscript examines how patterns of selection on gene expression differ between a normal field environment and a field environment with elevated salinity based on transcript abundances obtained from leaves of a diverse panel of rice germplasm. In addition, the manuscript also maps expression QTL (eQTL) that explains variation in each environment. One highlight from the mapping is that a small group of trans-mapping regulators explains some gene expression variation for large sets of transcripts in each environment. The overall scope of the datasets is impressive, combining large field studies that capture information about fecundity, gene expression, and trait variation at multiple sites. The finding related to patterns indicating increased LD among eQTLs that have cis-trans compensatory or reinforcing effects is interesting in the context of other recent work finding patterns of epistatic selection. However, other analyses in the manuscript are less compelling or do not make the most of the value of collected data. Revisions are also warranted to improve the precision with which field-specific terminology is applied and the language chosen when interpreting analytical findings.

      Selection of gene expression:

      One strength of the dataset is that gene expression and fecundity were measured for the same genotypes in multiple environments. However, the selection analyses are largely conducted within environments. The addition of phenotypic selection analyses that jointly analyze gene expression across environments and or selection on reaction norms would be worthwhile.

      We do plan to consider fitting models that explicitly incorporate G×E interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.

      Gene expression trade-offs:

      The terminology and possibly methods involved in the section on gene expression trade-offs need amendment. I specifically recommend discontinuing reference to the analysis presented as an analysis of antagonistic pleiotropy (rather than more general trade-offs) because pleiotropy is defined as a property of a genotype, not a phenotype. Gene expression levels are a molecular phenotype, influenced by both genotype and the environment. By conducting analyses of selection within environments as reported, the analysis does not account for the fact that the distribution of phenotypic values, the fitness surface, or both may differ across environments. Thus, this presents a very different situation than asking whether the genotypic effect of a QTL on fitness differs across environments, which is the context in which the contrasting terms antagonistic pleiotropy and conditional neutrality have been traditionally applied. A more interesting analysis would be to examine whether the covariance of phenotype with fitness has truly changed between environments or whether the phenotypic distribution has just shifted to a different area of a static fitness surface.

      We recognize that pleiotropy is a property of a genotype, and not phenotype, but since our phenotype (gene expression) is strongly coupled with the genotype, we choose to call trade-offs as antagonistic pleiotropy. That being said, we did test whether the covariance of gene expression with phenotype significantly varies between environments, and found that to indeed be the case.

      Biological processes under selection / Decoherence: PCs are likely not the most ideal way to cluster genes to generate consolidated metrics for a selection gradient analysis. Because individual genes will contribute to multiple PCs, the current fractional majority-rule method applied to determine whether a PC is under direct or indirect selection for increased or decreased expression comes across as arbitrary and with the potential for double-counting genes. A gene co-expression network analysis could be more appropriate, as genes only belong to one module and one can examine how selection is acting on the eigengene of a co-expression module. Building gene co-expression modules would also provide a complementary and more concrete framework for evaluating whether salinity stress induces "decoherence" and which functional groups of genes are most impacted.

      We recognize that individual genes can contribute to multiple PCs, and this is precisely why we choose PCA clustering over WGCNA where one gene can belong to only one module. Our aim was to recognize all biological processes that could be under selection in either environment, and since one gene can be involved in various different processes, we wanted to identify the contribution of these genes to different processes which can be done effectively by a PCA analyses. But again as pointed out by the reviewer, our PCs did contain contribution (even negligible) of each gene, so to identify the ‘primary’ biological processes represented by the PCs, we chose the majority rule. As for testing decoherence, we agree that a co-expression module analyses would have provided additional support to the specific test performed in our manuscript, but since it would just be additional support, we choose to not add it in the manuscript.

      But based on the recommendation of the reviewer(s), we did perform a WGCNA analyses and found a total of 14 and 13 modules in normal and saline conditions, of which 0 and 2 modules (with no significant GO enrichment) were under directional selection. This supports our reasoning of potentially missing on identification of processes under selection.

      Selection of traits:

      Having paired organismal and molecular trait data is a strength of the manuscript, but the organismal trait data are underutilized. The manuscript as written only makes weak indirect inferences based on GO categories or assumed gene functions to connect selection at the organismal and molecular levels. Stronger connections could be made for instance by showing a selection of co-expression module eigengene values that are also correlated with traits that show similar patterns of selection, or by demonstrating that GWAS hits for trait variation co-localize to cis-mapping eQTL.

      We did perform a GWAS for all the traits collected in both normal and saline environment, and only found significant hits for fecundity (in both normal and saline environment) and chlorophyll_a content (in the saline environment). But these regions did not overlap with any candidate genes or cis-mapping eQTL. Hence we choose to mention it in the manuscript. Additionally, using the WGCNA modules, we found that the only two module under selection in the saline environment were not significantly correlated with any of the traits measured.

      Genetic architecture of gene expression variation:

      The descriptive statistics of the eQTL analysis summarize counts of eQTLs observed in each environment, but these numbers are not broken down to the molecular trait level (e.g., what are the median and range of cis- and trans-eQTLs per gene). In addition, genetic architecture is a combination of the numbers and relative effect sizes of the QTLs. It would be useful to provide information about the relative distributions of phenotypic variance explained by the cis- vs. trans- eQTLs and whether those distributions vary by environment. The motivation for examining patterns of cis-trans compensation specifically for the results obtained under high salinity conditions is unclear to me. If the lines sampled have predominantly evolved under low salinity conditions and the hypothesis being evaluated relates to historical experience of stabilizing selection, then my intuition is that evaluating the eQTL patterns under normal conditions provides the more relevant test of the hypothesis.

      We have added the median number of eQTLs per gene in each environment. Additionally, we recognize that genetic architecture is a combination id numbers and effect size, and we have added information regarding the effect sizes of eQTLs by type and by environment as recommend by another reviewer. We did explore the distributions of phenotypic variance explained by the cis- vs. trans- eQTLs as recommended here, and found that trans-eQTLs explain more phenotypic variance than cis-eQTLs in both environments and that the distribution of either type of eQTL does not vary by environment. We are choosing to not add this in the main text due to space limitations. Lastly, we examined the patterns of cis-trans compensation/reinforcement under both normal and salinity conditions and have compared and contrasted the results from both in the main text.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Lines 126: I would recommend citing those who originally developed the 3' end targeted RNA sequencing methods (e.g. Meyer et al 2011 Molecular Ecology).

      We have cited the recommended paper.

      Lines 128-130: It would be useful to include a description here of what models were fit to the data to partition out G, E, and GxE effects.

      Due to space limitations, we have in brief added a sentence to this effect.

      Line 139: I would suggest changing "found little" to "no" since the test was not significant.

      The sentence has been modified to say no evidence.

      Line 313: I think you mean directional selection instead of positive selection.

      We have corrected the text

      Lines 362-363: Would the authors also expect an enrichment of reinforcing genes for most scenarios where that has been divergent selection, such as local adaptation among populations?

      Based on our hypothesis, we would indeed expect an enrichment of reinforcing genes for scenarios of local adaptation where different alleles are maintained in different populations due to local adaptation.

      Reviewer #3 (Recommendations For The Authors):

      Figures 1d-e are not mentioned in the Results.

      The figures have been referenced in appropriate places.

      Lines 41-45: Terms such as reinforcement and compensation need to be explained in this specific context. Also "different selection regimes" is a bit broad and vague.

      Due to word-count limitation, we are choosing to not elaborate the terms reinforcement and compensation in the abstract (since these are commonly used in the literature, and we have also defined these in the main text). Additionally, we now explicitly state the selection pressures associated with cis and trans eQTLs.

      Table 1: Please explain S and C in the footnote.

      We have added the recommended footnote

      Figures: Some panel labels (a, b, c...) are mingled with the graphs.

      We are re-made our figure such that the panel labels do not mingle with the plots.

      Lines 588-591: font.

      Modified

      Lines 620-633: Please describe how these RNAseq libraries were allocated/pooled into different sequencing lanes to avoid potential batch effects among sequencing lanes.

      The sequencing was performed on the same Illumina NextSeq 500 machine and we have added the sequencing libraries/pool plan in the methods (lines 688-689). 

      Lines 690-692: At the beginning of this paragraph, it was mentioned that the un-standardized coefficients were estimated. But here, it seems like the transcript data were already standardized in the data preparation step. What do lines 687-688 refer to? Further standardizing those estimated coefficients so that the whole distribution has mean=0 and sd=1?

      Thank you pointing out our oversight. We checked our scripts and data preparation did not include transcript standardization, and we have removed the above line from the manuscript.

      Lines 705-711: Please explain why assigning the positive/negative selection status for each gene is important. "Positive selection" here is defined as genes whose increased expression also increases fitness, but traditionally positive selection was defined as "the derived state is favored over the ancestral state". For a gene whose ancestral expression is high but lower expression increases fitness in this experiment, could we also say this gene is under positive selection? Given that we don't know the ancestral state here, maybe the authors could explain whether this definition is necessary. Also, given that many genes positively or negatively regulate each other in a pathway, it is also unclear whether it is necessary to assign the positive/negative status for a PC using the majority rule (lines 710-711).

      We have now defined the different selection terms with respect to our study and use them consistently throughout the manuscript.

      Lines 711-715: If I understand correctly, PCs were used as traits, and by definition PCs should all be orthogonal. Is this section saying only retaining PCs whose correlation < 0.6 with each other? What is the rationale?

      PCA were performed on transcript abundance and the resulting orthogonal PCs explaining over 0.5% variance were all retained for selection analyses.

      We also performed selection analyses on the functional traits measured in the field, but since these functional traits are correlated (and as such would not satisfy the independent variable requirement of regression analyses), we retained only those functional traits which had a Pearson correlation coefficient < 0.6.

      Line 729: Please briefly describe what CLIP is doing.

      We have added the required description.

      Lines 736-741: The accession numbers do not add up to 125.

      Thank you for catching our oversight. We have edited the text, and now the numbers add upto 125.

      Line 796: Please remind readers where these 247k SNPs come from. Supposedly all accessions have been whole-genome sequenced, so the total number of SNPs should be larger than this.

      We have detailed method detailing how the SNPs were obtained and processed in the lines preceding this. Indeed the number of SNPs would have been much bigger, but the stringent cutoffs and linkage disequilibrium pruning reduced our dataset to about 247k SNPs.

      Lines 154-160: This is a bit confusing. The authors first mentioned, for the raw selection differentials, the mean and variance differ between environments, meaning they are misleading (why?). The next sentence then says non-standardized selection differentials will be used.

      The mean and variance for transcript abundances vary between the two environments. Because traits are usually measured in different scales, it is recommended to standardize trait values using variance or mean before estimating selection coefficients. Multiplying this variance (or mean) standardized selection differential with heritability gives the expected response to selection in standard deviation (or mean) units. But if the trait variance (or mean) varies between traits or environments, it leads to a conflation between the standardized selection differential and trait variance (or mean), which can be misleading. So to avoid this, and given that our traits (transcript abundance in this case) were all measured on the same scale, we chose to not standardize our trait values and estimated raw selection differentials.

      Figure 1 c-e: Please explain how the horizontal axis values were obtained. Is it assuming these selection differentials have a normal distribution of mean=0 & sd=1?

      Yes, horizontal axis represents theorical quantile for selection differential assuming they have a normal distribution with mean=0 and sd=1. This has been added to the figure legend.

      Line 162-168: Please clarify this part. What does “general trend towards stronger positive compared to negative selection on gene expression” mean? Does it mean the whole distribution of S is significantly different from 0, the difference in the number of genes in the S>0 vs S<0 category, or the a-bit-higher median |S| in the S>0 vs S<0 category? If it is the last one, are the small differences biological meaningful (0.053 vs. 0.047 for control & 0.051 vs. 0.050 for salt conditions), given that the authors defined |S|<0.1 as neutral?

      By “general trend towards stronger positive compared to negative selection on gene expression”, we mean that more transcripts were under positive directional selection as compared to negative directional selection. We have also clarified this in the text now.

      Line 177-178: This sentence implies disruptive selection is more important than stabilizing selection in the saline environment, but the test was not significant (line 176).

      Although there was no significant difference in the magnitude of stabilizing vs disruptive selection within the saline environment, the number of transcripts experiencing stronger disruptive selection in the saline condition was greater than the number of transcripts experiencing disruptive selection in the normal conditions. And so comparing between conditions, disruptive selection plays an important role in the saline conditions.

      Line 188-190: How CN vs. AP was statistically defined was not mentioned in the Methods section.

      We have added in the main text within the Results section.

      Line 203-214: How do these results fit with the previous observations that almost all transcripts have significant heritability?

      Although we do find that all but three transcripts have a have significant genetic effect (and thus have significant heritability), the median broad-sense heritability for 51 antagonistically pleiotropic genes is 0.23. Give that, we would only be able to detect SNPs regulating gene expression with high effect size since our sample size is n=130. Additionally, we used a very stringent criteria (FDR < 0.001) to define eQTLs. These two factors in combination could lead to us not being able to detect significant eQTLs for AP genes.

      Line 246-250: Please explain why the current conclusion would be opposite from the previous study. Supposedly the PCA, G matrix, and breeder’s equation were done for each environment separately. It makes sense that the G matrix and response to selection could be different between saline and drought treatments, but for the control treatments in the two studies, do they still differ? Why? Also in Table S7, it would be nice to show the % variation explained by each PC.

      Although both our studies had largely overlapping samples, about 20% samples were unique to each study. Additionally, although the site where the study was performed was the same across the two studies, we found significant temporal differences in gene expression due to micro-environmental differences. Both these factors can lead to changes in direct and indirect selection and its response, and we are examining these differences as part of a separate study. We also highlight these caveats in our discussion.

      Information on percent explained by each PCs is given in Table S5.

      Figure 2b: The vertical axis was labeled as “selection gradient”, but I think the responses to selection (D, I, T) have different units.

      We have re-labeled the vertical axis as “selection”.

      Reviewer #4 (Recommendations For The Authors):

      The manuscript mixes terminology for selection from quantitative genetics with that from population genetics. This is problematic, and the adjectives positive and negative should be replaced as descriptors of selection by instead rewording, for example, positive directional selection as directional selection for higher transcript abundance.

      Lines 193-196: The phrasing here reads as if the selection is solely acting on the presence/absence of expression rather than on quantitative variation in expression. During revision, it would be worth considering including an analysis of genes that parses genes that show the presence/absence of variation of expression within or across environments separately from genes that are expressed to non-trivial levels in both environments.

      We have modified the sentence in question now. Also, we pre-processed RNA-seq data to remove all transcripts with low expression signals (sigma signal < 20), and further retained only transcripts that had non-trivial expression in at least 10% of the population, which we believe represents presence/absence of variation of expression within or across environments.

      Lines 216-231: Is this analysis solely for directional selection? Not clear since previous sections examined both directional and stabilizing selection.

      Yes, we performed this analysis for only directional selection, and have clarified this in the text too.

      Lines 224-226: The meaning of this sentence is unclear and should be written more concretely.

      We have rephrased the sentence to be more clear.

      Lines 232-241: The description of the scientific logic here could be read as implying that genes interacting in networks are the sole source of indirect selection. I recommend revising the language to indicate this cause is one of several potential causes.

      We have reworded the sentence such that we indicate selection acting on interacting genes is just one of the causes of indirect selection.

      The strength of the conclusions of the decoherence analysis should be evaluated in light of caveats with such analyses (see Cai and Des Marais New Phytologist 2023).

      We have added the caveat with relevant citation in the manuscript.

      Rename this section as "Selection on Organismal Traits", as the previous sections have also been investigating selection on traits, just molecular traits.

      We have renamed the section as recommended

      Lines 314-318: Rewrite for clarity. Most environments select for an optimal phenotype; it is just the case here that the phenotypic distribution in the high salinity environment overlaps with the optimum.

      We have rephrased and clarified the statement.

      Lines 343-345: Rephrase to "These results indicate that natural variation in gene regulation under..."

      Rephrased.

      Line 354: "most" reads as too strong a descriptor here if the majority is ~60%.

      We have reworded the sentence to read “more than half”

      Lines 359-361: It is unclear to me how this interpretation follows from the above analysis.

      We have reworded the sentence so that the claim follows our analysis.

      Line 372: Is the expectation here more specifically one of epistatic selection? Other processes could stochastically lead to the genetic fixation of compensatory/reinforcing variants, but I think only epistasis for fitness would cause the interesting patterns of LD observed.

      The expectation here is that certain cis and trans variants only exists to compensate/reinforce, potentially through epistasis. We have clarified this in the text.

      Line 405: Change "adaptive organismal responses of organisms" to "organismal responses." As written, the sentence reads as being about plasticity rather than evolutionary responses, which are by populations, not organisms. None of the analyses included the manuscript test specifically test for adaptive plasticity.

      Rephrased.

    1. Reviewer #1 (Public review):

      The conserved AAA-ATPase PCH-2 has been shown in several organisms including C. elegans to remodel classes of HORMAD proteins that act in meiotic pairing and recombination. In some organisms the impact of PCH-2 mutations is subtle but becomes more apparent when other aspects of recombination are perturbed. Patel et al. performed a set of elegant experiments in C. elegans aimed at identifying conserved functions of PCH-2. Their work provides such an opportunity because in C. elegans meiotically expressed HORMADs localize to meiotic chromosomes independently of PCH-2. Work in C. elegans also allows the authors to focus on nuclear PCH-2 functions as opposed to cytoplasmic functions also seen for PCH-2 in other organisms.

      The authors performed the following experiments:

      (1) They constructed C. elegans animals with SNPs that enabled them to measure crossing over in intervals that cover most of four of the six chromosomes. They then showed that double-crossovers, which were common on most of the four chromosomes in wild-type, were absent in pch-2. They also noted shifts in crossover distribution in the four chromosomes.

      (2) Based on the crossover analysis and previous studies they hypothesized that PCH-2 plays a role at an early stage in meiotic prophase to regulate how SPO-11 induced double-strand breaks are utilized to form crossovers. They tested their hypothesis by performing ionizing irradiation and depleting SPO-11 at different stages in meiotic prophase in wild-type and pch-2 mutant animals. The authors observed that irradiation of meiotic nuclei in zygotene resulted in pch-2 nuclei having a larger number of nuclei with 6 or greater crossovers (as measured by COSA-1 foci) compared to wildtype. Consistent with this observation, SPO11 depletion, starting roughly in zygotene, also resulted in pch-2 nuclei having an increase in 6 or more COSA-1 foci compared to wildtype. The increased number at this time point appeared beneficial because a significant decrease in univalents was observed.

      (3) They then asked if the above phenotypes correlated with the localization of MSH-5, a factor that stabilizes crossover-specific DNA recombination intermediates. They observed that pch-2 mutants displayed an increase in MSH-5 foci at early times in meiotic prophase and an unexpectedly higher number at later times. They conclude based on the differences in early MSH-5 localization and the SPO-11 and irradiation studies that PCH-2 prevents early DSBs from becoming crossovers and early loading of MSH-5. By analyzing different HORMAD proteins that are defective in forming the closed conformation acted upon by PCH-2, they present evidence that MSH-5 loading was regulated by the HIM-3 HORMAD.

      (4) They performed a crossover homeostasis experiment in which DSB levels were reduced. The goal of this experiment was to test if PCH-2 acts in crossover assurance. Interestingly, in this background PCH-2 negative nuclei displayed higher levels of COSA-1 foci compared to PCH-2 positive nuclei. This observation and a further test of the model suggested that "PCH-2's presence on the SC prevents crossover designation."

      (5) Based on their observations indicating that early DSBS are prevented from becoming crossovers by PCH-2, the authors hypothesized that the DNA damage kinase CHK-2 and PCH-2 act to control how DSBs enter the crossover pathway. This hypothesis was developed based on their finding that PCH-2 prevents early DSBs from becoming crossovers and previous work showing that CHK-2 activity is modulated during meiotic recombination progression. They tested their hypothesis using a mutant synaptonemal complex component that maintains high CHK-2 activity that cannot be turned off to enable crossover designation. Their finding that the pch-2 mutation suppressed the crossover defect (as measured by COSA-1 foci) supports their hypothesis.

      Based on these studies the authors provide convincing evidence that PCH-2 prevents early DSBs from becoming crossovers and controls the number and distribution of crossovers to promote a regulated mechanism that ensures the formation of obligate crossovers and crossover homeostasis. As the authors note, such a mechanism is consistent with earlier studies suggesting that early DSBs could serve as "scouts" to facilitate homolog pairing or to coordinate the DNA damage response with repair events that lead to crossing over. The detailed mechanistic insights provided in this work will certainly be used to better understand functions for PCH-2 in meiosis in other organisms.

      Comments on revisions:

      The authors responded very carefully to all of my concerns expressed in the first review, which were primarily aimed at improving the clarity of the manuscript.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public review): 

      The conserved AAA-ATPase PCH-2 has been shown in several organisms including C. elegans to remodel classes of HORMAD proteins that act in meiotic pairing and recombination. In some organisms the impact of PCH-2 mutations is subtle but becomes more apparent when other aspects of recombination are perturbed. Patel et al. performed a set of elegant experiments in C. elegans aimed at identifying conserved functions of PCH-2. Their work provides such an opportunity because in C. elegans meiotically expressed HORMADs localize to meiotic chromosomes independently of PCH-2. Work in C. elegans also allows the authors to focus on nuclear PCH-2 functions as opposed to cytoplasmic functions also seen for PCH-2 in other organisms. 

      The authors performed the following experiments: 

      (1) They constructed C. elegans animals with SNPs that enabled them to measure crossing over in intervals that cover most of four of the six chromosomes. They then showed that doublecrossovers, which were common on most of the four chromosomes in wild-type, were absent in pch-2. They also noted shifts in crossover distribution in the four chromosomes. 

      (2) Based on the crossover analysis and previous studies they hypothesized that PCH-2 plays a role at an early stage in meiotic prophase to regulate how SPO-11 induced double-strand breaks are utilized to form crossovers. They tested their hypothesis by performing ionizing irradiation and depleting SPO-11 at different stages in meiotic prophase in wild-type and pch-2 mutant animals. The authors observed that irradiation of meiotic nuclei in zygotene resulted in pch-2 nuclei having a larger number of nuclei with 6 or greater crossovers (as measured by COSA-1 foci) compared to wildtype. Consistent with this observation, SPO11 depletion, starting roughly in zygotene, also resulted in pch-2 nuclei having an increase in 6 or more COSA-1 foci compared to wild type. The increased number at this time point appeared beneficial because a significant decrease in univalents was observed. 

      (3) They then asked if the above phenotypes correlated with the localization of MSH-5, a factor that stabilizes crossover-specific DNA recombination intermediates. They observed that pch-2 mutants displayed an increase in MSH-5 foci at early times in meiotic prophase and an unexpectedly higher number at later times. They conclude based on the differences in early MSH-5 localization and the SPO-11 and irradiation studies that PCH-2 prevents early DSBs from becoming crossovers and early loading of MSH-5. By analyzing different HORMAD proteins that are defective in forming the closed conformation acted upon by PCH-2, they present evidence that MSH-5 loading was regulated by the HIM-3 HORMAD. 

      (4) They performed a crossover homeostasis experiment in which DSB levels were reduced. The goal of this experiment was to test if PCH-2 acts in crossover assurance. Interestingly, in this background PCH-2 negative nuclei displayed higher levels of COSA-1 foci compared to PCH-2 positive nuclei. This observation and a further test of the model suggested that "PCH-2's presence on the SC prevents crossover designation." 

      (5) Based on their observations indicating that early DSBS are prevented from becoming crossovers by PCH-2, the authors hypothesized that the DNA damage kinase CHK-2 and PCH2 act to control how DSBs enter the crossover pathway. This hypothesis was developed based on their finding that PCH-2 prevents early DSBs from becoming crossovers and previous work showing that CHK-2 activity is modulated during meiotic recombination progression. They tested their hypothesis using a mutant synaptonemal complex component that maintains high CHK-2 activity that cannot be turned off to enable crossover designation. Their finding that the pch-2 mutation suppressed the crossover defect (as measured by COSA-1 foci) supports their hypothesis. 

      Based on these studies the authors provide convincing evidence that PCH-2 prevents early DSBs from becoming crossovers and controls the number and distribution of crossovers to promote a regulated mechanism that ensures the formation of obligate crossovers and crossover homeostasis. As the authors note, such a mechanism is consistent with earlier studies suggesting that early DSBs could serve as "scouts" to facilitate homolog pairing or to coordinate the DNA damage response with repair events that lead to crossing over. The detailed mechanistic insights provided in this work will certainly be used to better understand functions for PCH-2 in meiosis in other organisms. My comments below are aimed at improving the clarity of the manuscript. 

      We thank the reviewer for their concise summary of our manuscript and their assessment of our work as “convincing” and providing “detailed mechanistic insight.”

      Comments 

      (1) It appears from reading the Materials and Methods that the SNPs used to measure crossing over were obtained by mating Hawaiian and Bristol strains. It is not clear to this reviewer how the SNPs were introduced into the animals. Was crossing over measured in a single animal line? Were the wild-type and pch-2 mutations made in backgrounds that were isogenic with respect to each other? This is a concern because it is not clear, at least to this reviewer, how much of an impact crossing different ecotypes will have on the frequency and distribution of recombination events (and possibly the recombination intermediates that were studied). 

      We have clarified these issues in the Materials and Methods of our updated preprint. The control and pch-2 mutants were isogenic in either the Bristol or Hawaiian backgrounds. Control lines were the original Bristol and Hawaiian lines and pch-2 mutants were originally made in the Bristol line and backcrossed at least 3 times before analysis. Hawaiian pch-2 mutants were made by backcrossing pch-2 mutants at least 8 times to the Hawaiian background and verifying the presence of Hawaiian SNPs on all chromosomes tested in the recombination assay. To perform the recombination assays, these lines were crossed to generate the relevant F1s.

      (2) The authors state that in pch-2 mutants there was a striking shift of crossovers (line 135) to the PC end for all of the four chromosomes that were tested. I looked at Figure 1 for some time and felt that the results were more ambiguous. Map distances seemed similar at the PC end for wildtype and pch-2 on Chrom. I. While the decrease in crossing over in pch-2 appeared significant for Chrom. I and III, the results for Chrom. IV, and Chrom. X. seemed less clear. Were map distances compared statistically? At least for this reviewer the effects on specific intervals appear less clear and without a bit more detail on how the animals were constructed it's hard for me to follow these conclusions. 

      We hope that the added details above makes the results of these assays more clear. Map distances were compared and did not satisfy statistical significance, except where indicated. While we agree that the comparisons between control animals and pch-2 mutants may seem less clear with individual chromosomes, we argue that more general, consistent patterns become clear when analyzing multiple chromosomes. Indeed, this is why we expanded our recombination analysis beyond Chromosome III and the X Chromosome, as reported in Deshong, 2014. We have edited this sentence to: “Moreover, there was a striking and consistent shift of crossovers to the PC end of all four chromosomes tested.”

      (3) Figure 2. I'm curious why non-irradiated controls were not tested side-by-side for COSA-1 staining. It just seems like a nice control that would strengthen the authors' arguments. 

      We have added these controls in the updated preprint as Figure 2B.

      (4) Figure 3. It took me a while to follow the connection between the COSA-1 staining and DAPI staining panels (12 hrs later). Perhaps an arrow that connects each set of time points between the panels or just a single title on the X-axis that links the two would make things clearer. 

      To make this figure more clear, we have generated two different cartoons for the assay that scores GFP::COSA-1 foci and the assay that scores bivalents. We have also edited this section of the results to make it more clear.

      Reviewer #2 (Public review): 

      Summary: 

      This paper has some intriguing data regarding the different potential roles of Pch-2 in ensuring crossing over. In particular, the alterations in crossover distribution and Msh-5 foci are compelling. My main issue is that some of the models are confusingly presented and would benefit from some reframing. The role of Pch-2 across organisms has been difficult to determine, the ability to separate pairing and synapsis roles in worms provides a great advantage for this paper. 

      Strengths: 

      Beautiful genetic data, clearly made figures. Great system for studying the role of Pch-2 in crossing over. 

      We thank the reviewers for their constructive and useful summary of our manuscript and the analysis of its strengths. 

      Weaknesses: 

      (1) For a general audience, definitions of crossover assurance, crossover eligible intermediates, and crossover designation would be helpful. This applies to both the proposed molecular model and the cytological manifestation that is being scored specifically in C. elegans. 

      We have made these changes in an updated preprint.

      (2) Line 62: Is there evidence that DSBs are introduced gradually throughout the early prophase? Please provide references. 

      We have referenced Woglar and Villeneuve 2018 and Joshi et. al. 2015 to support this statement in the updated preprint.

      (3) Do double crossovers show strong interference in worms? Given that the PC is at the ends of chromosomes don't you expect double crossovers to be near the chromosome ends and thus the PC? 

      Despite their rarity, double crossovers do show interference in worms. However, the PC is limited to one end of the chromosome. Therefore, even if interference ensures the spacing of these double crossovers, the preponderance of one of these crossovers toward one end (and not both ends) suggest something functionally unique about the PC end.

      (4) Line 155 - if the previous data in Deshong et al is helpful it would be useful to briefly describe it and how the experimental caveats led to misinterpretation (or state that further investigation suggests a different model etc.). Many readers are unlikely to look up the paper to find out what this means. 

      We have added this to the updated preprint: “We had previously observed that meiotic nuclei in early prophase were more likely to produce crossovers when DSBs were induced by the Mos transposon in pch-2 mutants than in control animals but experimental caveats limited our ability to properly interpret this experiment.”

      (5) Line 248: I am confused by the meaning of crossover assurance here - you see no difference in the average number of COSA-1 foci in Pch-2 vs. wt at any time point. Is it the increase in cells with >6 COSA-1 foci that shows a loss of crossover assurance? That is the only thing that shows a significant difference (at the one time point) in COSA-1 foci. The number of dapi bodies shows the loss of Pch-2 increases crossover assurance (fewer cells with unattached homologs). So this part is confusing to me. How does reliably detecting foci vs. DAPI bodies explain this? 

      We have removed this section to avoid confusion.

      (6) Line 384: I am confused. I understand that in the dsb-2/pch2 mutant there are fewer COSA-1 foci. So fewer crossovers are designated when DSBs are reduced in the absence of PCH-2.

      How then does this suggest that PCH-2's presence on the SC prevents crossover designation? Its absence is preventing crossover designation at least in the dsb-2 mutant. 

      We have tried to make this more clear in the updated preprint. In this experiment, we had identified three possible explanations for why PCH-2 persists on some nuclei that do not have GFP::COSA-1 foci: 1) PCH-2 removal is coincident with crossover designation; 2) PCH-2 removal depends on crossover designation; and 3) PCH-2 removal facilitates crossover designation. The decrease in the number of GFP::COSA-1 foci in dsb2::AID;pch-2 mutants argues against the first two possibilities, suggesting that the third might be correct. We have edited the sentence to read: “These data argue against the possibility that PCH-2’s removal from the SC is simply in response to or coincident with crossover designation and instead, suggest that PCH-2’s removal from the SC somehow facilitates crossover designation and assurance.”

      (7) Discussion Line 535: How do you know that the crossovers that form near the PCs are Class II and not the other way around? Perhaps early forming Class I crossovers give time for a second Class II crossover to form. In budding yeast, it is thought that synapsis initiation sites are likely sites of crossover designation and class I crossing over. Also, the precursors that form class I and II crossovers may be the same or highly similar to each other, such that Pch-2's actions could equally affect both pathways. 

      We do not know that the crossovers that form near the PC are Class II but hypothesize that they are based on the close, functional relationship that exists between Class I crossovers and synapsis and the apparent antagonistic relationship that exists between Class II crossovers and synapsis. We agree that Class I and Class II crossover precursors are likely to be the same or highly similar, exhibit extensive crosstalk that may complicate straightforward analysis and PCH-2 is likely to affect both, as strongly suggested by our GFP::MSH-5 analysis. We present this hypothesis based on the apparent relationship between PCH-2 and synapsis in several systems but agree that it needs to be formally tested. We have tried to make this argument more clear in the updated preprint.

      Reviewer #3 (Public review): 

      Summary: 

      This manuscript describes an in-depth analysis of the effect of the AAA+ ATPase PCH-2 on meiotic crossover formation in C. elegant. The authors reach several conclusions, and attempt to synthesize a 'universal' framework for the role of this factor in eukaryotic meiosis. 

      Strengths: 

      The manuscript makes use of the advantages of the 'conveyor' belt system within the c.elegans reproductive tract, to enable a series of elegant genetic experiments. 

      We thank this reviewer for the useful assessment of our manuscript and the articulation of its strengths.

      Weaknesses: 

      A weakness of this manuscript is that it heavily relies on certain genetic/cell biological assays that can report on distinct crossover outcomes, without clear and directed control over other aspects and variables that might also impact the final repair outcome. Such assays are currently out of reach in this model system. 

      In general, this manuscript could be more generally accessible to non-C.elegans readers. Currently, the manuscript is hard to digest for non-experts (even if meiosis researchers). In addition, the authors should be careful to consider alternative explanations for certain results. At several steps in the manuscript, results could ostensibly be caused by underlying defects that are currently unknown (for example, can we know for sure that pch-2 mutants do not suffer from altered DSB patterning, and how can we know what the exact functional and genetic interactions between pch-2 and HORMAD mutants tell us?). Alternative explanations are possible and it would serve the reader well to explicitly name and explain these options throughout the manuscript. 

      We have made the manuscript more accessible to non-C. elegans readers and discuss alternate explanations for specific results in the updated preprint. 

      Recommendations for the authors:  

      Reviewing Editor Comments: 

      (1) Please provide 'n' values for each experiment. 

      n values are now included in the Figure legends for each experiment.

      (2) Line 129: Please represent the DCOs as percent or fraction (1%-9.8%, instead of 1-13). 

      We have made this change.

      (3) Figure 3A legend: the grey bar should read 20hr. COSA-1/ 32 hr DAPI. In Figure 3E, it is not clear why 36hr Auxin and 34hr Auxin show a significant difference in DAPI bodies between control and pch-2, but 32hr Auxin treatment does not. Here again 'n' values will help. 

      We have made this change. We also are not sure why the 32 hour auxin treatment did not show a significant difference in DAPI stained bodies. We have included the n values, which are not very different between timepoints and therefore are unlikely to explain the difference. The difference may reflect the time that it takes for SPO-11 function to be completely abrogated.

      (4) Line 360: Please provide the fraction of PCH-2 positive nuclei in dsb-2.

      We have made this change. 

      Please also address all reviewer comments. 

      Reviewer #1 (Recommendations for the authors): 

      (1) Page 3, line 52. While I agree that crossing over is important to generate new haplotypes, work has suggested that the contribution by an independent assortment of homologs to generate new haplotypes is likely to be significantly greater. One reference for this is: Veller et al. PNAS 116:1659. 

      We deeply appreciate this reviewer pointing us to this paper, especially since it argues that controlling crossover distribution contributes to gene shuffling and now cite it in our introduction! While we agree that this paper concludes that independent assortment likely explains the generation of new haplotypes to a greater degree than crossovers, the authors performed this analysis with human chromosomes and explicitly include the caveat that their modeling assumes uniform gene density across chromosomes. For example, we know this is not true in C. elegans. It would be interesting to perform the same analysis with C. elegans chromosomes in control and pch-2 mutants, taking into account this important difference.

      (2) Figure 2. It would really help the reader if an arrow and text were shown below each irradiation sign to indicate the stage in meiosis in which the irradiation was done as well as another arrow in the late pachytene box to show when the COSA-1 foci were analyzed. In general, having text in the figures that help stage the timing in meiosis would help the non C. elegans reader. This is also an issue where staging of C. elegans is shown (Figure 4). 

      We have made these changes to Figure 2. To help readers interpret Figure 4, we have added TZ and LP to the graphs in Figure 4B and 4D and indicated what these acronyms (transition zone and late pachytene, respectively) are in the Figure legend.

      (3) Page 12, line 288. It would be valuable to first outline why the him3-R93Y and htp-3H96Y alleles were chosen. This was eventually done on Page 13, but introducing this earlier would help the reader. 

      We have introduced these mutations earlier in the manuscript.

      (4) Page 13, line 323. A one sentence description of the OLLAS tagging system would be useful. 

      We have added this sentence: “we generated wildtype animals and pch-2 mutants with both GFP::MSH-5 and a version of COSA-1 that has been endogenously tagged at the Nterminus with the epitope tag, OLLAS, a fusion of the E. coli OmpF protein and the mouse Langerin extracellular domain”

      Reviewer #2 (Recommendations for the authors): 

      (1) The title is a little awkward. Consider: PCH-2 controls the number and distribution of crossovers in C. elegans by antagonizing their formation 

      We have made this change.

      (2) Abstract: 

      Consider removing "that is observed" from line 20. 

      We have made this change.

      I'm confused by the meaning of "reinforcement of crossover-eligible intermediates" from line 27. 

      We have removed this phrase from the abstract.

      A definition of crossover assurance would be helpful in the abstract. 

      We have added this to the abstract: “This requirement is known as crossover assurance and is one example of crossover control.”

      (3) Line 36: I know a stickler but many meioses only produce one haploid gamete (mammalian oocytes, for example) 

      Thanks for the reminder! We have removed the “four” from this sentence.

      (4) Line 284 - are you defining MSH-5 foci as crossover-eligible intermediates? If so, please state this earlier. 

      We have added this to the introduction to this section of the results: “In C. elegans, these crossover-eligible intermediates can be visualized by the loading of the pro-crossover factor MSH-5, a component of the meiosis-specific MutSγ complex that stabilizes crossover-specific DNA repair intermediates called joint molecules”

      (5) Can the control be included in Figure S1? 

      We have made this change.

      (6) Can you define that crossover designation is the formation of a COSA-1 focus? 

      We did this in the section introducing GFP::MSH-5: “In the spatiotemporally organized meiotic nuclei of the germline, a functional GFP tagged version of MSH-5, GFP::MSH-5, begins to form a few foci in leptotene/zygotene (the transition zone), becoming more numerous in early pachytene before decreasing in number in mid pachytene to ultimately colocalize with COSA-1 marked sites in late pachytene in a process called designation” 

      (7) Would it be easier to see the effect of DSB to crossover eligible intermediates in Spo-11, Pch-2 vs. Spo-11 mutant with irradiation using your genetic maps? At least for early vs. late breaks? 

      Unfortunately, irradiation does not show the same bias towards genomic location that endogenous double strand breaks do so it is unlikely to recapitulate the effects on the genetic map.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Weaknesses:

      In my estimation, the following would improve this manuscript:

      (1) The physiological relevance of these data could be better highlighted. For instance, future work could revolve around incubating oocytes with oviduct fluid (or OVGP1) to reduce polyspermy in porcine IVF, and naturally improve sperm selection in human IVF.

      Thank you for the suggestions. We have added these physiological relevance points at the end of the discussion.

      (2) Biological and technical replicate values for each experiment are unclear - for semen, oocytes, and oviduct fluid pools. I suggest providing in the Materials and Methods and/or Figure legends.

      Biological and technical replicates are now indicated in M&M. Number of oocytes or ZPs used were already indicated in every Supplementary Table.

      (3) Although differences presented in the bar charts seem obvious, providing statistical analyses would strengthen the manuscript.

      Statistical analyses are now indicated in each bar chart.

      (4) Results are presented as {plus minus} SEM (line 677); however, I believe standard deviation is more appropriate.

      This was a mistake; all the results are indicated as standard deviation.

      (5) Given the many independent experimental variables and combinations, a schematic depiction of the experimental design may benefit readers.

      A schematic depiction of the experimental design is now included as Figure 1. This new Figure modifies the number assigned to the rest of Figures.

      (6) Attention to detail can be improved in parts, as delineated in the "author recommendation" review section.

      Done

      Reviewer #2 (Public review):

      Weaknesses:

      The authors postulate a role for oviductal fluid in species-specific fertilization, but in my opinion, they cannot rule out hormonal effects or differences in the method of oocyte maturation employed.

      As we indicate below, the effect of hormones has been analyzed, and we have demonstrated that it is not the cause of zona pellucida specificity.

      They also cannot unequivocally prove that OVGP1 is the oviductal protein involved in the effect. Additional experiments are necessary to rule out these alternative explanations.

      Our work does not demonstrate that other proteins could be involved, but it does show that OVGP1 is involved in the process.

      When performing the EZPT assay on mouse oocytes obtained either from the ovary or from the oviduct, the oocytes obtained from the ovary came from mice primed with eCG, whereas the ones collected from the oviduct were obtained from superovulated mice (eCG plus hCG). This difference in the hormonal environment may make a difference in the properties of the ZP. Additionally, the ones obtained from the ovary were in vitro matured, which is also different from the freshly ovulated eggs and, again, may change the properties of the ZP. I suggest doing this experiment superovulating both groups of mice but collecting the fully matured MII eggs from the ovary before they get ovulated. In that way the hormonal environment will be the same in both groups and in both groups, oocytes will be matured in vivo. Hence, the only difference will be the exposure to oviductal fluids.

      In Figure 2, we compare ZPs from murine oocytes obtained from the ovary using only PMSG with ZPs from oviductal oocytes treated with both HCG and PMSG. But in Figure 7, however, we compared ZPs from murine oocytes exposed only to PMSG, with the only difference being whether or not they had been in contact with OVGP1. This shows that it is not the effect of the hormone but rather the contact with OVGP1 that determines their specificity.

      Mice with OVGP1 deletion are viable and fertile. It would be quite interesting to investigate the species-specificity of sperm-ZP binding in this model. That would indicate whether OVGP1 is the only glycoprotein involved in determining species-specificity. Alternatively, the authors could immunodeplete OVGP1 from oviductal fluid and then ascertain whether this depleted fluid retains the ability to impede cross-species fertilization.

      We agree with the reviewer that it would be interesting to investigate sperm-ZP binding in this model. Unfortunately, we do not have the OVGP1 knockout mouse strain. We also believe that immunodepletion of OVGP1 would not completely remove the protein, so its effect would likely not be entirely eliminated.

      What is the concentration of OVGP1 in the oviduct? How did the authors decide what concentration of protein to use in the experiments where they exposed ZPs to purified OVGP1? Why did they use this experimental design to check the structure of the ZP by SEM? Why not do it on oocytes exposed to oviductal fluid, which would be more physiological?

      We have included in the manuscript that the concentration of OVGP1 in the oviductal fluid was quantified using ImageJ software by comparing the mean gray value of the band in the oviductal fluid to the band in the recombinant protein lane. By establishing this relationship, along with the known concentration of protein amount in the recombinant one and in the total protein amount of oviductal fluid, the concentration of OVGP1 in the oviductal fluid was determined as the average of three western blots. The concentration of OVGP1 in oviductal fluids was in the range of 100-150 ng/µl in mice and 150-200 ng/µL in cow. We have included also in the manuscript the concentration that we have use for the EZPTs, 30 ng/µL of recombinants OVGP1 (bovine, murine and human) for 30 minutes in 20µL drops. With this concentration, we observed a clear effect on zona specificity with no negative impact on the gametes.

      As you can see in supplementary Fig S8B, we already realized SEM of oocytes exposed to oviductal fluid.

      None of the figures show any statistical analysis. Please perform analysis for all the data presented, include p values, and indicate which statistical tests were performed. The Statistical analysis section in the Methods indicating that repeated measures ANOVA was used must refer to the tables. Was normality tested? I doubt all the data are normally distributed, in which case using ANOVA is not appropriate.

      Statistical results are now included in each Figure and Table. All the statistical analysis are included, all the data pass normality, homogeneity of variance and independence; for this reason the data analysis was conducted by using a one-way ANOVA, followed by Tukey´s post hoc test. Significance level was set at p <0.05.

      Why was OVGP1 selected as the probable culprit of the species specificity? In the Results section entitled "Homology of bovine, human and murine OVGP1 proteins..." the authors delve into the possible role of this protein without any rationale for investigating it. What about other oviductal proteins?

      A sentence indicating this rationale for investigating OVGP1 has been introduced in this paragraph.

      Reviewer #3 (Public review):

      Weaknesses:

      The manuscript began with a well-written introduction, but problems started to surface in the Results section, in the Discussion, as well as in the Materials and Methods. Major concerns include inconsistencies, misinterpretation of results, lacking up-to-date literature search, numerous errors found in the figure legends, misleading and incorrect information given in the Materials and Methods, missing information regarding statistical analysis, and inadequate discussion. These concerns raise questions regarding the authenticity of the study, reliability of the findings, and interpretation of the results. The manuscript does not provide solid and convincing findings to support the conclusion.

      We have modified and clarified all the issues, some of which are misunderstandings, we have also performed the suggested experiment of putting sperm in contact with OVGP1.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Ensure consistency in (past) tense, for example, "decondensed" (line 102), "induced" (line 103), and elsewhere.

      Done

      (2) Replace "table" with "Table" throughout.

      Done

      (3) The authors often refer to "co-incubation". I believe this should read "incubation". My understanding is that oocytes were incubated with oviduct fluid or sperm but never both simultaneously as "co-incubation" implies.

      Done

      (4) Synonymous terms "OVGP1" and "oviductin" are used interchangeably. Consider using one or the other for consistency.

      We believe that by using both terms, reading is more fluid.

      (5) Delete "around" on line 256 and "approximately" on line 263 and provide actual percentages.

      Done

      (6) The point of the sentence on lines 311-313 is unclear to me.

      Rewritten

      (7) Suggest specifying "wildtype" on line 419.

      All the mice used in this work are wildtype

      (8) Do the authors have details regarding cattle oocyte donor breeds?

      Done

      (9) What do the authors mean by "strengthen" on line 500?

      The word strengthen has been changed to carefully isolated

      (10) Ponceau and vinculin (Figure 3) details are not provided in the manuscript.

      Ponceau and vinculin details are now included in the manuscript

      (11) Address formatting issues (e.g. citation 26 among others).

      Done

      (12) Primary and secondary antibody controls for immunofluorescent imaging (to fully exclude autofluorescence) are lacking.

      Controls for immunofluorescent imaging are indicated in Supplementary Figure S7.

      (13) The corresponding author on the manuscript and in the eLife submission system are different

      It was a problem during submission, now it is corrected.

      Reviewer #2 (Recommendations for the authors):

      (1) For the experiment depicted in Figures 3C and D, the authors need to perform a negative control to demonstrate that this fluorescent signal is specific. What happens if they express a different FLAG-tagged protein instead of bOVGP1 and mOVGP1? FLAG antibodies give quite strong non-specific binding. Or if they expressed untagged bovine and mouse OVGP1?

      The negative controls are in the supplementary Figure S7. A rabbit polyclonal antibody to the human OVGP1 was used for murine and bovine IVM ZPs from ovaries and murine superovulated ZPs recovered from mouse oviducts. There is a remarkable difference in the ones that are not incubated with any OVGP1 and the endogenous one, given the specificity of the antibody.

      Also, IVM mouse and bovine oocytes incubated or not with OF were immunoblotted with anti-Flag-tag antibody. Since any of them present OVGP1 tagged to Flag, there is not signal in the immunofluorescence.

      (2) For the Western blots of recombinant proteins, why are the authors not showing the blots using His and FLAG tag antibodies? Is the 50-kDa band observed for the mouse OVGP1 detected with His-Tag antibody?

      We have included a supplementary figure S6 with the western blot with anti-His and anti-Flag. The protein around 50 kDa is not a specific band (there is not signal with anti-Flag). This new figure modifies the number assigned to the rest of supplementary figures (S6-S8).

      (3) How was the estrous cycle stage determined in mice? It is not described in the Methods.

      Estrous cycle stage was determined in mice by visual examination of the vaginal opening and cytological examination of the vagina smear. This is now included in the M&M

      (4) For sperm binding, what does the percentage mean?

      It was a mistake, percentages were related to pronuclear formation and cleavage not to sperm binding, this is now corrected.

      (5) In Figure 3A, the labels for regions C, D, and E are mixed up. It is regions A and C that are conserved (or orange and blue, if the letters are incorrect). The purple region is only present in the mouse (E?), and the red region (D?) is only in the human form. Also, the legend for this panel is repeated verbatim in the Results section. Please remove one of them.

      Errors in Figure 3a have been corrected. Legend repetition is removed.

      (6) In the title of Figure 1B and in different places in the text, it should be mouse (not mice) oocytes.

      Done

      (7) In line 140, I would change the part indicating "We extracted the cytoplasmic contents from the oocytes". It is not only the cytoplasm, but all the oocyte, including the nucleus and membranes, that are being removed.

      Done

      (8) Please rephrase the sentence in lines 245-247, as it is quite confusing.

      Done

      (9) In line 236, the authors indicate that "During in vitro maturation (IVM), oocytes displayed a porous ZP structure...". Do they mean after IVM? When were those oocytes collected for SEM?

      The sentence has been modified by “after IVF”. Bovine oocytes were collected from slaughterhouse ovaries and were similar to those used in the rest of the experiments in the manuscript.

      (10) In the legend of Figure 1, please indicate what the parthenogenic group is.

      Done

      (11) In the legend to Figure 1G, the text indicates "Note sperm only appear outside the zona". However, I cannot see any sperm in that image.

      The phrase has been removed, as when enlarging the image to better see the sperm that are inside the area, the vision of those that are outside has been lost.

      (12) In the legend to Figure 2 describing the different zona pictures, the letters of the panels are not correct.

      Done

      (13) In line 999, please provide the right concentration for NMase (it indicates 10 μ/mL).

      Done

      (14) Where does the model depicted at the end of the manuscript go? Is it a Figure? A graphical abstract? In that model, please correct some typos: it should be "ZP obtained from ovarian oocytes"; and change specie for species in all three panels.

      Done. It is a model (Fig. 10)

      (15) The FITC-PNA staining to visualize acrosomes is not described in the Methods section.

      Done

      Reviewer #3 (Recommendations for the authors):

      The present study reports findings from a series of experiments suggesting that bovine oviductal fluid and species-specific oviductal glycoprotein (OVGP1 or oviductin) from bovine, murine, or human sources modulate the species specificity of bovine and murine oocytes. The manuscript began with a well-written introduction, but problems started to surface in the Results section, Discussion as well as in the Materials and Methods. Major concerns include inconsistencies, misinterpretation of results, lacking up-to-date literature search, numerous errors found in the figure legends, misleading and incorrect information given in the Materials and Methods, missing information regarding statistical analysis, and inadequate discussion.

      We have modified and clarified all the issues, some of which are misunderstandings, we have also performed the suggested experiment of putting sperm in contact with OVGP1.

      Specific comments:

      (1) Lines 142 to 143 on page 5: It is stated that "Because this experiment was done on empty ZPs, we called this test "empty zona penetration test" (EZPT)". In fact, the experiment was not actually done on empty ZPs, but on oocytes with the ooplasm extracted. Therefore, the zona pellucidae used in the experiment were not empty but contained an intact zona matrix of glycoproteins. The term "EZPT" used by the authors in the manuscript is a misnomer. A better term should be used to reflect the ZPs which were intact and not empty.

      We extracted the cytoplasmic containing all the organelles, nucleus and membranes, and the polar body. This has been clarified in the text.

      (2) The authors need to distinguish between sperm penetration and sperm binding in the manuscript. In lines 169 to 177 on page 6, the authors mixed up the terms "penetration" and "binding" in the text. In writing about events leading to fertilization in reproductive biology, the term "sperm binding" refers to the interaction between the sperm plasma membrane and the oocyte zona pellucida (ZP), whereas the term "sperm penetration" refers to the passage of the sperm through the ZP. Therefore, the statements in lines 169 to 177 describing the binding of bovine, murine, and human sperm to bovine oocytes with and without prior treatment with oviductal fluid are misleading and not correct. In fact, Figure 2 and Table 6 show sperm penetration and not sperm binding.

      Figure 2A and B (now 3A and 3B), and Tables S6 show both sperm penetration (% penetration rate and average sperm in penetrated ZPs) and sperm binding (average sperm bound to ZPs). Throughout the manuscript, a clear distinction is made between sperm attached to the ZP and sperm that have penetrated it.

      (3) Lines 182 to 187 on page 6: What is being described in the text here does not match what is being shown in Figure 3A. As a result, the information provided in lines 182 to 187 is not correct and misleading. For example, it is stated in lines 182 to 183 that "As depicted in Fig. 3A, the sequences of these three OVGP1 have five distinct regions (A, B, C, D and E)." However, Figure 3A shows that hOVGP1 and mOVGP1 both have only 4 regions and bOVGP1 has only 3 regions. None of the three has 5 regions. In lines 183 to 184, the authors continued to state that "Regions A and D are conserved in the different mammals." This statement is also not true because Figure 3A shows that only region A is conserved in all three species but not region D which is found only in the human. What is stated in lines 186 to 187 is also not correct based on the information provided in Figure 3A. It is stated here that "Region C is an insertion present only in the mouse (Mus) and region E is typical of human oviductin." However, based on the color codes provided in Figure 3A, region C is present in all three species while region E is present only in the mouse.

      Errors with naming regions in Figure 3A (now 4A) have been corrected.

      (4) In lines 195 to 197 on page 6, the authors stated that "Western blots of the three OVGP1 recombinants indicated expected sizes based on those of the proteins: 75 kDa for human and murine OVGP1 and around 60 kDa for bovine OVGP1 (Fig. 3B)." However, the expected size of the recombinant human OVGP1 is not in agreement with what has been published in literature regarding the molecular weight of recombinant human OVGP1. It has been previously reported that a single protein band of approximately 110-150 kDa was detected for recombinant human OVGP1 using an antibody against human OVGP1. The authors provided Western blots of murine oviductal fluid and bovine oviductal fluid in Figure 3B but not a Western blot of native human oviductal fluid. The latter should have been included for a comparison with the recombinant human OVGP1.

      We do not have human oviductal fluid, but we have included now a supplementary figure 6S of a western blot with antibody again His and Flag (present in the recombinant OVGP1) which shows that the size of the recombinant protein is as indicated in the Figure 3B (now 4B).

      (5) Lines 220 to 229 on page 7: In this experiment, the authors conducted the EZPT using ZPs from bovine oocytes that were either treated with or without bOVGP1 followed by incubation, respectively, with homologous sperm (bovine) and heterologous sperm (human and murine). This is a logical experiment to determine if OVGP1 plays a species-specific role in setting the specificity of the zona pellucida. However, in the in vivo situation, sperm that reach the lumen of the ampulla region of the oviduct where fertilization takes place are also exposed to oviductal fluid of which OVGP1 is a major constituent. Therefore, an additional experiment in which sperm are treated with OVGP1 prior to incubation with ZP should be carried out for a comparison.

      The additional experiment in which sperm are treated with OVGP1 prior to incubation with ZP has been done (Table S9). No effects were observed. This is now included in the manuscript.

      (6) Regarding the results obtained with the use of neuraminidase (lines 278 to 293 on pages 8 to 9), if neuraminidase treatment of bovine ZP prevented bovine sperm penetration regardless of whether ZPs had been or had not been in contact with OVGP1, that means OVGP1 is not responsible for penetration despite the description of earlier findings in the manuscript. Sialic acid is likely associated with the sugar side chains of ZP glycoproteins and not sugar side chains of OVGP1. To attribute the species-specific property of sialic acid to OVGP1 for sperm binding, an experiment in which OVGP1 will be treated with neuraminidase prior to performing the EZPT is needed.

      We conducted the experiment by treating only OVGP1 with neuraminidase and then isolating OVGP1 from the enzyme previously to incubate treated OVGP1 with ZPs. The results agree with our previous findings, indicating the importance of sialic acid on OVGP1 for sperm binding and penetration, and confirming that OVGP1 is responsible for species-specific penetration. Results are shown in Fig. 9 and Table S14.

      (7) The Discussion appears superficial and a more in-depth discussion regarding the results obtained in the present study in relation to other reports about OVGP1 published in literature is needed (e.g. a recent paper published by Kenji Yamatoya et al. (2023) Biology of Reproduction https://doi.org/10.1093/biolre/ioad159). Lines 317 to 342 of the Discussion on pages 10 to 11 should belong to the Introduction.

      Results of Yamatoya are now included in discussion. Part of the discussion from 317 to 342 are now in the introduction

      (8) In is not clear what the authors exactly want to say in lines 343 to 344 of the Discussion on page 11. It is stated here that "The empty zona penetration test (EZPT) enables heterologous sperm to overcome the oocyte's second barrier, the plasma membrane or oolemma." Do the authors mean that the sperm can now enter the empty space encircled by the ZP without having to go through the plasma membrane or oolemma? In Figure S4 which depicts the method used to empty the ooplasm in the bovine oocyte, does the method extract only the ooplasm (or cytoplasmic contents) leaving behind the intact plasma membrane or oolemma? This needs to be clearly shown and clearly explained. High magnifications of the zona pellucida are also needed to show whether the plasma membrane (or oolemma) is still present and intact after extraction of the ooplasm.

      This is clearly explained in the text. To obtain empty ZP, everything except ZP (nucleus, organelles, membranes and cytoplasmic contents of the oocytes) was removed using a micromanipulator, following the procedure outlined in Figure S4.

      (9) The authors stated in the Discussion in lines 383 to 383 on page 12 that "After ovulation, the changes reported in the carbohydrate composition of the ZP (3, 25) are likely induced by the addition of glycoproteins of oviductal origin, as we have seen here with OVGP1." There is no evidence in the present study to suggest that OVGP1 or glycoproteins of oviductal origin have changed or can change the carbohydrate composition of the ZP. At present, it is not known if OVGP1 or glycoproteins of oviductal origin directly interact with ZP glycoproteins (including ZP1, ZP2, ZP3 and/or ZP4) that make up the zona matrix.

      There is scientific evidence suggesting that oviductal glycoproteins, including OVGP1, interact with the zona pellucida (ZP) glycoproteins of the oocyte. Studies have shown that OVGP1 binds to the ZP of the oocyte. Specifically, OVGP1 is thought to interact with ZP glycoproteins, such as ZP2 and ZP3, in a way that may help stabilize the oocyte or modify the ZP structure during its passage through the oviduct. This interaction is believed to influence processes like sperm binding, oocyte maturation, and potentially the prevention of polyspermy during fertilization. For example, in several studies, the absence of OVGP1 in knockout animals (such as in Ovgp1-KO hamsters) has been associated with impaired fertilization and embryonic development, which indicates the importance of this interaction. However, the detailed molecular mechanisms and functional significance of these interactions require further exploration. We have use the work “likely” to soften this statement.

      Velásquez, J. G., Canovas, S., Barajas, P., Marcos, J., Jiménez‐Movilla, M., Gallego, R. G., ... & Coy, P. (2007). Role of sialic acid in bovine sperm–zona pellucida binding. Molecular reproduction and development, 74(5), 617-628.

      Kunz, P., et al. (2013). "The role of oviductal glycoprotein 1 in sperm–egg interaction and early embryonic development." Reproduction, 145(3), 225-233. DOI: 10.1530/REP-12-0300

      Yamatoya, K., Kurosawa, M., Hirose, M., Miura, Y., Taka, H., Nakano, T., ... & Araki, Y. (2024). The fluid factor OVGP1 provides a significant oviductal microenvironment for the reproductive process in golden hamster. Biology of reproduction, 110(3), 465-475.

      (10) Lines 390 to 391 page 12: The statement "This determines that OVGP1 modifications are critical to define the barrier among the different species of mammals." needs to be rephrased because there is no evidence in the present study showing that OVGP1 has been modified. There are many concerns with errors, important information that is missing, and inconsistencies as well as wrong and misleading information in the Materials and Methods which are troublesome. These concerns raise questions regarding the authenticity and reliability of the study. Some of the major concerns are listed below:

      All concerns have been fixed

      (11) It says in line 399 on page 13 that "Human semen samples were obtained from a normozoospermic donor...". Do the authors really mean that the semen samples were obtained from only one donor?

      Samples were obtained from 3 normozoospermic donor, this is now indicated in M&M

      (12) In lines 409 to 411 on page 13, what do the authors mean by "...the samples were frozen into pellets..."? Was centrifugation of the samples carried out prior to freezing the samples? Secondly, what do the authors mean by "....and stored in liquid nitrogen at -196{degree sign}C or lower.", particularly what do the authors mean by "or lower"? The temperature of liquid nitrogen is -196{degree sign}C. What is the "lower" temperature?

      Centrifugation of the samples were no carried out at this time. A more detailed protocol is now included The word lower has been removed.

      (13) Line 424 on page 13: Provide the full name of "M2" when it is first used in the text then followed by the abbreviation.

      Done

      (14) Is there a reason why different counting chambers were used to determine sperm concentrations? In line 432 on page 13, a Thomas cell counting chamber was used to determine the sperm count of epididymal mouse sperm whereas it is mentioned in line 441 on page 14 that a Neubauer cell counting chamber was used to determine epididymal cat sperm. Furthermore, where did the cat's sperm come from?

      The cat sperm was obtained and processed at the Faculty of Veterinary Medicine and the rest of the samples were processed in the INIA-CSIC lab, and different chambers were used in both places.

      (15) The mention of the use of cat spermatozoa in line 439 on page 14 is a worrisome problem of the manuscript. The present study used bovine, mouse, and human sperm and not cat. Therefore, the sudden mentioning of the use of cat spermatozoa in the Materials and Methods is troublesome and worrisome. It appears that the paragraph from lines 439 to 450 was directly copied and pasted from previously published work. Furthermore, lines 441 to 445 do not flow and do not make sense. In fact, what is described in this paragraph (lines 439 to 450) does not appear to correspond to the method(s) used to obtain the results presented in the Results section of the manuscript.

      I don't understand why the reviewer says we don't use cat sperm. This study uses cat sperm. Results of cat sperm are indicated in the Figure 1A (now 2A). We have modified the M&M to clarify frozen description.

      (16) Similarly, several problems are also found in the paragraphs (lines 453-478 on page 14) describing the methods and procedures to obtain homologous and heterologous IVF of bovine oocytes. Firstly, it is mentioned here (in line 460) that COCs were co-incubated with selected sperm without removing the cumulus cells. However, the results of the sperm penetration experiment indicated otherwise. Figures 2 and 3 show that the oocytes were denuded of cumulus cells. Secondly, it is very worrisome and troublesome to read what is written in line 468 on page 14 that "...from other species (cat, human, mouse, and rabbit)." One wonders where the cat and rabbit came from. Again, it appears that this paragraph was directly copied and pasted from previously published work.

      Cat sperm was used in this manuscript and it is correctly indicated in every section and figures. About IVF and EZPT protocols, in the protocol of IVF for bovine oocytes, COCs were used without removing the cumulus cells. For the EZPT cumulus cells were removed, this is described in the following sections of the material and methods. The word rabbit was a mistake and it has been removed.

      (17) In lines 468 to 469 on page 14, it is mentioned that "Sperm-egg interactions were assessed through a sperm-ZP binding assay...". The authors only examined sperm penetration in their study. Therefore, this needs to be specified in the Materials and Methods. Secondly, the authors did not use the conventional sperm-ZP binding assay in their study. Instead, they used the EZPT in their study. There appear to be many inconsistencies throughout the manuscript.

      When the IVF experiments using bovine COCs were done (Fig 2A and C, Fig 1S to 3S, and Tables 1S to 4S) conventional sperm-egg interaction was assessed at 2.5 hours after IVF. EZPT was used in the rest of experiments. IVF with COCs and EZPT with ZPs are different experiments.

      (18) Lines 480 to 489 on page 15 under the sub-heading of "In vitro culture of presumptive zygotes to first cleavage embryos on Day 2" do not provide the correct methodology used for obtaining the results presented in the manuscript. In line 482, it is not clear where the "synthetic oviductal fluid" came from. In fact, in the Results section, none of the results came from the use of synthetic oviductal fluid. In line 487, humans and rabbits are mentioned here. However, human and rabbit oocytes were not used in the present study. It is very strange indeed to read human and rabbit in the sentence.

      SOF reference is now included. Human results are in Fig 1A; the sentence is referred about the cultures of bovine oocytes inseminated with sperm of bull, human, mouse or cat). Rabbit word is a mistake and is now eliminated of the manuscript.

      (19) In line 500 on page 15, what do the authors mean by "Each oviduct was strengthen by removing the adjacent tissue..."?

      The sentence has been modified.

      (20) On page 15 in the Materials and Methods, the authors described the collection of bovine and mouse oviductal fluid. However, there is no mention of human oviductal fluid and how it was collected. This important information is missing.

      We have not use human oviductal fluid in this manuscript.

      (21) Line 510 on page 15: The sub-heading of "Preparation of empty zonae pellucidae from bovine ovarian oocytes" should be rephrased. As pointed out earlier in my review, the ZPs prepared by the authors were intact and not "empty". It was the oocyte which was empty after extraction of the ooplasm.

      Everything except the ZP were removed from the oocyte, this is now clarified in the manuscript.

      (22) Line 518 on page 16 and line 553 on page 17: "Figure S5" should be "Figure 4S".

      Done

      (23) Line 538 and line 547 on page 16: "mice oocytes" should be "mouse oocytes".

      Done

      (24) On page 17, the procedures for in vitro fertilization, sperm penetration, and binding assessment in mice were described here in lines 560 to 574. Several problems are noted in this paragraph as listed below:<br /> a. As mentioned earlier the authors in the present manuscript mixed up sperm penetration and sperm binding which are two separate events. Based on the results presented in the manuscript, they represent sperm penetration and not sperm binding. Therefore, the authors need to precisely explain in the manuscript whether the results presented refer to sperm penetration or sperm binding.

      Both sperm penetration and binding have been analyzed in this work.

      b. In line 570 on page 17, the term "insemination" is wrongly used here. Insemination is the introduction of semen into the female reproductive tract either through sexual intercourse or through an instrument. The procedure used in the present study was carried out in vitro in a co-incubation manner and not by transferring sperm into the female reproductive tract.

      The word insemination has been changed to incubation

      c. Information regarding procedures for treatment with various oviductal fluid and OVGP1s are all missing in the Materials and Methods.

      This information is now in M&M

      d. The concentrations of various oviductal fluids and OVGP1s used and the number of ZPs used in each incubation are also missing.

      Concentrations are now indicated in the manuscript. All the numbers and ZPs used are indicated in supplementary figures.

      (25) Lines 577 to 603 on pages 17 to 18: Were recombinant bovine and murine glycoproteins prepared using the same methodology? In line 595 on page 18, it is stated that "Supernatant was saved in subsequent experiments." It is not clear exactly what experiments the supernatant was subsequently used in.

      Details about how the bovine and murine glycoproteins were prepared are now included. Sentence about subsequent experiment is delete; supernatant was used for the next steps of protein purification.

      (26) What is being described in lines 604 to 609 on page 18 is problematic. The paragraph starts by saying that "Human recombinant oviductin was obtained from Origene Technologies....". Strangely, the paragraph continues by saying that the recombinant proteins were produced by transfection in HEK293T...". If recombinant human OVGP1 had already been obtained from Origene Technologies, why did the authors want to produce it again? It does not make sense.

      We briefly described the method that Origene used for the production of the human recombinant OVGP1

      (27) In lines 626 to 627 on page 18, it is stated that "Zonae pellucidae previously incubated with OVGP1 proteins from several species and murine oviductal fluid...". Were the zonae pellucidae previously incubated with only murine oviductal fluid or also with others?

      It was only incubated with OVGP1 or with oviductal fluid, this is now clarified in the text.

      (28) In lines 638 and 639 on page 19, can the authors please explain the difference between "endogenous OVGP1 and bOVGP1" and "exogenous recombinant hOVGP1 and mOVGP1"?

      This is now clarified

      (29) As stated in lines 676 to 679 on page 20, statistical analysis was performed in the study. Strangely, no "n" numbers and p values were provided in any of the figures that require statistical analysis. This is problematic.

      Statistical analysis and significant differences are now included in the figures, all the numbers used are included in the supplementary tables that are related with the figures.

      There are also many errors noted in the Figure Legends. These concerns raise questions regarding the reliability of the findings and interpretation of the results. Some major ones that require attention are listed below:

      (30) Figure legend 1 on page 27: In line 912, where did the "cat sperm" come from? In line 913, where did the "feline sperm" come from? In line 918, as pointed out earlier, the term "empty zona penetration test (EZPT)" is a misnomer and should be replaced with a better term. In line 924, it is stated that "Note sperm only appear outside the zona." However, no sperm can be seen outside the zona pellucida shown in Figure 1.

      Cat sperm is used in this manuscript. Term EZPT is now clarified The sentence about sperm outside of ZP is removed

      (31) Figure legend 2 on page 27 (lines 928 to 940) needs to be rewritten. Some of the sentences are not clearly written. Authors, please check all the capital labeling letters some of which appear to be wrong.

      Done

      (32) As is written, Figure legend 3 on pages 28 and 29 (lines 943 to 959) presents many problems:

      a. Contrary to what is stated in the figure legend, not all five regions are present in the hOVGP1, mOVGP1, and bOVGP1.

      Done

      b. Contrary to what is stated in line 946, region D is not conserved in the mouse and bull as shown in Figure 3A, and region C is not present only in the mouse.

      Done

      c. Based on what is shown in Figure 3A, region E is present only in the mouse and not in the human.

      Done

      d. What is stated in line 951 that "Proteins were expressed in mammalian cells..." is not correct. Based on the information provided in the manuscript, recombinant human OVGP1 was obtained from Origene Technologies and was not expressed in mammalian cells as claimed.

      All the recombinant proteins were produced in mammalian cells.

      (33) Figure legend 6 on page 28: In lines 985 to 986, what do the authors mean by "...and combinations of the three oviductins with sperm of the three species."? As is written, it appears that the bovine ZPs were pretreated with a combination of all three oviductins and then co-incubated with sperm from the bull, mouse and human together.

      We have clarified this sentence

      (34) What is described in the figure legend for the supplemental figure (Figure S7) does not make sense.

      Legend of Fig S7 (now S8) is related to pictures A to E, the legend is now clarified.

      (35) In addition to the figures and supplemental figures provided in the manuscript, there is also an additional figure labeled with "Model" showing three diagrams. Strangely, there is no mention of this additional figure in the manuscript. There is no figure legend for or description of this figure. It is not clear what is being shown in this figure, and it is not clear about the purpose of the use of this figure.

      We have included a legend to the model that is now Figure 10.

    1. Reviewer #2 (Public Review):

      Summary:

      Environmental influences on development are ubiquitous, affecting many phenotypes in organisms. However molecular genetic and cellular mechanisms transducing environmental signals are still only barely understood. This study examines part of one such intracellular mechanism in a polyphenic (or dimorphic) aphid.

      Strengths:

      While other published reports have linked phenotypic plasticity to RNA editing before, this study reports such an interaction in insects. The study uses a wide array of molecular tools to identify connections upstream and downstream of the RNA editing to elucidate the regulatory mechanism, which is illuminating.

      Weaknesses:

      While this system is intriguing, this report does not foster confidence in its conclusions. Many of the analyses seem based on very small sample sizes. It is itself problematic that sample sizes are not obvious in most figures, although based on Methods section covering RNAseq, they seem to be either 3, 6 or 9, depending on whether stages were pooled, but that point is not made clear. With such small sample sizes, statistical tests of any kind are unreliable. Besides the ambiguity on sample sizes, it's unclear what error bars or whiskers show in plots throughout this study. When sample sizes are small estimates of variance are not reliable. Student's t-test is not appropriate for comparisons with such small sample sizes. Presently, it is not possible to replicate the tests shown in Figures 3, 4 and 6. (Besides the HT-seq reads, other data should also be made publicly available, following the journal's recommendations.) Regardless, effect sizes in some comparisons (Fig 3J, 4A-C, 6E,H) are clearly not large, making confidence in conclusions low. The authors should be cautious about over-interpreting these data.

      [Editors' note: The authors made a great effort to address the reviewers' concerns. The current manuscript is significantly improved with additional data and clarification.]

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, a chromosome-level genome of the rose-grain aphid M. dirhodum was assembled with high quality, and A-to-I RNA-editing sites were systematically identified. The authors then demonstrated that: 1) Wing dimorphism induced by crowding in M. dirhodum is regulated by 20E (ecdysone signaling pathway); 2) an A-to-I RNA editing prevents the binding of miR-3036-5p to CYP18A1 (the enzyme required for 20E degradation), thus elevating CYP18A1 expression, decreasing 20E titer, and finally regulating the wing dimorphism of offspring.

      Strengths:

      he authors present both genome and A-to-I RNA editing data. An interesting finding is that a A-to-I RNA editing site in CYP18A1 ruin the miRNA binding site of miR-3036-5p. And loss of miR-3036-5p regulation lead to less 20E and winged offspring.

      Weaknesses:

      How crowding represses the miR-3036-5p is still unclear.

      Reviewer #2 (Public Review):

      Summary:

      Environmental influences on development are ubiquitous, affecting many phenotypes in organisms. However molecular genetic and cellular mechanisms transducing environmental signals are still only barely understood. This study examines part of one such intracellular mechanism in a polyphenic (or dimorphic) aphid.

      Strengths:

      While other published reports have linked phenotypic plasticity to RNA editing before, this study reports such an interaction in insects. The study uses a wide array of molecular tools to identify connections upstream and downstream of the RNA editing to elucidate the regulatory mechanism, which is illuminating.

      Weaknesses:

      While this system is intriguing, this report does not foster confidence in its conclusions. Many of the analyses seem based on very small sample sizes. It is itself problematic that sample sizes are not obvious in most figures, although based on Methods section covering RNAseq, they seem to be either 3, 6 or 9, depending on whether stages were pooled, but that point is not made clear. With such small sample sizes, statistical tests of any kind are unreliable. Besides the ambiguity on sample sizes, it's unclear what error bars or whiskers show in plots throughout this study. When sample sizes are small estimates of variance are not reliable. Student's t-test is not appropriate for comparisons with such small sample sizes. Presently, it is not possible to replicate the tests shown in Figures 3, 4 and 6. (Besides the HT-seq reads, other data should also be made publicly available, following the journal's recommendations.) Regardless, effect sizes in some comparisons (Fig 3J, 4A-C, 6E, H) are clearly not large, making confidence in conclusions low. The authors should be cautious about over-interpreting these data.

      We appreciate very much for the reviewers’ time spent on our manuscript and the referees for the valuable suggestions and comments.

      To Reviewer #1:

      At present, researches on miRNAs mainly focus on its role in gene regulation by binding to the mRNA of target genes, “how miRNAs are regulated” has received less attention.

      Recent researches indicated that the expression of miRNAs is also regulated at the transcriptional or post transcriptional level. Transcriptional regulation including changes in the promoter of microRNA genes, and post-transcriptional mechanisms such as changes in miRNA processing and stability can both affect the final expression level of miRNAs.

      This article did not address how crowding treatment regulates miRNA expression. But this will be a very interesting issue, and we will pay attention to it in our future research.

      Thank you for this suggestion.

      To Reviewer #2:

      (1) “Transgenerational wing dimorphism was observed in M. dirhodum in which crowding of the parent (100 mother aphids in a 10 cm³ tube) increased the winged offspring (Fig 3E).” In this experiment, over 250 offsprings were used to calculate the proportion of winged and wingless individuals in normal (277), crowding (255) and crowding+20E (272) groups, respectively.

      “The RNAi-mediated knockdown of CYP18A1 and ADAR2 can significantly increase the titer of 20E (Fig. 4E) and reduce the number of winged offspring by 29.6% and 24.4% (Fig. 4F), respectively.” In this experiment, over 245 offsprings were used to calculate the proportion of winged and wingless individuals in dsEGFP (273), dsCYP18A1(248), and dsADAR2 (250) groups, respectively.

      “miR-3036-5p agomir and antagomir treatments could affect the proportion of winged offspring under normal conditions (Fig. 6F), but have no effect on the wing dimorphism of offspring under crowded conditions (Fig. 6L).” In this experiment, over 235 offsprings were used to calculate the proportion of winged and wingless individuals in each group, respectively.

      So I think our conclusion that crowding treatment, A-to-I RNA editing, and miRNAs could affect the wing dimorphism of offspring in M. dirhodum is very reliable. Because the number of aphids we use to count the results is sufficient.

      (2) The quantitative PCR method is used to detect changes in gene expression levels of CYP18A1 and ADAR2 after treatment with crowding, 20E, dsRNA, miRNA agomir and antagomir, and the results are shown in Fig. 3J, 4A-C, 5B, 6B, H, respectively. 5 biological replicates (more than 100 aphids were used for each biological replicate) were used in each sample, which might be sufficient for qPCR experiments. And among these biological replicates, the differences in gene expression levels are relatively small.

      (3) The titer of 20E was detected after treatment with crowding, 20E, dsRNA, miRNA agomir and antagomir, and the results are shown in Fig. 3I, 4E, 6E, K, respectively. 8 biological replicates (more than 100 aphids were used for each biological replicate) were used in each sample.

      The number of biological replicates used in each analysis and the number of aphids included in each biological replicate have been added in the Materials and Methods section. Thank you very much for pointing out this important issue.

      Reviewer #1 (Recommendations For The Authors):

      Several questions:

      (1) This study was conducted on the rose-grain aphid M. dirhodum. However, pea aphid Acyrthosiphon pisum seems to be a better object in wing dimorphism and development studies. Have the authors also identified the A-to-I RNA editing on pea aphids or other aphids?

      Wheat is one of the main grain crops in China as well as in the world. Metopolophium dirhodum is one of the most important wheat aphids around China, and has posed a significant threat to grain production. The current study was conducted to determine the regulatory mechanism of wing dimorphism on M. dirhodum, which might be of great significance to better control this pest in wheat production.

      Surely the pea aphid offers more established experimental tools and genomic resources. However, with the development of high-throughput sequencing technology, the chromosome level genomes of many insect species have been assembled. That means any of various insects might be studied as a model species, and not limited to Drosophila melanogaster, Acyrthosiphon pisum, etc.

      We didn’t identify the A-to-I RNA editing on pea aphids or other aphids. A recent study has shown that editing events are poorly conserved across different Xenopus species. Even sites that are detected in both X. laevis and X. tropicalis show largely divergent editing levels or developmental profiles. In protein-coding regions, only a small subset of sites that are found mostly in the brain are well conserved between frogs and mammals. The conservation of RNA editing in aphids is still unknown, and we will continue to pay attention to this issue in our future research works.

      Reference: Nguyen TA, Heng JWJ, Ng YT, Sun R, Fisher S, Oguz G, Kaewsapsak P, Xue S, Reversade B, Ramasamy A, Eisenberg E, Tan MH. Deep transcriptome profiling reveals limited conservation of A-to-I RNA editing in Xenopus. BMC Biology. 2023, 21(1):251.

      (2) "Two miRNA-target prediction software programs, miRanda and RNAhybrid, were used to identify the miRNAs that potentially act on CYP18A1. The results showed that miR-3036-5p could bind to the sequence containing edited position (editing site 528) of CYP18A1 in M. dirhodum." Is there any other miRNA that can also act on CYP18A1, thereby regulating its expression?

      The predicted results indicate that there are several other miRNAs can act on CYP18A1, but none of them can bind to this editing site (editing site 528). Therefore, we did not pay attention to other miRNAs.

      (3) 11678 A-to-I RNA-editing sites were systematically identified in M. dirhodum. Does that mean RNAi-mediated knockdown of ADAR2 may affect the RNA-editing and expression of a large number of genes? Please clarify.

      It is of course possible that RNAi-mediated knockdown of ADAR2 may affect the RNA-editing and expression of a large number of genes. A-to-I RNA editing was also observed in 5 other genes that involved in 20E biosynthesis and signaling pathway, but no evident difference was identified for the RNA editing and expression levels of these 5 genes after crowding treatment (Fig. S2, Table S5). That means the A-to-I RNA editing of CYP18A1 might be crucial in 20E-mediated wing dimorphism in M. dirhodum.

      (4) It is interesting that "the transcriptional level of ADAR2 was 2.19 fold higher in the crowding+20E treatment parent than that in the normal group, but no significant difference was identified between the crowding and normal groups". ADAR2 can be induced by 20E, rather than crowding. How should the author explain? It seems that 20E induction can also cause many RNA editing events.

      20-hydroxyecdysone (20E) can affect the growth and development, molting, metamorphosis, and reproductive processes of insects. According to this result, 20E induction can also cause RNA editing events by regulating the expression of ADAR2, and which may provide valuable references for the future study on 20E. Meanwhile, we will also continue to pay attention to this issue in our future research works.

      (5) Authors provided a lot of text to describe the genome assembly. I don't think it's necessary, authors can make appropriate deletions.

      Thank you for this suggestion. This is the first high-quality chromosome-level genome of M. dirhodum, which will be very helpful for the cloning, functional verification, and evolutionary analysis of genes in this important species or even other Hemiptera insects. Therefore, I think it is necessary to provide a detailed description. We will also make appropriate deletions in the “Result and Discussion” sections.

      Reviewer #2 (Recommendations For The Authors):

      Additional concerns

      - With an existing genome sequence available for the peas aphid *Acyrthosiphon pisum*, why have these authors chosen to use the rose-grain aphid for this study? It would be helpful to address any limitations in *Acyrthosiphon pisum* or advantages in *Metopolophium dirhodum* that explain that decision.

      Wheat is one of the main grain crops in China as well as in the world. Metopolophium dirhodum is one of the most important wheat aphids around China, and has posed a significant threat to grain production. The current study was conducted to determine the regulatory mechanism of wing dimorphism on M. dirhodum, which might be of great significance to better control this pest in wheat production.

      Surely the pea aphid offers more established experimental tools and genomic resources. However, with the development of high-throughput sequencing technology, the chromosome level genomes of many insect species have been assembled. That means any of various insects might be studied as a model species, and not limited to Drosophila melanogaster, Acyrthosiphon pisum, etc.

      - In Figure 5E, what anatomy is being shown in FISH? Moreover, this represents a single sample. It would be preferable to include a supplemental figure with comparable images from at least 3 additional specimens.

      It is the whole aphid body, and we have already uploaded additional 2 FISH images to the supplementary material Fig. S5. Thank you for this suggestion.

      - L190: Conservation alone seems inadequate to conclude that a chromosome functions as a sex chromosome. It would be fine to note the homology between Chr1 and the X of other Aphidini, but there are other explanations for that. Inference that Chr 1 is a sex chromosome might come from observations in karyotypes (by relative size comparisons or ideally from FISH) or from comparison of reads mapped to the chromosomes, suggesting Chr1 is hemizygous in males.

      Karyotype analysis experiment was not conducted in this research, so here the sex chromosome was determined based on chromosome homology between M. dirhodum and A. pisum genome. We have made appropriate modifications to the description in the article. Thank you for this suggestion.

      - L205: It's unclear to me how to interpret RNA editing results, based on RNAseq data, that map to "intergenic regions", especially when this is such a large fraction (37.3%) of the total result. Does this suggest a fundamental problem with the analysis, that so much RNAseq data maps to parts of the genome that are not annotated as genes?

      Non-coding RNA regions often account for a large proportion in the genome, and this RNAseq data is mapped to non-coding RNA transcription regions (37.3%) between protein-coding genes (intergenic regions).

      - L288-290: What degrees of confidence are attached to the predictions of these miRNA targets?

      There is no clear research indicating the accuracy of miRNA target prediction software. However, by comprehensively utilizing multiple prediction tools and experimental verification, the accuracy and reliability of prediction can be significantly improved.

      Actually, the prediction of miRNA targets is only a preliminary identification step, and we have subsequently demonstrated that miR-3036-5p can act on CYP18A1 through dual-luciferase reporter assay, RNA immunoprecipitation and FISH, etc.

      - L296-298: The mechanism proposed in this study seems to imply that miR-3036-5p should be absent (not expressed) in aphids under crowded conditions. Therefore, relative realtime PCR is not particularly useful here. Finding that the miR relative expression is reduced by 48.8% is meaningless, because in *relative* expression, zero has no special meaning. In this case, absolute quantitative PCR, measuring actual transcript numbers, would be far more informative.

      miR-3036-5p is not absent in aphids under crowded conditions. Only a significant decrease of miR-3036-5p in expression level under crowded conditions was identified compared to normal feeding conditions (Fig. 5B). So it should be reasonable to use relative quantitative methods for expression level analysis.

      - L361: Isn't alternative mRNA splicing a more common post-transcriptional modification?

      I'm very sorry, this sentence has been modified to “A-to-I RNA editing is one of the most prevalent forms of posttranscriptional modification in animals, plants, and other organisms.” Thank you for this suggestion.

      - L372: "Functional wing polymorphism is commonly observed in insects as a form of adaptation and a source of variation for natural selection (14)." The relationship between plastic phenotypic variation and natural selection is complex, and there is a large theoretical literature in evolutionary biology and evo-devo on this topic, but it is not a focus in the cited review by Zhang et al.. It would be helpful if the authors could expand on this idea with reference to some of this literature (e.g. Levins 1968; Harrison 1980; Moran 1992; Roff 1996; West-Eberhard 2003; Zera 2009).

      I have changed the citation and expanded on this idea. “Wing polymorphism is commonly observed in insects, resulting from variation in both genetic factors and environmental factors (Zera 2009).”

      - L404: Use the word "accurate" seems inappropriate in this context. Both morphs are equally "accurate".

      This sentence has been modified to “resulting in the alteration of CYP18A1 expression and wing dimorphism of offspring regulated by miR-3036-5p”, Thank you for this suggestion.

      - L412: Reference 67 seems irrelevant to this point.

      References have been changed and added.

      67. E.J. Duncan, C.B. Cunningham, P.K. Dearden. Phenotypic plasticity: what has DNA methylation got to do with it? Insects. 13(2):110 (2022).

      68. K.J. Rangan, S.L. Reck-Peterson, RNA recoding in cephalopods tailors microtubule motor protein function. Cell 186, 2531-2543 (2023).

      - L443: Is this referring to "mixed stage" aphids?

      Yes. To make it clearer, this sentence has been modified to “Approximately 200 mg of fresh M. dirhodum with mixed stages (including first- to fourth-instar nymphs and winged and wingless adults)”.

      - L483: What mass or number of individual aphids was used? I assume multiple individuals were pooled?

      Each sample contains approximately 200 aphids.

      - L499: Why was k = 17 used? The default is k = 21.

      The selection of k is usually an odd number between 15 and 21, which ensures that the types of k-mers can cover the genome while being small enough to avoid erroneous effects. Therefore, using 17 is very reasonable.

      - L574: what does it mean "multiple editing types"? What different types are possible? Are you referring to things other than A-to-I editing?

      That means besides A-to-I, this locus may also have other editing situations, such as A-to-C. If this situation occurs, it will be discarded.

      - L635: Which luciferase construct or plasmid has been used in this experiment? Citation to that source is necessary.

      PmirGLO vector (Promega, Leiden, Netherlands) was used in this experiment, and a reference has been added.

      B. Zhu, L. Li, R. Wei, P. Liang, X. Gao. Regulation of GSTu1-mediated insecticide resistance in Plutella xylostella by miRNA and lncRNA. PLoS Genetics. 17(10), e1009888 (2021).

      - L644: Did cDNA synthesis employ random primers or a poly-dT primer?

      This kit provides mixed primers, including random and poly-dT primers. (PrimeScript™ RT reagent Kit with gDNA Eraser (Perfect Real Time), Takara Biotechnology, Dalian, China).

      - Fig 4D: Seems like this panel should be divided to cover the two sites, as in Fig 3F. Right now the x-axis labels seem redundant.

      Done. Thank you for this suggestion.

      - Fig 7: Consider adding ADAR2 to this figure.

      Done. Thank you for this suggestion.

      - Table 1: It would be helpful to represent this data in a figure where the phylogenetic relationships among the species can be shown.

      The phylogenetic relationships among the species were shown in Fig. 1D, and the table here may present genome information in more detail.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review)

      This paper focuses on secondary structure and homodimers in the HIV genome. The authors introduce a new method called HiCapR which reveals secondary structure, homodimer, and long-range interactions in the HIV genome. The experimental design and data analysis are well-documented and statistically sound. However, the manuscript could be further improved in the following aspects.

      Major comments:

      (1) Please give the full name of an abbreviation the first time it appears in the paper, for example, in L37, "5' UTR" "RRE".

      Thank you for your suggestion. We have added the full name of these abbreviations.

      (2) The introduction could be strengthened by discussing the limitations of existing methods for studying HIV RNA structures and interactions and highlighting the specific advantages of the HiCapR method.

      Thank you for your insightful suggestion. We have modifed sentences in the introduction section (line 66 -line 71, line 80-line 81 in the revised manuscript).

      (3) Please reorganize Results Part 1.

      Thank you for your advice. We have reorganized results part 1. We hope the revision provides a logical flow and clarity to the results, making it easier for readers to follow the progression of the study and the significance of the findings regarding to the HiCapR method.

      (4) Is there any reason that the authors mention "genome structure of SARS-CoV-2" in L95?

      Thank you for your insightful question. We have deleted this sentence in the revised paper.

      Initially, the mention of our previous work on SARS-CoV-2 serves two purposes: firstly, to demonstrate our capability to perform proximity ligation assays on viral samples; and secondly, to underscore the necessity of the hybridization step, which is particularly relevant for the study of HIV.

      Unlike SARS-CoV-2, which is highly abundant in infected cells and does not require post-library hybridization, HIV-1 presents a unique challenge due to its typically low viral RNA input within cells. The simplified SPLASH protocol, while effective for more abundant viral RNAs, does not provide the necessary coverage for high-resolution analysis when applied directly to HIV samples.

      Now, we have deleted this sentence according to your comments, and discuss the technical difference elsewhere.

      (5) L102: Please clarify the purpose of comparing "NL4-3" and "GX2005002." Additionally, could you explain what NL4-3 and GX2005002 are? The connection between NL4-3, GX2005002, and HIV appears to be missing.

      Thank you for your question, and we apologize for the misleading. "NL4-3" and "GX2005002" are two distinct HIV-1 strains that exhibit different prevalence patterns in various geographical regions. The NL4-3 strain is a well-characterized laboratory strain that is widely used in HIV research and is representative of the HIV-1 subtype B, which is highly prevalent in Europe and the Americas. On the other hand, GX2005002 is a primary isolate of the CRF01_AE subtype, which is one of the most prevalent strains in Southeast Asia, particularly in China.

      The reason for comparing these two strains in our study is twofold. Firstly, it allows us to assess the applicability and versatility of our HiCapR method across different HIV-1 strains that may have distinct genetic and structural features. This is crucial for understanding the potential broad utility of our method in studying various HIV-1 strains globally. Secondly, by comparing these strains, we can begin to elucidate any strain-specific differences in RNA structure, homodimer formation, and long-range interactions, which may have implications for viral pathogenesis, transmission, and response to therapeutic interventions.

      The connection between NL4-3, GX2005002, and HIV lies in their representation of different subtypes of the HIV-1 virus, which exhibit genetic diversity and are associated with different geographical distributions. This diversity is epidemiologically and clinically relevant, as it may be associated with different pathogenesis and resistance mechanisms, and might has implications for vaccine development and treatment strategies.

      (6) Figure 1A is not able to clearly present the innovation point of HiCapR.

      Thank you for your comment. We have revised this figure to more clearly illustrate the steps and principles of the post-library capture process using HIV pooled probes hybridization and streptavidin pull down to enrich HIV RNA-derived chimeras.

      (7) Please compare the contact metrics detected by HiCapR and current techniques like SHAPE on the local interactions to assess the accuracy of HiCapR in capturing local RNA interactions relative to established methods.

      Thank you for your request to compare the contact metrics detected by HiCapR and current techniques like SHAPE on local interactions to assess the accuracy of HiCapR in capturing local RNA interactions relative to established methods.

      In this study, HiCapR has demonstrated its ability to identify key structural elements within the HIV genome, including TAR, polyA, SL1, SL2, and SL3, as well as the polyA-SL1 in the monomeric conformation. These elements are crucial for understanding the local RNA structures involved in HIV replication and pathogenesis. By visualizing the base pairing probability as a heatmap, we have identified the most stable base pairs in the 5’ UTR of HIV, which is consistent across both NL4-3 and GX2005002 strains (Figure 2D). This consistency suggests robustness in the overall structure despite sequence variations and alternative RNA conformations, indicating a high level of agreement between HiCapR and SHAPE methods in detecting local interactions.

      Furthermore, HiCapR not only confirms the presence of known structural elements but also reveals alternative conformations of the 5'UTR that support the alternative conformations found in SHAPE analysis. This additional layer of information provides a more comprehensive view of the RNA structures, highlighting HiCapR's ability to capture local RNA interactions with a high degree of accuracy comparable to established methods like SHAPE.

      (8) The paper needs further language editing.

      We have thoroughly revised the paper. We hope it’s improved significantly.

      Reviewer #2 (Public review):

      Summary:

      In the manuscript "Mapping HIV-1 RNA Structure, Homodimers, Long-Range Interactions and 1 persistent domains by HiCapR" Zhang et al report results from an omics-type approach to mapping RNA crosslinks within the HIV RNA genome under different conditions i.e. in infected cells and in virions. Reportedly, they used a previously published method which, in the present case, was improved for application to RNAs of low abundance.

      Their claims include the detection of numerous long-range interactions, some of which differ between cellular and virion RNA. Further claims concern the detection and analysis of homodimers.

      Strengths:

      (1) The method developed here works with extremely little viral RNA input and allows for the comparison of RNA from infected cells versus virions.

      (2) The findings, if validated properly, are certainly interesting to the community.

      Thank you for your comprehensive review and insightful comments on our manuscript. We appreciate your recognition of the strengths of our HiCapR method and the potential interest of our findings to the scientific community.

      Weaknesses:

      (1) On the communication level, the present version of the manuscript suffers from a number of shortcomings. I may be insufficiently familiar with habits in this community, but for RNA afficionados just a little bit outside of the viral-RNA-X-link community, the original method (reference 22) and the presumed improvement here are far too little explained, namely in something like three lines (98-100). This is not at all conducive to further reading.

      Thank you for your feedback on the clarity of our manuscript, particularly regarding the explanation of the HiCapR method and its improvements over the original method mentioned in reference 22

      In response to your feedback, we expand on the description of the HiCapR method in the revised manuscript to ensure that it is accessible to a broader audience. We will provide a more thorough comparison between HiCapR and the original method, detailing the specific improvements and how they enable the analysis of low-abundance viral RNAs like HIV. This will include:

      Post-library Hybridization: Unlike the original method, HiCapR incorporates a post-library hybridization step. This innovation allows for the capture of target RNA involved in interactions after library construction, offering additional flexibility and enhancing the resolution of the analysis.

      Enhanced Sensitivity: HiCapR has been optimized to work with extremely low viral RNA input, which is a significant advancement over the original method. This is crucial for studying viruses like HIV, where obtaining high quantities of viral RNA can be challenging. As a matter of fact,

      (2) Experimentally, the manuscript seems to be based on a single biological replicate, so there is strong concern about reproducibility.

      Thank you for raising the issue of reproducibility in our study. We understand the importance of experimental replication in ensuring the reliability of our findings. In response to your concern, we would like to provide the following clarification and additional details regarding the reproducibility of our HiCapR experiments:

      Replicates in HiCapR Experiments: All ligation and control samples in our HiCapR experiments were performed in three biological replicates. This was done to ensure the high reproducibility of our results. The high degree of correlation (r > 0.99) between these replicates underscores the reliability of our findings.

      Dimer Validation Experiments: To validate the dimer formation of RRE and 5’-UTR, we employed multiple independent methods, including Native agarose gel electrophoresis, Agilent 4200 TapeStation Capillary electrophoresis, and Biomolecular Binding Kinetics Assays. These methods provide complementary perspectives on the dimer formation, enhancing the robustness of our validation process. The data presented in Figure 3C and Supplementary figure S12 are representative results from these experiments, which consistently support our findings on dimer formation.

      Agreement Between Cellular and Virion RNA: Our study also demonstrates a significant similarity between virions in the supernatant and infected cells from the same viral strain, as shown in Supplementary Figure S3. This consistency further validates the reproducibility and reliability of our HiCapR method in capturing RNA structures and interactions under different conditions.

      Consistency across two strains: Our study includes a comprehensive analysis of two distinct HIV-1 strains, NL4-3 and GX2005002, which are prevalent in Europe and Southeast Asia, respectively. The consistency in our findings across these strains serves as a strong indicator of the reproducibility and general applicability of our HiCapR method. Specifically, presence of key structural elements such as TAR, polyA, SL1, SL2, and SL3 in both NL4-3 and GX2005002 strains, suggests a robust structural framework that is conserved across different strains, despite sequence variations. Additionally, our study reveals approximately 20 candidate dimer peaks conserved between the NL4-3 and GX2005002 strains along the genome. The conservation of these dimer peaks across strains indicates a reproducible pattern of dimerization.

      (3) The authors perform an extensive computational analysis from a limited number of datasets, which are in thorough need of experimental validation

      Thank you for your comment.

      In response to your concern, we would like to clarify that while our manuscript does present an extensive computational analysis, we have also conducted a series of experiments. Specifically, we have validated dimer formation using multiple independent methods (afore discussed).

      Given the time-consuming nature of additional experiments, we have chosen to share the HiCapR data with the community in a timely manner. This approach allows for more immediate communication and evaluation of the data on HIV structure, which we believe is valuable for advancing the field.

      We are committed to further investigating the functional implications of our structural findings. We plan to conduct more experiments to explore the functional linking between the structural insights of HIV, which will help to deepen our understanding of the virus's replication and potential antiviral strategies.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      I suggest a major revision of the manuscript.

      Minor comments:

      (1) The article lacks consistency in its presentation. The expression of the proper noun is wrong in the paper. For example, (a) L89, "RNA:RNA interaction" →RNA-RNA interaction; (b) L431, "SARS-COV-2" → SARS-CoV-2;

      We are sorry for the inconsistency. We have corrected the mistakes.

      (2) "We identified dimers based on the methodology described in23." is not a complete sentence.

      Thank you for your insightful comment. We have revised the sentence to provide a complete and clear description of our methodology. The revised sentence is as follows: "Homodimers were identified in accordance with the methods previously reported in the literature."

      Reviewer #2 (Recommendations for the authors):

      (1) The authors perform an extensive computational analysis from a limited number of datasets, which are in thorough need of experimental validation. There is a single series on in vitro validation of the interaction of an homodimerization site, described in five lines (278-283) plus the Figure panel 3c with a very brief legend, and an extremely minimalist Figure S12. The panel to Figure 3c contains Kd values which have not been assessed for significant digits.

      Thank you for your constructive feedback on our manuscript.

      We acknowledge that our computational analysis is based on a limited number of datasets. Due to the initial exploratory nature of our study and the logistical challenges of generating additional datasets, we have focused on in-depth analysis of the available data. We are currently working on further validating our findings and are committed to publishing these results in a follow-up study.

      Regarding Experimental Validation:

      We agree that the initial description of our in vitro validation of the homodimerization site was concise. To address this, we have expanded the description of our experimental procedures. Specifically, we have detailed the methods used for the in vitro transcription, the preparation of RNA samples, and the use of the Octet R8 platform for biomolecular binding kinetics assays.

      For the Kd values presented in Figure 3c. We have now included standard error of the mean and have defined the significant digits in the figure legend. This revision provides a more accurate representation of the binding affinities.

      (2) As a further example to be experimentally validated, splice sites are discussed after lines 354, for which unsophisticated validation techniques such as targeted RT-PCR are widely accepted.

      In response to your comment, we would like to clarify that the splice sites mentioned in our study are well-established and widely recognized in the literature. They have been previously characterized and are considered canonical within the HIV research community. Given their established nature, we have relied on this foundational knowledge in our analysis.

      However, we concur with the importance of validating the regulatory role of homodimers in splicing, which is a significant aspect of HIV biology. While we have provided evidence for the presence of these homodimers and their potential implications for splicing, we acknowledge the need for further functional studies to elucidate their mechanistic role.

      Due to the scope and length constraints of the current manuscript, we have chosen to focus on the structural and interaction analyses provided by HiCapR. The functional validation of these homodimers and their impact on splicing will be pursued in subsequent studies, which we plan to initiate promptly. We believe that a dedicated follow-up study will allow for a more in-depth exploration of this complex and important aspect of HIV gene regulation.

      We are committed to advancing our understanding of the functional significance of these homodimers in the context of HIV splicing and will ensure that this line of investigation is thoroughly addressed in our future work.

      Thank you again for your valuable feedback. We look forward to contributing further to the field with our ongoing research.

    1. Reviewer #2 (Public Review):

      Summary:

      In this work, the authors show that the camelid single-chain antibody sdAb42 selectivity inhibits Trypanosome pyruvate kinase (PYK) but not human PYK. Through the determination of the crystal structure and biophysical experiments, the authors show that the nanobody binds to the inactive T-state of the enzyme, and in silico analysis shows that the binding site coincides with an allosteric hotspot, suggesting that nanobody binding may affect the enzyme active site. Binding to the T-state of the enzyme is further supported by non-linear inhibition kinetics. PYK is an important enzyme in the glycolytic pathway, and inhibition is likely to have an impact on organisms such a trypanosomes, that heavily rely on glycolysis for their energy production. The nanobody was generated against Trypanosoma congolense PYK, but for technical reasons the authors progressed to testing its impact on cell viability in Trypanosoma brucei brucei. First, they show that sdA42 is able to inhibit Tbb PYK, albeit with lower potency. Cell-based experiments next show that expression of sdA42 has a modest, and dose-dependent effect on the growth rate of Tbb. The authors conclude that their data indicates that targeting this allosteric site affects cell growth and is a valuable new option for the development of new chemotherapeutics for trypanosomatid diseases.

      Strengths:

      The work clearly shows that sdA42A inhibits Trypanosome and Leishmania PYK selectively, with no inhibition of the human orthologue. The crystal structure clearly identifies the binding site of the nanobody, and the accompanying analysis supports that the antibody acts as an allosteric inhibitor of PYK, by locking the enzyme in its apo state (T-state).

      Weaknesses:

      (1) The most impactful claim of this work is that sdAb42-mediated inhibition of PYK negatively affects parasite growth and that this presents an opportunity to develop novel chemotherapeutics for trypanosomatid diseases. For the following reasons I think this claim is not sufficiently supported:

      - The authors do not provide evidence of target-engagement in cells, i.e. they do not show that sdA42A binds to, or inhibits, Tbb PYK in cells and/or do not provide a functional output consistent with PYK inhibition (e.g. effect on ATP production). Measuring the extent of target engagement and inhibition is important to draw conclusions from the modest effect on growth.

      - The authors do not explore the selectivity of sdA42A in cells. Potentially sdA42A may cross-react with other proteins in cells, which would confound interpretation of the results.

      - sdA42A only affects minor growth inhibition in Tbb. The growth defect is used as the main evidence to support targeting this site with chemotherapeutics, however based on the very modest effect on the parasites, one could reasonably claim that PYK is actually not a good drug target. The strongest effect on growth is seen for the high expressor clone in Figure 4a, however here the uninduced cells show an unusual profile, with a sudden increase in growth rate after 4 days, something that is not seen for any of the other control plots. This unexplained observation accentuates the growth difference between induced and uninduced, and the growth differences seen in all other experiments, including those with the highest expressors (clones 54 and 55) are much more modest. The loss of expression of sdA42A over time is presented as a reason for the limited effect, and used to further support the hypothesis that targeting the allosteric site is a suitable avenue for the development of new drugs. However, strong evidence for this is missing.

      - For chemotherapeutic interventions to be possible, a ligandable site is required. There is no analysis provided of the antibody binding site to indicate that small molecule binding is indeed feasible.

      (2) The authors comment on the modest growth inhibition, and refer to the need to achieve over 88% reduction in Vmax of PYK to see a strong effect, something that may or may not be achieved in the cell-based model (no target-engagement or functional readout provided). The slow binding model and switch of species are also raised as potential explanations. While these may be plausible explanations, they are not tested which leaves us with limited evidence to support targeting the allosteric site on PYK.

      (3) The evidence to support an allosteric mechanism is derived from structural studies, including the in silico allosteric network predictions. Unfortunately, standard enzyme kinetics mode of inhibition studies are missing. Such studies could distinguish uncompetitive from non-competitive behaviour and strengthen the claim that sdAb42 locks the enzyme complex in the apo form.

      (4) As general comment, the graphical representation of the data could be improved in line with recent recommendations: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128, https://elifesciences.org/inside-elife/5114d8e9/webinar-report-transforming-data-visualisation-to-improve-transparency-and-reproducibility.

      - Bar-charts for potency are ideally presented as dot plots, showing the individual data points, or box plots with datapoints shown.

      - Images in Figure 7 show significant heterogeneity of nanobody expression, but the extent of this can not be gleaned from Figure 7B. It would be much better to use box plots or violin plots for each cell line on this figure panel. The same applies to Figure 10.

      Comments on revision:

      The authors have reduced the emphasis on the potential drug discovery applications. They are now referring to opportunities using a so called "chemo-superior" approach. This is not a commonly used term, and the newly added text seems to indicate that "chemo-superiors" target sites exposed by antibody binding, whereas the paper that the authors refer to (Lawson, 2012), defines "chemo-superiors" as small-molecules that induce similar effects to antibodies. I suggest removing the term "chemo-superior" altogether, as it has not been used since being coined in 2012, and instead simply point out the examples where antibodies have successfully informed small molecule design.

      Unfortunately, the authors were unable to carry out additional experiments. Any experimental data to support their hypotheses as to why the observed growth defect is only marginal, and how the effect on growth could be increased, would have been very useful. As such, the evidence to support embarking on a drug discovery campaign for this allosteric site remains very limited.

      The authors do provide some evidence of a druggable allosteric pocket, that partially overlaps with the antibody binding site, which is useful. However, I also ran the APOP tool on TcoPYK and it reveals 217 potential allosteric pockets all over the protein. The authors should provide the rank and APOP confidence score for the pocket that they have selected, to show that this is a high confidence allosteric pocket.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      “This work presents valuable data demonstrating that a camelid single-domain antibody can selectively inhibit a key glycolytic enzyme in trypanosomes via an allosteric mechanism. The claim that this information can be exploited for the design of novel chemotherapeutics is incomplete and limited by the modest effects on parasite growth, as well as the lack of evidence for cellular target engagement in vivo.”

      We agree with this assessment. In this re-worked version, we implemented the textual changes suggested by the reviewers and performed additional in silico work. The reviewers also presented valuable suggestions for additional experiments. However, we currently don’t have dedicated hands and funding for this project, which renders it impossible for us to perform additional “wet lab” experiments at this stage. We have thus not included new experimental “wet lab” data. Finally, the claim that our results may be exploited for the design of novel chemotherapeutics perhaps came across stronger than we intended to. We still believe our findings indicate a potential for such an endeavor, but this clearly requires further investigation and experimental evidence. We have softened this statement by removing it from the abstract and have edited the discussion to end as follows.

      “Based on the presented results, we propose that sdAb42 may pinpoint a site of vulnerability on trypanosomatid PYKs that could potentially be exploited for the design of novel chemotherapeutics. Indeed, antibodies (or fragments thereof) are valuable drug discovery tools. Antibodies (and camelid sdAbs especially) are known for their ability to "freeze out" specific conformations of highly dynamic antigens, thereby exposing target sites of interest, which could be exploited for rational drug design (the development of so-called "chemo-superiors", (Lawson, 2012; Khamrui et al., 2013; van Dongen et al., 2019)). While the design of a "chemo-superior" inspired on the sdAb42-mediated allosteric inhibition mechanism will require further investigation, the results presented here provide a foundation to fuel such an endeavour.”

      REVIEWER 1:

      Summary:

      The authors identified nanobodies that were specific for the trypanosomal enzyme pyruvate kinase in previous work seeking diagnostic tools. They have shown that a site involved in the allosteric regulation of the enzyme is targeted by the nanobody and using elegant structural approaches to pinpoint where binding occurs, opening the way to the design of small molecules that could also target this site.

      Strengths:

      The structural work shows the binding of a nanobody to a specific site on Trypanosoma congolense pyruvate kinase and provides a good explanation as to how binding inhibits enzyme activity. The authors go on to show that by expressing the nanobodies within the parasites they can get some inhibition of growth, which albeit rather weak, they provide a case on how this could point to targeting the same site with small molecules as potential trypanocidal drugs.

      Weaknesses:

      The impact on growth is rather marginal. Although explanations are offered on the reasons for that, including the high turnover rate of the expressed nanobody and the difficulty in achieving the high levels of inhibition of pyruvate kinase required to impact energy production sufficiently to kill parasites, this aspect of the work doesn't offer great support to developing small molecule inhibitors of the same site.

      Recommendations for authors:

      Generally, the paper is very well written and the figures and their legends are clear.

      Comment 1.1: I thought the Introduction could give more focus to the need for new drugs for veterinary trypanosomiasis. The reality is that with fexinidazole now available and acoziborole soon to be available, with <1,000 cases of human African trypanosomiasis in each of the last five years, the case for needing new drugs is difficult to make. For Animal trypanosomiasis, however, the need for novel drugs is much more pressing.

      We agree with this comment and have included an additional section in the Introduction’s second paragraph, which reads as follows.

      “Hence, there is a need for alternative compounds, preferably with novel modes of action and/or designed based on mechanistic insights of the target’s structure-function relationship (Field et al., 2017; De Rycker et al., 2018). This need is especially pressing for AAT, which strongly impedes sustainable livestock rearing in Sub-Saharan Africa. AAT results in drastic reductions of draft power, meat, and milk production by the infected animals (small and large ruminants), and its control relies mainly on vector control and chemotherapy, with only few drugs currently available. The lack of routine field diagnosis has resulted in the misuse of trypanocidal drugs, thereby accelerating the rise of parasite resistance and further exacerbating the problem (Richards et al., 2021). As such, AAT-inflicted annual losses are estimated at around $5 billion (and the necessity to invest another $30 million each year to control AAT through chemotherapy), thereby having a devastating impact on the socio-economic development of Sub-Saharan Africa (Fetene et al., 2021). In contrast, HAT is perceived as a minor threat as it has reached a post-elimination phase as a public health problem with less than 1,000 yearly documented cases (Franco et al., 2022). In addition, new and effective drugs for HAT treatment have recently become available (De Rycker et al., 2023). HAT control currently relies on case detection and treatment, and vector control (Büscher et al., 2017).”

      Comment 1.2: A few pedantic things can be tidied up too, for example on line 61 it is stated tsetse flies are part of the life cycle for all trypanosomes while some veterinary species e.g. T. evansi and some T.vivax strains use other biting flies for transmission. I'd also add in the Introduction that pyruvate kinase is not a glycosomal enzyme (it is covered in the legend to figure 1 but I think it is quite important to clarify in the Introduction too so as to assure readers aren't wondering if "intrabodies" can get targeted there.

      We agree with this comment and have included an additional section in the Introduction’s third paragraph to expand on the life cycles of African trypanosomes, which reads as follows.

      “African trypanosomes are extracellular parasites that have a bipartite life cycle involving insect vectors and mammals as hosts (Radwanska et al., 2018). Most HAT (T. brucei gambiense and T. b. rhodesiense) and AAT (T. b. brucei and T. congolense) causing trypanosomes are uniquely vectored by tsetse flies (Glossina spp.) and are confined to Sub-Saharan Africa. T. b. evansi and T. vivax (both causative agents of AAT) have expanded beyond the tsetse belt due to their ability to be mechanically transmitted by a variety of biting flies (Glossina, Stomoxys, and Tabanus spp.). Finally, T. b. equiperdum infects equids and represents an exception as it is transmitted directly from animal to animal through sexual contact.”

      The introduction now also explicitly mentions that pyruvate kinase is not a glycosomal enzyme.

      Comment 1.3: The introduction would also be a good place to include some more information on what is known about the allosteric effectors of pyruvate kinase in trypanosomes, and emphasize where gaps in knowledge exist too.

      We agree with this comment and have included an additional section in the Introduction’s third paragraph, which reads as follows.

      “Pyruvate kinase (PYK) represents another attractive glycolytic target. This non-glycosomal enzyme catalyses the last step of the glycolysis (the irreversible conversion of phosphoenolpyruvate (PEP) to pyruvate; Figure 1A). The importance of this reaction is two-fold: i) the generation of ATP through the transfer of a phosphoryl group from PEP to ADP and ii) the formation of pyruvate, a crucial metabolite of the central metabolism. Like most PYKs, trypanosomatid PYKs are homotetramers. The PYK monomer is a ∼55 kDa protein organized into four domains termed ’N’, ’A’, ’B’, and ’C’ (Figure 1B). The A domain constitutes the largest part of the PYK monomer and is characterized by an (𝛼/𝛽)8-TIM barrel fold that contains the active site. Together with the N-terminal domain, it is also involved in the formation of the PYK tetramer AA’ dimer interfaces. The B domain is known as the flexible ’lid’ domain that shields the active site during enzyme-mediated phosphotransfer. Finally, the C domain harbors the binding pocket for allosteric effectors and stabilizes the PYK tetramer by formation of CC’ dimer interfaces. Because of their role in ATP production and distribution of fluxes into different metabolic branches, the activity of trypanosomatid PYKs is tightly regulated through an allosteric mechanism known as the "rock and lock" model (Morgan et al., 2010, 2014; Pinto Torres et al., 2020). In this model (which is detailed in Figure 1C), the binding of substrates and/or effectors (and analogs thereof) to the active and effector sites, respectively, trigger a conformational change from the enzymatically inactive T state to the catalytically active R state. Known effector molecules for trypanosomatid PYKs are fructose 2,6-bisphosphate (F26BP), fructose 1,6-bisphosphate (F16BP) and sulfate (SO<sub>4</sub><sup>2-</sup>), with F26BP being the most potent one (van Schaftingen et al., 1985; Callens and Opperdoes, 1992; Ernest et al., 1994; Tulloch et al., 2008). Interestingly, trypanosomatid PYKs seem to be largely unresponsive to the allosteric regulation of enzyme activity by free amino acids (Callens et al., 1991), which contrasts with human PYKs (Chaneton et al., 2012; Yuan et al., 2018). Known trypanosomatid PYK inhibitors impair enzymatic activity through occupation of the PYK active site (Morgan et al., 2011).”

      In the Results, although I am not qualified to analyse the structural data in detail I am confident in the ability of the authors to do so.

      Comment 1.4: Differences in nanobody binding kinetics to the T. congolense enzyme when compared to T. brucei and Leishmania enzymes are attributed to the relatively few amino acid differences in those sites. It is desirable to test site-directed mutagenesis of those residues.

      This is a highly valuable suggestion from the reviewer. However, we currently don’t have dedicated hands and funding for this project, which renders it impossible for us to perform additional experiments at this stage.

      Comment 1.5: In the section on slow-binding inhibition kinetics (lines 194-220) I found it difficult to follow whether it was just the R<>T transition that slowed nanobody inhibition, or whether competition with effectors at the site would also impact on those inhibition kinetics. Can this be clarified?

      Since the sdAb42 epitope is located relatively far away from both active and effector sites (~20 and ~40 Å, respectively), it seems highly unlikely the observed “slow-binding inhibition” kinetics are the result of a competition between sdAb42 on one hand and substrates and/or effectors on the other for enzyme binding. Instead, given that sdAb42 selectively binds and locks the enzyme’s inactive T state, these data can be explained by the idea that sdAb42 can only bind to trypanosomatid PYKs after having undergone an R- to T-state transition. To clarify this matter, we slightly reformulated the discussion as indicated below. We also included a small discussion on the observation that there is a 400-fold difference between the Kd and the IC50.

      “Since the sdAb42 epitope is located relatively far away from both active and effector sites (~20 and ~40 Å, respectively), it seems highly unlikely that the observed “slow-binding inhibition” kinetics are the result of a direct competition between sdAb42 and substrates and/or effectors. Instead, given that sdAb42 selectively binds and locks the enzyme’s inactive T state, these data can be explained by the idea that sdAb42 can only bind to trypanosomatid PYKs after having undergone an R- to T-state transition. An additional observation in this context, is the 400-fold difference between the K<sub>D</sub> and IC<sub>50</sub> values. Although we currently do not have a mechanistic explanation, similar differences have been observed for the sdAb-mediated allosteric inhibition of other kinases (Singh et al., 2022).”

      For the intrabody expression work, the reference cited on line 230 actually points to a growing ability to genetically modify T. congolense. However, it is justifiable to work on T.brucei given the much wider availability and advanced status of the genetic tools.

      The growth inhibition data shown in Figure 7 is weak, albeit significant and the case is made as to why that might be.

      Comment 1.6: The authors do point to the fact that inhibiting other parts of the glycolytic pathway might be helpful in getting a better growth inhibitory effect. It would be useful, in this regard, to test the ability of the PFK inhibitors in the Macnae et al. paper in the intrabody expressing line, and possibly other inhibitors e.g. 2-deoxy-D-glucose to see if these combinations do have the desired impacts. Also, looking at the metabolome of the intrabody expressors under induction could also give some further insights into changes in flux (although perhaps not on its own given the weak effects on the growth seen).

      This is a highly valuable suggestion from the reviewer. However, we currently don’t have dedicated hands and funding for this project, which renders it impossible for us to perform additional experiments at this stage. We would like to point out that, in our experience, studying the effect of enzyme inhibition on the metabolome is usually only useful shortly after adding the onset of inhibition. The system adapts to the lowered flux and relevant changes are mostly transient. Since the induced expression of sdAb42 is almost certainly slow, we expect the metabolic changes will be minimal.

      REVIEWER 2:

      Summary:

      In this work, the authors show that the camelid single-chain antibody sdAb42 selectivity inhibits Trypanosome pyruvate kinase (PYK) but not human PYK. Through the determination of the crystal structure and biophysical experiments, the authors show that the nanobody binds to the inactive T-state of the enzyme, and in silico analysis shows that the binding site coincides with an allosteric hotspot, suggesting that nanobody binding may affect the enzyme active site. Binding to the T-state of the enzyme is further supported by non-linear inhibition kinetics. PYK is an important enzyme in the glycolytic pathway, and inhibition is likely to have an impact on organisms such a trypanosomes, that heavily rely on glycolysis for their energy production. The nanobody was generated against Trypanosoma congolense PYK, but for technical reasons the authors progressed to testing its impact on cell viability in Trypanosoma brucei brucei. First, they show that sdA42 is able to inhibit Tbb PYK, albeit with lower potency. Cell-based experiments next show that expression of sdA42 has a modest, and dose-dependent effect on the growth rate of Tbb. The authors conclude that their data indicates that targeting this allosteric site affects cell growth and is a valuable new option for the development of new chemotherapeutics for trypanosomatid diseases.

      Strengths:

      The work clearly shows that sdA42A inhibits Trypanosome and Leishmania PYK selectively, with no inhibition of the human orthologue. The crystal structure clearly identifies the binding site of the nanobody, and the accompanying analysis supports that the antibody acts as an allosteric inhibitor of PYK, by locking the enzyme in its apo state (T-state).

      Weaknesses:

      (1) The most impactful claim of this work is that sdAb42-mediated inhibition of PYK negatively affects parasite growth and that this presents an opportunity to develop novel chemotherapeutics for trypanosomatid diseases. For the following reasons I think this claim is not sufficiently supported:

      Comment 2.1: The authors do not provide evidence of target-engagement in cells, i.e. they do not show that sdA42A binds to, or inhibits, Tbb PYK in cells and/or do not provide a functional output consistent with PYK inhibition (e.g. effect on ATP production). Measuring the extent of target engagement and inhibition is important to draw conclusions from the modest effect on growth.

      The authors do not explore the selectivity of sdA42A in cells. Potentially sdA42A may cross-react with other proteins in cells, which would confound interpretation of the results.

      We understand the reviewer’s concern. While it is theoretically possible that sdAb42 may non-specifically (cross-)react with other proteins in the cell, this would be highly unlikely based on the following arguments. First, many studies have employed sdAbs as intrabodies and reported on specific sdAb-mediated effects (outstanding reviews on the topic are Cheloha et al. (PMID 32868455) and Soetens et al. (PMID 33322697)). Second, it has been demonstrated that selecting sdAbs from an immune library through phage display or “bacteriomatch” (a bacterial system similar to yeast two hybrid) yields highly similar results (Pellis et al., PMID 22583807), thereby indicating that sdAbs interact specifically with their target antigens in an intracellular environment. Third, we identified TcoPYK as the target for sdAb42 by employing sdAb42 as bait in a pull-down from a parasite whole cell lysate (Pinto Torres et al., PMID 29899344). The pull-down fractions were analysed by SDS-PAGE and we observed a clear prominent band, which was further analysed by mass spectrometry and revealed TcoPYK as the target with great certainty. Even though the affinity of sdAb42 for TbrPYK is lower, it still remains high (nM affinity) and we expect it to bind TbrPYK with high specificity.

      Regarding measuring the effect on ATP production, we would like to state that such experiments are not obvious. Instead of measuring ATP levels, one should measure ATP turnover as ATP levels may not necessarily be decreased. The latter was observed to be the case for the specific inhibition of trypanosomal PFK (Nare et al. PMID 36864883). The specific trypanosomal PFK inhibitor inhibits motility (and growth) of T. congolense IL3000 at concentrations that only slightly affect ATP levels. One could think of repeating the sdAb42 experiments in a T. congolense model. However, T. congolense BSF metabolism is more complicated than that of T. brucei BSF. First, the T. congolense glucose metabolic network is more expanded, allowing a lower glucose consumption rate to produce ATP and metabolites for growth. Second, pyruvate is not excreted but further metabolised, in part in the mitochondrion. Steketee et al. (PMID 34310651) have shown that T. congolense also takes up pyruvate from the medium. One can thus check if (increased) external pyruvate (partially) rescues the growth inhibition by sdAb42. It will not provide proof, but maybe an indication. As mentioned above, we are currently unable to perform such additional experiments due to lack of dedicated hands and funding.

      Comment 2.2: sdA42A only affects minor growth inhibition in Tbb. The growth defect is used as the main evidence to support targeting this site with chemotherapeutics, however based on the very modest effect on the parasites, one could reasonably claim that PYK is actually not a good drug target. The strongest effect on growth is seen for the high expressor clone in Figure 4a, however here the uninduced cells show an unusual profile, with a sudden increase in growth rate after 4 days, something that is not seen for any of the other control plots. This unexplained observation accentuates the growth difference between induced and uninduced, and the growth differences seen in all other experiments, including those with the highest expressors (clones 54 and 55) are much more modest. The loss of expression of sdA42A over time is presented as a reason for the limited effect, and used to further support the hypothesis that targeting the allosteric site is a suitable avenue for the development of new drugs. However, strong evidence for this is missing.

      We agree that the growth effect of sdAb42 expression is modest, and we have provided several explanations as to why this could be the case. In addition, as mentioned at the start of this rebuttal, the claim that our results may be exploited for the design of novel chemotherapeutics was perhaps expressed stronger than we intended to. We still believe our findings indicate a potential for such an endeavor, but this clearly requires further investigation and experimental evidence as mentioned by the reviewer.

      We, however, disagree that PYK would not be a good drug target. Its potential to serve as a drug target is related to its fundamentally important role in trypanosomal glycolysis and not to the features of sdAb42. Steketee et al. (PMID 34310651) have shown that glycolysis is essential for T. congolense BSF, despite a lower glycolytic flux than in T. brucei BSF. The T. congolense glucose metabolic network is more expanded, allowing a lower glucose consumption rate to produce ATP and metabolites for growth. Also here, PYK is thus almost certainly essential and from that perspective a good drug target.

      Comment 2.3: For chemotherapeutic interventions to be possible, a ligandable site is required. There is no analysis provided of the antibody binding site to indicate that small molecule binding is indeed feasible.

      We agree with the reviewer’s comment and have included APOP analysis on the TcoPYK T state crystal structure (see also reply to Comment 3.1). Briefly, APOP works by detecting pockets and then perturbing each pocket in the protein's elastic network (GNM) by adding stiffer springs between the surrounding residues. The pockets are scored and ranked based on the calculated shifts in the eigenvalues of the global GNM modes and their local hydrophobic densities, thereby also considering the pocket’s surface accessibility, which renders it suitable for the identification of allosteric (and druggable) pockets. The APOP analysis identifies pockets overlapping with the sdAb42 epitope as highly ranking allosteric ligand binding pockets. The data have been summarized in an additional supplementary figure (Figure 4 – figure supplement 1). The manuscript also contains details on the performed APOP analysis in the Materials and Methods section.

      Comment 2.4: The authors comment on the modest growth inhibition, and refer to the need to achieve over 88% reduction in Vmax of PYK to see a strong effect, something that may or may not be achieved in the cell-based model (no target-engagement or functional readout provided). The slow binding model and switch of species are also raised as potential explanations. While these may be plausible explanations, they are not tested which leaves us with limited evidence to support targeting the allosteric site on PYK.

      In our understanding of this remark, we believe it be related to Comments 2.1 and 2.2 and thus refer to our answers formulated above.

      Comment 2.5: The evidence to support an allosteric mechanism is derived from structural studies, including the in silico allosteric network predictions. Unfortunately, standard enzyme kinetics mode of inhibition studies are missing. Such studies could distinguish uncompetitive from non-competitive behaviour and strengthen the claim that sdAb42 locks the enzyme complex in the apo form.

      We agree with the referee that a thorough kinetic analysis could distinguish between uncompetitive (i.e., sdAb only binds to the enzyme if substrate is bound) or non-competitive (i.e., sdAb can bind to apo enzyme and substrate-bound enzyme) inhibition. In both cases, however, this would correspond to an allosteric mechanism of inhibition. Although such a thorough kinetic analysis would be interesting in its own right, we would like to argue that this type of very detailed kinetics is outside the scope of this paper. This is especially the case taking into account that this analysis could be complicated by the slow-onset inhibition behavior.

      Comment 2.6: As general comment, the graphical representation of the data could be improved in line with recent recommendations: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128, https://elifesciences.org/inside-elife/5114d8e9/webinar-report-transforming-data-visualisation-to-improve-transparency-and-reproducibility.

      - Bar-charts for potency are ideally presented as dot plots, showing the individual data points, or box plots with datapoints shown.

      - Images in Figure 7 show significant heterogeneity of nanobody expression, but the extent of this can not be gleaned from Figure 7B. It would be much better to use box plots or violin plots for each cell line on this figure panel. The same applies to Figure 10.

      We thank the reviewer for these suggestions but have taken the decision not to act upon these as the other reviewers explicitly mentioned that our figures are very clear.

      Recommendations for authors:

      Please find below some minor comments:

      Comment 2.7: Line 24: "increasing number of drug failures": This does not really reflect the current situation for human African trypanosomiasis, with NECT treatment retaining efficacy, fexinidazole now being registered, and acoziborole progressing towards registration. It may be worth considering focusing the introduction more on Nagana, as all Trypanosome species used in the paper are animal infective, and the nanobody was discovered for T. congolense.

      We refer to our answer formulated in response to Comment 1.1.

      Comment 2.8: Line 55: "alarming number of reports describing ..." While resistance is a big problem, this mainly applies to malaria, bacterial and fungal diseases. For kinetoplastids, the number of reports describing resistance in the clinic is pretty limited. However, the drug discovery pipeline for these diseases is sparse, so I definitely agree there is a need to develop new compounds with differentiated mechanisms.

      We agree with the reviewer and have slightly adapted our wording here as follows.

      “Unfortunately, a number of reports describe treatment failure or parasite resistance to the currently available drugs (De Rycker et al., 2018).”

      Comment 2.9: This manuscript is about pyruvate kinase, but the enzyme is not properly introduced. I suggest a short paragraph introducing PYK at line 77 (without duplicating Figure 1), covering its role in glycolysis, the importance of pyruvate, any essentiality data from the literature, and any known inhibitors.

      We refer to our answer formulated in response to Comment 1.3.

      Comment 2.10: Figure 6: For the top insets it would be useful to somehow show the increasing antibody concentration, either by using a changing intensity or size for each line.

      We thank the reviewer for this suggestions, but decided not to act upon it as we found that the inclusion of this information in the figure made it “too crowded”, which is why we opted to provide this information in the figure legend.

      “Only a subset of the traces is shown for the sake of clarity. The following curves are shown (from bottom to top): TcoPYK (0.15 nM sdAb42, 500 nM sdAb42, 750 nM sdAb42, 1000 nM sdAb42, 1500 nM sdAb42, 2000 nM sdAb42, no enzyme control), LmePYK (5 nM sdAb42, 750 nM sdAb42, 1250 nM sdAb42, 1500 nM sdAb42, 2500 nM sdAb42, 3000 nM sdAb42, no enzyme control), and TbrPYK (1 nM sdAb42, 1000 nM sdAb42, 1750 nM sdAb42, 2000 nM sdAb42, 3500 nM sdAb42, 4000 nM sdAb42, no enzyme control).”

      Comment 2.11: You refer to the curves as biphasic, but they look like 1st order kinetics, and there are no clear 1st and 2nd phases (or at least they are not marked). It may be more appropriate to label these as non-linear.

      We agree that the term “biphasic” is potentially an over-simplification of the actual situation. What we mean is that the formation of product as a function of time ([P] versus [t] curve) is not linear at short time ranges but evolves from an initial “weakly inhibited” rate (v<sub>0</sub>) to a “strongly inhibited” steady-state rate (v<sub>ss</sub>). This conversion from v<sub>0</sub> to v<sub>ss</sub> indeed occurs in a fashion following single exponential behavior. With the term “biphasic” we thus meant a non-linear phase (before v<sub>ss</sub> is reached) followed by a linear phase (after v<sub>ss</sub> is reached). To avoid any confusion, we replaced the term “biphasic” by “non-linear”.

      Comment 2.12: IC50s - would be useful to provide a comparison with IC50s generated in the pre-incubation experiments - is the antibody less potent without pre-incubation? I could not find IC50s for the pre-incubation experiments shown in Figure 2.

      We determined an IC50 value for sdAb42 against TcoPYK under pre-incubation conditions, but initially decided not to include this into the manuscript. We agree with the reviewer that a comparison between IC50 values determined under pre- and post-incubation conditions would be of interest, and have therefore included the pre-incubation IC50 data for TcoPYK in Figure 2 (panel B). The data indeed show that sdAb42 far more efficiently inhibits an enzyme that is not continuously cycling between R and T states (IC50 values of 15 nM and 359 nM under pre- and post-incubation conditions, respectively). This is now discussed in the results section of the manuscript. We did not determine IC50 values for sdAb42 against TbrPYK and LmePYK under pre-incubation conditions, but suspect that a similar observation will be made upon comparing these values to IC50 under post-incubation conditions.

      REVIEWER 3:

      Summary:

      Out of the 20 Neglected Tropical Diseases (NTD) highlighted by the WHO, three are caused by members of the trypanosomatids, namely Leishmanaisis, Trypanosomiasis, and Chagas disease. Trypanosomal glycolytic enzymes including pyruvate kinase (PyK) have long been recognised as potential targets. In this important study, single-chain camelid antibodies have been developed as novel and potent inhibitors of PyK from the T, congolense. To gain structural insight into the mode of action, binding was further characterised by biophysical and structural methods, including crystal structure determination of the enzyme-nanobody complex. The results revealed a novel allosteric mechanism/pathway with significant potential for the future development of novel drugs targeting allosteric and/or cryptic binding sites.

      Strengths:

      This paper covers an important area of science towards the development of novel therapies for three of the Neglected Tropical Diseases. The manuscript is very clearly written with excellent graphics making it accessible to a wide readership beyond experts. Particular strengths are the wide range of experimental and computational techniques applied to an important biological problem. The use of nanobodies in all areas from biophysical binding experiments and X-ray crystallography to in-vivo studies is particularly impressive. This is likely to inspire researchers from many areas to consider the use of nanobodies in their fields.

      Weaknesses:

      There is no particular weakness, but I think the computational analysis of allostery, which basically relies on a single server could have been more detailed.

      Recommendations for authors:

      Overall an excellent paper, there are just a couple of points that the authors could consider, if time allows.

      Comment 3.1: As mentioned above the computational analysis of allostery appears to be based on a single server based on coordinates alone with no in-depth analysis. It would be extremely interesting to see if more sophisticated methods based on elastic network model and/or molecular dynamics simulation gave similar results. I realize that this would require quite a lot of work though.

      We agree with the reviewer’s comment and have complemented the perturbation analysis (previously presented in the manuscript) with dGNM and APOP analyses to identify allosteric communication pathways and allosteric binding pockets, respectively. dGNM, which is based on transfer entropy, allowing for a detailed characterization of the dynamic coupling and information transfer between residues. Meanwhile, APOP employs a perturbation-based approach to detect and rank allosteric pockets. The findings are in good agreement with the previously presented perturbation data and have been summarized in an additional supplementary figure (Figure 4 – figure supplement 1). The manuscript also contains details on the performed transfer entropy and APOP analyses in the Materials and Methods section.

      Comment 3.2: The figures are excellent and really help the reader - with the exception of the screenshots (Figure 8). Using pymol or chimera (or any other more expensive commercial package) would really help the reader and will not take much time.

      We agree with the referee that this is not the most beautiful figure. However, we find the quality and clarity of the figure to be adequate for its purpose (i.e., a supplemental figure).

      Comment 3.3: Finally, I would have liked to see at least the PDB validation files. This is a highly regarded and experienced team, nevertheless, the resolution is rather mediocre. As the crystal coordinates were used as input for the computational, any experimental inaccuracies will affect the computational results.

      We agree with the reviewer that we could have provided the validation report together with the submitted manuscript and we apologise for the inconvenience. The validation reports will be released together with the structures following final manuscript publication. Regarding the resolution of the crystal structures, we agree with the reviewer’s comment, but we obviously employed data sets from our best diffracting crystals and could not obtain a higher resolution despite our best efforts.

    1. Reviewer #2 (Public review):

      Summary:

      This study by Pradhan et al. offers critical insights into the mechanisms by which antimony-resistant Leishmania donovani (LD-R) parasites alter host cell lipid metabolism to facilitate their own growth and, in the process, acquire resistance to amphotericin B therapy. The authors illustrate that LD-R parasites enhance LDL uptake via fluid-phase endocytosis, resulting in the accumulation of neutral lipids in the form of lipid droplets that surround the intracellular amastigotes within the parasitophorous vacuoles (PV) that support their development and contribute to amphotericin B treatment resistance. The evidence provided by the authors supporting the main conclusions is compelling, presenting rigorous controls and multiple complementary approaches. The work represents an important advance in understanding how intracellular parasites can modify host metabolism to support their survival and escape drug treatment.

      Strengths:

      (1) The study utilizes clinical isolates of antimony-resistant L. donovani and provides interesting mechanistic information regarding the increased LD-R isolate virulence and emerging amphotericin B resistance.

      (2) The authors have used a comprehensive experimental approach to provide a link between antimony-resistant isolates, lipid metabolism, parasite virulence, and amphotericin B resistance. They have combined the following approaches:<br /> (a) In vivo infection models involving BL/6 and Apoe-/- mice.<br /> (b) Ex-vivo infection models using primary Kupffer cells (KC) and peritoneal exudate macrophages (PEC) as physiologically relevant host cells.<br /> (c) Various complementary techniques to ascertain lipid metabolism including GC-MS, Raman spectroscopy, microscopy.<br /> (d) Applications of genetic and pharmacological tools to show the uptake and utilization of host lipids by the infected macrophage resident L. donovani amastigotes.

      (3) The outcome of this study has clear clinical significance. Additionally, the authors have supported their work by including patient data showing a clear clinical significance and correlation between serum lipid profiles and treatment outcomes.

      (4) The present study effectively connects the basic cellular biology of host-pathogen interactions with clinical observations of drug resistance.

      (5) Major findings in the study are well-supported by the data:<br /> (a) Intracellular LD-R parasites induce fluid-phase endocytosis of LDL independent of LDL receptor (LDLr).<br /> (b) Enhanced fusion of LDL-containing vesicles with parasitophorous vacuoles (PV) containing LD-R parasites both within infected KCs and PECs cells.<br /> (c) Intracellular cholesterol transporter NPC1-mediated cholesterol efflux from parasitophorous vacuoles is suppressed by the LD-R parasites within infected cells.<br /> (d) Selective exclusion of inflammatory ox-LDL through MSR1 downregulation.<br /> (e) Accumulation of neutral lipid droplets contributing to amphotericin B resistance.

      Weaknesses:

      The weaknesses are minor:

      (1) The authors do not show how they ascertain that they have a purified fraction of the PV post-density gradient centrifugation.

      (2) The study could have benefited from a more detailed analysis of how lipid droplets physically interfere with amphotericin B access to parasites.

      Impact and significance:

      This work makes several fundamental advances:

      (1) The authors were able to show the link between antimony resistance and enhanced parasite proliferation.

      (2) They were also able to reveal how parasites can modify host cell metabolism to support their growth while avoiding inflammation.

      (3) They were able to show a certain mechanistic basis for emerging amphotericin B resistance.

      (4) They suggest therapeutic strategies combining lipid droplet inhibitors with current drugs.

    1. Reviewer #2 (Public review):

      Frangos et al present a set of studies aiming to determine mechanisms underlying initiation and tumour progression. Overall, this work provides some useful insights into the involvement of mitochondrial dysfunction during the cellular transformation process. This body of work could be improved in several possible directions to establish more mechanistic connections.

      (1) The interesting point of the paper: the contrast between suppressed ETC components and activated OXPHOS function is perplexing and should be resolved. It is still unclear if activated mitochondrial function triggers gene down-regulation vs compensatory functional changes (as the title suggests). Have the authors considered reversing the HER2-derived signals e.g. with PI3K-AKT-MTOR or ERK inhibitors to potentially separate the expression vs. functional phenotypes? The root of the OXPHOS component down-regulation should also be traced further, e.g. by probing into levels of core mitochondrial biogenesis factors. Are transcript levels of factors encoded by mtDNA also decreased?

      (2) The second interesting aspect of this study is the implication of mitochondrial activation in tumours, despite the downregulation of expression signatures, suggestive of a positive role for mitochondria in this tumour model. To address if this is correlative or causal, have the authors considered testing an OXPHOS inhibitor for suppression of tumorigenesis?

      (3) A number of issues concerning animal/ tumour variability and further pathway dissection could be explored with in vitro approaches. Have the authors considered deriving tumour-derived cell cultures, which could enable further confirmations, mechanistic drug studies and additional imaging approaches? Culture systems would allow alternative assessment of mitochondrial function such as Seahorse or flow cytometry (mitochondrial potential and ROS levels).

      (4) The study could be greatly improved with further confirmatory studies, eg immunoblotting for mitochondrial components with parallel blots for phospho-signalling in the same samples. It would be interesting if trends could be maintained in tumour-derived cell cultures. It is notable that OXPHOS protein/transcript changes are more consistent (Figure 5, Supplementary Figure 4) than mitochondrial dynamics /mitophagy factors (Figure 8). Core regulatory factors in these pathways should be confirmed by conventional immunoblotting.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Ghone et al show that HIV-1 Vif causes a pseudo-metaphase arrest rather than a G2 arrest. The metaphase arrest correlates with misregulation of the kinetochore which could be explained by the loss of phosphatase functions that determine chromosome-microtubule interactions.

      Strengths:

      The single-cell imaging using different reporters of cell cycle progression is very elegant and the quantitation is convincing. The authors clearly show that what others have characterized as a G2 arrest by flow cytometry is somewhat later in metaphase and correlates with kinetochore misregulation.

      We sincerely appreciate the reviewer recognizing the quality and precision of our study, particularly our use of long-term live cell imaging combined with single-cell resolution analysis.

      Weaknesses:

      (1) The major problem with the paper is trying to connect what is observed in tumor cell lines with actual infections in primary T cells. While all of the descriptive work in cell lines is convincing, none of these cells are relevant targets and tumor cells have different cell death and cell cycle regulation than primary T cells. Thus, while Vif might well do all of the things described in the manuscript, it is a stretch to connect any of it to what happens in vivo.

      We fully agree with this point. It is indeed technically challenging to perform 48-120 hours of live-cell imaging at high magnification at short intervals using primary T cells because of their non-adherent nature. We also agree that Vif’s functions in pseudo-metaphase arrest and the consequent induction of cell death, observed in cancer cells (e.g., Cal51, HeLa, and MDA-MB-231 cell lines) or normal non-transformed epithelial cells (e.g., the RPE1 cell line), may differ in T cells. Further studies and refined approaches will be required to address this important question. We have revised the manuscript to include a discussion of this issue in the section of Limitation of this study.

      (2) Line 109 and elsewhere. The ability of Vif to cause cell cycle arrest and bind PP2A subunits is not a completely conserved feature. Rather, it is quite variable in different HIV-1 strains. (e.g. https://doi.org/10.1016/j.bbrc.2020.04.123 and https://elifesciences.org/articles/53036). Therefore, it is necessary for the authors to quite clearly use strain designations in the manuscript rather than a generic "Vif", and to more clearly describe the viruses being used.

      Thank you for raising this important point. We utilized the NL4-3 strain in our study and have revised the manuscript to specify this detail. While this study uncovered part of the mechanism by which Vif modulates phosphatase regulation during mitosis, further research is required to elucidate the full mechanism, particularly how this degradation induces a robust pseudo-metaphase arrest.

      (3) Figure 5: This figure shows disruption of PP2A-B56 at the kinetochores. However, is this specific to the kinetochores? Since Vif has been described to more broadly degrade PP2A-B56, could this not be a result of a more general decrease in PP2A activity throughout the cell?

      Thank you for highlighting this critical point. PP2A is a major serine/threonine phosphatase that regulates numerous essential cell cycle processes. To the best of our knowledge, Vif selectively targets the degradation of the B56 family of PP2A regulatory subunits, without affecting other three B-type subunits or the catalytic core of PP2A itself. During early mitosis, all five members of the B56 family (B56α, B56β, B56γ, B56δ, and B56ε) accumulate at kinetochores and centromeres, where they play critical roles in chromosome alignment. Many PP2A-B56 substrates are also localized to kinetochores and chromosomes during mitosis. Depletion of specific B56 isoforms or introduction of phosphorylation-deficient mutants of PP2A-B56 substrates at kinetochores has been shown to result in mitotic defects, underscoring the crucial roles of PP2A-B56 in regulating kinetochore, centromere, and chromosomal functions during mitosis. Interestingly, we observed no significant cell cycle arrest during G1, S, or G2 phases in Vif-expressing cells. While PP2A-B56 likely has important roles outside of mitosis, Vif-mediated degradation of PP2A-B56 appears to selectively disrupt its mitotic functions, particularly at the kinetochore. This finding highlights a targeted mechanism by which Vif interferes with PP2A-B56-mediated regulation of mitotic processes. However, further experiments are required to elucidate the precise mechanisms underlying Vif's inhibition of the specific mitotic roles of PP2A-B56.

      Reviewer #2 (Public review):

      Summary

      The authors characterize the cell-cycle arrest induced by HIV-1 Vif in infected cells. They show this arrest is not at G2/M as previously thought but during metaphase. They show that the metaphase plate forms normally but progression to anaphase is massively delayed, and chromosome segregation is dysregulated in a manner consistent with impaired assembly of microtubules at the kinetochore. This correlates with the lack of recruitment of B56-subunits of PP2 phosphatase which are known degradation targets of Vif, suggesting that this weakens and unbalances the microtubule-mediated forces on the separating chromosomes.

      Strengths

      The authors present a very well-performed set of quantitative live cell imaging experiments that convincingly show a difference between Vif and Vpr-mediated cell cycle arrests. Through an in-depth characterization of the Vif-mediated block in metaphase, they make a strong case for this phenotype being tied to the degradation of PP2-B56 by Vif. Furthermore, it is important that they have performed most of these experiments with virally infected cells, meaning that their observations are observable at relevant viral expression levels of Vif.

      We appreciate the reviewer’s recognition of the importance and significance of our study.

      Weaknesses

      Experimentally there is very little to criticize with respect to the cellular systems used. Data from 10.1016/j.bbrc.2020.04.123 has identified selective mutants that fail to degrade B56 while maintaining A3G degradation by Cul5, and it would be nice to confirm that such a mutant behaves like the delta-Vif virus when examining metaphase, but selective ablation of B56 during mitosis to mimic Vif is would expect to be very challenging and beyond the scope.

      Thank you for your valuable suggestion. As also highlighted by Reviewer #1, it is true that certain variants of Vif, as discussed in 10.1016/j.bbrc.2020.04.123, differentially impact B56 degradation. Notably, some variants degrade A3G without inducing cell cycle arrest. We agree that investigating whether Vif's effects on B56 are directly linked to the mitotic arrest phenotype is an important direction for future research. Equipped with our advanced imaging tools, we are now preparing to extend our studies to include Vif variants from additional HIV-1 subtypes, including primary isolates. As you rightly pointed out, depletion of B56 is expected to be challenging as the B56 family comprises multiple isoforms, each with distinct and partially redundant roles in mitosis, particularly in microtubule assembly and spindle assembly checkpoint regulation. The functions of PP2A-B56 in mitosis are well-documented compared to the relatively new studies on Vif’s role in PP2A-B56 degradation. In human cells, the B56 family comprises 5 isoforms (B56α, B56β, B56γ, B56δ, and B56ε). While all B56 isoforms localize to kinetochores or centromeres during early mitosis, the reasons for their slightly different localization patterns (to either kinetochores or centromeres) remain unclear (Vallardi et al., eLife, 2019). Notably, these isoforms exhibit functional redundancy; thus, the depletion of any single isoform does not result in severe mitotic defects (Foley et al., Nature Cell Biology, 2011; Neumann et al., Nature, 2010). Supporting this redundancy, the overexpression of a single isoform (tested only B56α and B56γ) can rescue kinetochore function when all other isoforms are depleted (Foley et al., Nature Cell Biology, 2011; Vallardi et al., eLife, 2019). This complexity poses significant challenges to modulating the relative levels of individual B56 isoforms experimentally. While these specific experiments are beyond the current scope of our study, we remain committed to advancing our understanding of the mechanisms driving Vif-induced pseudo-metaphase arrest. Your suggestion aligns with our ongoing efforts, and we will consider these experiments as we further explore this fascinating area.

      Where I would raise some criticism is in the relevance of these observations to the replication and pathogenesis of the virus itself, which the authors do not address or discuss. Firstly, despite clear data that both Vpr and Vif can lead to a cell cycle arrest in cycling cells, it has never been particularly clear why the virus does this. While I would agree with the authors that Vif results in the metaphase arrest through targeting B56-PP2A, this may not be the reason WHY the virus targets one of the cell's major phosphatases, but rather a knock-on effect of doing so. I appreciate that this is beyond the scope of the study, but it is something I feel should be discussed rather than the narrow mechanistic points made in the discussion. Secondly, the authors suggest that this activity of Vif is a major cause of apoptosis in infected cells and perhaps CD4+ T cell depletion in vivo. It would be good to quantify how much apoptosis is Vif-dependent in infected primary human CD4+ T cells rather than transformed tumor cells, and whether this correlates with the Vif-mediated induction of a pseudometaphase.

      Thank you for highlighting this important point. We completely agree that the full scope of Vif’s bi-functional roles, in both degrading the APOBEC3 family, which is essential for HIV-1 infection, and inducing cell cycle arrest, is not yet fully understood. The connection between Vif’s role in cell cycle arrest and the HIV-1 life cycle remains unclear. One possible explanation, as discussed in our study, is that Vif-induced pseudo-metaphase arrest may contribute to cell death, suggesting that Vif could play a role in the reduction of CD4+ T cells. Alternatively, Vif’s impact on cell cycle arrest, or its disruption of phosphatase activity, could facilitate HIV-1 virus production. However, further experiments, especially using primary human CD4+ T cells with similar approaches as in this study, are essential to gain deeper insights. This discussion has been included in the Limitations section of our study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The first paragraph of the Introduction is not necessary and anyway is quite outdated about the current state of HIV pathogenesis. Likewise, the discussion implies that HIV pathogenesis is due to virally-induced cell death, which is also outdated by more than a decade of work demonstrating that chronic immune activation is the driver of CD4 cell decline rather than direct cytotoxicity due to viral proteins.

      We have revised the first paragraph of the Introduction.

      (2) Line 134. I do not know what are Cal51 cells, and why they are being used for an HIV study here. Some rationale for being the cell of choice for this study should be included.

      Thank you for this suggestion. We have revised the text to clearly articulate the rationale for selecting the Cal51 cell line in this study. Briefly, this study focuses on the robust mitotic arrest induced by Vif. To capture this phenomenon, long-term live-cell imaging was required with a range of 48–120 hours, with imaging intervals of 6–12 minutes and 3–4 z-stacks per time point. These parameters presented considerable technical challenges. The Cal51 cell line was chosen as it has been genetically engineered by the CRISPR-Cas9 method to express mScarlet-tagged Histone H2B and mNeonGreen-tagged Tubulin, enabling extended live-cell imaging. Furthermore, the Cal51 cell line exhibits wild-type p53 expression and maintains a stable near-diploid karyotype, making it an ideal model for studying cell cycle progression.

      (3) A description of the viruses being used is necessary. Although the authors cite a previous paper, the names in that paper do not exactly match the names used here. I presume that is the NL4.3 strain?

      Thank you for raising this important point. We utilized the B type HIV-1 NL4-3 strain in our study and have revised the manuscript to specify this detail.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Reviews):

      Summary:

      This study examines to what extent this phenomenon varies based on the visibility of the saccade target. Visibility is defined as the contrast level of the target with respect to the noise background, and it is related to the signal-to-noise ratio of the target. A more visible target facilitates the oculomotor behavior planning and execution, however, as speculated by the authors, it can also benefit foveal prediction even if the foveal stimulus visibility is maintained constant. Remarkably, the authors show that presenting a highly visible saccade target is beneficial for foveal vision as the detection of stimuli with an orientation similar to that of the saccade target is improved, the lower the saccade target visibility, the less prominent the effect.  

      Strengths:

      The results are convincing and the research methodology is technically sound.

      Weaknesses:

      Discussion on how this phenomenon may unfold in natural viewing conditions when the foveal and saccade target stimuli are complex and are constituted by different visual properties is lacking. Some speculations regarding feedforward vs feedback neural processing involved in the phenomenon and the speed of the feedforward signal in relation to the visibility of the target, are not well justified and not clearly supported by the data.

      We thank the reviewer for their comment. In general, we tried to address conceptual points only briefly in this Research Advance if we had discussed them in depth in our main article which this advance will be linked to (Kroell & Rolfs, 2022: https://elifesciences.org/articles/78106). However, the reviews showed us that this rendered our theoretical reasoning in the current manuscript appear incomplete. In the revised Discussion section, we have elaborated on several conceptual questions. In particular, we expand on the transferability of our findings to natural viewing conditions:

      “Foveal prediction in natural visual environments

      As noted above, human observers typically move their eyes towards the most conspicuous objects in their environment (‘t Hart, Schmidt, Roth, & Einhäuser, 2013). Foveal prediction seems to benefit from this strategy as the strength of the predicted signal increases with the conspicuity of the eye movement target. Nonetheless, natural visual environments as well as naturalistic viewing behavior pose several challenges for the foveal prediction mechanism (see Kroell & Rolfs, 2022, for an initial discussion). 

      First, naturalistic saccade target stimuli will likely exhibit complex shapes and, more often than not, will include feature conjunctions rather than isolated features. Previous findings suggest that the foveal feedback mechanism is capable of operating at this level of complexity: High-level peripheral information such as the category of novel, rendered objects (Williams et al., 2008) has been successfully decoded from activation in foveal retinotopic cortex. If, indeed, temporal objectspecific areas such as area TE send feedback, the foveal prediction mechanism may even be specialized for the transfer of complex visual properties.

      Second, foveal input will often be of high contrast in natural visual environments. If fed-back predictive signals can influence foveal perception in the presence of high-contrast feedforward input remains to be established. In our main investigation (Kroell & Rolfs, 2022; Figure 2B) as well as in previous studies (Hanning & Deubel, 2022b), pre-saccadic foveal detection performance decreased markedly in the course of saccade preparation, presumably because visuospatial attention gradually shifted towards the saccade target and away from the foveal location. This presaccadic decrease in foveal sensitivity may boost the relative weight of fed-back signals by attenuating the conspicuity of high-contrast feedforward input. In other words, the strength of feedforward input to the fovea is reduced gradually across saccade preparation. At the same time, the strength of the fed-back predictive signal should profit from the high contrast of naturalistic saccade targets.

      Third, while foveal and peripheral information was congruent on 50% of all ‘probe present’ trials in our investigation, peripheral and foveal features will often be weakly correlated or even uncorrelated in natural environments (see Samonds, Geisler, & Priebe, 2018). Again, the presaccadic attenuation of foveal feedforward processing may allow fed-back peripheral signals to influence perception even if they are uncorrelated with foveal information. Moreover, in piloting variations of our paradigm, we observed that the subjective impression of perceiving the saccade target at the pre-saccadic foveal location is most pronounced if the foveal noise region is replaced with a black Gaussian blob at certain time points before saccade onset (unpublished phenomenological accounts). In consequence, fed-back signals do not seem to require correlated feedforward input to influence perception. Quantitative evidence, however, remains to be established.

      Lastly, pre-saccadic foveal input is likely less relevant during natural viewing behavior than it is in our task. It is possible that this task-induced prioritization of the foveal location facilitated the emergence of congruency effects. In a previous experiment (Kroell & Rolfs, 2022; Figure 1D), however, the perceptual probe could appear anywhere on a horizontal axis of 9 dva length around the fixation location. Despite this spatial unpredictability, congruency effects peaked at the presaccadic foveal location, even after peripheral baseline performances had been raised to a foveal level through an adaptive increase in probe opacity. On a similar note, the orientation of the saccade target is irrelevant to the behavioral task in our design, mirroring naturalistic situations: The eye movement can be planned and executed based on local contrast variations alone, and observers are never required to report on the orientation of the peripheral target stimulus. Ultimately, however, an influence of task demands on visual processing can only be fully excluded through techniques that provide a direct readout of perceptual contents without requiring overt responses. In psychophysical investigations, a prediction of saccade target motion may be read out from observers’ eye velocities (Kroell, Mitchell, & Rolfs, 2023; Kwon, Rolfs, & Mitchell, 2019). In electroencephalographic (EEG) and electrophysiological studies, foveal predictions should manifest in early visually evoked potentials (e.g., Creel, 2019) and increased firing rates of featureselective foveal neurons in early visual areas, respectively. In conclusion, previous findings (Williams et al., 2008), the assumed properties of the neuronal feedback mechanism (Williams et al., 2008; Bullier, 2001) and characteristics of our current and previous experimental paradigms collectively suggest that foveal feature predictions are likely to transfer to naturalistic environments and viewing situations. Experimental evidence remains to be established.”

      We have furthermore modified the Abstract to emphasize the connection of the current manuscript to the main article.

      With respect to the reviewer’s point that “speculations regarding feedforward vs feedback neural processing involved in the phenomenon and the speed of the feedforward signal in relation to the visibility of the target, are not well justified”: 

      Again, we understand that we should have elaborated on our theoretical reasoning in this Research Advance. The assumption that our initial findings rely on neuronal feedback to foveal retinotopic cortex is derived from Williams et al.’s (2008) seminal findings: In an fMRI study, the category of peripherally presented objects could be decoded from voxels in foveal retinotopic cortex, suggesting that peripheral visual information was available to neurons with strictly foveal receptive fields. We extended these findings to saccade preparation, suggesting that feedback from higher-order, non-retinotopically organized visual areas may transmit information without the requirement of efference copies (see Kroell, 2023; Dissertation; https://doi.org/10.18452/27204, pp. 54-59): Irrespective of the vector of the upcoming saccade, the features of the attended saccade target would invariably be relayed to foveal retinotopic cortex. Ultimately, only anatomical and functional studies in non-human primates can conclusively establish the role of feedback connections in the observed foveal prediction effects. At present, however, this parsimonious model could account for all of our current and previous findings, that is, a temporally, spatially and feature-specific anticipation of saccade target properties in the presaccadic center of gaze. Nonetheless, we are open to considering any other mechanism that may account for our findings, and have integrated the explanation provided by the reviewer into the paragraph on potential thalamic mechanisms (see the reviewer’s Major Point 1).

      Concerning the point that the “some speculations regarding feedforward vs feedback neural processing […] and the speed of the feedforward signal in relation to the visibility of the target are not well justified and not clearly supported by the data”: 

      Theoretical considerations on the impact of peripheral target contrast on feedforward processing speed were a main motivation for the current study. We apologize if our theoretical reasoning was incomplete and have added additional references and elaborations to the Introduction: 

      “In particular, neuronal response latencies decrease systematically as the contrast of visual input increases. While this phenomenon is reliably observed at varying stages of the visual processing hierarchy—such as the lateral geniculate nucleus (Lee, Elepfandt, & Virsu, 1981b), primary visual cortex (e.g., Albrecht, 1995; Carandini & Heeger, 1994; Carandini, Heeger, & Movshon, 1997; Carandini, Heeger, & Senn, 2002), and anterior superior temporal sulcus (STSa; Oram, Xiao, Dritschel, & Payne, 2002; van Rossum, van der Meer, Xiao, & Oram, 2008)—influences of contrast on neuronal response latency are particularly pronounced in higher-order visual areas: A doubling of stimulus contrast has been shown to decrease the latency of V1 neurons by 8 ms, compared to a reduction of 33 ms in area STSa (Oram et al., 2002; van Rossum et al., 2008). Assuming that the peripheral target is processed in a bottom-up fashion until it reaches higher-order object processing areas, the time point at which peripheral signals are available for feedback should be dictated by the temporal dynamics of visual feedforward processing.”

      Concerning the interpretation of the observed time courses, and regarding the reviewer’s Major points 3 & 6, we substantially revised the Results and Discussion section. In brief, we deemphasized the claim/interpretation of faster enhancement with increasing target opacity and instead focus on describing the oscillatory pattern mentioned by the reviewer. We provide a more temporally resolved pre-saccadic time course using a moving-window analysis and discuss all suggested and further alternative explanations (i.e., saccade-locked perceptual or attentional oscillations, longer signal accumulation intervals for low-contrast information, oscillatory nature of feedback signaling). Details and full revised paragraphs are provided in the response to this reviewer’s Major points 3 & 6.

      Unfortunately, there is no line numbering in the manuscript version I downloaded so I cannot refer to the specific lines of text here.

      We apologize for the inconvenience and have added line numbers.

      Major:

      (1) The authors speculate that the phenomenon of pre-saccadic foveal prediction arises from feedback connections from higher-order visual areas, which relay relevant saccade target features to the foveal retinotopic cortex. These feedback signals are then presumably combined with feedforward foveal input to the early visual cortex and facilitate the detection of target-congruent features at the center of gaze. This interpretation is sensible, however, it may not be the only plausible scenario. The thalamus receives copies of feedforward and feedback connections between all visual areas and is a likely candidate hub for combining information across visual space. In this latter case, the phenomenon of pre-saccadic foveal prediction may not arise from feedback from higher-order visual areas, but rather from a combination of signals occurring at the level of the thalamus. The authors should either acknowledge this possibility and the fact that this phenomenon is not necessarily the result of a feedback loop, or they should explain their rationale for excluding this scenario.

      We thank the reviewer for their highly thoughtful suggestion, and for alerting us to relevant literature. We have added the following paragraph to the Discussion section. In brief, we discuss the thalamic pulvinar as either an intermediate modulatory region or as the final receiver of the fed-back signal. Yet, we assume that—to solve the combinatorial issue associated with a transfer of feature information before saccades with any possible direction and amplitude—the contribution of non-retinotopic, higherorder object processing areas is likely required. 

      “Neural implementation of foveal prediction

      Based on the body of our findings as well as previous literature, we suggested a parsimonious feedback mechanism to underly the observed effects: the preparation of a saccadic eye movement, and the concomitant shift of pre-saccadic attention (e.g., Kowler, Anderson, Dosher, & Blaser, 1995; Deubel & Schneider, 1996), selects the peripheral target stimulus among competing information. Higher-order visual areas feed selected feature input back to early retinotopic areas— specifically, to neurons with foveal receptive fields. Fed-back feature information combines with congruent, foveal feedforward input, resulting in the enhancement effects we observe. Especially in the context of active vision, this feedback mechanism is appealing as it resolves a combinatorial issue associated with feature-specific information transfer before saccades. Consider a simplified case in which, right before a saccadic eye movement, the activation of a feature-selective neuron that encodes a certain retinal location is transferred to a neuron within the same brain area that will encode said retinal location after saccade landing. For this mechanism to function for any possible saccade direction and amplitude, most neurons would need to be connected to most other neurons (or, in a simplified version, to neurons with foveal receptive fields) in a given brain area. Assuming an information transmission via feedback rather than horizontal connections significantly reduces this dimensionality: Higher-order visual areas that encode object properties (largely) detached from retinotopic or spatiotopic reference frames selectively transfer feature information to neurons with foveal receptive fields, irrespective of the vector of the upcoming saccade. This parsimonious mechanism would have shortcomings. In particular, foveal feedback should become less effective during saccade sequences where several peripheral targets are simultaneously attended. Feature information at both attended target locations may be fed back in temporal succession or weighted and erroneously combined into a single fed-back signal. In most cases, however, foveal feedback may reasonably achieve what established transsaccadic mechanisms struggle to explain: An anticipation of the features of a single saccade target—which typically constitutes the currently most relevant object in the visual field—in foveal vision. 

      While direct feedback connections from higher-order to early visual areas would constitute the most straightforward implementation, it is conceivable that feedback signals are relayed through and modulated by subcortical areas. In particular, the thalamic pulvinar has been identified as a connection hub for visual processing that receives copies of feedforward and feedback connections from different visual areas and may even combine information across visual space (Cortes, Ladret, Abbas-Farishta, & Casanova, 2024). In the case of foveal prediction, thalamic neurons may receive fed-back signals from higher-order areas and enhance those signals before passing them on to cortical neurons with foveal receptive fields. Perhaps, a modification of foveal activation within the thalamic pulvinar itself is sufficient to influence perception. To the best of our understanding, however, the fed-back signal must originate in non-retinotopic, higher-order object processing areas to reduce the number of necessary neuronal connections.”

      (2) The results presented are very compelling. I wonder to which extent they generalize to situations in which the foveal input and the peripheral input are more heterogenous (e.g., faces or complex objects composed of many different features, orientations, and other visual properties). I think the current research raises a number of interesting questions. In general, it would be important for the readers to elaborate more on how the mechanism of pre-saccadic foveal prediction may play out in normal viewing conditions or in conditions in which the foveal input is completely irrelevant to the task.

      We agree and have reiterated this point in the current manuscript (see our first reply to “Weaknesses”). We also explicitly refer to Kroell & Rolfs (2022) for an extensive initial discussion of this question.

      (3) On page 10 the authors state that their data suggest that foveal enhancement emerges in earlier stages of saccade preparation as target opacity increases. However, this is not clear from the figures, when performance is locked to saccade onset (Fig 3 C), for the highest opacity targets performance seems to oscillate, however, the authors do not comment on that. There is literature showing how saccades can reset perceptual oscillations, and maybe what is observed here is just a stronger performance oscillation when the saccade target is more visible. Why would performance drop systematically 75 ms before saccade onset and then increase again 25 ms before the onset? Can the authors elaborate more on this?

      In response to this comment, we inspected the pre-saccadic time course of enhancement effects in a more temporally resolved fashion and, indeed, observed pronounced oscillations for the two higher target opacity conditions (see Results): 

      “Especially at higher target opacities, the temporal development of foveal enhancement appears to exhibit an oscillatory pattern. To inspect this incidental observation in a more temporally resolved fashion, we determined mean enhancement values in a boxcar window of 50 ms duration sliding along all saccade-locked probe offset time points (step size = 10 ms; x-axis values in Figure 4 indicate the latest time point in a certain window). We then fitted 6th order polynomials (with no constraints on parameters) to the resulting time courses and compared the fitted values against zero using bootstrapping (see Methods). The average foveal enhancement across target opacities reached significance starting 115 ms before saccade onset (gray curve in Figure 4; all ps < .046). For every individual target opacity condition, we observed significant enhancement immediately before saccade onset, although only very briefly for the lowest opacity (-2–0 ms for 25%; -39–0 ms for 39%, -106–0 ms for 59% &  -13–0 ms for 90%; all ps < .050; yellow to dark red curves in Figure 4). Especially for the higher two target opacities, we observed a local maximum preceding eye movement onset by approximately 80 ms. Interestingly, assuming a peak in enhancement in approximately 80 ms intervals (i.e., at x-axis values of -80 and 0 ms in Figure 4) would correspond to an oscillation frequency of 12.5 Hz. In contrast to rapid feedforward processing, feedback signaling is associated with neural oscillations in the alpha and beta range (i.e., between 7 and 30 Hz; Bastos et al., 2015; Jensen, Bonnefond, Marshall, & Tiesinga, 2015; van Kerkoerle et al., 2015).”

      We had observed an oscillatory pattern in multiple previous investigations, and in both Hit Rates to foveal orientation content and reflexive gaze velocities in response to peripheral motion information. So far, we have been unsure how to explain it. The literature on thalamic visual processing mentioned by the reviewer alerted us to the oscillatory nature of feedback signaling itself. Interestingly, the temporal frequency range of feedback oscillations includes the frequency of ~12.5 Hz observed in our data. We have included this and alternative explanations in the Discussion section (see below). Throughout, we highlight that we are aware that our analysis approach is purely descriptive and that the potential explanations we give are speculative.

      “Moreover, foveal congruency effects appear to exhibit an oscillatory pattern, with peaks in a medium saccade preparation stage (~80 ms before the eye movement) and immediately before saccade onset. We have noticed this pattern in several investigations with substantially different visual stimuli and behavioral readouts. For instance, using a full-screen dot motion paradigm, we observed a pre-saccadic, small-gain ocular following response to coherent motion in the saccade target region (Kroell, Rolfs, & Mitchell, 2023, conference abstract; Kroell, 2023, dissertation). Predictive ocular following first reached significance ~125 ms before the eye movement, then decreased and subsequently ramped up again ~25 ms before saccade onset. Several explanatory mechanisms appear conceivable. Unlike rapid feedforward processing, feedback propagation has been shown to follow an oscillatory rhythm in the alpha and beta range, that is, between 7 and 30 Hz (Bastos et al., 2015; Jensen, Bonnefond, Marshall, & Tiesinga, 2015; van Kerkoerle, et al., 2015). In our case, it is possible that the object-processing areas that send feedback to retinotopic visual cortex do so at a temporal frequency of ~12.5 Hz. At higher stimulus contrasts, feedforward signals may be fed back instantaneously and without the need for signal accumulation in feedbackgenerating areas. The resulting perceptual time courses may reflect innate temporal feedback properties most veridically. Alternatively, the initial enhancement peak may be related to the sudden onset of the saccade target stimulus and not to movement preparation itself. In this case, the initial peak should become particularly apparent if enhancement is aligned to the onset of the target stimulus. Yet, Figure 3 and Figure 4 suggest more prominent oscillations in saccade-locked time courses. In accordance with this, perceptual and attentional processes have been shown to exhibit oscillatory modulations that are phase-locked to action onset (e.g., Tomassini, Spinelli, Jacono, Sandini, & Morrone, 2015; Hogendoorn, 2016; Wutz, Muschter, van Koningsbruggen, Weisz, & Melcher, 2016; Benedetto & Morrone, 2017; Tomassini, Ambrogioni, Medendorp, & Maris, 2017; Benedetto, Morrone & Tomassini, 2019). Whether the oscillatory pattern of foveal enhancement, as well as its increased prominence at higher target contrasts, relies on innate temporal properties of feedback signaling, signal accumulation, saccade-locked oscillatory modulations of feedforward processing or attention, or a combination of these factors, one conclusion remains: task-induced cognitive influences suggested to underlie the considerable variability in temporal characteristics of foveal feedback during passive fixation (e.g., Fan et al., 2016; Weldon et al., 2016; 2020) are not the only possible explanation. Low-level target properties such as its luminance contrast modulate the resulting time course and should be equally considered, at least in our paradigm.”

      In the revised Abstract, we removed our claim on an earlier emergence of enhancement at higher opacities and have added this summary instead:

      “Second, the time course of foveal enhancement appeared to show an oscillatory pattern that was particularly pronounced at higher target opacities. Interestingly, the temporal frequency of these oscillations corresponded to the frequency range typically associated with neural feedback signaling.”

      (4) What was the average difference in latency between short and long latencies? It would be good to report it in the main text.

      We apologize for the oversight. The difference was 61 ms, with latencies of md = 247±18 ms for short- and md = 308±18 ms for long-latency saccades. We have added this information to the main text.

      (5) From the saccade latency graphs in Figure S1 it seems there is some variability in the latency of saccades across subjects, I wonder if there is a correlation between saccade latency and the magnitude of the foveal prediction effect across subjects.

      We had inspected a connection between saccade latency and congruency in our first investigation (Kroell & Rolfs, 2022; not reported) and observed that participants with lower latencies tended to show more enhancement, albeit non-significantly. Likewise, we observed a non-significant negative correlation between the median saccade latency and the mean foveal prediction effect (across opacities and time points) in the current investigation, r \= -0.22, p \= .572. While our study involved a small number of observers (n = 9), the analysis approach illustrated in Figure 2 A-C instead makes use of the large number of trials collected per participant (mean n = 2841 trials per observer) and demonstrates a reliable influence of saccade latency on an individual-observer level.

      (6) Page 14, the authors state that their findings suggest that the feedforward processing of the peripheral saccade target is accelerated when it is presented at high contrast. I find this a bit too speculative, both in terms of assuming that there is a feedforward vs a feedback process (see my point 1) and in terms of speculating that the feedforward process is accelerated as I do not see a clear hint of this in the data (see my point 3) and it is a bit of a stretch to speculate on delays or accelerations of neural processing. It is possible that the feedforward signal is always delivered at the same speed but it is weaker in one case and the effect needs more time to build up.

      We fully agree and hope to have addressed the reviewer’s arguments in the sections preceding this point. We included the reviewer’s last sentence in the Discussion section as well: 

      “Alternatively, or in addition, it is conceivable that weaker feedforward signals require a longer accumulation interval before the feedback process can be initiated.”

      Minor:

      (1) I think the description of the linear mixed-effects model can go in the supplemental methods, if possible, and its results can be briefly mentioned in the text.

      In previous work, we have been asked to move linear mixed-effects model descriptions from supplemental to main method (or even results) sections for clarity. We have followed this suggestion ever since and, due to the relevance of the models for the interpretation of the presented results, would like to keep their description in the methods section.

      (2) This is just a minor point, but I would suggest using a different word instead of opacity (maybe visibility?).

      We had gone back and forth on this. We decided to use the term ‘conspicuity’ when we discuss our findings conceptually and the term ‘opacity’ when we refer to the experimental manipulation (since we directly manipulate the transparency, i.e., 1-opacity, of the target patch against the background). To compute the slopes in Figures 2 and 5, we ordered observers’ performances by the linearly spaced opacity conditions. Since the term ‘opacity’ is closest to both the experimental manipulation and the variable entered into analysis, we would like to adhere to this terminology. However, we have added an explicit note to the end of our introduction to avoid confusion: 

      “Throughout the paper, we use the term ‘opacity’ when we refer to the experimental manipulation (that is, a variation of the transparency, i.e., 1-opacity of the target patch against the background noise) and the term ‘conspicuity’ when we discuss our findings conceptually.”

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors ran a dual task. Subjects monitored a peripheral location for a target onset (to generate a saccade to), and they also monitored a foveal location for a foveal probe. The foveal probe could be congruent or incongruent with the orientation of the peripheral target. In this study, the authors manipulated the conspicuity of the peripheral target, and they saw changes in performance in the foveal task. However, the changes were somewhat counterintuitive.

      Strengths:

      The authors use solid analysis methods and careful experimental design.

      Weaknesses:

      I have some issues with the interpretation of the results, as explained below. In general, I feel that a lot of effects are being explained by attention and target-probe onset asynchrony etc, but this seems to be against the idea put forth by the authors of "foveal prediction for visual continuity across saccades". Why would foveal prediction be so dependent on such other processes? This needs to be better clarified and justified.

      We address the described weaknesses in the respective sections below. In general, as we point out in response to Reviewer 1 as well, the current submission is a Research Advance article meant to supplement our main article (Kroell & Rolfs, 2022, https://doi.org/10.7554/eLife.78106). To comply with the eLife recommendations for Research Advance submissions, we addressed conceptual points only briefly, especially if they had been explained in detail in our main article. To make the nature and format of the current submission as explicit as possible, and to emphasize its connection to our previous work, we refer to the submission format in our abstract and introduction now.

      Specifics:

      The explanation of decreased hit rates with increased peripheral target opacity is not convincing. The authors suggest that higher contrast stimuli in the periphery attract attention. But, then, why are the foveal results occurring earlier (as per the later descriptions in the manuscript)? And, more importantly, why would foveal prediction need to be weaker with stronger pre-saccadic attention to the periphery? What is the function of foveal prediction? What of the other interpretation that could be invoked in general for this type of task used by the authors: that the dual task is challenging and that subjects somehow misattribute what they saw in the peripheral task when planning the saccade. i.e. foveal hit rates are misperceptions of the peripheral target. When the peripheral target is easier to see, then the foveal hit rate drops.

      We will address these comments one by one:

      The authors suggest that higher contrast stimuli in the periphery attract attention. But, then, why are the foveal results occurring earlier (as per the later descriptions in the manuscript)?

      We consider these observations to rely on separate processes. Already in the main publication (Kroell & Rolfs, 2022), we had observed a continuous decrease of target-congruent and target-incongruent foveal Hit Rates (HRs) during saccade preparation, and suggested that this decrease (similarly observed in Hanning & Deubel, 2022b is likely caused by the pre-saccadic shift of visuospatial attention to the target. In other words, as attentional resources shift towards the periphery, foveal detection performance is hampered, irrespective of peripheral and foveal feature (in-)congruency. In the current investigation, we again observed a pronounced pre-saccadic decrease of foveal HRs, irrespective of foveal probe orientation. Our argument that high-contrast peripheral saccade targets attract more attention relies on the clear observation that this decrease becomes more pronounced as the contrast of the saccade target increases. To the best of our judgment and experience with doing the task ourselves, this interpretation appears very conceivable. We explain this rationale in the Abstract and the Results sections of the manuscript (see below).

      Our hypotheses and interpretations concerning the time course of foveal prediction refer to the difference between target-congruent and target-incongruent foveal HRs (i.e., to predictive foveal feature enhancement). Irrespective of the general, feature-unspecific decrease of foveal detection performances, we had hypothesized that the peripheral target is processed faster if it exhibits a high contrast. This assumption is based on temporal processing properties of many visual neurons that we have expanded on in our revision: 

      “In particular, neuronal response latencies decrease systematically as the contrast of visual input increases. While this phenomenon is reliably observed at varying stages of the visual processing hierarchy—such as the lateral geniculate nucleus (Lee et al., 1981b), primary visual cortex (e.g., Albrecht, 1995; Carandini et al., 1997, 2002; Carandini and Heeger, 1994), and anterior superior temporal sulcus (STSa; Oram et al., 2002; van Rossum et al., 2008)— influences of contrast on neuronal response latency are particularly pronounced in higher-order visual areas: A doubling of stimulus contrast has been shown to decrease the latency of V1 neurons by 8 ms, compared to a reduction of 33 ms in area STSa (Oram et al., 2002; van Rossum et al., 2008). Assuming that the peripheral target is processed in a bottom-up fashion until it reaches higher-order object processing areas, the time point at which peripheral signals are available for feedback should be dictated by the temporal dynamics of visual feedforward processing.”

      Of note, both reviewers asked us to explore the oscillatory nature of the difference between targetcongruent and target-incongruent HRs. We will post our changes in response to the reviewer’s remark below.

      And, more importantly, why would foveal prediction need to be weaker with stronger pre-saccadic attention to the periphery?

      We hope that our previous reply has cleared up that the opposite is true: In general, and irrespective of the feature congruency of target and foveal probe, foveal HRs decrease as target contrast increases. As we have stated in our Abstract and Results, “foveal Hit Rates for target-congruent and incongruent probes decreased as target opacity increased, presumably since attention was increasingly drawn to the target the more salient it became. Crucially, foveal enhancement defined as the difference between congruent and incongruent Hit Rates increased with opacity”. This finding did not appear counterintuitive to us and was, in fact pre-registered as a main hypothesis (see https://osf.io/wceba). 

      We are unsure if this goes beyond the reviewer’s concern but we, in fact, speculate in the revised Discussion section as well as in our original eLife article that the overall, feature-unspecific decrease in foveal detection performances may aid feature-specific foveal prediction: 

      “This pre-saccadic decrease in foveal sensitivity may boost the relative weight of fed-back signals by attenuating the conspicuity of high-contrast feedforward input. In other words, the strength of feedforward input to the fovea is reduced gradually across saccade preparation. At the same time, the strength of the fed-back predictive signal should profit from the high contrast of naturalistic saccade targets.”

      What is the function of foveal prediction?

      Please refer to the section ‘What is the function of foveal prediction?’ in our main article. We have pasted this paragraph below for the reviewer’s convenience. 

      “What is the function of foveal prediction?

      As stated above, previous investigations on foveal feedback required observers to make peripheral discrimination judgments. We, in contrast, did not ask observers to generate a perceptual judgment on the orientation of the saccade target. Instead, detecting the target was necessary to perform the oculomotor task. While the identification of local contrast changes would have sufficed to direct the eye movement, the orientation of the target enhanced foveal processing of congruent orientations. The automatic nature of foveal enhancement showcases that perceptual and oculomotor processing are tightly intertwined in active visual settings: planning an eye movement appears to prioritize the features of its target; commencing the processing of these features before the eye movement is executed may accelerate post- saccadic target identification and ultimately provide a head start for corrective gaze behavior (Deubel et al., 1982; Ohl and Kliegl, 2016; Tian et al., 2013).”

      What of the other interpretation that could be invoked in general for this type of task used by the authors: that the dual task is challenging and that subjects somehow misattribute what they saw in the peripheral task when planning the saccade. i.e. foveal hit rates are misperceptions of the peripheral target. When the peripheral target is easier to see, then the foveal hit rate drops.

      Alternative explanations in general: In our main article, we ruled out—either through direct experimentation or by considering relevant properties of our findings—the following alternative explanations: i) spatially global feature-based attention to the target orientation, ii) a multiplicative combination of spatial and feature-based attention, and iii) shifts of decision criterion. While dual tasks (i.e., simultaneous oculomotor planning and perceptual detection) are standard in psychophysical investigations of active vision, we acknowledge the potential influence of an explicit foveal task in the revised manuscript, and in response to both reviewers: 

      “Lastly, pre-saccadic foveal input is likely less relevant during natural viewing behavior than it is in our task. It is possible that this task-induced prioritization of the foveal location facilitated the emergence of congruency effects. In a previous experiment (Kroell & Rolfs, 2022; Figure 2D), the perceptual probe could appear anywhere on a horizontal axis of 9 dva length around the screen center. Despite this spatial unpredictability, however, congruency effects peaked at the pre-saccadic foveal location, even after peripheral baseline performances had been raised to a foveal level through an adaptive increase in probe opacity. Ultimately, an influence of task demands on visual processing can only be fully excluded through techniques that provide a direct readout of perceptual contents without requiring keyboard responses. In psychophysical investigations, a prediction of saccade target motion may be read out from observers’ eye velocities (Kroell, Mitchell, & Rolfs, 2023; Kwon, Rolfs, & Mitchell, 2019). In electroencephalographic (EEG) and neurophysiological studies, foveal predictions should manifest in early visual evoked potentials (e.g., Creel, 2019) and increased firing rates of feature-selective foveal neurons in early visual areas, respectively.”

      Difficulty of the task: Concerning the perceptual detection task, every experimental session was preceded by an adaptive staircase procedure that adjusted the transparancy of the foveal probe—and, thus, task difficulty—depending on the respective observer’s performance (see Methods for details). Concerning the oculomotor task, observers were able to perform accurate saccades with typical movement latencies for all target opacity conditions (see Results, Supplements & Figure S1). In general, we are unsure how high task difficulty could produce a feature-, temporally and spatially specific enhancement of both filtered and incidental target-congruent foveal orientation information. In fact, a main finding of our current submission is that foveal HRs decrease as the target becomes easier to see and the oculomotor task thus becomes easier to perform.

      Perceptual confusion of target and probe stimulus: We observe a specific increase in HRs for foveal probes that exhibit the same orientation as the peripheral saccade target. Just like in our main article, a response is defined as a ‘Hit’ if a foveal probe is presented and the observer generates a ‘present’ judgment. To our understanding, the suggestion that a confusion of target and probe stimuli may account for these effects necessarily implies that this confusion hinges on the congruency between peripheral and foveal feature inputs. In other words, peripheral and foveal signals should be more readily “confused” if they exhibit similar features. We assume that peripheral feature information is fed back to neurons with foveal receptive field and combines with feature-congruent feedforward input. Whether this combination of signals can be described as low-level perceptual “confusion” likely depends on individual linguistic judgments (it would certainly be a novel description of feedback-feedforward interactions). Perhaps a defining difference between the reviewer’s concern and our assumed mechanism is the spatial specificity of the resulting congruency effects. We suggest that only neurons with foveal receptive fields receive feature information via feedback. And indeed, we demonstrate a clear spatial specificity of congruency effects around the pre-saccadic foveal location, even after parafoveal performances had been raised to a foveal level by an adaptive increase in probe opacity (see Kroell & Rolfs, 2022; Figure 2C & Figure 3). In other words, observers’ perception is altered in their pre-saccadic center of gaze while the target is presented peripherally. We struggle to conceive a

      scenario in which a confusion of signals should be feature-specific as well as specific to an interaction between peripheral and foveal signals without being meaningful at the same time. If the reviewer is referring to confusions on the response or decision level, we would like to point them towards the Discussion section ‘Can our findings be explained by established mechanisms other than foveal prediction?’ in our main article. In this paragraph, we provide detailed arguments for a dissociation between our findings and shifts in decision criterion that would exceed the scope of a Research Advance. 

      When the peripheral target is easier to see, then the foveal hit rate drops.

      We agree. Target-congruent and incongruent foveal HRs decreased as the contrast of the probe increased. However, and as we stated in response to the reviewer’s first comment, the difference between target-congruent and target-incongruent foveal HRs (and, thus, foveal enhancement of the target orientation) increased with peripheral target contrast.

      The analyses of Fig. 3C appear to be overly convoluted. They also imply an acknowledgment by the authors that target-probe temporal difference matters. Doesn't this already negate the idea that the foveal effects are associated with the saccade generation process itself? If the effect is related to target onset, how is it interpreted as related to a foveal prediction that is associated with the saccade itself? 

      We indeed conducted analyses that can reveal an influence of target presentation duration at probe onset, the saccade preparation stage at probe offset, as well as a combination of both factors. The fact that target presentation duration may have an influence on foveal prediction would not negate a simultanous influence of saccade preparation and vice versa. In the main article, we directly investigated the influence of saccade preparation on foveal enhancement by introducing a passive fixation condition (Kroell & Rolfs, 2022; Figure 5). At identical target-probe offset durations, pre-saccadic foveal enhancement was significantly more pronounced and accelerated compared to enhancement during passive fixation. We have added a purely saccade-locked time course (uncorrected by targetprobe interval) to our Results section and to Figure 3 (second row). We still believe that the target-locked, saccade-locked and combined analysis are informative for future investigations and would like to present them all for completeness.

      Also, the oscillatory nature of the effect in Fig. 3C for 59% and 90% opacity is quite confusing and not addressed. The authors simply state that enhancement occurs earlier before the saccade for higher contrasts. But, this is not entirely true. The enhancement emerges then disappears and then emerges again leading up to the saccade. Why would foveal prediction do that?

      In response to this comment and a suggestion by Reviewer 1, we inspected the pre-saccadic time course of enhancement effects in a more temporally resolved fashion and, indeed, observed pronounced oscillations for the two higher target opacity conditions (see Results): 

      “Especially at higher target opacities, the temporal development of foveal enhancement appears to exhibit an oscillatory pattern. To inspect this incidental observation in a more temporally resolved fashion, we determined mean enhancement values in a boxcar window of 50 ms duration sliding along all saccade-locked probe offset time points (step size = 10 ms; x-axis values in Figure 4 indicate the latest time point in a certain window). We then fitted 6th order polynomials to the resulting time courses and compared the fitted values against zero using bootstrapping (see Methods). The average foveal enhancement across target opacities reached significance starting 115 ms before saccade onset (gray curve in Figure 4; all ps < .046). For every individual target opacity condition, we observed significant enhancement immediately before saccade onset, although only very briefly for the lowest opacity (-2–0 ms for 25%; -39–0 ms for 39%, -106–0 ms for 59% &  -13–0 ms for 90%; all ps < .050; yellow to dark red curves in Figure 4). Especially for the higher two target opacities, we observed a local maximum preceding eye movement onset by approximately 80 ms. Interestingly, assuming a peak in enhancement in approximately 80 ms intervals (i.e., at x-axis values of -80 and 0 ms in Figure 4) would correspond to an oscillation frequency of 12.5 Hz. In contrast to rapid feedforward processing, feedback signaling is associated with neural oscillations in the alpha and beta range (i.e., between 7 and 30 Hz; Bastos et al., 2015; Jensen, Bonnefond, Marshall, & Tiesinga, 2015; van Kerkoerle et al., 2015).”

      We had observed an oscillatory pattern in multiple previous investigations, and in both Hit Rates to foveal orientation content and reflexive gaze velocities in response to peripheral motion information. So far, we have been unsure how to explain it. The literature on thalamic visual processing mentioned by the reviewer alerted us to the oscillatory nature of feedback signaling itself. Interestingly, the temporal frequency range of feedback oscillations includes the frequency of ~12.5 Hz observed in our data. We have included this and alternative explanations in the Discussion section (see below). We are aware, and acknowledge in the manuscript, that our analysis approach is purely descriptive, and that the potential explanations we give are speculative. 

      “Moreover, foveal congruency effects appeared to exhibit an oscillatory pattern, with peaks in a medium saccade preparation stage (~80 ms before the eye movement) and immediately before saccade onset. We have noticed this pattern in several investigations with substantially different visual stimuli and behavioral readouts. For instance, using a full-screen dot motion paradigm, we observed a pre-saccadic, small-gain ocular following response to coherent motion in the saccade target region (Kroell, Rolfs, & Mitchell, 2023, conference abstract; Kroell, 2023, dissertation). Predictive ocular following first reached significance ~125 ms before the eye movement, then decreased and subsequently ramped up again ~25 ms before saccade onset. Several explanatory mechanisms appear conceivable. Unlike rapid feedforward processing, feedback propagation has been shown to follow an oscillatory rhythm in the alpha and beta range, that is, between 7 and 30 Hz (Bastos et al., 2015; Jensen, Bonnefond, Marshall, & Tiesinga, 2015; van Kerkoerle, et al., 2015). In our case, it is possible that the object-processing areas that send feedback to retinotopic visual cortex do so at a temporal frequency of ~12.5 Hz. At higher stimulus contrasts, feedforward signals may be fed back instantaneously and without the need for signal accumulation in feedback-generating areas. The resulting perceptual time courses may reflect innate temporal feedback properties most veridically. Alternatively, the initial enhancement peak may be related to the sudden onset of the saccade target stimulus and not to movement preparation itself. In this case, the initial peak should become particularly apparent if enhancement is aligned to the onset of the target stimulus. Yet, Figure 3 and Figure 4 suggest more prominent oscillations in saccade-locked time courses. In accordance with this, perceptual and attention processes have been shown to exhibit oscillatory modulations that are phase-locked to action onset (e.g., Tomassini, Spinelli, Jacono, Sandini, & Morrone, 2015; Hogendoorn, 2016; Wutz, Muschter, van Koningsbruggen, Weisz, & Melcher, 2016; Benedetto & Morrone, 2017; Tomassini, Ambrogioni, Medendorp, & Maris, 2017; Benedetto, Morrone & Tomassini, 2019). Whether the oscillatory pattern of foveal enhancement, as well as its increased prominence at higher target contrasts, relies on innate temporal properties of feedback  signaling, signal accumulation, saccade-locked oscillatory modulations of feedforward processing or attention, or a combination of these factors, one conclusion remains: task-induced cognitive influences suggested to underlie the considerable variability in temporal characteristics of foveal feedback during passive fixation (e.g., Fan et al., 2016; Weldon et al., 2016; 2020) are not the only possible explanation. Low-level target properties such as its luminance contrast modulate the resulting time course and should be equally considered, at least in our paradigm.”

      The interpretation of Fig. 4 is also confusing. Doesn't the longer latency already account for the lapse in attention, such that visual continuity can proceed normally now that the saccade is actually eventually made? In all results, it seems that the effects are all related to the dual nature of the task and/or attention, rather than to the act of making the saccade itself. Why should visual continuity (when a saccade is actually made, whether with short or long latency) have different "fidelity"? And, isn't this disruptive to the whole idea of visual continuity in the first place?

      We are unsure if we grasp the unifying concern behind these remarks. For the reviewer’s point on the dual-task nature of our paradigm, please consider our answer above. Perhaps it is important to note that we do not (and would never) claim that foveal prediction is the only mechanism underlying visual continuity. We believe that multiple mechanisms, including but not limited to pre-saccadic shifts of attention, predictive remapping of attention pointers and the perception of intra-saccadic signals interact and jointly contribute to visual continuity. It appears highly conceivable that, like most processes in biological systems, motor and perceptual performances are subject to fluctuations. We argue that saccade latencies as well as the magnitude of foveal prediction constitute read-outs of these variations. We also suggest that those read-outs are innately correlated beyond their common moderator of, perhaps, attentional state; we have previously presented clear evidence for a link between eye movement preparation and foveal prediciton (Kroell & Rolfs, 2022; Figure 2). To the best of our judgment, we consider it reasonable that the effectiveness of movement-contingent perceptual processes varies with the effectiveness (in programming or execution) of the very movement motivating them. We present evidence for this assumption in our submission. We would also like to make clear that we do not assume our vision to fail entirely, even if every single well-known mechanism of visual continuity were to break down at once. Upon saccade landing, the visual system receives reliable visual input. Nonetheless, the visual system has undeniably developed mechanisms to optimize this process. We believe foveal prediciton to rank among them.

      Small question: is it just me or does the data in general seem to be too excessively smoothed?

      We did not apply any smoothing to either the analysis or visualization of our data in the initial manuscript.

      Every observer completed a large number of trials (mean n = 2841 trials per observer; total trial number > 25,500), which likely contributes to the clarity of our data. To inspect the oscillatory pattern of enhancement in a more temporally resolved fashion (in response to the reviewer’s point above), we applied a moving window analysis in this revision. Due to overlapping window borders, this analysis introduces a certain degree of smoothing. Nonetheless, data patterns are comparable to the time course with only few non-overlapping time bins (Figure 3B; second row). In general, we have described all steps of our analysis routine extensively in the Methods section and will make our data publicly available upon publication of the Reviewed Preprint. 

      General comment: it is important to include line numbers in manuscripts, to help reviewers point to specific parts of the text when writing their comments. Otherwise, the peer review process is rendered unnecessarily complicated for the reviewers.

      We apologize and have added line numbers.

    Annotators

    1. Liste des préconisations

      Le rapport IGESR 2025 "La classe de seconde : étape-clé pour l’élève de lycée" propose plusieurs recommandations organisées autour de quatre axes principaux, visant à améliorer la transition collège-lycée, à mieux gérer l'hétérogénéité des élèves, à optimiser l'orientation, et à renforcer la citoyenneté. Voici une synthèse des principaux points :

      • Axe 1 : Renforcer la liaison collège-lycée

        • Formaliser la liaison collège-lycée en structurant la collaboration entre enseignants de troisième et de seconde pour assurer une cohérence et une évolution progressive des règles, des modalités d'enseignement, et d'évaluation.
        • Diffuser les documents institutionnels relatifs à la scolarité, tels que les livrets personnels de compétences du collège aux professeurs de lycée et les bulletins trimestriels de seconde aux enseignants de collège, pour mieux informer les équipes sur les besoins et le devenir des élèves.
        • Organiser des temps d'information et d'immersion pour accompagner les élèves dans leur parcours d'orientation, en proposant des moments d'information communs et des immersions ciblées dans les voies professionnelle et technologique.
      • Axe 2 : Mieux gérer l’hétérogénéité scolaire

        • Améliorer l’identification et l’accompagnement des élèves les plus fragiles en s'appuyant sur des enseignants référents et en renforçant les temps dédiés au soutien, à la construction de projets, et au suivi personnalisé.
        • Développer des temps de réflexion entre pairs et de formation sur les spécificités de l’enseignement en classe de seconde, en proposant des formations sur la gestion des classes hétérogènes et en étendant l'évaluation du cycle terminal à la seconde.
        • Créer des projets pédagogiques adaptés aux objectifs de la classe de seconde, en s'appuyant sur des expériences existantes et en évaluant leur contribution à l'amélioration des apprentissages.
      • Axe 3 : Améliorer la qualité du choix d’orientation

        • Faire évoluer les outils d’affectation pour mieux prendre en compte le projet de l’élève vers la voie professionnelle, en adaptant Affelnet et en repensant les procédures d'affectation.
        • Informer les élèves et les enseignants sur les parcours post-seconde, en garantissant l'effectivité des heures dédiées au Parcours avenir et en construisant un dispositif numérique retraçant le parcours des élèves.
        • Renforcer la dimension technologique de la seconde générale et technologique, en présentant les différents parcours aux élèves de troisième et en retravaillant le programme de SNT pour développer sa dimension technologique.
      • Axe 4 : Révéler et affirmer la citoyenneté des élèves

        • Élaborer localement un document d’information sur les possibilités d’engagement dans et hors le lycée, en s'appuyant sur les structures locales de la continuité éducative.
        • Permettre à chaque élève de s’engager durant les années lycée, en offrant des lieux et des moments pour partager leurs expériences et les sensibiliser aux finalités de l'engagement.
        • Reconnaître les compétences acquises de l’engagement citoyen, en donnant de l'effectivité au parcours citoyen et en reconnaissant les compétences acquises dans le bulletin et le livret scolaires.
    1. Reviewer #1 (Public review):

      Dixit, Noe, and Weikl apply coarse-grained and all-atom molecular dynamics to determine the response of the mechanosensitive proteins Piezo 1 and Piezo 2 proteins to tension. Cryo-EM structures in micelles show a high curvature of the protein whereas structures in lipid bilayers show lower curvature. Is the zero-stress state of the protein closer to the micelle structure or the bilayer structure? Moreover, while the tension sensitivity of channel function can be inferred from the experiment, molecular details are not clearly available. How much does the protein's height and effective area change in response to tension? With these in hand, a quantitative model of its function follows that can be related to the properties of the membrane and the effect of external forces.

      Simulations indicate that in a bilayer the protein relaxes from the highly curved cryo-EM dome (Figure 1).

      Under applied tension, the dome flattens (Figure 2) including the underlying lipid bilayer. The shape of the system is a combination of the membrane mechanical and protein conformational energies (Equation 1). The membrane's mechanical energy is well-characterized. It requires only the curvature and bending modulus as inputs. They determine membrane curvature and the local area metric (Equation 4) by averaging the height on a grid and computing second derivatives (Eqsuations 7, 8) consistent with known differential geometric formulas.

      The bending energy can be limited to the nano dome but this implies that the noise in the membrane energy is significant. Where there is noise outside the dome there is noise inside the dome. At the least, they could characterize the noisy energy due to inadequate averaging of membrane shape.

      My concern for this paper is that they are significantly overestimating the membrane deformation energy based on their numerical scheme, which in turn leads to a much stiffer model of the protein itself. Two things would address this:

      (1) Report the membrane energy under different graining schemes (e.g., report schemes up to double the discretization grain).

      (2) For a Gaussian bump with sigma=6 nm I obtained a bending energy of 0.6 kappa, so certainly in the ballpark with what they are reporting but significantly lower (compared to 2 kappa, Figure 5 lower left). It would be simpler to use the Gaussian approximation to their curves in Figure 3 - and I would argue more accurate, especially since they have not reported the variation of the membrane energy with respect to the discretization size and so I cannot judge the dependence of the energy on discretization. I view reporting the variation of the membrane energy with respect to discretization as being essential for the analysis if their goal is to provide a quantitative estimate for the force of Piezo. The Helfrich energy computed from an analytical model with a membrane shape closely resembling the simulated shapes would be very helpful. According to my intuition, finite-difference estimates of curvatures will tend to be overestimates of the true membrane deformation energy because white noise tends to lead to high curvature at short-length scales, which is strongly penalized by the bending energy.

      The fitting of the system deformation to the inverse time appears to be incredibly ad hoc ... Nor is it clear that the quantified model will be substantially changed without extrapolation. The authors should either justify the extrapolation more clearly (sorry if I missed it!) or also report the unextrapolated numbers alongside the extrapolated ones.

      In summary, this paper uses molecular dynamics simulations to quantify the force of the Piezo 1 and Piezo 2 proteins on a lipid bilayer using simulations under controlled tension, observing the membrane deformation, and using that data to infer protein mechanics. While much of the physical mechanism was previously known, the study itself is a valuable quantification. I identified one issue in the membrane deformation energy analysis that has large quantitative repercussions for the extracted model.

    1. unsigned long newsize = roundup_pow_of_two(size); if (newsize <= max / 32) newsize = newsize * 4; else if (newsize <= max / 4) newsize = newsize * 2; else newsize = max; return newsize;

      These are all heuristics to determine intial readahead size. Trades potentially memory overuse for faster access. Could be changed to not use these hardcoded values.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      The study presents important findings on inositol-requiring enzyme (IRE1α) inhibition on diet-induced obesity (overnutrition) and insulin resistance where IRE1α inhibition enhances thermogenesis and reduces the metabolically active and M1-like macrophages in adipose tissue. The evidence supporting the conclusions is convincing but can be enhanced with information/data on the validity, specificity, selectivity, and toxicity of the IRE1α inhibitor and supported with more detail on the mechanisms by which adipose tissue macrophages influence adipocyte metabolism. The work will be of interest to cell biologists and biochemists working in metabolism, insulin resistance, and inflammation.

      We thank the editors for the assessment and appreciation of our findings in this study. In the revision, we have added the information on the validity, selectivity and toxicity of IRE1α inhibitor. In addition, we also discussed the likelihood that suppression of metabolically activated proinflammatory macrophage population in adipose tissue on the reversal of adipose remodeling and thermogenesis. In the revision, we have improved the manuscript significantly throughout the text and figures following the recommends by the reviewers.

      Public Reviews:

      Reviewer #1 (Public review):

      First, the authors confirm the up-regulation of the main genes involved in the three branches of the Unfolded Protein Response (UPR) system in diet-induced obese mice in AT, observations that have been extensively reported before. Not surprisingly, IRE1a inhibition with STF led to an amelioration of the obesity and insulin resistance of the animals. Moreover, non-alcoholic fatty liver disease was also improved by the treatment. More novel are their results in terms of thermogenesis and energy expenditure, where IRE1a seems to act via activation of brown AT. Finally, mice treated with STF exhibited significantly fewer metabolically active and M1-like macrophages in the AT compared to those under vehicle conditions. Overall, the authors conclude that targeting IRE1a has therapeutical potential for treating obesity and insulin resistance.

      The study has some strengths, such as the detailed characterization of the effect of STF in different fat depots and a thorough analysis of macrophage populations. However, the lack of novelty in the findings somewhat limits the study´s impact on the field.

      We thank the reviewer for the appreciation of our findings and the comments about the novelty. Regarding the novelty, we would emphasize several novelties presented in this manuscript. First, as the reviewer correctly pointed out, we discovered that IRE1 inhibition by STF activates brown AT and promotes thermogenesis and that IRE1 inhibition not only significantly attenuated the newly discovered CD9+ ATMs and the “M1-like” CD11c+ ATMs but also diminished the M2 ATMs for the first time. These discoveries are very important and novel. In obesity, it was originally proposed that ATM undergoes M1/M2 polarization from an anti-inflammatory M2 to a classical pro-inflammatory M1 state. It was further reported that IRE1 deletion improves thermogenesis by boosting M2 population which then synthesize and secrete catecholamines to promote thermogenesis. It is now known that M2 macrophages do not synthesize catecholamines or promote thermogenesis. In this study, we discovered that IRE1 inhibition doesn’t increase (but instead decrease) the M2 population and that IRE1 inhibition promotes thermogenesis likely by suppressing pro-inflammatory macrophage populations including the M1-like ATMs and most importantly the newly identified metabolically active macrophages, given that ATM inflammation has been reported to suppress thermogenesis. Second, this study presented the first characterization of relationship between the more classical M1-like ATMs and the newly discovered metabolically active ATMs, showing that the CD11c+ M1-like ATMs are largely overlapping with but yet non-identical to CD9+ ATMs in the eWAT under HFD. Third, although upregulation of ER stress response genes in the adipose tissues of diet-induced obese mice have been extensively reported, it doesn’t necessarily mean that targeting IRE1a or ER stress can reverse existing insulin resistance and obesity. It is not uncommon that a therapy doesn’t yield the desired effect as expected. For instance, amyloid plaques are a hallmark of Alzheimer's disease (AD), interventions that prevent or reverse beta amyloid deposition have been expected to prevent progression or even reverse cognitive impairment in AD patients. However, clinical trials on such therapies have been disappointing. In essence, experimental demonstration of effectiveness or feasibility for any potential therapeutic targets is a first step for any future clinical implementation.

      Reviewer #2 (Public review):

      The manuscript by Wu et al demonstrated that IRE1a inhibition mitigated insulin resistance and other comorbidities through increased energy expenditure in DIO mice. In this reviewer's opinion, this timely study has high significance in the field of metabolism research for the following reasons.

      (1) The authors' findings are significant and may offer a new therapeutic target to treat metabolic diseases, including diabetes, obesity, NAFLD, etc.

      (2) The authors carefully profiled the ATMs and examined the changes in gene expression after STF treatment.

      (3) The authors presented evidence collected from both systemic indirect calorimetry and individual tissue gene expression to support the notion of increased energy expenditure.

      Overall, the authors have presented sufficient background in a clear and logically organized structure, clearly stated the key question to be addressed, used the appropriate methodology, produced significant and innovative main findings, and made a justified conclusion.

      We thank the reviewer for the appreciation of our work.

      Reviewer #3 (Public review):

      Summary:

      The manuscript by Wu D. et al. explores an innovative approach to immunometabolism and obesity by investigating the potential of targeting macrophage Inositol-requiring enzyme 1α (IRE1α) in cases of overnutrition. Their findings suggest that pharmacological inhibition of IRE1α could influence key aspects such as adipose tissue inflammation, insulin resistance, and thermogenesis. Notable discoveries include the identification of High-Fat Diet (HFD)-induced CD9+ Trem2+ macrophages and the reversal of metabolically active macrophages' activity with IRE1α inhibition using STF. These insights could significantly impact future obesity treatments.

      Strengths:

      The study's key strengths lie in its identification of specific macrophage subsets and the demonstration that inhibiting IRE1α can reverse the activity of these macrophages. This provides a potential new avenue for developing obesity treatments and contributes valuable knowledge to the field.

      Weaknesses:

      The research lacks an in-depth exploration of the broader metabolic mechanisms involved in controlling diet-induced obesity (DIO). Addressing this gap would strengthen the understanding of how targeting IRE1α might fit into the larger metabolic landscape.

      Impact and Utility:

      The findings have the potential to advance the field of obesity treatment by offering a novel target for intervention. However, further research is needed to fully elucidate the metabolic pathways involved and to confirm the long-term efficacy and safety of this approach. The methods and data presented are useful, but additional context and exploration are required for broader application and understanding.

      We thank the reviewer for the appreciation of strengths in our manuscript. In particular, we appreciate the reviewer’s recommendation on the exploration of broader metabolic landscape, such as the effect of IRE1 inhibition on non-adipose tissue macrophages and metabolism. We agree that achieving these will certainly broaden the therapeutic potential of IRE1 inhibition to larger metabolic disorders and we will pursue these explorations in future studies.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      A list of recommendations for the authors is presented below:

      (1) Please, update the literature review to include more recent studies relevant to the topic.

      We thank the reviewer’s suggestions. We have added more references from recent studies.

      (2) Please, provide a detailed explanation of how STF functions, including potential off-target effects or issues related to specificity.

      We thank the reviewer’s suggestions. STF is a small-molecule inhibitor designed to selectively inhibit the RNase activity of IRE1a. Once IRE1a is activated (e.g., in obesity), its RNase domain initiates the unconventional splicing of the transcription factor X-box binding protein 1 (XBP1) mRNA and the Regulated IRE1-Dependent Decay (RIDD) of microRNAs, which is detrimental if prolonged. IRE1a RNase inhibitors including STF engage the RNase-active site of IRE1a with high affinity and specificity by exploiting a shallow complementary pocket through pi-stacking interactions with His910 and Phe889 and an essential Schiff base interaction between the aldehyde moiety of the inhibitor and the side chain amino group of Lys907 (Sanches et al., NComm 2014, PMID: 25164867). This specific and high affinity binding blocks the IRE1a RNase activity, preventing the splicing of XBP1 mRNA and RIDD. As IRE1a has been shown to be activated in multiple tissues under various pathological conditions and to be responsible for the progression of the pathological conditions, inhibition of IRE1a by pharmacological agents including STF has the great potential for the treatment of various pathological disorders. Several studies have reported that STF shows no overt toxicity when administered systemically (Madhavan, Aparajita, et al.2022, PMID 35105890; Herlea-Pana et al., 2021, PMID 34675883; Papandreou et al., 2011, PMID 21081713; Tufanli et al., 2017, PMID 28137856).

      (3) Lines 263-266 require a reference.

      We thank the reviewer’s suggestion. A reference has been added.

      (4) Stromal vascular fraction (SVF) also contains a significant amount of preadipocytes and stem cells, not only macrophages, which might affect the conclusions reached by the authors.

      We thank the reviewer’s comments. It is true that SVF consists of multiple cell types, including endothelial cells, macrophages, preadipocytes, and various stem cell populations. In HFD-induced obesity, adipose tissue undergoes significant remodeling, and the percentage of macrophages in the SVF of obese adipose tissue increases significantly relative to other cell types. In our studies, SVFs from adipose tissues of obese mice were isolated, cultured, and treated with STF for overnight.  We observed that IRE1 RNase activity in SVFs was inhibited by STF treatment, and that ATM population and the expression of pro-inflammatory genes were downregulated by STF. Given the short-term treatment, the parsimonious interpretation of the data would be that STF directly acts on ATMs.  However, we note that the possibility that the effect of STF on other cell types might influence the ATM and inflammatory gene expression can’t be totally ruled out. As such, we have modified our conclusion from “these results indicate that STF acts directly on ATMs to regulate inflammation” to “these results indicate that STF likely acts directly on ATMs to regulate inflammation”.

      (5) Figures 1A and G: It is common practice to present the XBP1s/XBP1u ratio; consider using this standard measure.

      We thank the reviewer’s comments. Regarding the XBP1 mRNA splicing, we see both ways of presentation in publications. There are quite a number of papers, for instance, PMID25018104, 2014, Cell; PMID23086298, 2012, NCB, that used the XBP1s/ (XBP1s+XBP1u) ratio. We preferred this way of presentation as it shows the ratio of spliced XBP1 (XBP1s) relative to the total XBP1 mRNA (XBP1s+XBP1u).

      (6) Figure 1F: please indicate the type of AKT phosphorylation assessed.

      We thank the reviewer’s comments. We have added Ser473 as the phosphorylation site at in both figure legend and figure.

      (7) Figures 2E-H: please clearly indicate the specific fat depots analyzed in each figure.

      We thank the reviewer’s comments. We have added the information in the figure legends and figures.

      (8) Figures 1I and 3A, and Supplementary Figures 6D-E: please include a quantification analysis of the images presented.

      We thank the reviewer’s suggestion. We have added the quantifications of the images.

      (9) In Figure 3D the image corresponding to the merge for the STF condition is a duplication of the control, please correct this.

      We thank the reviewer for pointing this out. We have replaced it with the correct image.

      (10) Figures 4B-F: please provide individual data points in the graphs to show variability and sample distribution.

      We thank the reviewer’s suggestion. We have re-plotted the graphs in Fig. 4B-F with the individual data points.

      (11) Figure 4I: it is rather unusual to have such a strong signal of UCP1 in ND conditions, please explain.

      We thank the reviewer for the comment. We wish to point out that the images were taken from BAT slides. UCP1 is expected to show strong staining in BAT under DN condition, which as expected is weakened under HFD condition. STF treatment was able to correct the HFD-induced weakening of UCP1 staining in BAT.

      (12) Supplementary Figures 2C-D: please provide representative images for better clarity and interpretation.

      We thank the reviewer for the comment. The representative images for Supplementary Figures 2C-D were actually shown in Figures 2C and F. Supplementary Figures 2C-D were the mere quantification for adipocyte areas for Figures 2C and F.

      (13) Supplementary Table 3 is repeated, please remove.

      We thank the reviewer for the comment. We have deleted this repetition.

      Reviewer #2 (Recommendations for the authors):

      The manuscript can be further strengthened with more clarification on the following points.

      (1) The use of IRE1a pharmacological inhibitor STF-083010 (STF) needs to be validated. How was the dose determined? Were there any dose-dependent studies? Under the current dosing regimen, what are the specificity, selectivity, and toxicity of STF? Also, were the serine/threonine kinase and RNase activities measured in the adipocytes and ATMs of the animals dosed with the compound? What's the PK data?

      We thank the reviewer for the comments. In the animal study, we used STF 10 mg/kg for intraperitoneal injection. This dose was adopted from several recent studies (Madhavan, Aparajita, et al.2022, PMID 35105890; Herlea-Pana et al., 2021, PMID 34675883; Papandreou et al., 2011, PMID 21081713; Tufanli et al., 2017, PMID 28137856), in which STF treatment showed beneficial effect in their respective disease models. STF didn’t compromise cell viability or induce any other toxicity at the dose or concentration used in these studies (Papandreou I, et al., 2011; Upton JP, et al., 2012; Lerner AG, et al., 2012; Kemp KL, et al., 2013; Cross BC, et al., 2012). In our study, we didn’t observe any apparent toxicity on mice at this dose. Importantly, we did observe that STF inhibited IRE1 RNase activity in adipose tissues (F1G, S1D) and ATMs (F6Q, S8C, G, I) of the animals at this dose. As the IRE1 inhibitors including STF has been extensively examined and shown to have no effect on the kinase function of IRE1 (Cross et al., 2012, PMID: 22315414; Tufanli et al., 2017, PMID 28137856), we didn’t perform the assay on Ire1 kinase activity. Additionally, as the chemical has been administered into several animal models, with significant beneficial effects, one would assume decent pharmacokinetic parameters being achieved with the current dose. It would be important and necessary to have systematic PK studies in the future if clinical trials are to be considered.

      (2) The statistical method for individual panels in each figure needs to be specified.

      We thank the reviewer for the suggestion. We have specified the statistical method in the figure legends.

      (3) In Figure 1E, there's no difference in fasting insulin levels, though a difference was detected after the glucose load. This suggests an effect on insulin secretion but not insulin sensitivity.

      We thank the reviewer for the comments. The insulin levels are still different between Veh and STF groups at fasting, just not reaching statistically significant. Under glucose stimulation, the insulin levels all showed the same trend, which is, the STF group is lower than the Veh group. Even if the fasting insulin levels showed no difference between the two groups regardless of glucose stimulation, the fact that the blood glucose levels at all the time points are lower in STF group than Veh group (Fig. 1C) indicates that insulin sensitivity is improved. In our study, the insulin levels were lower in STF group, but the blood glucose levels were still lowered by STF, further strengthening the notion that STF treatment improves insulin sensitivity. This is indeed further corroborated by the ITT results (Fig. 1D).

      (4) Figure 2 and S2A did not show a decrease in BW but rather BW gain. The statement (line 308) needs to be edited. As a result of this, the relative fat mass measurement (% of BW) needs to be presented in addition to Figure 2B.

      We thank the reviewer for the comments/suggestions. As shown in Figs. 2A and S2A, we observed a slight decrease in body weight (~2g reduction) in STF-treated mice while Veh group increased body weight by ~3.5g, at the end of 4 weeks of treatment. As shown in Fig. 2B, this difference in body weight between Veh and STF groups was primarily due to a reduction in fat tissue. In the revision, we also added the percentages of fat and lean masses over total body weight in Supplemental Fig. 2B, which show the similar trend.

      (5) The measurement of blood lipid levels in Figure 3F-H is informative. More importantly, hepatic lipid content needs to be measured.

      We thank the reviewer for and agree on the comments. As this study is more focused on the insulin resistance and adipose tissue remodeling, we didn’t go deep into the comorbidities beyond the reported observations. It will be interesting to explore the effects of IRE1 inhibition on the obesity/insulin resistance comorbidities including hepatic lipid content measurement in future study.

      Minor corrections:

      (1) Line 261: "(spliced".

      Done. We have corrected it.

      (2) Line 334: spell out "PEPCK".

      We have added the full name “Phosphoenolpyruvate carboxykinase”. Thanks!

      (3) Line 478: please rephrase.

      We thank the reviewer for the comment. We have rephrased the sentence as following: “These results reveal that STF treatment suppresses the adipose tissue inflammation and the accumulation of pro-inflammatory ATM with augmenting (suppressing instead) M2-like ATMs.”

      (4) Figure 4L: "pGC1-a".

      We thank the reviewer for pointing this out. We have corrected the name.

      (5) Figure 4O: missing Y-axis label.

      We have added the label. Thanks!

      Reviewer #3 (Recommendations for the authors):

      The observations presented by Wu D. et al. in the manuscript are potentially interesting and relevant. The current study seeks to build upon previous findings, specifically from the work titled, "Silencing IRE1α using myeloid-specific cre suppresses alternative activation of macrophages and impairs energy expenditure in obesity." By using a pharmacological inhibitor to modulate IRE1α activity in adipose tissue macrophages (ATMs), the authors aim to develop therapeutics that could significantly impact the treatment of obesity and metabolic disease.

      The authors have performed some satisfactory experiments related to liver steatosis. However, the manuscript would benefit from a more comprehensive exploration of the mechanisms by which ATMs influence adipocyte metabolism, particularly in epididymal white adipose tissue (eWAT). In particular, the study should investigate how adiposity and lipid droplet size change in response to alterations in lipolysis and adipogenesis, as this could provide insights into how these processes contribute to the amelioration of the obesity phenotype.

      Several issues should be addressed to strengthen the manuscript and make the study more convincing. Below are specific comments and recommendations:

      Major:

      (1) The indirect calorimetric data should be normalized for dependent variables such as body weight, lean mass, and fat mass+ lean mass to accurately interpret the results. The results for 24-hour energy expenditure should be included in Figure 4B-F to provide a more comprehensive analysis. It is recommended to plot bar graphs with all individual data points for the energy expenditure (EE) results shown in Figure 4B-F, to offer a clearer and more detailed presentation of the data (Figure 4B-F).

      We thank the reviewer for the comments. Data analysis on the indirect calorimetric studies has evolved over the years. One common practice was/is to normalize the data by body weight. However, this approach was deemed improper some years ago (Tschop et al Nature Methods 2012, PMID: 22205519). Tschop paper also pointed out the shortcomings associated with normalization by lean mass. Instead, it concludes that “generalized linear model is the most appropriate statistical approach to accommodate discrete (genotype) and continuous (body mass) traits, rather than using a simple division by BW or lean BW”. In our study, we used CalR, an improved generalized linear model (which includes ANOVA and ANCOVA) (Mina et al Cell Metabolism 2018, PMID: 30017358) for all our energy expenditure data analysis (shown in Fig. 4A-E). In the revision, we also included data analysis normalized by BW (Fig. S2F-H’), which actually shows even wider difference between Veh and STF groups than the data shown in Fig. 4A-F. As STF decreased the fat mass and had little effect on lean mass, the difference would be more drastic for normalization with fat mass and with fat mass+ lean mass than the data shown in Fig. 4A-E and would be similar to the data shown in Fig. 4A-E for normalization with lean mass. In addition, we replotted the graphs in Fig. 4B, D, F-H with the individual data points.

      (2) At the thermoneutral point (30{degree sign}C), the study could benefit from testing the indirect calorimetric models of human energy physiology. Future studies could also explore this to evaluate the implications for drug development.

      We agree with the reviewer on the comments. In the future study, it will be very informative to investigate the effects of STF under thermoneutral conditions, which could provide more consistent data on how drugs affect metabolic processes in humans, improving translational research.

      (3) The current study missed the opportunity to investigate the effects of STF on non-adipose tissue (non-AT) resident macrophage populations, such as those in bone marrow or lymph-node macrophages. Understanding how STF modulates macrophage metabolism in these contexts would be valuable.

      We thank the reviewer for and agree on the comments. As this study is more focused on the insulin resistance and adipose tissue remodeling, we were mostly restricted to adipose tissue macrophage populations. In the future, it would be interesting to investigate the effect of STF on macrophages in other non-adipose tissues, which will provide a more comprehensive understanding of STF's effects on immune cell metabolism, which could inform its application in various therapeutic areas.

      (4) The study should explore how STF influences the expression of CD9, Trem2, (positive subpopulations), and the secretion of pro-inflammatory cytokines by macrophages, particularly in response to LPS and IFNγ activation in stromal vascular fraction (SVF) cells and bone marrow-derived macrophages (BM-Macrophages).

      We appreciate the reviewer for the comments. Under obesity, the ATM does not undergo the classical M1/M2 polarization; instead, both M1-like/pro-inflammatory macrophages and M2 macrophages increase drastically in obesity. It will be interesting to investigate the effects of STF on the newly identified CD9- and Trem2-positive macrophage subpopulations in SVF and bone marrow macrophages in response to LPS and IFNγ stimulation in the future, although these studies might not faithfully reflect the changes in adipose tissue under obesity as these stressors typically induce classical M1/M2 polarization.

      (5) Additional macrophage gating is necessary better to understand adipose tissue macrophage (ATM) inflammation. Specifically, CD11c−MHC2 low macrophages represent a newly identified inflammatory and dynamic subset in murine adipose tissue. These ATMs accumulate rapidly after ten days of a high-fat diet (HFD) and should increase further with prolonged HFD. For this study, CD11c−MHC2 low ATMs could be subdivided for flow cytometry analysis based on their MHC2 expression, distinguishing them from CD11c−MHC2 high ATMs. All macrophage subtypes categorized here can be studied for metabolic health using seahorse analysis as well.

      We appreciate the reviewer for the comments. It will be interesting to investigate the effects of STF on the newly identified CD11c−MHC2 low macrophage subpopulation in the future. Future studies certainly can include metabolic analysis with Seahorse which can corroborate the energy metabolism at the cellular level with organismal thermogenesis. 

      (6) All flow cytometry histograms - are they showing mean fluorescence intensity or cell# per population? Please specify. All flow cytometry dot plots - It would be helpful for readers to see populations plotted as bar graphs next to respective flow plots, as opposed to being shown as supplemental tables. Additionally, labeling dot plots with the parent population from which cells were gated on would also help readers understand faster what we're looking at.

      We appreciate the reviewer for the comments. In flow cytometry histograms, we used “normalized to mode”. The mode is often used to compare the distribution of fluorescence intensity between different samples. It focuses on the shape of the distribution (with a max of 100%) rather than the absolute cell counts, which helps remove variations caused by different cell numbers or sample sizes, making it easier to compare populations based on fluorescence intensity. When normalizing to the mode, the highest peak in the histogram is scaled to 100%, and all other values are scaled relative to that peak. This allows for easy comparison of multiple histograms, even if the total number of cells (or events) differs between samples.

      (7) The results appear to confuse the actual sample size and p-value. Please carefully review the statistical analyses to ensure that biological replicates are accurately represented. Additionally, include p-values alongside fold change data in the text for clarity represented.

      We appreciate the reviewer for the comments. We have rechecked the statistical analyses confirming that the biological replicates are now properly represented. The exact number of biological replicates for each experiment is now clearly specified in both the methods section and figure legends.

      (8) To further validate the findings, consider using Seahorse analysis at the cellular level in future experiments. This could confirm indirect calorimetric data and thermogenesis responses to cold stimulation.

      We appreciate the reviewer for the comments. Yes, Seahorse analysis at the cellular level will be conducted in future experiments.

      (9) Please ensure the use of person-first language, avoiding labels or adjectives that define individuals based on a condition or characteristic.

      We appreciate the reviewer for the comments. We have changed the descriptions by using person-first language.

      (10) The manuscript does not demonstrate how STF inhibition of IRE1α in ATM, specifically through CD9 and Trem2, controls diet-induced obesity. This aspect should be further elucidated.

      We appreciate the reviewer for the comment. In this study, we observed that STF inhibits IRE1α RNase activity in SVF and in sorted ATMs as well as in adipose tissue. The improvement in diet-induced obesity can be attributable to IRE1α inhibition in both adipocytes and macrophages as shown previously by myeloid and adipocyte-specific knockouts of IRE1α. To conclude whether the IRE1α in CD9- and/or Trem2-positive ATMs controls diet-induced obesity, genetic means would be needed to generate CD9- and/or Trem2-positive ATMs-specific deletion of IRE1α, which will be technically challenging at this moment as there is no CD9 or Trem2-specific Cre lines available.

      Minor:

      (1) Line 43-44: Update terminology to "MASLD" instead of "NAFLD."

      We thank the reviewer for pointing these out. We have changed the terminology in the revision.

      (2) Line 58-59: Add a reference for the mentioned text.

      We thank the reviewer for the comment. Added a reference in the text in the revision.

      (3) Was the antibody used to detect CD9 and Trem2 validated for FACS and other analyses?

      We thank the reviewer for the comment. In our studies, we determined CD9 and Trem2 expression through flow cytometry and immunostaining staining. In flow experiment, CD9 and Trem2 were acquired from Biolegend: PE/Dazzle™ 594 anti-mouse CD9 (BioLegend Cat# 124821, RRID:AB_2800601); APC-conjugated Trem2 (R&D Systems Cat# FAB17291N, RRID:AB_3646995), which were validated for FACS. For immunostaining: CD9  (Abcam Cat# ab223052, RRID:AB_2922392). and Trem2 (R&D Systems Cat# MAB17291, RRID:AB_2208679).

      (4) Studies were limited to male mice; this should be noted in the title and discussed as a limitation.

      We thank the reviewer for the comment. We have modified the wording in the revision.

      (5) Ensure all reagents are fully described with preparation details and identifiable numbers for reproducibility and/or submit the FACS protocol to any protocol archives.

      We thank the reviewer for the suggestions. Yes, we have modified the wording in the revision.

      (6) Provide the correct version numbers for all software used (FlowJo, Prism, etc.).

      We thank the reviewer for the suggestions. We have provided the correct version numbers for softwares for FlowJo and Prism.

      (7) Specify section size (µm) and blocking agent used for eWAT immunofluorescence (Line 207).

      We thank the reviewer for the suggestions. We have added this information.

      (8) Add gene accession numbers to Supplementary Table 3.

      We thank the reviewer for the suggestions. We have added this information.

      (9) Figure 2: Clarify HFD and treatment timelines with a schematic diagram.

      We thank the reviewer for the suggestions. We have added a schematic diagram in Supplemental Figure 1C.

      (10) For histology analysis, the minimum combined data from triplicate images is shown in Figure 2C-2H. For Figures 2E and H, provide complete methods for histology analysis.

      We thank the reviewer for the comments. For the histology analysis shown in Figures 2C–2H, we used a minimum of three mice per treatment group. For each mouse, 3–5 images were taken for analysis. All histology analyses were conducted using ImageJ for image quantification, and the data were processed and organized using Excel and Graphpad.

      (11) Figure 3D Macrophage markers F4/80 stained differently in Figure 5B; to avoid false positive staining, show isotype control to confirm actual staining. For eWAT immunofluorescence (Figures 3D, 5B, 6E)., counterstaining is needed in addition to macrophages, such as for adipocytes-perilipin, and phalloidin for total cells.

      We thank the reviewer for the comments. Yes, Figures 3D macrophage marker F4/80 stained is differently from that of Figure 5B, as they are in different tissues, with Figure 3D in liver samples while Figure 5B in adipose tissues. In the liver, subsets of macrophages are known as Kupffer cells. Kupffer cells have distinct morphology and behavior compared to other tissue-resident macrophages. When stained with F4/80 in the liver, the pattern may reflect the specialized role of Kupffer cells, typically showing a more diffuse or localized staining around blood vessels and sinusoids. In adipose tissue, macrophages tend to accumulate around dead or dying adipocytes, forming what is known as "crown-like structures" (CLS). The F4/80 staining in adipose tissue shows a more clustered pattern, particularly around areas of fat tissue undergoing remodeling or inflammation. In adipose tissue, you can still see clear, defined cells even without counterstaining like perilipin, and importantly, adipocytes are generally way larger than macrophages in size. Yes, we agree that if with counterstaining it would enhance the accuracy. In the future study, we will use perilipin staining to make it easier to differentiate adipocytes from other structures and provide stronger data.

      (12) Insert scale bars in the original images for Figures 3D, 4I, 4M, 5B, 6E, S3B, S6D-E, and S7A-B. All images added a scale bar not inserted while acquiring the image or using imaging software.

      We thank the reviewer for the suggestions. The resolution for the scale bars in the images obtained during acquisition, somehow, isn’t sufficient enough to be clearly visible and requires the enlargement of the images to be seen clearly. In the revision, we have manually added the scale bars for clarity.

      (13) Figure 5E: Please label X-axis as F4/80.

      We thank the reviewer for pointing this out. The label has been added in the revision.

      (14) Figure 5F: It is specified in the legend that cells were gated on F4/80+CD11b+CD11c+, but there is a CD11c- population shown in the histogram...How is this population appearing if all cells should be CD11c+?

      We thank the reviewer for pointing this out. We gated against CD11c in F4/80+CD11b+ population. As such, we have corrected the description in the legend.

      (15) Figure 5G: What is the F4/80+CD11b+CD11c-CD206- population gated in quadrants?

      We thank the reviewer for the comment. The F4/80+CD11b+CD11c-CD206- population was shown in Figure 5G on the lower left side, with the percentages being 15.7% for ND, 5.54% for Veh-HFD, and 26% for STF-HFD.

      (16) Figure 6J: Flow cytometry gates seem slightly misplaced and the sample appears to be overcompensated - were FMOs included in this experiment to establish proper gates? If so, please include.

      We thank the reviewer for the comment. In the study, we did include Fluorescence Minus One (FMO) control in the experiment to establish proper gating. We have included this information in the methods section.

      (17) Table 1-3: Indicate the number of replicates (n=) used in all tables.

      We thank the reviewer for the suggestion. We have provided the specific number of mice used in the study within the figure legends.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      The analysis of the dormancy rates is interesting and offers some intriguing questions related to the higher dormancy rate found for the L2 isolates and lower for the L3 ones. It will be interesting in the future to expand the data generated in this advanced in vitro plaAorm to in vivo studies.

      Indeed, an increased dormancy propensity of L2 isolates was previously reported in broth culture and associated to specific genetic polymorphisms. The opposite phenotype observed in the L3 isolates is indeed particularly intriguing and was not described to date. Hence, we fully agree that it would be very interesting to find out whether these phenotypes are also observed in vivo.

      The authors propose that ‘strains exhibiting greater proliferative capacity are more prone to induce macrophage apoptosis, thereby contributing to the extent of the granulomatous response.’ It would be interesting to know what happens if the macrophage apoptotic response is blocked.

      This is an interesting suggestion that would deserve a dedicated comprehensive investigation covering other cell death pathways. Even though the trend is significant, the correlation coefficient is rather low in this interaction, which looks a fortiori due to substantial inter-host variability in the apoptotic propensity of macrophages from individual donors to a given strain. In addition, such blocking experiments may require performing isolated macrophage infections that would fall outside of the scope of this study, or considering the extent and the contribution of the apoptosis of other cell subsets. 

      In contrast to macrophage apoptosis, T cell activation correlated with less replicative bacteria. Are these two findings related, ie, are the granulomas showing more (apoptotic) macrophages the ones with a lower percentage of activated T cells? This would shed light on what distinguishes granulomas that are protective from those that support bacterial growth. 

      Indeed, a significant negative correlation between macrophage apoptosis induction and T cell activation can be observed, specifically with activated CD4 T cells expressing CD38 (rS \= -0.36, p < 0.05) or CD69 (rS = -0.40, p < 0.01). We have added this additional result in the manuscript text (line 217).

      It would also be interesting to know the functional impact of blocking early CXCL9 or IL1b on the outcome of granulomatous response/bacteria growth.

      We have performed the suggested early blocking experiments and added the expected negative effect on granuloma formation upon neutralization of IL-1b (current Fig. 6E) in the revised version of the manuscript, and furthermore discussed the null effect on bacterial growth of the treatment with an anti-CXCL-9 specific antibody (current Fig. 6H).

      The authors acknowledge the absence of neutrophils in this model. However, this could be discussed in more detail, as neutrophils play an important part in TB pathogenesis as shown in different models of infection and human TB. 

      We concur and have expanded the importance of neutrophils in TB pathogenesis (including references) in the discussion section (line 260). 

      Related to neutrophils and TB pathogenesis, another important player is type I IFN. The multiplex assay used included IFN-alpha, was this molecule detected? If so, was there any difference in the levels of type I IFN detected among the different infections?

      We agree and that is why we had originally included IFN-α in our screen. However, this cytokine remained under the limit of quantification at both studied time points, preventing us to draw conclusions on the effect of Mtb strain diversity on the secretion of type I IFNs in in vitro granulomas.

      Reviewer #2:

      In Figure 1b/c, it is not clear what comparisons are being made to give the p-value annotations.

      In Figure 2a/b, it is not clear what comparisons are being made to give the p-value annotations.

      In Figure 3a, again it is not clear what comparisons are being made to give the p-value annotation.

      The p-values formerly present on the upper le] corner of the panels were resulting from either Friedman (Figures 1C, 2A and 3A) or Kruskal-Wallis (Figures 1B and 2B) tests and indicated whether there was a significant difference between the analyzed groups overall. To avoid confusion, those values have been removed to only leave the post-test comparison between specific groups.  

      In the results narrative related to Figure 1 (lines 93-103), the authors refer to lineage heterogeneity without providing any objective quantification of this - I suggest they do so, by providing variance or standard deviations. 

      Thank you very much for this relevant suggestion, we have now included the coefficients of variation as a quantitative measure of the within-lineage heterogeneity in the manuscript (line 97). 

      I also suggest the authors explain what the data points actually represent in this figure - do I assume each data point = cfu from a well of 'granuloma'? Are they all from the same donor PBMC? What is the sample N for each lineage? If the data are not from the same donor PBMC, I think more informative to present the results of paired statistical analyses, stratified by donor cells. In addition, the authors should include a summary table of the demographic characteristics of the donors (at least sex, ethnicity, and age). If the data are derived from a single donor, I'd advocate providing data from at least one further donor.

      In the new supplementary figure requested by Reviewer 3 Figure 1—figure supplement 1 (actual CFU data on days 1 and 8 p.i. used to calculate the growth rate) it is now indicated that bacterial load was quantified as CFU per well.

      Regarding the number of donors used, as stated in the Material and Methods section (current line 418) and depicted by the four different shapes used when data are grouped by individual infecting strain, all figures in our manuscript have been generated using PBMCs from 4 independent donors. For greater clarity, “n = 4” has now been included in the figure legends. Regarding the statistical analyses, paired statistical analyses stratified by donor were already performed in the original version of the manuscript whenever appropriate. 

      As stated in the methods section, the buffy coats used for PBMC isolation are anonymized so demographic data are unavailable.

      The premise of the analysis in Figure tic and the results narrative ("This finding suggests that an increased ability to enter dormancy is not necessarily associated with a more pronounced growth phenotype", line 132) is not clear to me. Why would increased dormancy relate to increased growth in the same context? I suggest this analysis be removed.

      We apologize for the confusion in our original statement. We now rephrased it as “This finding suggests that an increased tendency to remain in a metabolically active state is not necessarily associated with a more pronounced growth phenotype”.

      In Figure 3b, I think it may be more informative if the data points from the same donor were linked. Likewise in Figure 3c, I'd like to see a donor-paired statistical analysis.

      For all figures, the choice of using individual symbols to identify data points from the same donor but not connecting lines was made to provide a neater image. Nevertheless, we have now modified the figure linking the data points from the same donor. The statistical analysis performed is always donor-paired whenever appropriate. 

      The casual inference suggested in the results narrative between ‘macrophage apoptosis’ and granulomatous response line 173-175) is not tested directly by the experiment – I suggest the authors exclude this statement.

      Fair point, the statement has been removed.

      To what extent have the authors considered whether variation in T cell responses between lineages may be confounded by variation in Mtb reactive T cell frequencies in donor PBMC. Can this be disentangled at all? This should be acknowledged as a potential limitation of the study.

      We did characterize the presence of mycobacterial antigen-specific reactive T cells in the PBMCs from the investigated donors. To do so, we performed in vitro stimulations with purified protein derivative (PPD) or an ESAT-6/CFP-10 peptide pool and quantified the frequency of IFN-γ-positive CD4 T cells by flow cytometry. The percentage of IFN-γg-positive CD4 T cells recalled by PPD stimulation ranged from 0.02% to 0.13%, while no ESAT6/CFP-10 reactive T cells were detected. As such, we can akest that the PBMC donors never encountered Mtb even though some levels of memory recalled by PPD may be due to cross-reactivity with BCG or pre-exposure to non-tuberculous mycobacteria. We have now added a panel in Figure 5—figure supplement 2 representing the frequency of mycobacteria-specific CD4 T cells and, as suggested, discussed the impact on the extent of the T cell responses observed in granulomas in the revised version of the manuscript.  Nevertheless, the observed MTBC strain-specific trends are consistent across the donors, as depicted in Figure 5B and Figure 5—figure supplement 2A-B.

      Moreover, the experimental design does not really test cause and effect for the relationship between T cell proliferation/activation and bacterial growth. What is the impact of T-cell depletion from PBMC on bacterial growth?

      The increased TB susceptibility of HIV patients demonstrated that T cells play a critical part in the control of Mtb infection. We agree and did envisage such a depletion experiment. However, depleting T cells from PBMCs would imply removing up to 70% of the cells present in the specimen, which would lead to a situation from which results cannot be compared to the original sample and therefore would not be interpretable. 

      Reviewer #3:

      Data presentation:

      - In Figure 1 (replication rate), actual cumulative CFU means from each strain for both days 1 and 8 with statistical analysis should be presented as panels in this figure.

      Agreed. We are providing the requested representation of the data and the corresponding paired statistical analysis as supplementary material Figure 1—figure supplement 1.

      - In Figure 2 (dormancy), a panel comparing the mean number of bacteria that are single positive for either Auramine-O, Nile Red, or are double positive should be included for each strain, with statistical analysis. Representative photomicrographs of phenotypes from the staining should also be included. Electron microscopy could be conducted to compare the presence of intermediate lipid inclusions within organoidbound mycobacteria.

      As requested, percentages of single stained as well as double positive bacilli in each sample are now represented in Figure 2—figure supplement 1. In addition, we have now also followed the request and included a photomicrograph picturing representative Mtb staining phenotypes. Lastly, it would certainly be very elegant to visualize the presence of Mtb lipid inclusions within cellular aggregates by electron microscopy. However, we do not currently have the means for such investigations and the implementation of such a protocol under BSL3 conditions appears unrealistic in the context of this study.  

      - In Figure 3 (granulomatous response), the number, circularity, and size of immune aggregates are presented as "granuloma score" in which the mean ratio of size to circularity is divided by the number of inclusions. To their credit, in Supplementary Figure 2, the authors provide the data in a straighAorward manner. However, the granuloma score metric is reduced as the number of observed "granulomas" increases, which is counterintuitive. Additionally, circularity is not a definitive aspect of human granulomas (Wells et al., Am J Respir Crit Care Med, 2021, PMID: 34015247). I am skeptical that the "granuloma score" is an accurate predictor granulomatous inflammation. Is there precedent for this metric in the literature? If so, a reference should be provided. A high magnification inset of 1 representative granuloma from each strain should be included in Figure 3A.

      As requested, insets of a representative average granuloma for each strain have been included in Figure 3A. The formulation of the “granuloma score” has no precedent and cannot be referenced. By doing so, we meant to integrate within one single parameter the visual differences represented in the current Figure 3— figure supplement 2. We intentionally sought to assign the highest score to the massive aggregation that some strains may promote unlike some that trigger several small, dispersed and diffused aggregates.

      - In Figure 4 (macrophage apoptosis), a panel showing the percentage of dual Annexin V and 7-AAD positive cells should be included to provide the reader with the relative scope of ongoing apoptotic vs necrotic/secondary necrotic death in the model. If the data is readily available, including a control of uninfected PBMCs would also allow the reader to evaluate donor-dependent differences of in vitro cell death at baseline.

      No significant differences were observed in the percentage of dual Annexin V- and 7-AAD-positive macrophages (necrosis/secondary necrosis) between the MTBC strains at this time-point. Nevertheless, we have disclosed this result in the revised manuscript as Figure 4—figure supplement 2.

      - In Figures 5 and 6 (lymphocyte activation and soluble mediator secretion), panels showing unscaled data should be included. Panels depicting the unscaled immunoassay protein readings (pg/mL) by strain for CXCL9, granzyme B, and TNF with statistical analysis should be included in Figure 6.

      As requested, unscaled lymphocyte activation and soluble mediator data have been included as Figure 5— figure supplement 2 and Figure 6—figure supplement 1, respectively (replacing former supplementary figures 5 and 7). In addition, updated Figure 6G panel now depicts correlation analysis with the unscaled cytokine concentrations.

      The DosR-regulon:

      The authors hypothesize that differences in the prevalence of the dormancy metrics (acid-fastness or lipid inclusion prevalence, are due to strain-specific increases in expression of the DosR regulon within the model's hypoxic conditions (lines 107-114, 126-127). The claim that their model is equipped to evaluate dosR-dependent mycobacterial phenotypes was also previously proposed (Arbués et el., 2021) and should be tested. A comparison of the dosR-dependent gene expression of each strain in PBMC aggregates and broth culture by qRT-PCR would test this idea at a very basic level.

      We agree. Actually, a similar request was made during the revision of our first in vitro granuloma study for which such qPCR data were generated and presented in Fig. 1 D (PMID: 32069329). In addition, the work of Kapoor et al., who originally developed the in vitro granuloma model also demonstrated the induction of most of the DosR regulated genes by qPCR (PMID: 23308269). We trust that the reviewer will agree that this does not need to be repeated.

      The modern Beijing lineage strain L2C:

      The authors claim (Line 101-102) that the results of Figure 1 "confirm the higher virulence propensities of strains from modern lineages". From the data presented, it appears that strain L2C (Modern-Beijing) dominates the modern vs ancestral and inter/intra-lineage phenotypes of replication, dormancy, and apoptosis. Are significant differences between modern and ancestral lineages or between strains simply a facet of the distinct profile of L2C? Do the statistical differences disappear when the L2C group is excluded?

      Indeed, among the modern lineages’ isolates, L2C exhibits a hypervirulent profile in terms of bacterial replication. However, the difference between modern and ancestral strains remains statistically significant when L2C is excluded from the analysis (p = 0.002). That is also the case when we analyze the proportion of dormant bacteria. Exclusion of L2C strain results in a Kruskal-Wallis overall p = 0.005, and p = 0.0002 when we compare L2 vs. L3. Lastly, regarding the percentage of apoptotic macrophages, if we use L2B (instead of L2C) to compare, the difference is still significant vs. L1A (p = 0.008) although there is no longer a trend for L2A (p = 0.1).

      "Dormancy":

      Dormancy is definitively a non-replicative state, where bacterial growth is absent. The authors' findings and claims appear to be incompatible with that definition, which they acknowledge (Lines 130-135). The lack of correlation between growth and dormancy in their model is supported with reference to Figure 2C, a Spearman's analysis of dormancy ratio with growth rate (inclusive of all strains under consideration). The figure supports a model where "dormancy" and "growth rate" are disjunct but also appears to show high "dormancy" accompanying increasing "growth" in the L2C group. How are strains able to grow if they are in a non-replicative state? Are the "growth rate" assays actually measures of survival? Are there different rates of infectivity? Are the bacteria growing cellularly in the serum-rich ECM, etc. etc? We need to see the hard CFU and Nile Red, and Auramine-O data to contextualize these findings. Alternatively, could the accumulation of inclusions in the model not be a reliable dormancy metric (Fines et al., BioRxiv [Preprint], 2023, PMID: 37609245)?

      We fully agree. The Nile red profiles are always relative and only depict the proportion of the population that has entered a dormant state. Nevertheless, dormancy can be dynamic and bacteria may swi]ly resuscitate in that model. Furthermore, and as depicted in Figure 2—figure supplement 1, despite showing an increased tendency to enter a dormant-like state, a considerable population of lineage 2 bacilli still remains metabolically active and in a replicative state. The referred preprint is very interesting and we will follow it up closely.

      Specificity of responses to PBMC aggregation:

      The authors claim that their results "reveal a broad spectrum of granulomatous responses" (Line 73) but do not show any aggregation specificity of PBMC responses beyond the model's intrinsic metrics of area and circularity. To establish that their phenotypes such as lymphocyte activation, cytokine release, cell death, or mycobacterial acid-fastness/lipid inclusion prevalence, are aspects of the granulomatous response the authors could infect PBMCs from the same donors with the same strains and perform the same assays using established Mtb-PBMC models in which the cells do not aggregate. This would answer many important questions, for example, does the rate of macrophage infection account for variability in apoptosis percentage? Phagocytosis assay and quantification of stained intracellular mycobacteria within recently infected PBMCs could be conducted to determine if phenotypes are an aspect of granulomatous aggregation or due to strain-specific differences in cellintrinsic macrophage immunity. It would also be very informative to know what percentage of PBMCs and mycobacteria are granuloma-bound in the ECM.

      We are not aware of Mtb-PBMC models in which the cells do not aggregate. We previously compared PBMC infection models in the presence or absence of the collagen matrix and cells also spontaneously coalesced around infection foci (PMID: 34603299). Regarding the last point, the melting step of the collagen matrix requires enzymatic digestion and pipetting that dislocate the aggregates. Accordingly, we cannot distinguish the bacteria that would remain within the matrix compared to those replicating within cellular aggregates. However, we did resolve this question by demonstrating that the bacteria were not able to grow in the absence of cells in this culture condition (Supplementary material, PMID: 34603299)

      Minor recommendations

      - The term TNF-a should be replaced with TNF throughout the manuscript.

      We acknowledge that the term TNF-a can be interchangeable with TNF. However, we chose to use the TNFα terminology to differentiate it from lymphotoxin α, which is also referred to as TNF-β.

      - The authors cite studies conducted in murine and NHP models to support the claim that "understanding of immune protective traits in TB remains insufficient and yet dominated by data from mouse and non-human primate studies" (Lines 63-64) but ignore an abundance of data from other in vivo and in vitro models that have provided numerous valuable insights in the field of TB immunology. This line should be revised or omired.

      For us, the term “dominate” implies that these models are widely used, not that they are the only ones. Other models indeed provided additional relevant data. We are citing the lung-on-chip model of McKinney’lab and the in vitro granuloma model of Elkigton’s lab (line 66). We would be very happy to include more references upon further specifications even though we cannot build an extensive review here.

      - The authors claim that their model "encompasses, with the exception of neutrophils, all immune cell types involved in TB" (Lines 67-68). To support this claim, they should provide additional references or data demonstrating that the PBMC aggregates include, eosinophils, mast cells, dendritic cells, yolk-sac-derived alveolar macrophages, and Langhan's giant cells.

      With the aim of providing a more accurate and detailed information regarding the cell types present in the model, the sentence has been reformulated as: “The model encompasses all PBMC-derived cell types involved in TB immune responses, but lacks granulocytes (i.e. neutrophils, eosinophils, basophils and mast cells)” (line 260). Noteworthy, the presence of multinucleated giant cells was reported in Kapoor’s paper describing the in vitro granuloma model for the first time (PMID: 23308269).

      -  As an additional note, the title can be improved and made more broadly accessible by revising the use of the acronyms CXCL9, granzyme B, and TNF-α.

      To render the title more broadly accessible we propose to replace the listed acronyms by “soluble immune mediators”, but we remain opened to more appropriate and specific suggestions.

      Answers to the reviewers’ public comments

      Reviewer #1:

      First of all, we would like to thank the reviewers for their feedback and suggestions to improve our manuscript. To strengthen the findings of our study, we have performed and added results from IL-1b and CXCL9 blocking experiments evaluating the impact on the granulomatous response and bacterial load, respectively. In the revised version of the manuscript, while we discuss the null effect on bacterial growth of the treatment with an anti-CXCL-9 antibody and the potential reason behind it, we are now reporting a negative effect on the magnitude of granuloma formation upon neutralization of IL-1b that the correlation analysis had initially suggested.

      Reviewer #2:

      The revised version of our manuscript incorporates now all the points detailed in the private answers to the reviewer, including clarifications on the statistical tests performed, additional supplementary materials to transparently disclose the raw data behind the normalization approach, as well as flow cytometry data on the immune memory status of the blood donors. In addition, and as stated in the answer to reviewer #1, to test causal relationship between some host and pathogen traits, we have now performed and provided data and interpretation of IL-1b and CXCL9 blocking experiments.

      Reviewer #3:

      We are thankful and concur with these constructive comments and insights. We have now consistently revisited the statistics in the figures to improve clarity and included new supplementary figures reporting the raw data that were missing in the initial version of the manuscript. In addition, and as mentioned in the answers to reviewers #1 and #2, we have now performed and added IL-1β and CXCL9 blocking experiments to test causal relationship between specific host and pathogen traits. In particular, we are now reporting a negative effect on the magnitude of granuloma formation upon neutralization of IL-1β that the correlation analysis had initially suggested.

      More specifically, regarding the point that our method for bacterial collection calls into question whether all Mtb plated for CFU assay resided within granulomatous aggregates, we previously reported that Mtb growth strictly required the presence of human cells in our culture conditions (Supplementary material, Arbués et al, 2021, PMID: 34603299). In the presence of cells, our microscopy read-out does allow us to observe extra-cellular growth if infections are carried on beyond an 8-day limit, which we applied in the current study to exclude this particular caveat. 

      Concerning the apparently conflicting observation that those strains displaying an increased tendency to enter a dormant-like state are the ones exhibiting the highest replication rates, we would like to point out that a considerable population of bacilli still remains metabolically active and in a replicative state. For instance, and as depicted in Figure 2—figure supplement 1, despite showing an increased tendency to enter a dormant-like state, a considerable population of lineage 2 bacilli does remain metabolically active. Moreover, dormancy can be dynamic and bacteria may swi]ly resuscitate.

      Regarding the mentioned limitations of our study that we have discussed in the revised version of our manuscript, we fully concur that PBMC-based in vitro granuloma models lack tissue structure as well as some important stromal and immune cellular players. Nevertheless, we and others demonstrated the particular relevance of the 3-dimensional infection approach within a matrix of collagen and fibronectin by providing mechanistical insights into Mtb resuscitation previously associated to treatment with various immunomodulatory drugs (Arbués et al., 2020, PMID: 32069329; Tezera et al., 2020, PMID: 32091388).

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      Our manuscript initially entitled “Ribosomal RNA synthesis by RNA polymerase I is regulated by premature termination of transcription” investigates the regulation of the initial steps of ribosome biogenesis – the synthesis of large ribosomal RNA precursor by RNA polymerase I.

      In our manuscript, we demonstrate for the first time that RNA Polymerase I (Pol I) can prematurely release nascent transcripts at the 5' end of ribosomal DNA transcription units in vivo. This achievement was made possible by comparing wild-type Pol I with a mutant form of Pol I, hereafter called SuperPol previously isolated in our lab (Darrière at al., 2019). By combining in vivo analysis of rRNA synthesis (using pulse-labelling of nascent transcript and cross-linking of nascent transcript - CRAC) with in vitro analysis, we could show that Superpol reduced premature transcript release due to altered elongation dynamics and reduced RNA cleavage activity. Such premature release could reflect regulatory mechanisms controlling rRNA synthesis. Importantly, This increased processivity of SuperPol is correlated with resistance with BMH-21, a novel anti-cancer drugs inhibiting Pol I, showing the relevance of targeting Pol I during transcriptional pauses to kill cancer cells. This work offers critical insights into Pol I dynamics, rRNA transcription regulation, and implications for cancer therapeutics.

      We sincerely thank the three reviewers for their insightful comments and recognition of the strengths and weaknesses of our study. Their acknowledgment of our rigorous methodology, the relevance of our findings on rRNA transcription regulation, and the significant enzymatic properties of the SuperPol mutant is highly appreciated. We are particularly grateful for their appreciation of the potential scientific impact of this work. Additionally, we value the reviewer’s suggestion that this article could address a broad scientific community, including in transcription biology and cancer therapy research. These encouraging remarks motivate us to refine and expand upon our findings further.

      All three reviewers acknowledged the increased processivity of SuperPol compared to its wild-type counterpart. However, two out of three questions our claims that premature termination of transcription can regulate ribosomal RNA transcription. This conclusion is based on SuperPol mutant increasing rRNA production. Proving that modulation of early transcription termination is used to regulate rRNA production under physiological conditions is beyond the scope of this study. Therefore, we propose to change the title of this manuscript to focus on what we have unambiguously demonstrated:

      “Ribosomal RNA synthesis by RNA polymerase I is subjected to premature termination of transcription”.

      Reviewer 1 main criticisms centers on the use of the CRAC technique in our study. While we address this point in detail below, we would like to emphasize that, although we agree with the reviewer’s comments regarding its application to Pol II studies, by limiting contamination with mature rRNA, CRAC remains the only suitable method for studying Pol I elongation over the entire transcription units. All other methods are massively contaminated with fragments of mature RNA which prevents any quantitative analysis of read distribution within rDNA. This perspective is widely accepted within the Pol I research community, as CRAC provides a robust approach to capturing transcriptional dynamics specific to Pol I activity.

      We hope that these findings will resonate with the readership of your journal and contribute significantly to advancing discussions in transcription biology and related fields.

      2. Description of the planned revisions

      Despite numerous text modification (see below), we agree that one major point of discussion is the consequence of increased processivity in SuperPol mutant on the “quality” of produced rRNA. Reviewer 3 suggested comparisons with other processive alleles, such as the rpb1-E1103G mutant of the RNAPII subunit (Malagon et al., 2006). This comparison has already been addressed by the Schneider lab (Viktorovskaya OV, Cell Rep., 2013 - PMID: 23994471), which explored Pol II (rpb1-E1103G) and Pol I (rpa190-E1224G). The rpa190-E1224G mutant revealed enhanced pausing in vitro, highlighting key differences between Pol I and Pol II catalytic rate-limiting steps (see David Schneider's review on this topic for further details).

              Reviewer 2 and 3 suggested that a decreased efficiency of cleavage upon backtracking might imply an increased error rate in SuperPol compared to the wild-type enzyme. Pol I mutant with decreased rRNA cleavage have been characterized previously, and resulted in increased error-rate. We already started to address this point. Preliminary results from *in vitro* experiments suggest that SuperPol mutants exhibit an elevated error rate during transcription. However, these findings remain preliminary and require further experimental validation to confirm their reproducibility and robustness. We propose to consolidate these data and incorporate into the manuscript to address this question comprehensively. This could provide valuable insights into the mechanistic differences between SuperPol and the wild-type enzyme. SuperPol is the first pol I mutant described with an increased processivity *in vitro* and *in vivo*, and we agree that this might be at the cost of a decreased fidelity.
      

      Regulatory aspect of the process:

      To address the reviewer’s remarks, we propose to test our model by performing experiments that would evaluate PTT levels in Pol I mutant’s or under different growth conditions. These experiments would provide crucial data to support our model, which suggests that PTT is a regulatory element of Pol I transcription. By demonstrating how PTT varies with environmental factors, we aim to strengthen the hypothesis that premature termination plays an important role in regulating Pol I activity.

      We propose revising the title and conclusions of the manuscript. The updated version will better reflect the study's focus and temper claims regarding the regulatory aspects of termination events, while maintaining the value of our proposed model.

      __ __

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Some very important modifications have now been incorporated:



      Statistical Analyses and CRAC Replicates:

      Unlike reviewers 2 and 3, reviewer 1 suggests that we did not analyze the results statistically. In fact, the CRAC analyses were conducted in biological triplicate, ensuring robustness and reproducibility. The statistical analyses are presented in Figure 2C, which highlights significant findings supporting the fact WT Pol I and SuperPol distribution profiles are different. We CRAC replicates exhibit a high correlation and we confirmed significant effect in each region of interest (5’ETS, 18S.2, 25S.1 and 3’ ETS, Figure 1) to confirm consistency across experiments. We finally took care not to overinterpret the results, maintaining a rigorous and cautious approach in our analysis to ensure accurate conclusions.

      CRAC vs. Net-seq:

      Reviewer 1 ask to comment differences between CRAC and Net-seq. Both methods complement each other but serve different purposes depending on the biological question on the context of transcription analysis. Net-seq has originally been designed for Pol II analysis. It captures nascent RNAs but does not eliminate mature ribosomal RNAs (rRNAs), leading to high levels of contamination. While this is manageable for Pol II analysis (in silico elimination of reads corresponding to rRNAs), it poses a significant problem for Pol I due to the dominance of rRNAs (60% of total RNAs in yeast), which share sequences with nascent Pol I transcripts. As a result, large Net-seq peaks are observed at mature rRNA extremities (Clarke 2018, Jacobs 2022). This limits the interpretation of the results to the short lived pre-rRNA species. In contrast, CRAC has been specifically adapted by the laboratory of David Tollervey to map Pol I distribution while minimizing contamination from mature rRNAs (The CRAC protocol used exclusively recovers RNAs with 3′ hydroxyl groups that represent endogenous 3′ ends of nascent transcripts, thus removing RNAs with 3’-Phosphate, found in mature rRNAs). This makes CRAC more suitable for studying Pol I transcription, including polymerase pausing and distribution along rDNA, providing quantitative dataset for the entire rDNA gene.

      CRAC vs. Other Methods:

      Reviewer 1 suggests using GRO-seq or TT-seq, but the experiments in Figure 2 aim to assess the distribution profile of Pol I along the rDNA, which requires a method optimized for this specific purpose. While GRO-seq and TT-seq are excellent for measuring RNA synthesis and co-transcriptional processing, they rely on Sarkosyl treatment to permeabilize cellular and nuclear membranes. Sarkosyl is known to artificially induces polymerase pausing and inhibits RNase activities which are involved in the process. To avoid these artifacts, CRAC analysis is a direct and fully in vivo approach. In CRAC experiment, cells are grown exponentially in rich media and arrested via rapid cross-linking, providing precise and artifact-free data on Pol I activity and pausing.

      Pol I ChIP Signal Comparison:

      The ChIP experiments previously published in Darrière et al. lack the statistical depth and resolution offered by our CRAC analyses. The detailed results obtained through CRAC would have been impossible to detect using classical ChIP. The current study provides a more refined and precise understanding of Pol I distribution and dynamics, highlighting the advantages of CRAC over traditional methods in addressing these complex transcriptional processes.

      BMH-21 Effects:

      As highlighted by Reviewer 1, the effects of BMH-21 observed in our study differ slightly from those reported in earlier work (Ref Schneider 2022), likely due to variations in experimental conditions, such as methodologies (CRAC vs. Net-seq), as discussed earlier. We also identified variations in the response to BMH-21 treatment associated with differences in cell growth phases and/or cell density. These factors likely contribute to the observed discrepancies, offering a potential explanation for the variations between our findings and those reported in previous studies. In our approach, we prioritized reproducibility by carefully controlling BMH-21 experimental conditions to mitigate these factors. These variables can significantly influence results, potentially leading to subtle discrepancies. Nevertheless, the overall conclusions regarding BMH-21's effects on WT Pol I are largely consistent across studies, with differences primarily observed at the nucleotide resolution. This is a strength of our CRAC-based analysis, which provides precise insights into Pol I activity.

      We will address these nuances in the revised manuscript to clarify how such differences may impact results and provide context for interpreting our findings in light of previous studies.

      Minor points:

      Reviewer #1:

      • In general, the writing style is not clear, and there are some word mistakes or poor descriptions of the results, for example: On page 14: "SuperPol accumulation is decreased (compared to Pol I)". • *On page 16: "Compared to WT Pol I, the cumulative distribution of SuperPol is indeed shifted on the right of the graph." *

      We clarified and increased the global writing style according to reviewer comment.

      • *There are also issues with the literature, for example: Turowski et al, 2020a and Turowski et al, 2020b are the same article (preprint and peer-reviewed). Is there any reason to include both references? Please, double-check the references. *

      This was corrected in this version of the manuscript.

      • *In the manuscript, 5S rRNA is mentioned as an internal control for TMA normalisation. Why are Figure 1C data normalised to 18S rRNA instead of 5S rRNA? *

      Data are effectively normalized relative to the 5S rRNA, but the value for the 18S rRNA is arbitrarily set to 100%.

      • Figure 4 should be a supplementary figure, and Figure 7D doesn't have a y-axis labelling.

      The presence of all Pol I specific subunits (Rpa12, Rpa34 and Rpa49) is crucial for the enzymatic activity we performed. In the absence of these subunits (which can vary depending on the purification batch), Pol I pausing, cleavage and elongation are known to be affected. To strengthen our conclusion, we really wanted to show the subunit composition of the purified enzyme. This important control should be shown, but can indeed be shown in a supplementary figure if desired.

      Y-axis is figure 7D is now correctly labelled

      • *In Figure 7C, BMH-21 treatment causes the accumulation of ~140bp rRNA transcripts only in SuperPol-expressing cells that are Rrp6-sensitive (line 6 vs line 8), suggesting that BHM-21 treatment does affect SuperPol. Could the author comment on the interpretation of this result? *

      The 140 nt product is a degradation fragment resulting from trimming, which explains its lower accumulation in the absence of Rrp6. BMH21 significantly affects WT Pol I transcription but has also a mild effect on SuperPol transcription. As a result, the 140 nt product accumulates under these conditions.

      Reviewer #2:

      • *pp. 14-15: The authors note local differences in peak detection in the 5'-ETS among replicates, preventing a nucleotide-resolution analysis of pausing sites. Still, they report consistent global differences between wild-type and SuperPol CRAC signals in the 5'ETS (and other regions of the rDNA). These global differences are clear in the quantification shown in Figures 2B-C. A simpler statement might be less confusing, avoiding references to a "first and second set of replicates" *

      According to reviewer, statement has been simplified in this version of the manuscript.


      • *Figures 2A and 2C: Based on these data and quantification, it appears that SuperPol signals in the body and 3' end of the rDNA unit are higher than those in the wild type. This finding supports the conclusion that reduced pausing (and termination) in the 5'ETS leads to an increased Pol I signal downstream. Since the average increase in the SuperPol signal is distributed over a larger region, this might also explain why even a relatively modest decrease in 5'ETS pausing results in higher rRNA production. This point merits discussion by the authors. *

      We agree that this is a very important discussion of our results. Transcription is a very dynamic process in which paused polymerase is easily detected using the CRAC assay. Elongated polymerases are distributed over a much larger gene body, and even a small amount of polymerase detected in the gene body can represent a very large rRNA synthesis. This point is of paramount importance and, as suggested by the reviewer, is now discussed in detail.


      • *A decreased efficiency of cleavage upon backtracking might imply an increased error rate in SuperPol compared to the wild-type enzyme. Have the authors observed any evidence supporting this possibility? *

      Reviewer suggested that a decreased efficiency of cleavage upon backtracking might imply an increased error rate in SuperPol compared to the wild-type enzyme. We already started to address this point. Preliminary results from in vitro experiments suggest that SuperPol mutants exhibit an elevated error rate during transcription. However, these findings remain preliminary and require further experimental validation to confirm their reproducibility and robustness. We propose to consolidate these data and incorporate into the manuscript to address this question comprehensively.


      • *pp. 15 and 22: Premature transcription termination as a regulator of gene expression is well-documented in yeast, with significant contributions from the Corden, Brow, Libri, and Tollervey labs. These studies should be referenced along with relevant bacterial and mammalian research. *

      According to reviewer suggestion, we referenced these studies.


      • *p. 23: "SuperPol and Rpa190-KR have a synergistic effect on BMH-21 resistance." A citation should be added for this statement. *

      This represents some unpublished data from our lab. KR and SuperPol are the only two known mutants resistant to BMH-21. We observed that resistance between both alleles is synergistic, with a much higher resistance to BMH-21 in the double mutant than in each single mutant (data not shown). Comparing their resistance mechanisms is a very important point that we could provide upon request. This was added to the statement.


      • *p. 23: "The released of the premature transcript" - this phrase contains a typo *

      This is now corrected.


      Reviewer #3:

      • *Figure 1B: it would be opportune to separate the technique's schematic representation from the actual data. Concerning the data, would the authors consider adding an experiment with rrp6D cells? Some RNAs could be degraded even in such short period of time, as even stated by the authors, so maybe an exosome depleted background could provide a more complete picture. Could also the authors explain why the increase is only observed at the level of 18S and 25S? To further prove the robustness of the Pol I TMA method could be good to add already characterized mutations or other drugs to show that the technique can readily detect also well-known and expected changes. *

      The precise objective of this experiment is to avoid the use of the Rrp6 mutant. Under these conditions, we prevent the accumulation of transcripts that would result from a maturation defect. While it is possible to conduct the experiment with the Rrp6 mutant, it would be impossible to draw reliable conclusions due to this artificial accumulation of transcripts.


      • *Figure 1C: the NTS1 probe signal is missing (it is referenced in Figure 1A but not listed in the Methods section or the oligo table). If this probe was unused, please correct Figure 1A accordingly. *

      __We corrected Figure 1A. __


      • *Figure 2A: the RNAPI occupancy map by CRAC is hard to interpret. The red color (SuperPol) is stacked on top of the blue line, and we are not able to observe the signal of the WT for most of the position along the rDNA unit. It would be preferable to use some kind of opacity that allows to visualize both curves. Moreover, the analysis of the behavior of the polymerase is always restricted to the 5'ETS region in the rest of the manuscript. We are thus not able to observe whether termination events also occur in other regions of the rDNA unit. A Northern blot analysis displaying higher sizes would provide a more complete picture. *

      We addressed this point to make the figure more visually informative. In Northern Blot analysis, we use a TSS (Transcription Start Site) probe, which detects only transcripts containing the 5' extremity. Due to co-transcriptional processing, most of the rRNA undergoing transcription lacks its 5' extremity and is not detectable using this technique. We have the data, but it does not show any difference between Pol I and SuperPol. This information could be included in the supplementary data if asked.


      • *"Importantly, despite some local variations, we could reproducibly observe an increased occupancy of WT Pol I in 5'-ETS compared to SuperPol (Figure 1C)." should be Figure 2C. *

      Thanks for pointing out this mistake. it has been corrected.


      • *Figure 3D: most of the difference in the cumulative proportion of CRAC reads is observed in the region ~750 to 3000. In line with my previous point, I think it would be worth exploring also termination events beyond the 5'-ETS region. *

      We agree that such an analysis would have been interesting. However, with the exception of the pre-rRNA starting at the transcription start site (TSS) studied here, any cleaved rRNA at its 5' end could result from premature termination and/or abnormal processing events. Exploring the production of other abnormal rRNAs produced by premature termination is a project in itself, beyond this initial work aimed at demonstrating the existence of premature termination events in ribosomal RNA production.


      • *Figure 4: should probably be provided as supplementary material. *

      As lmentioned earlier (see comments), ____the presence of all Pol I specific subunits (Rpa12, Rpa34 and Rpa49) is crucial for the enzymatic activity we performed. This important control should be shown, but can indeed be shown in a supplementary figure if desired.


      • *"While the growth of cells expressing SuperPol appeared unaffected, the fitness of WT cells was severely reduced under the same conditions." I think the growth of cells expressing SuperPol is slightly affected. *

      We agree with this comment and we modified the text accordingly.


      • *Figure 7D: the legend of the y-axis is missing as well as the title of the plot. *

      Legend of the y-axis and title of the plot are now present.


      • The statements concerning BMH-21, SuperPol and Rpa190-KR in the Discussion section should be removed, or data should be provided.

      This was discussed previously. See comment above.


      • *Some references are missing from the Bibliography, for example Merkl et al., 2020; Pilsl et al., 2016a, 2016b. *

      Bibliography is now fixed

      __ __

      4. Description of analyses that authors prefer not to carry out

      Does SuperPol mutant produces more functional rRNAs ?

      As Reviewer 1 requested, we agree that this point requires clarification. In cells expressing SuperPol, a higher steady state of (pre)-rRNAs is only observed in absence of degradation machinery suggesting that overproduced rRNAs are rapidly eliminated. We know that (pre)-rRNas are unable to accumulate in absence of ribosomal proteins and/or Assembly Factors (AF). In consequence, overproducing rRNAs would not be sufficient to increase ribosome content. This specific point is further address in our lab but is beyond the scope of this article.

      __Is premature termination coupled with rRNA processing __

      We appreciate the reviewer’s insightful comments. The suggested experiments regarding the UTP-A complex's regulatory potential are valuable and ongoing in our lab, but they extend beyond the scope of this study and are not suitable for inclusion in the current manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In the manuscript "Ribosomal RNA synthesis by RNA polymerase I is regulated by premature termination of transcription", Azouzi and co-authors investigate the regulatory mechanisms of ribosomal RNA (rRNA) transcription by RNA Polymerase I (RNAPI) in the budding yeast S. cerevisiae. They follow up on exploring the molecular basis of a mutant allele of the second largest subunit of RNAPI, RPA135-F301S, also dubbed SuperPol, that they had previously reported (Darrière et al, 2019), and which was shown to rescue Rpa49-linked growth defects, possibly by increasing rRNA production.

      Through a combination of genomic and in vitro approaches, the authors test the hypothesis that RNAPI activity could be subjected to a Premature Transcription Termination (PPT) mechanism, akin to what is observed for RNA Polymerase II (RNAPII), and which is suggested to be an important step for the quality control of rRNA transcripts. SuperPol is proposed to lack such a regulatory mechanism, due to an increased processivity. In agreement, SuperPol is shown to be resistant to BMH-21, a drug previously shown to impair RNAPI elongation.

      Overall, the experiments are performed with rigor and include the appropriate controls and statistical analysis. Both the figures and the text present the data clearly. The Material and Methods section is detailed enough. The reported results are interesting; however, I am not fully convinced of the existence of PPT of RNAPI, and even less of its utmost importance. The existence of PPT of RNAPI would entail an intended regulatory mechanism. The authors propose that PPT could serve as quality control step for the UTP-A complex loading on the rRNA 5'-end. While this hypothesis is enticing and cautiously phrased by the authors, the lack of evidence showing a specific regulatory function (such as UTP-A loading checkpoint or else) limits these termination events to possibly abortive actions of unclear significance. The auhors may want to consider comparisons to other processive alleles, such as the rpb1-E1103G mutant of the RNAPII subunit (Malagon et al, 2006) or the G1136S allele of E. coli RNAP (Bar-Nahum et al., 2005). While clearly mechanistically distinct, these mutations result in similarly processive enzymes that achieve more robust transcription, possibly at the cost of decreased fidelity. Indeed, an alternative possibility explaining these transcripts could be that they originate from unsuccessful resumption of transcription after misincorporation (see below).

      I suggest reconsidering the study's main conclusions by limiting claims about the regulatory function of these termination events (the title of the manuscript should be changed accordingly). Alternatively, the authors should provide additional investigation on their regulatory potential, for example by assessing if indeed this quality control is linked to the correct assembly of the UTP-A complex. The expectation would be that SuperPol should rescue at least to some extent the defects observed in the absence of UTP-A components. Moreover, the results using the clv3 substrate suggest the possibility that SuperPol might simply be more able to tolerate mismatches, thus be more processive in transcribing, because not subjected to proof-reading mechanisms, similarly to what observed in Schwank et al., 2022. This could explain many of the observations, and I think it is worth exploring by assessing the fidelity of the enzyme, especially in the frame of suggesting a regulatory function for these termination events.

      Minor comments

      1. Figure 1B: it would be opportune to separate the technique's schematic representation from the actual data. Concerning the data, would the authors consider adding an experiment with rrp6D cells? Some RNAs could be degraded even in such short period of time, as even stated by the authors, so maybe an exosome depleted background could provide a more complete picture. Could also the authors explain why the increase is only observed at the level of 18S and 25S? To further prove the robustness of the Pol I TMA method could be good to add already characterized mutations or other drugs to show that the technique can readily detect also well-known and expected changes.
      2. Figure 1C: the NTS1 probe signal is missing (it is referenced in Figure 1A but not listed in the Methods section or the oligo table). If this probe was unused, please correct Figure 1A accordingly.
      3. Figure 2A: the RNAPI occupancy map by CRAC is hard to interpret. The red color (SuperPol) is stacked on top of the blue line, and we are not able to observe the signal of the WT for most of the position along the rDNA unit. It would be preferable to use some kind of opacity that allows to visualize both curves. Moreover, the analysis of the behavior of the polymerase is always restricted to the 5'ETS region in the rest of the manuscript. We are thus not able to observe whether termination events also occur in other regions of the rDNA unit. A Northern blot analysis displaying higher sizes would provide a more complete picture.
      4. "Importantly, despite some local variations, we could reproducibly observe an increased occupancy of WT Pol I in 5'-ETS compared to SuperPol (Figure 1C)." should be Figure 2C.
      5. Figure 3D: most of the difference in the cumulative proportion of CRAC reads is observed in the region ~750 to 3000. In line with my previous point, I think it would be worth exploring also termination events beyond the 5'-ETS region.
      6. Figure 4: should probably be provided as supplementary material.
      7. "While the growth of cells expressing SuperPol appeared unaffected, the fitness of WT cells was severely reduced under the same conditions." I think the growth of cells expressing SuperPol is slightly affected.
      8. Figure 6B: can the authors explain why most of bands detected in their Pol I TMA assay in Figure 6B are unchanged? It is unclear to me why only the 18S and 25S bands are decreased following BMH-21 treatment. Moreover, this experiment lacks the corresponding quantification and statistical tests.
      9. Figure 7D: the legend of the y-axis is missing as well as the title of the plot.
      10. The statements concerning BMH-21, SuperPol and Rpa190-KR in the Discussion section should be removed, or data should be provided.
      11. Some references are missing from the Bibliography, for example Merkl et al., 2020; Pilsl et al., 2016a, 2016b.

      Significance

      Azouzi and co-authors' work builds on their previous study (Darrière et al, 2019) of RPA135-F301S (SuperPol), a mutant allele of the second largest RNAPI subunit, which was shown to compensate for Rpa49 loss, potentially by increasing rRNA production. The work advances the mechanistic understanding of the the SuperPol allele, demonstrating the increased processivity of this enzyme compared to its wild-type counterpart. Such increased processivity "desensitizes" RNAPI from abortive transcription cycles, the existence of which is clearly shown, though the biological significance of this phenomenon remains unclear. The lack of evidence for a regulatory mechanism behind these early termination events is, in my opinion, a limitation of this study, as it does not allow for differentiation between an intended regulatory process and a byproduct of an imperfect system.

      This work is of interest for researchers studying transcription regulation, particularly those interested in understanding RNAPI's role and fidelity. Demonstrating PPT as a regulatory quality control for RNAPI could point to common strategies in between RNAPI and RNAPII regulation, where premature termination has been extensively documented. However, without evidence of a specific regulatory function, these findings may currently be limited to descriptive insights.

      My expertise lies is RNAPII transcription, transcription termination, and genomic approaches to studying transcription.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The study characterises an RNA polymerase (Pol) I mutant (RPA135-F301S) named SuperPol. This mutant was previously shown to increase yeast ribosomal RNA (rRNA) production by Transcription Run-On (TRO). In this work, the authors confirm this mutation increases rRNA transcription using a slight variation of the TRO method, Transcriptional Monitoring Assay (TMA), which also allows the analysis of partially degraded RNA molecules. The authors show a reduction of abortive rRNA transcription in cells expressing the SuperPol mutant and a modest occupancy decrease at the 5' region of the rRNA genes compared to WT Pol I. These results suggest that the SuperPol mutant displays a lower frequency of premature termination. Using in vitro assays, the authors found that the mutation induces an enhanced elongation speed and a lower cleavage activity on mismatched nucleotides at the 3' end of the RNA. Finally, SuperPol mutant was found to be less sensitive to BMH-21, a DNA intercalating agent that blocks Pol I transcription and triggers the degradation of the Pol I subunit, Rpa190. Compared to WT Pol I, short BMH-21 treatment has little effect on SuperPol transcription activity, and consequently, SuperPol mutation decreases cell sensitivity to BMH-21.

      I'd suggest the following points to be taken into consideration:

      Major points:

      1. The differences in the transcriptionally engaged WT Pol I and SuperPol profiles (Figure 2) are very modest, without any statistical analyses. What is the correlation between CRAC replicates? Are they separated in PCA analyses? Please, include more quality control information. In my opinion, these results are not very convincing. Similarly, the effect of BMH-21 on WT Pol I activity (Figure 7) is also very subtle and doesn't match the effect observed in a previous study [1]. Could the author comment on the reasons for these differences? These discrepancies raise concerns about the methodology. In addition, according to the laboratory's previous work [2], Pol I ChIP signal at rDNA is not significantly different in cells expressing WT Pol I and SuperPol. How can these two observations be reconciled? I would suggest using an independent methodology to analyse Pol I transcription, for example, GRO-seq or TT-seq.
      2. While the experiments clearly show SuperPol mutant increases nascent transcription and decreases the production of abortive promoter-proximal transcripts compared to WT Pol I. RPA135-F301S mutation has a minor impact on total rRNA levels, at least those shown in Figure 3B. Are steady-state rRNA levels higher in cells expressing SuperPol mutant? It would be interesting to know if SuperPol mutant produces more functional rRNAs.

      Minor points

      1. In general, the writing style is not clear, and there are some word mistakes or poor descriptions of the results, for example:<br /> On page 14: "SuperPol accumulation is decreased (compared to Pol I)". On page 16: "Compared to WT Pol I, the cumulative distribution of SuperPol is indeed shifted on the right of the graph."
      2. There are also issues with the literature, for example: Turowski et al, 2020a and Turowski et al, 2020b are the same article (preprint and peer-reviewed). Is there any reason to include both references? Please, double-check the references.
      3. In the manuscript, 5S rRNA is mentioned as an internal control for TMA normalisation. Why are Figure 1C data normalised to 18S rRNA instead of 5S rRNA?
      4. Figure 4 should be a supplementary figure, and Figure 7D doesn't have a y-axis labelling.
      5. In Figure 7C, BMH-21 treatment causes the accumulation of ~140bp rRNA transcripts only in SuperPol-expressing cells that are Rrp6-sensitive (line 6 vs line 8), suggesting that BHM-21 treatment does affect SuperPol. Could the author comment on the interpretation of this result?

      References

      1. Jacobs RQ, Huffines AK, Laiho M & Schneider DA (2022) The small-molecule BMH-21 directly inhibits transcription elongation and DNA occupancy of RNA polymerase I in vivo and in vitro. J. Biol. Chem. 298: 101450
      2. Darrière T, Pilsl M, Sarthou M-K, Chauvier A, Genty T, Audibert S, Dez C, Léger-Silvestre I, Normand C, Henras AK, Kwapisz M, Calvo O, Fernández-Tornero C, Tschochner H & Gadal O (2019) Genetic analyses led to the discovery of a super-active mutant of the RNA polymerase I. PLoS Genet. 15: e1008157

      Significance

      The work further characterises a single amino acid mutation of one of the largest yeast Pol I subunits (RPA135-F301S). While this mutation was previously shown to increase rRNA synthesis, the current work expands the SuperPol mutant characterisation, providing details of how RPA135-F301S modifies the enzymatic properties of yeast Pol I. In addition, their findings suggest that yeast Pol I transcription can be subjected to premature termination in vivo. The molecular basis and potential regulatory functions of this phenomenon could be explored in additional studies.

      Our understanding of rRNA transcription is limited, and the findings of this work may be interesting to the transcription community. Moreover, targeting Pol I activity is an open strategy for cancer treatment. Thus, the resistance of SuperPol mutant to BMH-21 might also be of interest to a broader community, although these findings are yet to be confirmed in human Pol I and with more specific Pol I inhibitors in future.

      My expertise is human Pol II and Pol III transcription regulation.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This manuscript describes the impact of modulating signaling by a key regulatory enzyme, Dual Leucine Zipper Kinase (DLK), on hippocampal neurons. The results are interesting and will be important for scientists interested in synapse formation, axon specification, and cell death. The methods and interpretation of the data are solid, but the study can be further strengthened with some additional studies and controls.

      We greatly appreciate the thorough review and thoughtful suggestions from the reviewers and editors on our original manuscript. We provide point-to-point response below.  We added new studies on P10 mice and controls as suggested, and made revision of figures and texts for clarification. The revised manuscript includes three new supplemental figures; major text revision is copied under response.

      Reviewer #1 (Public Review):

      Summary:

      In this work, Ritchie and colleagues explore functional consequences of neuronal over-expression or deletion of the MAP3K DLK that their labs and others have strongly implicated in both axon degeneration, neuronal cell death, and axon regeneration. Their recent work in eLife (Li, 2021) showed that inducible over-expression of DLK (or the related LZK) induces neuronal death in the cerebellum. Here, they extend this work to show that inducible over-expression in Vglut1+ neurons also kills excitatory neurons in hippocampal CA1, but not CA3. They complement this very interesting finding with translatomics to quantify genes whose mRNAs are differentially translated in the context of DLK over-expression or knockout, the latter manipulation having little to no effect on the phenotypes measured. The authors note that several genes and pathways are differentially regulated according to whether DLK is over-expressed or knocked out. They note DLK-dependent changes in genes related to synaptic function and the cytoskeleton and ultimately relate this in cultured neurons to findings that DLK over-expression negatively impacts synapse number and changes microtubules and neurites, though with a less obvious correlation.

      Strengths:

      This work represents a conceptual advance in defining DLK-dependent changes in translation. Moreover, the finding that DLK may differentially impact neuronal death will become the basis for future studies exploring whether DLK contributes to differential neuronal susceptibility to death, which is a broadly important topic.

      We thank the reviewer for the comments on the value of our work.

      Weaknesses:

      This seems like two works in parallel that the authors have not yet connected. First is that DLK affects the translation of an interesting set of genes, and second, that DLK(OE) kills some neurons, disrupts their synapses, and affects neurite growth in culture.

      Specific questions:

      (1) Is DLK effectively knocked out? The authors reference the floxxed allele in their 2016 work (PMID: 27511108), however, the methods of this paper say that the mouse will be characterized in a future publication. Has this ever been published? The major concern is that here the authors show that Cre-mediated deletion results in a smaller molecular weight protein and the maintenance of mRNA levels.

      We apologize for out-of-date citation of the DLK(cKO)<sup>fl/fl</sup> mice.  The DLK(cKO)<sup>fl/fl</sup> mice have been published in (Li et al., 2021; Saikia et al., 2022); excision of the flox-ed exon was verified using several Cre drivers (Pv-Cre, AAV-Cre, and VGlut1-Cre in this study).  The flox-ed exon contains the initiation ATG and 148 amino acids.  By western blot analysis using antibodies against C-terminal peptides of DLK on cerebellar extracts (in Li et al., 2021) and hippocampal extracts (this study), the full-length DLK protein was significantly reduced (Fig 1A-B); DLK is expressed in other hippocampal cells, in addition to glutamatergic neurons, explaining remaining full-length DLK detected. 

      Our Ribo-seq of VGlut1-Cre; DLK(cKO)<sup>fl/fl</sup> detected remaining Dlk mRNAs lacking the floxed exon (Fig.S1C), which has several candidate ATG at amino acid 223 and after (Fig.S1C1). We detected a very faint band for smaller molecular weight proteins on western blots, only when the membrane was exposed under 5X longer exposure using Pico PLUS Chemiluminescent Substrate (Thermo Scientific, 34580) and a Licor Odyssey XF Imager (revised Fig. S1B). This smaller molecular weight protein might be produced using any candidate ATGs, but would represent an N-terminal truncated DLK protein lacking the ATP binding site and ~1/4 of the kinase domain, i.e. not a functional kinase. 

      The revised manuscript has updated citation for DLK(cKO)<sup>fl/fl</sup>. Revised Fig.S1B includes images of a western blot under normal exposure vs longer exposure of western blots using anti-DLK antibodies. New Fig.S1C1 shows effects of floxed exon on DLK.

      (2) Why does DLK(OE) not kill CA3 neurons? The phenomenon is clear but there is no link to gene expression changes. In fact, the highlighted transcript in this work, Stmn4, changes in a DLK-dependent manner in CA3.

      We agree that this is a very interesting question not answered by our gene expression analysis.  While we verified Stmn4 expression levels to correlate to the levels of DLK, we do not think that increased Stmn4 per se in DLK(iOE) is a major factor accounting for CA1 death vs CA3 survival. Several published studies have also reported regulation of Stmn4 mRNAs in other cell types, in the contexts of cell death (Watkins et al., 2013; Le Pichon et al., 2017) and axon regeneration and cytoskeleton disruption (Asghari Adib et al., 2024; DeVault et al., 2024; Hu et al., 2019;  Shin et al., 2019). As Stmns have significant expression and function redundancy, conventional knockdown or overexpression of individual Stmn generally does not lead to detectable effects on cellular function. As CA3 neurons are widely known for their dense connections and show resilience to NMDA-mediated neurotoxicity (Sammons et al., 2024; Vornov et al., 1991), we speculate that the differential vulnerability of CA1 and CA3 under DLK(iOE) is a reflection of both the intrinsic property, such as gene expression, and also their circuit connection. 

      In the revised manuscript, we have included following statement on pg 18:

      ‘While our data does not pinpoint the molecular changes explaining why CA3 would show less vulnerability to increased DLK, we may speculate that DLK(iOE) induced signal transduction amplification may differ in CA1 vs CA3. CA1 genes appear to be more strongly regulated than CA3 genes, consistent with our observation that increased c-Jun expression in CA1 is greater than that in CA3. Other parallel molecular factors may also contribute to resilience of CA3 neurons to DLK(iOE), such as HSP70 chaperones, different JNK isoforms, and phosphatases, some of which showed differential expression in our RiboTag analysis of DLK(iOE) vs WT (shown in File S2. WT vs DLK(iOE) DEGs). Together with other genes that show dependency on DLK, the DLK and Jun regulatory network contributes to the regional differences in hippocampal neuronal vulnerability under pathological conditions.’

      Further we state in ‘Limitation of our study’ on pg 20:

      ‘Our analysis also does not directly address why CA3 neurons are less vulnerable to increased DLK expression. Future studies using cell-type specific RiboTag profiling and other methods at a refined time window will be required to address how DLK dependent signaling interacts with other networks underlying hippocampal regional neuron vulnerability to pathological insults.’

      We hope our data will stimulate continued interests for testable hypothesis in future studies.

      (3) Why are whole hippocampi analyzed to IP ribosome-associated mRNAs? The authors nicely show a differential effect of DLK on CA1 vs CA3, but then - at least according to their methods ¬- lyse whole hippocampi to perform IP/sequencing. Their data are therefore a mix of cells where DLK does and does not change cell death. The key issue is whether DLK does/does not have an effect based on the expression changes it drives.

      At the time of planning the Ribo-Tag experiment several years ago, we focused on the hippocampal glutamatergic neurons. Due to technical difficulty in micro-dissecting individual hippocampal regions from this early timepoint, we opted to use whole hippocampi to isolate ribosome-associated mRNAs. We agree with the reviewer that it is important to sort out DLK-dependent general gene expression changes vs those specific to a particular cell type where DLK impacts its survival. With emerging CA1, CA3 and other cell-type specific Cre drivers and advanced RNAseq technology, we hope that our work will stimulate broad interest in these questions in future studies. 

      In the revised manuscript, we have included new analysis comparing our Vglut1-RiboTag profiling (P15) with CamK2-RiboTag (for CA1) and Grik4-RiboTag (for CA3) (P42) published in Traunmüller et al., 2023 (GSE209870). We find that >80% of the top ranked genes in their CamK2-RiboTag (for CA1) and Girk4-RiboTag (for CA3) were detected in our VGlut1-RiboTag (revised methods and Supplemental Excel File S3). CA1-enriched genes tended to be expressed higher in DLK(cKO), compared to control, whereas CA3-enriched genes showed less significant correlation to DLK expression levels. Additionally, many genes known to specify CA1 fate do not show significant downregulation in DLK(iOE). This analysis, along with other data in our manuscript, is consistent with an idea that DLK does not regulate neuronal fate.

      In the revised manuscript, we presented this additional analysis in Fig. S6K-L, and expanded text description on page 9:

      ‘Additionally, we compared our Vglut1-RiboTag datasets with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We defined a list of genes enriched in CamK2-expressing CA1 neurons relative to Grik4-expressing CA3 neurons (CA1 genes), and those enriched in Grik4-expressing CA3 neurons (CA3 genes) (File S3). When compared with the entire list of Vglut1-RiboTag profiling in our control and DLK(cKO), we found CA1 genes tended to be expressed more in DLK(cKO) mice, compared to control (Fig.S6K), while CA3 genes showed a slight enrichment in control though the trend was less significant, and were less clustered towards one genotype (Fig.S6L). Moreover, many CA1 genes related to cell-type specification, such as FoxP1, Satb2, Wfs1, Gpr161, Adcy8, Ndst3, Chrna5, Ldb2, Ptpru, and Ntm, did not show significant downregulation when DLK was overexpressed. These observations imply that DLK likely specifically down-regulates CA1 genes both under normal conditions and when overexpressed, with a stronger effect on CA1 genes, compared to CA3 genes. Overall, the informatic analysis suggests that decreased expression of CA1 enriched genes may contribute to CA1 neuron vulnerability to elevated DLK, although it is also possible that the observed down-regulation of these genes is a secondary effect associated with CA1 neuron degeneration’.

      (4) Is the subtle decrease in synapse number (Basson/Homer co-loc.) in the DLK (OE) simply a function of neurons (and their synapses, presumably) having died? At the P15 time point that the authors choose because cell death is minimal, there is still a ~25% reduction in CA1 thickness (Figure 2B), which is larger than the ~15% change in synapses (Figure 5H) they describe.

      We thank reviewer for the question. To address this, we have analyzed synapses in the CA1 region at P10 in DLK(iOE) mice when there was no detectable loss of neurons. At P10, we did not detect significant changes in Bassoon, Homer1, or colocalized puncta in CA1 (Fig.S11A-F). In P15 DLK(iOE) mice, Homer1 puncta were slightly smaller (Fig.5L) and showed a significant decrease in CA1 SR (Fig.5I).

      In the revised manuscript we have also redone our statistical analysis of synapses, using mice rather than ROIs (revised Fig. 5), as recommended by R3. We also analyzed synapses in CA3, and found no significant differences in P10 or P15 (Fig.S12).  We would interpret the data to mean that the effects of DLK(OE) on synapses in CA1 may represent an early step in neuronal death. We hope that future studies will shed clarity on this question.

      Reviewer #2 (Public Review):

      This manuscript describes the impact of deleting or enhancing the expression of the neuronal-specific kinase DLK in glutamatergic hippocampal neurons using clever genetic strategies, which demonstrates that DLK deletion had minimal effects while overexpression resulted in neurodegeneration in vivo. To determine the molecular mechanisms underlying this effect, ribotag mice were used to determine changes in active translation which identified Jun and STMN4 as DLK-dependent genes that may contribute to this effect. Finally, experiments in cultured neurons were conducted to better understand the in vivo effects. These experiments demonstrated that DLK overexpression resulted in morphological and synaptic abnormalities.

      Strengths:

      This study provides interesting new insights into the role of DLK in the normal function of hippocampal neurons. Specifically, the study identifies:

      (1) CA1 vs CA3 hippocampal neurons have differing sensitivity to increased DLK signaling.

      (2) DLK-dependent signaling in these neurons is similar to but distinct from the downstream factors identified in other cell types, highlighted by the identification of STMN4 as a downstream signal.

      (3) DLK overexpression in hippocampal neurons results in signaling that is similar to that induced by neuronal injury.

      The study also provides confirmatory evidence that supports previously published work through orthogonal methods, which adds additional confidence to our understanding of DLK signaling in neurons. Taken together, this is a useful addition to our understanding of DLK function.

      We thank the reviewer for careful reading and positive comments.

      Weaknesses:

      There are a few weaknesses that limit the impact of this manuscript, most of which are pointed out by the authors in the discussion. Namely:

      (1) It is difficult to distinguish whether the changes in the translatome identified by the authors are DLK-dependent transcriptional changes, DLK-dependent post-transcriptional changes or secondary gene expression changes that occur as a result of the neurodegeneration that occurs in vivo. Additional expression analysis at earlier time points could be one method to address this concern.

      We appreciate the reviewer’s comment, and have performed new analysis on c-Jun and p-c-Jun levels in CA1, CA3, and DG in P10 DLK(OE) mice. Our data suggest that in CA3 elevations in p-c-Jun and c-Jun occur separately from cell death in a DLK-dependent manner, though the high elevation of both p-c-Jun and c-Jun in CA1 correlates with cell death.

      The data is presented in revised Fig.S7A,B, and described in revised text on pg 9-10:

      ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis.’

      Also, on pg.10:

      In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).

      (2) Related to the above, it is difficult to conclusively determine from the current data whether the changes in synaptic proteins observed in vivo are a secondary result of neuronal degeneration or a primary impact on synapse formation. The in vitro studies suggest this has the potential to be a primary effect, though the difference in experimental paradigm makes it impossible to determine whether the same mechanisms are present in vitro and in vivo.

      We appreciate the comment, which is related to R1 point 4. We have performed further analysis and revised the text on pg.12 with the following text:

      ‘To assess effects of DLK overexpression on synapses, we immunostained hippocampal sections from both P10 and P15, with age-matched littermate controls. Quantification of Bassoon and Homer1 immunostaining revealed no significant differences in CA1 SR and CA3 SR and SL in P10 mice of _<_i>Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> and control (Fig.S11A-F, S12A-J). In P15, Bassoon density and size in CA1 SR were comparable in both mice (Fig 5G, H, K), while Homer1 density and size were reduced in DLK(iOE) (Fig.5G,I, L). Overall synapse number in CA1 SR was similar in DLK(iOE) and control mice (Fig.5J). Similar analysis on CA3 SR and SL detected no significant difference from control (Fig.S12M-V).’

      We would interpret the data to mean that the effects of DLK(OE) on synapses in CA1 may represent an early step in neuronal death. We hope that future studies will shed clarity on this question.

      Additionally, to address whether the same mechanisms are present in vitro, we have performed further analysis on cultured hippocampal neurons. As described in the Methods, we made hippocampal neuron cultures from P1 pups of the following crosses:

      For control: Vglut1<sup>Cre/+</sup> X Rosa26<sup>tdT/+</sup> 

      For DLKcKO: Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>  X Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>;Rosa26<sup>tdT/+</sup> 

      For DLKiOE: H11-DLK<sup>iOE/iOE</sup> X Vglut1<sup>Cre/+</sup>;Rosa26<sup>tdT/+</sup> 

      Dissociated cells from a given litter were pooled into the same culture. Because there were different proportions of neurons with our genotype of interest in each culture, it is not simple to know whether DLK was causing significant cell death.

      On pg 13, we stated our observation:

      ‘We did not notice an obvious effect of DLK(iOE) or DLK(cKO) on neuron density in cultures at DIV2. To assess neuronal type distribution in our cultures, we immunostained DIV14 neurons with antibodies for Satb2, as a CA1 marker (Nielsen et al., 2010), and Prox1, as a marker of DG neurons (Iwano et al., 2012). We did not observe significant differences in the proportion of cells labeled with each marker in DLK(cKO) or DLK(iOE) cultures (Fig.S13E). These data are consistent with the idea that DLK signaling does not have a strong role in neuron-type specification both in vivo and in vitro’.

      (3) The phenotype of DLK cKO mice is very subtle (consistent with previous reports) and while the outcome of increased DLK levels is interesting, the relevance to physiological DLK signaling is less clear. What does seem possible is that increased DLK may phenocopy other neuronal injuries but there are no real comparisons to directly address this in the manuscript. It would be helpful for the authors to provide this analysis as well as a table with all of the translational changes along with fold changes.

      Thank you for the suggestion. The fold changes of genes showing significantly altered expression in DLK(cKO) and DLK(iOE) are provided in the excel files (Supplementary excel File S1 WT vs DLK(cKO) DEGs and File S2. WT vs DLK(iOE) DEGs, highlighted columns B and F).  

      On pg 6, we revised the text as following to include comparison of DLK levels in other physiological conditions and our mice:

      ‘Several studies have reported that DLK protein levels increase under a variety of conditions, including optic nerve crush (Watkins et al., 2013), NGF withdrawal (~2 fold) (Huntwork-Rodriguez et al., 2013; Larhammar et al., 2017), and sciatic nerve injury (Larhammar et al., 2017). Induced human neurons show increased DLK abundance about ~4 fold in response to ApoE4 treatment (Huang et al., 2019). Increased expression of DLK can lead to its activation through dimerization and autophosphorylation (Nihalani et al., 2000)’.

      And,

      ‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’

      In Discussion, we state (pg. 16): ‘The levels of DLK in our DLK(iOE) mice model appear comparable to those reported under traumatic injury and chronic stress.’

      (4) For the in vivo experiments, it is unclear whether multiple sections from each animal were quantified for each condition. More information here would be helpful and it is important that any quantification takes multiple sections from each animal into account to account for natural variability.

      We apologize this was unclear in the original manuscript.

      In the revised methods, under Confocal imaging and quantification (pg 33), we stated: “For brain tissue, three sections per mouse were imaged with a minimum of three mice per genotype for data analysis.”

      In revised figure legends, we made it clear that multiple sections from each animal have been used for quantification in all instances, i.e. “Each dot represents averaged thickness from 3 sections per mouse, N≥4 mice/genotype per timepoint.” 

      In Fig.1F-H: “Each dot represents averaged intensity from 3 sections per mouse”

      In Fig.S3B “Data points represent individual mice, averages taken across 3 sections per mouse”

      Reviewer #3 (Public Review):

      Dr Jin and colleagues revisit DLK and its established multifactorial roles in neuronal development, axonal injury, and neurodegeneration. The ambitious aim here is to understand the DLK-dependent gene network in the brain and, to pursue this, they explore the role of DLK in hippocampal glutamatergic neurons using conditional knockout and induced overexpression mice. They produce evidence that dorsal CA1 and dentate gyrus neurons are vulnerable to elevated expression of DLK, while CA3 neurons appear unaffected. Then they identify the DLK-dependent translatome featured by conserved molecular signatures and cell-type specificity. Their evidence suggests that increased DLK signaling is associated with possible STMN4 disruptions to microtubules, among else. They also produce evidence on cultured hippocampal neurons showing that expression levels of DLK are associated with changes in neurite outgrowth, axon specification, and synapse formation. They posit that downstream translational events related to DLK signaling in hippocampal glutamatergic neurons are a generalizable paradigm for understanding neurodegenerative diseases.

      Strengths

      This is an interesting paper based on a lot of work and a high number of diverse experiments that point to the pervasive roles of DLK in the development of select glutamatergic hippocampal neurons. One should applaud the authors for their work in constructing sophisticated molecular cre-lox tools and their expert Ribotag analysis, as well as technical skill and scholarly treatment of the literature. I am somewhat more skeptical of interpretations and conclusions on spatial anatomical selectivity without stereological approaches and also going directly from (extremely complex) Ribotag profiling patterns to relevance based on immunohistochemistry and no additional interventions to manipulate (e.g. by knocking down or blocking) their top Ribotag profile hits. Also, it seems to this reviewer that major developmental claims in the paper are based on gene translational profiling dependent on DLK expression, not DLK activation, despite some evidence in the paper that there is a correlation between the two. Therefore, observed patterns and correlations may or may not be physiologically or pathologically relevant. Generalizability to neurodegenerative diseases is an overreach not justified by the scope, approach, and findings of the paper.

      We thank the reviewer for the encouraging and constructive comments on the manuscript.

      Weaknesses and Suggestions:

      The authors state that the rationale for the translatomic studies is to "to gain molecular understanding of gene expression associated with DLK in glutamatergic neurons" and to characterize the "DLK-dependent molecular and cellular network", However, a problem with the experimental design is the selection of an anatomical region at a time point featured by active neurodegeneration. Therefore, it is not straightforward that the differentially expressed genes or pathways caused by DLK overexpression changes could be due to processes related to neurodegeneration. Indeed, the authors find enrichment of signals related to pathways involved in extracellular matrix organization, apoptosis, unfolded protein responses, the complement cascade, DNA damage responses, and depletion of signals related to mitochondrial electron transport, etc., all of which could be the consequence of neurodegeneration regardless of cause. A more appropriate design to discover DLK-dependent pathways might be to look at a region and/or a time point that is not confounded by neurodegeneration.

      We appreciate reviewer’s comment. We included our thoughts in ‘Limitation of the study’ (pg 20):

      ‘Future studies using cell-type specific RiboTag profiling and other methods at a refined time window will be required to address how DLK dependent signaling interacts with other networks underlying hippocampal regional neuron vulnerability to pathological insults.’

      In a related vein, the authors ask "if the differentially expressed genes associated with DLK(iOE) might show correlation to neuronal vulnerability" and, to answer this question, they select the set of differentially expressed genes after DLK overexpression and assess their expression patterns in various regions under normal conditions. It looks to me that this selection is already confounded by neurodegeneration which could be the cause for their downregulation. Therefore, such gene profiles may not be directly linked to neuronal vulnerability. A similar issue also relates to the conclusion that "...the enrichment of DLK-dependent translation of genes in CA1 suggests that the decreased expression of these genes may contribute to CA1 neuron vulnerability to elevated DLK".

      We agree with the reviewer’s concern that it is difficult to separate neurodegenerative consequences from changes caused by DLK solely based on our translatomics studies on P15 DLK(iOE) mice.  As responded to reviewer 1 (point 4) and reviewer 2 (point 1), we have included new analysis of P10 mice (Fig.S7A,B) when neurons did not show detectable sign of degeneration.

      We consider several lines of evidence supporting that some differentially expressed genes in DLK(iOE) vs control may likely be specific for increased DLK signaling.

      First, the genes identified in DLK(iOE) vs control represent a small set of genes (260), which is comparable to other DLK dependent datasets (Asghari Adib et al., 2024) but shows cell-type specificity.

      Second, our analysis using rank-rank hypergeometric overlap (RRHO) detects a significant correlation between upregulated genes from DLK(iOE) vs downregulated genes in DLK(cKO), and vice versa, suggesting that expression of a similar set of genes is depended on DLK (Fig.3C, S6C-E). Consistently, GO term analysis using the list of genes coordinately regulated by DLK, derived from our RRHO analysis, leads to identification of similar GO terms related to up- and downregulated genes as using DLK(iOE)-RiboTag data alone. SynGO analysis of DLK(iOE) regulated genes and DLK(cKO) regulated genes also identified similar synaptic processes regulated by significantly regulated genes (Fig.3F and S6J).  

      Third, we performed additional analysis comparing our Vglut1-RiboTag dataset with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We observed >80% overlap among the top ranked genes (revised Methods). We described this analysis on pg 9 and Fig. S6K-L (and Supplemental Excel File S3):

      ‘Additionally, we compared our Vglut1-RiboTag datasets with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We defined a list of genes enriched in CamK2-expressing CA1 neurons relative to Grik4-expressing CA3 neurons (CA1 genes), and those enriched in Grik4-expressing CA3 neurons (CA3 genes) (File S3). When compared with the entire list of Vglut1-RiboTag profiling in our control and DLK(cKO), we found CA1 genes tended to be expressed more in DLK(cKO) mice, compared to control (Fig.S6K), while CA3 genes showed a slight enrichment in control though the trend was less significant, and were less clustered towards one genotype (Fig.S6L). Moreover, many CA1 genes related to cell-type specification, such as FoxP1, Satb2, Wfs1, Gpr161, Adcy8, Ndst3, Chrna5, Ldb2, Ptpru, and Ntm, did not show significant downregulation when DLK was overexpressed. These observations imply that DLK likely specifically down-regulates CA1 genes both under normal conditions and when overexpressed, with a stronger effect on CA1 genes, compared to CA3 genes. Overall, the informatic analysis suggests that decreased expression of CA1 enriched genes may contribute to CA1 neuron vulnerability to elevated DLK, although it is also possible that the observed down-regulation of these genes is a secondary effect associated with CA1 neuron degeneration.’

      To understand the role and relevance of the DLK overexpression model, there should be a discussion of to what extent it corresponds to endogenous levels of DLK expression or DLK-MAPK pathway activation under baseline or pathological conditions.

      We appreciate the suggestion, which is similar to R2 point 3. We have revised the text and discussion to include how DLK levels may be altered in other physiological conditions vs our mice.

      Pg. 6: ‘Several studies have reported that DLK protein levels increase under a variety of conditions, including optic nerve crush (Watkins et al., 2013), NGF withdrawal (~2 fold) (Huntwork-Rodriguez et al., 2013; Larhammar et al., 2017), and sciatic nerve injury (Larhammar et al., 2017). Induced human neurons show increased DLK abundance about ~4 fold in response to ApoE4 treatment (Huang et al., 2019). Increased expression of DLK can lead to its activation through dimerization and autophosphorylation (Nihalani et al., 2000)’.

      And,

      ‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’

      In Discussion (pg. 16): ‘The levels of DLK in our DLK(iOE) mice model appear comparable to those reported under traumatic injury and chronic stress.’

      The authors posit that "dorsal CA1 neurons are vulnerable to elevated DLK expression, while neurons in CA3 appear largely resistant to DLK overexpression". This statement assumes that DLK expression levels start at a similar baseline among regions. Do the authors have any such data? Ideally, they should show whether DLK expression and p-c-Jun (as a marker of downstream DLK signaling) are the same or different across regions in both WT and overexpression mice. For example, what are the DLK/p-c-Jun expression levels in regions other than CA1 in Supplementary Figures 2-3 and how do they compare with each other? Normalization to baseline for each region does not allow such a comparison. Also, in Supplementary Figure 6, analyses and comparisons between regions are done at a time point when degeneration has already started. Ideally, these should be done at P10.

      We thank the reviewer for raising these points. In the revised manuscript we have included protein expression analysis of DLK (Fig S3), c-Jun, and p-c-Jun at P10 (Fig. S7).

      We provided a quantification of DLK immunostaining intensity in CA1 and CA3 in Fig.S3D,E and find roughly comparable levels between regions.

      Pg. 6: ‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’

      We provided our quantifications without normalization to baseline in each region for c-Jun and p-c-Jun, and revised the text accordingly:

      Pg. 9-10: ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis’.

      Pg. 10: ‘In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).

      Illustration of proposed selective changes in hippocampal sector volume needs to be very carefully prepared in view of the substantial claims on selective vulnerability. In 2A under P15 and especially P60, it is difficult to see the difference - this needs lower magnification and a lot of care that anteroposterior levels are identical because hippocampal sector anatomy and volumes of sectors vary from level to level. One wonders if the cortex shrinks, too. This is important.

      Thank you for raising the point. We have provided images to view the anteroposterior level in Fig.S2A-C. We have noticed cortex in DLK(OE) mice to become thinner, along with expansion of ventricles in some animals at later timepoints (Fig.S2C).

      One cannot be sure that there is selective death of hippocampal sectors with DLK overexpression versus, say, rearrangement of hippocampal architecture. One may need stereological analysis, otherwise this substantial claim appears overinterpreted.

      We appreciate the comment.

      In the revised manuscript, we included a new supplemental figure (Fig. S2) showing lower magnification images of coronal sections, and used cautionary wording, such as ‘CA3 is less vulnerable, compared to CA1’, to minimize the impression of over-interpretation.  By NeuN staining, at P10, P15, P60, we did not observe detectable difference in overall hippocampus architecture, apart from noted cell death of CA1 and DG and associated thinning of each of the layers. At 46 weeks, some animals showed differences in the overall shape of dorsal hippocampus, though this appeared to reflect a disproportionately large CA3 region compared to other regions (Fig S2). Increased GFAP staining (Fig.S5A-C) was detected in CA1 but not in CA3, and microglia by IBA1 staining (Fig.S5E) also displayed less reactivity in CA3, compared to CA1. Thus, based on NeuN staining, GFAP staining, IBA1 staining and analysis of the differentially regulated genes, we infer that the effect of DLK(iOE) in CA1 is different than the effect on CA3.

      Is the GFAP excess reflective of neuroinflammation? What do microglial markers show? The presence of neuroinflammation does not bode well with apoptosis. Speaking of which, TUNEL in one cell in Supplementary Figure 4E is not strong evidence of a more widespread apoptotic event in CA1.

      We have included staining data for the microglia marker IBA1. Both GFAP and IBA1 showed evidence of reactivity particularly in the CA1 region (S5A-E), supporting the differential vulnerability in different regions, though whether cell death is primarily due to apoptosis is unclear.

      We agree that our data of sparse TUNEL staining at P15 (Fig S5F,G) do not rule out whether other mechanisms of cell death may also occur.  We have included this in our limitations (pg.20) “While we find evidence for apoptosis, other forms of cell death may also occur.”

      In several places in the paper (as illustrated in Figure 4B, Supplementary Figure 2B, etc.): the unit of biological observation in animal models is typically not a cell, but an organism, in which averaged measures are generated. This is a significant methodological problem because it is not easy to sample neurons without involving stereological methods. With the approach taken here, there is a risk that significance may be overblown.

      We appreciate the reviewer’s point. We used same region for quantification of RNAscope, genotype-blind when possible. We revised the graphs to show mean values for individual mice in Fig.4B, 4C, and Fig.S3B (previously Fig.S2B).

      Other Comments and Questions:

      Supplementary Figure 9: The authors state that data points are shown for individual ROIs - ideally, they should also show averages for biological replicates. Can the authors confirm that statistical analyses are based on biological replicates (mice) and not ROIs?

      We have revised the graphs to show averages from individual mice in Fig.5B-D, F5E-F (previously Fig.S9G-I), Fig.5H-J, and Fig.5K-L (previously Fig.S9J-L)  and Fig.S10B,C,E,F (previously Fig.S9B,C, E,F). The statistical analyses are based on biological replicates of mice.

      For in vitro experiments, what is the effect of DLK overexpression on neuronal viability and density? Could these variables confound effects on synaptogenesis/synapse maturation?

      As described in the Methods, we made hippocampal neuron cultures from P1 pups of the following crosses:

      For control: Vglut1<sup>Cre/+</sup> X Rosa26<sup>tdT/+</sup> 

      For DLKcKO: Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>  X Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>;Rosa26<sup>tdT/+</sup> 

      For DLKiOE: H11-DLK<sup>iOE/iOE</sup> X Vglut1<sup>Cre/+</sup>;Rosa26<sup>tdT/+</sup> 

      Dissociated cells from a given litter were pooled into the same culture. Because there were different proportions of neurons with our genotype of interest in each culture, it is not simple to know whether DLK was causing significant cell death.

      On pg 13, we stated our observation:

      ‘We did not notice an obvious effect of DLK(iOE) or DLK(cKO) on neuron density in cultures at DIV2. To assess neuronal type distribution in our cultures, we immunostained DIV14 neurons with antibodies for Satb2, as a CA1 marker (Nielsen et al., 2010), and Prox1, as a marker of DG neurons (Iwano et al., 2012). We did not observe significant differences in the proportion of cells labeled with each marker in DLK(cKO) or DLK(iOE) cultures (Fig.S13E). These data are consistent with the idea that DLK signaling does not have a strong role in neuron-type specification both in vivo and in vitro’.

      We cannot rule out whether variable factors in our cultures may confound effects on synaptogenesis/synapse maturation, and would hope future studies will shed clarity.

      Correlations between c-jun expression and phosphorylation are extremely important and need to be carefully and convincingly documented. I am a bit concerned about Supplementary Figure 6 images, especially 6B-CA1 (no difference between control and KO, too small images) and 6D (no p-c-Jun expression at all anywhere in the hippocampus at P15?).

      At P10, P15, and P60 we stained for p-c-Jun using the Rabbit monoclonal p-c-Jun (Ser73) (D47G9) antibody from Cell Signaling (cat# 3270) at a 1:200 dilution and imaged using an LSM800 confocal microscope with a 20x objective. We observed p-c-Jun to be quite low generally in control animals. We have replaced the images in Fig.S7F (previously S6D), and adjusted the brightness/contrast to enable better visualization of the low signal in Fig.S7B,D,F (previously Fig.S6B,D).

      We revised our text to present the data carefully as stated above:

      Pg. 9-10: ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis’.

      Pg. 10: ‘In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).

      Recommendations for the authors:

      Several major and minor reservations were raised. The major issues are the need for more information about the over-expression of DLK and a need to extrapolate to an in vivo condition with DLK. A considerable amount of useful information is presented with some very nicely done experiments but it is not yet a coherent or integrated story. The lack of impact of DLK overexpression in some neurons is perhaps the most impactful observation of the study and would be great to have more information around the differential transcriptional/signaling response in these cell types. There is also a need for more experimental details and to address several questions about the mouse genetic and translatome analysis. They are valid concerns that require attention by the authors.

      We thank the editors and reviewers for their thoughtful evaluation and suggestions.  We hope that the editors and reviewers find that the new data and text changes in our revised manuscript, along with above point-to-point response, have addressed the concerns and strengthened our findings.

      Minor points:

      (1)The authors state that deletion of DLK has no effect on CA1 at 1yr, however, the image of CA1 in Figure S1D shows substantially fewer NeuN+ neurons. Is this a representative field of view?

      We have re-examined images, and observed no effect on hippocampal morphology at 1 yr. We now included representative images in the revised Fig S1D.

      (2) Is the DLK protein section staining in Figure 2C a real signal? The staining looks like speckles and is purely somatic. Axonal staining is widely expected based on the literature and the authors' own work. There should be a specificity control.

      To our knowledge, axonal staining of DLK reported in the literature is mostly based on cultured DRG neurons. In addition to the reported axonal localization, DLK is present in the cell soma, near the golgi (Hirai et al., 2002), and in the post-synaptic density (Pozniak et al., 2013).

      In the revised manuscript, we addressed this point by including controls with no primary antibody, and using an antibody against the closely related kinase, LZK. These additional data are shown in (Fig.S3C,D) (previously Fig.S2C), supporting that DLK protein staining represents real signal.  At P10 and P15, DLK immunostaining around CA3 showed axonal staining of the mossy fibers, as well as in the soma and dendritic layers (Fig.S3C,D). A similar pattern was also seen in primary cultured neurons (Fig 6A).

      (3) The protein expression of DLK in the transgenic overexpressor (Figure S7C) looks, to the resolution of this blot, to be at least 50kD heavier than 'WT' DLK. Can the authors explain this discrepancy?

      The Cre-induced DLK(iOE) transgene has T2A and tdTomato in-frame to C-terminus of DLK. It is known that T2A ‘self-cleavage’ is often incomplete. DLK-T2A-tdTomato would be about 50 kD bigger than WT DLK. We now include the transgene design in revised Fig S1D, and also stated in figure legend of Fig.S8C (previously S7C) that ‘Larger molecular weight band of DLK in Vglut1<sup>Cre/+</sup>;H11-DLKiOE/+ would match the predicted molecular weight of DLK-T2A-tdTomato if T2A-peptide induced ‘self-cleavage’ due to ribosomal skipping is ineffective (Fig.S1D).’

      (4) Expression changes in DLK affect various aspects of neurites in CA1 cultures (Figure 6), and changes in DLK also modestly affect STMN4 (and 2, perhaps indirectly) levels (Figure S7C), but there is no indication that DLK acts via STMN4 to cause these changes. It is not clear what to make of these data. Of note, Stmn4 levels change in response to DLK in CA3, without DLK affecting cell death in this region.

      We appreciate and agree with the comment. Other studies (Asghari Adib et al., 2024; DeVault et al., 2024; Hu et al., 2019; Larhammar et al., 2017; Le Pichon et al., 2017; Shin et al., 2019; Watkins et al., 2013) reported expression changes in Stmn4 mRNAs in other cell types and cellular contexts, which appeared to depend on DLK. Hippocampal neurons express multiple Stmns (Fig.S8A). While we present our analysis on the effects of DLK dosage on Stmn4, and also Stmn2, we do not think that DLK-induced changes of Stmn4 expression per se is a major factor underlying CA1 cell death vs CA3 survival.

      In the revised manuscript, we addressed this point in ‘Limitation of our study’ (pg 20):

      ‘Additional experiments will be needed to elucidate in vivo roles of STMN4 and its interaction with other STMNs’.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      • Reviewer #1 (Evidence, reproducibility and clarity (Required)): Summary:

      In this manuscript, Hammond et al. study robustness of the vertebrate segmentation clock against morphogenetic processes such as cell ingression, cell movement and cell division to ask whether the segmentation clock and morphogenesis are modular or not. The modularity of these two would be important for evolvability of the segmenting system. The authors adopt a previously proposed 3D model of the presomitic mesoderm (Uriu et al. 2021 eLife) and include new elements; diKerent types of cell ingression, tissue compaction and cell cycles. Based on the results of numerical simulations that synchrony of the segmentation clock is robust, the authors conclude that there is a modularity in the segmentation clock and morphogenetic processes.

      The presented results support the conclusion. The manuscript is clearly written. I have several comments that could help the authors further strengthen their arguments.

      Major comment:

      [Optional] In both the current model and Uriu et al. 2021, coupling delay in phase oscillator model is not considered. Given that several previous studies (e.g. Lewis 2003, Herrgen et al. 2010, Yoshioka-Kobayashi et al. 2020) suggested the presence of coupling delays in Delta- Notch signaling, could the authors analyze the eKect of coupling delay on robustness of the segmentation clock against morphogenetic processes?

      Response: We thank the reviewer for the suggestion. Owing to the computational demands of including such a delay in the model, we cannot feasibly repeat every simulation analysed here in the presence of delay, and would like to note that the increased computational demand that delays put on the simulations is also the reason why Uriu et al 2021 did not include it, as stated in their published exchange with reviewers. However, analogous to our analysis in figure 7, we can analyse how varying the position of progenitor cell ingression aKects synchrony in the presence of the coupling delay measured in zebrafish by Herrgen et al. (2010). We show this analysis in a new figure 8 (8B, specifically), on page 21, and discuss its implications in the text on pages 20- 22. Our analysis reveals that the model cannot recover synchrony using the default parameters used by Uriu et al. (2021) and reveal a much stronger dependence on the rate of cell mixing (vs) than shown in the instantaneous coupling case (cf. figure 7). However, by systematically varying the value of the delay we find that a relatively minor increase in the delay is suKicient to recover synchrony using the parameter set of Uriu et al. (see figure 8C). Repeating this across the three scenarios of cell ingression we see that the combination of coupling strength and delay determine the robustness of synchrony to varying position of cell ingression. This suggests that the combination of these two parameters constrain the evolution of morphogenesis.

      Minor comments:

      • PSM radius and oscillation synchrony are both denoted by the same alphabet r. The authors should use different alphabets for these two to avoid confusion.

      Response: We thank the reviewer for spotting this. This has now been changed throughout to rT, as shorthand for ‘radius of tissue’.

      • page 5 Figure 1 caption: (x-x_a/L) should be (x-x_a)/L.

      Response: We thank the reviewer for spotting this. This has now been corrected.

      • Figure 3C: Description of black crosses in the panels is required in the figure legend.

      Response: Thank you for spotting this. The legend has now been corrected.

      • Figure 3C another comment: In this panel, synchrony r at the anterior PSM is shown. It is true that synchrony at anterior PSM is most relevant for normal segment formation. However, in this case, the mobility profile is changed, so it may be appropriate to show how synchrony at mid and posterior PSM would depend on changes in mobility profile. Is synchrony improved by cell mobility at the region where cell ingression happens?

      Response: We thank the reviewer for the suggestion. We have now plotted the synchrony along the AP axis for varying motility profiles, and this can be seen in figure 3 supplement 1, and is briefly discussed in the text on page 11. We show that while the synchrony varies with x-position (as already expected, see figure 2), there is no trend associated with the shape of the motility profile.

      • In page 12, the authors state that "the results for the DP and DP+LV cases are exactly equal for L = 185 um, as .... and the two ingression methods are numerically equivalent in the model". I understood that in this case two ingression methods are equivalent, but I do not understand why the results are "exactly" equal, given the presence of stochasticity in the model.

      Response: These results can be exactly equal despite the simulations being stochastic because they were both initialised using the same ‘seed’ in the source code. However, we now see that this might be confusing to the reader, and we have re-generated this figure but this time initialising the simulations for each ingression scenario using a diKerent seed value. This is now reflected in the text on page 12 and in figure 4.

      • The authors analyze the eKect of cell density on oscillation synchrony in Fig. 4 and they mention that higher density increases robustness of the clock by increasing the average number of interacting neighbours. I think it would be helpful to plot the average number of neighbouring cells in simulations as a function of density to quantitatively support the claim.

      Response: We thank the reviewer for their suggestion. Distributions of neighbour numbers for exemplar simulations with varying density can now be found in figure 4 supplementary figure 1 and are referred to in the text on page 11.

      • The authors analyze the eKect of PSM length on synchrony in Fig. 4. I think kymographs of synchrony r as shown in Fig. 2D would also be helpful to show that indeed cells get synchronized while advecting through a longer PSM.

      Response: We thank the reviewer for their suggestion and agree that visualising the data in this way is an excellent idea. We have generated the suggested kymographs and added them to figure 4 as supplements 2 and 4, and discussed these results in the text on page 12.

      • I understand that cells in M phase can interact with neighboring cells with the same coupling strength kappa in the model, although their clocks are arrested. If so, this aspect should be also mentioned in the main text in page 16, as this coupling can be another noise source for synchrony.

      Response: We agree this is an important clarification. We explicitly state this, and briefly justify our choice, in the text on page 16.

      • Figure 5-figure supplement 2: panel labels A, B, C are missing.

      Response: Thank you for bringing this to our attention. These have now been added.

      • Figure 5-figure supplement 3: panel labels A, B, C are missing.

      Response: Thank you for bringing this to our attention. These have now been added.

      • Reviewer #1 (Significance (Required)):

      Synchronization of the segmentation clock has been studied by mathematical modeling, but most previous studies considered cells in a static tissue without morphogenesis. In the previous study by Uriu et al. 2021, morphogenetic processes such as cell advection due to tissue elongation, tissue shortening, and cell mobility were considered in synchronization. The current manuscript provides methodological advances in this aspect by newly including cell ingression, tissue compaction and cell cycle. In addition, the authors bring a concept of modularity and evolvability to the field of the vertebrate segmentation clock, which is new. On the other hand, the manuscript confirms that the synchronization of the segmentation clock is robust by careful simulations, but it does not propose or reveal new mechanisms for making it robust or modular. The main targets of the manuscript will be researchers working on somitogenesis and evolutionary biologists who are interested in evolution of developmental systems. The manuscript will also be interested by broader audiences, like developmental biologists, biophysicists, and physicists and computer scientists who are working on dynamical systems.

      Response: We thank the reviewer for their interest in our manuscript and for acknowledging us as one of the first to address the modularity and evolvability of somitogenesis. We hope that this work will encourage others to think about these concepts in this system too. In the original submission, we identified a high enough coupling strength as the main mechanism underlying the identified modularity in somitogenesis. Since, we have included an analysis of the coupling delay and find that it is the interplay between coupling strength and coupling delay that mediate the identified modularity, allowing PSM morphogenesis and the segmentation clock to evolve independently in regions of parameter space that are constrained and determined by the interplay between these two parameters. We have now added an extra figure (figure 8) where we explore this interplay and have discussed it at length in the last section of the results and in the discussion. We again thank the reviewer for encouraging us to include delays in our analysis.

      • Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      SUMMARY

      The manuscript from Hammond et al., investigates the modularity of the segmentation clock and morphogenesis in early vertebrate development, focusing on how these processes might independently evolve to influence the diversity of segment numbers across vertebrates.

      Methodology | The study uses a previously published computational model, parameterized for zebrafish, to simulate and analyse the interactions between the segmentation clock and the morphogenesis of the pre-somitic mesoderm (PSM). Their model integrates cell advection, motility, compaction, cell division, and the synchronization of the embryo clock. Three alternative scenarios of PSM morphogenesis were modeled to examine how these changes aKect the segmentation clock.

      Model System | The computational model system combines a representation of cell movements and the phase oscillator dynamics of the segmentation clock within a three-dimensional horseshoe-shaped domain mimicking the geometry of the vertebrate embryo PSM. The parameters used for the mathematical model are mostly estimated from previously published experimental findings.

      Key Findings and Conclusions | (1) The segmentation clock was found to be broadly robust against variations in morphogenetic processes such as cell ingression and motility; (2) Changes in the length of the PSM and the strength of phase coupling within the clock significantly influenced the system's robustness; (3) The authors conclude that the segmentation clock and PSM morphogenesis exhibited developmental modularity (i.e. relative independence), allowing these two phenomena to evolve independently, and therefore possibly contributing to the diverse segment numbers observed in vertebrates.

      MAJOR COMMENTS

      1. The key conclusion drawn by the authors (that there is robustness, and therefore modularity, between the morphogenetic cellular processes modeled and the embryo clock synchronization) stems directly from the modeling results appropriately presented and discussed in the manuscript. The model comprises some strong assumptions, however all have been clearly explained and the parameterization choices are supported by experimental findings, providing biological meaning to the model. Estimated parameters are well explained and seem reasonable assumptions (from the embryology perspective).

      Response: We thank the reviewer for their positive comments about our work

      1. This study, as is, achieves its proposed goal of evaluating the potential robustness of the embryo clock to changes in (some) morphogenetic processes. The authors do not claim that the model used is complete, and they properly identify some limitations, including the lack of cell-cell interactions. Given the recognized importance of cellular physical interactions for successful embryo development, including them in the model would be a significant addition in future studies.

      Response: We would like to clarify that the model does include cell-cell interactions as cells interact with their neighbours’ clock phase to synchronise and to avoid occupying the same physical space.

      1. The authors have deposited all the code used for analysis in a public GitHub repository that is updated and available for the research community.

      Response: We support open source coding practices.

      1. In page 6, the authors justify their choice of clock parameters for cells ingressing the PSM: "As ingressing cells do not appear to express segmentation clock genes (Mara et al. (2007)), the position at which cells ingress into the PSM can create challenges for clock patterning, as only in the 'oK' phase of the clock will ingressing cells be in-phase with their neighbours."

      However, there are several lines of evidence (in chick and mouse), that some oscillatory clock genes are already being expressed as early as in the gastrulation phase (so prior to PSM ingression) (Feitas et al, 2001 [10.1242/dev.128.24.5139]; Jouve et al, 2002 [10.1242/dev.129.5.1107]; Maia-Fernandes at al, 2024 [10.1371/journal.pone.0297853])

      Question: Is this also true in zebrafish? (I.e. is there any recent experimental evidence that the clock genes are not expressed at ingression, since the paper cited to support this assumption is from 2007). If they are expressed in zebrafish (as they are in mouse and chick), then the cell addition should have random clock gene periods when they enter the PSM and not start all with a constant initial phase of zero. Probably this will not impact the results since the cells will also be out of phase with their neighbours when they "ingress", however, it will model more closely the biological scenario (and avoid such criticism).

      Response: We thank the reviewer for their comments. While it is known that in zebrafish the clock begins oscillating during epiboly and before the onset of segmentation (Riedel-Kruse et al., 2007), to our knowledge no-one has examined whether posteriorly or laterally ingressing progenitor cells express clock genes prior to their ingression into the PSM, which occurs later in development than the first oscillations which give rise to the first somites. We have not found any published evidence of her/hes gene expression in the dorsal donor tissues or lateral tissues surrounding the PSM, however we acknowledge that this has not been actively studied before and our assumption relies on an absence of evidence, rather than evidence of absence.

      However, we agree with the reviewer that one should include such an analysis for completeness, and we have now generated additional simulations where progenitor cells ingress with a random clock phase. This data is presented in figure 2 supplement 1 and mentioned in the main text on page 9.

      MINOR COMMENTS:

      1. The citations are appropriate and cover the major labs that have published work related to this study (although with some overrepresentation of the lab that published the model used).

      Response: We have cited the vast literature on somitogenesis to the best of our ability and do recognise that the work of the Oates lab appears prominently, but this is probably because their experimental data were originally used to parametrise the model in Uriu et al. 2021.

      The text is clear, carefully written, and both the methods and the reasoning behind them are clearly explained and supported by proper citations.

      Response: We are very glad to see that the reviewer found that the manuscript was clearly presented.

      1. The figures are comprehensive, properly annotated, with explanatory self-contained legends. I have no comments regarding the presentation of the results.

      Response: Thank you

      Minor suggestions:

      1. Page 26: In the Cell addition sub-section of the Methods section, correct all

      instances where the word domain is used, but subdomain should be used (for clarity and coherence with the description of the model, stated as having a single domain comprising 3 subdomains).

      Response: We thank the reviewer for raising this, this is a good point. We have now corrected to ‘subdomain’ where appropriate.

      1. Page 32: Table 1. Parameter values used in our work, unless otherwise stated -> Suggestion: Add a column with the individual citations used for each parameter (to facilitate the confirmation of each corresponding reference).

      Response: Thank you for the suggstion, we have now done this (see table 1 page 36).

      **Referee Cross-commenting**

      I carefully read the reports provided by my fellow reviewers. My cross-comments aim to enhance the collective evaluation of the manuscript by Hammond et al.

      • On reviewer #1's Comments:

      I agree with Reviewer #1's overall evaluation of the manuscript's value and relevance, and with their general comments. I particularly support the suggestion to optionally include coupling delays known to influence the clock's period, as this would improve the model's completeness and benefit the research community. I also view this as an optional but desirable addition, not mandatory.

      Response: As per reviewer #1’s suggestion, we have now included this analysis (figure 8).

      In Fig. 4, I agree that showing kymographs, similar to Fig. 2D, for each PSM length would greatly improve the visualization of the results, given the relevance of this result to the manuscript's main message.

      Response: As per reviewer #1’s suggestion, we have now included such an analysis (figure 4 supplements 2 and 4) and agree with both reviewers that they improve the communication of our results.

      The remaining minor comments are useful and relevant to improving the manuscript.

      • On reviewer #3's Comments:

      Although I agree with Reviewer #3 that the paper is somewhat lengthy, I find the detailed description of the model in its biological context necessary and welcomed by the embryology research community. Without this detail, the paper might be too 'dry' and lose part of its audience. Conversely, focusing mostly on embryology without detailing the model parameters and simulation findings would deprive it of novelty and critical insights.

      Response: We thank Reviewer #2 for this assessment, which we agree with. Nonetheless we have sought to streamline our writing throughout to increase clarity without reducing the content.

      Overall, I find Reviewer #3's suggestions scientifically interesting, particularly comments 3, 4, and 5, which express legitimate questions for future study. However, I find them tangential to the main question addressed in this manuscript, which pertains to the modularity of the segmentation clock and morphogenesis. Therefore, I do not see them as significant improvements for the authors to implement in the current study.

      Response: We thank Reviewer #2 for their comments here and refer them to our responses to Reviewer #3.

      I would like to know how the authors respond to comments 1 and 2, which I do not have the expertise to evaluate.

      Response: We have now addressed these concerns in our response to Reviewer #3. Please see below.

      I agree with comment 6 that a brief mention of the known pathways/gene networks to which the assumptions apply (in zebrafish) would be a good addition. However, I do not think a detailed discussion is needed, as specific genes/networks can be diKerent for diKerent organisms.

      Response: We now justify this assumption in the methods on page 32.

      I disagree with comment 7, as Fig. 3 shows that the clock is robust to changes in cell ingression regime across all cell motility profiles tested. This is an important result for the manuscript's take home message, and should remain in the main text, not as a supplementary figure.

      Response: We agree with Reviewer #2 and have included this in our response to Reviewer #3.

      Finally, regarding Reviewer #3's concern about the incompleteness of the results, I find the results robust given the formalism chosen and within the scenarios where the assumptions hold. I believe this concern applies to the formalism (which is a choice) and not to the quality or relevance of the work presented in the manuscript. Additionally, some of the model's limitations have been adequately addressed by the authors.

      Response: We thank Reviewer #2 for their comments.

      • Reviewer #2 (Significance (Required)): GENERAL ASSESSMENT

      • This study uses a previously published model to simulate alternative scenarios of morphogenetic parameters to infer the potential independence (termed here modularity) between the segmentation clock and a set of morphogenetic processes, arguing that such modularity could allow the evolution of more flexible body plans, therefore partially explaining the variability in the number of segments observed in the vertebrates. This question is fundamental and relevant, yet still poorly researched. This work provides a comprehensive simulation with a model that tries to simplify the many morphogenetic processes described in the literature, reducing it to a few core fundamental processes that allow drawing the conclusions seeked. It provides theoretical insight to support a conceptual advance in the field of evolutionary vertebrate embryology.

      ADVANCE

      • This study builds on a model recently published by Uriu et al. (eLife, 2021) that incorporates quantitative experimental data within a modeling framework including cell and tissue-level parameters, allowing the study of multiscale phenomena active during zebrafish embryo segmentation. Uriu's publication reports many relevant and often non-intuitive insights uncovered by the model, most notably the description of phase vortices formed by the synchronizing genetic oscillators interfering with the traveling-wave front pattern.

      However, this model can be further explored to ask additional questions beyond those described in the original paper. A good example is the present study, which uses this mathematical framework to investigate the potential independence between two of the modeled processes, thereby extracting extra knowledge from it. Accordingly, the present study represents a step forward in the direction of using relevant theoretical frameworks to quantitatively explore the landscape of complex molecular hypotheses in silico, and with it shed some light on fundamental open questions or inform the design of future experiments in the lab.

      • The study incorporates a wide range of existing literature on the developmental biology of vertebrates. It comprehensively cites prior work, such as the foundational studies by Cooke and Zeeman on the segmentation clock and the role of FGF signaling in PSM development as discussed by Gomez et al. The literature properly covers the breadth of knowledge in this field.

      AUDIENCE

      • Target audience | This study is relevant for fundamental research in developmental biology, specifically targeting researchers who focus on early embryo development and morphogenesis from both experimental and theoretical perspectives. It is also relevant for evolutionary biologists investigating the genetic factors that influence vertebrate evolution, as well as to computational biologists and bioinformatics researchers studying developmental processes and embryology.

      Developmental researchers studying the segmentation clock in other vertebrate model organisms (namely mouse and chick), will find this publication especially valuable since it provides insights that can help them formulate new hypotheses to elucidate the molecular

      mechanisms of the clock (for example finding a set of evolutionarily divergent genes that might interfere with PSM length). Additionally, this study provides a set of cellular parameters that have yet to be measured in mouse and chick, therefore guiding the design of future experiments to measure them, allowing the simulation of the same model with sets of parameters from diKerent vertebrate model organisms, therefore testing the robustness of the findings reported for zebrafish.

      MY EXPERTISE

      My areas of research (relevant for this study): Vertebrate embryo clock oscillations in Gallus gallus; Computational biology.

      I can evaluate the relevance and validity of the model, critically evaluate its outputs and parameters, and the significance of the model assumptions for drawing relevant biological insights; however, I am not an expert on this mathematical formalism.

      • Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Verd and colleagues explored how various biologically relevant factors influence the robustness of clock dynamics synchronization among neighboring cells within the context of somatogenesis, adapting a mathematical model presented by Urio et. al in 2021 in a similar context. Specifically they show that clock dynamics is robust to diKerent biological mechanisms such as cell infusion, cellular motility, compaction-extension and cell-division. On the other hand , the length of Presomitic Mesoderm (PSM) and density of cells in it has a significant role in the robustness of clock dynamics. While the manuscript is well-written and provides clear descriptions of methods and technical details, it tends to be somewhat lengthy. Below are the comments I would like the authors to address:

      1. The authors mention that "...the model is three dimensional and so can quantitatively recapture the rates of cell mixing that we observe in the PSM". I am not convinced with this justification of using a 3D model. None of the eKects the authors explore in this manuscript requires a three dimensional model or full physical description of the cellular mechanics such as excluded volume interaction etc. A one-dimensional model characterized by cell position along the arclength of PSM and somatic region and segmentation clock phase θ can incorporate all the physics authors described in this manuscript as well as significantly computationally cheap allowing the authors to explore the eKect of diKerent parameters in greater detail.

      Response: One of the main objectives of the work we present in this manuscript is to assess how the evolution of PSM morphogenesis affects, or does not affect, segment patterning. The PSM is a three-dimensional tissue with diKering cell rearrangement dynamics along its anterior-posterior axis. In addition, PSM dimension, density, the rearrangement rate, and patterns of cell ingression all vary across vertebrate species, and they are functional, especially cell mixing as it promotes synchronisation and drives elongation. In order to answer questions on the modularity of somitogenesis we therefore consider it absolutely necessary to include a three-dimensional representation of the PSM thatcaptures single cells and their movements. In addition, this will allow us, as Reviewer #2 also pointed out, to reparametrize our model using species-specific data as it becomes available.

      While the reviewer is right in that lower dimensional representations would be computationally more efficient, and are generally more tractable, it would not be possible to represent cell mixing in one dimension, as this happens in three dimensions. One could perhaps encode the synchrony-promoting eKect of cell mixing via some coupling function κ(x) that increases towards the posterior, however it is unclear what existing biological data one could use to parameterise this function or determine its form. Cell mixing can be modelled in a two-dimensional framework, however this cannot quantitatively recapture the rate of cell mixing observed in vivo, which is an advantage of this model.

      Furthermore, it is unclear how one would simulate processes such as compaction- extension using a one-dimensional model. The two diKerent scenarios of cell ingression which we consider can also not be replicated in a one-dimensional model, as having a population of cells re-acquiring synchrony on the dorsal surface of the tissue while new material is added to the ventral side, creating asynchrony, is qualitatively diKerent than a one-dimensional scenario where cells are introduced continuously along the spatial axis.

      I am not sure about the justification for limiting the quantification of phase synchrony in a very limited (one cell diameter wide) region at one end of the somatic part (Page 33 below Fig. 9). From my understanding of the manuscript, the segments appear in significant length anterior to this region. Wouldn't an ensemble average of multiple such one cell diameter wide regions in the somatic region be a more accurate metric for quantifying synchrony?

      Response: Indeed, such a metric (e.g. as that used by Uriu et al. to quantify synchrony along the x- axis) would be more accurate for determining synchrony within the PSM. However, as per the clock and wavefront model of somitogenesis, only synchrony at the very anterior of the PSM (or at the wavefront, equivalently) is functional for somitogenesis and thus evolution. Therefore, we restrict our analysis to the anterior-most region of the PSM. We now further justify this in the main text on page 9.

      While studying the eKect of cellular ingression, the authors study three discrete modes- random, DP and DP+LV and show that in the DP+LV mode the clock synchrony becomes aKected. I would like the authors to explore this in a continuous fashion from a pure DP ingression to Pure LV ingression and intermediates.

      Response: We thank the reviewer for this suggestion; this is a very interesting question. We are currently working on a related computational and experimental project to address the question of how PSM morphogenesis can change over evolutionary time to evolve the diKerent modes that we see across species. As part of this work, we are running precisely the simulations suggested by the reviewer to find regions of parameter space in which all the relevant morphogenetic processes can freely evolve. While interesting, this work is however outside the scope of the current manuscript.

      While studying the effect of length and density of cells in PSM on cellular synchrony, the authors restrict to 3 values of density and 6 values of PSM length keeping the other parameter constant. I would be interested to see a phase diagram similar to Fig. 7 in the two-dimensional parameter space of L and ρ0. I am curious if a scaling relation exists for the parameter values that partition the parameter space with and without synchrony.

      Response: We thank the reviewer for their suggestion and agree that this would constitute an interesting addition to the manuscript. We have now generated these data, which are shown in figure 4 supplement 5 and mentioned on page 13. We see no clear relationship between these two variables when co-varying in the presence of random ingression.

      Both in the abstract and introduction, the authors discuss at a great length about the variability in the number of segments. I am curious how the number and width of the segments observed depend on different parameters related to cellular mechanics and the segmentation clock ?

      Response: We thank the reviewer for this question. It was not clear to us if this was something the reviewer wants us to address in the study’s background and introduction, or an analysis we should include in the results. Therefore, we have responded to both comprehensively below:

      The prevailing conceptual framework for understanding this is the clock and wavefront model (Cooke and Zeeman, 1976), which posits that the somite length is inversely proportional to the frequency of the clock relative to the speed of the wavefront, and that the total number of segments is the relative frequency multiplied by the total duration of somitogenesis.

      Experimentally we know that the frequency is determined in part by the coupling strength (Liao, Jorg, and Oates, 2016), and from comparative embryological studies (Gomez et al., 2008; Steventon et al., 2016) we know that changes in the elongation dynamics of the PSM correlate with changes in somite number, presumably by altering the total duration of somitogenesis (Gomez et al., 2009). These changes in elongation are thought to be driven by the changes in cell and tissue mechanics we test in our manuscript.

      Within our model, we cannot in general predict how the number of segments responds to changes in either clock parameters or cell mechanical parameters, as we lack understanding of what causes somitogenesis to cease; this is thus not encoded in our model and segmentation can in principle proceed indefinitely. Therefore, we have not performed this analysis.

      Similarly, we have not included an analysis of somite length. This is for two reasons: 1) as per the clock and wavefront model, the frequency at the PSM anterior (which we analyse) is equivalent to this measurement, as we assume (in general) the wavefront ($x = x_{a}$) is inertial. 2) the length of the nascent somite is not thought to be of much relevance to the adult phenotype, and by extension evolution. Somites undergo cell division and growth soon after their patterning by the segmentation clock, therefore their final size does not majorly depend on the dynamics of the segmentation clock. Rather, the main function of the clock is to control their number (and polarity).

      The authors assume that the phase dynamics of the chemical network may be described by an oscillator with constant frequency. For the completeness of the manuscript, the author should discuss in detail, for which chemical networks this is a good assumption.

      Response: We thank the reviewer for their suggestion and now justify this assumption in the methods on page 31.

      Such an assumption is appropriate for the segmentation clock, as the clock in the posterior of the PSM is thought to oscillate with a constant frequency, at least for the majority of somitogenesis although the frequency of somite formation slows towards the end of this process in zebrafish (Giudicelli et al., 2007, PLoS Biol.). In addition, PSM cells isolated and cultured in the presence of FGF (thus replicating the signalling environment of the posterior PSM) will continue to exhibit her1 oscillations with an apparently constant frequency (Webb et al., 2016).

      We note that such formulations are widely used within the segmentation clock literature (e.g. Riedel-Kruse et al., 2007, Morelli et al., 2009).

      Figure 3 and the associated text shows no eKect of the cellular motility profile in the synchrony of the segmentation clock. This may be moved to the supplementary considering the length of this manuscript.

      Response: Thank you for the suggestion. However, we would argue that the lack of eKect is a crucial result when discussing modularity. Reviewer #2 agrees with this assessment.

      • Reviewer #3 (Significance (Required)):

      The manuscript answers some important questions in the synchrony of segmentation clock in the vertebrates utilizing a model published earlier. However, the presented result is incomplete in some aspects (points 2 to 5 of section A) and that could be overcome by a more detailed analysis using a simpler one dimensional (point 1 of section A). I believe this manuscript could be of interest to an intersecting audience of developmental biologists, systems biologists, and physicists/engineers interested in dynamical systems.

      My research interests are building physics and engineering based models of cell and tissue scale biological phenomena.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, Verd and colleagues explored how various biologically relevant factors influence the robustness of clock dynamics synchronization among neighboring cells within the context of somatogenesis, adapting a mathematical model presented by Urio et. al in 2021 in a similar context. Specifically they show that clock dynamics is robust to different biological mechanisms such as cell infusion, cellular motility, compaction-extension and cell-division. On the other hand , the length of Presomitic Mesoderm (PSM) and density of cells in it has a significant role in the robustness of clock dynamics. While the manuscript is well-written and provides clear descriptions of methods and technical details, it tends to be somewhat lengthy. Below are the comments I would like the authors to address:

      1. The authors mention that "...the model is three dimensional and so can quantitatively recapture the rates of cell mixing that we observe in the PSM". I am not convinced with this justification of using a 3D model. None of the effects the authors explore in this manuscript requires a three dimensional model or full physical description of the cellular mechanics such as excluded volume interaction etc. A one-dimensional model characterized by cell position along the arclength of PSM and somatic region and segmentation clock phase θ can incorporate all the physics authors described in this manuscript as well as significantly computationally cheap allowing the authors to explore the effect of different parameters in greater detail.
      2. I am not sure about the justification for limiting the quantification of phase synchrony in a very limited (one cell diameter wide) region at one end of the somatic part (Page 33 below Fig. 9). From my understanding of the manuscript, the segments appear in significant length anterior to this region. Wouldn't an ensemble average of multiple such one cell diameter wide regions in the somatic region be a more accurate metric for quantifying synchrony?
      3. While studying the effect of cellular ingression, the authors study three discrete modes-random,DP and DP+LV and show that in the DP+LV mode the clock synchrony becomes affected. I would like the authors to explore this in a continuous fashion from a pure DP ingression to Pure LV ingression and intermediates.
      4. While studying the effect of length and density of cells in PSM on cellular synchrony, the authors restrict to 3 values of density and 6 values of PSM length keeping the other parameter constant. I would be interested to see a phase diagram similar to Fig. 7 in the two dimensional parameter space of L and ρ0. I am curious if a scaling relation exists for the parameter values that partition the parameter space with and without synchrony.
      5. Both in the abstract and introduction, the authors discuss at a great length about the variability in the number of segments. I am curious how the number and width of the segments observed depend on different parameters related to cellular mechanics and the segmentation clock ?
      6. The authors assume that the phase dynamics of the chemical network may be described by an oscillator with constant frequency. For the completeness of the manuscript, the author should discuss in detail,for which chemical networks this is a good assumption.
      7. Figure 3 and the associated text shows no effect of the cellular motility profile in the synchrony of the segmentation clock. This may be moved to the supplementary considering the length of this manuscript.

      Significance

      The manuscript answers some important questions in the synchrony of segmentation clock in the vertebrates utilizing a model published earlier. However, the presented result is incomplete in some aspects (points 2 to 5 of section A) and that could be overcome by a more detailed analysis using a simpler one dimensional (point 1 of section A). I believe this manuscript could be of interest to an intersecting audience of developmental biologists, systems biologists, and physicists/engineers interested in dynamical systems.

      My research interests are building physics and engineering based models of cell and tissue scale biological phenomena

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript from Hammond et al., investigates the modularity of the segmentation clock and morphogenesis in early vertebrate development, focusing on how these processes might independently evolve to influence the diversity of segment numbers across vertebrates.

      Methodology | The study uses a previously published computational model, parameterized for zebrafish, to simulate and analyse the interactions between the segmentation clock and the morphogenesis of the pre-somitic mesoderm (PSM). Their model integrates cell advection, motility, compaction, cell division, and the synchronization of the embryo clock. Three alternative scenarios of PSM morphogenesis were modeled to examine how these changes affect the segmentation clock.

      Model System | The computational model system combines a representation of cell movements and the phase oscillator dynamics of the segmentation clock within a three-dimensional horseshoe-shaped domain mimicking the geometry of the vertebrate embryo PSM. The parameters used for the mathematical model are mostly estimated from previously published experimental findings.

      Key Findings and Conclusions | (1) The segmentation clock was found to be broadly robust against variations in morphogenetic processes such as cell ingression and motility; (2) Changes in the length of the PSM and the strength of phase coupling within the clock significantly influenced the system's robustness; (3) The authors conclude that the segmentation clock and PSM morphogenesis exhibited developmental modularity (i.e. relative independence), allowing these two phenomena to evolve independently, and therefore possibly contributing to the diverse segment numbers observed in vertebrates.

      Major comments

      1. The key conclusion drawn by the authors (that there is robustness, and therefore modularity, between the morphogenetic cellular processes modeled and the embryo clock synchronization) stems directly from the modeling results appropriately presented and discussed in the manuscript. The model comprises some strong assumptions, however all have been clearly explained and the parameterization choices are supported by experimental findings, providing biological meaning to the model. Estimated parameters are well explained, and seem reasonable assumptions (from the embryology perspective).
      2. This study, as is, achieves its proposed goal of evaluating the potential robustness of the embryo clock to changes in (some) morphogenetic processes. The authors do not claim that the model used is complete, and they properly identify some limitations, including the lack of cell-cell interactions. Given the recognized importance of cellular physical interactions for successful embryo development, including them in the model would be a significant addition in future studies.
      3. The authors have deposited all the code used for analysis in a public GitHub repository that is updated and available for the research community.
      4. In page 6, the authors justify their choice of clock parameters for cells ingressing the PSM: "As ingressing cells do not appear to express segmentation clock genes (Mara et al. (2007)), the position at which cells ingress into the PSM can create challenges for clock patterning, as only in the 'off' phase of the clock will ingressing cells be in-phase with their neighbors."

      However, there are several lines of evidence (in chick and mouse), that some oscillatory clock genes are already being expressed as early as in the gastrulation phase (so prior to PSM ingression) (Feitas et al, 2001 [10.1242/dev.128.24.5139]; Jouve et al, 2002 [10.1242/dev.129.5.1107]; Maia-Fernandes at al, 2024 [10.1371/journal.pone.0297853]).

      Question: Is this also true in zebrafish? (I.e. is there any recent experimental evidence that the clock genes are not expressed at ingression, since the paper cited to support this assumption is from 2007). If they are expressed in zebrafish (as they are in mouse and chick), then the cell addition should have random clock gene periods when they enter the PSM and not start all with a constant initial phase of zero. Probably this will not impact the results since the cells will also be out of phase with their neighbors when they "ingress", however, it will model more closely the biological scenario (and avoid such criticism).

      Minor comments

      1. The citations are appropriate and cover the major labs that have published work related to this study (although with some overrepresentation of the lab that published the model used).
      2. The text is clear, carefully written, and both the methods and the reasoning behind them are clearly explained and supported by proper citations.
      3. The figures are comprehensive, properly annotated, with explanatory self-contained legends. I have no comments regarding the presentation of the results.
      4. Minor suggestions:
      5. Page 26: In the Cell addition sub-section of the Methods section, correct all instances where the word domain is used, but subdomain should be used (for clarity and coherence with the description of the model, stated as having a single domain comprising 3 subdomains).
      6. Page 32: Table 1. Parameter values used in our work, unless otherwise stated -> Suggestion: Add a column with the individual citations used for each parameter (to facilitate the confirmation of each corresponding reference).

      Referee Cross-commenting

      I carefully read the reports provided by my fellow reviewers. My cross-comments aim to enhance the collective evaluation of the manuscript by Hammond et al.

      Reviewer #1's Comments:

      I agree with Reviewer #1's overall evaluation of the manuscript's value and relevance, and with their general comments. I particularly support the suggestion to optionally include coupling delays known to influence the clock's period, as this would improve the model's completeness and benefit the research community. I also view this as an optional but desirable addition, not mandatory.

      In Fig. 4, I agree that showing kymographs, similar to Fig. 2D, for each PSM length would greatly improve the visualization of the results, given the relevance of this result to the manuscript's main message.

      The remaining minor comments are useful and relevant to improving the manuscript.

      Reviewer #3's Comments:

      Although I agree with Reviewer #3 that the paper is somewhat lengthy, I find the detailed description of the model in its biological context necessary and welcomed by the embryology research community. Without this detail, the paper might be too 'dry' and lose part of its audience. Conversely, focusing mostly on embryology without detailing the model parameters and simulation findings would deprive it of novelty and critical insights.

      Overall, I find Reviewer #3's suggestions scientifically interesting, particularly comments 3, 4, and 5, which express legitimate questions for future study. However, I find them tangential to the main question addressed in this manuscript, which pertains to the modularity of the segmentation clock and morphogenesis. Therefore, I do not see them as significant improvements for the authors to implement in the current study.

      I would like to know how the authors respond to comments 1 and 2, which I do not have the expertise to evaluate.

      I agree with comment 6 that a brief mention of the known pathways/gene networks to which the assumptions apply (in zebrafish) would be a good addition. However, I do not think a detailed discussion is needed, as specific genes/networks can be different for different organisms.

      I disagree with comment 7, as Fig. 3 shows that the clock is robust to changes in cell ingression regime across all cell motility profiles tested. This is an important result for the manuscript's take home message, and should remain in the main text, not as a supplementary figure.

      Finally, regarding Reviewer #3's concern about the incompleteness of the results, I find the results robust given the formalism chosen and within the scenarios where the assumptions hold. I believe this concern applies to the formalism (which is a choice) and not to the quality or relevance of the work presented in the manuscript. Additionally, some of the model's limitations have been adequately addressed by the authors.

      Significance

      GENERAL ASSESSMENT

      • This study uses a previously published model to simulate alternative scenarios of morphogenetic parameters to infer the potential independence (termed here modularity) between the segmentation clock and a set of morphogenetic processes, arguing that such modularity could allow the evolution of more flexible body plans, therefore partially explaining the variability in the number of segments observed in the vertebrates. This question is fundamental and relevant, yet still poorly researched. This work provides a comprehensive simulation with a model that tries to simplify the many morphogenetic processes described in the literature, reducing it to a few core fundamental processes that allow drawing the conclusions seeked. It provides theoretical insight to support a conceptual advance in the field of evolutionary vertebrate embryology.

      ADVANCE

      • This study builds on a model recently published by Uriu et al. (eLife, 2021) that incorporates quantitative experimental data within a modeling framework including cell and tissue-level parameters, allowing the study of multiscale phenomena active during zebrafish embryo segmentation. Uriu's publication reports many relevant and often non-intuitive insights uncovered by the model, most notably the description of phase vortices formed by the synchronizing genetic oscillators interfering with the traveling-wave front pattern. However, this model can be further explored to ask additional questions beyond those described in the original paper. A good example is the present study, which uses this mathematical framework to investigate the potential independence between two of the modeled processes, thereby extracting extra knowledge from it. Accordingly, the present study represents a step forward in the direction of using relevant theoretical frameworks to quantitatively explore the landscape of complex molecular hypotheses in silico, and with it shed some light on fundamental open questions or inform the design of future experiments in the lab.
      • The study incorporates a wide range of existing literature on the developmental biology of vertebrates. It comprehensively cites prior work, such as the foundational studies by Cooke and Zeeman on the segmentation clock and the role of FGF signaling in PSM development as discussed by Gomez et al. The literature properly covers the breadth of knowledge in this field.

      AUDIENCE

      • Target audience | This study is relevant for fundamental research in developmental biology, specifically targeting researchers who focus on early embryo development and morphogenesis from both experimental and theoretical perspectives. It is also relevant for evolutionary biologists investigating the genetic factors that influence vertebrate evolution, as well as to computational biologists and bioinformatics researchers studying developmental processes and embryology.

      Developmental researchers studying the segmentation clock in other vertebrate model organisms (namely mouse and chick), will find this publication especially valuable since it provides insights that can help them formulate new hypotheses to elucidate the molecular mechanisms of the clock (for example finding a set of evolutionarily divergent genes that might interfere with PSM length). Additionally, this study provides a set of cellular parameters that have yet to be measured in mouse and chick, therefore guiding the design of future experiments to measure them, allowing the simulation of the same model with sets of parameters from different vertebrate model organisms, therefore testing the robustness of the findings reported for zebrafish.

      MY EXPERTISE

      My areas of research (relevant for this study): Vertebrate embryo clock oscillations in Gallus gallus; Computational biology.

      I can evaluate the relevance and validity of the model, critically evaluate its outputs and parameters, and the significance of the model assumptions for drawing relevant biological insights; however, I am not an expert on this mathematical formalism.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, Hammond et al. study robustness of the vertebrate segmentation clock against morphogenetic processes such as cell ingression, cell movement and cell division to ask whether the segmentation clock and morphogenesis are modular or not. The modularity of these two would be important for evolvability of the segmenting system. The authors adopt a previously proposed 3D model of the presomitic mesoderm (Uriu et al. 2021 eLife) and include new elements; different types of cell ingression, tissue compaction and cell cycles. Based on the results of numerical simulations that synchrony of the segmentation clock is robust, the authors conclude that there is a modularity in the segmentation clock and morphogenetic processes.

      The presented results support the conclusion. The manuscript is clearly written. I have several comments that could help the authors further strengthen their arguments.

      Major comment:

      [Optional] In both the current model and Uriu et al. 2021, coupling delay in phase oscillator model is not considered. Given that several previous studies (e.g. Lewis 2003, Herrgen et al. 2010, Yoshioka-Kobayashi et al. 2020) suggested the presence of coupling delays in Delta-Notch signaling, could the authors analyze the effect of coupling delay on robustness of the segmentation clock against morphogenetic processes?

      Minor comments:

      • PSM radius and oscillation synchrony are both denoted by the same alphabet r. The authors should use different alphabets for these two to avoid confusion.
      • page 5 Figure 1 caption: (x-x_a/L) should be (x-x_a)/L.
      • Figure 3C: Description of black crosses in the panels is required in the figure legend.
      • Figure 3C another comment: In this panel, synchrony r at the anterior PSM is shown. It is true that synchrony at anterior PSM is most relevant for normal segment formation. However, in this case, the mobility profile is changed, so it may be appropriate to show how synchrony at mid and posterior PSM would depend on changes in mobility profile. Is synchrony improved by cell mobility at the region where cell ingression happens?
      • In page 12, the authors state that "the results for the DP and DP+LV cases are exactly equal for L = 185 um, as .... and the two ingression methods are numerically equivalent in the model". I understood that in this case two ingression methods are equivalent, but I do not understand why the results are "exactly" equal, given the presence of stochasticity in the model.
      • The authors analyze the effect of cell density on oscillation synchrony in Fig. 4 and they mention that higher density increases robustness of the clock by increasing the average number of interacting neighbors. I think it would be helpful to plot the average number of neighboring cells in simulations as a function of density to quantitatively support the claim.
      • The authors analyze the effect of PSM length on synchrony in Fig. 4. I think kymographs of synchrony r as shown in Fig. 2D would also be helpful to show that indeed cells get synchronized while advecting through a longer PSM.
      • I understand that cells in M phase can interact with neighboring cells with the same coupling strength kappa in the model, although their clocks are arrested. If so, this aspect should be also mentioned in the main text in page 16, as this coupling can be another noise source for synchrony.
      • Figure 5-figure supplement 2: panel labels A, B, C are missing.
      • Figure 5-figure supplement 3: panel labels A, B, C are missing.

      Significance

      Synchronization of the segmentation clock has been studied by mathematical modeling, but most previous studies considered cells in a static tissue without morphogenesis. In the previous study by Uriu et al. 2021, morphogenetic processes such as cell advection due to tissue elongation, tissue shortening, and cell mobility were considered in synchronization. The current manuscript provides methodological advances in this aspect by newly including cell ingression, tissue compaction and cell cycle. In addition, the authors bring a concept of modularity and evolvability to the field of the vertebrate segmentation clock, which is new. On the other hand, the manuscript confirms that the synchronization of the segmentation clock is robust by careful simulations, but it does not propose or reveal new mechanisms for making it robust or modular. The main targets of the manuscript will be researchers working on somitogenesis and evolutionary biologists who are interested in evolution of developmental systems. The manuscript will also be interested by broader audiences, like developmental biologists, biophysicists, and physicists and computer scientists who are working on dynamical systems.

    1. Reviewer #2 (Public review):

      Piersma et al. continue to work on deciphering the role and function of Ly49 NK cell receptors. This manuscript shows that a single inhibitory Ly49 receptor is sufficient to license NK cells and eliminate MHC-I-deficient target cells in mice. In short, they refined the mouse model ∆Ly49-1 (Parikh et al., 2020) into the Ly49KO model in which all Ly49 genes are disrupted. Using this model, they confirmed that NK cells from Ly49KO mice cannot be licensed, produce lower levels of IFN-gamma, and cannot reject MHC-I-deficient cells. To study the effect of a single Ly49 receptor in the function of NK cells, the authors backcrossed Ly49KO mice to H-2Dd transgenic KODO (D8-KODO) Ly49A knock-in mice in which a single inhibitory Ly49A receptor that recognizes H-2Dd ligands is expressed. By doing so, they demonstrate that a single inhibitory Ly49 receptor expressed by all NK cells is sufficient for licensing and missing-self killing.

      While the results of the study are largely consistent with the conclusions, it is important to address some discrepancies. For instance, in the title of Figure 1, the authors state that NK cells in Ly49KO mice compared to WT mice have a less mature phenotype , which is not consistent with the corresponding text in the Results section (lines 170-171) that states there is no difference in maturation. These differences are not evident in Figure 1, panel D. It is crucial to acknowledge these inconsistencies to ensure a comprehensive understanding of the research findings.

      In the legend of Figure 2. the text related to panel C indicates the use of dyes to label the splenocytes, and CFSE, CTV, and CTFR were mentioned. However, only CTV and CTFR are shown on the plots and mentioned in the corresponding text in the Results section. Similarly, in the legend of Figure 4, which is related to panel C, the authors write that splenocytes were differentially labeled with CFSE and CTV as indicated; however, in Figure 4, C and the Results section text, there is no mention of CFSE.

      The authors should clarify why they assume that KLRG1 expression is influenced by the expression of inhibitory Ly49 receptors and not by manipulations on chromosome 6, where the genes for both KLRG1 and Ly49 receptors are located. However, a better explanation for the possible influence of other inhibitory NK cell receptors still needs to be included. In the study by Zhang et al. (doi: 10.1038/s41467-019-13032-5 the authors showed the synergized regulation of NK cell education by the NKG2A receptor and the specific Ly49 family members. Although in this study, Piersma and colleagues show the control of MHC-I deficient cells by Ly49A+ NKG2A-NK cells in Figure 4., this receptor is not mentioned in the Results or in the Discussion section, so its role in this story needs to be clarified. Therefore, the reader would benefit from more information regarding NKG2A receptor and NKG2A+/- populations in their results.

      Comments on revisions: The authors have successfully answered all my questions and edited the manuscript accordingly.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The article by Piersma et al. aims to reduce the complex process of NK cell licensing to the action of a single inhibitory receptor for MHC class I. This is achieved using a mouse strain lacking all of the Ly49 receptors expressed by NK cells and inserting the Ly49a gene into the Ncr1 locus, leading to expression on the majority of NK cells.

      Strengths:

      The mouse model used represents a precise deletion of all NK-expressed genes within the Ly49 cluster. The re-introduction of the Ly49a gene into the Ncr1 locus allows expression by most NK cells. Convincing effects of Ly49a expression on in vitro activation and in vivo killing assay are shown.

      Weaknesses:

      The choice of Ly49a provides a clear picture of H-2D<sup>d</sup> recognition by this Ly49. It would be valuable to perform additional studies investigating Ly49c and Ly49i receptors for H-2b. This is of interest because there are reports indicating that Ly49c may not be a functional receptor in B6 mice due to strong cis interactions.

      We agree with the reviewer that it will be important to extend our findings to H-2b haplotypes with individual cognate Ly49 receptors (Ly49C and Ly49I). While these experiments are subject of our ongoing studies, they are beyond the scope of the current manuscript considering the significant time, effort and cost to generate these new Ly49C and Ly49I knockin mice.

      This work generates an excellent mouse model for the study of NK cell licensing by inhibitory Ly49s that will be useful for the community. It provides a platform whereby the functional activity of a single Ly49 can be assessed.

      Reviewer #2 (Public review):

      Piersma et al. continue to work on deciphering the role and function of Ly49 NK cell receptors. This manuscript shows that a single inhibitory Ly49 receptor is sufficient to license NK cells and eliminate MHC-I-deficient target cells in mice. In short, they refined the mouse model ∆Ly49-1 (Parikh et al., 2020) into the Ly49KO model in which all Ly49 genes are disrupted. Using this model, they confirmed that NK cells from Ly49KO mice cannot be licensed, produce lower levels of IFN-gamma, and cannot reject MHC-I-deficient cells. To study the effect of a single Ly49 receptor in the function of NK cells, the authors backcrossed Ly49KO mice to H-2D<sup>d</sup> transgenic KODO (D8-KODO) Ly49A knock-in mice in which a single inhibitory Ly49A receptor that recognizes H-2D<sup>d</sup> ligands is expressed. By doing so, they demonstrate that a single inhibitory Ly49 receptor expressed by all NK cells is sufficient for licensing and missing-self killing.

      While the results of the study are largely consistent with the conclusions, it is important to address some discrepancies. For instance, in the title of Figure 1, the authors state that NK cells in Ly49KO mice compared to WT mice have a less mature phenotype , which is not consistent with the corresponding text in the Results section (lines 170-171) that states there is no difference in maturation. These differences are not evident in Figure 1, panel D. It is crucial to acknowledge these inconsistencies to ensure a comprehensive understanding of the research findings.

      We thank the reviewer for pointing this out. We have corrected the figure legend title to: “Mice generated to lack all NK-related Ly49 molecules using CRISPR have NK cells that display alterations in select surface molecules.”

      In the legend of Figure 2. the text related to panel C indicates the use of dyes to label the splenocytes, and CFSE, CTV, and CTFR were mentioned. However, only CTV and CTFR are shown on the plots and mentioned in the corresponding text in the Results section. Similarly, in the legend of Figure 4, which is related to panel C, the authors write that splenocytes were differentially labeled with CFSE and CTV as indicated; however, in Figure 4, C and the Results section text, there is no mention of CFSE.

      We thank the reviewer to point out these inconsistencies. We did label target cells with CFSE to distinguish them from host cells, to clarify we have done the following:

      We have removed CFSE from figure legends of Figure 2 and 4.

      We included the following on CFSE labeling in the Materials and Methods section: “Target splenocytes were additionally labeled with CFSE to identify transferred target splenocytes from host cells.”

      The authors should clarify why they assume that KLRG1 expression is influenced by the expression of inhibitory Ly49 receptors and not by manipulations on chromosome 6, where the genes for both KLRG1 and Ly49 receptors are located.

      The effect on KLRG1 expression in phenocopied in the Ly49A KI mice (on a Ly49 KO background). The Ly49A KI allele is encoded by the Ncr1 locus, which is located on chromosome 7 and not by chromosome 6 where KLRG1 is located, thus excluding involvement of cis-regulatory elements encoded by the Ly49 locus on chromosome 6. 

      We have clarified this in the discussion section (lines 350-358):

      “The Ly49 gene family as well as Klrg1 is located within the NKC on chromosome 6 (Yokoyama and Plougastel, 2003) ….  expression of only Ly49A, encoded in the Ncr1 locus located on chromosome 7, in Ly49KO mice on a H-2D<sup>d</sup> background restored KLRG1 expression”

      However, a better explanation for the possible influence of other inhibitory NK cell receptors still needs to be included. In the study by Zhang et al. (doi: 10.1038/s41467-019-13032-5 the authors showed the synergized regulation of NK cell education by the NKG2A receptor and the specific Ly49 family members. Although in this study, Piersma and colleagues show the control of MHC-I deficient cells by Ly49A+ NKG2A-NK cells in Figure 4., this receptor is not mentioned in the Results or in the Discussion section, so its role in this story needs to be clarified. Therefore, the reader would benefit from more information regarding NKG2A receptor and NKG2A+/- populations in their results.

      We agree with the reviewer that it is important to describe our results in the context of other inhibitory receptors. To clarify the role of NKG2A and potentially other inhibitory receptors we have made the following improvements to our manuscript:

      We discuss the role of NKG2A in the discussion section, which now include (lines 259-266):

      “While our results did not interrogate licensing by inhibitory receptors outside of the Ly49 receptor family, such as has been reported for NKG2A (Anfossi et al., 2006; Zhang et al., 2019), they do demonstrate that expression of Ly49A without other Ly49 family members can mediate NK cell licensing. Moreover, we found that Ly49 receptors are required and sufficient for missing-self rejection under steady-state conditions. However, these observations do not rule out involvement of other inhibitory receptors under specific inflammatory conditions. For example, NKG2A contributes to rejection of missing-self targets in poly(I:C)-treated mice (Zhang et al., 2019).”

      We also added the following to the result section (lines 179-182):

      NKG2A has been implicated in NK cell licensing by the non-classical MHC-I molecule Qa1 (Anfossi et al., 2006), to eliminate potential confounding effects by this interaction, effector functions of NKG2A- NK cells were evaluated as described before (Bern et al., 2017).

      Reviewer #3 (Public review):

      Summary:

      In this study, Piersma et al. successfully generated a mouse model with all Ly49n et al., 2017 genes knocked out, resulting in the complete absence of Ly49 receptor expression on the cell surface. The absence of Ly49 expression led to the loss of NK cell education/licensing and consequently, a failure in responsiveness against missing-self target cells. The experimental work and findings are partially overlapping with the previous work by Zhang et al. (2019), who also performed knockout of the entire Ly49 locus in mice and demonstrated that loss of NK responsiveness was due to the removal of inhibitory, and not activating Ly49 genes. The authors demonstrate the restoration of NK cell licensing by knocking in a single Ly49 gene, Ly49A, in a mouse expressing the H-2D<sup>d</sup> ligand for this receptor, which is a novel and important finding.

      Strengths:

      The authors established a novel mouse model enabling them to have a clean and thorough study on the function of Ly49 on NK cell licensing. Also, by knocking in a single Ly49, they were able to investigate the function of a given Ly49 receptor excluding the "contamination" of co-expression of any other Ly49 genes. Their idea and method were novel though the mouse model was somehow genetically similar to a previous study. The experiment design and data interpretation were logically clear and the evidence was solid.

      Weaknesses:

      The paper is very poorly written and confusing. The authors should be more accurate in the usage of terminology, provide more details on experimental procedures, and revise much of the text to improve clarity and coherence. A thorough revision aiming to clarify the paper would be helpful.

      We regret that the manuscript was confusing to the reviewer. We have made thorough revisions to the different sections, which we hope will improve the clarity of the manuscript.

      We have made changes to all sections of the manuscript, including the title. These revisions include improved clarity on description of NK cell licensing and consistent usage throughout the manuscript, per the reviewer recommendations. We hope that all our improvements help the clarity of the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      I was confused by lines 262-270 in the discussion. The data from Hanke et al. is presented as contradictory to the observation that Ly49s bind more efficiently to H2-Kb than -Db, but they showed that Ly49c/i did not bind Kb-deficient cells, supporting the preferred binding to Kb.

      We have clarified this issue and the paragraph now reads: “This is further supported by early studies using Ly49 transfectants binding to Con A blasts showing that Ly49C and Ly49I can bind to H-2D<sup>b</sup>-deficient but not H-2K<sup>b</sup>-deficient cells (Hanke et al., 1999), despite the caveat of testing binding to cells overexpressing Ly49s in these studies.”

      Reviewer #2 (Recommendations for the authors):

      The authors' conclusion that one type of inhibitory Ly49 receptor expressed on NK cells is sufficient for successful licensing and rejection of missing self-cells is a significant step forward. However, it would be beneficial to complement this with additional data. For instance, exploring the role of a single inhibitory Ly49 receptor responsible for licensing in a mouse model with a different haplotype (e.g. Ly49C or Ly49I on H-2b MHC I haplotype in C57BL/6J mice) could provide valuable insights and open new avenues for research in the field.

      We agree with the reviewer that it will be important to extend our findings to additional MHC-I haplotypes with single cognate Ly49 receptors. While these experiments are subject of our ongoing studies, they are beyond the scope of the current manuscript considering the significant effort, time, and cost to generate these new Ly49C and Ly49I knockin mice.

      Reviewer #3 (Recommendations for the authors):

      Specific issues that should be addressed are as follows:

      (1) The title of the paper: "Expression of a single inhibitory Ly49 receptor is sufficient to license NK cells for effector functions" is ambiguous. When I first read the title, I thought the authors meant that only a single Ly49 molecule on the NK cell surface was necessary to induce licensing. It might be better to replace "single inhibitory receptor" with "single member of Ly49 receptor family".

      We have changed the title to: “Expression of a single inhibitory member of the Ly49 receptor family is sufficient to license NK cells for effector functions”

      (2) In the abstract, introduction, and results, the authors distinguish "licensing" and "rejection of missing-self targets" as two distinct phenomena. An example includes Abstract, lines 51-53: "Herein, we showed mice lacking expression of all Ly49s were unable to reject missing-self target cells in vivo, were defective in NK cell licensing, and displayed lower KLRG1 on the surface of NK cells". Similarly, the title of the second subsection of the Results states: "Ly49-deficient NK cells are defective in licensing and rejection of cognate MHC-I deficient target cells" (line 176). In these instances, it seems that by "licensing", they mean only response to plate-bound anti-NK1.1 stimulation and not a response to missing-self targets. Alternatively, in the first paragraph of the Discussion, it sounds as if licensing includes both anti-NK1.1 and missing-self responses (lines 258-260): "...NK cells were fully licensed in terms of their functional phenotype, including the capacity to be activated by an activation receptor in vitro and efficient rejection of MHC-I deficient target cells in vivo". Please define the terms and use the terms consistently throughout the paper.

      We were the first to describe the term licensing and have defined this as acquisition of NK cell functional competence by self-MHC molecules (Kim et al., 2005), which is characterized by increased NK cell effector functions to activating signals. Thus, licensed NK cells are prevented from attacking normal MHC-I<sup>+</sup> cells by the same self-MHC-I-specific receptor that conferred licensing, while unlicensed NK cells without appropriate Ly49 receptors are functionally incompetent.

      To clarify we made changes throughout the manuscript including the following:

      Lines 91-101:

      “In addition to effector function in missing-self, Ly49 receptors that recognize their cognate MHC-I ligands are involved in licensing or education of NK cells to acquire functional competence. NK cell licensing is characterized by potent effector functions including IFNγ production and degranulation in response to activation receptor stimulation (Elliott et al., 2010; Kim et al., 2005). Like missing-self recognition, inhibitory Ly49s require SHP-1 for NK cell licensing which interacts with the ITIM-motif encoded in the cytosolic tail of inhibitory Ly49s (Bern et al., 2017; Kim et al., 2005; Viant et al., 2014). Moreover, lower expression of SHP-1, particularly within the immunological synapse, is associated with licensed NK cells (Schmied et al., 2023; Wu et al., 2021). Thus, inhibitory Ly49s have a second function that licenses NK cells to self-MHC-I thereby generating functionally competent NK cells but it has not been possible to exclude contributions from other co-expressed Ly49s.”

      Lines 268-271 (previously 258-260):

      “Yet the NK cells were fully licensed in terms of IFNγ production and degranulation in vitro and efficiently rejected MHC-I deficient target cells in vivo. Thus, a single Ly49 receptor is capable to confer the licensed phenotype and missing-self rejection in vitro and in vivo.”

      Lines 309-312:

      “In conclusion, these data show that expression of a single inhibitory Ly49 receptor is necessary and sufficient to license NK cells and mediate missing self-rejection under steady state conditions in vivo.”

      (3) Introduction, lines 76-79. Please provide the C57BL/6 MHC-I genotype. It is difficult to follow the text here without this information. In general, please provide information to help the reader who may not be working in this precise field.

      We thank the reviewer for pointing this out. We have included this and the lines now read: “For example, in the C57BL/6 background, Ly49C and Ly49I can recognize H-2<sup>b</sup> MHC-I molecules that include H-2K<sup>b</sup> and H-2D<sup>b</sup>, while Ly49A and Ly49G cannot recognize H-2<sup>b</sup> molecules and instead they recognize H-2<sup>d</sup> alleles.”

      (4) Introduction, lines 85-97. Please use commas: "...the MHC-I specificities of other Ly49s have been primarily studied with MHC tetramers containing human b2m, which is not recognized by Ly49A, on cells overexpressing Ly49s" in order to clarify the sentence.

      Commas have been added as suggested by the reviewer.

      (5) Introduction, lines 91-101. The whole paragraph starting with the following sentence does not make sense and should be re-written. "In addition to effector function in missing-self, when inhibitory Ly49 receptors recognize their cognate MHC-I ligands in vivo, they license or educate NK cells for potent effector functions including IFNγ production and degranulation in response to activation receptor stimulation".

      We regret that this paragraph was not clear to the reviewer. We have changed this paragraph to:

      “In addition to effector function in missing-self, Ly49 receptors that recognize their cognate MHC-I ligands are involved in licensing or education of NK cells to acquire functional competence. NK cell licensing is characterized by potent effector functions including IFNγ production and degranulation in response to activation receptor stimulation (Elliott et al., 2010; Kim et al., 2005). Like missing-self recognition, inhibitory Ly49s require SHP-1 for NK cell licensing which interacts with the ITIM-motif encoded in the cytosolic tail of inhibitory Ly49s (Bern et al., 2017; Kim et al., 2005; Viant et al., 2014). Moreover, lower expression of SHP-1, particularly within the immunological synapse, is associated with licensed NK cells (Schmied et al., 2023; Wu et al., 2021). Thus, inhibitory Ly49s have a second function that licenses NK cells to self-MHC-I thereby generating functionally competent NK cells but it has not been possible to exclude contributions from other co-expressed Ly49s.”

      (6) Results, line 181. Please edit: "...MHC-I-deficient H-2K<sup>b</sup> x H-2D<sup>b</sup> deficient (KODO) mice".

      This sentence now reads “... NK cells from H-2K<sup>b</sup> and H-2D<sup>b</sup> double deficient (KODO) mice”

      (7) Results, line 192. Please re-word the following phrase: "missing-self is dominated by H-2K<sup>b</sup> in the C57BL/6 background", as it is unclear. Do you mean that H-2K<sup>b</sup> is protected from lysis as opposed to H-2D<sup>b</sup>?

      We thank the reviewer for pointing this out, line 192 now reads: “..missing-self recognition in the C57BL/6 background depends on the absence of H-2K<sup>b</sup> rather than H-2D<sup>b</sup>.”

      (8) Please briefly describe the Ncr1-Ly49A knockin procedure so that the reader understands the link between NKp46 and Ly49A expression without going to the earlier paper. Also, it needs to be mentioned that Ncr1 is the gene encoding NKp46.

      Lines 201-205 now read: “To investigate the potential of a single inhibitory Ly49 receptor on mediating NK cell licensing and missing-self rejection, the Ly49KO mice were backcrossed to H-2D<sup>d</sup> transgenic KODO (D8-KODO) Ly49A KI mice that express Klra1 cDNA encoding the inhibitory Ly49A receptor in the Ncr1 locus encoding NKp46 and its cognate ligand H-2D<sup>d</sup> but not any other classical MHC-I molecules (Parikh et al., 2020).

      In the materials and Methods section, the following has been added (lines 324-326):

      “In Ly49A KI mice the stop codon of Ncr1 encoding NKp46 is replaced with a P2A peptide-cleavage site upstream of the Ly49A cDNA, while maintaining the 3’ untranslated region.”

      (9) Figure 4C, legend. There is no CFSE staining in this experiment. Please correct.

      We did label target cells with CFSE to distinguish them from host cells, to clarify we have done the following:

      We have removed CFSE from figure legends of Figure 2 and 4.

      We included the following on CFSE labeling in the Materials and Methods section (lines 377-379): “Target splenocytes were additionally labeled with CFSE to identify transferred target splenocytes from host cells.”

      (10) Discussion, lines 262-270. This paragraph sounds as if data by Hanke et al. does not agree with the data presented in the paper. On the contrary, Hanke et al. demonstrate that Ly49C and Ly49I detectably bind to H-2K<sup>b</sup>, but poorly to H-2D<sup>b</sup>, supporting observations shown in Figure 2C.

      We have clarified this issue and the paragraph now reads: “This is further supported by early studies using Ly49 transfectants binding to Con A blasts showing that Ly49C and Ly49I can bind to H-2D<sup>b</sup>-deficient but not H-2K<sup>b</sup>-deficient cells (Hanke et al., 1999), despite the caveat of testing binding to cells overexpressing Ly49s in these studies.”

    1. Reviewer #2 (Public Review):

      When people help others is an important psychological and neuroscientific question. It has received much attention from the psychological side, but comparatively less from neuroscience. The paper translates some ideas from a social Psychology domain to neuroscience using a neuroeconomically oriented computational approach. In particular, the paper is concerned with the idea that people help others based on perceptions of merit/deservingness, but also because they require/need help. To this end, the authors conduct two experiments with an overlapping participant pool:

      (1) A social perception task in which people see images of people that have previously been rated on merit and need scales by other participants. In a blockwise fashion, people decide to whether the depicted person a) deserves help, b) needs help, and c) whether the person uses both hands (== control condition)

      (2) In an altruism task, people make costly helping decisions by deciding between giving a certain amount of money to themselves or another person. It is manipulated how much the other person needs and deserves the money.

      The authors use sound and robust computational modelling approach for both tasks using evidence accumulation models. They analyse behavioural data for both tasks, showing that the behaviour is indeed influenced, as expected, by the deservingness and the need of the shown people. Neurally, the authors use a block-wise analysis approach to find differences in activity levels across conditions of the social perception task. The authors do find large activation clusters in areas related to theory of mind. Interestingly, they also find that activity in TPJ that relates to the deservingness condition correlates with people's deservingness ratings while they do the task, but also with computational parameters related to helping others in the second task, the one that was conducted many months later. Also some behavioural parameters correlate across the two tasks, suggesting that how deserving of help others are perceived reflects a relatively stable feature that translates into concrete helping decisions later-on.

      The conclusions of the paper are overall well supported by the data.

      (1) I found that the modelling was done very thoroughly for both tasks. Overall, I had the impression that the methods are very solid with many supplementary analyses. The computational modelling is done very well.

      (2) A slight caveat, however, regarding this aspect, is that, in my view, the tasks are relatively simplistic, so that even the complex computational models do not as much as they can in the case of more complex paradigms. For example, the bias term in the model seems to correspond to the mean response rate in a very direct way (please correct me if I am wrong).

      (3) Related to the simple tasks: The fMRI data is analysed in a simple block-fashion. This is in my view not appropriate to discern the more subtle neural substrates of merit/need-based decision making or person perception. Correspondingly, the neural activation patterns (merit > control, need > control) are relatively broad and unspecific. They do not seem to differ in the classic theory of mind regions, that are the focus of the analyses.

      (4) However, the relationship between neural signal and behavioural merit sensitivity in TPJ is noteworthy.

      (5) The latter is even more the case, as the neural signal and aspects of the behaviour are correlated across subjects with the second task that is conducted much later. Such a correlation is very impressive and suggests that the tasks are sensitive for important individual differences in helping perception/behaviour.

      (6) That being said, the number of participants in the latter analyses are at the lower end of the number of participants that are these days used for across-participant correlations.

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      General Statements<br /> The reviewer comments helped us improve the paper by including new computations, figures, and analyses related to vasopressin, drug dosages, and treatment cessation. We have also removed confusing terminology from the text. We believe that the paper is now more comprehensive, clear, and rigorous.

      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      The authors address the question of lowering long-term elevated cortisol levels by affecting the parameters in a previously published mathematical model of the hypothalamic-pituitary-adrenal (HPA) axis. The parameters are related to various pathways. The elevation in cortisol levels is related to diseases e.g. mood disorders and Cushing's syndrome.<br /> The authors conducted a systematic in silico analysis of various points of intervention in the HPA axis. They found that only two interventions targeting corticotropin-releasing hormone (CRH) can lower long-term cortisol. Other drug targets either fail to lower cortisol due to gland-mass compensation or lower cortisol but harm other aspects of the HPA axis. Thus, they identify potential drug targets, including CRH-neutralizing antibodies and CRH synthesis inhibitors, for lowering long-term cortisol in mood disorders and in those suffering from chronic stress.<br /> The method used is in silico investigations of the mathematical model.<br /> The draft is well written with a single typo in line 270. I have no further comments!

      Response: The typo is fixed.

      Reviewer #1 (Significance):

      In silico predictions without direct use of data is a weakness but the conducted analysis is convincing. An improved understanding of why some drugs work and others do not is important and is postulated to agree with clinical evidence.

      Response: We thank the reviewer for this endorsement.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary<br /> The authors utilise a mathematical model of the hypothalamic-pituitary-adrenal axis to address the utility of interventions altering its various outputs (CRH, ACTH and cortisol) to ameliorate axis disruption in response to chronic stress. They show that a lowering of circulating CRH by either blocking its synthesis or increasing its clearance is effective at returning the HPA axis to basal activity at all levels. In contrast, interventions altering ACTH or cortisol production, their circulating levels or actions are ineffective in the model. This is consistent with data on the long-term efficacy of drugs reducing excess corticosteroids in patients and animal models. The use of mathematical models to describe complex interactions in endocrine systems is a valuable advance in our understanding of potential mechanisms and therapies and this is an excellent example.

      Response: We thank the reviewer for this endorsement.

      Major comments<br /> 1. The model of the HPA axis that the authors have described previously is a little simplistic when considering the known physiology. Specifically, this model ignores the contribution of vasopressin to the axis, which has been described as being the primary hypothalamic factor driving HPA axis activity in chronic stress (see doi.org/10.1016/S0079-6123(08)00403-2). Including this may be beyond the scope of the current model, however it should be considered and at least commented on. It is notable that the model fits the clinical and animal model data, which may suggest that the contribution of vasopressin in the long term may be overestimated, possibly as a result of differential effects of the two hypothalamic factors, with CRH driving ACTH release and POMC gene expression, whilst vasopressin only increases ACTH release without augmenting POMC expression. This is worthy of discussion.

      Response: We thank the reviewer for this comment which helped us discuss vasopressin. We agree that adding it as a variable in the model is beyond the scope of the current study. We describe its effects in the introduction and discussion sections. Interestingly, when one considers the best characterized effect of vasopressin, namely enhancing CRH-dependent ACTH release, one can use this model to investigate the effects of inhibiting vasopressin. We predict that vasopressin inhibition is unlikely to be an effective strategy for lowering long-term cortisol and alleviating stress-related mental disorders, as evidenced by the failure of clinical trials.

      In the introduction we add:<br /> 1. “CRH stimulates the secretion of adrenocorticotropic hormone (ACTH) by corticotroph cells in the anterior pituitary, an effect enhanced by vasopressin (Aguilera et al, 2008; Antoni, 2017).” (lines 35-37)<br /> 2. Clinical trials for two vasopressin 1b receptor antagonist candidates, SSR149415 and TS-121, in the table of HPA-related clinical trials (Table 1)

      In the discussion we add (lines 398-409): ”One important factor not explicitly considered in the model is the contribution of vasopressin to the axis. Vasopressin potentiates the CRH-dependent release of ACTH from pituitary corticotrophs by acting on the V1b receptor (V1bR) (Aguilera et al, 2008; Antoni, 2017). Including this hormone explicitly is beyond the current scope. However, a simple analysis indicates that the effect of elevated vasopressin can be modeled by increasing the ACTH secretion parameter b2. This suggests that vasopressin V1b receptor antagonists should have effects similar to inhibitors of ACTH production. As such, vasopressin receptor antagonists should be compensated by the HPA axis without long-term effects on cortisol. Accordingly, V1bR antagonists did not show statistically significant efficacy in clinical trials for major depressive disorder and generalized anxiety disorder (Griebel et al, 2012; Chaki, 2021; Kamiya et al, 2020). However, vasopressin may have additional relevant effects on the HPA axis and the central nervous system which warrant a more detailed modeling analysis.”

      1. The model that this study relies on is dependent on slow changes in the various levels of the endocrine axis and the authors have focused on alterations in cell number as the process leading to a prolongation of their dysfunction. For the stress axis, the evidence for changes in corticotroph cell number is weak and the recent paper of Lopez et al (DOI: 10.1126/sciadv.abe44) suggests that chronic stress, at least over a period of 3 weeks does not lead to an alteration in the number of corticotrophs, despite cell population changes in the adrenal gland. There are other processes which could lead to prolonged alteration of corticotroph output and it would be better to focus (as the authors have in places) on functional mass, rather than cell number which may suggest it is not the trophic effect of CRH that is important for increased functional mass.

      Response: We thank the reviewer for this. We now refer only to functional mass changes. We corrected all places in which hyperplasia of corticotrophs is mentioned. We also state in lines 125-126 that the model is agnostic as to whether growth in functional mass is due to hyperplasia or hypertrophy.<br /> We also added a citation for Lopez et al. 2021 (line 86) to support the growth of cortisol-secreting cells in the zona fasciculata of the adrenal gland under stress conditions.

      1. The parameters in the model for interventions are described as simply being less than or greater than one- to what extent are the effects of these interventions dependent on their specific value? For example, presumably if the I1 value is close to zero, then the CRH-synthesis inhibitor would be ineffective. Likewise, if it were close to 1 then there would be negligible release of CRH in response to stress, and the preservation of a response to acute stress would be lost. Can the authors show the range of values for I1, C1 and A1 where the interventions are effective at normalising HPA axis function whilst (for I1 and A1) still preserving the acute stress response?

      Response: We thank the reviewer for this comment that helped us to add a new section in the results on dose response, and three new figures (Figure 4, Figure S2 and Figure S3):

      CRH interventions have a dose-dependent response in the model<br /> We computed the effects of drug doses by varying the relevant model parameter, where zero dose means no change in the parameter and high doses mean large changes in the parameter. We find that both candidate interventions for lowering cortisol - CRH-synthesis inhibitors and CRH-blocking antibodies - cause a dose-dependent reduction of steady-state cortisol (Figure 4A). This indicates that putative treatment may require finding the appropriate dose to return the patients to their normal cortisol baseline range. Other drug candidates have no effect on long-term cortisol steady state (Figure S2).

      At all doses, the steady states of CRH and ACTH remain normal (Figure 4B-C). The acute stress response, defined as peak cortisol upon acute stress input relative to steady-state cortisol, is dose dependent (Figure 4D and Figure S3). At a dose that returns cortisol to the normal range, the acute response is also normalized.

      We also tested the effects of abrupt treatment cessation. For both CRH interventions, stopping treatment led to a rapid return to hypercortisolemia (Figure 4E-F and Figure S4).

      Figure 4. Predicted effective interventions have a dose-dependent effect on cortisol, and cortisol abruptly rises when treatment is ceased. (A) Cortisol steady state in the model upon changes in doses of CRH-synthesis inhibitors and CRH-blocking antibodies. (B-C) The same changes in drug doses have no effect on ACTH (B) and CRH (C) steady state levels. (D) Cortisol peak response to an acute stress relative to steady state for different drug doses. (E-F) HPA dynamics upon cessation of CRH-synthesis inhibitors (E) and anti-CRH antibodies (F) after 50 days.”

      In the supplemental information:

      Cortisol dose response to HPA-targeting drugs

      Figure S2. Cortisol steady state dose response to HPA-targeting drugs, related to Figure 4.

      Figure S3. Cortisol peak response to acute stressor under varying concentrations of HPA-targeting drugs, related to Figure 4.”

      1. In the models that the authors describe with CRH interventions, what is the impact of stopping the intervention on axis output in the short and long-term? Presumably ceasing the use of CRH antagonists would lead to much more severe axis dysregulation than CRH neutralising antibodies or CRH synthesis inhibitors.

      Response: We have now added new analysis on drug cessation (new figure 4E-F, Figure S4). After a 50 day treatment, sudden cessation caused a rapid return to hypercortisolemia:<br /> We added in lines 277-278: “We also tested the effects of abrupt treatment cessation. For both CRH interventions, stopping treatment led to a rapid return to hypercortisolemia (Figure 4E-F).”

      Reviewer #2 (Significance):

      Whilst the study builds on the use of a previously described mathematical model, its utility in identifying potential targets for therapy within the important area of chronic stress makes it an important example of the value of the modelling approach to decisions on appropriate targets for therapy. The model does not include important known factors which have been described as being important in the HPA axis response to chronic stress and would be considerably improved if these could be incorporated.<br /> The study builds on conceptual insights into the role a delayed or slow functional mass change might play in dysregulation of endocrine axes and this could be applied to other physiological systems and will be of interest to modellers and physiologists alike. The authors are leaders in this field and there are few other modellers considering systems level interactions over this timescale.

      Response: We thank the reviewer for this endorsement.

      As a pituitary physiologist, my review has focused on the interactions between the various players in HPA axis function, I do not have the expertise to comment on mathematical modelling aspects.

      Reviewer #3 (Evidence, reproducibility and clarity):

      This extremely interesting paper asks why various attempts to treat depression and bipolar disorder with glucocorticoid antagonists or cortisol synthesis inhibitors have failed. The starting point for their analysis is a simple computational model that, importantly, includes the facts that CRH stimulates not only ACTH release but also corticotroph growth and ACTH stimulates not only cortisol production but also the growth of cells in the adrenal cortex. They call this the "gland mass model". According to the model, if the hypothalamus receives a continuous stress input, all of the HPA hormones will be elevated-CRH transiently and the others in a sustained fashion. Adding a sufficient dose of a CRH inhibitor (decreasing the rate constant b1 in the model) or a CRH antibody (increasing the rate constant a1) normalizes the hormone levels, whereas blocking cortisol function or production does not. This is demonstrated by numerical simulations and backed up by deriving analytical expressions for the hormone concentrations at steady state. The paper provides a plausible explanation for why past therapeutic efforts have failed and points to a couple of approaches that might succeed. These conclusions are hypotheses-they haven't been tested experimentally and we really don't know how accurately the system is described by this nice, simple model-but they are really intriguing hypotheses that could lead to therapeutic breakthroughs. I strongly recommend publication.

      Response: We thank the reviewer for this endorsement.

      My only criticisms are minor:

      1. The authors should specify what exact change in the model's parameters they are making to implement their therapeutic interventions. E.g. in Fig 1B top left and 2A, what is the change in the value of b1 that corresponds to the addition of a CRH-synthesis inhibitor? (I'd guess it's being dropped to zero, but if this is stated, I missed it)

      Response: We thank the reviewer for that comment which helped us to clarify what is the required parameter change to normalize cortisol. We have now added in lines 173-175: “According to equation (1), as a general guideline, treating cortisol levels that are x-fold higher than baseline requires a drug dose that alters the relevant parameter (e.g., CRH production or removal rate) by a similar x-fold.”

      1. I think it would also be useful to show a dose-response relationship for the various interventions.

      Response: We thank the reviewer for this comment that helped us to add a new section in the results on dose response, and three new figures (Figure 4, Figure S2 and Figure S3):

      CRH interventions have a dose-dependent response in the model<br /> We computed the effects of drug doses by varying the relevant model parameter, where zero dose means no change in the parameter and high doses mean large changes in the parameter. We find that both candidate interventions for lowering cortisol - CRH-synthesis inhibitors and CRH-blocking antibodies - cause a dose-dependent reduction of steady-state cortisol (Figure 4A). This indicates that putative treatment may require finding the appropriate dose to return the patients to their normal cortisol baseline range. Other drug candidates have no effect on long-term cortisol steady state (Figure S2).

      At all doses, the steady states of CRH and ACTH remain normal (Figure 4B-C). The acute stress response, defined as peak cortisol upon acute stress input relative to steady-state cortisol, is dose dependent (Figure 4D and Figure S3). At a dose that returns cortisol to the normal range, the acute response is also normalized.

      We also tested the effects of abrupt treatment cessation. For both CRH interventions, stopping treatment led to a rapid return to hypercortisolemia (Figure 4E-F and Figure S4).

      Figure 4. Predicted effective interventions have a dose-dependent effect on cortisol, and cortisol abruptly rises when treatment is ceased. (A) Cortisol steady state in the model upon changes in doses of CRH-synthesis inhibitors and CRH-blocking antibodies. (B-C) The same changes in drug doses have no effect on ACTH (B) and CRH (C) steady state levels. (D) Cortisol peak response to an acute stress relative to steady state for different drug doses. (E-F) HPA dynamics upon cessation of CRH-synthesis inhibitors (E) and anti-CRH antibodies (F) after 50 days.”

      In the supplemental information:

      Cortisol dose response to HPA-targeting drugs

      Figure S2. Cortisol steady state dose response to HPA-targeting drugs, related to Figure 4.

      Figure S3. Cortisol peak response to acute stressor under varying concentrations of HPA-targeting drugs, related to Figure 4.”

      *Referees cross-commenting*

      It looks like we are all enthusiastic about this work.

      Response: Thank you.

      Reviewer #3 (Significance):

      Strengths: It's a beautiful new insight on a really important topic, extracted from a simple, understandable mathematical model of the HPA axis.

      Weaknesses: It is based on a model and the model could be wrong. This does not however diminish my enthusiasm for this provocative work.

      Advance: It is highly original.

      Audience: I hope attracts a wide audience--modelers, endocrinologists, psychiatrists, drug developers.

      My expertise: I am a systems biologist, have taught psychopharmacology to medical students, and have an interest in endocrine signaling.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The authors utilise a mathematical model of the hypothalamic-pituitary-adrenal axis to address the utility of interventions altering its various outputs (CRH, ACTH and cortisol) to ameliorate axis disruption in response to chronic stress. They show that a lowering of circulating CRH by either blocking its synthesis or increasing its clearance is effective at returning the HPA axis to basal activity at all levels. In contrast, interventions altering ACTH or cortisol production, their circulating levels or actions are ineffective in the model. This is consistent with data on the long-term efficacy of drugs reducing excess corticosteroids in patients and animal models. The use of mathematical models to describe complex interactions in endocrine systems is a valuable advance in our understanding of potential mechanisms and therapies and this is an excellent example.

      Major comments

      1. The model of the HPA axis that the authors have described previously is a little simplistic when considering the known physiology. Specifically, this model ignores the contribution of vasopressin to the axis, which has been described as being the primary hypothalamic factor driving HPA axis activity in chronic stress (see doi.org/10.1016/S0079-6123(08)00403-2). Including this may be beyond the scope of the current model, however it should be considered and at least commented on. It is notable that the model fits the clinical and animal model data, which may suggest that the contribution of vasopressin in the long term may be overestimated, possibly as a result of differential effects of the two hypothalamic factors, with CRH driving ACTH release and POMC gene expression, whilst vasopressin only increases ACTH release without augmenting POMC expression. This is worthy of discussion.
      2. The model that this study relies on is dependent on slow changes in the various levels of the endocrine axis and the authors have focused on alterations in cell number as the process leading to a prolongation of their dysfunction. For the stress axis, the evidence for changes in corticotroph cell number is weak and the recent paper of Lopez et al (DOI: 10.1126/sciadv.abe44) suggests that chronic stress, at least over a period of 3 weeks does not lead to an alteration in the number of corticotrophs, despite cell population changes in the adrenal gland. There are other processes which could lead to prolonged alteration of corticotroph output and it would be better to focus (as the authors have in places) on functional mass, rather than cell number which may suggest it is not the trophic effect of CRH that is important for increased functional mass.
      3. The parameters in the model for interventions are described as simply being less than or greater than one- to what extent are the effects of these interventions dependent on their specific value? For example, presumably if the I1 value is close to zero, then the CRH-synthesis inhibitor would be ineffective. Likewise, if it were close to 1 then there would be negligible release of CRH in response to stress, and the preservation of a response to acute stress would be lost. Can the authors show the range of values for I1, C1 and A1 where the interventions are effective at normalising HPA axis function whilst (for I1 and A1) still preserving the acute stress response?
      4. In the models that the authors describe with CRH interventions, what is the impact of stopping the intervention on axis output in the short and long-term? Presumably ceasing the use of CRH antagonists would lead to much more severe axis dysregulation than CRH neutralising antibodies or CRH synthesis inhibitors.

      Significance

      Whilst the study builds on the use of a previously described mathematical model, its utility in identifying potential targets for therapy within the important area of chronic stress makes it an important example of the value of the modelling approach to decisions on appropriate targets for therapy. The model does not include important known factors which have been described as being important in the HPA axis response to chronic stress and would be considerably improved if these could be incorporated.<br /> The study builds on conceptual insights into the role a delayed or slow functional mass change might play in dysregulation of endocrine axes and this could be applied to other physiological systems and will be of interest to modellers and physiologists alike. The authors are leaders in this field and there are few other modellers considering systems level interactions over this timescale.

      As a pituitary physiologist, my review has focused on the interactions between the various players in HPA axis function, I do not have the expertise to comment on mathematical modelling aspects.

    1. Reviewer #2 (Public review):

      In this manuscript, Hua et al. proposed SLC7A11, a protein facilitating cellular cystine uptake, as a potential target for the treatment of trastuzumab-resistant HER2-positive breast cancer. If this claim holds true, the finding would be of significance and might be translated to clinical practice. Nevertheless, this reviewer finds that the conclusion was poorly supported by the data.

      Notably, most of the data (Figures 2-6) were based on two cell lines - JIMT1 as a representative of trastuzumab-resistant cell line, and SKBR3 as a representative of trastuzumab sensitive cell line. As such, these findings could be cell-line specific while irrelevant to trastuzumab sensitivity at all. Furthermore, the authors claimed ferroptosis simply based on lipid peroxidation (Figure 3). Cell viability was not determined, and the rescuing effects of ferroptosis inhibitors were missing. The xenograft experiments were also suspicious (Figure 4). The description of how cysteine starvation was performed on xenograft tumors was lacking, and the compound (i.e., erastin) used by the authors is not suitable for in vivo experiments due to low solubility and low metabolic stability. Finally, it is confusing why the authors focused on epigenetic regulations (Figures 5 & 6), without measuring major transcription factors (e.g., NRF2, ATF4) which are known to regulate SLC7A11.

      To sum up, this reviewer finds that the most valuable data in this manuscript is perhaps Figure 1, which provides unbiased information concerning the metabolic patterns in trastuzumab-sensitive and primary resistant HER2-positive breast cancer patients.

    2. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Hua et al show how targeting amino acid metabolism can overcome Trastuzumab resistance in HER2+ breast cancer.

      Strengths:

      The authors used metabolomics, transcriptomics and epigenomics approaches in vitro and in preclinical models to demonstrate how trastuzumab-resistant cells utilize cysteine metabolism.

      Thank you for your valuable comments. We would like to extend our appreciation for your efforts. Your constructive suggestion would help improve our research.

      Weaknesses:

      However, there are some key aspects that needs to be addressed.

      Major:

      (1) Patient Samples for Transcriptomic Analysis: It is unclear from the text whether tumor tissues or blood samples were used for the transcriptomic analysis. This distinction is crucial, as these two sample types would yield vastly different inferences. The authors should clarify the source of these samples.

      Thank you for your valuable comments. In the transcriptomic analysis, we included the data of HER2 positive breast cancer patients who received trastuzumab in I-SPY2 trial (GSE181574). Tumor tissues were used in this dataset.

      (2) The study only tested one trastuzumab-resistant and one trastuzumab-sensitive cell line. It is unclear whether these findings are applicable to other HER2-positive tumor cell lines, such as HCC1954. The authors should validate their results in additional cell lines to strengthen their conclusions.

      Thank you for your valuable comments. We agree with your opinion, and the exploration of multiple cell lines would make our research findings more comprehensive. This is a limitation of our study, and we would continue to improve our design and methods in future experiments.

      (3) Relevance to Metastatic Disease: Trastuzumab resistance often arises in patients during disease recurrence, which is frequently associated with metastasis. However, the mouse experiments described in this paper were conducted only in the primary tumors. This article would have more impact if the authors could demonstrate that the combination of Erastin or cysteine starvation with trastuzumab can also improve outcomes in metastasis models.

      Thank you for your valuable comments. We agree with your suggestions. The exploration of metastatic disease would make our research more meaningful and help better address clinical key issues. In our future studies, we will continue to investigate the association between the invasive and metastatic capabilities of trastuzumab resistant HER2 positive breast cancer and cysteine metabolism.

      Minor:

      (1) The figures lack information about the specific statistical tests used. Including this information is essential to show the robustness of the results.

      Thank you for your valuable comments. We would include the statistical information in our figure legends.

      (2) Figure 3K Interpretation: The significance asterisks in Figure 3K do not specify the comparison being made. Are they relative to the DMSO control? This should be clarified.

      Thank you for your valuable comments. We would clarify the comparison information in our figure legends.

      Reviewer #2 (Public review):

      In this manuscript, Hua et al. proposed SLC7A11, a protein facilitating cellular cystine uptake, as a potential target for the treatment of trastuzumab-resistant HER2-positive breast cancer. If this claim holds true, the finding would be of significance and might be translated to clinical practice. Nevertheless, this reviewer finds that the conclusion was poorly supported by the data.

      Notably, most of the data (Figures 2-6) were based on two cell lines - JIMT1 as a representative of trastuzumab-resistant cell line, and SKBR3 as a representative of trastuzumab sensitive cell line. As such, these findings could be cell-line specific while irrelevant to trastuzumab sensitivity at all. Furthermore, the authors claimed ferroptosis simply based on lipid peroxidation (Figure 3). Cell viability was not determined, and the rescuing effects of ferroptosis inhibitors were missing. The xenograft experiments were also suspicious (Figure 4). The description of how cysteine starvation was performed on xenograft tumors was lacking, and the compound (i.e., erastin) used by the authors is not suitable for in vivo experiments due to low solubility and low metabolic stability. Finally, it is confusing why the authors focused on epigenetic regulations (Figures 5 & 6), without measuring major transcription factors (e.g., NRF2, ATF4) which are known to regulate SLC7A11.

      To sum up, this reviewer finds that the most valuable data in this manuscript is perhaps Figure 1, which provides unbiased information concerning the metabolic patterns in trastuzumab-sensitive and primary resistant HER2-positive breast cancer patients.

      Thank you for your valuable comments. We agree with your suggestions. Your feedback would help enhance the quality of our research.

      (1) Our research was mainly conducted in JIMT1 (trastuzumab resistant) and SKBR3 (trastuzumab sensitive), and this is a limitation of our study. The experimental validation using different cell lines will make our research findings more persuasive. In our future research, we will continuously optimize experimental design and methods to make our findings more comprehensive.

      (2) The detection of ferroptosis in our research was mainly performed by evaluating the lipid peroxidation. Experiments measuring cell viability and rescuing effects would help provide more evidence.

      (3) In xenograft experiments, the cysteine starvation was performed by feeding cysteine-free diet. The drug dissolution and other conditions were optimized by referring to previous relevant literature. We would clarify more details in our article.

      (4) Epigenetic modifications have been recognized as crucial factors in drug resistance formation. An increasing number of studies have emphasized the importance of epigenetic changes in regulating the abnormal expression of oncogenes and tumor suppressor genes related to drug resistance. Currently, the role of epigenetic changes in the development of trastuzumab resistance in HER2 positive breast cancer is still in exploration. We tried to investigate the dysregulation of histone modifications and DNA methylation in trastuzumab resistant HER2 positive breast cancer. Our findings indicated that targeting H3K4me3 and DNA methylation could decrease SLC7A11 expression and induce ferroptosis. This would provide more evidence in exploring trastuzumab resistance mechanisms. We will provide a more detailed discussion in the article.

      We would like to extend our appreciation for your constructive suggestions and continue to improve our research in future experiments.

    1. Reviewer #1 (Public review):

      In this manuscript, the authors report that GPR55 activation in presynaptic terminals of Purkinje cells decrease GABA release at the PC-DCN synapse. The authors use an impressive array of techniques (including highly challenging presynaptic recordings) to show that GPR55 activation reduces the readily releasable pool of vesicle without affecting presynaptic AP waveform and presynaptic Ca2+ influx. This is an interesting study, which is seemingly well-executed and proposes a novel mechanism for the control of neurotransmitter release. However, the authors' main conclusions are heavily, if not solely, based on pharmacological agents that most often than not demonstrate affinity at multiple targets. Below are points that the authors should consider in a revised version.

      Major points:

      (1) There is no clear evidence that GPR55 is specifically expressed in presynaptic terminals at the PC-DCN synapse. The authors cited Ryberg 2007 and Wu 2013 in the introduction, mentioning that GPR55 is potentially expressed in PCs. Ryberg (2007) offers no such evidence, and the expression in PC suggested by Wu (2013) does not necessarily correlate with presynaptic expression. The authors should perform additional experiments to demonstrate the presynaptic expression of GPR55 at PC-DCN synapse.

      (2) The authors' conclusions rest heavily on pharmacological experiments, with compounds that are sometimes not selective for single targets. Genetic deletion of GPR55 would be a more appropriate control. The authors should also expand their experiments with occlusion experiments, showing if the effects of LPI are absent after AM251 or O-1602 treatment. In addition, the authors may want to consider AM281 as a CB1R antagonist without reported effects at GPR55.

      (3) It is not clear how long the different drugs were applied, and at what time the recordings were performed during or following drug application. It appears that GPR55 agonists can have transient effects (Sylantyev, 2013; Rosenberg, 2023), possibly due to receptor internalization. The timeline of drug application should be reported, where IPSC amplitude is shown as a function of time and drug application windows are illustrated.

      (4) A previous investigation on the role of GPR55 in the control of neurotransmitter release is not cited nor discussed Sylantyev et al., (2013, PNAS, Cannabinoid- and lysophosphatidylinositol-sensitive receptor GPR55 boosts neurotransmitter release at central synapses). Similarities and differences should be discussed.

      Minor point:

      (1) What is the source of LPI? What isoform was used? The multiple isoforms of LPI have different affinities for GPR55.

    2. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      In this manuscript, the authors report that GPR55 activation in presynaptic terminals of Purkinje cells decrease GABA release at the PC-DCN synapse. The authors use an impressive array of techniques (including highly challenging presynaptic recordings) to show that GPR55 activation reduces the readily releasable pool of vesicle without affecting presynaptic AP waveform and presynaptic Ca2+ influx. This is an interesting study, which is seemingly well-executed and proposes a novel mechanism for the control of neurotransmitter release. However, the authors' main conclusions are heavily, if not solely, based on pharmacological agents that most often than not demonstrate affinity at multiple targets. Below are points that the authors should consider in a revised version.

      We thank the reviewer for the encouraging comments, and will fully address the reviewer’s concerns as detailed below.

      Major points:

      (1) There is no clear evidence that GPR55 is specifically expressed in presynaptic terminals at the PC-DCN synapse. The authors cited Ryberg 2007 and Wu 2013 in the introduction, mentioning that GPR55 is potentially expressed in PCs. Ryberg (2007) offers no such evidence, and the expression in PC suggested by Wu (2013) does not necessarily correlate with presynaptic expression. The authors should perform additional experiments to demonstrate the presynaptic expression of GPR55 at PC-DCN synapse.

      We agree with the reviewer’s concern that the present manuscript lacks the evidence for localization of GPR55 at PC axon terminals. Honestly, our previous attempt to immune-label GPR55 did not work well. Now, we realize that different antibodies are commercially available, and are going to test them. Hopefully, in the revised manuscript, we will demonstrate immunocytochemical images showing GPR55 at terminals of PCs.

      (2) The authors' conclusions rest heavily on pharmacological experiments, with compounds that are sometimes not selective for single targets. Genetic deletion of GPR55 would be a more appropriate control. The authors should also expand their experiments with occlusion experiments, showing if the effects of LPI are absent after AM251 or O-1602 treatment. In addition, the authors may want to consider AM281 as a CB1R antagonist without reported effects at GPR55.

      We appreciate the reviewer for pointing out the essential issue regarding the specificity of activation of GPR55 in our study. Regarding the direct manipulation of GPR55, such as genetic deletion, we will try acute knock-down of its expression, considering the possibility of compensation which sometimes occur when the complete knock-out is performed. In addition, according to the reviewer’s suggestion, we will examine whether the effects of LPI and AM251 occlude each other, and also perform control experiments showing the lack of CB1R involvement.

      (3) It is not clear how long the different drugs were applied, and at what time the recordings were performed during or following drug application. It appears that GPR55 agonists can have transient effects (Sylantyev, 2013; Rosenberg, 2023), possibly due to receptor internalization. The timeline of drug application should be reported, where IPSC amplitude is shown as a function of time and drug application windows are illustrated.

      As suggested, the timing and duration of drug application will be indicated together with the time course of changes of IPSC amplitudes. This change will make things much clearer. Thank you for the suggestion.

      (4) A previous investigation on the role of GPR55 in the control of neurotransmitter release is not cited nor discussed Sylantyev et al., (2013, PNAS, Cannabinoid- and lysophosphatidylinositol-sensitive receptor GPR55 boosts neurotransmitter release at central synapses). Similarities and differences should be discussed.

      We are really sorry for missing this important study in discussion and citation. In the revised version, of course, we will discuss their findings and our data.

      Minor point:

      (1) What is the source of LPI? What isoform was used? The multiple isoforms of LPI have different affinities for GPR55.

      We are sorry for insufficient explanation about the LPI used in our study. We used LPI derived from soy (Merck, catalog #L7635) that was estimated to contain 58% C16:0 and 42% C18:0 or C18:2 LPI. This information will be added to the Materials and Methods in the revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      This paper investigates the mode of action of GPR55, a relatively understudied type of cannabinoid receptor, in presynaptic terminals of Purkinje cells. The authors use demanding techniques of patch clamp recording of the terminals, sometimes coupled with another recording of the postsynaptic cell. They find a lower release probability of synaptic vesicles after activation of GPR55 receptors, while presynaptic voltage-dependent calcium currents are unaffected. They propose that the size of a specific pool of synaptic vesicles supplying release sites is decreased upon activation of GPR55 receptors.

      Strengths:

      The paper uses cutting-edge techniques to shed light on a little-studied, potentially important type of cannabinoid receptor. The results are clearly presented, and the conclusions are for the most part sound.

      We are really happy to hear the encouraging comments from the reviewer.

      Weaknesses:

      The nature of the vesicular pool that is modified following activation of GPR55 is not definitively characterized.

      During revision, we will perform further analysis and additional experiments to obtain deeper insights into the vesicle pools affected by GPR55 as much as possible.

      Reviewer #3 (Public review):

      Summary:

      Inoshita and Kawaguchi investigated the effects of GPR55 activation on synaptic transmission in vitro. To address this question, they performed direct patch-clamp recordings from axon terminals of cerebellar Purkinje cells and fluorescent imaging of vesicular exocytosis utilizing synapto-pHluorin. They found that exogenous activation of GPR55 suppresses GABA release at Purkinje cell to deep cerebellar nuclei (PC-DCN) synapses by reducing the readily releasable pool (RRP) of vesicles. This mechanism may also operate at other synapses.

      Strengths:

      The main strength of this study lies in combining patch-clamp recordings from axon terminals with imaging of presynaptic vesicular exocytosis to reveal a novel mechanism by which activation of GPR55 suppresses inhibitory synaptic strength. The results strongly suggest that GPR55 activation reduces the RRP size without altering presynaptic calcium influx.

      We thank the reviewer for the positive evaluation on our conclusions.

      Weaknesses:

      The study relies on the exogenous application of GPR55 agonists. It remains unclear whether endogenous ligands released due to physiological or pathological activities would have similar effects. There is no information regarding the time course of the agonist-induced suppression. There is also little evidence that GPR55 is expressed in Purkinje cells. This study would benefit from using GPR55 knockout (KO) mice. The downstream mechanism by which GPR55 mediates the suppression of GABA release remains unknown.

      We agree with the reviewer in all respects suggested as weaknesses. Most issues will be made much clearer by the additional experiments and analysis described above to respond to respective issues raised by other reviewers. The situation of endogenous ligands for GPR55 causing the synaptic depression and its downstream mechanism are very important issues, and we are going to discuss these points in the revised manuscript, and like to work on these in the future study.

    1. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Edwards et al. describe hamFISH, a customizable and cost-efficient method for performing targeted spatial transcriptomics. hamFISH utilizes highly amplified multiplexed branched DNA amplification, and the authors extensively describe hamFISH development and its advantages over prior variants of this approach.

      The authors then used hamFISH to investigate an important circuit in the mouse brain for social behavior, the medial amygdala (MeA). To develop a hamFISH probe set capable of distinguishing MeA neurons, the authors mined published single-cell RNA-sequencing datasets of the MeA, ultimately creating a panel of 32 hamFISH probes that mostly cover the identified MeA cell types. They evaluated over 600,000 MeA cells and classified neurons into 16 inhibitory and 10 excitatory types, many of which are spatially clustered. The authors combined hamFISH with viral and other circuit tracer injections to determine whether the identified MeA cell populations sent and/or received unique inputs from connected brain regions, finding evidence that several cell types had unique patterns of input and output. Finally, the authors performed hamFISH on the brains of male mice that were placed in behavioral conditions that elicit aggressive, infanticidal, or mating behaviors, finding that some cell populations are selectively activated (as assessed by c-fos mRNA expression) in specific social contexts.

      Strengths:

      (1) The authors developed an optimized tissue preparation protocol for hamFISH and implemented oligopools instead of individually synthesized oligonucleotides to reduce costs. The branched DNA amplification scheme improved smFISH signal compared to previous methods, and multiple variants provide additional improvements in signal intensity and specificity. Compared to other spatial transcriptomics methods, the pipeline for imaging and analysis is streamlined and is compatible with other techniques like fluorescence-based circuit tracing. This approach is cost-effective and has several advantages that make it a valuable addition to the list of spatial transcriptomics toolkits.

      (2) Using 31 probes, hamFISH was able to detect 16 inhibitory and 10 excitatory neuron types in the MeA subregions, including the vast majority of cell types identified by other transcriptomics approaches. The authors quantified the distributions of these cell types along the anterior-posterior, dorsal-ventral, and medial-lateral axes, finding spatial segregation among some, but not all, MeA excitatory and inhibitory cell types. The authors additionally identified a class of inhibitory neurons expressing Ndnf (and a subset of these that express Chrna7) that project multiple social chemosensory circuits.

      (3) The authors combined hamFISH with MeA input and output mapping, finding cell-type biases in the projections to the MPOA, BNST, and VMHvl, and inputs from multiple regions.

      (4) The authors identified excitatory and inhibitory cell types, and patterns of activity across cell types, that were selectively activated during various social behaviors, including aggression, mating, and infanticide, providing new insights and avenues for future research into MeA circuit function.

      Weaknesses:

      (1) Gene selection for hamFISH is likely to still be a limiting factor, even with the expanded (32-probe) capacity. This may have contributed to the lack of ability to identify sexually dimorphic cell types (Figure S2B). This is an expected tradeoff for a method that has major advantages in terms of cost and adaptability.

      (2) Adaptation of hamFISH, for example, to adapt it to other brain regions or tissues, may require extensive optimization.

      (3) Pairing this method with behavioral experiments is likely to require further optimization, as c-fos mRNA expression is an indirect and incomplete survey of neuronal activity (e.g. not all cell types upregulate c-fos when electrically active). As such, there is a risk of false negative results that limit its utility for understanding circuit function.

      (4) The limited compatibility of hamFISH with thicker tissue samples and lack of optical sectioning introduce additional technical limitations. For example, it would be difficult to densely sample larger neural circuits using serial 20 micron sections. Also, because the imaging modality is not clear from the methods, it is difficult to know whether the analysis methods introduce the risk of misattributing gene expression to overlapping cells.

    2. Author response:

      Reviewer #1:

      In their paper entitled "Combined transcriptomic, connectivity, and activity profiling of the medial amygdala using highly amplified multiplexed in situ hybridization (hamFISH)" Edwards et al. present a new method designated as hamFISH (highly amplified multiplexed in situ hybridization) that enables sequential detection of {less than or equal to}32 genes using multiplexed branched DNA amplification. As proof-of-principle, the authors apply the new technique - in conjunction with connectivity, and activity profiling - to the medial amygdala (MeA) of the mouse, which is a critical nucleus for innate social and defensive behaviors.

      As mentioned by Edwards et al., hamFISH could prove beneficial as an affordable alternative to other in situ transcriptomic methods, including commercial platforms, that are resource-intensive and require complex analysis pipelines. Thus, the authors envision that the method they present could democratize in situ cell-type identification in individual laboratories.

      The data presented by Edwards et al. is convincing. The authors use the appropriate and validated methodology in line with the current state-of-the-art. The paper makes a strong case for the benefits of hamFISH when combining transcriptomics studies with connectivity tracing and immediate early gene-based activity profiling. Notably, the authors also discuss the caveats and limitations of their study/approach in an open and transparent manner.

      In its current state, the manuscript touches upon a number of most intriguing, yet rather preliminary findings. For example, the roles of inhibitory neuron cluster i3 or of the selective and apparently MeA neuron-specific projections (Figure 3 - Figure Supplement 2D) remain elusive. As it is the authors' prime intent to provide "a proof-of-principle example of overlaying transcriptomic types, projection, and activity in a behaviorally relevant manner and demonstrates the usefulness of hamFISH in multiplexed in situ gene expression profiling", such studies might be beyond the scope of the present manuscript. The absence of such more in-depth hypothesis-based analysis, however, prevents an even more enthusiastic overall assessment.

      We thank the reviewer for their positive assessment and agree that further studies are needed to explore and understand the MeA circuit further.

      Reviewer #2:

      The authors describe the development and implementation of hamFISH, a sensitive multiplexed ISH method. They leverage a pre-existing scRNA-seq dataset for the MeA to design 32 probes that combinatorically represent MeA neuronal populations - ~80% of MeA neurons express three of these markers. Using these markers to assess the spatial organization of the MeA, the authors identify a novel population of Ndnf+ projection neurons and characterize their connectivity with anterograde and retrograde labeling. They additionally combine hamFISH with CTB labeling of three principal MeA projection sites to show that 75% of MeA neurons have only a single projection target. Finally, they engage adult male mice in encounters with other adult males (aggression), females (mating), and pups (infanticide), followed by hamFISH and c-fos labeling to relate cell identity to behavior. Their overall conclusion is that hamFISH-defined cell types are broadly active to multiple sensory stimuli. However, the data presented are not sufficient to conclude that no selectivity exists within the MeA. A weakness of the study is that the selected hamFISH genes contain only Lhx6 as a lineage-marking transcription factor. Instead, the authors predominately use neuropeptides as markers. Genes such as Tac1, Cartpt, Adcyap1, Calb1, and Gal are expressed throughout the MeA, and many other brain regions; they are not restricted to a single transcriptomic cell type and they do not denote any developmental origins. By design, the panel has low cell type specificity as all MeA neurons express at least three of the genes. Therefore, the authors' conclusions may not hold with a more stringent classification of cell type or cell identity.

      We agree with the reviewer that a deeper level of cell type classification may reveal the selectivity of cell types that may have been missed. The design of our hamFISH bridge-readout probes allows modification to be compatible with a barcoded readout system such as MERFISH, which would substantially increase the number of genes that can be included in the gene panel. This would, however, increase the complexity of the analysis pipeline and reduce throughput, but would be a potential avenue to explore to define MeA cell types at a deeper level. An advantage of hamFISH is the ease of including and reading out alternative gene panels. For example, one panel could examine developmental-lineage-specific genes. Overall, our panel captures the highest hierarchical level (similar to the subclass level of the Allen taxonomy) of MeA transcriptomic types, based on published data available at the time of our gene panel design. Genes including Tac1, Cartpt, Adcyap1, Calb1, and Gal are expressed in specific patterns within the MeA and are useful for classification. In the original manuscript, we also included our rationale for dropping Foxp2, a lineage-specific marker gene in the MeA.

      Reviewer #3:

      In this manuscript, Edwards et al. describe hamFISH, a customizable and cost-efficient method for performing targeted spatial transcriptomics. hamFISH utilizes highly amplified multiplexed branched DNA amplification, and the authors extensively describe hamFISH development and its advantages over prior variants of this approach.

      The authors then used hamFISH to investigate an important circuit in the mouse brain for social behavior, the medial amygdala (MeA). To develop a hamFISH probe set capable of distinguishing MeA neurons, the authors mined published single-cell RNA-sequencing datasets of the MeA, ultimately creating a panel of 32 hamFISH probes that mostly cover the identified MeA cell types. They evaluated over 600,000 MeA cells and classified neurons into 16 inhibitory and 10 excitatory types, many of which are spatially clustered. The authors combined hamFISH with viral and other circuit tracer injections to determine whether the identified MeA cell populations sent and/or received unique inputs from connected brain regions, finding evidence that several cell types had unique patterns of input and output. Finally, the authors performed hamFISH on the brains of male mice that were placed in behavioral conditions that elicit aggressive, infanticidal, or mating behaviors, finding that some cell populations are selectively activated (as assessed by c-fos mRNA expression) in specific social contexts.

      Strengths:

      (1) The authors developed an optimized tissue preparation protocol for hamFISH and implemented oligopools instead of individually synthesized oligonucleotides to reduce costs. The branched DNA amplification scheme improved smFISH signal compared to previous methods, and multiple variants provide additional improvements in signal intensity and specificity. Compared to other spatial transcriptomics methods, the pipeline for imaging and analysis is streamlined and is compatible with other techniques like fluorescence-based circuit tracing. This approach is cost-effective and has several advantages that make it a valuable addition to the list of spatial transcriptomics toolkits.

      (2) Using 31 probes, hamFISH was able to detect 16 inhibitory and 10 excitatory neuron types in the MeA subregions, including the vast majority of cell types identified by other transcriptomics approaches. The authors quantified the distributions of these cell types along the anterior-posterior, dorsal-ventral, and medial-lateral axes, finding spatial segregation among some, but not all, MeA excitatory and inhibitory cell types. The authors additionally identified a class of inhibitory neurons expressing Ndnf (and a subset of these that express Chrna7) that project multiple social chemosensory circuits.

      (3) The authors combined hamFISH with MeA input and output mapping, finding cell-type biases in the projections to the MPOA, BNST, and VMHvl, and inputs from multiple regions.

      (4) The authors identified excitatory and inhibitory cell types, and patterns of activity across cell types, that were selectively activated during various social behaviors, including aggression, mating, and infanticide, providing new insights and avenues for future research into MeA circuit function.

      Weaknesses:

      (1) Gene selection for hamFISH is likely to still be a limiting factor, even with the expanded (32-probe) capacity. This may have contributed to the lack of ability to identify sexually dimorphic cell types (Figure S2B). This is an expected tradeoff for a method that has major advantages in terms of cost and adaptability.

      We recognise that the 32-plex gene detection might not be sufficient to address key questions in the transcriptomic organization of innate social behavior circuits, and that the study fell short of addressing more quantitative gene expression differences between sexes.  Detecting sexually dimorphic gene expression likely requires a more targeted approach as the dimorphism is expression differences rather than binary expression of marker genes, and the gene panel needs to be specifically configured for this purpose.

      (2) Adaptation of hamFISH, for example, to adapt it to other brain regions or tissues, may require extensive optimization.

      We have successfully performed hamFISH on at least two other mouse brain regions without needing to optimize further, suggesting that compatibility with other mouse brain regions is not an issue. We recognise, however, that optimization of hamFISH may be required for its application in other types of tissue or species. Human brain tissue, for example, typically suffers from high autofluorescence and different tissue preparation methods may need to be employed. We note that the amplification by hamFISH signal boost with v2 amplifiers may be useful to this end.

      (3) Pairing this method with behavioral experiments is likely to require further optimization, as c-fos mRNA expression is an indirect and incomplete survey of neuronal activity (e.g. not all cell types upregulate c-fos when electrically active). As such, there is a risk of false negative results that limit its utility for understanding circuit function.

      We acknowledge that c-fos is not the only readout of neuronal activity and that a panel of immediate early genes would allow a more comprehensive readout of activity-dependent gene expression. We fully agree that immediate early gene induction is an indirect readout of neural activity, and alternative methods such as in vivo physiology would provide a complementary insight into the selectivity of MeA neuron responses.

      (4) The limited compatibility of hamFISH with thicker tissue samples and lack of optical sectioning introduce additional technical limitations. For example, it would be difficult to densely sample larger neural circuits using serial 20 micron sections. Also, because the imaging modality is not clear from the methods, it is difficult to know whether the analysis methods introduce the risk of misattributing gene expression to overlapping cells.

      We agree that the use of hamFISH as described here is restricted to thin (<20 um) sections. We have shown, however, that our encoding probe and bridge-readout probe design are compatible with HCR-based mRNA detection, which is compatible with thicker sections. Regarding the misattribution of gene expression to overlapping cells in the z-axis, we used epifluorescence microscopy with 14x 500 nm z-steps to collect our raw data and generate maximum intensity projections for further analysis. Because of the thin sections (10 um) used for the imaging, the overlap between cells in z is expected to be minimal. Regarding throughput, we agree that hamFISH is likely not suitable for brain-wide questions that require large volume coverage, but its major advantage is that it allows routine use of low-level multiplexing for targeted brain areas.

    1. 代码清单5-13 自定义事件监听器MethodExecutionEventListener定义 2public interface MethodExecutionEventListener extends EventListener {/**3* 处理方法开始执行的时候发布的MethodExecutionEvent事件*/void onMethodBegin(MethodExecutionEvent evt);/**4* 处理方法执行将结束时候发布的MethodExecutionEvent事件*/void onMethodEnd(MethodExecutionEvent evt);5}事件监听器接口定义首先继承了java.util.EventListener,然后针对不同的事件发布时机提供相应的处理方法定义,最主要的就是,这些处理方法所接受的参数就是MethodExecutionEvent类型的事件。也就是说,我们的自定义事件监听器类只负责监听其对应的自定义事件并进行处理

      事件监听器接口应该继承EventListener,实现该接口的类仅仅处理目标事件类

    Tags

    Annotators

    1. List at least three ways to make up for missing notes because you miss a class.

      1.Ask a classmate for their notes 2. Meet with the instructor to discuss missing material 3. review the online resources such as podcasts or lectures.

    2. __________________________________________________________________
      1. You get a better understanding in what your note taking is actually saying.
      2. although its a way of note taking its all just fats writing down of what your teacher is saying and not rleay a good idea of what everything is.
      1. What is a chemical bond? A chemical bond is an interaction between atoms or ions that stabilizes their outer shells.
      2. What subatomic particles participate in a chemical bond? The V alence electrons in the outermost orbital shell.
      3. What is a molecule? Molecules are substances composed of two or more atoms held together by a chemical bond.
      4. What is a salt? Salts are substances composed of ions held together by chemical bonds.1
      5. What is a compound? Salts and molecules made up of two or more atoms of different elements.
      6. What is an ionic bond? Occurs between ions with opposite charges.
      7. What is a covalent bond? Occurs between atoms of the same molecule
      8. What is a hydrogen bond? Occurs between atoms in different molecules, or different parts of the same molecule.
    1. Background In recent years, Large Language Models (LLMs) have shown promise in various domains, notably in biomedical sciences. However, their real-world application is often limited by issues like erroneous outputs and hallucinatory responses.Results We developed the Knowledge Graph-based Thought (KGT) framework, an innovative solution that integrates LLMs with Knowledge Graphs (KGs) to improve their initial responses by utilizing verifiable information from KGs, thus significantly reducing factual errors in reasoning. The KGT framework demonstrates strong adaptability and performs well across various open-source LLMs. Notably, KGT can facilitate the discovery of new uses for existing drugs through potential drug-cancer associations, and can assist in predicting resistance by analyzing relevant biomarkers and genetic mechanisms. To evaluate the Knowledge Graph Question Answering (KGQA) task within biomedicine, we utilize a pan-cancer knowledge graph to develop a pan-cancer question answering benchmark, named the Pan-cancer Question Answering (PcQA).Conclusions The KGT framework substantially improves the accuracy and utility of LLMs in the biomedical field. This study serves as a proof-of-concept, demonstrating its exceptional performance in biomedical question answering.

      This work has been peer reviewed in GigaScience (see , https://doi.org/10.1093/gigascience/giae082), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer: Linhao Luo

      Summary: This paper proposes a novel framework called KGT that integrates Large Language Models (LLMs) with Knowledge Graphs (KGs) for pan-cancer question answering. The KGT framework can effectively retrieve knowledge from KGs and improve the accuracy of LLMs for question answering. Moreover, it can provide interpretable and faithful explanations with the help of structured KGs. Comments: 1. This paper construct a new dataset denoted as PcQA form a customized KG called SOKG for the evaluation of pan-cancer question answering. This is a great contribution to the community. However, it is unclear how to constuct such a dataset. More details about the construnction process and statistics of the final datasets should be disscussed in the paper. For example, how to generate the natural language questions and answers? How to link the question with relatived KG information (i.e., entity and relation)? How many questions can be answered by the KGs (i.e., answer converage rate). How many questions have been generated? What is the ratio of each quetion types defined in Table 2? 2. In Table2, the author define 4 reasoning types. How about other reasoning types such union and negation? Can we incorpate these tpes into the datasets? 3. The propsed method is novel and interesting. However some details are unclear. In the candidate path search, do we want to search reasoning paths or relational chains? The definition of these two paths are also unclear. Please give clear definition of them in prelimary. If is the reasoning paths, do we only keep the type information during the BFS? 4. I do not understand why we need to generatea cypher query to retrieve subgraph then construct relation paths from KG. We can directly retrieved relational paths from KGs by BFS. What are the benefits and motivations of using this two-stage pipeline? 5. What are the meanings of the X and √ in the figure. How to get them? 6. In experiments, other advanced KGQA methods can be compared, e.g., RoG [1] and ToG [2]. 7. The analysis of used token, time, and cost should be disscussed in the paper. 8. Can we apply the proposed metod to other KGs (i.e., SynLethKG, and SDKG) or KGQA tasks (MetaQA, and FACTKG) to show the generability. [1] LUO, L., Li, Y. F., Haf, R., & Pan, S. Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning. In The Twelfth International Conference on Learning Representations. [2] Sun, J., Xu, C., Tang, L., Wang, S., Lin, C., Gong, Y., ... & Guo, J. (2023). Think-on-graph: Deep and responsible reasoning of large language model with knowledge graph. arXiv preprint arXiv:2307.07697

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:  

      Reviewer #1 (Public Review):  

      Summary:  

      The study by Pudlowski et al. investigates how the intricate structure of centrioles is formed by studying the role of a complex formed by delta- and epsilon-tubulin and the TEDC1 and TEDC2 proteins. For this, they employ knockout cell lines, EM, and ultrastructure expansion microscopy as well as pull-downs. Previous work has indicated a role of delta- and epsilon-tubulin in triplet microtubule formation. Without triplet microtubules centriolar cylinders can still form, but are unstable, resulting in futile rounds of de novo centriole assembly during S phase and disassembly during mitosis. Here the authors show that all four proteins function as a complex and knockout of any of the four proteins results in the same phenotype. They further find that mutant centrioles lack inner scaffold proteins and contain an extended proximal end including markers such as SAS6 and CEP135, suggesting that triplet microtubule formation is linked to limiting proximal end extension and formation of the central region that contains the inner scaffold. Finally, they show that mutant centrioles seem to undergo elongation during early mitosis before disassembly, although it is not clear if this may also be due to prolonged mitotic duration in mutants.  

      Strengths:  

      Overall this is a well-performed study, well presented, with conclusions mostly supported by the data. The use of knockout cell lines and rescue experiments is convincing.  

      Weaknesses:  

      In some cases, additional controls and quantification would be needed, in particular regarding cell cycle and centriole elongation stages, to make the data and conclusions more robust. 

      We thank the reviewer for these comments and have improved our analyses of these as detailed below.

      Reviewer #2 (Public Review):  

      Summary:  

      In this article, the authors study the function of TEDC1 and TEDC2, two proteins previously reported to interact with TUBD1 and TUBE1. Previous work by the same group had shown that TUBD1 and TUBE1 are required for centriole assembly and that human cells lacking these proteins form abnormal centrioles that only have singlet microtubules that disintegrate in mitosis. In this new work, the authors demonstrate that TEDC1 and TEDC2 depletion results in the same phenotype with abnormal centrioles that also disintegrate into mitosis. In addition, they were able to localize these proteins to the proximal end of the centriole, a result not previously achieved with TUBD1 and TUBE1, providing a better understanding of where and when the complex is involved in centriole growth.  

      Strengths:  

      The results are very convincing, particularly the phenotype, which is the same as previously observed for TUBD1 and TUBE1. The U-ExM localization is also convincing:

      despite a signal that's not very homogeneous, it's clear that the complex is in the proximal region of the centriole and procentriole. The phenotype observed in U-ExM on the elongation of the cartwheel is also spectacular and opens the question of the regulation of the size of this structure. The authors also report convincing results on direct interactions between TUBD1, TUBE1, TEDC1, and TEDC2, and an intriguing structural prediction suggesting that TEDC1 and TEDC2 form a heterodimer that interacts with the TUBD1- TUBE1 heterodimer.  

      Weaknesses:  

      The phenotypes observed in U-ExM on cartwheel elongation merit further quantification, enabling the field to appreciate better what is happening at the level of this structure.  

      We thank the reviewer for these comments and have improved our analyses of cartwheel elongation as detailed below.

      Reviewer #3 (Public Review):  

      Summary:  

      Human cells deficient in delta-tubulin or epsilon-tubulin form unstable centrioles, which lack triplet microtubules and undergo a futile formation and disintegration cycle. In this study, the authors show that human cells lacking the associated proteins TEDC1 or TEDC2 have these identical phenotypes. They use genetics to knockout TEDC1 or TEDC2 in p53negative RPE-1 cells and expansion microscopy to structurally characterize mutant centrioles. Biochemical methods and AlphaFold-multimer prediction software are used to investigate interactions between tubulins and TEDC1 and TEDC2.  

      The study shows that mutant centrioles are built only of A tubules, which elongate and extend their proximal region, fail to incorporate structural components, and finally disintegrate in mitosis. In addition, they demonstrate that delta-tubulin or epsilon-tubulin and TEDC1 and TEDC2 form one complex and that TEDC1 TEDC2 can interact independently of tubulins. Finally, they show that the localization of four proteins is mutually dependent.  

      Strengths:  

      The results presented here are mostly convincing, the study is exciting and important, and the manuscript is well-written. The study shows that delta-tubulin, epsilon-tubulin, TEDC1, and TEDC2 function together to build a stable and functional centriole, significantly contributing to the field and our understanding of the centriole assembly process.  

      Weaknesses:  

      The ultrastructural characterization of TEDC1 and TEDC2 obtained by U-ExM is inconclusive. Improving the quality of the signals is paramount for this manuscript.  

      We thank the reviewer for these comments and have improved our imaging of TEDC1 and TEDC2 localization, as detailed below.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):  

      The reviewers agreed that the conclusions are largely supported by solid evidence, but felt that improving the following aspects would make some of the data more convincing:  

      (1) The UExM localizations of TEDC1/2 are not very convincing and the reviewers suggest to complement these with alternative super-resolution approaches (e.g. SIM) and/or different labeling techniques such as pre-expansion labeling using STAR red/orange secondaries (also robust for SIM and STED), use of the Halo tag, different tag antibodies, etc 

      We thank the reviewers for these recommendations and have adapted two of these strategies to improve our imaging of TEDC1 and TEDC2 localization. First, we used an alternative super-resolution approach, a Yokogawa CSU-W1 SoRA confocal scanner (resolution = 120 nm) and imaged cells grown on coverslips (not expanded). We found that TEDC1 and TEDC2 localize to procentrioles and the proximal end of parental centrioles (Fig 2 – Supplementary Figure 1a, b). Second, we used a recently described expansion gel chemistry (Kong et al., Methods Mol Biol 2024) combined with Abberior Star red and orange secondary antibodies. This technique resulted in robust signal at centrosomes and in the cytoplasm and indicated that TEDC1 and TEDC2 localize near the centriole walls of procentrioles and the proximal region of parental centrioles, near CEP44 (Fig 2 – Supplementary Figure 1c, d). These results complement and support our initial observations (Fig 2C, D) and we have edited the text to reflect this (lines 157-163). We also note that these Flag tag and V5 tag primary antibodies are specific and have little background signal in all applications (Fig 2 – Supplementary Fig 1E-J), while other commercially available antibodies against these tags did exhibit non-specific signal. 

      (2) The cell cycle classifications of centrioles would strongly benefit, apart from a better description, from adding quantifications of average centriole length at a given stage based on tubulin staining (not acTub). 

      We thank the reviewers for these recommendations. We have added an improved description of our cell cycle analyses (lines 234-237). We have also added new analyses for centriole length as measured by staining with alpha-tubulin (Fig 4 – Supp 3 and Fig 4 – Supp 4). We find that in all mutants, acetylated tubulin elongates along with alpha-tubulin in a similar way as control centrioles.

      Reviewer #1 (Recommendations For The Authors):  

      Specific points:  

      (1) The introduction is a bit oddly structured. About halfway through it summarizes what is going to be presented in the study, giving the impression that it is about to conclude, but then continues with additional, detailed introduction paragraphs. Overall, the authors may also want to consider making it more concise.

      We thank the reviewer for these suggestions and have shortened and restructured the introduction for clarity and conciseness.

      (2) The text should explain to the non-expert reader why endogenous proteins are not detected and why exogenously expressed, tagged versions are used. Related to this, the authors state overexpression, but what is this assessment based on? Does expression at the endogenous level also rescue? At least by western blot, these questions should be addressed. 

      In the text, we have added clarification about why endogenous proteins were not detected for immunofluorescence (lines 149-151). To quantify the overexpression, we have added Western blots of TEDC1 and TEDC2 to Fig 1 – Supplementary Figure 1E,F. We note that endogenous levels of both proteins are very low, and the rescue constructs are overexpressed 20 to 70 fold above endogenous levels.  

      (3) The figures should clearly indicate when tagged proteins are used and detected.

      Currently, this info is only found in the legends but should be in the figure panels as well. 

      We have made these changes to the figure panels in Fig 2, Fig 2 – Supp 1, and Fig 3.

      (4)  I could not find a description and reference to Figure 2 Supplement 2 and 3. 

      We have replaced these supplements with new supplementary figures for TEDC1 and TEDC2 localization (Fig 2 – Supp 1).

      (5) The multiple bands including unspecific (?) bands should be labeled to guide the reader in the western blots. 

      We have labeled nonspecific bands in our Western blots with asterisks (Fig 1 – Supp 1, Fig 3)

      (6) The alphafold prediction suggests that TUBD1 can bind to the TED complex in the absence of TUBE1 can this be shown? This would be a nice validation of the predicted architecture of the complex. I also missed a bit of a discussion of the predicted architecture. How could it be linked to triplet microtubule formation? Is the latest alphafold version 3 adding anything to this analysis? 

      In our pulldown experiments, we found that TUBD1 cannot bind to TEDC1 or TEDC2 in the absence of TUBE1 (Fig 3C, D, IB: TUBD1). We performed this experiment with three biological replicates and found the same result. It is possible that TUBD1 and TUBE1 form an intact heterodimer, similar to alpha-tubulin and beta-tubulin, and this will be an exciting area of future research.

      We have added new analysis from AlphaFold3 (Fig 3 – Supp 1B). AlphaFold3 predicts a similar structure as AlphaFold Multimer.

      We have also added additional discussion about the AlphaFold prediction to the text (lines 220-222, 365-367). Thanks to the reviewer for pointing out this oversight.

      (7) I suggest briefly explaining in the text how cells and centrioles at different cell cycle stages were identified. I found some info in the legend of Figure 1, but no info for other figures or in the text. Related to this, how are procentrioles defined in de novo formation? There is no parental centriole to serve as a reference. 

      We have added a brief explanation of the synchronization and identification in lines 234237. We have also clarified the text regarding de novo centrioles, and now term these “de novo centrioles in the first cell cycle after their formation” (lines 271-272).

      (8) Related to point 7: using acetylated tubulin as a universal length and width marker seems unreliable since it is a PTM. The authors should use general tubulin staining to estimate centriole dimensions, or at least establish that acetylated tubulin correlates well with the overall tubulin signal in all mutants. 

      We have added two supplementary data figures (Fig 4 – supp 3 and Fig 4 – supp 4) in which we co-stain control and mutant centrioles with alpha-tubulin. We found that acetylated tubulin marked mutant centrioles well and as alpha-tubulin length increased, acetylated tubulin length also increased. 

      (9) Presence and absence of various centriolar proteins. These analyses lack a clear reference for the precise centriole elongation stage. This is particularly problematic for proteins that are recruited at specific later stages (such as inner scaffold proteins). The staining should be correlated with centriole length measurements, ideally using general tubulin staining.  

      As described for point 8, we have added two supplementary data figures in which we costain control and mutant centrioles with alpha-tubulin and found that acetylated tubulin also increases as overall tubulin length increases in all mutants. We note that inner scaffold proteins are absent in all our mutant centrioles at all stages of the cell and centriole cycle, as also previously reported for POC5 in Wang et al., 2017.

      Reviewer #2 (Recommendations For The Authors):  

      Here's a list of points I think could be improved:  

      -  As the authors previously published, the centriole appears to have a smaller internal diameter than mature centrioles. Could the authors measure to see if the phenotype is identical? Is the centriole blocked in the bloom phase (Laporte et al. 2024)? 

      We have added an additional supplementary figure (Fig 4 – supp 5) to show that mutant centrioles have smaller diameters than mature centrioles, as we previously reported for the delta-tubulin and epsilon-tubulin mutant centrioles by EM. We thank the reviewers for the additional question of the bloom phase. Given the comparatively smaller number of centrioles we analyzed in this paper compared to Laporte et al (50 to 80 centrioles per condition here, versus 800 centrioles in Laporte et al), it is difficult to definitively conclude whether there is a block in bloom phase. This would be an interesting area for future research.  

      -  The images of the centrioles in EM are beautiful. Would it be possible to apply a symmetrisation on it to better see the centriolar structures? For example, is the A-C linker present? 

      We thank the reviewer for this excellent suggestion. Using centrioleJ, we find that the A-C linker is absent from mutant centrioles. The symmetrized images have been added to Fig 1 – Supplementary Fig 2, and additional discussion has been added to the text (line 143-144, line 368-374).  

      -  How many EM images were taken? Did the centrioles have 100% A-microtubule only or sometimes with B-MT? 

      For TEM, we focused on centrioles that were positioned to give perfect cross-section images of the centriolar microtubules, and thus did not take images of off-angle or rotated centrioles. Given the difficulty of this experiment (centrioles are small structures within the cell, centrosomes are single-copy organelles, and off-angle centrioles were not imaged), we were lucky to image 3 centrioles that were in perfect cross-section – 2 for Tedc1<sup>-/-</sup> and 1 for Tedc2<sup>-/-</sup>. Our images indicate that these centrioles only have A-tubules (Fig 1 – Supp Fig

      2).

      -  In Figure 2 - it would be preferable to write TEDC2-flag or TEDC1-flag and not TEDC2/1. 

      We have made this change

      -  It seems that Figures 2C and D aren't cited, and some of the data in the supplemental data are not described in the main text. 

      We have replaced these supplements with new supplementary figures for TEDC1 and TEDC2 localization (Fig 2 – Supp 1).

      -  The signal in U-ExM with the anti-Flag antibody is heterogeneous. Did the authors test several anti-FLAG antibodies in U-ExM? 

      We tested several anti-Flag and anti-V5 antibodies for our analyses, and chose these because they have little background signal in all applications (Fig 2 – Supplementary Fig 1E-J). Other commercially available antibodies against these tags did exhibit non-specific signal.

      -  The AlphaFold prediction is difficult to interpret, the authors should provide more views and the PDB file. 

      We have added 2 additional views of the AlphaFold prediction in Fig 3 – Supp 1A.

      -  In general, but particularly for Figure 4: the length doesn't seem to be divided by the expansion factor, it is therefore difficult to compare with known EM dimensions. Can the authors correct the scale bars? 

      We have corrected the scale bars for all figures to account for the expansion factor.

      -  Concerning Gamma-tubulin that is "recruited to the lumen of centrioles by the inner scaffold, had localization defects in mutant centrioles. However, we were unable to reliably detect gamma-tubulin within the lumen of control or de novo-formed centrioles in S or G2-phase (Figure 4 - Supplement 1E), and thus were unable to test this hypothesis". In Laporte et al 2024, Gamma-tubulin arrives later than the inner scaffold and only on mature centrioles, so this result appears to be in line with previous observation. However, the authors should be able to detect a proximal signal under the microtubules of the procentriole, is this the case? 

      We agree that this is an exciting question. However, in our expansion microscopy staining, we frequently observe that gamma-tubulin surrounds centrioles, corresponding to its role in the pericentriolar material (PCM). In our hands, we find it difficult to distinguish between centriolar gamma-tubulin at the base of the A-tubule from gamma-tubulin within the PCM.  

      -  In the signal elongation of SAS-6, STIL, CEP135, CPAP, and CEP44, would it be possible to quantify the length of these signals (with dimensions divided by the expansion factor for comparison with known TEM distances)? 

      We have quantified the lengths of SAS-6 and CEP135 in new Fig 4 – Supp 3 and Fig 4 – Supp 4.  

      -  The authors observe that centrin is present, but only as a SFI1 dot-like localization (which is another protein that would be interesting to look at), and not an inner scaffold localization. Can the authors elaborate? These results suggest that the distal part is correctly formed with only a microtubule singlet. 

      We agree with the reviewer’s interpretation that the centriole distal tip is likely correctly formed with only singlet microtubules, as both distal centrin and CP110 are present. We have added this point to the discussion (line 415).

      -The authors observe that CPAP is elongated, but CPAP has two locations, proximal and distal. Is it distal or proximal elongation? Is the proximal signal of CPAP longer than that of CEP44 in the mutants? The authors discuss that the elongation could come from overexpression of CPAP, but here it seems that the centriole is not overlong, just the structures around the cartwheel. 

      We thank the reviewer for this point. It is difficult for us to conclude whether the proximal or distal region is extended in the mutants, as our mutant centrioles lacks a visible separation between these two regions. It would be interesting to probe this question in the future by testing whether subdomains of CPAP may be differentially regulated in our mutants.

      Reviewer #3 (Recommendations For The Authors):  

      It isn't apparent to me what was counted in Figure 1C. Were all centrioles (mother centrioles and procentrioles) counted? Where is the 40% in control cells coming from? Can this set of data be presented differently? 

      We apologize for the confusion. In this figure, all centrioles were counted. We have updated the figure legend for clarity. We performed this analysis in a similar way as in Wang et al., 2017 to better compare phenotypes.  

      Figure 2C. and the text lines 182-187: The ultrastructural characterization of TEDC1 and TEDC2 suffers from the low quality of the TEDC1 and TEDC2 signals obtained postexpansion. In comparison with robust low-resolution immunosignal, it appears that most of the signal cannot be recovered after expansion. Another sub-resolution imaging method to re-analyze TEDC1 and TEDC22 localization would be essential. The same concern applies to Figures 2 - Supplement 2 and 3. Also, Figure 2 - Supplement 2 and Supplement 3 do not seem to be cited. 

      We thank the reviewer for these recommendations. As also mentioned above, we used an alternative super-resolution approach, a Yokogawa CSU-W1 SoRA confocal scanner (resolution = 120 nm), and found that TEDC1 and TEDC2 localize to procentrioles and the proximal end of parental centrioles (Fig 2 – Supplementary Figure 1a, b). Second, we used a recently described expansion gel chemistry (Kong et al., Methods Mol Biol 2024) combined with Abberior Star red and orange secondary antibodies. This technique resulted in robust signal at centrosomes and in the cytoplasm and indicated that TEDC1 and TEDC2 localize near the centriole walls of procentrioles and the proximal region of parental centrioles, near CEP44 (Fig 2 – Supplementary Figure 1c, d). These stainings complement and support our initial observations (Fig 2C, D) and we have edited the text to reflect this (lines 157-163). We have also removed the supplementary figures that were uncited in the text.

      TUBD1 and TUBE1 form a dimer and TEDC2 and TEDC1 can interact. Any speculation as to why TEDC2 does not pull down both TUBE1 and TUBD1? 

      We apologize for the confusion. TEDC2 does pull down both TUBE1 and TUBD1 (Fig 3D, pull-down, second column, Tedc2-V5-APEX2 rescuing the Tedc2<sup>-/-</sup> cells pulls down TUBD1, TUBE1, and TEDC1).  

      Figure 4A and B. The authors use acetylated tubulin to determine the length of procentrioles in the S and G2 phases. However, procentrioles are not acetylated on their distal ends in these cell phase phases (as the authors also mention further in the text). Why has alpha tubulin not been used since it works well in U-ExM? The average size of the control, G2 procentrioles, seems too small in Figure 4A and not consistent with other imaging data (for instance, in Figure 4 - Supplement 1 C, Cp110, and CPAP staining). There is no statistical analysis in F4A.  

      We have added two supplementary data figures (Fig 4 – supp 3 and Fig 4 – supp 4) in which we co-stain control and mutant centrioles with alpha-tubulin. We found that acetylated tubulin correlates well with overall tubulin signal in all mutants. We have added statistical analysis to the figure legend of Fig 4A.

      Lines 260 - 262: "These results indicate that centrioles with singlet microtubules can elongate to the same length as controls, and therefore that triplet microtubules are not essential for regulating centriole length." It is hard to agree with this statement. Mutant procentrioles show aberrantly elongated proximal signals of several tested proteins. In addition, in lines 326 - 328, the authors state that "Together, these results indicate that centrioles lacking compound microtubules are unable to properly regulate the length of the proximal end."  

      We thank the reviewer and have clarified the statement to state that these results indicate that centrioles with singlet microtubules can elongate to the same overall length as control centrioles in G2 phase.  

      Line 353: The authors suggest that elongated procentriole structure in mitosis may represent intermediates in centriole disassembly. Another interpretation, more in line with the EM data from Wang et al., 2017, would be that these mutant procentrioles first additionally elongate before they disassemble in late mitosis. The aberrant intermediate structure concept would need further exploration. For instance, anti-alpha/beta-tubulin antibodies could be used to investigate centriole microtubules.  

      We apologize for the confusion and have edited this section for clarity (lines 341-343): “We conclude that in our mutant cells, centrioles elongate in early mitosis to form an aberrant intermediate structure, followed by fragmentation in late mitosis.”

      References need to be included in lines 122, 277, 279. 

      We have added these references

      Line 281: Add references PMID: 30559430 and PMID: 32526902.  

      We have added these references (lines 265-266).

      Line 289: "Moreover, our results suggest that centriole glutamylation is a multistep process, in which long glutamate side chains are added later during centriole maturation." This does not seem like an original observation. For instance, see PMID: 32526902.  

      We have added this reference (lines 273-274).

    1. Author response:

      Reviewer 1:

      (1) Provide Rsmd and DALI scores to show how similar the AlphaFold-predicted structures of BrrG are to other anti-termination factors. This should be done for Fig1B and also for Suppl. Fig 1 to support the claim that BrrG, GafA, GafZ, Q21 share structural features.

      In the revised manuscript we will provide Rsmd and DALI scores.

      (2) Throughout the manuscript, flow cytometry data of gfp expression was used and shown as single replicate. Korotaev et al wrote in the legends that error bars are shown (that is not true for e.g. Figs. 3, 4, and 5). It is difficult for reviewers/readers to gauge how reliable are their experiments.

      As stated in the manuscript all flow cytometry data were performed in triplicate. In the revised manuscript we will include the two replicates not presented in the main figures as supplementary information.

      (3) I am unsure how ChIP-seq in Fig. 2A was performed (with anti-FLAG or anti-HA antibodies? I cannot tell from the Materials & Methods). More importantly, I did not see the control for this ChIP-seq experiment. If a FLAG-tagged BrrG was used for ChIP-seq, then a WT non-tagged version should be used as a negative control (not sequencing INPUT DNA), this is especially important for anti-terminator that can co-travel with RNA polymerase. Please also report the number of replicates for ChIP-seq experiments.

      Fig. 2A presents a coverage plot from the ChIP-Seq of ∆brrG +pTet:brrG-3xFLAG (N’). A replicate of this N-terminally tagged construct will be added as supplementary data in the revised version. As anticipated by the referee, we had used ∆brrG +pTet:brrG (untagged) as control.

      (4) Korotaev et al mentioned that BrrG binds to DNA (as well as to RNA polymerase). With the availability of existing ChIP-seq data, the authors should be able to locate the DNA-binding element of BrrG, this additional information will be useful to the community.

      We will mine the ChIP-Seq data to define the BrrG binding site as closely as possible and include the analysis in the revised version of the manuscript.

      (5) Mutational experiments to break the potential hairpin structure are required to strengthen the claim that this putative hairpin is the potential transcriptional terminator.

      We did not claim that the identified hairpin is a terminator but rather suggested it as a candidate terminator. We agree with the referee that the proposed experiment would be necessary to definitively prove its terminator function. However, our primary aim was to demonstrate that BrrG acts as a processive terminator, which we have shown by replacing the putative terminator with a well-characterized synthetic terminator that BrrG successfully overcame. Therefore, we prefer not to conduct the proposed experiment and will instead further tone down our conclusions regarding the putative terminator function of the identified hairpin structure.

      Reviewer 2:

      (1) The authors wrote "GTAs are not self-transmitting because the DNA packaging capacity of a GTA particle is too small to package the entire gene cluster encoding it" (page 3). I thought that at least the Bartonella capsid gene cluster should be self-transmissible within the 14 kb packaged DNA (https://doi.org/10.1371/journal.pgen.1003393, https://doi.org/10.1371/journal.pgen.1000546). This was also concluded by Lang et al (https://doi.org/10.1146/annurev-virology-101416-041624). In this case the presented results would have important implications. As the gene cluster and the anti-terminator required for its expression are separated on the chromosome, it would not be possible to transfer an active GTA gene cluster, although the DNA coding for the genes required for making the packaging agent itself, theoretically fits into a BaGTA particle. Could the authors comment on that? I think it would be helpful to add the sizes of the different gene clusters and the distance between them in Fig. 2A. The ROR amplified region spans 500kb, is the capsid gene cluster within this region?

      We thank the reviewer for bringing up this interesting point. The bgt cluster (capsid cluster) is approximately 9.2 kb in size and could feasibly be packaged in its entirety into a GTA particle. In contrast, the ror gene cluster, which encodes the antiterminator BrrG, is approximately 20 kb in size—exceeding the packaging limit of GTA particles—and is separated from the bgt cluster by approximately 35 kb. Consequently, if the bgt cluster is transferred via a GTA particle into a recipient host that does not encode the ror gene cluster, the bgt cluster would not be expressed.

      (2) Another side-note regarding the introduction: On page three the authors write: "GTAs encode bacteriophage-like particles and in contrast to phages transfer random pieces of host bacterial DNA". While packaging is not specific, certain biases in the packaging frequency are observed in both studied GTA families. For Bartonella this is ROR. In the two GTA-producing strains D. shibae and C. crescentus origin and terminus of replication are not packaged and certain regions are overrepresented (https://doi.org/10.1093/gbe/evy005, https://doi.org/10.1371/journal.pbio.3001790). Furthermore, D. shibae plasmids are not packaged but chromids are. I think the term "random" does not properly describe these observations. I would suggest using "not specific" instead.

      We thank the reviewer for this suggestion and will adjust the working accordingly.

      (3) Page 5: Remove "To address this". It is not needed as you already state "To test this hypothesis" in the previous sentence.

      We will adjust the working accordingly.

      (4) I think the manuscript would greatly benefit from a summary figure to visualize the Q-like antiterminator-dependent regulatory circuit for GTA control and its four components described on pages 15 and 16.

      We thank the reviewer for this valuable suggestion and will include a summary figure illustrating the deduced regulatory mechanism in the revised manuscript.

      (5) Page 17: It might be worth noting that GafA is highly conserved along GTAs in Rhodobacterales (https://doi.org/10.3389/fmicb.2021.662907) and so is probably regulatory integration into the ctrA network (https://doi.org/10.3389/fmicb.2019.00803). It's an old mechanism. It would be also interesting to know if it is a common feature of the two archetypical GTAs that the regulator is not part of the cluster itself.

      We agree with the points raised by the reviewer and will address them in the revised manuscript. Specifically, we will highlight the high conservation of GafA in GTAs across Rhodobacterales and its regulatory integration within the ctrA network. Additionally, we will analyze whether the GafA-like antitermination regulator is typically located outside the regulated gene cluster, as we have demonstrated for BrrG of BaGTA in the Bartonellae.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, Huang et al used SMRT sequencing to identify methylated nucleotides (6mA, 4mC, and 5mC) in Pseudomonas syringae genome. They show that the most abundant modification is 6mA and they identify the enzymes required for this modification as when they mutate HsdMSR they observe a decrease of 6mA. Interestingly, the mutant also displays phenotypes of change in pathogenicity, biofilm formation, and translation activity due to a change in gene expression likely linked to the loss of 6mA. Overall, the paper represents an interesting set of new data that can bring forward the field of DNA modification in bacteria.

      Thank you for your valuable feedback on our paper exploring the impact of 6mA modification in P. syringae.

      Major Concerns:

      Most of the authors' data concern Psph pathovar. I am not sure that the authors' conclusions are supported by the two other pathovars they used in the initial 2 figures. If the authors want to broaden their conclusions to Pseudomonas syringe and not restrict it to Psph, the authors should have stronger methylation data using replicates. Additionally, they should discuss why Pss is so different than Pst and Psph. Could they do a blot to confirm it is really the case and not a sequencing artifact? Is the change of methylation during bacterial growth conserved between the pathovar? The authors should obtain mutants in the other pathovar to see if they have the same phenotype. The authors have a nice set of data concerning Psph but the broadening of the results to other pathovar requires further investigation.

      We appreciate the reviewer’s insightful comments. While the majority of our data focuses on the Psph, we recognize the importance of validating these findings in Pss and Pst. To this end, we have performed additional experiments using dot blot and mutant construction to enhance our conclusions in other pathovars.

      We agree that we should discuss why Pss is different from Psph and Pst. We performed a dot blot assay using genome DNA in Pss and Pst, presented in Figure S5A. Meanwhile, we compared the 6mA modification level of Pss and Pst in different growth phases. As shown in Figure S5A, the change of methylation during bacterial growth is conserved in Pst. The change was not obvious in Pss, which might be due to the lack of a type I R-M system.

      “In accordance with previous studies showing that growth conditions affect the bacterial methylation status, we applied dot blot experiments using the same amount of DNA (1 μg) from these three P. syringae strains to detect the 6mA levels during both logarithmic and stationary phases. The results revealed that 6mA levels in the stationary phase were much higher compared to the logarithmic phase in Psph and Pst, but no significant change in Pss. Additionally, we found that during the stationary phase, 6mA methylation levels in Psph and Pst were higher than those in Pss. These findings were consistent with the MTases predication on these three strains, since Pss does not harbor any type I R-M systems, which are important for 6mA medication in bacteria.”

      Please see Figure S5A and Lines 220-228 in the revised manuscript.

      We also tried to construct an HsdM mutant in Pst to explore whether the influence of 6mA methylation was conserved in P. syringae, but it failed after multiple attempts. We did not construct a Pss mutant because no type I R-M system was predicted, and few methylation sites were identified via SMRT-seq in this strain. Therefore, we overexpressed HsdM in Pst instead. We have performed additional experiments in WT and the HsdM overexpression strains, including dot blot and growth curve assays.

      Please see Figures S5B-C and Lines228-232 in the revised manuscript.

      The authors should include proper statistical analysis of their data. A lot of terms are descriptive but not supported by a deeper analysis to sustain the conclusions. For example, in Figure 4E, we do not know if the overlap is significant or not. Are DEGs more overlapping to 6mA sites than non-DEGs? Here is a non-exhaustive list of terms that need to be supported by statistics: different level (L145), greater conservation (L162), significant conservation (L165), considerable similarity (L175), credible motifs (L189), Less strong (L277) and several "lower" and "higher" throughout the text.

      Thank you for the insightful feedback. We have made the following revisions in the manuscript to ensure that the terms are more precise and do not require statistical significance testing.

      (1) Statistical analysis: We have added statistical tests for the overlap between DEGs and 6mA sites in Figure 4E. We performed the Fisher test, and we found the overlap was not significant (p> 0.05). DEGs and non-DEGs were both non-significant overlapped 6mA sites. Please see Figure 4E and Lines 261-262.

      “Less strong” was used to indicate the influence of HsdM on biofilm in Figure 5D. All Figures with “*” labels were analyzed using students' two-tailed t-tests with a significant change (p < 0.05).

      (2) Revised wording: For terms used to describe comparisons, we have revised the wording to be clearer and ensure that the terminology used did not imply the need for statistical significance testing when not required. For example:

      “Different level” has been removed. Please see Lines 143-144.

      “Greater conservation” has been revised to “more conserved functional terms”. Please see Lines 161-162.

      “Significant conservation” has been revised to “notable conservation”. Please see Line 165.

      “Credible motifs” has been revised to “identified motifs”. Please see Line 186.

      The authors performed SMRT sequencing of the delta hsdMSR showing a reduction of 6mA. Could they include a description of their results similar to Figures 1-2. How reduced is the 6mA level? Is it everywhere in the genome? Does it affect other methylation marks? This analysis would strengthen their conclusions.

      Yes, we agree. We have provided additional analysis and descriptions to strengthen the conclusions regarding these valuable comments. We determined three methylation sites in the HsdMSR mutant strain and compared the overlapped genes within these modification patterns. Specifically, we focused on the 6mA sites in Psph WT, HsdMSR mutant, and HsdM motif CAGCN<sub>(6)</sub>CTC. As expected, we found almost all of the reduction 6mA sites in the ΔhsdMSR were from motif CAGCN<sub>(6)</sub>CTC. We also noticed that 5mC and 4mC sites in the mutant were relatively similar to that in WT, and the slight difference might be caused by sequencing errors. Overall, we propose that HsdMSR only catalyze the 6mA located on the motif CAGCN<sub>(6)</sub>CTC, but does not affect other 6mA sites and other modification types.

      Please see Figures S4D-E and Lines 212-218 in the revised manuscript.

      In Figure 6E to conclude that methylation is required on both strands, the authors are missing the control CAGCN6CGC construct otherwise the effect could be linked to the A on the complementary strand.

      Thank you for your valuable suggestions. We have provided the control result on the complementary strand. Please see Figure 6C. The new result evidences the conclusion that 6mA methylation regulates gene transcription based on methylation on both strands.

      Please see Figure 6C and Lines 329-330 in the revised manuscript.

      Reviewer #2 (Public Review):

      In the present manuscript, Huang et.al. revealed the significant roles of the DNA methylome in regulating virulence and metabolism within Pseudomonas syringae, with a particular focus on the HsdMSR system in this model strain. The authors used SMRT-seq to profile the DNA methylation patterns (6mA, 5mC, and 4mC) in three P. syringae strains (Psph, Pss, and Psa) and displayed the conservation among them. They further identified the type I restriction-modification system (HsdMSR) in P. syringae, including its specific motif sequence. The HsdMAR participated in the process of metabolism and virulence (T3SS & Biofilm formation), as demonstrated through RNA-seq analyses. Additionally, the authors revealed the mechanisms of the transcriptional regulation by 6mA. Strictly from the point of view of the interest of the question and the work carried out, this is a worthy and timely study that uses third-generation sequencing technology to characterize the DNA methylation in P. syringae. The experimental approaches were solid, and the results obtained were interesting and provided new information on how epigenetics influences the transcription in P. syringae. The conclusions of this paper are mostly well supported by data, but some aspects of data analysis and discussion need to be clarified and extended.

      Thank you for your positive feedback and recognition of the importance of our study. We appreciate the suggestions for further clarification and extension of some aspects of data analysis and discussion. We added further analysis of the SMRT-seq result of the ΔhsdMSR and overexpressed HsdM in Pst to provide more information on conservation. We added these contents to the discussion in the revised manuscript. Please see Figure 6C and  Figure S5.

      Reviewer #3 (Public Review):

      Summary:

      The article by Huang et.al. presents an in-depth study on the role of DNA methylation in regulating virulence and metabolism in Pseudomonas syringae, a model phytopathogenic bacterium. This comprehensive research utilized single-molecule real-time (SMRT) sequencing to profile the DNA methylation landscape across three model pathovars of P. syringae, identifying significant epigenetic mechanisms through the Type-I restriction-modification system (HsdMSR), which includes a conserved sequence motif associated with N6-methyladenine (6mA). The study provides novel insights into the epigenetic mechanisms of P. syringae, expanding the understanding of bacterial pathogenicity and adaptation. The use of SMRT sequencing for methylome profiling, coupled with transcriptomic analysis and in vivo validation, establishes a robust evidence base for the findings

      Strengths:

      The results are presented clearly, with well-organized figures and tables that effectively illustrate the study's findings.

      Weaknesses:

      It would be helpful to add more details, especially in the methods, which make it easy to evaluate and enhance the manuscript's reproducibility.

      Thank you for the positive evaluation of our study, as well as the constructive feedback provided. We have added more details in methods for RNA-seq analysis and Ribo-seq analysis. Please see Lines 484-515.

      “Briefly, bacteria were cultured to an OD<sub>600</sub> of 0.4, at which point chloramphenicol was added to a final concentration of 100 µg/mL for 2 minutes. Cells were then pelleted and washed with pre-chilled lysis buffer [25 mM Tris-HCl, pH 8.0; 25 mM NH4Cl; 10 mM MgOAc; 0.8% Triton X-100; 100 U/mL RNase-free DNase I; 0.3 U/mL Superase-In; 1.55 mM chloramphenicol; and 17 mM GMPPNP]. The pellet was resuspended in lysis buffer, followed by three freeze-thaw cycles using liquid nitrogen. Sodium deoxycholate was then added to a final concentration of 0.3% before centrifugation. The resulting supernatant was adjusted to 25 A260 units and mixed with 2 mL of 500 mM CaCl<sub>2</sub> and 12 µL MNase, making up a total volume of 200 µL. After the digestion, the reaction was quenched with 2.5 mL of 500 mM EGTA. Monosomes were isolated using Sephacryl S400 MicroSpin columns, and RNA was purified using the miRNeasy Mini Kit (Qiagen). rRNA was removed using the NEBNext rRNA Depletion Kit, and the final library was constructed with the NEBNext Small RNA Library Prep Kit. For each sample, ribosome footprint reads were mapped to the Psph 1448A reference genome, and the translational efficiency was calculated by dividing the normalized Ribo-seq counts by the normalized RNA counts. Two biological replicates were performed for all experiments.”

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      I would recommend the authors limit their manuscript to Psph pathovar and include statistical analysis supporting their conclusions.

      Thank you for your suggestion.

      Minor

      • L104: "significantly" please add a p-value and explain the analysis.

      Sorry for the confusion. We have added the p-value and explained the analysis in the method section. The p-value used for SMRT-seq was the modification quality value (QV) score, which were used to call the modified bases A (QV=50) and C (QV=100). Please see Lines 452-454.

      • Figures 1B, D, F, and Figure 2A: make the Venn diagram to scale

      Yes, we have revised.

      • L110-111: missing p-value to say that the authors observe a bigger overlap in Pst than Psph as they observed more modified sites in Pst

      Sorry for the confusion. We said it had a bigger overlap in Pst because the number 17.7 was bigger than the number of 15 in Psph. To avoid misunderstanding, we revised the wording to “more genes equipped with all three modification types were detected in Pst than Psph”. Please see Lines 110-111.

      • L112: missing description of their Pss analysis (IDP, sites...)

      We have added the information for Pss in the revised manuscript.

      “Additionally, the methylome atlas of Pss revealed a lower incidence of methylation than those of Psph and Pst, particularly in terms of 6mA modifications, which were only seen in 457 significant 6mA occurrences under the same threshold (IPD > 1.5) and a total of 2,853 and 1,438 methylation sites were detected as 5mC and 4mC, respectively”. Please see Lines 114-116.

      • L118: "modification" to "modified "

      We have revised. Please see Line 119.

      • L120: "modification sites" to "modified nucleotides"

      We have revised. Please see Line 121.

      • L142: correct the title "Methylated genes revealed highly functional conservation among three P. syringae strains" maybe to "Methylated genes are functionally conserved among ..."

      We have revised. Please see Line 142.

      • Figure 2C: not easy to read and interpret

      Sorry for the confusion. Figure 2C revealed the significantly enriched functional pathways in GO and KEGG databases among three P. syringae strains. The specific names of each pathway were listed on the left, and each column with dots indicated the number of genes within one kind of methylation in one of three P. syringae strains. The larger the size, the bigger the number.

      We have revised the legend of Figure 2C. Please see Lines 575-579.

      “The dot plot revealed the significantly enriched functional pathways in GO and KEGG databases among three P. syringae strains. The specific names of each pathway were listed on the left, and each column with dots indicated the number of genes within one kind of methylation in one of three P. syringae strains. The size of the dots indicates the number of related genes.”

      • Figure 6B-C: what is the difference between B 24h and C?

      Figure 6B revealed the expression difference between WT and mutant during 24 hours. Figure 6C only showed a time point in 24 hours. To avoid repetition, we have removed Figure 6C.

      • Figure 6C-D: if the same maybe remove Figure 2C

      We have removed Figure 6D.

      Reviewer #2 (Recommendations For The Authors):

      The manuscript could be improved by addressing the following concerns:

      (1) In line 146: How to understand the percentage conserved in "more than two of the strains"?

      Sorry for the confusion, we planned to indicate the pattern that conserved in two strains and three strains. We have revised it to: “Notable, about 25% to 45% of methylated genes were conserved in two and three strains”. Please see Line 145.

      (2) In line 178: Five conserved sequence motifs should be replaced by "Six conserved sequence motifs".

      We have revised. Please see Line 176.

      (3) In Figure 2B, specify the C1, C2 and C3. "m6A" should be replaced by "6mA".

      Yes, we have revised.

      (4) In Figure S2, "m6A" should be replaced by "6mA".

      Yes, we have revised.

      (5) In line 212, please add references for the previous studies showing that growth conditions affect bacterial methylation status.

      Thank you for your suggestion. We have added the relevant references (Gonzalez and Collier, 2013), (Krebes et al., 2014), (Sanchez-Romero and Casadesus, 2020).

      (6) In line 217, "illustrate" should be "illustrated".

      Yes, we have revised. Please see Line 210.

      (7) There are some genes colored in grey, revealing bigger differences between the two strains than those related to ribosomal protein, T3SS, and alginate synthesis in Fig. 4A. Do they have important functional roles as well?

      Thank you for your suggestion. A total of 116 genes with bigger differences (|Log<sub>2</sub>FC| > 2) except for genes related to ribosomal protein, T3SS, and alginate synthesis. Among these genes, 31 were annotated as hypothetical proteins and 4 as transcription factors with unknown functions, and the remaining genes mostly encoded metabolism-related enzymes. These enzymes might have effects on growth defects in ΔhsdMSR. We added this information in the revised manuscript. Please see Line 249-254.

      (8) The authors should discuss what will be the potential signals or factors that can regulate the activity of HsdMSR. In other words, what can decide the extent of methylation through activating or suppressing the expression of HsdMSR?

      Thank you for your valuable suggestion. We have added this part in the discussion part. Please see Lines 404-415.

      “Apart from the established roles of 6mA and HsdMSR in P. syringae, certain signals or factors may influence HsdMSR expression. For instance, we confirmed that the growth phase affects methylation levels in P. syringae. Previous studies have shown that increased temperatures can reduce methylation levels, as observed in PAO1(Doberenz et al., 2017). These findings suggest that HsdMSR expression may be responsive to both intrinsic cellular states and extrinsic environmental conditions. To further explore potential upstream TFs regulating the expression of HsdMSR, we searched for TF-binding sites in the HsdMSR promoter using our own databases (Fan et al., 2020; Shao et al., 2021; Sun et al., 2024). As a result, we found three candidate TFs (PSPPH_0061, PSPPH_3268, and PSPPH_3504) that might directly bind and regulate HsdMSR expression. Future studies on these TFs and their interactions with the HsdMSR promoter would help clarify the regulatory network governing HsdMSR activity.”

      Reviewer #3 (Recommendations For The Authors):

      (1) Some figures contain dense information, which may be overwhelming for readers. Streamlining the legend for Figure 1 and resizing the Venn diagrams within it could enhance clarity and visual appeal.

      Thank you for your suggestion. We have scaled all the Venn plots in the revised version.

      (2) Incorporating a discussion about the role of the restriction-modification (RM) system in bacterial defense against phage infection into the discussion section could enrich the manuscript's context and relevance.

      Thank you for your valuable suggestion. We have added this part in the Discussion part. Please see Lines 416-427.

      “RM systems are known for their intrinsic role as innate immune systems in anti-phage infection, and present in around 90% of bacterial genomes(Oliveira et al., 2014). RM systems protect bacteria self by recognizing and degrading foreign phage DNA via methylation-specific site and restriction endonucleases (REases) (Loenen et al., 2014). In addition, other phage-resistance systems are similar to RM systems but carry extra genes. One is called the phage growth limitation (Pgl) system, which modifies and cleaves phage DNA. However, the Pgl only modifies the phage DNA in the first infection cycle, and cleaves phage DNA in the subsequent infections, which gives a warn to the neighboring cells(Hampton et al., 2020; Hoskisson et al., 2015). To counteract RM and RM-like systems, phages have evolved strategies, including unusual modifications such as hydroxymethylation, glycosylation, and glucosylation. They can also encode their own MTases to protect their DNA or employ strategies to evade restriction systems and other anti-RM defenses.(Iida et al., 1987; Murphy et al., 2013; Vasu and Nagaraja, 2013).

      (3) In line 152: What is the importance of the mentioned example of Cro/CI family TF?

      Thank you for your comments. The Cro/CI are important TFs present in phages. The interaction between Cro and CI affects bacteria immunity status in Enterohemorrhagic Escherichia coli (EHEC) strains(Jin et al., 2022). RM systems are known as a kind of phage-defense system, and hence we mentioned it here. We have added this description in the revised manuscript. Please see Lines 152-154.

      Reference:

      (1) Doberenz, S., Eckweiler, D., Reichert, O., Jensen, V., Bunk, B., Sproer, C., Kordes, A., Frangipani, E., Luong, K., Korlach, J., et al. (2017). Identification of a Pseudomonas aeruginosa PAO1 DNA Methyltransferase, Its Targets, and Physiological Roles. mBio 8. 10.1128/mBio.02312-16.

      (2) Fan, L., Wang, T., Hua, C., Sun, W., Li, X., Grunwald, L., Liu, J., Wu, N., Shao, X., Yin, Y., et al. (2020). A compendium of DNA-binding specificities of transcription factors in Pseudomonas syringae. Nat Commun 11, 4947. 10.1038/s41467-020-18744-7.

      (3) Gonzalez, D., and Collier, J. (2013). DNA methylation by CcrM activates the transcription of two genes required for the division of Caulobacter crescentus. Mol Microbiol 88, 203-218. 10.1111/mmi.12180.

      (4) Hampton, H.G., Watson, B.N., and Fineran, P.C. (2020). The arms race between bacteria and their phage foes. Nature 577, 327-336.

      (5) Hoskisson, P.A., Sumby, P., and Smith, M.C. (2015). The phage growth limitation system in Streptomyces coelicolor A (3) 2 is a toxin/antitoxin system, comprising enzymes with DNA methyltransferase, protein kinase and ATPase activity. Virology 477, 100-109.

      (6) Iida, S., Streiff, M.B., Bickle, T.A., and Arber, W. (1987). Two DNA antirestriction systems of bacteriophage P1, darA, and darB: characterization of darA− phages. Virology 157, 156-166.

      (7) Jin, M., Chen, J., Zhao, X., Hu, G., Wang, H., Liu, Z., and Chen, W.-H. (2022). An engineered λ phage enables enhanced and strain-specific killing of enterohemorrhagic Escherichia coli. Microbiology Spectrum 10, e01271-01222.

      (8) Krebes, J., Morgan, R.D., Bunk, B., Sproer, C., Luong, K., Parusel, R., Anton, B.P., Konig, C., Josenhans, C., Overmann, J., et al. (2014). The complex methylome of the human gastric pathogen Helicobacter pylori. Nucleic Acids Res 42, 2415-2432. 10.1093/nar/gkt1201.

      (9) Loenen, W.A., Dryden, D.T., Raleigh, E.A., Wilson, G.G., and Murray, N.E. (2014). Highlights of the DNA cutters: a short history of the restriction enzymes. Nucleic Acids Res 42, 3-19.

      (10) Murphy, J., Mahony, J., Ainsworth, S., Nauta, A., and van Sinderen, D. (2013). Bacteriophage orphan DNA methyltransferases: insights from their bacterial origin, function, and occurrence. Appl Environ Microb 79, 7547-7555.

      (11) Oliveira, P.H., Touchon, M., and Rocha, E.P. (2014). The interplay of restriction-modification systems with mobile genetic elements and their prokaryotic hosts. Nucleic Acids Res 42, 10618-10631.

      (12) Sanchez-Romero, M.A., and Casadesus, J. (2020). The bacterial epigenome. Nature reviews. Microbiology 18, 7-20. 10.1038/s41579-019-0286-2.

      (13) Shao, X., Tan, M., Xie, Y., Yao, C., Wang, T., Huang, H., Zhang, Y., Ding, Y., Liu, J., Han, L., et al. (2021). Integrated regulatory network in Pseudomonas syringae reveals dynamics of virulence. Cell Rep 34, 108920. 10.1016/j.celrep.2021.108920.

      (14) Sun, Y., Li, J., Huang, J., Li, S., Li, Y., Lu, B., and Deng, X. (2024). Architecture of genome-wide transcriptional regulatory network reveals dynamic functions and evolutionary trajectories in Pseudomonas syringae. bioRxiv, 2024.2001. 2018.576191.

      (15) Vasu, K., and Nagaraja, V. (2013). Diverse functions of restriction-modification systems in addition to cellular defense. Microbiol Mol Biol Rev 77, 53-72. 10.1128/MMBR.00044-12.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the editors and reviewers for the comments and suggestions on our manuscript.  The main point that we wished to convey in this paper was the concept and the kinetic model that enabled the estimation of nuclear export rate from an image of single mRNAs localised in single cells.  By studying the influenza viral transcripts with this model, we report the variation in the mRNA nuclear export rate of the eight viral segments.  Of note, the hemagglutinin and neuraminidase mRNAs were the slowest among the eight segments in exiting the nucleus.  We agree that the potential mechanism and the biological impact of this observation require further validation, as the reviewers pointed out.  We revised our manuscript to describe these points separately (Lines 21-25, Abstract; Lines 86-91, Introduction; Lines 316-320, Results; Lines 372-381, Discussion).  We also highlight below, the revisions that we made to address the specific points raised by the reviewers.  

      Influenza viral transcription

      The authors used specific settings for their virology experiments and several assumptions regarding their mathematical modelling, so it's extremely important that the reader has the viral life cycle clearly understood before immersing themselves in the results. Thus, a detailed explanation of the viral life cycle, including the kinetics of each step, would be extremely helpful if included in the introduction section.  Reviewer #1

      We have included the molecular composition of influenza vRNP and the mechanism of viral transcription in the revised manuscript (Lines 46-53).  

      Line 45: "Eight viral RNA segments are transcribed by the same set of molecular machinery" (Ref. 7). What's known about the arrival of the viral RNA segments in the nucleus? Is it synchronized? The authors will understand that my concern is related to the fact that a differential arrival would indeed impact the transcription and export processes.  Reviewer #1

      The arrival of eight vRNPs in the nucleus is not synchronised, with each of the eight vRNPs arriving independently (Chou et al. PLOS Pathogens 2013) (Lakadamyali et al, PNAS 2003).  This does not compromise our model, as our model estimates the export rate of each mRNA species individually (also please see our response in Model assumption below).  This is included in the second paragraph of the Discussion section (Lines 390-400).  

      Model assumption

      Even though I do not have the expertise to assess the authors' mathematical model, I do not doubt its robustness. Even so, I find some virological concerns related to the set-up of their experiments. According to what I understand, the authors performed non-synchronized 2 h-long infections with the WSN strain of influenza A virus. They did this to avoid cRNA production (and cross-reaction of the probes), which they claim to occur "much later than mRNA synthesis". Then they omit the degradation of the mRNAs for their model without giving an explanation for having done so. So, taking all these into account, it seems to me that too many assumptions are made without a strong argument. I understand that they are made in order to simplify their model, but I strongly consider that the model would gain strength if some of these events were experimentally considered. Thus, would it be possible to perform synchronized infections? Would it be possible to empirically demonstrate that cRNA production does not occur within the first 2 hours of infection and/or separate transcription and replication? Would it be possible to incorporate a degradation inhibitor of the mRNAs into their infections? If all these could be achieved, then the results coming out of the mathematical model would be enormously reinforced.  Reviewer #1

      * The study lacks experimental data that would help support the conclusions. For instance, perturbations are many times used to prove a point related to gene expression. An example for Fig. 2 for such an experiment could be to treat the cells with transcription inhibitors (e.g. DRB, 5,6-dichloro1-beta-D-ribofuranosylbenzimidazole). Preventing transcription leaves only mature RNAs in the nucleus, and then using this system one can compare the export rate of different RNAs.  Reviewer #2

      We agreed that the primary concern in our model was the assumption that the mRNA degradation could be omitted.  Synchronised infection is not necessary; in fact, non-synchronised infection is preferred, as we explain later in our response.  Additionally, the dominance of mRNA production over the cRNA production has been documented elsewhere.  To address mRNA degradation and validate our model estimation, we performed a time-course measurement using baloxavir.  Baloxavir efficiently blocks the viral transcription by inhibiting the nuclease activity in PA.  DRB, suggested by the reviewer, allows influenza viral transcription and causes viral transcripts to accumulate in the nucleus for unknown mechanisms (Amorim et al. Traffic 2007 and our observation using smFISH, not shown).  The additional experiment, now presented in Fig. 5 in the revised manuscript, indicated that the mRNA degradation is minimal, and the export rate estimated in our model and the time-course experiment agreed well for the HA segment.  The experiment raised the possibility that the time-course measurement underestimates the export rate of transcripts that exit the nucleus rapidly, such as NP.  A real-time imaging of single transcripts would be necessary to directly measure the true nuclear export rate; however, this is beyond the scope of our paper.  The new result is now presented in Fig. 5, Supplementary figures 3 and 4, and in the main text (Lines 322-360).  An alteration was also made in Line 286 to guide to Fig. 5.  The Materials and Methods section was updated (Lines 478-482).  

      We note that our model does not require synchronised infection.  Even under synchronised infection, such as incubating cells with the virus at 4°C to facilitate attachment and subsequently shifting to 37°C to allow viral entry, the inherent heterogeneity in vRNP migration to the nucleus still remains.  This randomness does not compromise our model; rather, our model exploits this random arrival of each vRNP in each cell in the system.  This variation, in turn, generates cells carrying varying amounts of transcripts, enabling the estimation of nuclear export rate.  Importantly, more variation ensures the broader distribution of transcript levels, enabling more precise parameter fitting in our model.  It is also important to note that our model does not require the correlation between segments.  Our model estimates the export rate of each mRNA species individually.  These important points were explained in the Discussion section (Lines 390-400).  

      * There is no concrete value given for the export rates and what they might mean biologically (e.g. time present/stuck in the nucleus) - Fig. 4D. This leaves the reader in the dark.  Reviewer #2

      The export rate lambda (previously denoted as k) in our model (Fig. 4) and the decay constant k in the time-course measurement (Fig. 5) represent the proportion of mRNAs exported from the nucleus in an infinitesimal time, defining the nuclear export rate.  This has been clarified in the revised manuscript (Lines 314-316), with some alterations to make the parameter use more comprehensive.  

      -  The Greek letter k previously used in Fig. 4 and the associated equations was consistently replaced with lambda to avoid the confusion with the parameter k that is subsequently used for the exponent decay in Fig. 5 in the revised manuscript.  

      -  The Greek letter epsilon (previously used to represent export) was replaced with mu, slightly more common for representing the rate of transport.  

      -  The term “velocity” was consistently replaced with “rate” in the context of the nuclear export (Lines 163, 215, 320, 441).  

      -  The phrase “molar concentrations of mRNAs” was corrected for “molecules of mRNAs” (Line 282).

      Also, we have now described our model in two sections: “Conceiving the model” and “Implementing a kinetic model to estimate the nuclear export rate” in the Result.  The first section outlines the conceptual framework of the model, and the second focuses on its implementation and the parameter extraction (Lines 227 and 277).  

      Applicability of the model

      Lines 27-29. "Our framework presented in this study can be widely used for investigating the nuclear retention of nascent transcripts produced in a transcription burst." In my opinion, this is the strongest point of the manuscript: developing a mathematical model to analyze nuclear export retention as a mechanism of protein expression control, which could lay the foundation for further biological processes. The authors revisit this idea in the Discussion section. However, which would be those processes for which the model could be helpful? I consider that a more conspicuous discussion on this topic would broaden the readers scope, a crucial point under the eLife scope.  Reviewer #1

      * Could this framework be used to quantify the nuclear export rate of cellular RNAs? According to the explanation in the Discussion, it would seem that this approach is limited to quantifying the export rate of influenza RNAs.  Reviewer #2

      Our model is not limited to the influenza virus infection.  Our model is applicable for systems where transcription is initiated concurrently, such as when stimuli trigger the activation of a certain set of genes for transcription.  Therefore, this makes it particularly valuable for quantifying the nuclear retention of mRNAs in a transcription burst.  This point is reiterated in Line 383-390.  

      Potential mechanisms for differential nuclear export rate of viral segments

      * There is no mechanistic insight in the study. The idea driven by this study is that gene expression is regulated by the RNA export rate. But how is that explained? Is there any molecular pathway or explanation for this model? If the transcripts are ready for export, why do the mRNAs stay inside the nucleus? One option to consider are the export factors. Viral RNAs are exported by different pathways as mentioned (line 362), or by TREX2 (Bhat P et al Nat Comm 2023). The data shows that there is no difference observed in the export rate of different pathways. How about knocking down an important export factor to show how this affects the export rates. Or the opposite, overexpress a certain factor, would this change the nucleus/cytoplasm distribution of the retained RNAs.  Reviewer #2

      As we discussed in the paper, we are beginning to consider that each viral segment has an intrinsic sequence that determines its nuclear export rate, because previous studies on the export factors does not fully explain the variation in the nuclear export rate observed in our study.  As the reviewer suggested, a recent study (Bhat et al. Nature Communications 2023) exactly pointed out the internal sequence in the HA segment, aligning with our working hypothesis.  This point is discussed and their work (Bhat et al. 2023) has been cited in the Discussion section in the revised manuscript (Lines 446-449).  

      Biological impact of the nuclear retention

      The authors mention several times throughout the manuscript that the virus might use the nuclear retention of mRNA for HA and NA to postpone the expression of these antigenic molecules. At this point, I need to admit that a great question mark appeared in my mind, maybe related to the fact that some knowledge is lacking in my analysis. Lines 328-330: "On the other hand, pushing back the expression of viral antigens HA and NA would be beneficial for the virus to delay the host immune response against the infected cells in which the virus is being replicated." As I tend to understand, the host immune response recognizes HA and NA within the viral particle, if so and independently of the time that HA and Na arrive at the virus assembly step, the progeny' viral particles that are complete and extruded from the cells would be those awakening the host immunity response. If this is right, how would the delayed export of those proteins from the nucleus (and their late expression) be beneficial for delaying the immune response? I would appreciate an explanation for this point, and if I am wrong, then there could exist a relationship between nuclear export rate and the pathogenicity of different strains of influenza A virus. If so, could the authors challenge their model with additional viral strains showing a differential immune response pattern? A deeper analysis in this direction would greatly strengthen the message in their manuscript.  Reviewer #1

      * Is the timing of viral protein appearance in accordance with the time the mRNA is exported to the cytoplasm. It is logical that the first mRNA to go to the cytoplasm would be the first to become a protein. Can the authors show that nuclear retention of mRNA would push back the expression of the viral antigens HA and NA.  Reviewer #2

      Three types of immune reactions are being studied extensively.  The first is the humoral immune response, where antibodies target the viral antigens HA and NA on the viral envelope, coating and inactivating the viral particles.  The second is the cytotoxic T cell response.  There is growing evidence that cytotoxic T cells react against NP, eliciting cross-reaction to broader range of influenza viral strains.  This reaction is not specific to HA and NA, and antigens are processed in the cytoplasm and presented on the MHC.  The third is antibody-dependent cellular cytotoxicity (ADCC), where antibodies recognise the viral proteins on the cellular surface (HA and NA) of infected cells, facilitating their elimination by the NK cells.  Although protein translation may begin as soon as the first mRNA exits the nucleus, the virus may delay the peak of the antigen production and therefore, postpone the NK-mediated ADCC.  This specific point, along with references to ADCC in influenza virus infection, has been clarified in the Discussion section (Lines 377-381).  

      Data analysis and presentation

      Lines 99-101. "Viral mRNAs were detected as single diffraction-limited spots in the three-dimensional image stacks, allowing for absolute mRNA quantification (Fig. 1B)". What do the authors mean to say by "absolute mRNA quantification"? Do they refer to the total spots or the total mRNAs? Is it assumed that one spot corresponds to a single mRNA transcript? This is not clear at all for this reviewer, which could be the situation for a potential reader. Since it's the beginning of the story, this should be clearly stated in the manuscript.  Reviewer #1

      Each spot of fluorescent signal corresponds to a single molecule of viral mRNA.  We quantified the absolute number of transcripts in each cell.  This is clarified in the revised manuscript (Lines 104-106).  

      * Line 151: does the baseline change according to the RNA in question? The authors say that the "baseline is defined by the median of the Z distribution of peripheral mRNAs" - it seems that the number 0.731 refers only to one type of RNA (which is not mentioned at all not in the text and not in the legend). Reviewer #2

      The baseline was set using the NP mRNAs in the cytoplasm because the NP mRNA showed the widest distribution across the cytoplasm (Line 157).  

      * Also, what is all the signal that is seen outside the marked cells in Fig. 2B? There seems to be significant background in the field, does this mean much false-positive in the multiplex FISH? If so, then how do the authors know that the staining inside the cells isn't to some degree non-specific? It would be necessary to back this up with some other type of quantitative assay like qRT-PCR.  Reviewer #2

      The cells were removed from the analysis if the cytoplasmic boundary touched any edge of the field-of-view, while the signals were recovered across the entire field-of-view.  This is clarified in the figure legend (Lines 194-195).  

      Others

      * The meaning and explanation for Figure 1H -are unclear. Rephrase and make the legend more reader friendly.  Reviewer #2

      We made alterations to the legend (Lines 132-134) and the relevant lines in the main text (Lines 148-151).  

      * Fig. 2E: Is this the total transcript count or only in the nucleus? Would it be possible to find some correlation between the segments if a pair-wise analysis is performed according to nuclear-cytoplasm distribution?  Reviewer #2

      The total counts are presented.  This is clarified in the legend (Lines 199-200).  

      * Abstract -"A mathematical modelling indicated that the relationship between the nuclear ratio and the total count of mRNAs in single cells is dictated by a proxy for the nuclear export rate." - this sentence is very unclear.  Reviewer #2

      The sentence was removed in the revised manuscript (Line 21).  This removal did not affect the overall meaning in the abstract.  We also made an alteration to Line 279 that contained a similar phrase.  

      * The use of the word "acutely" (lines 16 and 35) is strange.  Reviewer #2

      They have been removed (now Lines 15, 33).  

      * Line 157 - "This result indicates that the velocity of viral mRNA export from the nucleus varies according to the viral segments." - not velocity, maybe timing.  Reviewer #2

      We consistently replaced “velocity” with “rate” (Lines 163, 215, 320, 441).

      * Reference for line 41.  Reviewer #2

      A reference (Waker et al. Trends Microbiol. 2019) has been cited (Line 39).  

      * Reference for lines 105-106.  Reviewer #2

      The gene length of each segment was indicated in the sentence (Line 137).  

      * Line 264- why here is 0.02 M.O.I used compared to line 97 where 2 is used?  Reviewer #2

      We used M.O.I. of 0.02 to allow for spot quantification over longer periods of observation (Lines 269-270).  

      * NS1 is expressed at late infection times and might alter the nuclear export of viral mRNAs (line 352). Need to show that indeed it is not expressed in the experiments done here.  Reviewer #2

      It is not possible to definitely prove that NS1 is not expressed due to the sensitivity limitations.  However, we minimised the its impact by investigating at the early time point (Lines 415416).  

      * Line 459- 30% formamide? Is this correct or should it be 10%?  Reviewer #2

      This is correct.  The probes used were longer than the others for smFISH.  Therefore, we washed away the probes with the stringent condition.

    1. Reviewer #2 (Public review):

      Summary:

      Giménez-Orenga carried out this study to assess whether human endogenous retroviruses (HERVs) could be used to improve the diagnosis of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) and Fibromyalgia (FM). To this end, they used the HERV-V3 array developed previously, to characterize the genome-wide changes in the expression of HERVs in patients suffering from ME/CFS, FM, or both, compared to controls. In turn, they present a useful repertoire of HERVs that might characterize ME/CFS and FM. For the most part, the paper is written in a manner that allows a natural understanding of the workflow and analyses carried out, making it compelling. The figures and additional tables present solid support for the findings. However, some statements made by the authors seem incomplete and would benefit from a more thorough literature review. Overall, this work will be of interest to the medical community seeking in better understanding of the co-occurrence of these pathologies, hinting at a novel angle by integrating HERVs, which are often overlooked, into their assessment.

      Strengths:

      (1) The work is well-presented, allowing the reader to understand the overall workflow and how the specific aims contribute to filling the knowledge gap in the field.

      (2) The analyses carried out to understand the potential impact on gene expression mediated by HERVs are in line with previous works, making it solid and robust in the context of this study.

      Weaknesses:

      (1) The authors claim to obtain genome-wide HERV expression profiles. However, the array used was developed using hg19, while the genomic analysis of this work are carried out using a liftover to hg38. It would improve the statement and findings to include a comparison of the differences in HERVs available in hg38, and how this could impact the "genome-wide" findings.

      (2) The authors in some points are not thorough with the cited literature. Two examples are:<br /> a) Lines 396-397 the authors say "the MLT1, usually found enriched near DE genes (Bogdan et al., 2020)". I checked the work by Bogdan, and they studied bacterial infection. A single work in a specific topic is not sufficient to support the statement that MLT1 is "usually" in close vicinity to differentially expressed genes. More works are needed to support this.<br /> b) After the previous statement, the authors go on to mention "contributing to the coding of conserved lncRNAs (Ramsay et al., 2017)". First, lnc = long non-coding, so this doesn't make sense. Second, in the work by Ramsay they mention "that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved", which is different from what the authors in this study are trying to convey. Again, additional work and a rephrasing might help to support this idea.

      (3) When presenting the clusters, the authors overlook the fact that cluster 4 is clearly control-specific, and fail to discuss what this means. Could this subset of HERV be used as bona fide markers of healthy individuals in the context of these diseases? Are they associated with DE genes? What could be the impact of such associations?

      Appraisals on aims:

      The authors set specific questions and presented the results to successfully answer them. The evidence is solid, with some weaknesses discussed above that will methodologically strengthen the work.

      Likely impact of work on the field:

      This work will be of interest to the medical community looking for novel ways to improve clinical diagnosis. Although future works with a greater population size, and more robust techniques such as RNA-Seq, are needed, this is the first step in presenting a novel way to distinguish these pathologies.

      It would be of great benefit to the community to provide a table/spreadsheet indicating the specific genomic locations of the HERVs specific to each condition. This will allow proper provenance for future researchers interested in expanding on this knowledge, as these genomic coordinates will be independent of the technique used (as was the array used here).

    2. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Giménez-Orenga et al. investigate the origin and pathophysiology of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and fibromyalgia (FM). Using RNA microarrays, the authors compare the expression profiles and evaluate the biomarker potential of human endogenous retroviruses (HERV) in these two conditions. Altogether, the authors show that HERV expression is distinct between ME/CFS and FM patients, and HERV dysregulation is associated with higher symptom intensity in ME/CFS. HERV expression in ME/CFS patients is associated with impaired immune function and higher estimated levels of plasma cells and resting CD4 memory T cells. This work provides interesting insights into the pathophysiology of ME/CFS and FM, creating opportunities for several follow-up studies.

      Strengths:

      (1) Overall, the data is convincing and supports the authors' claims. The manuscript is clear and easy to understand, and the methods are generally well-detailed. It was quite enjoyable to read.

      (2) The authors combined several unbiased approaches to analyse HERV expression in ME/CFS and FM. The tools, thresholds, and statistical models used all seem appropriate to answer their biological questions.

      (3) The authors propose an interesting alternative to diagnosing these two conditions. Transcriptomic analysis of blood samples using an RNA microarray could allow a minimally invasive and reproducible way of diagnosing ME/CFS and FM.

      Weaknesses:

      (1) The cohort analysed in this study was phenotyped by a single clinician. As ME/CFS and FM are diagnosed based on unspecific symptoms and are frequently misdiagnosed, this raises the question of whether the results can be generalised to external cohorts.

      Thank you for your comment. Surely the study of larger cohorts will determine the external validity of these results in a clinical scenario. However, this pilot study, first of its kind, was designed to maximize homogeneity across participants which seemed primarily ensured by inclusion of females only diagnosed by a single experienced observer.

      (2) The analyses performed to unravel the causes and effects of HERV expression in ME/CFS and FM are solely based on sequencing data. Experimental approaches could be used to validate some of the transcriptomic observations.

      Certainly, experimental approaches may add robustness to our findings. We in fact consider taking this avenue to deepen in the observations presented here. However, the limited knowledge of HERV-mediated physiological functions may hinder the task of revealing causes and effects of HERV expression in ME/CFS and FM in the short term.

      Reviewer #2 (Public review):

      Summary:

      Giménez-Orenga carried out this study to assess whether human endogenous retroviruses (HERVs) could be used to improve the diagnosis of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) and Fibromyalgia (FM). To this end, they used the HERV-V3 array developed previously, to characterize the genome-wide changes in the expression of HERVs in patients suffering from ME/CFS, FM, or both, compared to controls. In turn, they present a useful repertoire of HERVs that might characterize ME/CFS and FM. For the most part, the paper is written in a manner that allows a natural understanding of the workflow and analyses carried out, making it compelling. The figures and additional tables present solid support for the findings. However, some statements made by the authors seem incomplete and would benefit from a more thorough literature review. Overall, this work will be of interest to the medical community seeking in better understanding of the co-occurrence of these pathologies, hinting at a novel angle by integrating HERVs, which are often overlooked, into their assessment.

      Strengths:

      (1) The work is well-presented, allowing the reader to understand the overall workflow and how the specific aims contribute to filling the knowledge gap in the field.

      (2) The analyses carried out to understand the potential impact on gene expression mediated by HERVs are in line with previous works, making it solid and robust in the context of this study.

      Weaknesses:

      (1) The authors claim to obtain genome-wide HERV expression profiles. However, the array used was developed using hg19, while the genomic analysis of this work are carried out using a liftover to hg38. It would improve the statement and findings to include a comparison of the differences in HERVs available in hg38, and how this could impact the "genome-wide" findings.

      This is an important point. However, the low number of probes that were excluded from our analysis by lack of correspondence with hg38, less than 100 among the 1,290,800 probesets, was interpreted as insignificant for "genome-wide" claims. An aspect that will be detailed in the revised version of this manuscript.

      (2) The authors in some points are not thorough with the cited literature. Two examples are:

      a) Lines 396-397 the authors say "the MLT1, usually found enriched near DE genes (Bogdan et al., 2020)". I checked the work by Bogdan, and they studied bacterial infection. A single work in a specific topic is not sufficient to support the statement that MLT1 is "usually" in close vicinity to differentially expressed genes. More works are needed to support this.

      b) After the previous statement, the authors go on to mention "contributing to the coding of conserved lncRNAs (Ramsay et al., 2017)". First, lnc = long non-coding, so this doesn't make sense. Second, in the work by Ramsay they mention "that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved", which is different from what the authors in this study are trying to convey. Again, additional work and a rephrasing might help to support this idea.

      Certainly, these two sentences need rephrasing to better adjust statements to current evidence and will be replaced in the revised version of this manuscript.

      (3) When presenting the clusters, the authors overlook the fact that cluster 4 is clearly control-specific, and fail to discuss what this means. Could this subset of HERV be used as bona fide markers of healthy individuals in the context of these diseases? Are they associated with DE genes? What could be the impact of such associations?

      Using control DE HERV as bona fide markers of healthy individuals seems like an interesting possibility worth exploring. Control DE HERVs (cluster 4) are indeed associated with DE genes involved in apoptosis, T cell activation and cell-cell adhesion (modules 1 and 6) (Figure 3A). The impact of which deserves further study.

      Appraisals on aims:

      The authors set specific questions and presented the results to successfully answer them. The evidence is solid, with some weaknesses discussed above that will methodologically strengthen the work.

      Likely impact of work on the field:

      This work will be of interest to the medical community looking for novel ways to improve clinical diagnosis. Although future works with a greater population size, and more robust techniques such as RNA-Seq, are needed, this is the first step in presenting a novel way to distinguish these pathologies.

      It would be of great benefit to the community to provide a table/spreadsheet indicating the specific genomic locations of the HERVs specific to each condition. This will allow proper provenance for future researchers interested in expanding on this knowledge, as these genomic coordinates will be independent of the technique used (as was the array used here).

      We agree with the reviewer that sharing genomic locations of DE HERVs in these pathologies would contribute to further development of our findings. Unfortunately, we do not hold the rights to share probe coordinates from this custom HERV-V3 microarray which we used under MTA agreement with its developer.

      Reviewer #3 (Public review):

      The authors find that HERV expression patterns can be used as new criteria for differential diagnosis of FM and ME/CFS and patient subtyping. The data are based on transcriptome analysis by microarray for HERVs using patient blood samples, followed by differential expression of ERVs and bioinformatic analyses. This is a standard and solid data processing pipeline, and the results are well presented and support the authors' claim.

    1. AbstractDNA molecular is a promising next-generation data storage medium. Recently, it has been theoretically proposed that non-natural or modified bases can serve as extra molecular letters to increase the information density. However, the feasibility of the strategy is challenging due to the difficulty in synthesizing and the complex structure of non-natural DNA sequences. Here, we described a practical DNA data storage transcoding scheme named R+ based on expanded molecular alphabet by introducing 5-methlcytosine(5mC). We also demonstrated the experimental validation by encoding one representative file into several 1.3~1.6 kbps in vitro DNA fragments for nanopore sequencing. The results show an average data recovery rate of 98.97% and 86.91% with and without reference respectively. This work validates the practicability of 5mC in DNA storage systems, with a potentially wide range of applications.Availability & Implementation R+ is implemented in Python and the code is available under the MIT license at https://github.com/Incpink-Liu/DNA-storage-R_plus

      This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.147). These reviews (including a protocol review) are as follows.

      Reviewer 1. Abdur Rasool

      Is the source code available, and has an appropriate Open Source Initiative license been assigned to the code? However, the Git links have a typo; the working code is available at https://github.com/Incpink-Liu/DNA-storage-R_plus

      Is the code executable?

      Unable to test. Complete execution of the given code requires time and resources.

      Is installation/deployment sufficiently outlined in the paper and documentation, and does it proceed as outlined? Unable to test. Additional Comments: This manuscript focuses on DNA data storage based on an expanded molecular alphabet. In view of the challenges of non-natural bases in synthesis, sequencing, and compatibility, the manuscript proposes a DNA data storage scheme containing 5-methylcytosine based on the theory that modified bases can replace non-natural bases as extra molecular letters and develops an adaptive transcoding algorithm named R+ for corresponding experimental validation. The high data recovery rate obtained from sequencing analysis demonstrates its practicability.

      This manuscript provides a simple but relatively universal transcoding algorithm for DNA data storage that introduces additional molecular letters. The proposed DNA data storage scheme outperforms conventional DNA data storage in the potential development of information density. Considering the anticipated decrease in future synthesis costs and the expected advancements in relevant transcoding algorithms, my outlook remains optimistic regarding the potential application of this scheme. I suggest that the manuscript could be accepted after a few minor revisions listed below:

      1. Figure 3 in the paper could be further modified, specifically minimizing the excess white space on both sides of Subfigure A to make it more aesthetically pleasing.
      2. The subfigures A, B, and D in Figure 2 and Figure S2 both demonstrate the difference between poem.txt/program.py and the other four files. However, the manuscript lacks an explanation for this phenomenon. Is it relevant to the file size?
      3. The 8 nt adaptors play a key role during the sequence assembly in the experimental validation, so I suggest supplementing the specific generation process of these linkers. Text descriptions or flow charts are acceptable.
      4. It’s better to add the silico simulation to the Methods to make its structure more complete.
      5. For the practicality of DNA storage, I suggest to cite https://onlinelibrary.wiley.com/doi/10.1002/smtd.202301585 and https://academic.oup.com/bib/article/25/5/bbae463/7759103.
      6. Provide the correct URLs of GitHub links for reproducibility.

      Reviewer 2. Bi Kun

      Are there (ideally real world) examples demonstrating use of the software?

      No. Additional Comments:

      In this study, a practical DNA data storage transcoding scheme named R+ based on expanded molecular alphabet is proposed to increase the information density. The experimental validation demonstrates the practicability of DDS-5mC and highlight the enormous potential of modified bases represented by 5mC in the field of DNA data storage. Overall, the methods and results look appropriate and promising, but it has minor issues that need to be addressed currently.

      1.Please indicate the proportion of substitution: insertion: deletion in the error rates of Fig. 4C and D. 2.What is the meaning of the vertical axis of Fig. 2B? Is it the number of homopolymers per sequence, the longest length of homopolymers, or something else? 3.Line 304, please add s, "References" 4.The last sentence of the Abstract: "This work validates the practicability of 5mC over other non-natural bases in DNA storage systems". Please correspond it with the last paragraph of Results (151-154). 5.If necessary, according to the guideline of this journal, section Conclusion can be added or not.

      Reviewer 3. Lifu Song

      This manuscript explores the application of 5-methylcytosine (5mC) as an additional molecular letter in DNA data storage systems, expanding the molecular alphabet to increase information density. The authors present a novel transcoding scheme (R+) and validate it with both in silico and experimental data. The study explores GC content, homopolymer distribution, and data recovery rates under various conditions, offering detailed insights into practical applications. Experimental validation with nanopore sequencing demonstrates real-world feasibility. By improving storage density and ensuring compatibility with nanopore sequencing, the study addresses significant challenges in incorporating non-natural bases into DNA storage systems. Overall, the manuscript is well-structured and addresses a highly relevant topic in DNA data storage, offering valuable contributions to the field. I recommend it for publication, subject to minor revisions to enhance clarity and precision.

      Suggested minor revisions: 1) Although substitution errors, particularly between C and 5mC, were discussed, the manuscript does not provide a detailed explanation of how these errors affect long- term storage or large-scale applications—both of which are critical for archival storage, the primary use case of DNA data storage technology. 2) The manuscript could benefit from a broader comparison with other high-density DNA storage strategies, such as composite DNA letters, to contextualize the benefits and limitations of 5mC. 3) The discussion could be expanded to address practical challenges, such as strategies to reduce synthesis costs and improve sequencing accuracy for modified bases like 5mC, to provide a more holistic perspective on the technology's scalability.

      Protocol Review: I have taken a look at the experiment protocol associated with this manuscript in the website of protocols.io. The protocol looks sensible. I don't have any additional comments about it and am happy for it to go live.

      See: https://dx.doi.org/10.17504/protocols.io.q26g7mr78gwz/v1