10,000 Matching Annotations
  1. Nov 2024
    1. Reviewer #3 (Public review):

      This is a well-designed study examining an important, surprisingly understudied question: how does adaptation affect spatial frequency processing in the human visual cortex? Using a combination of psychophysics and neuroimaging, the authors test the hypothesis that spatial frequency tuning is shifted to higher or lower frequencies, depending on the preadapted state (low or high s.f. adaptation). They do so by first validating the phenomenon psychophysically, showing that adapting to 0.5 cpd stimuli causes an increase in perceived s.f., and 3.5 cpd causes a relative decrease in perceived s.f. Using the same stimuli, they then port these stimuli to a neuroimaging study, in which population receptive fields are measured under high and low spatial frequency adaptation states. They find that adaptation changes pRF size, depending on adaptation state: adapting to high s.f. led to broader overall pRF sizes across the early visual cortex, whereas adapting to low s.f. led to smaller overall pRF sizes. Finally, the authors carry out a control experiment to psychophysically rule out the possibility that the perceived contrast change w/ adaptation may have given rise to these imaging results (this doesn't appear to be the case). All in all, I found this to be a good manuscript: the writing is taut, and the study is well designed There are a few points of clarification that I think would help, though, including a little more detail about the pRF analyses carried out in this study. Moreover, one weakness is that the sample size is relatively small, given the variability in the effects.

      (1) The pRF mapping stimuli and paradigm are slightly unconventional. This is, of course, fairly necessary to assess the question at hand. But, unless I missed it, there is a potentially critical piece of the analyses that I couldn't find in the results or methods: is the to-our adapter incorporated into the inputs for the pRF analyses, or was it simply estimating pRF size in response to the pRF mapping bar? Ignoring the large, full field-ish top-up seems like it might be dismissing an important nonlinearity in RF response to that aspect of the display (including that that had different s.f. content from the mapping stimulus) -especially because it occurred 50% of the time during the pRF mapping procedure. While the bar/top-up were events sub-TR, you could still model the prfprobe+topup response, then downsample to TR level afterwards. In any case, to fully understand this, some more detail is needed here regarding the prf fitting procedure.

      (2) I appreciate the eccentricity-dependent breakdown in Figure 5b. However, it would be informative to have included the actual plots of the pRF size as a function of eccen, for the two conditions individually, in addition to the difference effects depicted in 5b.

      (3) I know the N is small for this, but did the authors take a look at whether there was any relationship between the magnitude of the psychophysical effect and the change in pRF size, per individual? This is probably underpowered but could be worth a peek.

    2. Author response:

      We thank the reviewers for their valuable comments. Our revision will address their recommendations and clarify any misconceptions. The main points we plan to amend are as follows:

      Direct comparison of pRF sizes

      We may have misunderstood this comment in the eLife assessment. We believe our original analyses and the figures already provided a “direct comparison between pRF sizes in the high-adapted and low-adapted conditions”. Specifically, we included a figure showing the histograms of pRF sizes in both conditions, and also reported statistical tests to compare conditions both within each participant and across the group. However, we now realize these comparisons might not be as clear to readers as we intended, which would explain Reviewer #2’s interpretations. To clarify, in our revised version we will instead show 2D plots comparing pRF sizes between conditions as suggested by Reviewer #2, and also show the pRF size plotted against eccentricity (rather than only the difference) as suggested by Reviewer #3.

      Data sharing 

      The behavioral data, fMRI data (where ethically permissible), stimulus-generation code, statistical analyses, and fMRI stimulus video are already publicly available at the link: https://osf.io/9kfgx/. However, we unfortunately failed to include the link in the preprint. We apologize for this oversight. It will be included in the revision. The repository now also contains a script for simulated adaptation effects on pRF size used in our response to Reviewer #2. Moreover, for transparency, we will include plots of all the pRF parameter maps for all participants, including pRF size, polar angle, eccentricity, normalized R2, and raw R2.

      Sample size

      The reviewers shared concerns about the sample size of our study. We disagree that this is a weakness of our study. It is important to note that large sample sizes are not necessary to obtain conclusive results, especially when the research aims to test whether an effect exists, rather than finding out how strong the effect is on average in a population (Schwarzkopf & Huang, 2024, currently out as preprint, but in press at Psychological Methods). Our results showed robust within-subject effects, consistent across multiple visual regions in most individual participants. A larger sample size would not necessarily improve the reliability of our findings. Treating each individual as an independent replication, our results suggest a high probability that they would replicate in each additional participant we could scan. 

      Reviewer #1:

      We thank the reviewer for their careful evaluation and positive comments. We will include a more detailed discussion about the issues pointed out, and an additional plot showing the polar angle for both adapter conditions. In line with previous work on the reliability of pRF estimates (van Dijk, de Haas, Moutsiana, & Schwarzkopf, 2016; Senden, Reithler, Gijsen, & Goebel, 2014), both polar angle and eccentricity maps are very stable between the two adaptation conditions.

      Reviewer #2:

      We thank the reviewer for their comments - we will improve how we report key findings which we hope will clarify matters raised by the reviewer.

      RF positions in a voxel

      The reviewer’s comments suggest that they may have misunderstood the diagram (Figure 1A) illustrating the theoretical basis of the adaptation effect, likely due to us inadvertently putting the small RFs in the middle of the illustration. We will change this figure to avoid such confusion.

      Theoretical explanation of adaptation effect

      The reviewer’s explanation for how adaptation should affect the size of pRF averaging across individual RFs is incorrect. When selecting RFs from a fixed range of semi-uniformly distributed positions (as in an fMRI voxel), the average position of RFs (corresponding to pRF position) is naturally near the center of this range. The average size (corresponding to pRF size) reflects the visual field coverage of these individual RFs. This aggregate visual field coverage thus also reflects the individual sizes. When large RFs have been adapted out, this means the visual field coverage at the boundaries is sparser, and the aggregate pRF is therefore smaller. The opposite happens when adapting out the contribution of small RFs. We demonstrate this with a simple simulation at this OSF link: https://osf.io/ebnky/.

      Figure S2 

      It is not actually possible to compare R2 between regions by looking at Figure S2 because it shows the pRF size change, not R2. Therefore, the arguments Reviewer #2 made based on their interpretation of the figure are not valid. Just as the reviewer expected, V1 is one of the brain regions with good pRF model fits. In our revision, we will include normalized and raw R2 maps to make this more obvious to the readers and provide additional explanations.

      V1 appeared essentially empty in that plot primarily due to the sigma threshold we selected, which was unintentionally more conservative than those applied in our analyses and other figures. We apologize for this mistake and will correct it in the revised version by including a plot with the appropriate sigma threshold.

      Thresholding details 

      Thresholding information was included in our original manuscript; however, we will include more information in the figure captions to make it more obvious.

      2D plots will replace histograms

      We thank the reviewer for this suggestion. The manuscript contained histograms showing the distribution of pRF size for both adaptation conditions for each participant and visual area (Figure S1). However, we agree that 2D plots better communicate the difference in pRF parameters between conditions, so we will replace this figure. We will consider 2D kernel density plots as suggested by the reviewer; however, such plots can obscure distributional anomalies so they may not be the optimal choice and we may opt to show transparent scatter plots of individual pRFs instead.

      (proportional) pRF size-change map 

      The reviewer requests pRF size difference maps. Figure S2 in fact demonstrates the proportional difference between the pRF sizes of the two adaptation conditions. Instead of simply taking the difference, we believe showing the proportional change map is more sensible because overall pRF size varies considerably between visual regions. We will explain this more clearly in our revision. 

      pRF eccentricity plot 

      “I suspect that the difference in PRF size across voxels correlates very strongly with the difference in eccentricity across voxels.”

      Our manuscript already contains a supplementary plot (Figure S4 B) comparing the eccentricity between adapter conditions, showing no notable shift in eccentricities except in V3A - but that is a small region and the results are generally more variable. We will comment more on this finding in the main text and explain this figure in more detail. 

      To the reviewer’s point, even if there were an appreciable shift in eccentricity between conditions (as they suggest may have happened for the example participant we showed), this does not mean that the pRF size effect is “due [...] to shifts in eccentricity.” Parameters in a complex multi-dimensional model like the pRF are not independent. There is no way of knowing whether a change in one parameter is causally linked with a change in another. We can only report the parameter estimates the model produces. 

      In fact, it is conceivable that adaptation causes both: changes in pRF size and eccentricity. If more central or peripheral RFs tend to have smaller or larger RFs, respectively, then adapting out one part of the distribution will shift the average accordingly. However, as we already established, we find no compelling evidence that pRF eccentricity changes dramatically due to adaptation, while pRF size does. We will illustrate this using the 2D plots in our revision.

      Reviewer #3:

      We thank the reviewer for their comments.

      pRF model

      Top-up adapters were not modelled in our analyses because they are shared events in all TRs, critically also including the “blank” periods, providing a constant source of signal. Therefore modelling them separately cannot meaningfully change the results. However, the reviewer makes a good suggestion that it would be useful to mention this in the manuscript, so we will add a discussion of this point.

      pRF size vs eccentricity

      We will add a plot showing pRF size in the two adaptation conditions (in addition to the pRF size difference) as a function of eccentricity.

      Correlation with behavioral effect

      In the original manuscript, we pointed out why the correlation between the magnitude of the behavioral effect and the pRF size change is not an appropriate test for our data. First, the reviewer is right that a larger sample size would be needed to reliably detect such a between-subject correlation. More importantly, as per our recruitment criteria for the fMRI experiment, we did not scan participants showing weak perceptual effects. This limits the variability in the perceptual effect and makes correlation inapplicable.

      References

      van Dijk, J. A., de Haas, B., Moutsiana, C., & Schwarzkopf, D. S. (2016). Intersession reliability of population receptive field estimates. NeuroImage, 143, 293–303. https://doi.org/10.1016/J.NEUROIMAGE.2016.09.013

      Schwarzkopf, D. S., & Huang, Z. (2024). A simple statistical framework for small sample studies. BioRxiv, 2023.09.19.558509. https://doi.org/10.1101/2023.09.19.558509

      Senden, M., Reithler, J., Gijsen, S., & Goebel, R. (2014). Evaluating population receptive field estimation frameworks in terms of robustness and reproducibility. PloS One, 9(12). https://doi.org/10.1371/JOURNAL.PONE.0114054

    1. Author response:

      eLife Assessment 

      This valuable study investigates how the neural representation of individual finger movements changes during the early period of sequence learning. By combining a new method for extracting features from human magnetoencephalography data and decoding analyses, the authors provide incomplete evidence of an early, swift change in the brain regions correlated with sequence learning, including a set of previously unreported frontal cortical regions. The addition of more control analyses to rule out that head movement artefacts influence the findings, and to further explain the proposal of offline contextualization during short rest periods as the basis for improvement performance would strengthen the manuscript. 

      We appreciate the Editorial assessment on our paper’s strengths and novelty.  We have implemented additional control analyses to show that neither task-related eye movements nor increasing overlap of finger movements during learning account for our findings, which are that contextualized neural representations in a network of bilateral frontoparietal brain regions actively contribute to skill learning.  Importantly, we carried out additional analyses showing that contextualization develops predominantly during rest intervals.

      Public Reviews:

      We thank the Reviewers for their comments and suggestions, prompting new analyses and additions that strengthened our report.

      Reviewer #1 (Public review): 

      Summary: 

      This study addresses the issue of rapid skill learning and whether individual sequence elements (here: finger presses) are differentially represented in human MEG data. The authors use a decoding approach to classify individual finger elements and accomplish an accuracy of around 94%. A relevant finding is that the neural representations of individual finger elements dynamically change over the course of learning. This would be highly relevant for any attempts to develop better brain machine interfaces - one now can decode individual elements within a sequence with high precision, but these representations are not static but develop over the course of learning. 

      Strengths: The work follows a large body of work from the same group on the behavioural and neural foundations of sequence learning. The behavioural task is well established and neatly designed to allow for tracking learning and how individual sequence elements contribute. The inclusion of short offline rest periods between learning epochs has been influential because it has revealed that a lot, if not most of the gains in behaviour (ie speed of finger movements) occur in these so-called micro-offline rest periods. The authors use a range of new decoding techniques, and exhaustively interrogate their data in different ways, using different decoding approaches. Regardless of the approach, impressively high decoding accuracies are observed, but when using a hybrid approach that combines the MEG data in different ways, the authors observe decoding accuracies of individual sequence elements from the MEG data of up to 94%. 

      We have previously showed that neural replay of MEG activity representing the practiced skill correlated with micro-offline gains during rest intervals of early learning, 1 consistent with the recent report that hippocampal ripples during these offline periods predict human motor sequence learning2.  However, decoding accuracy in our earlier work1 needed improvement.  Here, we reported a strategy to improve decoding accuracy that could benefit future studies of neural replay or BCI using MEG.

      Weaknesses: 

      There are a few concerns which the authors may well be able to resolve. These are not weaknesses as such, but factors that would be helpful to address as these concern potential contributions to the results that one would like to rule out. Regarding the decoding results shown in Figure 2 etc, a concern is that within individual frequency bands, the highest accuracy seems to be within frequencies that match the rate of keypresses. This is a general concern when relating movement to brain activity, so is not specific to decoding as done here. As far as reported, there was no specific restraint to the arm or shoulder, and even then it is conceivable that small head movements would correlate highly with the vigor of individual finger movements. This concern is supported by the highest contribution in decoding accuracy being in middle frontal regions - midline structures that would be specifically sensitive to movement artefacts and don't seem to come to mind as key structures for very simple sequential keypress tasks such as this - and the overall pattern is remarkably symmetrical (despite being a unimanual finger task) and spatially broad. This issue may well be matching the time course of learning, as the vigor and speed of finger presses will also influence the degree to which the arm/shoulder and head move. This is not to say that useful information is contained within either of the frequencies or broadband data. But it raises the question of whether a lot is dominated by movement "artefacts" and one may get a more specific answer if removing any such contributions. 

      Reviewer #1 expresses concern that the combination of the low-frequency narrow-band decoder results, and the bilateral middle frontal regions displaying the highest average intra-parcel decoding performance across subjects is suggestive that the decoding results could be driven by head movement or other artefacts.

      Head movement artefacts are highly unlikely to contribute meaningfully to our results for the following reasons. First, in addition to ICA denoising, all “recordings were visually inspected and marked to denoise segments containing other large amplitude artifacts due to movements” (see Methods). Second, the response pad was positioned in a manner that minimized wrist, arm or more proximal body movements during the task. Third, while head position was not monitored online for this study, the head was restrained using an inflatable air bladder, and head position was assessed at the beginning and at the end of each recording. Head movement did not exceed 5mm between the beginning and end of each scan for all participants included in the study. Fourth, we agree that despite the steps taken above, it is possible that minor head movements could still contribute to some remaining variance in the MEG data in our study. The Reviewer states a concern that “it is conceivable that small head movements would correlate highly with the vigor of individual finger movements”. However, in order for any such correlations to meaningfully impact decoding performance, such head movements would need to: (A) be consistent and pervasive throughout the recording (which might not be the case if the head movements were related to movement vigor and vigor changed over time); and (B) systematically vary between different finger movements, and also between the same finger movement performed at different sequence locations (see 5-class decoding performance in Figure 4B). The possibility of any head movement artefacts meeting all these conditions is extremely unlikely.

      Given the task design, a much more likely confound in our estimation would be the contribution of eye movement artefacts to the decoder performance (an issue appropriately raised by Reviewer #3 in the comments below). Remember from Figure 1A in the manuscript that an asterisk marks the current position in the sequence and is updated at each keypress. Since participants make very few performance errors, the position of the asterisk on the display is highly correlated with the keypress being made in the sequence. Thus, it is possible that if participants are attending to the visual feedback provided on the display, they may move their eyes in a way that is systematically related to the task.  Since we did record eye movements simultaneously with the MEG recordings (EyeLink 1000 Plus; Fs = 600 Hz), we were able to perform a control analysis to address this question. For each keypress event during trials in which no errors occurred (which is the same time-point that the asterisk position is updated), we extracted three features related to eye movements: 1) the gaze position at the time of asterisk position update (or keyDown event), 2) the gaze position 150ms later, and 3) the peak velocity of the eye movement between the two positions. We then constructed a classifier from these features with the aim of predicting the location of the asterisk (ordinal positions 1-5) on the display. As shown in the confusion matrix below (Author response image 1), the classifier failed to perform above chance levels (Overall cross-validated accuracy = 0.21817):

      Author response image 1.

      Confusion matrix showing that three eye movement features fail to predict asterisk position on the task display above chance levels (Fold 1 test accuracy = 0.21718; Fold 2 test accuracy = 0.22023; Fold 3 test accuracy = 0.21859; Fold 4 test accuracy = 0.22113; Fold 5 test accuracy = 0.21373; Overall cross-validated accuracy = 0.2181). Since the ordinal position of the asterisk on the display is highly correlated with the ordinal position of individual keypresses in the sequence, this analysis provides strong evidence that keypress decoding performance from MEG features is not explained by systematic relationships between finger movement behavior and eye movements (i.e. – behavioral artefacts).

      In fact, inspection of the eye position data revealed that a majority of participants on most trials displayed random walk gaze patterns around a center fixation point, indicating that participants did not attend to the asterisk position on the display. This is consistent with intrinsic generation of the action sequence, and congruent with the fact that the display does not provide explicit feedback related to performance. A similar real-world example would be manually inputting a long password into a secure online application. In this case, one intrinsically generates the sequence from memory and receives similar feedback about the password sequence position (also provided as asterisks), which is typically ignored by the user. The minimal participant engagement with the visual task display observed in this study highlights another important point – that the behavior in explicit sequence learning motor tasks is highly generative in nature rather than reactive to stimulus cues as in the serial reaction time task (SRTT).  This is a crucial difference that must be carefully considered when designing investigations and comparing findings across studies.

      We observed that initial keypress decoding accuracy was predominantly driven by contralateral primary sensorimotor cortex in the initial practice trials before transitioning to bilateral frontoparietal regions by trials 11 or 12 as performance gains plateaued.  The contribution of contralateral primary sensorimotor areas to early skill learning has been extensively reported in humans and non-human animals. 1,3-5  Similarly, the increased involvement of bilateral frontal and parietal regions to decoding during early skill learning in the non-dominant hand is well known.  Enhanced bilateral activation in both frontal and parietal cortex during skill learning has been extensively reported6-11, and appears to be even more prominent during early fine motor skill learning in the non-dominant hand12,13.  The frontal regions identified in these studies are known to play crucial roles in executive control14, motor planning15, and working memory6,8,16-18 processes, while the same parietal regions are known to integrate multimodal sensory feedback and support visuomotor transformations6,8,16-18, in addition to working memory19. Thus, it is not surprising that these regions increasingly contribute to decoding as subjects internalize the sequential task.  We now include a statement reflecting these considerations in the revised Discussion.

      A somewhat related point is this: when combining voxel and parcel space, a concern is whether a degree of circularity may have contributed to the improved accuracy of the combined data, because it seems to use the same MEG signals twice - the voxels most contributing are also those contributing most to a parcel being identified as relevant, as parcels reflect the average of voxels within a boundary. In this context, I struggled to understand the explanation given, ie that the improved accuracy of the hybrid model may be due to "lower spatially resolved whole-brain and higher spatially resolved regional activity patterns".

      We strongly disagree with the Reviewer’s assertion that the construction of the hybrid-space decoder is circular. To clarify, the base feature set for the hybrid-space decoder constructed for all participants includes whole-brain spatial patterns of MEG source activity averaged within parcels. As stated in the manuscript, these 148 inter-parcel features reflect “lower spatially resolved whole-brain activity patterns” or global brain dynamics. We then independently test how well spatial patterns of MEG source activity for all voxels distributed within individual parcels can decode keypress actions. Again, the testing of these intra-parcel spatial patterns, intended to capture “higher spatially resolved regional brain activity patterns”, is completely independent from one another and independent from the weighting of individual inter-parcel features. These intra-parcel features could, for example, provide additional information about muscle activation patterns or the task environment. These approximately 1150 intra-parcel voxels (on average, within the total number varying between subjects) are then combined with the 148 inter-parcel features to construct the final hybrid-space decoder. In fact, this varied spatial filter approach shares some similarities to the construction of convolutional neural networks (CNNs) used to perform object recognition in image classification applications. One could also view this hybrid-space decoding approach as a spatial analogue to common time-frequency based analyses such as theta-gamma phase amplitude coupling (PAC), which combine information from two or more narrow-band spectral features derived from the same time-series data.

      We directly tested this hypothesis – that spatially overlapping intra- and inter-parcel features portray different information – by constructing an alternative hybrid-space decoder (HybridAlt) that excluded average inter-parcel features which spatially overlapped with intra-parcel voxel features, and comparing the performance to the decoder used in the manuscript (HybridOrig). The prediction was that if the overlapping parcel contained similar information to the more spatially resolved voxel patterns, then removing the parcel features (n=8) from the decoding analysis should not impact performance. In fact, despite making up less than 1% of the overall input feature space, removing those parcels resulted in a significant drop in overall performance greater than 2% (78.15% ± SD 7.03% for HybridOrig vs. 75.49% ± SD 7.17% for HybridAlt; Wilcoxon signed rank test, z = 3.7410, p = 1.8326e-04) (Author response image 2).

      Author response image 2.

      Comparison of decoding performances with two different hybrid approaches. HybridAlt: Intra-parcel voxel-space features of top ranked parcels and inter-parcel features of remaining parcels. HybridOrig:  Voxel-space features of top ranked parcels and whole-brain parcel-space features (i.e. – the version used in the manuscript). Dots represent decoding accuracy for individual subjects. Dashed lines indicate the trend in performance change across participants. Note, that HybridOrig (the approach used in our manuscript) significantly outperforms the HybridAlt approach, indicating that the excluded parcel features provide unique information compared to the spatially overlapping intra-parcel voxel patterns.

      Firstly, there will be a relatively high degree of spatial contiguity among voxels because of the nature of the signal measured, i.e. nearby individual voxels are unlikely to be independent. Secondly, the voxel data gives a somewhat misleading sense of precision; the inversion can be set up to give an estimate for each voxel, but there will not just be dependence among adjacent voxels, but also substantial variation in the sensitivity and confidence with which activity can be projected to different parts of the brain. Midline and deeper structures come to mind, where the inversion will be more problematic than for regions along the dorsal convexity of the brain, and a concern is that in those midline structures, the highest decoding accuracy is seen. 

      We definitely agree with the Reviewer that some inter-parcel features representing neighboring (or spatially contiguous) voxels are likely to be correlated. This has been well documented in the MEG literature20,21 and is a particularly important confound to address in functional or effective connectivity analyses (not performed in the present study). In the present analysis, any correlation between adjacent voxels presents a multi-collinearity problem, which effectively reduces the dimensionality of the input feature space. However, as long as there are multiple groups of correlated voxels within each parcel (i.e. - the effective dimensionality is still greater than 1), the intra-parcel spatial patterns could still meaningfully contribute to the decoder performance. Two specific results support this assertion.

      First, we obtained higher decoding accuracy with voxel-space features [74.51% (± SD 7.34%)] compared to parcel space features [68.77% (± SD 7.6%)] (Figure 3B), indicating individual voxels carry more information in decoding the keypresses than the averaged voxel-space features or parcel-space features.  Second, Individual voxels within a parcel showed varying feature importance scores in decoding keypresses (Author response image 3). This finding supports the Reviewer’s assertion that neighboring voxels express similar information, but also shows that the correlated voxels form mini subclusters that are much smaller spatially than the parcel they reside in.

      Author response image 3.

      Feature importance score of individual voxels in decoding keypresses: MRMR was used to rank the individual voxel space features in decoding keypresses and the min-max normalized MRMR score was mapped to a structural brain surface. Note that individual voxels within a parcel showed different contribution to decoding.

       

      Some of these concerns could be addressed by recording head movement (with enough precision) to regress out these contributions. The authors state that head movement was monitored with 3 fiducials, and their time courses ought to provide a way to deal with this issue. The ICA procedure may not have sufficiently dealt with removing movement-related problems, but one could eg relate individual components that were identified to the keypresses as another means for checking. An alternative could be to focus on frequency ranges above the movement frequencies. The accuracy for those still seems impressive and may provide a slightly more biologically plausible assessment. 

      We have already addressed the issue of movement related artefacts in the first response above. With respect to a focus on frequency ranges above movement frequencies, the Reviewer states the “accuracy for those still seems impressive and may provide a slightly more biologically plausible assessment”. First, it is important to note that cortical delta-band oscillations measured with local field potentials (LFPs) in macaques is known to contain important information related to end-effector kinematics22,23 muscle activation patterns24 and temporal sequencing25 during skilled reaching and grasping actions. Thus, there is a substantial body of evidence that low-frequency neural oscillatory activity in this range contains important information about the skill learning behavior investigated in the present study. Second, our own data shows (which the Reviewer also points out) that significant information related to the skill learning behavior is also present in higher frequency bands (see Figure 2A and Figure 3—figure supplement 1). As we pointed out in our earlier response to questions about the hybrid space decoder architecture (see above), it is likely that different, yet complimentary, information is encoded across different temporal frequencies (just as it is encoded across different spatial frequencies). Again, this interpretation is supported by our data as the highest performing classifiers in all cases (when holding all parameters constant) were always constructed from broadband input MEG data (Figure 2A and Figure 3—figure supplement 1).  

      One question concerns the interpretation of the results shown in Figure 4. They imply that during the course of learning, entirely different brain networks underpin the behaviour. Not only that, but they also include regions that would seem rather unexpected to be key nodes for learning and expressing relatively simple finger sequences, such as here. What then is the biological plausibility of these results? The authors seem to circumnavigate this issue by moving into a distance metric that captures the (neural network) changes over the course of learning, but the discussion seems detached from which regions are actually involved; or they offer a rather broad discussion of the anatomical regions identified here, eg in the context of LFOs, where they merely refer to "frontoparietal regions". 

      The Reviewer notes the shift in brain networks driving keypress decoding performance between trials 1, 11 and 36 as shown in Figure 4A. The Reviewer questions whether these substantial shifts in brain network states underpinning the skill are biologically plausible, as well as the likelihood that bilateral superior and middle frontal and parietal cortex are important nodes within these networks.

      First, previous fMRI work in humans performing a similar sequence learning task showed that flexibility in brain network composition (i.e. – changes in brain region members displaying coordinated activity) is up-regulated in novel learning environments and explains differences in learning rates across individuals26.  This work supports our interpretation of the present study data, that brain networks engaged in sequential motor skills rapidly reconfigure during early learning.

      Second, frontoparietal network activity is known to support motor memory encoding during early learning27,28. For example, reactivation events in the posterior parietal29 and medial prefrontal30,31 cortex (MPFC) have been temporally linked to hippocampal replay, and are posited to support memory consolidation across several memory domains32, including motor sequence learning1,33,34.  Further, synchronized interactions between MPFC and hippocampus are more prominent during early learning as opposed to later stages27,35,36, perhaps reflecting “redistribution of hippocampal memories to MPFC” 27.  MPFC contributes to very early memory formation by learning association between contexts, locations, events and adaptive responses during rapid learning37. Consistently, coupling between hippocampus and MPFC has been shown during, and importantly immediately following (rest) initial memory encoding38,39.  Importantly, MPFC activity during initial memory encoding predicts subsequent recall40. Thus, the spatial map required to encode a motor sequence memory may be “built under the supervision of the prefrontal cortex” 28, also engaged in the development of an abstract representation of the sequence41.  In more abstract terms, the prefrontal, premotor and parietal cortices support novice performance “by deploying attentional and control processes” 42-44 required during early learning42-44. The dorsolateral prefrontal cortex DLPFC specifically is thought to engage in goal selection and sequence monitoring during early skill practice45, all consistent with the schema model of declarative memory in which prefrontal cortices play an important role in encoding46,47.  Thus, several prefrontal and frontoparietal regions contributing to long term learning 48 are also engaged in early stages of encoding. Altogether, there is strong biological support for the involvement of bilateral prefrontal and frontoparietal regions to decoding during early skill learning.  We now address this issue in the revised manuscript.

      If I understand correctly, the offline neural representation analysis is in essence the comparison of the last keypress vs the first keypress of the next sequence. In that sense, the activity during offline rest periods is actually not considered. This makes the nomenclature somewhat confusing. While it matches the behavioural analysis, having only key presses one can't do it in any other way, but here the authors actually do have recordings of brain activity during offline rest. So at the very least calling it offline neural representation is misleading to this reviewer because what is compared is activity during the last and during the next keypress, not activity during offline periods. But it also seems a missed opportunity - the authors argue that most of the relevant learning occurs during offline rest periods, yet there is no attempt to actually test whether activity during this period can be useful for the questions at hand here. 

      We agree with the Reviewer that our previous “offline neural representation” nomenclature could be misinterpreted. In the revised manuscript we refer to this difference as the “offline neural representational change”. Please, note that our previous work did link offline neural activity (i.e. – 16-22 Hz beta power and neural replay density during inter-practice rest periods) to observed micro-offline gains49.

      Reviewer #2 (Public review): 

      Summary 

      Dash et al. asked whether and how the neural representation of individual finger movements is "contextualized" within a trained sequence during the very early period of sequential skill learning by using decoding of MEG signal. Specifically, they assessed whether/how the same finger presses (pressing index finger) embedded in the different ordinal positions of a practiced sequence (4-1-3-2-4; here, the numbers 1 through 4 correspond to the little through the index fingers of the non-dominant left hand) change their representation (MEG feature). They did this by computing either the decoding accuracy of the index finger at the ordinal positions 1 vs. 5 (index_OP1 vs index_OP5) or pattern distance between index_OP1 vs. index_OP5 at each training trial and found that both the decoding accuracy and the pattern distance progressively increase over the course of learning trials. More interestingly, they also computed the pattern distance for index_OP5 for the last execution of a practice trial vs. index_OP1 for the first execution in the next practice trial (i.e., across the rest period). This "off-line" distance was significantly larger than the "on-line" distance, which was computed within practice trials and predicted micro-offline skill gain. Based on these results, the authors conclude that the differentiation of representation for the identical movement embedded in different positions of a sequential skill ("contextualization") primarily occurs during early skill learning, especially during rest, consistent with the recent theory of the "micro-offline learning" proposed by the authors' group. I think this is an important and timely topic for the field of motor learning and beyond. <br /> Strengths 

      The specific strengths of the current work are as follows. First, the use of temporally rich neural information (MEG signal) has a large advantage over previous studies testing sequential representations using fMRI. This allowed the authors to examine the earliest period (= the first few minutes of training) of skill learning with finer temporal resolution. Second, through the optimization of MEG feature extraction, the current study achieved extremely high decoding accuracy (approx. 94%) compared to previous works. As claimed by the authors, this is one of the strengths of the paper (but see my comments). Third, although some potential refinement might be needed, comparing "online" and "offline" pattern distance is a neat idea. 

      Weaknesses 

      Along with the strengths I raised above, the paper has some weaknesses. First, the pursuit of high decoding accuracy, especially the choice of time points and window length (i.e., 200 msec window starting from 0 msec from key press onset), casts a shadow on the interpretation of the main result. Currently, it is unclear whether the decoding results simply reflect behavioral change or true underlying neural change. As shown in the behavioral data, the key press speed reached 3~4 presses per second already at around the end of the early learning period (11th trial), which means inter-press intervals become as short as 250-330 msec. Thus, in almost more than 60% of training period data, the time window for MEG feature extraction (200 msec) spans around 60% of the inter-press intervals. Considering that the preparation/cueing of subsequent presses starts ahead of the actual press (e.g., Kornysheva et al., 2019) and/or potential online planning (e.g., Ariani and Diedrichsen, 2019), the decoder likely has captured these future press information as well as the signal related to the current key press, independent of the formation of genuine sequential representation (e.g., "contextualization" of individual press). This may also explain the gradual increase in decoding accuracy or pattern distance between index_OP1 vs. index_OP5 (Figure 4C and 5A), which co-occurred with performance improvement, as shorter inter-press intervals are more favorable for the dissociating the two index finger presses followed by different finger presses. The compromised decoding accuracies for the control sequences can be explained in similar logic. Therefore, more careful consideration and elaborated discussion seem necessary when trying to both achieve high-performance decoding and assess early skill learning, as it can impact all the subsequent analyses.

      The Reviewer raises the possibility that (given the windowing parameters used in the present study) an increase in “contextualization” with learning could simply reflect faster typing speeds as opposed to an actual change in the underlying neural representation. The issue can essentially be framed as a mixing problem. As correct sequences are generated at higher and higher speeds over training, MEG activity patterns related to the planning, execution, evaluation and memory of individual keypresses overlap more in time. Thus, increased overlap between the “4” and “1” keypresses (at the start of the sequence) and “2” and “4” keypresses (at the end of the sequence) could artefactually increase contextualization distances even if the underlying neural representations for the individual keypresses remain unchanged (assuming this mixing of representations is used by the classifier to differentially tag each index finger press). If this were the case, it follows that such mixing effects reflecting the ordinal sequence structure would also be observable in the distribution of decoder misclassifications. For example, “4” keypresses would be more likely to be misclassified as “1” or “2” keypresses (or vice versa) than as “3” keypresses. The confusion matrices presented in Figures 3C and 4B and Figure 3—figure supplement 3A in the previously submitted manuscript do not show this trend in the distribution of misclassifications across the four fingers.

      Moreover, if the representation distance is largely driven by this mixing effect, it’s also possible that the increased overlap between consecutive index finger keypresses during the 4-4 transition marking the end of one sequence and the beginning of the next one could actually mask contextualization-related changes to the underlying neural representations and make them harder to detect. In this case, a decoder tasked with separating individual index finger keypresses into two distinct classes based upon sequence position might show decreased performance with learning as adjacent keypresses overlapped in time with each other to an increasing extent. However, Figure 4C in our previously submitted manuscript does not support this possibility, as the 2-class hybrid classifier displays improved classification performance over early practice trials despite greater temporal overlap.

      We also conducted a new multivariate regression analysis to directly assess whether the neural representation distance score could be predicted by the 4-1, 2-4 and 4-4 keypress transition times observed for each complete correct sequence (both predictor and response variables were z-score normalized within-subject). The results of this analysis affirmed that the possible alternative explanation put forward by the Reviewer is not supported by our data (Adjusted R2 = 0.00431; F = 5.62). We now include this new negative control analysis result in the revised manuscript.

      Overall, we do strongly agree with the Reviewer that the naturalistic, self-paced, generative task employed in the present study results in overlapping brain processes related to planning, execution, evaluation and memory of the action sequence. We also agree that there are several tradeoffs to consider in the construction of the classifiers depending on the study aim. Given our aim of optimizing keypress decoder accuracy in the present study, the set of trade-offs resulted in representations reflecting more the latter three processes, and less so the planning component. Whether separate decoders can be constructed to tease apart the representations or networks supporting these overlapping processes is an important future direction of research in this area. For example, work presently underway in our lab constrains the selection of windowing parameters in a manner that allows individual classifiers to be temporally linked to specific planning, execution, evaluation or memory-related processes to discern which brain networks are involved and how they adaptively reorganize with learning. Results from the present study (Figure 4—figure supplement 2) showing hybrid-space decoder prediction accuracies exceeding 74% for temporal windows spanning as little as 25ms and located up to 100ms prior to the keyDown event strongly support the feasibility of such an approach.

      Related to the above point, testing only one particular sequence (4-1-3-2-4), aside from the control ones, limits the generalizability of the finding. This also may have contributed to the extremely high decoding accuracy reported in the current study. 

      The Reviewer raises a question about the generalizability of the decoder accuracy reported in our study. Fortunately, a comparison between decoder performances on Day 1 and Day 2 datasets does provide some insight into this issue. As the Reviewer points out, the classifiers in this study were trained and tested on keypresses performed while practicing a specific sequence (4-1-3-2-4). The study was designed this way as to avoid the impact of interference effects on learning dynamics. The cross-validated performance of classifiers on MEG data collected within the same session was 90.47% overall accuracy (4-class; Figure 3C). We then tested classifier performance on data collected during a separate MEG session conducted approximately 24 hours later (Day 2; see Figure 3—supplement 3). We observed a reduction in overall accuracy rate to 87.11% when tested on MEG data recorded while participants performed the same learned sequence, and 79.44% when they performed several previously unpracticed sequences. Both changes in accuracy are important with regards to the generalizability of our findings. First, 87.11% performance accuracy for the trained sequence data on Day 2 (a reduction of only 3.36%) indicates that the hybrid-space decoder performance is robust over multiple MEG sessions, and thus, robust to variations in SNR across the MEG sensor array caused by small differences in head position between scans.  This indicates a substantial advantage over sensor-space decoding approaches. Furthermore, when tested on data from unpracticed sequences, overall performance dropped an additional 7.67%. This difference reflects the performance bias of the classifier for the trained sequence, possibly caused by high-order sequence structure being incorporated into the feature weights. In the future, it will be important to understand in more detail how random or repeated keypress sequence training data impacts overall decoder performance and generalization. We strongly agree with the Reviewer that the issue of generalizability is extremely important and have added a new paragraph to the Discussion in the revised manuscript highlighting the strengths and weaknesses of our study with respect to this issue.

      In terms of clinical BCI, one of the potential relevance of the study, as claimed by the authors, it is not clear that the specific time window chosen in the current study (up to 200 msec since key press onset) is really useful. In most cases, clinical BCI would target neural signals with no overt movement execution due to patients' inability to move (e.g., Hochberg et al., 2012). Given the time window, the surprisingly high performance of the current decoder may result from sensory feedback and/or planning of subsequent movement, which may not always be available in the clinical BCI context. Of course, the decoding accuracy is still much higher than chance even when using signal before the key press (as shown in Figure 4 Supplement 2), but it is not immediately clear to me that the authors relate their high decoding accuracy based on post-movement signal to clinical BCI settings.

      The Reviewer questions the relevance of the specific window parameters used in the present study for clinical BCI applications, particularly for paretic patients who are unable to produce finger movements or for whom afferent sensory feedback is no longer intact. We strongly agree with the Reviewer that any intended clinical application must carefully consider these specific input feature constraints dictated by the clinical cohort, and in turn impose appropriate and complimentary constraints on classifier parameters that may differ from the ones used in the present study.  We now highlight this issue in the Discussion of the revised manuscript and relate our present findings to published clinical BCI work within this context.

      One of the important and fascinating claims of the current study is that the "contextualization" of individual finger movements in a trained sequence specifically occurs during short rest periods in very early skill learning, echoing the recent theory of micro-offline learning proposed by the authors' group. Here, I think two points need to be clarified. First, the concept of "contextualization" is kept somewhat blurry throughout the text. It is only at the later part of the Discussion (around line #330 on page 13) that some potential mechanism for the "contextualization" is provided as "what-and-where" binding. Still, it is unclear what "contextualization" actually is in the current data, as the MEG signal analyzed is extracted from 0-200 msec after the keypress. If one thinks something is contextualizing an action, that contextualization should come earlier than the action itself. 

      The Reviewer requests that we: 1) more clearly define our use of the term “contextualization” and 2) provide the rationale for assessing it over a 200ms window aligned to the keyDown event. This choice of window parameters means that the MEG activity used in our analysis was coincident with, rather than preceding, the actual keypresses.  We define contextualization as the differentiation of representation for the identical movement embedded in different positions of a sequential skill. That is, representations of individual action elements progressively incorporate information about their relationship to the overall sequence structure as the skill is learned. We agree with the Reviewer that this can be appropriately interpreted as “what-and-where” binding. We now incorporate this definition in the Introduction of the revised manuscript as requested.

      The window parameters for optimizing accurate decoding individual finger movements were determined using a grid search of the parameter space (a sliding window of variable width between 25-350 ms with 25 ms increments variably aligned from 0 to +100ms with 10ms increments relative to the keyDown event). This approach generated 140 different temporal windows for each keypress for each participant, with the final parameter selection determined through comparison of the resulting performance between each decoder.  Importantly, the decision to optimize for decoding accuracy placed an emphasis on keypress representations characterized by the most consistent and robust features shared across subjects, which in turn maximize statistical power in detecting common learning-related changes. In this case, the optimal window encompassed a 200ms epoch aligned to the keyDown event (t0 = 0 ms).  We then asked if the representations (i.e. – spatial patterns of combined parcel- and voxel-space activity) of the same digit at two different sequence positions changed with practice within this optimal decoding window.  Of course, our findings do not rule out the possibility that contextualization can also be found before or even after this time window, as we did not directly address this issue in the present study.  Ongoing work in our lab, as pointed out above, is investigating contextualization within different time windows tailored specifically for assessing sequence skill action planning, execution, evaluation and memory processes.

      The second point is that the result provided by the authors is not yet convincing enough to support the claim that "contextualization" occurs during rest. In the original analysis, the authors presented the statistical significance regarding the correlation between the "offline" pattern differentiation and micro-offline skill gain (Figure 5. Supplement 1), as well as the larger "offline" distance than "online" distance (Figure 5B). However, this analysis looks like regressing two variables (monotonically) increasing as a function of the trial. Although some information in this analysis, such as what the independent/dependent variables were or how individual subjects were treated, was missing in the Methods, getting a statistically significant slope seems unsurprising in such a situation. Also, curiously, the same quantitative evidence was not provided for its "online" counterpart, and the authors only briefly mentioned in the text that there was no significant correlation between them. It may be true looking at the data in Figure 5A as the online representation distance looks less monotonically changing, but the classification accuracy presented in Figure 4C, which should reflect similar representational distance, shows a more monotonic increase up to the 11th trial. Further, the ways the "online" and "offline" representation distance was estimated seem to make them not directly comparable. While the "online" distance was computed using all the correct press data within each 10 sec of execution, the "offline" distance is basically computed by only two presses (i.e., the last index_OP5 vs. the first index_OP1 separated by 10 sec of rest). Theoretically, the distance between the neural activity patterns for temporally closer events tends to be closer than that between the patterns for temporally far-apart events. It would be fairer to use the distance between the first index_OP1 vs. the last index_OP5 within an execution period for "online" distance, as well. 

      The Reviewer suggests that the current data is not convincing enough to show that contextualization occurs during rest and raises two important concerns: 1) the relationship between online contextualization and micro-online gains is not shown, and 2) the online distance was calculated differently from its offline counterpart (i.e. - instead of calculating the distance between last IndexOP5 and first IndexOP1 from a single trial, the distance was calculated for each sequence within a trial and then averaged).

      We addressed the first concern by performing individual subject correlations between 1) contextualization changes during rest intervals and micro-offline gains; 2) contextualization changes during practice trials and micro-online gains, and 3) contextualization changes during practice trials and micro-offline gains (Author response image 4). We then statistically compared the resulting correlation coefficient distributions and found that within-subject correlations for contextualization changes during rest intervals and micro-offline gains were significantly higher than online contextualization and micro-online gains (t = 3.2827, p = 0.0015) and online contextualization and micro-offline gains (t = 3.7021, p = 5.3013e-04). These results are consistent with our interpretation that micro-offline gains are supported by contextualization changes during the inter-practice rest period.

      Author response image 4.

      Distribution of individual subject correlation coefficients between contextualization changes occurring during practice or rest with  micro-online and micro-offline performance gains. Note that, the correlation distributions were significantly higher for the relationship between contextualization changes during rest and micro-offline gains than for contextualization changes during practice and either micro-online or offline gain.

      With respect to the second concern highlighted above, we agree with the Reviewer that one limitation of the analysis comparing online versus offline changes in contextualization as presented in the reviewed manuscript, is that it does not eliminate the possibility that any differences could simply be explained by the passage of time (which is smaller for the online analysis compared to the offline analysis). The Reviewer suggests an approach that addresses this issue, which we have now carried out.   When quantifying online changes in contextualization from the first IndexOP1 the last IndexOP5 keypress in the same trial we observed no learning-related trend (Author response image 5, right panel). Importantly, offline distances were significantly larger than online distances regardless of the measurement approach and neither predicted online learning (Author response image 6).

      Author response image 5.

      Trial by trial trend of offline (left panel) and online (middle and right panels) changes in contextualization. Offline changes in contextualization were assessed by calculating the distance between neural representations for the last IndexOP5 keypress in the previous trial and the first IndexOP1 keypress in the present trial. Two different approaches were used to characterize online contextualization changes. The analysis included in the reviewed manuscript (middle panel) calculated the distance between IndexOP1 and IndexOP5 for each correct sequence, which was then averaged across the trial. This approach is limited by the lack of control for the passage of time when making online versus offline comparisons. Thus, the second approach controlled for the passage of time by calculating distance between the representations associated with the first IndexOP1 keypress and the last IndexOP5 keypress within the same trial. Note that while the first approach showed an increase online contextualization trend with practice, the second approach did not.

      Author response image 6.

      Relationship between online contextualization and online learning is shown for both within-sequence (left; note that this is the online contextualization measure used in the reviewd manuscript) and across-sequence (right) distance calculation. There was no significant relationship between online learning and online contextualization regardless of the measurement approach.

      A related concern regarding the control analysis, where individual values for max speed and the degree of online contextualization were compared (Figure 5 Supplement 3), is whether the individual difference is meaningful. If I understood correctly, the optimization of the decoding process (temporal window, feature inclusion/reduction, decoder, etc.) was performed for individual participants, and the same feature extraction was also employed for the analysis of representation distance (i.e., contextualization). If this is the case, the distances are individually differently calculated and they may need to be normalized relative to some stable reference (e.g., 1 vs. 4 or average distance within the control sequence presses) before comparison across the individuals. 

      The Reviewer makes a good point here. We have now implemented the suggested normalization procedure in the analysis provided in the revised manuscript.

      Reviewer #3 (Public review): 

      Summary: 

      One goal of this paper is to introduce a new approach for highly accurate decoding of finger movements from human magnetoencephalography data via dimension reduction of a "multi-scale, hybrid" feature space. Following this decoding approach, the authors aim to show that early skill learning involves "contextualization" of the neural coding of individual movements, relative to their position in a sequence of consecutive movements. Furthermore, they aim to show that this "contextualization" develops primarily during short rest periods interspersed with skill training and correlates with a performance metric which the authors interpret as an indicator of offline learning. <br /> Strengths: 

      A clear strength of the paper is the innovative decoding approach, which achieves impressive decoding accuracies via dimension reduction of a "multi-scale, hybrid space". This hybrid-space approach follows the neurobiologically plausible idea of the concurrent distribution of neural coding across local circuits as well as large-scale networks. A further strength of the study is the large number of tested dimension reduction techniques and classifiers (though the manuscript reveals little about the comparison of the latter). 

      We appreciate the Reviewer’s comments regarding the paper’s strengths.

      A simple control analysis based on shuffled class labels could lend further support to this complex decoding approach. As a control analysis that completely rules out any source of overfitting, the authors could test the decoder after shuffling class labels. Following such shuffling, decoding accuracies should drop to chance level for all decoding approaches, including the optimized decoder. This would also provide an estimate of actual chance-level performance (which is informative over and beyond the theoretical chance level). Furthermore, currently, the manuscript does not explain the huge drop in decoding accuracies for the voxel-space decoding (Figure 3B). Finally, the authors' approach to cortical parcellation raises questions regarding the information carried by varying dipole orientations within a parcel (which currently seems to be ignored?) and the implementation of the mean-flipping method (given that there are two dimensions - space and time - what do the authors refer to when they talk about the sign of the "average source", line 477?). 

      The Reviewer recommends that we: 1) conduct an additional control analysis on classifier performance using shuffled class labels, 2) provide a more detailed explanation regarding the drop in decoding accuracies for the voxel-space decoding following LDA dimensionality reduction (see Fig 3B), and 3) provide additional details on how problems related to dipole solution orientations were addressed in the present study.  

      In relation to the first point, we have now implemented a random shuffling approach as a control for the classification analyses. The results of this analysis indicated that the chance level accuracy was 22.12% (± SD 9.1%) for individual keypress decoding (4-class classification), and 18.41% (± SD 7.4%) for individual sequence item decoding (5-class classification), irrespective of the input feature set or the type of decoder used. Thus, the decoding accuracy observed with the final model was substantially higher than these chance levels.  

      Second, please note that the dimensionality of the voxel-space feature set is very high (i.e. – 15684). LDA attempts to map the input features onto a much smaller dimensional space (number of classes-1; e.g. –  3 dimensions, for 4-class keypress decoding). Given the very high dimension of the voxel-space input features in this case, the resulting mapping exhibits reduced accuracy. Despite this general consideration, please refer to Figure 3—figure supplement 3, where we observe improvement in voxel-space decoder performance when utilizing alternative dimensionality reduction techniques.

      The decoders constructed in the present study assess the average spatial patterns across time (as defined by the windowing procedure) in the input feature space.  We now provide additional details in the Methods of the revised manuscript pertaining to the parcellation procedure and how the sign ambiguity problem was addressed in our analysis.

      Weaknesses: 

      A clear weakness of the paper lies in the authors' conclusions regarding "contextualization". Several potential confounds, described below, question the neurobiological implications proposed by the authors and provide a simpler explanation of the results. Furthermore, the paper follows the assumption that short breaks result in offline skill learning, while recent evidence, described below, casts doubt on this assumption. 

      We thank the Reviewer for giving us the opportunity to address these issues in detail (see below).

      The authors interpret the ordinal position information captured by their decoding approach as a reflection of neural coding dedicated to the local context of a movement (Figure 4). One way to dissociate ordinal position information from information about the moving effectors is to train a classifier on one sequence and test the classifier on other sequences that require the same movements, but in different positions50. In the present study, however, participants trained to repeat a single sequence (4-1-3-2-4). As a result, ordinal position information is potentially confounded by the fixed finger transitions around each of the two critical positions (first and fifth press). Across consecutive correct sequences, the first keypress in a given sequence was always preceded by a movement of the index finger (=last movement of the preceding sequence), and followed by a little finger movement. The last keypress, on the other hand, was always preceded by a ring finger movement, and followed by an index finger movement (=first movement of the next sequence). Figure 4 - Supplement 2 shows that finger identity can be decoded with high accuracy (>70%) across a large time window around the time of the key press, up to at least +/-100 ms (and likely beyond, given that decoding accuracy is still high at the boundaries of the window depicted in that figure). This time window approaches the keypress transition times in this study. Given that distinct finger transitions characterized the first and fifth keypress, the classifier could thus rely on persistent (or "lingering") information from the preceding finger movement, and/or "preparatory" information about the subsequent finger movement, in order to dissociate the first and fifth keypress. Currently, the manuscript provides no evidence that the context information captured by the decoding approach is more than a by-product of temporally extended, and therefore overlapping, but independent neural representations of consecutive keypresses that are executed in close temporal proximity - rather than a neural representation dedicated to context. 

      Such temporal overlap of consecutive, independent finger representations may also account for the dynamics of "ordinal coding"/"contextualization", i.e., the increase in 2-class decoding accuracy, across Day 1 (Figure 4C). As learning progresses, both tapping speed and the consistency of keypress transition times increase (Figure 1), i.e., consecutive keypresses are closer in time, and more consistently so. As a result, information related to a given keypress is increasingly overlapping in time with information related to the preceding and subsequent keypresses. The authors seem to argue that their regression analysis in Figure 5 - Figure Supplement 3 speaks against any influence of tapping speed on "ordinal coding" (even though that argument is not made explicitly in the manuscript). However, Figure 5 - Figure Supplement 3 shows inter-individual differences in a between-subject analysis (across trials, as in panel A, or separately for each trial, as in panel B), and, therefore, says little about the within-subject dynamics of "ordinal coding" across the experiment. A regression of trial-by-trial "ordinal coding" on trial-by-trial tapping speed (either within-subject or at a group-level, after averaging across subjects) could address this issue. Given the highly similar dynamics of "ordinal coding" on the one hand (Figure 4C), and tapping speed on the other hand (Figure 1B), I would expect a strong relationship between the two in the suggested within-subject (or group-level) regression. Furthermore, learning should increase the number of (consecutively) correct sequences, and, thus, the consistency of finger transitions. Therefore, the increase in 2-class decoding accuracy may simply reflect an increasing overlap in time of increasingly consistent information from consecutive keypresses, which allows the classifier to dissociate the first and fifth keypress more reliably as learning progresses, simply based on the characteristic finger transitions associated with each. In other words, given that the physical context of a given keypress changes as learning progresses - keypresses move closer together in time and are more consistently correct - it seems problematic to conclude that the mental representation of that context changes. To draw that conclusion, the physical context should remain stable (or any changes to the physical context should be controlled for). 

      The issues raised by Reviewer #3 here are similar to two issues raised by Reviewer #2 above and agree they must both be carefully considered in any evaluation of our findings.

      As both Reviewers pointed out, the classifiers in this study were trained and tested on keypresses performed while practicing a specific sequence (4-1-3-2-4). The study was designed this way as to avoid the impact of interference effects on learning dynamics. The cross-validated performance of classifiers on MEG data collected within the same session was 90.47% overall accuracy (4-class; Figure 3C). We then tested classifier performance on data collected during a separate MEG session conducted approximately 24 hours later (Day 2; see Figure 3—supplement 3). We observed a reduction in overall accuracy rate to 87.11% when tested on MEG data recorded while participants performed the same learned sequence, and 79.44% when they performed several previously unpracticed sequences. This classification performance difference of 7.67% when tested on the Day 2 data could reflect the performance bias of the classifier for the trained sequence, possibly caused by mixed information from temporally close keypresses being incorporated into the feature weights.

      Along these same lines, both Reviewers also raise the possibility that an increase in “ordinal coding/contextualization” with learning could simply reflect an increase in this mixing effect caused by faster typing speeds as opposed to an actual change in the underlying neural representation. The basic idea is that as correct sequences are generated at higher and higher speeds over training, MEG activity patterns related to the planning, execution, evaluation and memory of individual keypresses overlap more in time. Thus, increased overlap between the “4” and “1” keypresses (at the start of the sequence) and “2” and “4” keypresses (at the end of the sequence) could artefactually increase contextualization distances even if the underlying neural representations for the individual keypresses remain unchanged (assuming this mixing of representations is used by the classifier to differentially tag each index finger press). If this were the case, it follows that such mixing effects reflecting the ordinal sequence structure would also be observable in the distribution of decoder misclassifications. For example, “4” keypresses would be more likely to be misclassified as “1” or “2” keypresses (or vice versa) than as “3” keypresses. The confusion matrices presented in Figures 3C and 4B and Figure 3—figure supplement 3A in the previously submitted manuscript do not show this trend in the distribution of misclassifications across the four fingers.

      Following this logic, it’s also possible that if the ordinal coding is largely driven by this mixing effect, the increased overlap between consecutive index finger keypresses during the 4-4 transition marking the end of one sequence and the beginning of the next one could actually mask contextualization-related changes to the underlying neural representations and make them harder to detect. In this case, a decoder tasked with separating individual index finger keypresses into two distinct classes based upon sequence position might show decreased performance with learning as adjacent keypresses overlapped in time with each other to an increasing extent. However, Figure 4C in our previously submitted manuscript does not support this possibility, as the 2-class hybrid classifier displays improved classification performance over early practice trials despite greater temporal overlap.

      As noted in the above replay to Reviewer #2, we also conducted a new multivariate regression analysis to directly assess whether the neural representation distance score could be predicted by the 4-1, 2-4 and 4-4 keypress transition times observed for each complete correct sequence (both predictor and response variables were z-score normalized within-subject). The results of this analysis affirmed that the possible alternative explanation put forward by the Reviewer is not supported by our data (Adjusted R2 = 0.00431; F = 5.62). We now include this new negative control analysis result in the revised manuscript.

      Finally, the Reviewer hints that one way to address this issue would be to compare MEG responses before and after learning for sequences typed at a fixed speed. However, given that the speed-accuracy trade-off should improve with learning, a comparison between unlearned and learned skill states would dictate that the skill be evaluated at a very low fixed speed. Essentially, such a design presents the problem that the post-training test is evaluating the representation in the unlearned behavioral state that is not representative of the acquired skill. Thus, this approach would not address our experimental question: “do neural representations of the same action performed at different locations within a skill sequence contextually differentiate or remain stable as learning evolves”.

      A similar difference in physical context may explain why neural representation distances ("differentiation") differ between rest and practice (Figure 5). The authors define "offline differentiation" by comparing the hybrid space features of the last index finger movement of a trial (ordinal position 5) and the first index finger movement of the next trial (ordinal position 1). However, the latter is not only the first movement in the sequence but also the very first movement in that trial (at least in trials that started with a correct sequence), i.e., not preceded by any recent movement. In contrast, the last index finger of the last correct sequence in the preceding trial includes the characteristic finger transition from the fourth to the fifth movement. Thus, there is more overlapping information arising from the consistent, neighbouring keypresses for the last index finger movement, compared to the first index finger movement of the next trial. A strong difference (larger neural representation distance) between these two movements is, therefore, not surprising, given the task design, and this difference is also expected to increase with learning, given the increase in tapping speed, and the consequent stronger overlap in representations for consecutive keypresses. Furthermore, initiating a new sequence involves pre-planning, while ongoing practice relies on online planning (Ariani et al., eNeuro 2021), i.e., two mental operations that are dissociable at the level of neural representation (Ariani et al., bioRxiv 2023). 

      The Reviewer argues that the comparison of last finger movement of a trial and the first in the next trial are performed in different circumstances and contexts. This is an important point and one we tend to agree with. For this task, the first sequence in a practice trial (which is pre-planned offline) is performed in a somewhat different context from the sequence iterations that follow, which involve temporally overlapping planning, execution and evaluation processes.  The Reviewer is particularly concerned about a difference in the temporal mixing effect issue raised above between the first and last keypresses performed in a trial. However, in contrast to the Reviewers stated argument above, findings from Korneysheva et. al (2019) showed that neural representations of individual actions are competitively queued during the pre-planning period in a manner that reflects the ordinal structure of the learned sequence.  Thus, mixing effects are likely still present for the first keypress in a trial. Also note that we now present new control analyses in multiple responses above confirming that hypothetical mixing effects between adjacent keypresses do not explain our reported contextualization finding. A statement addressing these possibilities raised by the Reviewer has been added to the Discussion in the revised manuscript.

      In relation to pre-planning, ongoing MEG work in our lab is investigating contextualization within different time windows tailored specifically for assessing how sequence skill action planning evolves with learning.

      Given these differences in the physical context and associated mental processes, it is not surprising that "offline differentiation", as defined here, is more pronounced than "online differentiation". For the latter, the authors compared movements that were better matched regarding the presence of consistent preceding and subsequent keypresses (online differentiation was defined as the mean difference between all first vs. last index finger movements during practice).  It is unclear why the authors did not follow a similar definition for "online differentiation" as for "micro-online gains" (and, indeed, a definition that is more consistent with their definition of "offline differentiation"), i.e., the difference between the first index finger movement of the first correct sequence during practice, and the last index finger of the last correct sequence. While these two movements are, again, not matched for the presence of neighbouring keypresses (see the argument above), this mismatch would at least be the same across "offline differentiation" and "online differentiation", so they would be more comparable. 

      This is the same point made earlier by Reviewer #2, and we agree with this assessment. As stated in the response to Reviewer #2 above, we have now carried out quantification of online contextualization using this approach and included it in the revised manuscript. We thank the Reviewer for this suggestion.

      A further complication in interpreting the results regarding "contextualization" stems from the visual feedback that participants received during the task. Each keypress generated an asterisk shown above the string on the screen, irrespective of whether the keypress was correct or incorrect. As a result, incorrect (e.g., additional, or missing) keypresses could shift the phase of the visual feedback string (of asterisks) relative to the ordinal position of the current movement in the sequence (e.g., the fifth movement in the sequence could coincide with the presentation of any asterisk in the string, from the first to the fifth). Given that more incorrect keypresses are expected at the start of the experiment, compared to later stages, the consistency in visual feedback position, relative to the ordinal position of the movement in the sequence, increased across the experiment. A better differentiation between the first and the fifth movement with learning could, therefore, simply reflect better decoding of the more consistent visual feedback, based either on the feedback-induced brain response, or feedback-induced eye movements (the study did not include eye tracking). It is not clear why the authors introduced this complicated visual feedback in their task, besides consistency with their previous studies.

      We strongly agree with the Reviewer that eye movements related to task engagement are important to rule out as a potential driver of the decoding accuracy or contextualization effect. We address this issue above in response to a question raised by Reviewer #1 about the impact of movement related artefacts in general on our findings.

      First, the assumption the Reviewer makes here about the distribution of errors in this task is incorrect. On average across subjects, 2.32% ± 1.48% (mean ± SD) of all keypresses performed were errors, which were evenly distributed across the four possible keypress responses. While errors increased progressively over practice trials, they did so in proportion to the increase in correct keypresses, so that the overall ratio of correct-to-incorrect keypresses remained stable over the training session. Thus, the Reviewer’s assumptions that there is a higher relative frequency of errors in early trials, and a resulting systematic trend phase shift differences between the visual display updates (i.e. – a change in asterisk position above the displayed sequence) and the keypress performed is not substantiated by the data. To the contrary, the asterisk position on the display and the keypress being executed remained highly correlated over the entire training session. We now include a statement about the frequency and distribution of errors in the revised manuscript.

      Given this high correlation, we firmly agree with the Reviewer that the issue of eye movement-related artefacts is still an important one to address. Fortunately, we did collect eye movement data during the MEG recordings so were able to investigate this. As detailed in the response to Reviewer #1 above, we found that gaze positions and eye-movement velocity time-locked to visual display updates (i.e. – a change in asterisk position above the displayed sequence) did not reflect the asterisk location above chance levels (Overall cross-validated accuracy = 0.21817; see Author response image 1). Furthermore, an inspection of the eye position data revealed that a majority of participants on most trials displayed random walk gaze patterns around a center fixation point, indicating that participants did not attend to the asterisk position on the display. This is consistent with intrinsic generation of the action sequence, and congruent with the fact that the display does not provide explicit feedback related to performance. As pointed out above, a similar real-world example would be manually inputting a long password into a secure online application. In this case, one intrinsically generates the sequence from memory and receives similar feedback about the password sequence position (also provided as asterisks), which is typically ignored by the user. Notably, the minimal participant engagement with the visual task display observed in this study highlights an important difference between behavior observed during explicit sequence learning motor tasks (which is highly generative in nature) with reactive responses to stimulus cues in a serial reaction time task (SRTT).  This is a crucial difference that must be carefully considered when comparing findings across studies. All elements pertaining to this new control analysis are now included in the revised manuscript.

      The authors report a significant correlation between "offline differentiation" and cumulative micro-offline gains. However, it would be more informative to correlate trial-by-trial changes in each of the two variables. This would address the question of whether there is a trial-by-trial relation between the degree of "contextualization" and the amount of micro-offline gains - are performance changes (micro-offline gains) less pronounced across rest periods for which the change in "contextualization" is relatively low? Furthermore, is the relationship between micro-offline gains and "offline differentiation" significantly stronger than the relationship between micro-offline gains and "online differentiation"? 

      In response to a similar issue raised above by Reviewer #2, we now include new analyses comparing correlation magnitudes between (1) “online differention” vs micro-online gains, (2) “online differention” vs micro-offline gains and (3) “offline differentiation” and micro-offline gains (see Author response images 4, 5 and 6 above). These new analyses and results have been added to the revised manuscript. Once again, we thank both Reviewers for this suggestion.

      The authors follow the assumption that micro-offline gains reflect offline learning.

      This statement is incorrect. The original Bonstrup et al (2019) 49 paper clearly states that micro-offline gains must be carefully interpreted based upon the behavioral context within which they are observed, and lays out the conditions under which one can have confidence that micro-offline gains reflect offline learning.  In fact, the excellent meta-analysis of Pan & Rickard (2015) 51, which re-interprets the benefits of sleep in overnight skill consolidation from a “reactive inhibition” perspective, was a crucial resource in the experimental design of our initial study49, as well as in all our subsequent work. Pan & Rickard stated:

      “Empirically, reactive inhibition refers to performance worsening that can accumulate during a period of continuous training (Hull, 1943). It tends to dissipate, at least in part, when brief breaks are inserted between blocks of training. If there are multiple performance-break cycles over a training session, as in the motor sequence literature, performance can exhibit a scalloped effect, worsening during each uninterrupted performance block but improving across blocks52,53. Rickard, Cai, Rieth, Jones, and Ard (2008) and Brawn, Fenn, Nusbaum, and Margoliash (2010) 52,53 demonstrated highly robust scalloped reactive inhibition effects using the commonly employed 30 s–30 s performance break cycle, as shown for Rickard et al.’s (2008) massed practice sleep group in Figure 2. The scalloped effect is evident for that group after the first few 30 s blocks of each session. The absence of the scalloped effect during the first few blocks of training in the massed group suggests that rapid learning during that period masks any reactive inhibition effect.”

      Crucially, Pan & Rickard51 made several concrete recommendations for reducing the impact of the reactive inhibition confound on offline learning studies. One of these recommendations was to reduce practice times to 10s (most prior sequence learning studies up until that point had employed 30s long practice trials). They stated:

      “The traditional design involving 30 s-30 s performance break cycles should be abandoned given the evidence that it results in a reactive inhibition confound, and alternative designs with reduced performance duration per block used instead 51. One promising possibility is to switch to 10 s performance durations for each performance-break cycle Instead 51. That design appears sufficient to eliminate at least the majority of the reactive inhibition effect 52,53.”

      We mindfully incorporated recommendations from Pan and Rickard51  into our own study designs including 1) utilizing 10s practice trials and 2) constraining our analysis of micro-offline gains to early learning trials (where performance monotonically increases and 95% of overall performance gains occur), which are prior to the emergence of the “scalloped” performance dynamics that are strongly linked to reactive inhibition effects. 

      However, there is no direct evidence in the literature that micro-offline gains really result from offline learning, i.e., an improvement in skill level.

      We strongly disagree with the Reviewer’s assertion that “there is no direct evidence in the literature that micro-offline gains really result from offline learning, i.e., an improvement in skill level.”  The initial Bönstrup et al. (2019) 49 report was followed up by a large online crowd-sourcing study (Bönstrup et al., 2020) 54. This second (and much larger) study provided several additional important findings supporting our interpretation of micro-offline gains in cases where the important behavioral conditions clarified above were met (see Author response image 7 below for further details on these conditions).

      Author response image 7.

      Micro-offline gains observed in learning and non-learning contexts are attributed to different underlying causes. (A) Micro-offline and online changes relative to overall trial-by-trial learning. This figure is based on data from Bönstrup et al. (2019) 49. During early learning, micro-offline gains (red bars) closely track trial-by-trial performance gains (green line with open circle markers), with minimal contribution from micro-online gains (blue bars). The stated conclusion in Bönstrup et al. (2019) is that micro-offline gains only during this Early Learning stage reflect rapid memory consolidation (see also 54). After early learning, about practice trial 11, skill plateaus. This plateau skill period is characterized by a striking emergence of coupled (and relatively stable) micro-online drops and micro-offline increases. Bönstrup et al. (2019) as well as others in the literature 55-57, argue that micro-offline gains during the plateau period likely reflect recovery from inhibitory performance factors such as reactive inhibition or fatigue, and thus must be excluded from analyses relating micro-offline gains to skill learning.  The Non-repeating groups in Experiments 3 and 4 from Das et al. (2024) suffer from a lack of consideration of these known confounds.

      Evidence documented in that paper54 showed that micro-offline gains during early skill learning were: 1) replicable and generalized to subjects learning the task in their daily living environment (n=389); 2) equivalent when significantly shortening practice period duration, thus confirming that they are not a result of recovery from performance fatigue (n=118);  3) reduced (along with learning rates) by retroactive interference applied immediately after each practice period relative to interference applied after passage of time (n=373), indicating stabilization of the motor memory at a microscale of several seconds consistent with rapid consolidation; and 4) not modified by random termination of the practice periods, ruling out a contribution of predictive motor slowing (N = 71) 54.  Altogether, our findings were strongly consistent with the interpretation that micro-offline gains reflect memory consolidation supporting early skill learning. This is precisely the portion of the learning curve Pan and Rickard51 refer to when they state “…rapid learning during that period masks any reactive inhibition effect”.

      This interpretation is further supported by brain imaging evidence linking known memory-related networks and consolidation mechanisms to micro-offline gains. First, we reported that the density of fast hippocampo-neocortical skill memory replay events increases approximately three-fold during early learning inter-practice rest periods with the density explaining differences in the magnitude of micro-offline gains across subjects1. Second, Jacobacci et al. (2020) independently reproduced our original behavioral findings and reported BOLD fMRI changes in the hippocampus and precuneus (regions also identified in our MEG study1) linked to micro-offline gains during early skill learning. 33 These functional changes were coupled with rapid alterations in brain microstructure in the order of minutes, suggesting that the same network that operates during rest periods of early learning undergoes structural plasticity over several minutes following practice58. Third, even more recently, Chen et al. (2024) provided direct evidence from intracranial EEG in humans linking sharp-wave ripple events (which are known markers for neural replay59) in the hippocampus (80-120 Hz in humans) with micro-offline gains during early skill learning. The authors report that the strong increase in ripple rates tracked learning behavior, both across blocks and across participants. The authors conclude that hippocampal ripples during resting offline periods contribute to motor sequence learning. 2

      Thus, there is actually now substantial evidence in the literature directly supporting the assertion “that micro-offline gains really result from offline learning”.  On the contrary, according to Gupta & Rickard (2024) “…the mechanism underlying RI [reactive inhibition] is not well established” after over 80 years of investigation60, possibly due to the fact that “reactive inhibition” is a categorical description of behavioral effects that likely result from several heterogenous processes with very different underlying mechanisms.

      On the contrary, recent evidence questions this interpretation (Gupta & Rickard, npj Sci Learn 2022; Gupta & Rickard, Sci Rep 2024; Das et al., bioRxiv 2024). Instead, there is evidence that micro-offline gains are transient performance benefits that emerge when participants train with breaks, compared to participants who train without breaks, however, these benefits vanish within seconds after training if both groups of participants perform under comparable conditions (Das et al., bioRxiv 2024). 

      It is important to point out that the recent work of Gupta & Rickard (2022,2024) 55 does not present any data that directly opposes our finding that early skill learning49 is expressed as micro-offline gains during rest breaks. These studies are essentially an extension of the Rickard et al (2008) paper that employed a massed (30s practice followed by 30s breaks) vs spaced (10s practice followed by 10s breaks) to assess if recovery from reactive inhibition effects could account for performance gains measured after several minutes or hours. Gupta & Rickard (2022) added two additional groups (30s practice/10s break and 10s practice/10s break as used in the work from our group). The primary aim of the study was to assess whether it was more likely that changes in performance when retested 5 minutes after skill training (consisting of 12 practice trials for the massed groups and 36 practice trials for the spaced groups) had ended reflected memory consolidation effects or recovery from reactive inhibition effects. The Gupta & Rickard (2024) follow-up paper employed a similar design with the primary difference being that participants performed a fixed number of sequences on each trial as opposed to trials lasting a fixed duration. This was done to facilitate the fitting of a quantitative statistical model to the data.  To reiterate, neither study included any analysis of micro-online or micro-offline gains and did not include any comparison focused on skill gains during early learning. Instead, Gupta & Rickard (2022), reported evidence for reactive inhibition effects for all groups over much longer training periods. Again, we reported the same finding for trials following the early learning period in our original Bönstrup et al. (2019) paper49 (Author response image 7). Also, please note that we reported in this paper that cumulative micro-offline gains over early learning did not correlate with overnight offline consolidation measured 24 hours later49 (see the Results section and further elaboration in the Discussion). Thus, while the composition of our data is supportive of a short-term memory consolidation process operating over several seconds during early learning, it likely differs from those involved over longer training times and offline periods, as assessed by Gupta & Rickard (2022).

      In the recent preprint from Das et al (2024) 61,  the authors make the strong claim that “micro-offline gains during early learning do not reflect offline learning” which is not supported by their own data.   The authors hypothesize that if “micro-offline gains represent offline learning, participants should reach higher skill levels when training with breaks, compared to training without breaks”.  The study utilizes a spaced vs. massed practice group between-subjects design inspired by the reactive inhibition work from Rickard and others to test this hypothesis. Crucially, the design incorporates only a small fraction of the training used in other investigations to evaluate early skill learning1,33,49,54,57,58,62.  A direct comparison between the practice schedule designs for the spaced and massed groups in Das et al., and the training schedule all participants experienced in the original Bönstrup et al. (2019) paper highlights this issue as well as several others (Author response image 8):

      Author response image 8.

      (A) Comparison of Das et al. Spaced & Massed group training session designs, and the training session design from the original Bönstrup et al. (2019) 49 paper. Similar to the approach taken by Das et al., all practice is visualized as 10-second practice trials with a variable number (either 0, 1 or 30) of 10-second-long inter-practice rest intervals to allow for direct comparisons between designs. The two key takeaways from this comparison are that (1) the intervention differences (i.e. – practice schedules) between the Massed and Spaced groups from the Das et al. report are extremely small (less than 12% of the overall session schedule) and (2) the overall amount of practice is much less than compared to the design from the original Bönstrup report 49  (which has been utilized in several subsequent studies). (B) Group-level learning curve data from Bönstrup et al. (2019) 49 is used to estimate the performance range accounted for by the equivalent periods covering Test 1, Training 1 and Test 2 from Das et al (2024). Note that the intervention in the Das et al. study is limited to a period covering less than 50% of the overall learning range.

      First, participants in the original Bönstrup et al. study 49 experienced 157.14% more practice time and 46.97% less inter-practice rest time than the Spaced group in the Das et al. study (Author response image 8).  Thus, the overall amount of practice and rest differ substantially between studies, with much more limited training occurring for participants in Das et al.  

      Second, and perhaps most importantly, the actual intervention (i.e. – the difference in practice schedule between the Spaced and Massed groups) employed by Das et al. covers a very small fraction of the overall training session. Identical practice schedule segments for both the Spaced & Massed groups are indicated by the red shaded area in Author response image 8. Please note that these identical segments cover 94.84% of the Massed group training schedule and 88.01% of the Spaced group training schedule (since it has 60 seconds of additional rest). This means that the actual interventions cover less than 5% (for Massed) and 12% (for Spaced) of the total training session, which minimizes any chance of observing a difference between groups.

      Also note that the very beginning of the practice schedule (during which Figure R9 shows substantial learning is known to occur) is labeled in the Das et al. study as Test 1.  Test 1 encompasses the first 20 seconds of practice (alternatively viewed as the first two 10-second-long practice trials with no inter-practice rest). This is immediately followed by the Training 1 intervention, which is composed of only three 10-second-long practice trials (with 10-second inter-practice rest for the Spaced group and no inter-practice rest for the Massed group). Author response image 8 also shows that since there is no inter-practice rest after the third Training practice trial for the Spaced group, this third trial (for both Training 1 and 2) is actually a part of an identical practice schedule segment shared by both groups (Massed and Spaced), reducing the magnitude of the intervention even further.

      Moreover, we know from the original Bönstrup et al. (2019) paper49 that 46.57% of all overall group-level performance gains occurred between trials 2 and 5 for that study. Thus, Das et al. are limiting their designed intervention to a period covering less than half of the early learning range discussed in the literature, which again, minimizes any chance of observing an effect.

      This issue is amplified even further at Training 2 since skill learning prior to the long 5-minute break is retained, further constraining the performance range over these three trials. A related issue pertains to the trials labeled as Test 1 (trials 1-2) and Test 2 (trials 6-7) by Das et al. Again, we know from the original Bönstrup et al. paper 49 that 18.06% and 14.43% (32.49% total) of all overall group-level performance gains occurred during trials corresponding to Das et al Test 1 and Test 2, respectively. In other words, Das et al averaged skill performance over 20 seconds of practice at two time-points where dramatic skill improvements occur. Pan & Rickard (1995) previously showed that such averaging is known to inject artefacts into analyses of performance gains.

      Furthermore, the structure of the Test in Das et. al study appears to have an interference effect on the Spaced group performance after the training intervention.  This makes sense if you consider that the Spaced group is required to now perform the task in a Massed practice environment (i.e., two 10-second-long practice trials merged into one long trial), further blurring the true intervention effects. This effect is observable in Figure 1C,E of their pre-print. Specifically, while the Massed group continues to show an increase in performance during test relative to the last 10 seconds of practice during training, the Spaced group displays a marked decrease. This decrease is in stark contrast to the monotonic increases observed for both groups at all other time-points.

      Interestingly, when statistical comparisons between the groups are made at the time-points when the intervention is present (as opposed to after it has been removed) then the stated hypothesis, “If micro-offline gains represent offline learning, participants should reach higher skill levels when training with breaks, compared to training without breaks”, is confirmed.

      The data presented by Gupta and Rickard (2022, 2024) and Das et al. (2024) is in many ways more confirmatory of the constraints employed by our group and others with respect to experimental design, analysis and interpretation of study findings, rather than contradictory. Still, it does highlight a limitation of the current micro-online/offline framework, which was originally only intended to be applied to early skill learning over spaced practice schedules when reactive inhibition effects are minimized49. Extrapolation of this current framework to post-plateau performance periods, longer timespans, or non-learning situations (e.g. – the Non-repeating groups from Experiments 3 & 4 in Das et al. (2024)), when reactive inhibition plays a more substantive role, is not warranted. Ultimately, it will be important to develop new paradigms allowing one to independently estimate the different coincident or antagonistic features (e.g. - memory consolidation, planning, working memory and reactive inhibition) contributing to micro-online and micro-offline gains during and after early skill learning within a unifying framework.

      References

      (1) Buch, E. R., Claudino, L., Quentin, R., Bonstrup, M. & Cohen, L. G. Consolidation of human skill linked to waking hippocampo-neocortical replay. Cell Rep 35, 109193 (2021). https://doi.org:10.1016/j.celrep.2021.109193

      (2) Chen, P.-C., Stritzelberger, J., Walther, K., Hamer, H. & Staresina, B. P. Hippocampal ripples during offline periods predict human motor sequence learning. bioRxiv, 2024.2010.2006.614680 (2024). https://doi.org:10.1101/2024.10.06.614680

      (3) Classen, J., Liepert, J., Wise, S. P., Hallett, M. & Cohen, L. G. Rapid plasticity of human cortical movement representation induced by practice. J Neurophysiol 79, 1117-1123 (1998).

      (4) Karni, A. et al. Functional MRI evidence for adult motor cortex plasticity during motor skill learning. Nature 377, 155-158 (1995). https://doi.org:10.1038/377155a0

      (5) Kleim, J. A., Barbay, S. & Nudo, R. J. Functional reorganization of the rat motor cortex following motor skill learning. J Neurophysiol 80, 3321-3325 (1998).

      (6) Shadmehr, R. & Holcomb, H. H. Neural correlates of motor memory consolidation. Science 277, 821-824 (1997).

      (7) Doyon, J. et al. Experience-dependent changes in cerebellar contributions to motor sequence learning. Proc Natl Acad Sci U S A 99, 1017-1022 (2002).

      (8) Toni, I., Ramnani, N., Josephs, O., Ashburner, J. & Passingham, R. E. Learning arbitrary visuomotor associations: temporal dynamic of brain activity. Neuroimage 14, 1048-1057 (2001).

      (9) Grafton, S. T. et al. Functional anatomy of human procedural learning determined with regional cerebral blood flow and PET. J Neurosci 12, 2542-2548 (1992).

      (10) Kennerley, S. W., Sakai, K. & Rushworth, M. F. Organization of action sequences and the role of the pre-SMA. J Neurophysiol 91, 978-993 (2004). https://doi.org:10.1152/jn.00651.2003 00651.2003 [pii]

      (11) Hardwick, R. M., Rottschy, C., Miall, R. C. & Eickhoff, S. B. A quantitative meta-analysis and review of motor learning in the human brain. Neuroimage 67, 283-297 (2013). https://doi.org:10.1016/j.neuroimage.2012.11.020

      (12) Sawamura, D. et al. Acquisition of chopstick-operation skills with the non-dominant hand and concomitant changes in brain activity. Sci Rep 9, 20397 (2019). https://doi.org:10.1038/s41598-019-56956-0

      (13) Lee, S. H., Jin, S. H. & An, J. The difference in cortical activation pattern for complex motor skills: A functional near- infrared spectroscopy study. Sci Rep 9, 14066 (2019). https://doi.org:10.1038/s41598-019-50644-9

      (14) Battaglia-Mayer, A. & Caminiti, R. Corticocortical Systems Underlying High-Order Motor Control. J Neurosci 39, 4404-4421 (2019). https://doi.org:10.1523/JNEUROSCI.2094-18.2019

      (15) Toni, I., Thoenissen, D. & Zilles, K. Movement preparation and motor intention. Neuroimage 14, S110-117 (2001). https://doi.org:10.1006/nimg.2001.0841

      (16) Wolpert, D. M., Goodbody, S. J. & Husain, M. Maintaining internal representations: the role of the human superior parietal lobe. Nat Neurosci 1, 529-533 (1998). https://doi.org:10.1038/2245

      (17) Andersen, R. A. & Buneo, C. A. Intentional maps in posterior parietal cortex. Annu Rev Neurosci 25, 189-220 (2002). https://doi.org:10.1146/annurev.neuro.25.112701.142922 112701.142922 [pii]

      (18) Buneo, C. A. & Andersen, R. A. The posterior parietal cortex: sensorimotor interface for the planning and online control of visually guided movements. Neuropsychologia 44, 2594-2606 (2006). https://doi.org:S0028-3932(05)00333-7 [pii] 10.1016/j.neuropsychologia.2005.10.011

      (19) Grover, S., Wen, W., Viswanathan, V., Gill, C. T. & Reinhart, R. M. G. Long-lasting, dissociable improvements in working memory and long-term memory in older adults with repetitive neuromodulation. Nat Neurosci 25, 1237-1246 (2022). https://doi.org:10.1038/s41593-022-01132-3

      (20) Colclough, G. L. et al. How reliable are MEG resting-state connectivity metrics? Neuroimage 138, 284-293 (2016). https://doi.org:10.1016/j.neuroimage.2016.05.070

      (21) Colclough, G. L., Brookes, M. J., Smith, S. M. & Woolrich, M. W. A symmetric multivariate leakage correction for MEG connectomes. NeuroImage 117, 439-448 (2015). https://doi.org:10.1016/j.neuroimage.2015.03.071

      (22) Mollazadeh, M. et al. Spatiotemporal variation of multiple neurophysiological signals in the primary motor cortex during dexterous reach-to-grasp movements. J Neurosci 31, 15531-15543 (2011). https://doi.org:10.1523/JNEUROSCI.2999-11.2011

      (23) Bansal, A. K., Vargas-Irwin, C. E., Truccolo, W. & Donoghue, J. P. Relationships among low-frequency local field potentials, spiking activity, and three-dimensional reach and grasp kinematics in primary motor and ventral premotor cortices. J Neurophysiol 105, 1603-1619 (2011). https://doi.org:10.1152/jn.00532.2010

      (24) Flint, R. D., Ethier, C., Oby, E. R., Miller, L. E. & Slutzky, M. W. Local field potentials allow accurate decoding of muscle activity. J Neurophysiol 108, 18-24 (2012). https://doi.org:10.1152/jn.00832.2011

      (25) Churchland, M. M. et al. Neural population dynamics during reaching. Nature 487, 51-56 (2012). https://doi.org:10.1038/nature11129

      (26) Bassett, D. S. et al. Dynamic reconfiguration of human brain networks during learning. Proc Natl Acad Sci U S A 108, 7641-7646 (2011). https://doi.org:10.1073/pnas.1018985108

      (27) Albouy, G., King, B. R., Maquet, P. & Doyon, J. Hippocampus and striatum: dynamics and interaction during acquisition and sleep-related motor sequence memory consolidation. Hippocampus 23, 985-1004 (2013). https://doi.org:10.1002/hipo.22183

      (28) Albouy, G. et al. Neural correlates of performance variability during motor sequence acquisition. Neuroimage 60, 324-331 (2012). https://doi.org:10.1016/j.neuroimage.2011.12.049

      (29) Qin, Y. L., McNaughton, B. L., Skaggs, W. E. & Barnes, C. A. Memory reprocessing in corticocortical and hippocampocortical neuronal ensembles. Philos Trans R Soc Lond B Biol Sci 352, 1525-1533 (1997). https://doi.org:10.1098/rstb.1997.0139

      (30) Euston, D. R., Tatsuno, M. & McNaughton, B. L. Fast-forward playback of recent memory sequences in prefrontal cortex during sleep. Science 318, 1147-1150 (2007). https://doi.org:10.1126/science.1148979

      (31) Molle, M. & Born, J. Hippocampus whispering in deep sleep to prefrontal cortex--for good memories? Neuron 61, 496-498 (2009). https://doi.org:S0896-6273(09)00122-6 [pii] 10.1016/j.neuron.2009.02.002

      (32) Frankland, P. W. & Bontempi, B. The organization of recent and remote memories. Nat Rev Neurosci 6, 119-130 (2005). https://doi.org:10.1038/nrn1607

      (33) Jacobacci, F. et al. Rapid hippocampal plasticity supports motor sequence learning. Proc Natl Acad Sci U S A 117, 23898-23903 (2020). https://doi.org:10.1073/pnas.2009576117

      (34) Albouy, G. et al. Maintaining vs. enhancing motor sequence memories: respective roles of striatal and hippocampal systems. Neuroimage 108, 423-434 (2015). https://doi.org:10.1016/j.neuroimage.2014.12.049

      (35) Gais, S. et al. Sleep transforms the cerebral trace of declarative memories. Proc Natl Acad Sci U S A 104, 18778-18783 (2007). https://doi.org:0705454104 [pii] 10.1073/pnas.0705454104

      (36) Sterpenich, V. et al. Sleep promotes the neural reorganization of remote emotional memory. J Neurosci 29, 5143-5152 (2009). https://doi.org:10.1523/JNEUROSCI.0561-09.2009

      (37) Euston, D. R., Gruber, A. J. & McNaughton, B. L. The role of medial prefrontal cortex in memory and decision making. Neuron 76, 1057-1070 (2012). https://doi.org:10.1016/j.neuron.2012.12.002

      (38) van Kesteren, M. T., Fernandez, G., Norris, D. G. & Hermans, E. J. Persistent schema-dependent hippocampal-neocortical connectivity during memory encoding and postencoding rest in humans. Proc Natl Acad Sci U S A 107, 7550-7555 (2010). https://doi.org:10.1073/pnas.0914892107

      (39) van Kesteren, M. T., Ruiter, D. J., Fernandez, G. & Henson, R. N. How schema and novelty augment memory formation. Trends Neurosci 35, 211-219 (2012). https://doi.org:10.1016/j.tins.2012.02.001

      (40) Wagner, A. D. et al. Building memories: remembering and forgetting of verbal experiences as predicted by brain activity. Science (New York, N.Y.) 281, 1188-1191 (1998).

      (41) Ashe, J., Lungu, O. V., Basford, A. T. & Lu, X. Cortical control of motor sequences. Curr Opin Neurobiol 16, 213-221 (2006).

      (42) Hikosaka, O., Nakamura, K., Sakai, K. & Nakahara, H. Central mechanisms of motor skill learning. Curr Opin Neurobiol 12, 217-222 (2002).

      (43) Penhune, V. B. & Steele, C. J. Parallel contributions of cerebellar, striatal and M1 mechanisms to motor sequence learning. Behav. Brain Res. 226, 579-591 (2012). https://doi.org:10.1016/j.bbr.2011.09.044

      (44) Doyon, J. et al. Contributions of the basal ganglia and functionally related brain structures to motor learning. Behavioural brain research 199, 61-75 (2009). https://doi.org:10.1016/j.bbr.2008.11.012

      (45) Schendan, H. E., Searl, M. M., Melrose, R. J. & Stern, C. E. An FMRI study of the role of the medial temporal lobe in implicit and explicit sequence learning. Neuron 37, 1013-1025 (2003). https://doi.org:10.1016/s0896-6273(03)00123-5

      (46) Morris, R. G. M. Elements of a neurobiological theory of hippocampal function: the role of synaptic plasticity, synaptic tagging and schemas. The European journal of neuroscience 23, 2829-2846 (2006). https://doi.org:10.1111/j.1460-9568.2006.04888.x

      (47) Tse, D. et al. Schemas and memory consolidation. Science 316, 76-82 (2007). https://doi.org:10.1126/science.1135935

      (48) Berlot, E., Popp, N. J. & Diedrichsen, J. A critical re-evaluation of fMRI signatures of motor sequence learning. Elife 9 (2020). https://doi.org:10.7554/eLife.55241

      (49) Bonstrup, M. et al. A Rapid Form of Offline Consolidation in Skill Learning. Curr Biol 29, 1346-1351 e1344 (2019). https://doi.org:10.1016/j.cub.2019.02.049

      (50) Kornysheva, K. et al. Neural Competitive Queuing of Ordinal Structure Underlies Skilled Sequential Action. Neuron 101, 1166-1180 e1163 (2019). https://doi.org:10.1016/j.neuron.2019.01.018

      (51) Pan, S. C. & Rickard, T. C. Sleep and motor learning: Is there room for consolidation? Psychol Bull 141, 812-834 (2015). https://doi.org:10.1037/bul0000009

      (52) Rickard, T. C., Cai, D. J., Rieth, C. A., Jones, J. & Ard, M. C. Sleep does not enhance motor sequence learning. J Exp Psychol Learn Mem Cogn 34, 834-842 (2008). https://doi.org:10.1037/0278-7393.34.4.834

      53) Brawn, T. P., Fenn, K. M., Nusbaum, H. C. & Margoliash, D. Consolidating the effects of waking and sleep on motor-sequence learning. J Neurosci 30, 13977-13982 (2010). https://doi.org:10.1523/JNEUROSCI.3295-10.2010

      (54) Bonstrup, M., Iturrate, I., Hebart, M. N., Censor, N. & Cohen, L. G. Mechanisms of offline motor learning at a microscale of seconds in large-scale crowdsourced data. NPJ Sci Learn 5, 7 (2020). https://doi.org:10.1038/s41539-020-0066-9

      (55) Gupta, M. W. & Rickard, T. C. Dissipation of reactive inhibition is sufficient to explain post-rest improvements in motor sequence learning. NPJ Sci Learn 7, 25 (2022). https://doi.org:10.1038/s41539-022-00140-z

      (56) Jacobacci, F. et al. Rapid hippocampal plasticity supports motor sequence learning. Proceedings of the National Academy of Sciences 117, 23898-23903 (2020).

      (57) Brooks, E., Wallis, S., Hendrikse, J. & Coxon, J. Micro-consolidation occurs when learning an implicit motor sequence, but is not influenced by HIIT exercise. NPJ Sci Learn 9, 23 (2024). https://doi.org:10.1038/s41539-024-00238-6

      (58) Deleglise, A. et al. Human motor sequence learning drives transient changes in network topology and hippocampal connectivity early during memory consolidation. Cereb Cortex 33, 6120-6131 (2023). https://doi.org:10.1093/cercor/bhac489

      (59) Buzsaki, G. Hippocampal sharp wave-ripple: A cognitive biomarker for episodic memory and planning. Hippocampus 25, 1073-1188 (2015). https://doi.org:10.1002/hipo.22488

      (60) Gupta, M. W. & Rickard, T. C. Comparison of online, offline, and hybrid hypotheses of motor sequence learning using a quantitative model that incorporate reactive inhibition. Sci Rep 14, 4661 (2024). https://doi.org:10.1038/s41598-024-52726-9

      (61) Das, A., Karagiorgis, A., Diedrichsen, J., Stenner, M.-P. & Azanon, E. “Micro-offline gains” convey no benefit for motor skill learning. bioRxiv, 2024.2007.2011.602795 (2024). https://doi.org:10.1101/2024.07.11.602795

      (62) Mylonas, D. et al. Maintenance of Procedural Motor Memory across Brief Rest Periods Requires the Hippocampus. J Neurosci 44 (2024). https://doi.org:10.1523/JNEUROSCI.1839-23.2024

    2. eLife Assessment

      This valuable study investigates how the neural representation of individual finger movements changes during the early period of sequence learning. By combining a new method for extracting features from human magnetoencephalography data and decoding analyses, the authors provide incomplete evidence of an early, swift change in the brain regions correlated with sequence learning, including a set of previously unreported frontal cortical regions. The addition of more control analyses to rule out that head movement artefacts influence the findings, and to further explain the proposal of offline contextualization during short rest periods as the basis for improvement performance would strengthen the manuscript.

    3. Reviewer #1 (Public review):

      Summary:

      This study addresses the issue of rapid skill learning and whether individual sequence elements (here: finger presses) are differentially represented in human MEG data. The authors use a decoding approach to classify individual finger elements, and accomplish an accuracy of around 94%. A relevant finding is that the neural representations of individual finger elements dynamically change over the course of learning. This would be highly relevant for any attempts to develop better brain machine interfaces - one now can decode individual elements within a sequence with high precision, but these representations are not static but develop over the course of learning.

      Strengths:

      The work follows a large body of work from the same group on the behavioural and neural foundations of sequence learning. The behavioural task is well established and neatly designed to allow for tracking learning and how individual sequence elements contribute. The inclusion of short offline rest periods between learning epochs has been influential because it has revealed that a lot, if not most of the gains in behaviour (ie speed of finger movements) occur in these so-called micro-offline rest periods.

      The authors use a range of new decoding techniques, and exhaustively interrogate their data in different ways, using different decoding approaches. Regardless of the approach, impressively high decoding accuracies are observed, but when using a hybrid approach that combines the MEG data in different ways, the authors observe decoding accuracies of individual sequence elements from the MEG data of up to 94%.

      Weaknesses:

      There are a few concerns which the authors may well be able to resolve. These are not weaknesses as such, but factors that would be helpful to address as these concern potential contributions to the results that one would like to rule out.

      Regarding the decoding results shown in Figure 2 etc, a concern is that within individual frequency bands, the highest accuracy seems to be within frequencies that match the rate of keypresses. This is a general concern when relating movement to brain activity, so is not specific to decoding as done here. As far as reported, there was no specific restraint to the arm or shoulder, and even then it is conceivable that small head movements would correlate highly with the vigor of individual finger movements. This concern is supported by the highest contribution in decoding accuracy being in middle frontal regions - midline structures that would be specifically sensitive to movement artefacts and don't seem to come to mind as key structures for very simple sequential keypress tasks such as this - and the overall pattern is remarkably symmetrical (despite being a unimanual finger task) and spatially broad. This issue may well be matching the time course of learning, as the vigor and speed of finger presses will also influence the degree to which the arm/shoulder and head move.

      This is not to say that useful information is contained within either of the frequencies or broadband data. But it raises the question of whether a lot is dominated by movement "artefacts" and one may get a more specific answer if removing any such contributions.

      A somewhat related point is this: when combining voxel and parcel space, a concern is whether a degree of circularity may have contributed to the improved accuracy of the combined data, because it seems to use the same MEG signals twice - the voxels most contributing are also those contributing most to a parcel being identified as relevant, as parcels reflect the average of voxels within a boundary. In this context, I struggled to understand the explanation given, ie that the improved accuracy of the hybrid model may be due to "lower spatially resolved whole-brain and higher spatially resolved regional activity patterns". Firstly, there will be a relatively high degree of spatial contiguity among voxels because of the nature of the signal measured, ie nearby individual voxels are unlikely to be independent. Secondly, the voxel data gives a somewhat misleading sense of precision; the inversion can be set up to give an estimate for each voxel, but there will not just be dependence among adjacent voxels, but also substantial variation in the sensitivity and confidence with which activity can be projected to different parts of the brain. Midline and deeper structures come to mind, where the inversion will be more problematic than for regions along the dorsal convexity of the brain, and a concern is that in those midline structures, the highest decoding accuracy is seen.

      Some of these concerns could be addressed by recording head movement (with enough precision) to regress out these contributions. The authors state that head movement was monitored with 3 fiducials, and their timecourses ought to provide a way to deal with this issue. The ICA procedure may not have sufficiently dealt with removing movement-related problems, but one could eg relate individual components that were identified to the keypresses as another means for checking. An alternative could be to focus on frequency ranges above the movement frequencies. The accuracy for those still seems impressive, and may provide a slightly more biologically plausible assessment.

      One question concerns the interpretation of the results shown in Figure 4. They imply that during the course of learning, entirely different brain networks underpin the behaviour. Not only that, but they also include regions that would seem rather unexpected to be key nodes for learning and expressing relatively simple finger sequences, such as here. What then is the biological plausibility of these results? The authors seem to circumnavigate this issue by moving into a distance metric that captures the (neural network) changes over the course of learning, but the discussion seems detached from which regions are actually involved; or they offer a rather broad discussion of the anatomical regions identified here, eg in the context of LFOs, where they merely refer to "frontoparietal regions".

      If I understand correctly, the offline neural representation analysis is in essence the comparison of the last keypress vs the first keypress of the next sequence. In that sense, the activity during offline rest periods is actually not considered. This makes the nomenclature somewhat confusing. While it matches the behavioural analysis, having only key presses one can't do it in any other way, but here the authors actually do have recordings of brain activity during offline rest. So at the very least calling it offline neural representation is misleading to this reviewer because what is compared is activity during the last and during the next keypress, not activity during offline periods. But it also seems a missed opportunity - the authors argue that most of the relevant learning occurs during offline rest periods, yet there is no attempt to actually test whether activity during this period can be useful for the questions at hand here.

    4. Reviewer #2 (Public review):

      Summary

      Dash et al. asked whether and how the neural representation of individual finger movements is "contextualized" within a trained sequence during the very early period of sequential skill learning by using decoding of MEG signal. Specifically, they assessed whether/how the same finger presses (pressing index finger) embedded in the different ordinal positions of a practiced sequence (4-1-3-2-4; here, the numbers 1 through 4 correspond to the little through the index fingers of the non-dominant left hand) change their representation (MEG feature). They did this by computing either the decoding accuracy of the index finger at the ordinal positions 1 vs. 5 (index_OP1 vs index_OP5) or pattern distance between index_OP1 vs. index_OP5 at each training trial and found that both the decoding accuracy and the pattern distance progressively increase over the course of learning trials. More interestingly, they also computed the pattern distance for index_OP5 for the last execution of a practice trial vs. index_OP1 for the first execution in the next practice trial (i.e., across the rest period). This "off-line" distance was significantly larger than the "on-line" distance, which was computed within practice trials and predicted micro-offline skill gain. Based on these results, the authors conclude that the differentiation of representation for the identical movement embedded in different positions of a sequential skill ("contextualization") primarily occurs during early skill learning, especially during rest, consistent with the recent theory of the "micro-offline learning" proposed by the authors' group. I think this is an important and timely topic for the field of motor learning and beyond.

      Strengths

      The specific strengths of the current work are as follows. First, the use of temporally rich neural information (MEG signal) has a large advantage over previous studies testing sequential representations using fMRI. This allowed the authors to examine the earliest period (= the first few minutes of training) of skill learning with finer temporal resolution. Second, through the optimization of MEG feature extraction, the current study achieved extremely high decoding accuracy (approx. 94%) compared to previous works. As claimed by the authors, this is one of the strengths of the paper (but see my comments). Third, although some potential refinement might be needed, comparing "online" and "offline" pattern distance is a neat idea.

      Weaknesses

      Along with the strengths I raised above, the paper has some weaknesses. First, the pursuit of high decoding accuracy, especially the choice of time points and window length (i.e., 200 msec window starting from 0 msec from key press onset), casts a shadow on the interpretation of the main result. Currently, it is unclear whether the decoding results simply reflect behavioral change or true underlying neural change. As shown in the behavioral data, the key press speed reached 3~4 presses per second already at around the end of the early learning period (11th trial), which means inter-press intervals become as short as 250-330 msec. Thus, in almost more than 60% of training period data, the time window for MEG feature extraction (200 msec) spans around 60% of the inter-press intervals. Considering that the preparation/cueing of subsequent presses starts ahead of the actual press (e.g., Kornysheva et al., 2019) and/or potential online planning (e.g., Ariani and Diedrichsen, 2019), the decoder likely has captured these future press information as well as the signal related to the current key press, independent of the formation of genuine sequential representation (e.g., "contextualization" of individual press). This may also explain the gradual increase in decoding accuracy or pattern distance between index_OP1 vs. index_OP5 (Figure 4C and 5A), which co-occurred with performance improvement, as shorter inter-press intervals are more favorable for the dissociating the two index finger presses followed by different finger presses. The compromised decoding accuracies for the control sequences can be explained in similar logic. Therefore, more careful consideration and elaborated discussion seem necessary when trying to both achieve high-performance decoding and assess early skill learning, as it can impact all the subsequent analyses.

      Related to the above point, testing only one particular sequence (4-1-3-2-4), aside from the control ones, limits the generalizability of the finding. This also may have contributed to the extremely high decoding accuracy reported in the current study.

      In terms of clinical BCI, one of the potential relevance of the study, as claimed by the authors, it is not clear that the specific time window chosen in the current study (up to 200 msec since key press onset) is really useful. In most cases, clinical BCI would target neural signals with no overt movement execution due to patients' inability to move (e.g., Hochberg et al., 2012). Given the time window, the surprisingly high performance of the current decoder may result from sensory feedback and/or planning of subsequent movement, which may not always be available in the clinical BCI context. Of course, the decoding accuracy is still much higher than chance even when using signal before the key press (as shown in Figure 4 Supplement 2), but it is not immediately clear to me that the authors relate their high decoding accuracy based on post-movement signal to clinical BCI settings.

      One of the important and fascinating claims of the current study is that the "contextualization" of individual finger movements in a trained sequence specifically occurs during short rest periods in very early skill learning, echoing the recent theory of micro-offline learning proposed by the authors' group. Here, I think two points need to be clarified. First, the concept of "contextualization" is kept somewhat blurry throughout the text. It is only at the later part of the Discussion (around line #330 on page 13) that some potential mechanism for the "contextualization" is provided as "what-and-where" binding. Still, it is unclear what "contextualization" actually is in the current data, as the MEG signal analyzed is extracted from 0-200 msec after the keypress. If one thinks something is contextualizing an action, that contextualization should come earlier than the action itself.

      The second point is that the result provided by the authors is not yet convincing enough to support the claim that "contextualization" occurs during rest. In the original analysis, the authors presented the statistical significance regarding the correlation between the "offline" pattern differentiation and micro-offline skill gain (Figure 5. Supplement 1), as well as the larger "offline" distance than "online" distance (Figure 5B). However, this analysis looks like regressing two variables (monotonically) increasing as a function of the trial. Although some information in this analysis, such as what the independent/dependent variables were or how individual subjects were treated, was missing in the Methods, getting a statistically significant slope seems unsurprising in such a situation. Also, curiously, the same quantitative evidence was not provided for its "online" counterpart, and the authors only briefly mentioned in the text that there was no significant correlation between them. It may be true looking at the data in Figure 5A as the online representation distance looks less monotonically changing, but the classification accuracy presented in Figure 4C, which should reflect similar representational distance, shows a more monotonic increase up to the 11th trial. Further, the ways the "online" and "offline" representation distance was estimated seem to make them not directly comparable. While the "online" distance was computed using all the correct press data within each 10 sec of execution, the "offline" distance is basically computed by only two presses (i.e., the last index_OP5 vs. the first index_OP1 separated by 10 sec of rest). Theoretically, the distance between the neural activity patterns for temporally closer events tends to be closer than that between the patterns for temporally far-apart events. It would be fairer to use the distance between the first index_OP1 vs. the last index_OP5 within an execution period for "online" distance, as well.

      A related concern regarding the control analysis, where individual values for max speed and the degree of online contextualization were compared (Figure 5 Supplement 3), is whether the individual difference is meaningful. If I understood correctly, the optimization of the decoding process (temporal window, feature inclusion/reduction, decoder, etc.) was performed for individual participants, and the same feature extraction was also employed for the analysis of representation distance (i.e., contextualization). If this is the case, the distances are individually differently calculated and they may need to be normalized relative to some stable reference (e.g., 1 vs. 4 or average distance within the control sequence presses) before comparison across the individuals.

    5. Reviewer #3 (Public review):

      Summary:

      One goal of this paper is to introduce a new approach for highly accurate decoding of finger movements from human magnetoencephalography data via dimension reduction of a "multi-scale, hybrid" feature space. Following this decoding approach, the authors aim to show that early skill learning involves "contextualization" of the neural coding of individual movements, relative to their position in a sequence of consecutive movements. Furthermore, they aim to show that this "contextualization" develops primarily during short rest periods interspersed with skill training, and correlates with a performance metric which the authors interpret as an indicator of offline learning.

      Strengths:

      A clear strength of the paper is the innovative decoding approach, which achieves impressive decoding accuracies via dimension reduction of a "multi-scale, hybrid space". This hybrid-space approach follows the neurobiologically plausible idea of the concurrent distribution of neural coding across local circuits as well as large-scale networks. A further strength of the study is the large number of tested dimension reduction techniques and classifiers (though the manuscript reveals little about the comparison of the latter).

      A simple control analysis based on shuffled class labels could lend further support to this complex decoding approach. As a control analysis that completely rules out any source of overfitting, the authors could test the decoder after shuffling class labels. Following such shuffling, decoding accuracies should drop to chance level for all decoding approaches, including the optimized decoder. This would also provide an estimate of actual chance-level performance (which is informative over and beyond the theoretical chance level). Furthermore, currently, the manuscript does not explain the huge drop in decoding accuracies for the voxel-space decoding (Figure 3B). Finally, the authors' approach to cortical parcellation raises questions regarding the information carried by varying dipole orientations within a parcel (which currently seems to be ignored?) and the implementation of the mean-flipping method (given that there are two dimensions - space and time - what do the authors refer to when they talk about the sign of the "average source", line 477?).

      Weaknesses:

      A clear weakness of the paper lies in the authors' conclusions regarding "contextualization". Several potential confounds, described below, question the neurobiological implications proposed by the authors and provide a simpler explanation of the results. Furthermore, the paper follows the assumption that short breaks result in offline skill learning, while recent evidence, described below, casts doubt on this assumption.

      The authors interpret the ordinal position information captured by their decoding approach as a reflection of neural coding dedicated to the local context of a movement (Figure 4). One way to dissociate ordinal position information from information about the moving effectors is to train a classifier on one sequence and test the classifier on other sequences that require the same movements, but in different positions (Kornysheva et al., Neuron 2019). In the present study, however, participants trained to repeat a single sequence (4-1-3-2-4). As a result, ordinal position information is potentially confounded by the fixed finger transitions around each of the two critical positions (first and fifth press). Across consecutive correct sequences, the first keypress in a given sequence was always preceded by a movement of the index finger (=last movement of the preceding sequence), and followed by a little finger movement. The last keypress, on the other hand, was always preceded by a ring finger movement, and followed by an index finger movement (=first movement of the next sequence). Figure 4 - Supplement 2 shows that finger identity can be decoded with high accuracy (>70%) across a large time window around the time of the key press, up to at least {plus minus}100 ms (and likely beyond, given that decoding accuracy is still high at the boundaries of the window depicted in that figure). This time window approaches the keypress transition times in this study. Given that distinct finger transitions characterized the first and fifth keypress, the classifier could thus rely on persistent (or "lingering") information from the preceding finger movement, and/or "preparatory" information about the subsequent finger movement, in order to dissociate the first and fifth keypress. Currently, the manuscript provides no evidence that the context information captured by the decoding approach is more than a by-product of temporally extended, and therefore overlapping, but independent neural representations of consecutive keypresses that are executed in close temporal proximity - rather than a neural representation dedicated to context.

      Such temporal overlap of consecutive, independent finger representations may also account for the dynamics of "ordinal coding"/"contextualization", i.e., the increase in 2-class decoding accuracy, across Day 1 (Figure 4C). As learning progresses, both tapping speed and the consistency of keypress transition times increase (Figure 1), i.e., consecutive keypresses are closer in time, and more consistently so. As a result, information related to a given keypress is increasingly overlapping in time with information related to the preceding and subsequent keypresses. The authors seem to argue that their regression analysis in Figure 5 - Figure Supplement 3 speaks against any influence of tapping speed on "ordinal coding" (even though that argument is not made explicitly in the manuscript). However, Figure 5 - Figure Supplement 3 shows inter-individual differences in a between-subject analysis (across trials, as in panel A, or separately for each trial, as in panel B), and, therefore, says little about the within-subject dynamics of "ordinal coding" across the experiment. A regression of trial-by-trial "ordinal coding" on trial-by-trial tapping speed (either within-subject or at a group-level, after averaging across subjects) could address this issue. Given the highly similar dynamics of "ordinal coding" on the one hand (Figure 4C), and tapping speed on the other hand (Figure 1B), I would expect a strong relationship between the two in the suggested within-subject (or group-level) regression. Furthermore, learning should increase the number of (consecutively) correct sequences, and, thus, the consistency of finger transitions. Therefore, the increase in 2-class decoding accuracy may simply reflect an increasing overlap in time of increasingly consistent information from consecutive keypresses, which allows the classifier to dissociate the first and fifth keypress more reliably as learning progresses, simply based on the characteristic finger transitions associated with each. In other words, given that the physical context of a given keypress changes as learning progresses - keypresses move closer together in time and are more consistently correct - it seems problematic to conclude that the mental representation of that context changes. To draw that conclusion, the physical context should remain stable (or any changes to the physical context should be controlled for).

      A similar difference in physical context may explain why neural representation distances ("differentiation") differ between rest and practice (Figure 5). The authors define "offline differentiation" by comparing the hybrid space features of the last index finger movement of a trial (ordinal position 5) and the first index finger movement of the next trial (ordinal position 1). However, the latter is not only the first movement in the sequence but also the very first movement in that trial (at least in trials that started with a correct sequence), i.e., not preceded by any recent movement. In contrast, the last index finger of the last correct sequence in the preceding trial includes the characteristic finger transition from the fourth to the fifth movement. Thus, there is more overlapping information arising from the consistent, neighbouring keypresses for the last index finger movement, compared to the first index finger movement of the next trial. A strong difference (larger neural representation distance) between these two movements is, therefore, not surprising, given the task design, and this difference is also expected to increase with learning, given the increase in tapping speed, and the consequent stronger overlap in representations for consecutive keypresses. Furthermore, initiating a new sequence involves pre-planning, while ongoing practice relies on online planning (Ariani et al., eNeuro 2021), i.e., two mental operations that are dissociable at the level of neural representation (Ariani et al., bioRxiv 2023).

      Given these differences in the physical context and associated mental processes, it is not surprising that "offline differentiation", as defined here, is more pronounced than "online differentiation". For the latter, the authors compared movements that were better matched regarding the presence of consistent preceding and subsequent keypresses (online differentiation was defined as the mean difference between all first vs. last index finger movements during practice). It is unclear why the authors did not follow a similar definition for "online differentiation" as for "micro-online gains" (and, indeed, a definition that is more consistent with their definition of "offline differentiation"), i.e., the difference between the first index finger movement of the first correct sequence during practice, and the last index finger of the last correct sequence. While these two movements are, again, not matched for the presence of neighbouring keypresses (see the argument above), this mismatch would at least be the same across "offline differentiation" and "online differentiation", so they would be more comparable.

      A further complication in interpreting the results regarding "contextualization" stems from the visual feedback that participants received during the task. Each keypress generated an asterisk shown above the string on the screen, irrespective of whether the keypress was correct or incorrect. As a result, incorrect (e.g., additional, or missing) keypresses could shift the phase of the visual feedback string (of asterisks) relative to the ordinal position of the current movement in the sequence (e.g., the fifth movement in the sequence could coincide with the presentation of any asterisk in the string, from the first to the fifth). Given that more incorrect keypresses are expected at the start of the experiment, compared to later stages, the consistency in visual feedback position, relative to the ordinal position of the movement in the sequence, increased across the experiment. A better differentiation between the first and the fifth movement with learning could, therefore, simply reflect better decoding of the more consistent visual feedback, based either on the feedback-induced brain response, or feedback-induced eye movements (the study did not include eye tracking). It is not clear why the authors introduced this complicated visual feedback in their task, besides consistency with their previous studies.

      The authors report a significant correlation between "offline differentiation" and cumulative micro-offline gains. However, it would be more informative to correlate trial-by-trial changes in each of the two variables. This would address the question of whether there is a trial-by-trial relation between the degree of "contextualization" and the amount of micro-offline gains - are performance changes (micro-offline gains) less pronounced across rest periods for which the change in "contextualization" is relatively low? Furthermore, is the relationship between micro-offline gains and "offline differentiation" significantly stronger than the relationship between micro-offline gains and "online differentiation"?

      The authors follow the assumption that micro-offline gains reflect offline learning. However, there is no direct evidence in the literature that micro-offline gains really result from offline learning, i.e., an improvement in skill level. On the contrary, recent evidence questions this interpretation (Gupta & Rickard, npj Sci Learn 2022; Gupta & Rickard, Sci Rep 2024; Das et al., bioRxiv 2024). Instead, there is evidence that micro-offline gains are transient performance benefits that emerge when participants train with breaks, compared to participants who train without breaks, however, these benefits vanish within seconds after training if both groups of participants perform under comparable conditions (Das et al., bioRxiv 2024).

    1. eLife Assessment

      This study presents valuable insights into the organization of second-order circuits for gustatory neurons, particularly how they integrate opposing taste inputs and the metabolic states that regulate feeding behavior. An elegant, compelling combination of multiple techniques discovered the target neurons for gustatory integration. However, the functional and behavioral evidence for the function of these neurons is incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      Mollá-Albaladejo et al. investigate the neurons downstream of GR64f and Gr66a, called G2Ns. They identify downstream neurons using trans-Tango labeling with RFP and then perform bulk RNA-seq on the RFP-sorted cells. Gene expression is up- or downregulated between the cell populations and between fed and starved states. They specifically identify Leukocinin as a neuropeptide that is upregulated in starved Gr66a cells. Leucokinin cells, identified by a GAL4 line indeed show higher expression when starved, especially in the SEZ. Furthermore, Leucokinin cells colocalize with the trans-Tango signal from downstream neurons of both GRs. This connection is confirmed with GRASP. According to EM data, Leucokinin cells in the SEZ receive a lot of input and connect to many downstream neurons. In behavior experiments performed with flies lacking Leucokinin neurons, flies show reduced responsiveness to sugar and bitter mixtures when starved. The authors suggest that Leucokinin neurons integrate bitter and sugar tastes and that their output is modified by a hunger state.

      Strengths:

      The authors use a multitude of tools to identify SELK neurons downstream of taste sensory neurons and as starvation-sensitive cells. This study provides an example of how combining genetic labeling, RNA-seq, and EM analysis can be combined to investigate neural circuits.

      Weaknesses:

      The authors do not show a functional connection between sensory neurons and SELK neurons. Additionally, data from RNA seq, anatomical studies, and EM analysis are sometimes contradictory in terms of connectivity. GRASP signal is not foolproof that cells are synaptically connected.

      The authors describe a behavioral phenotype when flies are starved, however, they do not use a specific driver for the described cell type, thus they should also tone down their claims.

      Generally, the authors do not provide a big advancement to the field and some of the results are contradictory with previous publications.

    3. Reviewer #2 (Public review):

      Summary:

      A core task of the brain is processing sensory cues from the environment. The neural mechanisms of how sensory information is transmitted from peripheral sense organs to subsequent being processing in defined brain centers remain an important topic in neuroscience. The taste system hereby assesses the palatability of food by evaluating the chemical composition and nutrient content while integrating the current need for energy by assessing the satiation level of the organism. The current manuscript provides insights into the early circuits of gustatory coding using the fruit fly as a model. By combining trans-tango and FACS-based bulk RNAseq to assess the target neurons of sweet sensing (using Gr64f-Gal4) and bitter sensing (using Gr66a-Gal4) in a first set of experiments the authors investigate genes that are differentially expressed or co-expressed in normal and starved conditions. With a focus on neuropeptides and neurotransmitters, different expressions in the different conditions were assessed resulting in the identification of Leucokinin as a potentially interesting gene. The notion is further supported by RNAseq of Lk-Gal4>mCD8:GFP sorted cells and immunostainings. GRASP and BacTrace experiments further support that the two Lk-expressing cells in the SEZ should indeed be postsynaptic to both types of sensories. Using EM-based connectomics data (based on a previous publication by Engert et al.), the authors also look for downstream targets of the bitter versus sweet gustatory neurons to identify the Lk-neurons. Based on the morphology they identify candidates and further depict the potential downstream neurons in the connectome, which appears largely in agreement with GRASP experiments. Finally silencing the Lk-neurons shows an increased PER response in starved flies (when combined with bitter compounds) as well as increased feeding in a FlyPad assay.

      Strengths:

      Overall this is an intriguing manuscript, which provides insight into the organization of 2nd order gustatory neurons. It specifically provides strong evidence for the Lk-neurons as a target of sweet and bitter GRNs and provides evidence for their role in regulating sweet vs bitter-based behavioral responses. Particularly the integration of different techniques and datasets in an elegant fashion is a strong side of the manuscript. Moreover to put the known LK-neurons into the context of 2nd order gustatory signalling is strengthening the knowledge about this pathway.

      Weaknesses:

      I do not see any major weakness in the current manuscript. Novelty is to some degree lessened by the fact, that the RNAseq approach did not identify new neurons but rather put the known LK-neurons as major findings. Similarly, the final behavioral section is not very deep and to some degree corroborates the previous publication by the Keene and Nässel labs - that said, the model they propose is indeed novel (but lacks depth in analyses; e.g. there is no physiology that would support the modulation of Lk neurons by either type of GRN). The connectomic section appears a bit out of place and after reading it it's not really clear what one should make of the potential downstream neurons (particularly since the Lk-receptor expression has been previously analyzed); here it might have been interesting to address if/how Lk-neurons may signal directly via a classical neurotransmitter (an information that might be found easily in the adult brain single-cell data).

    4. Reviewer #3 (Public review):

      Summary:

      To make feeding decisions, animals need to process three types of information: positive cues like sweetness, negative cues like bitterness, and internal states such as hunger or satiety. This study aims to identify where the information is integrated into the fruit fly brain. The authors applied RNA sequencing on second-order gustatory neurons responsible for sweet and bitter processing, under fed and starved conditions. The sequencing data reveal significant changes in gene expression across sweet vs. bitter pathways and fed vs. starved states. The authors focus on the neuropeptide Leucokinin (Lk), whose expression is dependent on the starvation state. They identify a pair of neurons, named SELK neurons, which express Lk and receive direct input from both sweet and bitter gustatory neurons. These SELK neurons are ideal candidates to integrate gustatory and internal state information. Behavioral experiments show that blocking these neurons in starved flies alters their tolerance to bitter substances during feeding.

      Strengths:

      (1) The study employs a well-designed approach, targeting specific neuronal populations, which is more efficient and precise compared to traditional large-scale genetic screening methods.

      (2) The RNAseq results provide valuable data that can be utilized in future studies to explore other molecules beyond Lk.

      (3) The identification of SELK neurons offers a promising avenue for future research into how these neurons integrate conflicting gustatory signals and internal state information.

      Weaknesses:

      (1) Unfortunately, due to technical challenges, the authors were unable to directly image the functional activity of SELK neurons.

      (2) In the behavioral experiments, tetanus toxin was used to block SELK neurons. Since these neurons may release multiple neurotransmitters or neuropeptides, the results do not specifically demonstrate that Leucokinin (Lk) is the critical factor, as suggested in Figure 8. To address this, I recommend using RNAi to inhibit Lk expression in SELK neurons and comparing the outcomes to wild-type controls via the PER assay.

    1. eLife Assessment

      This study presents an important finding on durotaxis in various amoeboid cells that is independent of focal adhesions. The evidence supporting the authors' claims is compelling. The work will be of interest to cell biologists and biophysicists working on rigidity sensing, the cytoskeleton, and cell migration.

    1. eLife Assessment

      This work describes a new software platform for machine-learning-based segmentation of and particle-picking in cryo-electron tomograms. The program and its corresponding online database of trained models will allow experimentalists to conveniently test different models and share their results with others. The paper provides convincing evidence that the software will be valuable to the community.

    2. Reviewer #1 (Public review):

      This paper describes "Ais", a new software tool for machine-learning based segmentation and particle picking of electron tomograms. The software can visualise tomograms as slices and allows manual annotation for the training of a provided set of various types of neural networks. New networks can be added, provided they adhere to a python file with an (undescribed) format. Once networks have been trained on manually annotated tomograms, they can be used to segment new tomograms within the same software. The authors also set up an online repository to which users can upload their models, so they might be re-used by others with similar needs. By logically combining the results from different types of segmentations, they further improve the detection of distinct features. The authors demonstrate the usefulness of their software on various data sets. Thus, the software appears to be a valuable tool for the cryo-ET community that will lower the boundaries of using a variety of machine-learning methods to help interpret tomograms.

    3. Reviewer #2 (Public review):

      Summary:

      Last et al. present Ais, a new deep learning based software package for segmentation of cryo electron tomography data sets. The distinguishing factor of this package is its orientation to the joint use of different models, rather than the implementation of a given approach: Notably, the software is supported by an online repository of segmentation models, open to contributions from the community.

      The usefulness of handling different models in one single environment is showcased with a comparative study on how different models perform on a given data set; then with an explanation on how the results of several models can be manually merged by the interactive tools inside Ais.

      The manuscripts presents two applications of Ais on real data sets; one oriented to showcase its particle picking capacities on a study previously completed by the authors; a second one refers to a complex segmentation problem on two different data sets (representing different geometries as bacterial cilia and mitochondria in a mouse neuron), both from public databases.

      The software described in the paper is compactly documented in its website, additionally providing links to some youtube videos (less than an hour it toral) where the authors videocapture and comment major workflows.

      In short, the manuscript describes a valuable resource for the community of tomography practitioners.

      Strengths:

      Public repository of segmentation models; easiness of working with several models and comparing/merging the results.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Last and colleagues describe Ais, an open-source software package for the semi-automated segmentation of cryo-electron tomography (cryo-ET) maps. Specifically, Ais provides a graphical user interface (GUI) for the manual segmentation and annotation of specific features of interest. These manual annotations are then used as input ground-truth data for training a convolutional neural network (CNN) model, which can then be used for automatic segmentation. Ais provides the option of several CNNs so that users can compare their performance on their structures of interest in order to determine the CNN that best suits their needs. Additionally, pretrained models can be uploaded and shared to an online database.

      Algorithms are also provided to characterize "model interactions" which allows users to define heuristic rules on how the different segmentations interact. For instance, a membrane adjacent protein can have rules where it must colocalize a certain distance away from a membrane segmentation. Such rules can help reduce false positives; as in the case above, false negatives predicted away from membranes are eliminated.

      The authors then show how Ais can be used for particle picking and subsequent subtomogram averaging and for segmentation of cellular tomograms for visual analysis. For subtomogram averaging, they used a previously published dataset and compared the averages of their automated picking with the published manual picking. Analysis of cellular tomogram segmentations were primarily visual.

      Strengths:

      CNN-based segmentation of cryo-ET data is a rapidly developing area of research, as it promises substantially faster results than manual segmentation as well as the possibility for higher accuracy. However, this field is still very much in the development and the overall performance of these approaches, even across different algorithms, still leaves much to be desired. In this context, I think Ais is an interesting packages, as it aims to provide both new and experienced users streamlined approaches for manual annotation, access to a number of CNNs, and methods to refine the outputs of CNN models against each other. I think this can be quite useful for users, particularly as these methods develop.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank the reviewers for helping us improve our article and software. The feedback that we received was very helpful and constructive, and we hope that the changes that we have made are indeed effective at making the software more accessible, the manuscript clearer, and the online documentation more insightful as well. A number of comments related to shared concerns, such as:

      • the need to describe various processing steps more clearly (e.g. particle picking, or the nature of ‘dust’ in segmentations)

      • describing the features of Ais more clearly, and explaining how it can interface with existing tools that are commonly used in cryoET

      • a degree of subjectivity in the discussion of results (e.g. about Pix2pix performing better than other networks in some cases.)

      We have now addressed these important points, with a focus on streamlining not only the workflow within Ais but also making interfacing between Ais and other tools easier. For instance, we explain more clearly which file types Ais uses and we have added the option to export .star files for use in, e.g., Relion, or meshes instead of coordinate lists. We also include information in the manuscript about how the particle picking process is implemented, and how false positives (‘dust’) can be avoided. Finally, all reviewers commented on our notion that Pix2pix can work ‘better’ despite reaching a higher loss after training. As suggested, we included a brief discussion about this idea in the supplementary information (Fig. S6) and used it to illustrate how Ais enables iteratively improving segmentation results. 

      Since receiving the reviews we have also made a number of other changes to the software that are not discussed below but that we nonetheless hope have made the software more reliable and easier to use. These include expanding the available settings, slight changes to the image processing that can help speed it up or avoid artefacts in some cases, improving the GUI-free usability of Ais, and incorporating various tools that should help make it easier to use Ais with remote data (e.g. doing annotation on an office PC, but model training on a more powerful remote PC). We have also been in contact with a number of users of the software, who reported issues or suggested various other miscellaneous improvements, and many of whom had found the software via the reviewed preprint.

      Reviewer 1 (Public Review):

      This paper describes "Ais", a new software tool for machine-learning-based segmentation and particle picking of electron tomograms. The software can visualise tomograms as slices and allows manual annotation for the training of a provided set of various types of neural networks. New networks can be added, provided they adhere to a Python file with an (undescribed) format. Once networks have been trained on manually annotated tomograms, they can be used to segment new tomograms within the same software. The authors also set up an online repository to which users can upload their models, so they might be re-used by others with similar needs. By logically combining the results from different types of segmentations, they further improve the detection of distinct features. The authors demonstrate the usefulness of their software on various data sets. Thus, the software appears to be a valuable tool for the cryo-ET community that will lower the boundaries of using a variety of machine-learning methods to help interpret tomograms. 

      We thank the reviewer for their kind feedback and for taking the time to review our article. On the basis of their  comments, we have made a number of changes to the software, article, and documentation, that we think have helped improve the project and render it more accessible (especially for interfacing with different tools, e.g. the suggestions to describe the file formats in more detail). We respond to all individual comments one-by-one below.

      Recommendations:

      I would consider raising the level of evidence that this program is useful to *convincing* if the authors would adequately address the suggestions for improvement below.

      (1) It would be helpful to describe the format of the Python files that are used to import networks, possibly in a supplement to the paper. 

      We have now included this information in both the online documentation and as a supplementary note (Supplementary Note 1). 

      (2) Likewise, it would be helpful to describe the format in which particle coordinates are produced. How can they be used in subsequent sub-tomogram averaging pipelines? Are segmentations saved as MRC volumes? Or could they be saved as triangulations as well? More implementation details like this would be good to have in the paper, so readers don't have to go into the code to investigate. 

      Coordinates: previously, we only exported arrays of coordinates as tab-separated .txt files, compatible with e.g. EMAN2. We now added a selection menu where users can specify whether to export either .star files or tsv .txt files, which together we think should cover most software suites for subtomogram averaging. 

      Triangulations: We have now improved the functionality for exporting triangulations. In the particle picking menu, there is now the option to output either coordinates or meshes (as .obj files). This was previously possible in the Rendering tab, but with the inclusion in the picking menu exporting triangulations can now be done for all tomograms at once rather than manually one by one.

      Edits in the text: the output formats were previously not clear in the text. We have now included this information in the introduction:

      “[…] To ensure compatibility with other popular cryoET data processing suites, Ais employs file formats that are common in the field, using .mrc files for volumes, tab-separated .txt or .star files for particle datasets, and the .obj file format for exporting 3D meshes.”

      (3) In Table 2, pix2pix has much higher losses than alternatives, yet the text states it achieves fewer false negatives and fewer false positives. An explanation is needed as to why that is. Also, it is mentioned that a higher number of epochs may have improved the results. Then why wasn't this attempted? 

      The architecture of Pix2pix is quite different from that of the other networks included in the test. Whereas all others are trained to minimize a binary cross entropy (BCE) loss, Pix2pix uses a composite loss function that is a weighted combination of the generator loss and a discriminator penalty, neither of which employ BCE. However, to be able to compare loss values, we do compute a BCE loss value for the Pix2pix generator after every training epoch. This is the value reported in the manuscript and in the software. Although Pix2pix’ BCE loss does indeed diminish during training, the model is not actually optimized to minimize this particular value and a comparison by BCE loss is therefore not entirely fair to Pix2pix. This is pointed out (in brief) in the legend to the able: 

      “Unlike the other architectures, Pix2pix is not trained to minimize the bce loss but uses a different loss function instead. The bce loss values shown here were computed after training and may not be entirely comparable.”

      Regarding the extra number of epochs for Pix2pix: here, we initially ran in to the problem that the number of samples in the training data was low for the number of parameters in Pix2pix, leading to divergence later during training. This problem did not occur for most other models, so we decided to keep the data for the discussion around Table 1 and Figure 2 limited to that initial training dataset. After that, we increased the sample size (from 58 to 170 positive samples) and trained the model for longer. The resulting model was used in the subsequent analyses. This was previously implicit in the text but is now mentioned explicitly and in a new supplementary figure. 

      “For the antibody platform, the model that would be expected to be one of the worst based on the loss values, Pix2pix, actually generates segmentations that are seem well-suited for the downstream processing tasks. It also output fewer false positive segmentations for sections of membranes than many other models, including the lowest-loss model UNet. Moreover, since Pix2pix is a relatively large network, it might also be improved further by increasing the number of training epochs. We thus decided to use Pix2pix for the segmentation of antibody platforms, and increased the size of the antibody platform training dataset (from 58 to 170 positive samples) to train a much improved second iteration of the network for use in the following analyses (Fig. S6).”

      (4) It is not so clear what absorb and emit mean in the text about model interactions. A few explanatory sentences would be useful here. 

      We have expanded this paragraph to include some more detail.

      “Besides these specific interactions between two models, the software also enables pitching multiple models against one another in what we call ‘model competition’. Models can be set to ‘emit’ and/or ‘absorb’ competition from other models. Here, to emit competition means that a model’s prediction value is included in a list of competing models. To absorb competition means that a model’s prediction value will be compared to all values in that list, and that this model’s prediction value for any pixel will be set to zero if any of the competing models’ prediction value is higher. On a pixel-by-pixel basis, all models that absorb competition are thus suppressed whenever their prediction value for a pixel is lower than that of any of the emitting models.”

      (5) Under Figure 4, the main text states "the model interactions described above", but because multiple interactions were described it is not clear which ones they were. Better to just specify again. 

      Changed as follows:

      “The antibody platform and antibody-C1 complex models were then applied to the respective datasets, in combination with the membrane and carbon models and the model interactions described above (Fig. 4b): the membrane avoiding carbon, and the antibody platforms colocalizing with the resulting membranes”.

      (6) The next paragraph mentions a "batch particle picking process to determine lists of particle coordinates", but the algorithm for how coordinates are obtained from segmented volumes is not described. 

      We have added a paragraph to the main text to describe the picking process:

      “This picking step comprises a number of processing steps (Fig. S7). First, the segmented (.mrc) volumes are thresholded at a user-specified level. Second, a distance transform of the resulting binary volume is computed, in which every nonzero pixel in the binary volume is assigned a new value, equal to the distance of that pixel to the nearest zero-valued pixel in the mask. Third, a watershed transform is applied to the resulting volume, so that the sets of pixels closest to any local maximum in the distance transformed volume are assigned to one group. Fourth, groups that are smaller than a user-specified minimum volume are discarded. Fifth, groups are assigned a weight value, equal to the sum of the prediction value (i.e. the corresponding pixel value in the input .mrc volume) of the pixels in the group. For every group found within close proximity to another group (using a user-specified value for the minimum particle spacing), the group with the lower weight value is discarded. Finally, the centroid coordinate of the grouped pixels is considered the final particle coordinate, and the list of all

      coordinates is saved in a tab-separated text file.

      “As an alternative output format, segmentations can also be converted to and saved as triangulated meshes, which can then be used for, e.g., membrane-guided particle picking. After picking particles, the resulting coordinates are immediately available for inspection in the Ais 3D renderer (Fig. S8).“

      The two supplementary figures are pasted below for convenience. Fig. S7 is new, while Fig. S8 was previously Fig. S10 -the reference to this figure was originally missing in the main text, but is now included.

      (7) In the Methods section, it is stated that no validation splits are used "in order to make full use of an input set". This sounds like an odd decision, given the importance of validation sets in the training of many neural networks. Then how is overfitting monitored or prevented? This sounds like a major limitation of the method. 

      In our experience, the best way of preparing a suitable model is to (iteratively) annotate a set of training images and visually inspect the result. Since the manual annotation step is the bottleneck in this process, we decided not to use validation split in order to make full use of an annotated training dataset (i.e. a validation split of 20% would mean that 20% of the manually annotated training data is not used for training)

      We do recognize the importance of using separate data for validation, or at least offering the possibility of doing so. We have now added a parameter to the settings (and made a Settings menu item available in the top menu bar) where users can specify what fraction (0, 10, 20, or 50%) of training datasets should be set aside for validation. If the chosen value is not 0%, the software reports the validation loss as well as the size of the split during training, rather than (as was done previously) the training loss. We have, however, set the default value for the validation split to 0%, for the same reason as before. We also added a section to the online documentation about using validation splits, and edited the corresponding paragraph in the methods section:

      “The reported loss is that calculated on the training dataset itself, i.e., no validation split was applied. During regular use of the software, users can specify whether to use a validation split or not. By default, a validation split is not applied, in order to make full use of an input set of ground truth annotations. Depending on the chosen split size, the software reports either the overall training loss or the validation loss during training.”

      (8) Related to this point: how is the training of the models in the software modelled? It might be helpful to add a paragraph to the paper in which this process is described, together with indicators of what to look out for when training a model, e.g. when should one stop training? 

      We have expanded the paragraph where we write about the utility of comparing different networks architectures to also include a note on how Ais facilitates monitoring the output of a model during training:

      “When taking the training and processing speeds in to account as well as the segmentation results, there is no overall best architecture. We therefore included multiple well-performing model architectures in the final library, in order to allow users to select from these models to find one that works well for their specific datasets. Although it is not necessary to screen different network architectures and users may simply opt to use the default (VGGNet), these results thus show that it can be useful to test different networks in order to identify one that is best. Moreover, these results also highlight the utility of preparing well-performing models by iteratively improving training datasets and re-training models in a streamlined interface. To aid in this process, the software displays the loss value of a network during training and allows for the application of models to datasets during training. Thus, users can inspect how a model’s output changes during training and decide whether to interrupt training and improve the training data or choose a different architecture.”

      (9) Figure 1 legend: define the colours of the different segmentations. 

      Done

      (10) It may be better to colour Figure 2B with the same colours as Figure 2A. 

      We tried this, but the effect is that the underlying density is much harder to see. We think the current grayscale image paired with the various segmentations underneath is better for visually identifying which density corresponds to membranes, carbon film, or antibody platforms.

      Reviewer 2 (Public Review):

      Summary: 

      Last et al. present Ais, a new deep learning-based software package for the segmentation of cryo-electron tomography data sets. The distinguishing factor of this package is its orientation to the joint use of different models, rather than the implementation of a given approach. Notably, the software is supported by an online repository of segmentation models, open to contributions from the community. 

      The usefulness of handling different models in one single environment is showcased with a comparative study on how different models perform on a given data set; then with an explanation of how the results of several models can be manually merged by the interactive tools inside Ais. 

      The manuscripts present two applications of Ais on real data sets; one is oriented to showcase its particlepicking capacities on a study previously completed by the authors; the second one refers to a complex segmentation problem on two different data sets (representing different geometries as bacterial cilia and mitochondria in a mouse neuron), both from public databases. 

      The software described in the paper is compactly documented on its website, additionally providing links to some YouTube videos (less than an hour in total) where the authors videocapture and comment on major workflows. 

      In short, the manuscript describes a valuable resource for the community of tomography practitioners. 

      Strengths: 

      A public repository of segmentation models; easiness of working with several models and comparing/merging the results. 

      Weaknesses: 

      A certain lack of concretion when describing the overall features of the software that differentiate it from others. 

      We thank the reviewer for their kind and constructive feedback. Following the suggestion to use the Pix2pix results to illustrate the utility of Ais for analyzing results, we have added a new supplementary figure (Fig. S6) and brief discussion, showing the use of Ais in iteratively improving segmentation results. We have also expanded the online documentation and included a note in the supplementary information about how models are saved/loaded (Supplemetary note 1) 

      Recommendations:

      I would like to ask the authors about some concerns about the Ais project as a whole: 

      (1) The website that accompanies the paper (aiscryoet.org), albeit functional, seems to be in its first steps. Is it planned to extend it? In particular, one of the major contributions of the paper (the maintenance of an open repository of models) could use better documentation describing the expected formats to submit models. This could even be discussed in the supplementary material of the manuscript, as this feature is possibly the most distinctive one of the paper. Engaging third-party users would require giving them an easier entry point, and the superficial mention of this aspect in the online documentation could be much more generous.

      We have added a new page to the online documentation, titled ‘Sharing models’ where we include an explanation of the structure of model files and demonstrate the upload page. We also added a note to the Supplementary Information that explains the file format for models, and how they are loaded/saved (i.e., that these standard keras model obects). 

      To make it easier to interface Ais with other tools, we have now also made some of the core functionality available (e.g. training models, batch segmentation) via the command line interface. Information on how to use this is included in the online documentation. All file formats are common formats used in cryoET, so that using Ais in a workflow with, e.g. AreTomo -> Ais -> Relion should now be more straightforward.

      (2) A different major line advanced by the authors to underpin the novelty of the software, is its claimed flexibility and modularity. In particular, the restrictions of other packages in terms of visualization and user interaction are mentioned. Although in the manuscript it is also mentioned that most of the functionalities in Ais are already available in major established packages, as a reader I am left confused about what exactly makes the offer of Ais different from others in terms of operation and interaction: is it just the two aspects developed in the manuscript (possibility of using different models and tools to operate model interaction)? If so, it should probably be stated; but if the authors want to pinpoint other aspects of the capacity of Ais to drive smoothly the interactions, they should be listed and described, instead of leaving it as an unspecific comment. As a potential user of Ais, I would suggest the authors add (maybe in the supplementary material) a listing of such features. Figure 1 does indeed carry the name "overview of (...) functionalities", but it is not clear to me which functionalities I can expect to be absent or differently solved on the other tools they mention.

      We have rewritten the part of the introduction where we previously listed the features as below. We think it should now be clearer for the reader to know what features to expect, as well as how Ais can interface with other software (i.e. what the inputs and outputs are). We have also edited the caption for Figure 1 to make it explicit that panels A to C represent the annotation, model preparation, and rendering steps of the Ais workflow and that the images are screenshots from the software.

      “In this report we present Ais, an open-source tool that is designed to enable any cryoET user – whether experienced with software and segmentation or a novice – to quickly and accurately segment their cryoET data in a streamlined and largely automated fashion. Ais comprises a comprehensive and accessible user interface within which all steps of segmentation can be performed, including: the annotation of tomograms and compiling datasets for the training of convolutional neural networks (CNNs), training and monitoring performance of CNNs for automated segmentation, 3D visualization of segmentations, and exporting particle coordinates or meshes for use in downstream processes. To help generate accurate segmentations, the software contains a library of various neural network architectures and implements a system of configurable interactions between different models. Overall, the software thus aims to enable a streamlined workflow where users can interactively test, improve, and employ CNNs for automated segmentation. To ensure compatibility with other popular cryoET data processing suites, Ais employs file formats that are common in the field, using .mrc files for volumes, tab-separated .txt or .star files for particle datasets, and the .obj file format for exporting 3D meshes.”

      “Figure 1 – an overview of the user interface and functionalities. The various panels represent sequential stages in the Ais processing workflow, including annotation (a), testing CNNs (b), visualizing segmentation (c). These images (a-c) are unedited screenshots of the software. a) […]”

      (3) Table 1 could have the names of the three last columns. The table has enough empty space in the other columns to accommodate this. 

      Done.

      (4) The comment about Pix2pix needing a larger number of training epochs (being a larger model than the other ones considered) is interesting. It also lends itself for the authors to illustrate the ability of their software to precisely do this: allow the users to flexibly analyze results and test hypothesis

      Please see the response to Reviewer 1 comment #3. We agree that this is a useful example of the ability to iterate between annotation and training, and have added an explicit mention of this in the text:

      “Moreover, since Pix2pix is a relatively large network, it might also be improved further by increasing the number of training epochs. In a second iteration of annotation and training, we thus increased the size of the antibody platform training dataset (from 58 to 170 positive samples) and generated an improved Pix2pix model for use in the following analyses.”

      Reviewer 3 (Public Review):

      We appreciate the reviewer’s extensive and very helpful feedback and are glad to read that they consider Ais potentially quite useful for the users. To address the reviewer’s comments, we have made various edits to the text, figures, and documentation, that we think have helped improve the clarity of our work. We list all edits below. 

      Summary

      In this manuscript, Last and colleagues describe Ais, an open-source software package for the semi-automated segmentation of cryo-electron tomography (cryo-ET) maps. Specifically, Ais provides a graphical user interface (GUI) for the manual segmentation and annotation of specific features of interest. These manual annotations are then used as input ground-truth data for training a convolutional neural network (CNN) model, which can then be used for automatic segmentation. Ais provides the option of several CNNs so that users can compare their performance on their structures of interest in order to determine the CNN that best suits their needs. Additionally, pre-trained models can be uploaded and shared to an online database. 

      Algorithms are also provided to characterize "model interactions" which allows users to define heuristic rules on how the different segmentations interact. For instance, a membrane-adjacent protein can have rules where it must colocalize a certain distance away from a membrane segmentation. Such rules can help reduce false positives; as in the case above, false negatives predicted away from membranes are eliminated. 

      The authors then show how Ais can be used for particle picking and subsequent subtomogram averaging and for the segmentation of cellular tomograms for visual analysis. For subtomogram averaging, they used a previously published dataset and compared the averages of their automated picking with the published manual picking. Analysis of cellular tomogram segmentation was primarily visual. 

      Strengths:

      CNN-based segmentation of cryo-ET data is a rapidly developing area of research, as it promises substantially faster results than manual segmentation as well as the possibility for higher accuracy. However, this field is still very much in the development and the overall performance of these approaches, even across different algorithms, still leaves much to be desired. In this context, I think Ais is an interesting package, as it aims to provide both new and experienced users with streamlined approaches for manual annotation, access to a number of CNNs, and methods to refine the outputs of CNN models against each other. I think this can be quite useful for users, particularly as these methods develop. 

      Weaknesses: 

      Whilst overall I am enthusiastic about this manuscript, I still have a number of comments: 

      (1) On page 5, paragraph 1, there is a discussion on human judgement of these results. I think a more detailed discussion is required here, as from looking at the figures, I don't know that I agree with the authors' statement that Pix2pix is better. I acknowledge that this is extremely subjective, which is the problem. I think that a manual segmentation should also be shown in a figure so that the reader has a better way to gauge the performance of the automated segmentation.

      Please see the answer to Reviewer 1’s comment #3.

      (2) On page 7, the authors mention terms such as "emit" and "absorb" but never properly define them, such that I feel like I'm guessing at their meaning. Precise definitions of these terms should be provided. 

      We have expanded this paragraph to include some more detail:

      “Besides these specific interactions between two models, the software also enables pitching multiple models against one another in what we call ‘model competition’. Models can be set to ‘emit’ and/or ‘absorb’ competition from other models. Here, to emit competition means that a model’s prediction value is included in a list of competing models. To absorb competition means that a model’s prediction value will be compared to all values in that list, and that this model’s prediction value for any pixel will be set to zero if any of the competing models’ prediction value is higher. On a pixel-by-pixel basis, all models that absorb competition are thus suppressed whenever their prediction value for a pixel is lower than that of any of the emitting models.” 

      (3) For Figure 3, it's unclear if the parent models shown (particularly the carbon model) are binary or not.

      The figure looks to be grey values, which would imply that it's the visualization of some prediction score. If so, how is this thresholded? This can also be made clearer in the text. 

      The figures show the grayscale output of the parent model, but this grayscale output is thresholded to produce a binary mask that is used in an interaction. We have edited the text to include a mention of thresholding at a user-specified threshold value:

      “These interactions are implemented as follows: first, a binary mask is generated by thresholding the parent model’s predictions using a user-specified threshold value. Next, the mask is then dilated using a circular kernel with a radius 𝑅, a parameter that we call the interaction radius. Finally, the child model’s prediction values are multiplied with this mask.”

      To avoid confusion, we have also edited the figure to show the binary masks rather than the grayscale segmentations. 

      (4) Figure 3D was produced in ChimeraX using the hide dust function. I think some discussion on the nature of this "dust" is in order, e.g. how much is there and how large does it need to be to be considered dust? Given that these segmentations can be used for particle picking, this seems like it may be a major contributor to false positives. 

      ‘Dust’ in segmentations is essentially unavoidable; it would require a perfect model that does not produce any false positives. However, when models are sufficiently accurate, the volume of false positives is typically smaller than that of the structures that were intended to be segmented. In these cases, discarding particles based on size is a practical way of filtering the segmentation results. Since it is difficult to generalize when to consider something ‘dust’ we decided to include this additional text in the Method’s section rather than in the main text:

      “… with the use of the ‘hide dust’ function (the same settings were used for each panel, different settings used for each feature).

      This ‘dust’ corresponds to small (in comparison to the segmented structures of interest) volumes of false positive segmentations, which are present in the data due to imperfections in the used models. The rate and volume of false positives can be reduced either by improving the models (typically by including more examples of the images of what would be false negatives or positives in the training data) or, if the dust particles are indeed smaller than the structures of interest, they can simply be discarded by filtering particles based on their volume, as applied here. In particle picking a ‘minimum particle volume’ is specified – particles with a smaller volume are considered ‘dust’.

      In combination with the newly included text about the method of converting volumes into lists of coordinates (see Reviewer 1’s comment #6).

      “Third, a watershed transform is applied to the resulting volume, so that the sets of pixels closest to any local maximum in the distance transformed volume are assigned to one group. Fourth, groups that are smaller than a user-specified minimum volume are discarded…”

      We think it should now be clearer that (some form of) discarding ‘dust’ is a step that is typically included in the particle picking process.

      (5) Page 9 contains the following sentence: "After selecting these values, we then launched a batch particle picking process to determine lists of particle coordinates based on the segmented volumes." Given how important this is, I feel like this requires significant description, e.g. how are densities thresholded, how are centers determined, and what if there are overlapping segmentations? 

      Please see the response to Reviewer 1’s comment #6.

      (6) The FSC shown in Figure S6 for the auto-picked maps is concerning. First, a horizontal line at FSC = 0 should be added. It seems that starting at a frequency of ~0.045, the FSC of the autopicked map increases above zero and stays there. Since this is not present in the FSC of the manually picked averages, this suggests the automatic approach is also finding some sort of consistent features. This needs to be discussed. 

      Thank you for pointing this out. Awkwardly, this was due to a mistake made while formatting the figure. In the two separate original plots, the Y axes had slightly different ranges, but this was missed when they were combined to prepare the joint supplementary figure. As a result, the FSC values for the autopicked half maps are displayed incorrectly. The original separate plots are shown below to illustrate the discrepancy:

      Author response image 1.

      The corrected figure is Figure S9 in the manuscript. The values of 44 Å and 46 Å were not determined from the graph and remain unchanged.

      (7) Page 11 contains the statement "the segmented volumes found no immediately apparent false positive predictions of these pores". This is quite subjective and I don't know that I agree with this assessment. Unless the authors decide to quantify this through subtomogram classification, I don't think this statement is appropriate. 

      We originally included this statement and the supplementary figure because we wanted to show another example of automated picking, this time in the more crowded environment of the cell. We do agree that it requires better substantiation, but also think that the demonstration of automated picking of the antibody platforms and IgG3-C1 complexes for subtomogram averaging suffices to demonstrate Ais’ picking capabilities. Since the supplementary information includes an example of picked coordinates rendered in the Ais 3D viewer (Figure S7) that also used the pore dataset, we still include the supplementary figure (S10) but have edited the statement to read:

      “Moreover, we could identify the molecular pores within the DMV, and pick sets of particles that might be suitable for use in subtomogram averaging (see Fig. S11).”

      We have also expanded the text that accompanies the supplementary figure to emphasize that results from automated picking are likely to require further curation, e.g. by classification in subtomogram averaging, and that the selection of particles is highly dependent on the thresholds used in the conversion from volumes to lists of coordinates.

      (8) In the methods, the authors note that particle picking is explained in detail in the online documentation. Given that this is a key feature of this software, such an explanation should be in the manuscript. 

      Please see the response to Reviewer 1’s comment #6. 

      Recommendations:

      (9) The word "model" seems to be used quite ambiguously. Sometimes it seems to refer to the manual segmentations, the CNN architectures, the trained models, or the output predictions. More precision in this language would greatly improve the readability of the manuscript.

      This was indeed quite ambiguous, especially in the introduction. We have edited the text to be clearer on these differences. The word ‘model’ is now only used to refer to trained CNNs that segment a particular feature (as in ‘membrane model’ or ‘model interactions’). Where we used terms such as ‘3D models’ to describe scenes rendered in 3D, we now use ‘3D visualizations’ or similar terms. Where we previously used the term ‘models’ to refer to CNN architectures, we now use terms such as ‘neural network architectures’ or ‘architecture’. Some examples:

      … with which one can automatically segment the same or any other dataset …

      Moreover, since Pix2pix is a relatively large network, …       

      … to generate a 3D visualization of ten distinct cellular …

      … with the use of the same training datasets for all network architectures …

      In Figure 1, the text in panels D and E is illegible. 

      We have edited the figure to show the text more clearly (the previous images were unedited screenshots of the website).

      (10) Prior to the section on model interactions, I was under the impression that all annotations were performed simultaneously. I think it could be clarified that models are generated per annotation type. 

      Multiple different features can be annotated (i.e. drawn by hand by the user) at the same time, but each trained CNN only segments one feature. CNNs that output segmentations for multiple features can be implemented straightforwardly, but this introduces the need to provide training data where for every grayscale image, every feature is annotated. This can make preparing the training data much more cumbersome. Reusability of the models is also hampered. We now mention the separateness of the networks explicitly in the introduction:

      “Multiple features, such as membranes, microtubules, ribosomes, and phosphate crystals, can be segmented and edited at the same time across multiple datasets (even hundreds). These annotations are then extracted and used as ground truth labels upon which to condition multiple separate neural networks, …”

      (11) On page 6, there is the text "some features are assigned a high segmentation value by multiple of the networks, leading to ambiguity in the results". Do they mean some false features? 

      To avoid ambiguity of the word ‘features’, we have edited the sentence to read:

      “… some parts of the image are assigned a high segmentation value by multiple of the networks, leading to false classifications and ambiguity in the results.”

      (12) Figures 2 and 3 would be easier to follow if they had consistent coloring. 

      We have changed the colouring in Figure 2 to match that of Figure 3 better:

      (13) For Figure 3D, I'm confused as to why the authors showed results from the tomogram in Figure 2B. It seems like the tomogram in Figure 3C would be a more obvious choice, as we would be able to see how the 2D slices look in 3D. This would also make it easier to see the effect of interactions on false negatives. Also, since the orientation of the tomogram in 2B is quite different than that shown in 3D, it's a bit difficult to relate the two.

      We chose to show this dataset because it exemplifies the effects of both model competition and model interactions better than the tomogram in Figure 3C. See Figure 3D and Author response image 2 for a comparison:

      Author response image 2.

      (14) I'm confused as to why the tomographic data shown in Figures 4D, E, and F are black on white while all other cryo-ET data is shown as white on black. 

      The images in Figure 4DEF are now inverted.

      (15) For Figure 5, there needs to be better visual cueing to emphasize which tomographic slices are related to the segmentations in Panels A and B. 

      We have edited the figure to show more clearly which grayscale image corresponds to which segmentation:

      (16) I don't understand what I should be taking away from Figures S1 and S2. There are a lot of boxes around membrane areas and I don't know what these boxes mean. 

      We have added a more descriptive text to these figures. The boxes are placed by the user to select areas of the image that will be sampled when saving training datasets.

    1. eLife Assessment

      The authors report that a secreted ubiquitin ligase of Shigella, called IpaH1.4, mediates the degradation of a host defense factor, RNF213. The data are solid and represent an important contribution to our understanding of cell-autonomous immunity and bacterial pathogenesis, as they provide new mechanistic insight into how the cytosolic bacterial pathogen Shigella flexneri evades IFN-induced host immunity.

    2. Reviewer #1 (Public review):

      Shigella flexneri is a bacterial pathogen that is an important globally significant cause of diarrhea. Shigella pathogenesis remains poorly understood. In their manuscript, Saavedra-Sanchez et al report their discovery that a secreted E3 ligase effector of Shigella, called IpaH1.4, mediates the degradation of a host E3 ligase called RNF213. RNF213 was previously described to mediate ubiquitylation of intracellular bacteria, an initial step in their targeting of xenophagosomes. Thus, Shigella IpaH1.4 appears to be an important factor in permitting evasion of RNF213-mediated host defense.

      Strengths:

      The work is focused, convincing, well-performed, and important. The manuscript is well-written.

    3. Reviewer #2 (Public review):

      Summary:

      The authors find that the bacterial pathogen Shigella flexneri uses the T3SS effector IpaH1.4 to induce degradation of the IFNg-induced protein RNF213. They show that in the absence of IpaH1.4, cytosolic Shigella is bound by RNF213. Furthermore, RNF213 conjugates linear and lysine-linked ubiquitin to Shigella independently of LUBAC. Intriguingly, they find that Shigella lacking ipaH1.4 or mxiE, which regulates the expression of some T3SS effectors, are not killed even when ubiquitylated by RNF213 and that these mutants are still able to replicate within the cytosol, suggesting that Shigella encodes additional effectors to escape from host defenses mediated by RNF213-driven ubiquitylation.

      Strengths:

      The authors take a variety of approaches, including host and bacterial genetics, gain-of-function and loss-of-function assays, cell biology, and biochemistry. Overall, the experiments are elegantly designed, rigorous, and convincing.

      Weaknesses:

      The authors find that ipaH1.4 mutant S. flexneri no longer degrades RNF213 and recruits RNF213 to the bacterial surface. The authors should perform genetic complementation of this mutant with WT ipaH1.4 and the catalytically inactive ipaH1.4 to confirm that ipaH1.4 catalytic activity is indeed responsible for the observed phenotype.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, the authors set out to investigate whether and how Shigella avoids cell-autonomous immunity initiated through M1-linked ubiquitin and the immune sensor and E3 ligase RNF213. The key findings are that the Shigella flexneri T3SS effector, IpaH1.4 induces degradation of RNF213. Without IpaH1.4, the bacteria are marked with RNF213 and ubiquitin following stimulation with IFNg. Interestingly, this is not sufficient to initiate the destruction of the bacteria, leading the authors to conclude that Shigella deploys additional virulence factors to avoid this host immune response. The second key finding of this paper is the suggestion that M1 chains decorate the mxiE/ipaH Shigella mutant independent of LUBAC, which is, by and large, considered the only enzyme capable of generating M1-linked ubiquitin chains.

      Strengths:

      The data is for the most part well controlled and clearly presented with appropriate methodology. The authors convincingly demonstrate that IpaH1.4 is the effector responsible for the degradation of RNF213 via the proteasome, although the site of modification is not identified.

      Weaknesses:

      The work builds on prior work from the same laboratory that suggests that M1 ubiquitin chains can be formed independently of LUBAC (in the prior publication this related to Chlamydia inclusions). In this study, two pieces of evidence support this statement -fluorescence microscopy-based images and accompanying quantification in Hoip and Hoil knockout cells for association of M1-ub, using an antibody, to Shigella mutants and the use of an internally tagged Ub-K7R mutant, which is unable to be incorporated into ubiquitin chains via its lysine residues. Given that clones of the M1-specific antibody are not always specific for M1 chains, and because it remains formally possible that the Int-K7R Ub can be added to the end of the chain as a chain terminator or as mono-ub, the authors should strengthen these findings relating to the claim that another E3 ligase can generate M1 chains de novo.

      The main weakness relating to the infection work is that no bacterial protein loading control is assayed in the western blots of infected cells, leaving the reader unable to determine if changes in RNF213 protein levels are the result of the absent bacterial protein (e.g. IpaH1.4) or altered infection levels.

      The importance of IFNgamma priming for RNF213 association to the mxiE or ipaH1.4 strain could have been investigated further as it is unclear if RNF213 coating is enhanced due to increased protein expression of RNF213 or another factor. This is of interest as IFNgamma priming does not seem to be needed for RNF213 to detect and coat cytosolic Salmonella.

      Overall, the findings are important for the host-pathogen field, cell-autonomous/innate immune signaling fields, and microbial pathogenesis fields. If further evidence for LUBAC independent M1 ubiquitylation is achieved this would represent a significant finding.

    1. eLife assessment

      This fundamental work describes an understudied bird migration pattern using data from an Arctic raptor. With an extensive dataset and comprehensive analyses, the observed pattern is convincing. This study will be of interest to researchers exploring the ecological drivers of bird migration.

    2. Reviewer #4 (Public review):

      Summary:

      This study describes an understudied migration pattern of dynamic non-breeding range using data from an Arctic raptor. Using data from GPS tags, the study describes the known pattern of fast migration during autumn and spring, and an undescribed pattern of slow migration, at much slower pace, throughout the over-wintering season.

      Strengths:

      The study presents a comprehensive analysis of the annual cycle of an interesting and undescribed migration system. The conceptual advancement is original and the data is rich and persuading. The Discussion part of the manuscript is well written.

      Weaknesses:

      Other sections of the manuscript need some more polish, both in terms of the terminology, the language and the logic of the presentation of the subject. The title is not good. During most of the text, the authors do not properly follow a certain terminology regarding migration, over-wintering, non-breeding range, and this is very confusing. So, consistency of the text is warranted. A bigger issue is the selection of latitudes (or the actual reason for movement) during the over-wintering period. The study claims that this relates to snow cover but fails to properly demonstrate it. It is likely that the birds move because of changes in snow cover rather than because of the level of snow cover. This is a testable prediction. A possible explanation is that there is a cost for moving further south and thus the birds are reluctant of moving unless they are forced to do it by the high snow cover. Another, similar and testable prediction is that the birds aim at selecting latitudes where snow cover is partial and move slowly during the winter to areas that are only partially covered by the snow with the progression of the winter. A modified, non-linear, snow cover analysis using GAMM could uncover such patterns.

    3. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #4

      We sincerely appreciate the time and effort you have taken to review our manuscript. We followed your recommendations to polish the text and make it easier to understand.

      Regarding terms and terminology, we changed “non-breeding” everywhere in the text to “over- wintering.”

      Regarding the title, as it was suggested by reviewer #1 as his recommendation, we tried to find a compromise and make the changes you suggested but left part of the suggestion from reviewer #1. So, now it’s “Foxtrot migration and dynamic over-wintering range of an arctic raptor”

      Thank you for highlighting the importance of snow cover and changes in snow cover as a possible factor of over-wintering movements. We appreciate your feedback and have explored several approaches to address this issue. Specifically, we examined how both snow cover extent and changes in snow cover influenced movement distance. However, we found no effect of either factor on movement distance.

      Our data show that birds leave their sites in October and move southwest, even though snow cover is minimal at that time. They also leave their sites in November and in subsequent months, regardless of the snow cover levels. Thus, we observed no pattern of birds leaving sites when snow cover reaches a specific threshold (e.g., 75-80%). Similarly, we found no evidence of birds staying in areas with a certain snow cover extent (e.g., 30%), nor did they leave sites when snow cover increased by a specific amount (e.g., by 10 or 20%).

      It is possible that more experienced birds anticipate that October plots will become inaccessible later in the winter and, therefore, leave early without waiting for significant snow accumulation. Alternatively, other factors, such as brief heavy snowfalls, may trigger movement, even if these do not lead to sustained increases in snow cover. Multiple factors, possibly acting asynchronously, could also play a role. This complexity adds an interesting dimension to the study of ecological patterns. However, in this study, we chose to focus on describing the migration pattern itself and its impact on aspects like over-winter range determination and population dynamics. While we have prioritized this approach, we remain committed to further analyzing the data to uncover additional details about this behavior.

      In response to your suggestion, we have expanded the Methods sections to clarify that we tested the effects of snow cover and changes in snow cover on distance (Lines 241-246); the Results section (Lines 348-349). We have also included the relevant plots in the Supplementary Materials. In the Discussion, we noted that this approach did not reveal any significant dependence and acknowledged that this issue requires further investigation (Lines 422-459).

      ---------

      The following is the authors’ response to the previous reviews.

      Reviewer #2:

      We sincerely appreciate the time and effort you have taken to review our manuscript. 

      First of all, we apologize for publishing the preprint without incorporating certain adjustments outlined in our earlier response, particularly in the Methods section. This was due to an oversight regarding the different versions of the manuscript. We have corrected this mistake. Our response to the feedback on this section (Methods), with line numbers of the changes made, is immediately below this response. In addition, we have included the units of measurement (mean and standard deviation) in both the results and figure captions for clarity.

      To focus on the main point regarding wintering strategies, we acknowledge that in the previous versions, this aspect was inadequately addressed and caused some confusion. In the revised edition, both the Introduction and the Discussion have been thoroughly reworked.

      As you suggested, we have removed the long introductory paragraph and all references to foxtrot migrations from the Introduction. As a result, the Introduction is now short and to the point. In the second paragraph, we explain why we propose the wintering strategies outlined (L74-81).

      In the Discussion, we've added a substantial new section at the beginning that discusses different wintering strategies. We have also updated Figure 4 accordingly. Previously, we erroneously suggested that Montagu's harrier and other African-Palaearctic migrants might adopt wintering strategies similar to those we describe. Upon further investigation, however, we found that almost all African-Palaearctic migrants exhibit an itinerant wintering strategy. Conversely, the strategy we describe is primarily observed in mid-latitude wintering species.

      We have shown that, unlike itinerancy, the birds in our study don't pause for 1-2 months at multiple non-breeding sites, but instead migrate significant distances, up to 1000 km, throughout the winter. Furthermore, unlike itinerancy, the sites they reach are consistently snow-free throughout the year. Following the logic of publications on Montagu's harriers (Schlaich et al. 2023), our birds do not wait for favorable conditions at the next site, as is typical of itinerancy. Moreover, this behavior is influenced by external factors such as snow cover dynamics and occurs primarily in mid-latitudes. Researchers studying a species similar to our subject, the Common buzzard, observed a similar pattern and termed it "prolonged autumn migration" rather than itinerancy. Although their transmitters stopped working in mid-winter, precluding a full observation of the annual cycle, they captured the essence of continued migration at a slower pace, distinct from itinerancy. We've detailed all of these findings in a new section.

      In addition, we acknowledge the mischaracterization of the implications of our research as ‘Conservation implications’ and have corrected this to ‘Mapping ranges and assessing population trends’, as you suggested.

      Finally, we've rewritten the Conclusion, removing overly grandiose statements and simply summarizing the main findings.

      We appreciate your time and effort in reviewing our manuscript. With your invaluable input, it has become clearer, more concise, and easier to understand.

      Dataset: unclear what is the frequency of GPS transmissions. Furthermore, information on relative tag mass for the tracked individuals should be reported.

      We have included this information in our manuscript (L 115-122). We also refer to the study in which this dataset was first used and described in detail (L 123).

      Data pre-processing: more details are needed here. What data have been removed if the bird died? The entire track of the individual? Only the data classified in the last section of the track? The section also reports on an 'iterative procedure' for annotating tracks, which is only vaguely described. A piecewise regression is mentioned, but no details are provided, not even on what is the dependent variable (I assume it should be latitude?).

      Regarding the deaths, we only removed the data when the bird was already dead. We estimated the date of death and excluded tracking data corresponding to the period after the bird's death. We have corrected the text to make this clear (L 130-131).

      Regarding the piecewise regression. We have added a detailed description on lines 136-148.

      Data analysis: several potential issues here:

      (1) Unclear why sex was not included in all mixed models. I think it should be included.

      Our dataset contains 35 females and eight males (L116). This ratio does not allow us to include sex in all models and adequately assess the influence of this factor. At the same time, because adult females disperse farther than males in some raptor species, we conducted a separate analysis of the dependence of migration distance on sex (Table S8) and found no evidence for this in our species. We have written about that in the Methods (L177-181) and after in the Results (L277-278).

      (2) Unclear what is the rationale of describing habitat use during migration; is it only to show that it is a largely unsuitable habitat for the species? But is a formal analysis required then? Wouldn't be enough to simply describe this?

      Habitat use and snow cover determine the two main phases (quick and slow) of the pattern we describe. We believe that habitat analysis is appropriate in this case, and a simple description would be uninformative and not support our conclusions.

      (3) Analysis of snow cover: such a 'what if' analysis is fine but it seems to be a rather indirect assessment of the effect of snow cover on movement patterns. Can a more direct test be envisaged relating e.g. daily movement patterns to concomitant snow cover? This should be rather straightforward. The effectiveness of this method rests on among-year differences in snow cover and timing of snowfall. A further possibility would be to demonstrate habitat selection within the entire non-breeding home range of an individual in relation snow cover. Such an analysis would imply associating presenceabsence of snow to every location within the non-breeding range and testing whether the proportion of locations with snow is lower than the proportion of snow of random locations within the entire nonbreeding home range (95% KDE) for every individual (e.g. by setting a 1/10 ratio presence to random locations).

      The proposed analysis will provide an opportunity to assess whether the Rough-legged buzzard selects areas with the lowest snow cover, but will not provide an opportunity to follow the dynamics and will therefore give a misleading overall picture. This is especially true in the spring months. In March-April, Rough-legged buzzards move northeast and are in an area that is not the most open to snow. At this time, areas to the southwest are more open to snow (this can be seen in Figure 3b). If we perform the proposed analysis, the control points for this period would be both to the north (where there is more snow) and to the south (where there is less snow) from the real locations, and the result would be that there is no difference in snow cover. 

      A step-selection analysis could be used, as we did in our previous work (Curk et al 2020 Sci Rep) with the same Rough-legged buzzards (but during migration, not winter). But this would only give us a qualitative idea, not a quantitative one - that Rough-legged Buzzards move from snow (in the fall) and follow snowmelt progression (in the spring). 

      At the same time, our analysis gives a complete picture of snow cover dynamics in different parts of the non-breeding range. This allows us to see that if Rough-legged buzzards remained at their fall migration endpoint without moving southwest, they would encounter 14.4% more snow cover (99.5% vs. 85.1%). Although this difference may seem small (14.4%), it holds significance for rodent-hunting birds, distinguishing between complete and patchy snow cover.

      Simultaneously, if Rough-legged buzzards immediately flew to the southwest and stayed there throughout winter, they would experience 25.7% less snow cover (57.3% vs. 31.6%). Despite a greater difference than in the first case, it doesn't compel them to adopt this strategy, as it represents the difference between various degrees of landscape openness from snow cover.

    1. Reviewer #1 (Public review):

      Summary:

      In an era of increasing antibiotic resistance, there is a pressing need for the development of novel sustainable therapies to tackle problematic pathogens. In this study, the authors hypothesize that pyoverdines - metal-chelating compounds produced by fluorescent pseudomonads - can act as antibacterials by locking away iron, thereby arresting pathogen growth. Using biochemical, growth and virulence assays on 12 opportunistic pathogens strains, the authors demonstrate that pyoverdines induce iron starvation, but this affect was highly context dependent. This same effect has been demonstrated for plant pathogens, but not for human opportunistic pathogens exposed to natural siderophores. Only those pathogens lacking (1) a matching receptor to take up pyoverdine-bound iron and/or (2) the ability to produce strong iron chelators themselves experienced strong growth arrest. This would suggest that pyoverdines might not be effective against all pathogens, thereby potentially limiting the utility of pyoverdines as global antibacterials.

      Strengths:

      The work addresses an important and timely question - can pyoverdines be used as an alternative strategy to deal with opportunistic pathogens? In general, the work is well conducted with rigorous biochemical, growth and virulence assays. In line, the work is clearly written, and the findings are supported by high-quality figures.

      Weaknesses:

      I do not think there are any 'weaknesses' as such. The authors have taken all suggestions on board and this has greatly improved the quality and robustness of the work

    2. Reviewer #2 (Public review):

      In this work, Vollenweider et al. examine the effectiveness of using natural products, specifically molecules that chelate iron, to treat infectious agents. Through the purification of 320 environmental isolates, 25 potential candidates were identified based on inhibition assays and further screened. The structural information and chemical composition of these candidates were determined. Using a series of well-described and standard assays, the authors show that three compounds have some effect in reducing mortality in a simple in vivo model.

      The paper is well-structured and thorough; targeting virulence factors in this manner is an excellent approach. However, my enthusiasm is dampened by the mediocre effects of the compounds. A reduction in the hazard ratio is reported, indicating that the compounds are having an effect, but without comparison to other iron-chelating molecules or current standards of care, it is difficult to contextualize the significance of these reductions.

      I am less convinced by a claim from the abstract: "Furthermore, experimental evolution combined with whole-genome sequencing revealed reduced potentials for resistance evolution compared to an antibiotic." Perhaps this is a semantic issue, but what is meant by "potential for resistance evolution"? My understanding is that this refers to mutations or sets of mutations that would be favored under selective pressure, allowing the bacteria to more easily climb a fitness landscape peak. However, the authors present a different result: the bacteria did not grow better after selection in different conditions (except for the positive control using ciprofloxacin). They correctly suggest that there may be individuals in the populations that have developed resistance and recommend isolating 8 from each treatment for testing. However, they then use the mean value of these individuals to conclude that there is no difference from the ancestor. This seems incorrect-surely the point of using individuals is not to compare them as a group but to determine if any one has a growth rate outside the expected distribution. In short, Figure S10 does not seem to support the findings reported in line 417.

      A final consideration for the evolution experiment is the choice of a bactericidal antibiotic. It might have been more appropriate to use a bacteriostatic drug as a control. However, I feel that additional work on this topic is beyond the scope of the current paper.

      Similarly, it would be interesting to consider how evolving the isolates in iron-limited media would affect resistance levels. Currently, I think the difference in growth rate is attributed to the iron-scavenging nature of the siderophores. In future work, this could be tested, and an evolution experiment in which iron availability is measured could provide valuable insights. To clarify, I believe this work is not necessary for the current paper, but it would be an interesting avenue for future research.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In an era of increasing antibiotic resistance, there is a pressing need for the development of novel sustainable therapies to tackle problematic pathogens. In this study, the authors hypothesize that pyoverdines - metal-chelating compounds produced by fluorescent pseudomonads - can act as antibacterials by locking away iron, thereby arresting pathogen growth. Using biochemical, growth, and virulence assays on 12 opportunistic pathogens strains, the authors demonstrate that pyoverdines induce iron starvation, but this effect was highly context-dependent. This same effect has been demonstrated for plant pathogens, but not for human opportunistic pathogens exposed to natural siderophores. Only those pathogens lacking (1) a matching receptor to take up pyoverdine-bound iron and/or (2) the ability to produce strong iron chelators themselves experienced strong growth arrest. This would suggest that pyoverdines might not be effective against all pathogens, thereby potentially limiting the utility of pyoverdines as global antibacterials.

      Strengths:

      The work addresses an important and timely question - can pyoverdines be used as an alternative strategy to deal with opportunistic pathogens? In general, the work is well conducted with rigorous biochemical, growth, and virulence assays. The work is clearly written and the findings are supported by high-quality figures.

      Weaknesses:

      I do not think there are any 'weaknesses' as such. However, it is well known that siderophore production is highly plastic, typically being upregulated in response to metal limitation (as well as toxic metal stress). Did the authors quantify whether pyoverdine supplementation altered siderophore production in the focal pathogens (either through phenotypic assays / transcriptomics)? Could such a phenotypic plastic response result in an increased capacity to scavenge iron from the environment? Importantly, increased expression of siderophores has been shown to enhance pathogen virulence (e.g. Lear et al 2023: increased pyoverdine production is linked with increased virulence in Pseudomonas aeruginosa). I really appreciate the amount of work the authors have put into this study, but I would suggest expanding the discussion a bit to include a few sentences on

      (1) unintentional consequences of pyoverdine treatment (e.g. changes in gene expression and non-siderophore-related mutations (e.g. biofilm formation)) on disease dynamics/pathogen virulence:

      (2) the efficacy of siderophore treatment under more natural conditions, i.e. when the pathogens have to compete with other species in the resident community (i.e. any other effects than resistance evolution through HGT of pyoverdine receptors as mentioned).

      Response 1: We would like to thank reviewer # 1 for the positive and constructive assessment. We agree that discussing the above points is important. We have added new paragraphs in the discussion, in which we elaborate on unintentional consequences (lines 532-551) and HGT of receptors (lines 599-607).

      Reviewer #1 (Recommendations For The Authors):

      I only have minor comments/suggestions for the authors, all listed below:

      • The authors' findings show that the antibacterial activity of pyoverdine is highly context-dependent. As such, I would suggest somewhat toning down the quite general statement in the Abstract: 'Thus, pyoverdines from environmental strain could become new sustainable antibacterials against human pathogens'

      Response 2: We agree that the pyoverdine treatment is especially potent against Acinetobacter baumannii and Staphylococcus aureus, but less so against Klebsiella pneumoniae. The treatment success is pathogen-dependent, and we have thus modified the phrase in the abstract (lines 32-34). The new sentence now reads: 'Thus, pyoverdines from environmental strains have the potential to become a new class of sustainable antibacterials against specific human pathogens.' Also in other parts of the manuscript (Results and Discussion), we emphasize that the pyoverdine treatment will likely be effective against specific pathogens (e.g., those with lower-iron affinity siderophores).

      • Bacteria often produce more than one type of siderophore. Do you know whether the 320 natural isolates used in this study produce any non-pyoverdine siderophores? Previous work has shown that pyochelin production is suppressed in PAO1 under a wider range of lab conditions. Do you know whether this is the case for the natural isolates used here (and rule out a potential role of non-pyoverdines in iron starvation as observed in Figure 1).

      Response 3: This is a valid question. Our own bioinformatic and phenotypic assays reveal that a certain fraction of strains (~ 40%) can produce secondary siderophores (unpublished data). We now mention the existence of secondary siderophores on lines 97-100 and 123. However, we do not think that their contribution to the supernatant assay results is large since the expression of pyoverdine typically suppresses the expression of the secondary siderophores (Cornelis 2010 Appl Microbiol Biotechnol; Dumas et al. 2013 Proc B) under stringent iron limitation. Furthermore, secondary siderophores have lower iron-binding affinities than pyoverdine. Finally, both the semi-pure and ultra-pure pyoverdine extracts showed strong pathogen inhibition (Fig. 3), and we are thus confident that pyoverdine is responsible for the observed growth inhibition.

      • Upon first mentioning the 'mock control' in the Results section in the main text, please state what the actual treatment is.

      Response 4: Thank you for noticing this. We now explain in more detail the actual treatment conditions used on lines 103-107 and in the caption of Figure 1. We have further removed the term 'mock' as it is confusing in this context and simple refer to the 'control treatment' in the text.

      • Please mention what the different colours mean in the legend of growth recovery in Figure 1B

      Response 5: We have clarified the colour scheme in the legend of Figure 1B.

      • Please clarify whether you used 12 or 14 strains of human pathogens (the latter number is mentioned in the results section)?

      Response 6: In the methods (lines 647-650), we now clearly specify that we used 12 strains of human pathogens in the initial supernatant screen (Figure 1). For all subsequent analyses (dose-response curves and infection experiments), we included the ESKAPE pathogens K. pneumoniae and A. baumannii.

      • Please explain whether ferribactin can be used in any other way than iron chelation (e.g. can this precursor be recycled to form pyoverdine)?

      Response 7: We apologize for not having properly explained the role of ferribactin. Under natural conditions, ferribactin is not secreted. It is kept in the periplasmic space, where it matures to pyoverdine. We most likely recovered ferribactin in the supernatant because of the vigorous shaking and centrifugation involved in the pyoverdine purification protocol. We now explain this on lines 216-218. Thus, there is no ferribactin secretion and recycling.

      • Have the authors looked at whether there is a relationship between the degree of growth arrest and phylogenetic distance? Would you expect there to be one?

      Response 8: This is an interesting question. We have now constructed a phylogenetic tree to explore this relationship (new Figure S2). We found that strains with inhibitory supernatants were scattered across the phylogenetic tree (described on lines 129-135). However, we also found two branches on the tree on which strains with inhibitory supernatant effects were overrepresented. This matches well our previous analysis that closely related species can produce similar pyoverdine types, but that the same pyoverdine can also be produced by completely different species (Gu et al. 2024 eLife).

      • In the Methods section, please mention you used pyoverdine-only controls in the infection assay.

      Response 9: We now mention the use of pyoverdine-only controls in the Methods section (lines 788-790). Overall, we have improved the infection procedure section (starting on line 770). Thank you for pointing this out.

      • Did you confirm whether the addition of pyoverdine resulted in lower bacterial loads in Galleria? In other words, were the observed changes in mortality solely related to changes in bacterial density?

      Response 10: Thank you for this valid question. No, we did not test whether pyoverdine treatment reduces the bacterial load. However, we did this in the past in two studies with a similar set of pathogens (Weigert et al. 2017 Evol Appl; Schmitz et al. 2023 Proc B) and found strong correlations between G. mellonella survival and bacterial loads. We agree that it is important to understand how pyoverdine affects pathogen load in the host and we will address this point in future studies.

      • In your infection assay, were Galleria (n=10) for each treatment housed in the same environment/container? If so, can you treat these as independent observations or should you use some sort of grouping variable in your survival analysis?

      Response 11: Thank you for pointing this out. We forgot to clarify this in the Methods section and now do so on lines 777-779. All larvae were individually housed in separate wells of a 24-well plate. There was no physical contact between larvae and no opportunity for pathogen exchange. As such, we treat each individual larvae as an independent observation.

      Reviewer #2 (Public Review):

      In this work, Vollenweider et al. examine the effectiveness of using natural products, specifically molecules that chelate iron, to treat infectious agents. Through the purification of 320 environmental isolates, 25 potential candidates were identified from natural products based on inhibition assays and were further screened. The structural information and chemical composition were determined.

      The paper is well-structured and thorough; targeting virulence factors in this manner is a great idea. My enthusiasm is dampened by the mediocre effects of the compounds. The lack of a dose-response curve in the survivability assays suggests a limited scope for these molecules. While it is encouraging that the best survivability occurred at the lowest toxicity level, it opens questions as to how effective such molecules can be. Either the reduction in mortality was offset by using higher concentrations, which was not observed in the compound-alone test, or there is no dose-response curve. The latter would suggest to me that the variation in survivability is not due to the addition of siderophores.

      Response 12: Thank you very much for the overall positive assessment. We understand your concern regarding the effectiveness of pyoverdines in the host. However, we wish to emphasize that hazard risks were reduced by more than 50% when treating A. baumannii and K. pneumoniae. Moreover, it was not so surprising to us that the treatment worked best at intermediate pyoverdine concentrations. We anticipated that pyoverdines could have negative effects for the host at relatively high concentrations because siderophore can interfere with host iron stocks (see discussion starting on line 552). Finally, dose-response curves do not necessarily need to be linear or sigmoid, they can also be hump-shaped. To better illustrate this aspect, we have now plotted the time to death for all the deceased larvae against the pyoverdine concentration gradient and fitted polynomial regression (new Fig. S6). For the above two pathogens, we found humped-shaped dose-response curves in four out of the six comparisons. We present this new analysis on lines 351-362.

      I would also like to see how these molecules compare to other iron-chelating molecules. Desferoxamine is a bacteria-derived siderophore that is FDA-approved. However, it is not used to treat infections. Would the author consider comparing their candidate molecules to well-studied molecules? This also raises questions about the novelty of this work; I think the authors could rephrase the discussion to better reflect that bioprospecting for iron-chelating molecules has previously occurred and been successful.

      Response 13: Thank you for the comment. The initial version of our manuscript already featured a brief discussion on other iron-chelation therapies. We have now changed the narrative to better reflect the differences of our approach to already existing iron-chelating molecules such as deferoxamine (lines 608-632).

      Finally, I am concerned about the few mutations reported in the resistance study. Looking at the SI, it appears that very few mutations were seen. It is unclear what filtering the authors used to arrive at such a low number of mutations. Even filtering against mutations that were selected by adaptation to the media, it seems low that only a handful of clones had distinct mutations.

      Response 14: We apologise for the unclear explanations and data analysis. When reanalysing the data we indeed detected a mistake: we originally treated all genomes as clonal origin, despite the fact that we sequenced entire populations for the control treatments. We have now completely re-done the mutational analysis using the breseq pipeline as newly described in the Methods (lines 861-866) and presented in the Results (lines 421-451). We have improved the filtering process and indeed found many more mutations, including the loss of mobile genetic elements. However, it is important to note that it is not uncommon to only find a few beneficial mutations. Especially, in cases where there are selective sweeps often only a few mutations fix.

      This paper has a lot of strengths. The workflow is logical and well-executed; the only significant weakness is the effect of the molecules and the lack of an explanation for a dose-response curve in the survivability assay, especially when compared to the data reported in Figure 3. As the authors describe in lines 214-217.

      Response 15: Thank you for this overall positive assessment. As discussed in our response 12, the effect of the molecule in the host was not weak as it decreased hazard risks by more than 50% for A. baumannii and K. pneumoniae. Moreover, we explain that the benefit of the pyoverdine treatment (in terms of treating the infection) can be offset by adverse effects on the host, especially at high pyoverdine concentrations.

      Reviewer #2 (Recommendations For The Authors):

      • Compare these compounds to well-studied iron chelating molecules.

      Response 16: We have addressed this comment in our response 13.

      • Considering adding time of death to the analysis for the survivability. While the reduction in mortality was not large perhaps the time to death increased.

      Response 17: This is an excellent suggestion. We have now analysed the time-to-death as a function of pyoverdine concentration (new Figure S6). Time-to-death was highly variable and sample size was fairly low for A. baumannii and K. pneumoniae as many larvae survived. Nonetheless, we found hump-shaped dose-response curves in four out of six comparisons and a linear dose-response curve in one case. We now report the new analyses on lines 351-362. Finally, we like to stress once more that reduction in mortality was considerable (hazard risk reduction by more than 50%).

      • I would also like to see the actual growth curves of the pathogens in the SI to accompany Fig 6.

      Response 18: This is a good point. We have now included the actual growth curves of the pathogens in the Supporting Information to accompany Figure 6 (new Figures S9 and S10).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Joint Public Review:

      Summary:

      This study presents a strategy to efficiently isolate PcrV-specific BCRs from human donors with cystic fibrosis who have/had Pseudomonas aeruginosa (PA) infection. Isolation of mAbs that provide protection against PA may be a key to developing a new strategy to treat PA infection as the PA has intrinsic and acquired resistance to most antibiotic drug classes. Hale et al. developed fluorescently labeled antigen-hook and isolated mAbs with anti-PA activity. Overall, the authors' conclusion is supported by solid data analysis presented in the paper. Four of five recombinantly expressed PcrV-specific mAbs exhibited anti-PA activity in a murine pneumonia challenge model as potent as the V2L2MD mAb (equivalent to gremubamab). However, therapeutic potency for these isolated mAbs is uncertain as the gremubamab has failed in Phase 2 trials. Clarification of this point would greatly benefit this paper.

      Strengths:

      (1) High efficiency of isolating antigen-specific BCRs using an antigenic hook.

      (2) The authors' conclusion is supported by data.

      Weaknesses:

      Although the authors state that the goal of this study was to generate novel protective mAbs for therapeutic use (P12; Para. 2), it is unclear whether PcrV-specific mAbs isolated in this study have therapeutic potential better than the gremubamab, which has failed in Phase 2 trials. Four of five PcrV-specific mAbs isolated in this study reduced bacterial burdens in mice as potent as, but not superior to, gremubamab-equivalent mAb. Clarification of this concern by revising the text or providing experimental results that show better potential than gremubamab would greatly benefit this paper.

      The authors thank the reviewer for their thoughtful positive assessment. As noted by the reviewer, the studies described here, which were performed in mice, show that our MBC-derived mAbs are as effective as V2L2MD, a mAb that is one component of the gremubamab bi-specific. However, key theoretical strengths of MBC-derived mAbs (reduced immunogenicity, full participation in effector functions) are not easily tested in mice. We have clarified and expanded our discussion of these points in our revised manuscript, particularly in the Discussion paragraph 4.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Page 8. Using improved methods that enhanced the efficiency and depth of sequencing (manuscript in preparation...). This method is not provided in detail. The authors should provide a detailed method (as a preprint on a public database or described in the method section).

      We thank the reviewers for their interest in the details of the specific methods for single cell B cell receptor sequencing. We regret that the manuscript is still in preparation. In fact, our current methods section provides much more detail about sequencing methods than is customarily supplied by authors mAb development papers. However, we understand the frustration and will remove our citation of our manuscript in preparation in our revised manuscript.

    2. eLife Assessment

      Treatment of Pseudomonas aeruginosa (PA) infections is challenging because of intrinsic and acquired antibiotic resistance to most antibiotic drug classes. Therefore, by using donor B cells in subjects with cystic fibrosis who undergo intermittent or chronic airway PA infections, the authors aimed to isolate B-cell receptors against PA virulence factors and examined their biological activities. The data are solid and the protective antibodies identified in this study could be useful for protection against PA.

    3. Joint Public Review:

      Summary:

      This study presents a strategy to efficiently isolate PcrV-specific BCRs from human donors with cystic fibrosis who have/had Pseudomonas aeruginosa (PA) infection. Isolation of mAbs that provide protection against PA may be a key to developing a new strategy to treat PA infection as the PA has intrinsic and acquired resistance to most antibiotic drug classes. Hale et al. developed fluorescently labeled antigen-hook and isolated mAbs with anti-PA activity. Overall, the authors' conclusion is supported by solid data analysis presented in the paper. Four of five recombinantly expressed PcrV-specific mAbs exhibited anti-PA activity in a murine pneumonia challenge model as potent as the V2L2MD mAb (equivalent to gremubamab). However, therapeutic potency for these isolated mAbs is uncertain as the gremubamab has failed in Phase 2 trials. Clarification of this point would greatly benefit this paper.

      Strengths:

      (1) High efficiency of isolating antigen-specific BCRs using an antigenic hook.

      (2) The authors' conclusion is supported by data.

      Weaknesses:

      Although the authors state that the goal of this study was to generate novel protective mAbs for therapeutic use (P12; Para. 2), it is unclear whether PcrV-specific mAbs isolated in this study have therapeutic potential better than the gremubamab, which has failed in Phase 2 trials. Four of five PcrV-specific mAbs isolated in this study reduced bacterial burdens in mice as potent as, but not superior to, gremubamab-equivalent mAb. Clarification of this concern by revising the text or providing experimental results that show better potential than gremubamab would greatly benefit this paper.

    1. eLife Assessment

      This study presents an important finding on durotaxis in various amoeboid cells that is independent of focal adhesions. The evidence supporting the authors' claims is compelling. The work will be of interest to cell biologists and biophysicists working on rigidity sensing, the cytoskeleton, and cell migration.

    2. Reviewer #1 (Public review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient.

      The authors observed that NMIIA is required for durotaxis and, buiding on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis.

      The evidence provided for durotaxis of non adherent (or low-adhering) cells is strong. I am particularly impressed by the fact that amoeboid cells can durotax even when not confined. I wish to congratulate the authors for the excellent work, which will fuel discussion in the field of cell adhesion and migration.

    3. Reviewer #2 (Public review):

      Summary:

      The authors developed an imaging-based device, that provides both spatial confinement and stiffness gradient, to investigate if and how amoeboid cells, including T cells, neutrophils and Dictyostelium can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that are not dependent on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient. 

      The authors observed that NMIIA is required for durotaxis and, buiding on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis. 

      The authors responded to all my comments and I have nothing to add. The evidence provided for durotaxis of non adherent (or low-adhering) cells is strong. I am particularly impressed by the fact that amoeboid cells can durotax even when not confined. I wish to congratulate the authors for the excellent work, which will fuel discussion in the field of cell adhesion and migration.

      We thank the reviewer for critically evaluating our work and giving kind suggestions. We are glad that the reviewer found our work to be of potential interest to the broad scientific community.

      Reviewer #2 (Public Review):

      Summary:

      The authors developed an imaging-based device that provides both spatialconfinement and stiffness gradient to investigate if and how amoeboid cells, including T cells, neutrophils, and Dictyostelium, can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that do not depend on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

      Weaknesses:

      Overall this study is well performed but there are still some minor issues I recommend the authors address:

      (1) When using NMIIA/NMIIB knockdown cell lines to distinguish the role of NMIIA and NMIIB in amoeboid durotaxis, it would be better if the authors took compensatory effects into account.

      We thank the reviewer for this suggestion. We have investigated the compensation of myosin in NMIIA and NMIIB KD HL-60 cells using Western blot and added this result in our updated manuscript (Fig. S4B, C). The results showed that the level of NMIIB protein in NMIIA KD cells doubled while there was no compensatory upregulation of NMIIA in NMIIB KD cells. This is consistent with our conclusion that NMIIA rather than NMIIB is responsible for amoeboid durotaxis since in NMIIA KD cells, compensatory upregulation of NMIIB did not rescue the durotaxis-deficient phenotype. 

      (2) The expansion microscopy assay is not clearly described and some details are missed such as how the assay is performed on cells under confinement.

      We thank the reviewer for this comment. We have updated details of the expansion microscopy assay in our revised manuscript in line 481-485 including how the assay is performed on cells under confinement:

      Briefly, CD4+ Naïve T cells were seeded on a gradient PA gel with another upper gel providing confinement. 4% PFA was used to fix cells for 15 min at room temperature. After fixation, the upper gradient PA gel is carefully removed and the bottom gradient PA gel with seeded cells were immersed in an anchoring solution containing 1% acrylamide and 0.7% formaldehyde (Sigma, F8775) for 5 h at 37 °C.

      (3) In this study, an active gel model was employed to capture experimental observations. Previously, some active nematic models were also considered to describe cell migration, which is controlled by filament contraction. I suggest the authors provide a short discussion on the comparison between the present theory and those prior models.

      We thank the reviewer for this suggestion. Active nematic models have been employed to recapitulate many phenomena during cell migration (Nat Commun., 2018, doi: 10.1038/s41467-018-05666-8.). The active nematic model describes the motion of cells using the orientation field, Q, and the velocity field, u. The director field n with (n = −n) is employed to represent the nematic state, which has head-tail symmetry. However, in our experiments, actin filaments are obviously polarized, which polymerize and flow towards the direction of cell migration. Therefore, we choose active gel model which describes polarized actin field during cell migration. In the discussion part, we have provided the comparison between active gel model and motor-clutch model. We have also supplemented a short discussion between the present model and active nematic model in the main text of line 345-347:

      The active nematic model employs active extensile or contractile agents to push or pull the fluid along their elongation axis to simulate cells flowing (61). 

      (4) In the present model, actin flow contributes to cell migration while myosin distribution determines cell polarity. How does this model couple actin and myosin together?

      We thank the reviewer for this question. In our model, the polarization field is employed to couple actin and myosin together. It is obvious that actin accumulate at the front while myosin diffuses in the opposite direction. Therefore, we propose that actin and myosin flow towards the opposite direction, which is captured in the convection term of actin ) and myosin () density field.

    5. eLife Assessment

      This study presents an important finding on durotaxis in various amoeboid cells that is independent of focal adhesions. The evidence supporting the authors' claims is compelling. The work will be of interest to cell biologists and biophysicists working on rigidity sensing, the cytoskeleton, and cell migration.

    6. Reviewer #1 (Public review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient.

      The authors observed that NMIIA is required for durotaxis and, buiding on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis.

      The evidence provided for durotaxis of non adherent (or low-adhering) cells is strong. I am particularly impressed by the fact that amoeboid cells can durotax even when not confined. I wish to congratulate the authors for the excellent work, which will fuel discussion in the field of cell adhesion and migration.

    7. Reviewer #2 (Public review):

      Summary:

      The authors developed an imaging-based device, that provides both spatial confinement and stiffness gradient, to investigate if and how amoeboid cells, including T cells, neutrophils and Dictyostelium can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that are not dependent on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

    8. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient. 

      The authors observed that NMIIA is required for durotaxis and, buiding on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis. 

      The authors responded to all my comments and I have nothing to add. The evidence provided for durotaxis of non adherent (or low-adhering) cells is strong. I am particularly impressed by the fact that amoeboid cells can durotax even when not confined. I wish to congratulate the authors for the excellent work, which will fuel discussion in the field of cell adhesion and migration.

      We thank the reviewer for critically evaluating our work and giving kind suggestions. We are glad that the reviewer found our work to be of potential interest to the broad scientific community.

      Reviewer #2 (Public Review):

      Summary:

      The authors developed an imaging-based device that provides both spatialconfinement and stiffness gradient to investigate if and how amoeboid cells, including T cells, neutrophils, and Dictyostelium, can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that do not depend on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

      Weaknesses:

      Overall this study is well performed but there are still some minor issues I recommend the authors address:

      (1) When using NMIIA/NMIIB knockdown cell lines to distinguish the role of NMIIA and NMIIB in amoeboid durotaxis, it would be better if the authors took compensatory effects into account.

      We thank the reviewer for this suggestion. We have investigated the compensation of myosin in NMIIA and NMIIB KD HL-60 cells using Western blot and added this result in our updated manuscript (Fig. S4B, C). The results showed that the level of NMIIB protein in NMIIA KD cells doubled while there was no compensatory upregulation of NMIIA in NMIIB KD cells. This is consistent with our conclusion that NMIIA rather than NMIIB is responsible for amoeboid durotaxis since in NMIIA KD cells, compensatory upregulation of NMIIB did not rescue the durotaxis-deficient phenotype. 

      (2) The expansion microscopy assay is not clearly described and some details are missed such as how the assay is performed on cells under confinement.

      We thank the reviewer for this comment. We have updated details of the expansion microscopy assay in our revised manuscript in line 481-485 including how the assay is performed on cells under confinement:

      Briefly, CD4+ Naïve T cells were seeded on a gradient PA gel with another upper gel providing confinement. 4% PFA was used to fix cells for 15 min at room temperature. After fixation, the upper gradient PA gel is carefully removed and the bottom gradient PA gel with seeded cells were immersed in an anchoring solution containing 1% acrylamide and 0.7% formaldehyde (Sigma, F8775) for 5 h at 37 °C.

      (3) In this study, an active gel model was employed to capture experimental observations. Previously, some active nematic models were also considered to describe cell migration, which is controlled by filament contraction. I suggest the authors provide a short discussion on the comparison between the present theory and those prior models.

      We thank the reviewer for this suggestion. Active nematic models have been employed to recapitulate many phenomena during cell migration (Nat Commun., 2018, doi: 10.1038/s41467-018-05666-8.). The active nematic model describes the motion of cells using the orientation field, Q, and the velocity field, u. The director field n with (n = −n) is employed to represent the nematic state, which has head-tail symmetry. However, in our experiments, actin filaments are obviously polarized, which polymerize and flow towards the direction of cell migration. Therefore, we choose active gel model which describes polarized actin field during cell migration. In the discussion part, we have provided the comparison between active gel model and motor-clutch model. We have also supplemented a short discussion between the present model and active nematic model in the main text of line 345-347:

      The active nematic model employs active extensile or contractile agents to push or pull the fluid along their elongation axis to simulate cells flowing (61). 

      (4) In the present model, actin flow contributes to cell migration while myosin distribution determines cell polarity. How does this model couple actin and myosin together?

      We thank the reviewer for this question. In our model, the polarization field is employed to couple actin and myosin together. It is obvious that actin accumulate at the front while myosin diffuses in the opposite direction. Therefore, we propose that actin and myosin flow towards the opposite direction, which is captured in the convection term of actin ) and myosin () density field.

    1. eLife Assessment

      This manuscript reports important findings on the impact of maternal obesity on offspring metabolism. It presents solid evidence that maternal obesity induces genomic methylation alterations in oocytes, which can be partly transmitted to F2 in females, and that melatonin is involved in regulating the hyper-methylation of high fat diet oocytes by increasing the expression of DNMTs via the cAMP/PKA/CREB pathway. This study would be of interest to biologists in the fields of epigenetics and metabolism.

    2. Joint Public review:

      Summary

      This manuscript offers significant insights into the impact of maternal obesity on oocyte methylation and its transgenerational effects. Chao and colleagues demonstrated the potential mechanisms behind the DNA methylation changes. The major observations of the work include transgenerational DNA methylation changes in offspring of maternal obesity and metabolites such as methionine and melatonin which correlated with the epigenetic changes. Exogenous melatonin treatment could reverse the effects of obesity. The authors further hypothesized that the linkage may be mediated by the cAMP/PKA/CREB pathway to regulate the expression of DNMTs. This work has done lots of breeding and DNA Methylation analysis across multiple generations, which provides solid data for future research. The results of this work may benefit from deeper data analysis to make more causal analyses and conclusions more concrete.

      Strengths

      The study employs comprehensive methodologies, including transgenerational breeding experiments, whole genome bisulfite sequencing, and metabolomics analysis, and provides the convincing data.

      Weaknesses

      The results of this work are correlational, which may require further analysis to establish more concrete conclusions on causal relationships.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      With socioeconomic development, more and more people are obese which is an important reason for sub-fertility and infertility. Maternal obesity reduces oocyte quality which may be a reason for the high risk of metabolic diseases for offspring in adulthood. Yet the underlying mechanisms are not well elucidated. Here the authors examined the effects of maternal obesity on oocyte methylation. Hyper-methylation in oocytes was reported by the authors, and the altered methylation in oocytes may be partially transmitted to F2. The authors further explored the association between the metabolome of serum and the altered methylation in oocytes. The authors identified decreased melatonin. Melatonin is involved in regulating the hyper-methylation of high-fat diet (HFD) oocytes, via increasing the expression of DNMTs which is mediated by the cAMP/PKA/CREB pathway.

      Strengths:

      This study is interesting and should have significant implications for the understanding of the transgenerational inheritance of GDM in humans.

      Thank you for your positive comments to our manuscript.

      Weaknesses:

      The link between altered DNA methylation and offspring metabolic disorders is not well elucidated; how the altered DNA methylation in oocytes escapes reprogramming in transgenerational inheritance is also unclear.

      Thanks. These are very good questions. There is a long way to completely elucidate the relationship between methylation and offspring metabolic disorders, and the underlying mechanisms of obtained methylation escaping the reprogramming during development. We would like to explore these in the future.

      Reviewer #2 (Public Review):

      This manuscript offers significant insights into the impact of maternal obesity on oocyte methylation and its transgenerational effects. The study employs comprehensive methodologies, including transgenerational breeding experiments, whole genome bisulfite sequencing, and metabolomics analysis, to explore how high-fat diet (HFD)-induced obesity alters genomic methylation in oocytes and how these changes are inherited by subsequent generations. The findings suggest that maternal obesity induces hyper-methylation in oocytes, which is partly transmitted to F1 and F2 oocytes and livers, potentially contributing to metabolic disorders in offspring. Notably, the study identifies melatonin as a key regulator of this hyper-methylation process, mediated through the cAMP/PKA/CREB pathway.

      Strengths:

      The study employs comprehensive methodologies, including transgenerational breeding experiments, whole genome bisulfite sequencing, and metabolomics analysis, and provides convincing data.

      Thank you for your positive comments to our manuscript.

      Weaknesses:

      The description in the results section is somewhat verbose. This section (lines 126~227) utilized transgenerational breeding experiments and methylation analysis to demonstrate that maternal obesity-induced alterations in oocyte methylation (including hyper-DMRs and hypo-DMRs) can be partially transmitted to F1 and F2 oocytes and livers. The authors should consider condensing and revising this section for clarity and brevity.

      Thanks for your suggestions. We have re-written this parts in the revised manuscript.

      There is a contradiction with Reference 3, but the discrepancy is not discussed. In this study, the authors observed an increase in global methylation in oocytes from HFD mice, whereas Reference 3 indicates Stella insufficiency in oocytes from HFD mice. This Stella insufficiency should lead to decreased methylation (Reference 33). There should be a discussion of how this discrepancy can be reconciled with the authors' findings.

      Thanks for your suggestions. As reported by Reference 33, STELLA prevents hypermethylation in oocytes by sequestering UHRF1 from the nuclei which recruits DNMT1 into nuclei. Han et al. reported that obesity induced by high-fat diet reduces STELLA level in oocytes. These indicate that STELLA insufficiency might induce hypermethylation in oocytes, although significant hypermethylation in obese oocytes is not reported by Han et al. using immunofluorescence. This contradiction may be caused by the limited sample sizes (n=14) used by Han et al. We have added a brief discussion in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      Maternal obesity is a health problem for both pregnant women and their offspring. Previous works including work from this group have shown significant DNA methylation changes for offspring of obese pregnancies in mice. In this manuscript, Chao et al digested the potential mechanisms behind the DNA methylation changes. The major observations of the work include transgenerational DNA methylation changes in offspring of maternal obesity, and metabolites such as methionine and melatonin correlated with the above epigenetic changes. Exogenous melatonin treatment could reverse the effects of obesity. The authors further hypothesized that the linkage may be mediated by the cAMP/PKA/CREB pathway to regulate the expression of DNMTs.

      Strengths:

      The transgenerational change of DNA methylation following HFD is of great interest for future research to follow. The metabolic treatment that could change the DNA methylation in oocytes is also interesting and has potential relevance to future clinical practice.

      Thank you for your positive comments to our manuscript.

      Weaknesses:

      The HFD oocytes have more 5mC signal based on staining and sequencing (Fig 1A-1F). However, the authors also identified almost equal numbers of hyper- and hypo-DMRs, which raises questions regarding where these hypo-DMRs were located and how to interpret their behaviors and functions. These questions are also critical to address in the following mechanistic dissections as the metabolic treatments may also induce bi-directional changes of DNA methylation. The authors should carefully assess these conflicts to make the conclusions solid.

      Thanks for the helpful comments and suggestions. As presented in Fig. 1F, there is an increase of methylation level in promoter and exon regions and there is a decrease in intron, utr3 and repeat regions. According to the suggestions, we further analyzed the distribution of DMRs, and found that hypo-DMRs were mainly distributed at utr3, intron, repeat, and tes regions compared with hyper-DMRs (Fig. S3). These suggest that the distribution of DMRs in genome is not random.

      The transgenerational epigenetic modifications are controversial. Even for F0 offspring under maternal obesity, there were different observations compared to this work (Hou, YJ., et al. Sci Rep, 2016). The authors should discuss the inconsistencies with previous works.

      Thanks for the suggestions. There are contradictions on the whole genome DNA methylation of oocytes in obese mice. Hou YJ et al. in 2016 reported that obesity reduces the whole genome DNA methylation of NSN GV oocytes using immunofluorescence. In 2018, Han LS et al. reported that the whole genome 5mC of oocytes is not significantly influenced by obesity using immunofluorescence, but they find the Stella level is reduced in oocytes by obesity. Stella locates in the cytoplasm and nuclei of oocytes and sequesters Uhrf1 from the nuclei. Stella knockout in oocytes results in about twofold increase of global methylation in MII oocytes via recruiting more DNMT1 into nuclei. These suggest that the global methylation of oocytes in obese mice should be increased, but the similar methylation in oocytes between obese and non-obese mice is reported by Han LS et al. Thus, the contradiction may be induced by the different sample size in our manuscript and previous studies, and Hou YJ and colleagues just examined the methylation of NSN GV oocytes. As present in Stella+/- oocytes, the global methylation of oocytes is normal, which suggest that the insufficiency of Stella may be not the main reason for the increased methylation of oocytes in obese mice. We have added a brief discussion in the revised manuscript.

      In addition to the above inconsistencies, the DNA methylation analysis in this work was not carefully evaluated. Several previous works were evaluating the DNA methylation in mice oocytes, which showed global methylation levels of around 50% (Shirane K, et al. PLoS Genet, 2013; Wang L., et al, Cell, 2014). In Figure 1E, the overall methylation level is about 23% in control, which is significantly different from previous works. The authors should provide more details regarding the WGBS procedure, including but not limited to sequencing coverage, bisulfite conversion rate, etc.

      Thanks for the good questions. Smallwood et al. reported the the CG methylation of MII oocyte is about 33.1% (Smallwood et al. Nature Methods, 2014) using single-cell genome-wide bisulfite sequencing. Shirane K et al. reported that the average methylation level of GV oocytes is 37.9%. Kobayashi H et al. Reported that the CG methylation in GV oocytes is about 40% (Kobayashi H et al. Plos Genet. 2012). CG methylation in fully grown oocytes is about 38.7% (Maenohara S et al. Plos Genet. 2017). The variation of methylation in oocytes is associated with sequencing methods, sequencing depth, and mapping rates. In the present study, whole genome bisulfite sequencing (WGBS) for small sample and methylation analysis were performed by NovoGene. The reads are 31613641 to 37359643, unique mapping rate is ≥32.88%,  conversation rate is > 99.44%, and sequencing depth is 2.45 to 2.75. Relative information is presented in Table S1. The sequencing depth might be a reason for the inconsistence. But we further confirmed our sequencing results using bisulfite sequencing (BS), and the result is similar between BS and WGBS results. These findings suggest that our results are reliable.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Since the results show that melatonin may play a role in hyper-methylation, the authors need to give some basic information in the Introduction section.

      Thanks. We added more information in the section of Introduction.

      (2) There are many differential metabolites identified. Besides melatonin, other differential metabolites are involved in the altered methylation in oocytes

      These is a good question. We firstly filtered the differential metabolites which may be involved in methylation, and then further filtered these metabolites according to the relative DNA methylation pathways and published papers. After that, we confirmed the concentrations of relative metabolites in the serum using ELISA. Certainly, we can not completely exclude all the metabolites which might involved in regulating DNA methylation.

      (3) The altered methylation would be found in the F1 tissues. Did the authors examine the other parts besides the liver?

      Thank you. In the present study, we didn’t examined the DNA methylation in the other tissues besides the liver. We agree that the altered methylation should be observed in the other tissues.

      (4) Did the authors try or guess how many generations the maternal obesity-induced genomic methylation alterations can be transmitted?

      Thanks. This is a good question. Takahashi Y and colleagues reported that obtained DNA methylation at CpG island can be transmitted across multiple generations using DNA methylation-edited mouse (Takahashi Y et al. 2023, cell). Similar inheritance is also reported by other studies using different models.

      (5) The F2 is indirectly affected by maternal obesity, so the evidence is not enough to prove the transgenerational inheritance of the altered methylation.

      Thanks. We find the altered DNA methylation in F2 tissue and oocytes is similar to that in F1 oocytes. These suggest the altered DNA methylation in F2 oocytes should be at least partly transmitted to F3. Previous paper (Takahashi Y et al. 2023, cell) confirms that obtain DNA methylation in CpG island can be transmitted across several generations through paternal and maternal germ lines. Certainly, it’s better if it is examined in F3 tissues.

      Reviewer #2 (Recommendations For The Authors):

      (1) Figure Font Size: The font sizes in the figures are quite inconsistent. Please try to uniform the font size of similar types of text.

      Thanks for your suggestions. We re-edited the relative figures in the revised manuscript.

      (2) Figure Clarity: Ensure that all critical information in the figures is clearly visible, such as in Figure 3C.

      Thank you. We revised this figure.

      (3) Figure 1B, C: The position of the asterisks ("**") is not centered in the corresponding columns, and the font size is too small. Please correct this and address similar issues in other figures.

      Thank you for your suggestions. We re-edited these in the revised figures.

      (4) Line 126: The current expression is confusing. It may be revised to: "Both the oocyte quality and the uterine environment can contribute to adult diseases, which may be mediated by epigenetic modifications."

      Thanks. We revised this sentence in the revised manuscript.

      (5) Missing Panel in Figure 3: Figure 3 is missing panel 3N.

      Thank you so much. We corrected it in the revised manuscript.

      (6) Figure Panel Order: Please adjust the order of the panels in the figures to follow a logical reading sequence.

      Thank you. We changed the orders in the revised manuscript.

      (7) Line 493: Correct "inthe" to "in the".

      Thank you. We revised it.

      (8) Lines 102-106: Polish the wording and expression, an example as follows: "We analyzed the differentially methylated regions (DMRs) in oocytes from both HFD and CD groups and identified 4,340 DMRs. These DMRs were defined by the criteria: number of CG sites {greater than or equal to} 4 and absolute methylation difference {greater than or equal to} 0.2. Among these, 2,013 were hyper-DMRs (46.38%) and 2,327 were hypo-DMRs (53.62%) (Fig. 1G). These DMRs were distributed across all chromosomes (Fig. 1H). "

      Thank you! We re-wrote these parts in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      The sample numbers should be annotated in the figure legend for all the bar plots using Image J. The lines in Figures 2B and 2C were without error bars. How many mice were used for these plots?

      Thanks for your suggestions. We added the sample size in the revised manuscript. We made a mistake when we prepared the pictures for figure 2B and figure 2C, which resulted in missing the error bars. We have corrected these pictures. Thanks again!

      The authors should revise the panel arrangement of the figures (Figure 2, Figure 5, etc) to make them more clear and readable.

      Thank you! We have revised these in the revised manuscript.

      The writing should be improved since there were multiple typos and unclear expressions. AI tools like Grammarly or ChatGPT may help.

      Thank you! We have re-edited the language in the revised manuscript using AI tools.

      Please recheck the immunofluorescence images for clear interpretability. For example, in Figure 5F (H89 treated), the GV is all the way at the edge of the oocyte, and the oocyte in the DIC image appears like it is partially lysed. The DIC images and the DAPI images are not clear enough.

      Thanks for your suggestions. We have re-edited these pictures in the revised manuscript.

      Another concern is that the Methods describes the immunofluorescence preparation for 5mC and 5hmC staining as a simple fixation in 4% paraformaldehyde followed by permeabilization with .5% TritonX-100, but there is no antigen exposure step described, a step that is normally required for visualizing these DNA modifications (e.g., 4N HCl).

      Thanks. Sorry for that we didn’t describe the methods clearly. We have added more information about the methods in the revised manuscript.

      The metabolomic analysis revealed a highly significant increase in dibutylphthalate, genistein, and daidzein in the control mice. The presence of these exogenous metabolites suggests that the diets differed in many aspects, not just fat content, so it would be very difficult to interpret the results as related to a high-fat diet alone. Both daidzein and genistein are phytoestrogens and dibutylphthalate is a plasticizer, suggesting differences in the diet and/or in the materials used to collect the samples for analysis from the mice. The Methods define the high-fat diet adequately, as the formulation can be found online using the catalog number. However, the control diet is just listed as "normal diet", so one has no idea what is in it

      Thank you for your good questions. The daidzein and genistein may be from the diets and the dibutylthalate may be from the materials used to collect samples. If so, these should be similar between groups. Thus, we added the formulation of normal diet in the revised manuscript. The raw materials of normal diet include corn, bean pulp, fish meal, flour, yeast powder, plant oil, salt, vitamins, and mineral elements. According to the suggestions, we re-checked the data about these metabolites, and found that the abundance of these metabolites was low. And the result of these metabolites was at a low confidence level because the iron of these metabolites was only mapped to ChemSpider(HMDB,KEGG,LIPID MAPS). To further confirm these results, we examined these metabolites in serum using ELISA, and results revealed that the concentrations of genistein and dibutylthalate were similar between groups. These results suggest that these metabolites may be not involved in the altered methylation of oocytes induced by obesity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work by Wang et al., the authors use single-molecule super-resolution microscopy together with biochemical assays to quantify the organization of Nipah virus fusion protein F (NiV-F) on cell and viral membranes. They find that these proteins form nanoscale clusters which favors membrane fusion activation, and that the physical parameters of these clusters are unaffected by protein expression level and endosomal cleavage. Furthermore, they find that the cluster organization is affected by mutations in the trimer interface on the NiV-F ectodomain and the putative oligomerization motif on the transmembrane domain, and that the clusters are stabilized by interactions among NiV-F, the AP2-complex, and the clathrin coat assembly. This work improves our understanding of the NiV fusion machinery, which may have implications also for our understanding of the function of other viruses.

      Strengths:

      The conclusions of this paper are well-supported by the presented data. This study sheds light on the activation mechanisms underlying the NiV fusion machinery.

      Weaknesses:

      The authors provide limited details of the convolutional neural network they developed in this work. Even though custom-codes are made available, a description of the network and specifications of how it was used in this work would aid the readers in assessing its performance and applicability. The same holds for the custom-written OPTICS algorithm. Furthermore, limited details are provided for the imaging setup, oxygen scavenging buffer, and analysis for the single-molecule data, which limits reproducibility in other laboratories. The claim of 10 nm resolution is not backed up by data and seems low given the imaging conditions and fluorophores used. Fourier Ring Correlation analysis would have validated this claim. If the authors refer to localization precision rather than resolution, then this should be specified and appropriate data provided to support this claim.

      We thank reviewer 1 for these suggestions. We described key steps in imaging setup, singlemolecule data reconstruction, the OPTICS algorithm in cluster identification, and 1D CNN in

      classification of the OPTICS data in the Materials and Methods section. We also provided a recipe for the imaging buffer. We refer to 10 nm localization precision rather than resolution. The localization precision achieved by our SMLM system is shown in the Author response image 1.

      Author response image 1.

      The localization precision of the custom-built SMLM. Shows the distribution of localization error at the x (dX), y (dY), and z (dZ) direction in nanometer of blinks generated from Alexa Flour 647 labeled to NiV-F expressed on the plasma membrane of PK13 cells. The lateral precision is <10 nm and the axial precision is < 20 nm. 

      Reviewer #2 (Public Review): 

      Summary:

      In this manuscript, Wang and co-workers employ single molecule light microscopy (SMLM) to detect NiV fusion protein (NiV-F) in the surface of cells. They corroborate that these glycoproteins form microclusters (previously seen and characterized together with the NiVG and Nipah Matrix protein by Liu and co-workers (2018) also with super-resolution light microscopy). Also seen by Liu and coworkers the authors show that the level of expression of NiV-F does not alter the identity of these microclusters nor endosomal cleavage. Moreover, mutations and the transmembrane domain or the hexamer-of-trimer interface seem to have a mild effect on the size of the clusters that the authors quantified.

      Importantly, it has also been shown that these particles tend to cluster in Nipah VLPs.

      We thank reviewer #2 for the comments and suggestions. This paper is built on Liu et al 1 to further characterize the nanoclusters formed by NiV-F and their role in membrane fusion activation. While Liu et al. studied the NiV glycoprotein distribution at the NiV assembly sites to inform mechanisms in NiV assembly and release, Wang et al. analyzed the nanoorganization and distribution of NiV-F at the prefusion conformation, providing insights into the membrane fusion activation mechanisms.  

      Strengths:

      The authors have tried to perform SMLM in single VLPs and have shown partially the importance of NiV-F clustering.

      Weaknesses:

      The labelling strategy for the NiV-F is not sufficiently explained. The use of a FLAG tag in the extracellular domain should be validated and compared with the unlabelled WT NiV-F when expressed in functional pseudoviruses (for example HIV-1 based particles decorated with NiV-F). This experiment should also be carried out for both infection and fusion (including BlaM-Vpr as a readout for fusion). I would also suggest to run a time-of-addition BlaM experiment to understand how this particular labelling strategy affects single virion fusion as compared to the the WT.  

      We thank reviewer #2 for this suggestion. We have made various efforts to validate the expression and function of FLAG-tagged NiV-F. The NiV-F-FLAG shows comparable cell surface expression levels and induces similar cell-cell fusion levels in 293T cells as that of untagged NiV-F 1. The NiV-F-FLAG also showed similar levels of virus entry as untagged NiV-F when both were pseudotyped on a recombinant Vesicular Stomatitis Virus (VSV) with the VSV glycoprotein replaced by a Renilla luciferase reporter gene (VSV-ΔG-rLuc; Fig. S1D). We also performed a virus entry kinetics assay using NiV VLPs expressing NiV-M-βlactamase (NiV-M-Bla), NiV-G-HA, and NiV-F-FLAG, NiV-F-AU1 or untagged NiV-F. The intracellular AU1 tag is located at the C-terminus of NiV-F (Genbank accession no. AY816748.1). However, we detected different levels of NiV-M-Bla in equal volume of VLPs, suggesting that the tags in NiV-F affect the budding of the VLPs (Author response image 2A). Therefore, we performed fusion kinetics assay by using VLPs expressing the same levels of NiV-M-Bla. Among them, the NiV-F-FLAG on VLPs shows the most efficient fusion between VLP and HEK293T cell membranes (Author response image 2B), significantly more efficient than that of untagged NiV-F and NiV-FAU1. However, we cannot attribute the enhanced fusion activity to the FLAG tag, because the readout of this assay relies on both the levels of β-lactamase (introduced by NiV-M-Bla in VLPs) and the NiV-F constructs. The tags in NiV-F could affect both the budding of VLPs and the stoichiometry of F and M in individual VLPs. We did not use the HIV-based pseudovirus system because the incorporation of NiV-F into HIV pseudoviruses requires a C-terminal deletion 2,3.

      In summary, the FLAG tag does not affect cell-cell fusion 1 and virus entry when pseudotyped to the recombinant VSV-ΔG-rLuc viruses (Fig. S1D). Given that we do not observe any difference in clustering between an HA- and FLAG-tagged NiV-F constructs on PK13 cell surface (Fig. S1A-C), we conclude that the FLAG tag has minimal effect on both the fusion activity and the nanoscale distribution of NiV-F. 

      Author response image 2.

      Viral entry is not affected by labeling of NiV-F. A) Western blot analysis of NiV-M-Bla in NiV-VLPs generated by HEK293T cells expressing NiV-M-Bla, NiV-G-HA and NiV-F-FLAG, untagged NiV-F, or NiV-F-AU1. Equal volume of VLPs were separated by a denaturing 10% SDS–PAGE and probed against β-lactamase (SANTA CRUZ, sc-66062). B) NiV-VLPs expressing NiV-M-BLa, NiV-G-HA, and NiV-F-FLAG, untagged NiV-F or NiV-F-AU1 expression plasmids were bond to the target HEK293T cells loaded with CCF2-AM dye at 4°C. The Blue/Green (B/G) ratio was measured at 37°C for 4 hrs at a 3-min interval. Results were normalized to the maximal B/G ratio of NiV-F-FLAG-NiV VLPs. Results from one representative experiment out of three independent experiments are shown. 

      It would also be very important to compare the FLAG labelling approach with recent advances in the field (for instance incorporating noncanonical amino acids (ncAAs) into NiVF by amber stop-codon suppression, followed by click chemistry). 

      We are greatly thankful for this comment from reviewer #2. Labeling noncanonical amino acids (ncAAs) with biorthogonal click chemistry is indeed a more precise labeling strategy compared to the traditional epitope labeling approach used in this paper. We will explore the applications of ncAAs labeling in single-molecule localization imaging and virus-host interactions in future projects. 

      In this paper, the FLAG tag inserted in NiV-F protein seems to have minimal effect on the NiV-F-induced virus entry and cell-cell fusion 1 (Fig. S1). Although the FLAG tag labeling approach may increase the detectable size of NiV-F nanoclusters due to the use of the antibody complex, it should not affect our conclusions drawn from the relative comparisons between wt and mutant NiV-F or control and drug-treated cells. 

      The correlation between the existence of microclusters of a particular size and their functionality is missing. Only cell-cell fusion assays are shown in supplementary figures and clearly, single virus entry and fusion cannot be compared with the biophysics of cell-cell fusion. Not only the environment is completely different, membrane curvature and the number of NiV-F drastically varies also. Therefore, specific fusion assays (either single virus tracking and/or time-of-addition BlaM kinetics with functional pseudoviruses) are needed to substantiate this claim.  

      We thank Reviewer 2 for the suggestion. To support the link between F clustering and viruscell membrane fusion, we conducted pseudotyped virus entry and VLP fusion kinetics assays, as shown in revised Figure S4. The viral entry results (Fig. S4 E and F) corroborate that of the cell-cell fusion assay (Fig. S4A and B) and previously published data 4. The fusion kinetics confirmed that the real-time fusion kinetics was affected by mutations at the hexameric interface, with the hypo-fusogenic mutants L53D and V108D exhibited reduced entry efficiency while the hyper-fusogenic mutant Q393L showed increased efficiency (Fig. S4G and H). The results were described in detail in the revised manuscript. 

      Additionally, we performed a pseudotyped virus entry assay on the LI4A (Fig. S6F and G) and YA (Fig. S7F and G) mutants to verify the function of these mutants on viruses in revised Supplemental Figures. Neither LI4A nor YA incorporated into the VSV/NiV pseudotyped viruses as shown by the Western blot analyses of the pseudovirions (Fig. S6F and S7F), and thus did not induce virus entry, consisting with the cell-cell fusion results (Fig. S6C, D and Fig. S7C, D). We did not perform the entry kinetic assay of these two mutants as they do not incorporate into VLPs or pseudovirions. 

      The authors also claim they could not characterize the number of NiV-F particles per cluster. Another technique such as number and brightness (Digman et al., 2008) could support current SMLM data and identify the number of single molecules per cluster. Also, this technology does not require complex microscopy apparatus. I suggest they perform either confocal fluorescence fluctuation spectroscopy or TIRF-based nandb to validate the clusters and identify how many molecule are present in these clusters.  

      We thank reviewer 2 for this suggestion. Determining the true copy number of NiV-F in individual clusters could verify whether the F clusters on the plasma membrane are hexamer-of-trimer assemblies. Regardless, it does not affect our conclusion that the organization of NiV-F into nanoclusters affects the membrane fusion triggering ability. The confocal fluorescence fluctuation spectroscopy (FFS) and TIRF-based analyses are accessible tools for quantifying fluorophore copy numbers and/or stoichiometry based on fluorescence fluctuation or photobleaching. However, these methods are unable to quantify the number of proteins in individual clusters because they analyze fluorophores either in the entire cell (as in wide-field epifluorescence microscopy coupled with FFS and TIRF-coupled photobleaching) 5–7 or within a large excitation volume (confocal laser scanning microscopycoupled FFS) 8. Both of these volumes are significantly larger than a single NiV-F cluster, which has an average diameter of 24-26 nm (Fig. 1F). 

      The current SMLM setup is useful for characterizing the protein distribution and organization. However, quantifying the true protein copy number within a nanocluster is challenging because of the stochasticity of fluorophore blinking and the unknown labeling stoichiometry 9–11. To address the challenge in fluorophore blinking, quantitative DNA-PAINT (qDNA-PAINT) may be used because the on-off frequency of the fluorophores is tied to the well-defined kinetic constants of DNA binding and the influx rate of the imager strands, rather than the stochasticity of fluorophore blinking. Thus, the frequency of blinks can be translated to protein counting 12. To address the challenge in unknown labeling stoichiometry, DNA origami can be used as a calibration standard 11. DNA origami supports handles at a regular space with several to tens of nanometers apart, and the handles can be conjugated with a certain number of proteins of interest. The copy number of protein interest in the experimental group can be determined by comparing the SMLM localization distribution of the sample to that of the DNA origami calibration standard. Given the requirement of a more sophisticated SMLM setup and a high-precision calibration tool, we will explore the quantification of NiV-F copy numbers in nanoclusters in a future project. 

      Also, it is not clear how many cells the authors employ for their statistics (at least 30-50 cells should be employed and not consider the number of events blinking events. I hope the authors are not considering only a single cell to run their stats... The differences between the mutants and the NiV-F is minor even if their statistical analyses give a difference (they should average the number and size of the clusters per cell for a total of 30-50 cells with experiments performed at least in three different cells following the same protocol). Overall, it seems that the authors have only evaluated a very low number of cells.

      We disagree with this comment from Reviewer #2. The sample size for cluster analysis in SMLM images was chosen by considering the target of the study (cells and VLPs) and the data acquisition and analysis standards in the SMLM imaging field. We also noted the sample size (# of ROI and cells) in the figure legend. 

      Below, we compared the sample sizes in our study to those in similar studies that used comparable imaging and cluster analysis methods from 2015 to 2024. The classical clustering analysis methods are categorized into global clustering (e.g. nearest neighbor analysis, Ripley’s K function, and pair correlation function) and complete clustering, such as density-based analysis (e.g. DBSCAN, Superstructure, FOCAL, ToMATo) and Tessellationbased analysis (e.g. Delaunay triangulation, Voronoii Tessellation). The global clustering analysis method provides spatial statistics for global protein clustering or organization (e.g. clustering extent), while the complete clustering approach extracts information from a single-cluster level, such as the morphology and localization density of individual clusters. We used the density-based analyses, DBSCAN and OPTICS, for cluster analysis on cell plasma membranes and VLP membranes. 

      Author response table 1.

      The comparison of imaging methods, analysis methods, and sample size in the current study to other studies conducted from 2015 to 2024.

      They should also compare the level of expression (with the number of molecules per cell provided by number and brightness) with the total number of clusters. 

      We thank reviewer 2 for this suggestion. We compared the level of expression with the total number of clusters for F-WT in Figure 1I in the main text.  

      The same applies to the VLP assay. I assume the authors have only taken VLPs expressing both NiV-M and NiV-F (and NiV-G). But even if this is not clearly stated I would urge the authors to show how many viruses were compared per condition (normally I would expect 300 particles per condition coming from three independent experiments. As a negative control to evaluate the cluster effect I would mix the different conditions. Clearly you have clusters with all conditions and the differences in clustering depending on each condition are minimal. Therefore you need to increase the n for all experiments.

      We thank reviewer 2 for this comment. We acquired and analyzed more images of NiV VLPs bearing F-WT, Q393L, L53D, and V108D. Results are shown in the revised Figure 4 and the number of VLPs (>300) used for analysis is specified in the figure legend. An increased number of VLP images does not affect the classification result in Figure 4C. 

      As for the suggestion on “evaluating the cluster effect at different mixed conditions”, I assume that reviewer 2 would like to see how the presence of different viral structural proteins (F, M, and G) on VLPs could affect F clustering.  We showed that the organization of NiV envelope proteins on the VLP membrane is similar in the presence or absence of NiV-M by direct visualization 27, suggesting that the effect of NiV-M on F-WT clustering on VLPs is minimal. We also show comparable incorporation of NiV-F among the NiV-F hexamer-oftrimer mutants (Fig. 4A). Therefore, we did not test the F clustering at different F, M, and G combinations in this paper. However, this could be an interesting question to pursue in a paper focusing on NiV VLP production. 

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Wang and colleagues describes single molecule localization microscopy to quantify the distribution and organization of Nipah virus F expressed on cells and on virus-like particles. Notably the crystal structure of F indicated hexameric assemblies of F trimers. The authors propose that F clustering favors membrane fusion.

      Strengths:

      The manuscript provides solid data on imaging of F clustering with the main findings of:

      -  F clusters are independent of expression levels

      -  Proteolytic cleavage does not affect F clustering

      -  Mutations that have been reported to affect the hexamer interface reduce clustering on cells and its distribution on VLPs - - F nanoclusters are stabilized by AP

      Weaknesses:

      The relationship between F clustering and fusion is per se interesting, but looking at F clusters on the plasma membrane does not exclude that F clustering occurs for budding. Many viral glycoproteins cluster at the plasma membrane to generate micro domains for budding. 

      This does not exclude that these clusters include hexamer assemblies or clustering requires hexamer assemblies. 

      We thank reviewer #3 for this question. We did not focus on the role of NiV-F clusters for budding in the current manuscript, although this is an interesting topic to pursue. In this manuscript, we observed that NiV VLP budding is decreased for some cluster-disrupting mutants, such as F-YA, and F-LI4A. however, F-V108D showed increased budding compared to F-WT (Fig. 4A). We also observed that VLPs and VSV/NiV pseudoviruses expressing L53D have little NiV-G (Fig. 4A, Fig. S4F and S4H), although the incorporation level of L53D is comparable to that of wt F in both VLPs and pseudovirions (Fig. 4A and Fig. S4F). L53D is a hypofusogenic mutant with decreased clustering ability. Therefore, our current data do not show a clear link between F clustering and NiV VLP budding or glycoprotein incorporation. 

      We reported that both NiV-F and -M form clusters at the plasma membrane although NiV-F clusters are not enriched at the NiV-M positive membrane domains 1. This result indicates that NiV-M is the major driving force for assembly and budding, while NiV-F is passively incorporated into the assembly sites. The central role of NiV-M in budding is also supported by a recent study showing that NiV-M induces membrane curvature by binding to PI(4,5)P2 in the inner leaflet of the plasma membrane 28. However, the expression of NiV-F alone induces the production of vesicles bearing NiV-F 29 and NiV-F recruits vesicular trafficking and actin cytoskeleton factors to VLPs either alone or in combination with NiV-G and -M, indicating a potential autonomous role in budding 30. Additionally, several electron microscopy studies show that the paramyxovirus F forms 2D lattice interspersed above the M lattice, suggesting the participation of F in virus assembly and budding. Nonetheless, the evidence above suggests that NiV-F may play a role in budding, but our data cannot correlate NiV-F clustering to budding. 

      Assuming that the clusters are important for entry, hexameric clusters are not unique to Nipah virus F. Similar hexameric clusters have been described for the HEF on influenza virus C particles (Halldorsson et al 2021) and env organization on Foamy virus particles (Effantin et al 2016), both with specific interactions between trimers. What is the organization of F on Nipah virus particles? If F requires to be hexameric for entry, this should be easily imaged by EM on infectious or inactivated virus particles. 

      We thank reviewer #3 for this suggestion. The hexamer-of-trimer NiV-F is observed on the VLP surface by electron tomography 4. The NiV-F hexamer-of-trimers are arranged into a soccer ball-like structure, with one trimer being part of multiple hexamer-of-trimers. The implication of NiV-F clusters in virus entry and the potential mechanism for NiV-F higherorder structure formation are discussed in the revised manuscripts. 

      AP stabilization of the F clusters is curious if the clusters are solely required for entry? Virus entry does not recruit the clathrin machinery. Is it possible that F clusters are endocytosed in the absence of budding? 

      We thank reviewer #3 for this question. The evidence from the current study does not exclude the role of NiV-F clustering in virus budding. NiV-F is known to be endocytosed in the virus-producing cells for cleavage by Cathepsin B or L at endocytic compartments at a pH-dependent manner31–33 in the absence of budding. However, given that all cleaved and uncleaved NiV-F have an endocytosis signal sequence at the cytoplasmic tail and are able to interact with AP-2 for endosome assembly and the cleaved and uncleaved F may have similar clustering patterns (Fig. 2), we do not think NiV-F clustering is specifically regulated for the cleavage of NiV-F. A plausible hypothesis is that NiV-F clusters are stabilized by multiple intrinsic factors (e.g. trimer interface) and host factors (e.g. AP-2) on cell membrane for cell-cell fusion and virus budding. We linked the clustering to the fusion ability of NiV-F in this study, but the NiV-F clustering may also be important in facilitating virus budding. Once in the viruses, the higher-order assembly of the clusters (e.g. lattice) may form due to protein enrichment, and the cell factors may not be the major maintenance force. 

      Clusters are required for budding. 

      Other points:

      Fig. 3: Some of the V108D and L53D clusters look similar in size than wt clusters. It seems that the interaction is important but not absolutely essential. Would a double mutant abrogate clustering completely?

      We thank Reviewer #3 for the suggestion. We generated a double mutant of NIV-F with L53D and V108D (NiV-F-LV) and assessed its expression and processing. Although the mutant retained processing capability, it exhibited minimal surface expression, making it unfeasible to analyze its nano-organization on the cell or viral membrane.

      Author response image 4.

      The expression and fusion activity of Flag-tagged NiV-F and NiV-F L53D-V108D (LV). (A) Representative western blot analysis of NiV-F-WT, LV in the cell lysate of 293T cells. 293T cells were transfected by NiV-F-WT or the LV mutant. The empty vector was used as a negative control. The cell lysates were analyzed on SDS-PAGE followed by western blotting after 28hrs post-transfection. F0 and F2 were probed by the M2 monoclonal mouse antiFLAG antibody. GAPDH was probed by monoclonal mouse anti-GAPDH. (B) Representative images of 293T cell-cell fusion induced by NiV-G and NiV-F-WT or NiV-F-LV. 293T cells were co-transfected with plasmids coding for NiV-G and empty vector (NC) or NiV-F constructs. Cells were fixed at 18 hrs post-transfection. Arrows point to syncytia. Scale bar: 10um. (C) Relative cell-cell fusion levels in 293T cells in (B). Five fields per experiment were counted from three independent experiments. Data are presented as mean ± SEM. (D) The cell surface expression levels of NiV-F-WT, NiV-F-LV in 293T cells measured by flow cytometry. Mean fluorescence Intensity (MFI) values were calculated by FlowJo and normalized to that of F-WT. Data are presented as mean ± SEM of three independent experiments. Statistical significance was determined by the unpaired t-test with Welch’s correction (*P<0.05, **P<0.01, ***P<0.001, ****P<0.0001). Values were compared to that of the NiV-F-WT.

      Fig. 4: The distribution of F on VLPs should be confirmed by cryoEM analyses. This would also confirm the symmetry of the clusters. The manuscript by Chernomordik et al. JBC 2004 showed that influenza HA outside the direct contact zone affects fusion, which could be further elaborated in the context of F clusters and the fusion mechanism.

      We thank reviewer 3 for this suggestion. The distribution of F on VLPs was resolved by electron tomogram which showed that the NiV-F hexamer-of-trimers are arranged into a soccer ball-like structure 4. The role of influenza HA outside of the contact zone in fusion activation is an interesting phenomenon. It may address the energy transmission within and among clusters. We will pursue this topic in a future project.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      •  Please define all used abbreviations throughout the manuscript and in the SI.

      We defined the abbreviations at their first usage. 

      •  The sentence starting with "Additionally, ..." on line 155 appears to be incomplete.

      We corrected this sentence.  

      •  The statement starting with "As reported, ..." on line 181 should be supported by a reference.

      We added a reference. 

      •  In Fig. 4C, it is unclear what the x and y axes represent.  

      Fig. 4C is a t-SNE plot for visualizing high-dimensional data in a low-dimensional space. It maintains the local data structure but does not represent exact quantitative relationships. In other words, points that are close together in Fig. 4C are also close in the high-dimensional space, meaning the OPTICS plots, which reflect the clustering patterns, are similar for two points that are positioned near each other in Fig. 4C. Therefore, the x and y axes do not represent the original, quantitative data, and thus the axis titles are meaningless.  

      •  The reference on line 306 appears to be unformatted.

      We reformatted the reference.  

      Reviewer #2 (Recommendations For The Authors):

      The authors need to include the overall statistics for each experiment (at least 30 to 50 cells with three independent experiments are needed). 

      We highlighted the sample size (number of ROI and number of cells) used for analysis in the figure legend. The determination of the sample size is justified in Table 1 in the response letter. 

      The authors need to generate a functional pseudovirus system (for example HIVpp/NiV F) to run both infectivity and fusion experiments (including Apr-BlaM assay). 

      We tested viral entry using a VSV/NiV pseudovirus system and the viral entry kinetics using VLPs expressing NiV-M-β-lactamase. The results are presented in Fig. S1, S4, S6, and S7.  

      Reviewer #3 (Recommendations For The Authors):

      Even low resolution EM data on VLPs or viruses would strengthen the conclusions.

      We thank this reviewer for the suggestion. We cited the NiV VLP images acquired by electron tomography 4, but we currently have limited resources to perform cryoEM on NiV VLPs.  

      References.

      (1) Liu, Q., Chen, L., Aguilar, H. C. & Chou, K. C. A stochastic assembly model for Nipah virus revealed by super-resolution microscopy. Nature Communications 9, 3050 (2018).

      (2) Khetawat, D. & Broder, C. C. A Functional Henipavirus Envelope Glycoprotein Pseudotyped Lentivirus Assay System. Virology Journal 7, 312 (2010).

      (3) Palomares, K. et al. Nipah Virus Envelope-Pseudotyped Lentiviruses Efficiently Target ephrinB2Positive Stem Cell Populations In Vitro and Bypass the Liver Sink When Administered In Vivo. J Virol 87, 2094–2108 (2013).

      (4) Xu, K. et al. Crystal Structure of the Pre-fusion Nipah Virus Fusion Glycoprotein Reveals a Novel Hexamer-of-Trimers Assembly. PLoS Pathog 11, e1005322 (2015).

      (5)    Bakker, E. & Swain, P. S. Estimating numbers of intracellular molecules through analysing fluctuations in photobleaching. Sci Rep 9, 15238 (2019).

      (6) Nayak, C. R. & Rutenberg, A. D. Quantification of Fluorophore Copy Number from Intrinsic

      Fluctuations during Fluorescence Photobleaching. Biophys J 101, 2284–2293 (2011).

      (7) Salavessa, L. & Sauvonnet, N. Stoichiometry of ReceptorsReceptors at the Plasma MembranePlasma membrane During Their EndocytosisEndocytosis Using Total Internal Reflection Fluorescent (TIRF) MicroscopyMicroscopy Live Imaging and Single-Molecule Tracking. in Exocytosis and Endocytosis: Methods and Protocols (eds. Niedergang, F., Vitale, N. & Gasman, S.) 3–17 (Springer US, New York, NY, 2021). doi:10.1007/978-1-0716-1044-2_1.

      (8) Slenders, E. et al. Confocal-based fluorescence fluctuation spectroscopy with a SPAD array detector. Light Sci Appl 10, 31 (2021).

      (9) Annibale, P., Vanni, S., Scarselli, M., Rothlisberger, U. & Radenovic, A. Identification of clustering artifacts in photoactivated localization microscopy. Nat Methods 8, 527–528 (2011).

      (10) Baumgart, F. et al. Varying label density allows artifact-free analysis of membrane-protein nanoclusters. Nat Methods 13, 661–664 (2016).

      (11) Zanacchi, F. C. et al. A DNA origami platform for quantifying protein copy number in super-resolution. Nat Methods 14, 789–792 (2017).

      (12) Jungmann, R. et al. Multiplexed 3D cellular super-resolution imaging with DNA-PAINT and Exchange-PAINT. Nature Methods 11, 313–318 (2014).

      (13) Rubin-Delanchy, P. et al. Bayesian cluster identification in single-molecule localization microscopy data. Nat Methods 12, 1072–1076 (2015).

      (14) Griffié, J. et al. 3D Bayesian cluster analysis of super-resolution data reveals LAT recruitment to the T cell synapse. Sci Rep 7, 4077 (2017).

      (15) Dynamic Bayesian Cluster Analysis of Live-Cell Single Molecule Localization Microscopy Datasets - Griffié - 2018 - Small Methods - Wiley Online Library. https://onlinelibrary.wiley.com/doi/full/10.1002/smtd.201800008.

      (16) Caetano, F. A. et al. MIiSR: Molecular Interactions in Super-Resolution Imaging Enables the Analysis of Protein Interactions, Dynamics and Formation of Multi-protein Structures. PLOS Computational Biology 11, e1004634 (2015).

      (17) Malkusch, S. & Heilemann, M. Extracting quantitative information from single-molecule superresolution imaging data with LAMA – LocAlization Microscopy Analyzer. Sci Rep 6, 34486 (2016).

      (18) Zhang, Y., Lara-Tejero, M., Bewersdorf, J. & Galán, J. E. Visualization and characterization of individual type III protein secretion machines in live bacteria. Proceedings of the National Academy of Sciences 114, 6098–6103 (2017).

      (19) Tobin, S. J. et al. Single molecule localization microscopy coupled with touch preparation for the quantification of trastuzumab-bound HER2. Sci Rep 8, 15154 (2018).

      (20) Levet, F. et al. SR-Tesseler: a method to segment and quantify localization-based super-resolution microscopy data. Nature Methods 12, 1065–1071 (2015).

      (21) Peters, R., Griffié, J., Burn, G. L., Williamson, D. J. & Owen, D. M. Quantitative fibre analysis of singlemolecule localization microscopy data. Sci Rep 8, 10418 (2018).

      (22) Levet, F. et al. A tessellation-based colocalization analysis approach for single-molecule localization microscopy. Nat Commun 10, (2019).

      (23) Banerjee, C. et al. ULK1 forms distinct oligomeric states and nanoscopic structures during autophagy initiation. Science Advances 9, eadh4094 (2023).

      (24) Pageon, S. V. et al. Functional role of T-cell receptor nanoclusters in signal initiation and antigen discrimination. Proceedings of the National Academy of Sciences 113, E5454–E5463 (2016).

      (25) Cresens, C. et al. Flat clathrin lattices are linked to metastatic potential in colorectal cancer. iScience 26, 107327 (2023).

      (26) Seeling, M. et al. Immunoglobulin G-dependent inhibition of inflammatory bone remodeling requires pattern recognition receptor Dectin-1. Immunity 56, 1046-1063.e7 (2023).

      (27) Liu, Q. T. et al. The nanoscale organization of Nipah virus matrix protein revealed by super-resolution microscopy. Biophysical Journal 121, 2290–2296 (2022).

      (28) Norris, M. J. et al. Measles and Nipah virus assembly: Specific lipid binding drives matrix polymerization. Science Advances 8, eabn1440 (2022).

      (29) Patch, J. R. et al. The YPLGVG sequence of the Nipah virus matrix protein is required for budding. Virol. J. 5, 137 (2008).

      (30) Johnston, G. P. et al. Nipah Virus-Like Particle Egress Is Modulated by Cytoskeletal and Vesicular Trafficking Pathways: a Validated Particle Proteomics Analysis. mSystems 4, e00194-19 (2019).

      (31) Diederich, S. et al. Activation of the Nipah Virus Fusion Protein in MDCK Cells Is Mediated by Cathepsin B within the Endosome-Recycling Compartment. J Virol 86, 3736–3745 (2012).

      (32) Diederich, S., Thiel, L. & Maisner, A. Role of endocytosis and cathepsin-mediated activation in Nipah virus entry. Virology 375, 391–400 (2008).

      (33) Pager, C. T., Craft, W. W., Patch, J. & Dutch, R. E. A mature and fusogenic form of the Nipah virus fusion protein requires proteolytic processing by cathepsin L. Virology 346, 251–257 (2006).

    2. eLife Assessment

      This valuable study advances our understanding of how Nipah virus fusion protein F (NiV-F) organizes into nanoclusters on cell and viral membranes using biochemical and super-resolution microscopy methods. The conclusions are supported by solid evidence and the revision has addressed most of the reviewers' concerns. The relationship between clustering and fusion is of high interest and an interesting hypothesis to continue investigating in future studies.

    3. Reviewer #1 (Public review):

      Summary:

      In this work by Wang et al., the authors use single-molecule super-resolution microscopy together with biochemical assays to quantify the organization of Nipah virus fusion protein F (NiV-F) on cell and viral membranes. They find that these proteins form nanoscale clusters which favors membrane fusion activation, and that the physical parameters of these clusters are unaffected by protein expression level and endosomal cleavage. Furthermore, they find that the cluster organization is affected by mutations in the trimer interface on the NiV-F ectodomain and the putative oligomerization motif on the transmembrane domain, and that the clusters are stabilized by interactions among NiV-F, the AP2-complex, and the clathrin coat assembly. This work improves our understanding of the NiV fusion machinery, which may also have implications for our understanding of the function of other viruses.

      Strengths:

      The conclusions of this paper are well-supported by the presented data. This study sheds light on the activation mechanisms underlying the NiV fusion machinery.

    4. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Wang and co-workets employ single molecule light microscopy (SMLM) to detect Nipah virus Fusion protein (NiV-F) in the surface of cells. They corroborate that these glycoproteins form microclusters (previously seen and characterized together with the NiV-G and Nipah Matrix protein by Liu and co-workers (2018) also with super-resolution light microscopy). Also seen by Liu and coworkers the authors show that the level of expression of NiV-F does not alter the identity of these microclusters nor endosomal cleavage. Moreover, mutations and the transmembrane domain or the hexamer-of-trimer interface seem to have a mild effect on the size of the clusters that the authors quantified. Importantly, it has also been shown that these particles tend to cluster in Nipah VLPs.

      Strengths:

      The authors have tried to perform SMLM in single VLPs and have shown partially the importance of NiV-F clustering.

      Comments on the revised version:

      I am happy with the answers the authors have provided to my questions

    5. Reviewer #3 (Public review):

      Summary:

      The manuscript by Wang and colleagues describes single molecule localization microscopy to quantify the distribution and organization of Nipah virus F expressed on cells and on virus-like particles. Notably the crystal structure of F indicated hexameric assemblies of F trimers. The authors propose that F clustering favors membrane fusion.

      Strengths:

      The manuscript provides solid data on imaging of F clustering with the main findings of:<br /> - F clusters are independent of expression levels<br /> - Proteolytic cleavage does not affect F clustering<br /> - Mutations that have been reported to affect the hexamer interface reduce clustering on cells and its distribution on VLPs<br /> - F nanoclusters are stabilized by AP

      Comments on the revised version:

      The authors addressed most of my previous concerns.

    1. eLife Assessment

      This important work presents a novel approach to infer causal relations in non-stationary time series data. To do so, the authors introduce a novel machine-learning model of Temporal Autoencoders for Causal Inference to identify and measure the direction and strength of time-varying causal interactions. The authors provide solid evidence for their claims through thorough numerical validation and comprehensive exploration of the method on both synthetic and real-world datasets. This is a timely contribution that may have theoretical and practical implications for diverse real-life applications.

    2. Reviewer #1 (Public review):

      Summary:

      The authors make a new contribution with careful computational validation/exploration of their method on synthetic and real-world datasets. Overall, I find their results significant and their presentation compelling.

      Strengths:

      The authors provide extensive computational validation of their approach to synthetic and real-world datasets of increasing complexity.

      Weaknesses:

      The authors should provide a comparison of their approach to other state-of-the-art neural network-based methods. Without this, it is difficult to tell which aspects of their approach (novel coupling metric, or network architecture) are most important for their results.

    3. Reviewer #2 (Public review):

      Summary:

      This paper introduces a new methodology for probing time-varying causal interactions in complex dynamical systems using a novel machine-learning architecture of Temporal Autoencoders for Causal Inference (TACI) combined with a novel metric (CSGI) for assessing causal interactions using surrogate data. This is a timely contribution in the field of causal inference from temporal data which has been largely restricted to stationary time series so far. However, the benchmarking of the proposed methods could be improved.

      Strength:

      The method's capacity to uncover piecewise time-varying non-linear dynamic systems is demonstrated on synthetic datasets as well as on two real-world applications on climate and brain activity data. A particular advantage of the approach is to train a single model capturing the dynamics of the whole time series, thereby allowing for time-varying interactions to be found without retraining over different time periods.

      Weaknesses:

      (1) It is not clear why the new metric Comparative Surrogate Granger Index CSGI (Eq.6) should be better than the Extended Granger Causality Index EGCI (Eq.5), which can also be used to compare the information about y(t) contained in the actual data x(t) versus in a randomized surrogate x^s(t), as implemented in the proposed metric (Eq.6).

      (2) The benchmarking of the new approach TACI against earlier metrics (ie Surrogate Linear Granger, Convergent Cross Mapping, and Transfer Entropy) should be revised:

      (a) The details of the computation should be provided to clarify how the different metrics are estimated notably between multidimensional variables [for instance to estimate Ty->x for x=(x_1,x_2,x_3) and y=(y_1,y_2,y_3)].

      (b) Reliable implementations of the different metrics should be used, as some of the reported results do not seem right. In particular, the unidirectional examples, Eq.9 (Figure 2) and Eq.12 (Figure 5), are expected to lead to vanishing transfer entropies from Y to X, ie Ty->x =0, for all values of the coupling parameter below the synchronization threshold. This can be verified by computing transfer entropies as conditional mutual information using MIIC R package, i.e. Ty->x = I(x(t);y(t-1)|x(t-1)).

      (c) Besides, some reported benchmarks focus on peculiar non-linear systems displaying somewhat "pathological" behaviors. For instance, the two Hénon maps with unidirectional coupling Eq.12 (Figure 5) lead to an equality between the two variables, i.e. y(t)=x(t) for all t, above the synchronization threshold C>0.7. This leads mathematically to zero transfer entropy upon synchronization, as I(x(t);y(<br /> d) By contrast, Eq.9 (Fig.2) leads to strongly coupled, yet non-identical variables above the synchronization threshold. This strong coupling can be shown to yield non-vanishing transfer entropies in both directions, as observed in Figure 2c, and does not correspond to "incorrect prediction of non-existent interactions", as stated in the "Summary of Results on Artificial Test Systems". Clearly synchronized variables do interact and their bidirectional transfer entropies are actually consistent with a non-causal (or bidirectional) relationship. Only a vanishing transfer entropy in one direction implies a causal relation (in the opposite direction). Likewise, vanishing transfer entropies in both directions imply either independent variables or a spurious dependency between them due to an unobserved common cause L, i.e. X<--(L)-->Y. This is usually represented with a bidirected edge (X<-->Y), which is different from a bidirectional relation corresponding to two opposite unidirectional edges (ie X-->Y and X<--Y). It is therefore surprising that TACI metric vanishes in both directions upon synchronization in this case (Eq.9, Figure 2), as one would expect to learn variable y(t) more reliably using the actual data x(<br /> e) In order to assess TACI performance on non-stationary time series, it might be more informative to benchmark it on datasets displaying intermittency rather than synchrony. In particular, the change of causal directions over time, presented as one of the motivations for the new approach, should be more thoroughly benchmarked in the paper. For instance, it would be nice to demonstrate the tracking of the spontaneous reversal of causal relation in a simple 'toggle switch' regulatory network between two mutually repressing genes + expression noise. This is something that causal inference methods assuming stationarity cannot do.

      (3) Concerning the real-world applications, the analysis of the electrocorticography (ECoG) data does not seem to be in strong disagreement with the general trends of the original more detailed study by Tajima et al 2015. Could the authors better delineate what are the common versus conflicting findings between the two approaches? The main difference appears to be the near loss of interaction in the anesthetized state, which might be linked to TACI's tendency to report no interaction between synchronized variables as discussed in d) above. Does the anesthetized state correspond to a global synchrony of the brain regions? This could be easily validated by a more direct analysis of synchrony.

    1. eLife Assessment

      This study describes the application of machine learning and Markov state models to characterize the binding mechanism of alpha-Synuclein to the small molecule Fasudil. The results suggest that entropic expansion can explain such binding. However, the simulations and analyses in their present form are inadequate.

    2. Reviewer #2 (Public Review):

      The manuscript by Menon et al describes a set of simulations of alpha-Synuclein (aSYN) and analyses of these and previous simulations in the presence of a small molecule.

      Comments on latest version:

      I have read the authors' response to my comments as well as to the other reviewers. Summarizing briefly, I don't think they provide substantial answer to the questions/comments by me or reviewer 3, and generally do not quantify the results/effects data. I still remain unconvinced about the analyses and conclusions. Rather than rewriting another set of comments, I think it will be more useful for all (authors and readers) simply to be able to see the entire set of reviews and responses together with the paper.

    3. Reviewer #3 (Public Review):

      In this manuscript Menon, Adhikari, and Mondal analyze explicit solvent molecular dynamics (MD) computer simulations of the intrinsically disordered protein (IDP) alpha-synuclein in the presence and absence of a small molecule ligand, Fasudil, previously demonstrated to bind alpha-synuclein by NMR spectroscopy without inducing folding into more ordered structures. In order to provide insight into the binding mechanism of Fasudil the authors analyze an unbiased 1500us MD simulation of alpha-synuclein in the presence of Fasudil previously reported by Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510). The authors compare this simulation to a very different set of apo simulations: 23 separate1-4us simulations of alpha-synuclein seeded from different apo conformations taken from another previously reported by Robustelli et. al. (PNAS, 115 (21), E4758-E4766), for a total of ~62us.

      To analyze the conformational space of alpha-synuclein - the authors employ a variational auto-encoder (VAE) to reduce the dimensionality of Ca-Ca pairwise distances to 2 dimensions, and use the latent space projection of the VAE to build Markov state Models. The authors utilize k-means clustering to cluster the sampled states of alpha-synuclein in each condition into 180 microstates on the VAE latent space. They then coarse grain these 180 microstates into a 3-macrostate model for apo alpha-synuclein and a 6-macrostate model for alpha-synuclein in the presence of fasudil using the PCCA+ course graining method. Few details are provided to explain the hyperparameters used for PCCA+ coarse graining and the rationale for selecting the final number of macrostates.

      The authors analyze the properties of each of the alpha-synuclein macrostates from their final MSMs - examining intramolecular contacts, secondary structure propensities, and in the case of alpha-synuclein:Fasudil holo simulations - the contact probabilities between Fasudil and alpha-synuclein residues.

      The authors utilize an additional variational autoencoder (a denoising convolutional VAE) to compare denoised contact maps of each macrostate, and project onto an additional latent space. The authors conclude that their apo and holo simulations are sampling distinct regions of the conformational space of alpha-synuclein projected on the denoising convolutional VAE latent space.

      Finally, the authors calculate water entropy and protein conformational entropy for each microstate. To facilitate water entropy calculations - the author's take a single structure from each macrostate - and ran a 20ps simulation at a finer timestep (4 femtoseconds) using a previously published method (DoSPT), which computes thermodynamic properties of water from MD simulations using autocorrelation functions of water velocities. The authors report that water entropy calculated from these individual 20ps simulations is very similar.

      For each macrostate the authors compute protein conformational entropy using a previously published Maximum Information Spanning tree approach based on torsion angle distributions - and observe that the estimated protein conformational entropy is substantially more negative for the macrostates of the holo ensemble.

      The authors calculate mean first passage times from their Markov state models and report a strong correlation between the protein conformational entropy of each state and the mean first passage time from each state to the highest populated state.

      As the authors observe the conformational entropy estimated from macrostates of the holo alpha-synuclein:Fasudil is greater than those estimated from macrostates of the apo holo alpha-synuclein macrostates - they suggest that the driving force of Fasudil binding is an increase in the conformational entropy of alpha-synuclein. No consideration/quantification of the enthalpy of alpha-synuclein Fasudil binding is presented.

      Strengths:

      The author's utilize MD simulations run with an appropriate force field for IDPs (a99SB-disp and a99SB-disp water (Robustelli et. al, PNAS, 115 (21), E4758-E4766) - which has previously been used to perform MD simulations of alpha-synuclein that have been validated with extensive NMR data.

      The contact probability between Fasudil and each alpha-synuclein residue observed in the previously performed 1500us MD simulation of alpha-synuclein in the presence of Fasudil (Robustelli et. al., Journal of the American Chemical Society, 144(6), pp.2501-2510) was previously found to be in good agreement with experimental NMR chemical shift perturbations upon Fasudil binding - suggesting that this simulation is a reasonable choice for understanding IDP:small molecule interactions.

      Comments on the latest version:

      While the authors have provided additional information in the updated manuscript, none of the additional analyses address the fundamental flaws of the manuscript.

      The additional analyses do not convincingly demonstrate that these two extremely different simulation datasets (1500 microsecond unbiased MD for a-synuclein + fasudil, 23 separate 1-4 microsecond simulations of apo a-synuclein) are directly comparable for the purposes of building MSMs.

      The additional analyses do not demonstrate that there are sufficient conformational transitions among kinetically metastable states observed in 23 separate 1-4 microsecond simulations of apo a-synuclein to build a valid MSM, or that the latent space of the VAE is kinetically meaningful.

      If one is interested in modeling the kinetics and thermodynamics of transitions between a set of conformational states, and they run a small number of MD simulations that are too short to see conformational transitions between conformational states - any kinetics and thermodynamics modeled by an MSM will be inherently meaningless. This is likely to be the case with the apo a-synuclein dataset analyzed in this investigation.

      Simulations of 1-4 microseconds are almost certainly far too short to see a meaningful sampling of conformational transitions of a highly entangled 140-residue IDP beyond a very local relaxation of the starting structures, and the authors provide no analyses to suggest otherwise.

      Without convincingly demonstrating reasonable statistics of conformational changes from the very small apo simulation dataset analyzed here, it seems highly likely the apparent validity of the apo MSM results from learning a VAE latent space that groups structurally and kinetically distinct conformations into similar states, creating the spurious appearance of transitions between states. As such, the kinetics and thermodynamics of the resulting MSM are likely to be relatively meaningless, and comparisons with an MSM for a-synuclein in the presence of fasudil are likely to be meaningless.

      In its present form, this study provides an example of how the use of black-box machine learning methods to analyze molecular simulations can lead to obtaining misleading results (such as the appearance of a valid MSM) - when more basic analyses are omitted.

    4. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #2 (Public Review):

      I have read the authors' response to my comments as well as to the other reviewers. Summarizing briefly, I don't think they provide substantial answer to the questions/comments by me or reviewer 3, and generally do not quantify the results/effects data. I still remain unconvinced about the analyses and conclusions. Rather than rewriting another set of comments, I think it will be more useful for all (authors and readers) simply to be able to see the entire set of reviews and responses together with the paper.

      The authors disagree with the views of referees. The authors have provided point-wise precise responses to each of the previous comments. The authors find that the referee has not been able to engage with the responses and accompanying analysis that were provided while communicating the previous response.

      The following extensive analyses were performed by the authors while submitting our revision of round 2 of peer-review to address the comments of reviewer 2 and reviewer 3   that were raised by them on the previous versions:

      (1) We calculated the distribution of multiple metrics for both the apo and holo simulations, including their secondary structure composition, and demonstrated the robustness of our findings.

      (2) We analyzed smaller 60 µs chunks from two parts of the 1.5 ms trajectory and showed how, in combination with the Markov state modeling (MSM) approach, these chunks effectively capture equilibrium properties.

      (3) We thoroughly investigated the choice of starting structures, examining parameters such as Rg, RMSD, secondary structure, and SASA, in response to Referee 3's concerns about the objectivity of our dimension reduction approach.

      (4) We conducted multiple analyses using VAMP-scores and justified the use of a Variational Autoencoder (VAE) over tICA.

      (5) We had extensively verified the choice of hyperparameters used in constructing the MSM.

      (6) To aleviate referee concerns, we had retrained a VAE with four latent dimensions and used it to build an MSM, ensuring the robustness of our approach.

      However, we find that Referee has not considered these additional analysis in response to his/her comments on the manuscript.

      Since referee 2 also draws comments from Referee 3, it is worth noting that some of the comments from Referee 2 and Referee 3 in Round 1 were mutually contradictory. In particular, Referee 3's suggestion in Round 1 to use the same initial configuration for simulations of intrinsically disordered proteins (IDPs) in both apo and ligand-bound forms contradicts the fundamental principle that IDPs should not possess structural bias. This recommendation also directly conflicts with Referee 2's request for greater diversity in starting structures. Our manuscript provided robust evidence that our initial configurations are indeed diverse, with one configuration coincidentally matching that used in the ligand-bound simulations. Despite this, we addressed both sets of concerns in our Round 2 revisions. Unfortunately, it seems that these efforts were overlooked in the subsequent round of review.

      Referee 2's suggestion in prevous round of review comments to mix both holo and apo simulation trajectories for MSM construction is conceptually wrong and indicates a lack of understanding of transition matrix building in this field. Nevertheless, we addressed these comments by performing additional analyses and demonstrating the robustness of our current MSM.

      Reviewer #3 (Public Review):

      Summary:

      While the authors have provided additional information in the updated manuscript, none of the additional analyses address the fundamental flaws of the manuscript.

      The additional analyses do not convincingly demonstrate that these two extremely different simulation datasets (1500 microsecond unbiased MD for a-synuclein + fasudil, 23 separate 1-4 microsecond simulations of apo a-synuclein) are directly comparable for the purposes of building MSMs.

      The 23 unbiased 1-4 microsecond simulations of apo αS totals to ~ 60 us.

      Author response image 1.

      Left figure : Distribution of the radius of gyration (Rg) of the 23 apo simulation (as shown in the colourbar) and holo simulation (black). Right figure : Mean and standard deviation (as error bar) of the Rg of the 23 apo (colourbar) and holo simulations (black).

      We have plotted the distribution of the Radius of gyration ((Rg) for the 23 apo simulation (colour bar) and the holo simulation (black) as shown in the left figure and also compared the mean and standard deviations of the Rg values (right figure). We find that our apo simulations span the entire space of Rg as is spanned by the holo simulation. We have also measured the mean and standard deviations (SD) (horizontal error bar) of the apo and holo simulations. The fact that the apo simulations have mean and SDs comparable to those of the holo ensemble suggests that the majority of the apo simulations are sampling similar conformational space as those observed in the ligand-bound holo form and hence can be used for building the MSM.

      The additional analyses do not demonstrate that there are sufficient conformational transitions among kinetically metastable states observed in 23 separate 1-4 microsecond simulations of apo a-synuclein to build a valid MSM, or that the latent space of the VAE is kinetically meaningful.      

      We have performed the Chapman-Kolmogorov test to compare observed and predicted transition probabilities over increasing lag times and found good agreement between these probabilities, thereby suggesting that transitions between states are well-sampled for both the apo (Author response image 2) and holo simulation (Figure S9).

      Author response image 2.

      The Chapman-Kolmogorov test performed for the three state Markov State Model of the αS ensemble.

      As for the latent space of VAE, we have compared the VAMP2 score and compared with tICA. VAE has a higher VAMP2 score as compared to tICA thereby indicating its efficacy in capturing slower mode for both apo and holo simulation (Fig. S7 and S8).

      If one is interested in modeling the kinetics and thermodynamics of transitions between a set of conformational states, and they run a small number of MD simulations that are too short to see conformational transitions between conformational states - any kinetics and thermodynamics modeled by an MSM will be inherently meaningless. This is likely to be the case with the apo asynuclein dataset analyzed in this investigation.

      We disagree with the referee’s view. The referee does not seem to understand the point of building Markov state models via short-time scale trajectories. The distribution of Rg of all the 23 apo simulations spans the entire Rg space sampled by the holo simulation, thereby suggesting that multiple short simulations can sample structures of varying sizes as sampled from the 1.5 ms holo simulation (see Author response image 1).

      Simulations of 1-4 microseconds are almost certainly far too short to see a meaningful sampling of conformational transitions of a highly entangled 140-residue IDP beyond a very local relaxation of the starting structures, and the authors provide no analyses to suggest otherwise.

      Author response image 3.

      Autocorrelation of the first principal component of the backbone dihedral for the apo (colourbar) and holo (black) simulation.

      Author response image 4.

      Autocorrelation of the second principal component of the backbone dihedral for the apo (colourbar) and holo (black) simulation.

      In order to assess the 23 short simulations in capturing meaningful kinetics and thermodynamics, we have computed the backbone dihedrals which were then reduced to two principal components for both the 23 apo and holo simulations. We then calculated the autocorrelation time for each of the components and for each of the apo and holo simulations which are plotted in Author response image 3 and Author response image 4 respectively.

      The autocorrelation for the holo and most of the apo simulation is similar, thereby suggesting that there is sufficient sampling of conformational transitions between conformational states in the apo simulations and are therefore able to represent the structural changes of the system similarly to the long simulation.

      Without convincingly demonstrating reasonable statistics of conformational changes from the very small apo simulation dataset analyzed here, it seems highly likely the apparent validity of the apo MSM results from learning a VAE latent space that groups structurally and kinetically distinct conformations into similar states, creating the spurious appearance of transitions between states. As such, the kinetics and thermodynamics of the resulting MSM are likely to be relatively meaningless, and comparisons with an MSM for a-synuclein in the presence of fasudil are likely to be meaningless.

      We have shown above that the short simulations are able to capture the structural changes in the long simulation. In addition we have compared the VAMP2 score of the apo and holo simulation with tICA and found out that VAE is superior in capturing long timescale dynamics, for both apo and holo simulation (Fig. S7 and S8).

      In its present form, this study provides an example of how the use of black-box machine learning methods to analyze molecular simulations can lead to obtaining misleading results (such as the appearance of a valid MSM) - when more basic analyses are omitted.

      The authors disagree with the referee’s viewpoint on our manuscript. We find that the majority of the contents of the referee’s comments are cursory and lack objectivity.

      The referee’s loose reference on Machine learning as a black box lacks basic knowledge to comprehend artificial deep neutral network’s long-proven ability to objectively deduce optimal set of lower-dimensional representation of conformational subspace of complex biomacromolecule. The referee’s views on the manuscript ignore the extensive optimization of hyper-parameters that were carried out by the authors in developing the suitable framework of beta-variational autoencoder for deducing optimal latent space representation of complex and fuzzy conformational  landscape of an IDP such as alpha-synuclein. We had thoroughly investigated the choice of starting structures, examining parameters such as Rg, RMSD, secondary structure, and SASA, in response to Referee 3's concerns about the objectivity of our dimension reduction approach. However, we find that referee 3 has ignored the analysis provided to justify our choice.

      Referee 3's advocacy for linear dimensional reduction techniques overlooks the necessity and generality of non-linear approaches, as enabled by artificial deep neural network frameworks, demonstrated in the present manuscript. Nevertheless, our manuscript includes evidence demonstrating the optimality of our current reduced dimensions through varied dimensional analyses. Our extensive analysis, based on the VAMP-2 score, supports the sufficiency of the present dimensions compared to other linear reduction methods.

      The referee’s views that developing Markov state models (MSM) of apo form of the alphasynulclein using multiple number of 1-4 microsecond long simulation length is misleading, suggests referee’s lack of knowledge on the fundamental purpose and motivation for the usage of MSM, which is, to derive long-time scale equilibrium properties from significantly short-length adaptively sampled trajectories. The referee has overlooked the extensive analysis that the authors had provided while demonstrating that the Markov state models developed from short length simulation trajectories of alpha-synclein can statistically replicate the properties derived from very long trajectories.

      ---

      The following is the authors’ response to the original reviews.

      The following extensive analyses were performed to address the reviewer comments:

      (1) We have calculated the distribution of radius of gyration (Rg), end-to-end distance (Ree), solvent accessible surface area (SASA)  of the apo and holo simulations and also their secondary structure composition.

      (2) We have performed a similar analysis for the smaller 60 µs chunk from two parts of the 1.5 ms trajectory.

      (3) The choice of starting structures have been thoroughly investigated in terms of Rg, RMSD, secondary structure and SASA.

      (4) We have justified the use of VAE over tICA.

      (5) We have verified the choice of hyperparameters that were used to build the MSM.

      (6) We have retrained a VAE with four latent dimensions and used it to build MSM. 

      (7) As per recommendation of the referee 1, we have updated the title of the manuscript by introducing ‘expansion’ phrase.

      The manuscript has been accordingly revised by updating it with additional analysis.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is a well-conducted study about the mechanism of binding of a small molecule (fasudil) to a disordered protein (alpha-synuclein). Since this type of interaction has puzzled researchers for the last two decades, the results presented are welcome as they offer relevant insight into the physical principles underlying this interaction.

      Strengths:

      The results show convincingly that the mechanism of entropic expansion can explain the previously reported binding of fasudil to alpha-synuclein. In this context, the analysis of the changes in the entropy of the protein and of water is highly relevant. The combination use of machine learning for dimensional reduction and of Markov State Models could become a general procedure for the analysis of other systems where a compound binds a disordered protein.

      Weaknesses:

      It would be important to underscore the computational nature of the results, since the experimental evidence that fasudil binds alpha-synuclein is not entirely clear, at least to my knowledge.

      The experimental evidence of binding of fasudil to α-synuclein and potentially preventing its aggregation is reported in the paper “Fasudil attenuates aggregation of α-synuclein in models of Parkinson’s disease. Tatenhorst et al. Acta Neuropathologica Communications (2016) 4:39 DOI 10.1186/s40478-016-0310-y ”. In this work, solution state 15N-1H HSQC NMR experiments were performed of α-synuclein in increasing amounts of fasudil which led to large chemical shift perturbation of Y133 and Y136 residues. Additionally single and double mutant  synT-Y133A and synT-Y136A (tyrosine is replaced with alanine), when treated with fasudil, had no significant effect as evident from immunochemistry, thereby indicating that α-synuclein aggregation can be inhibited by the interaction of C-terminal tyrosines with  fasudil. These two analyses point to binding specific binding sites of fasudil to α-synuclein.

      In our work, we have built a MSM using the latent dimension of a deep learning method called VAE,  to address how fasudil interacts with α-synuclein. An analysis of the macrostates as obtained from MSM, gives insights into how fasudil interacts with α-synuclein, in terms of  transition probabilities among the states, thereby predicting which states are most favorable for binding.

      Reviewer #2 (Public Review):

      The manuscript by Menon et al describes a set of simulations of alpha-Synuclein (aSYN) and analyses of these and previous simulations in the presence of a small molecule.

      While I agree with the authors that the questions addressed are interesting, I am not sure how much we learn from the present simulations and analyses. In parts, the manuscript reads more like an attempt to apply a whole range of tools rather than with a goal of answering any specific questions.

      In this manuscript, we have employed a variational bayesian method, VAE, that uses variational inference to approximate the distribution of latent variable. Unlike conventional linear dimension reduction methods such as tICA (as provided in the SI), this method has been found to be better (higher VAMP2 score) in capturing slow modes and thereby facilitate the study of long-time dynamics. Markov State Model was built on this lower dimension space which indicated the presence of three and six states for the apo and holo simulations respectively. The exclusivity of the states was justified by determining the backbone contact map and further mapping these states using a denoising CNN-VAE. The increase in the number of states in the presence of the small molecule was justified by calculating the entropy of the macrostates. The entropic contribution from water remained similar across all states, while for the protein in the holo ensemble, entropy was significantly modulated (either increased or decreased) compared to the apo state. In contrast, the entropy of the apo states showed much less modulation. This proves that an increase in the number of states is primarily an entropic effect caused by the small molecule. Finally we have compared the mean first passage time (MFPT) of other states to the most populated state, which reveals a strong correlation between transition time and the system's entropy for both apo and holo ensemble. However, the transition times (to the most populated state) are much lower for the holo ensemble, thereby suggesting that fasudil may potentially trap the protein conformations in the intermediate states, thereby slowing down αS in exploring the large conformational space and eventually slow down aggregation.

      There's a lot going on in this paper, and I am not sure it is useful for the authors, readers or me to spell out all of my comments in detail. But here are at least some points that I found confusing/etc

      Major concerns

      p. 5 and elsewhere:

      I lack a serious discussion of convergence and the statistics of the differences between the two sets of simulations. On p. 5 it is described how the authors ran multiple simulations of the ligandfree system for a total of 62 µs; that is about 25 times less than for the ligand system. I acknowledge that running 1.5 ms is unfeasible, but at a bare minimum the authors should discuss and analyse the consequences for the relatively small amount of sampling. Here it is important to say that while 62 µs may sound like a lot it is probably not enough to sample the relevant properties of a 140-residue long disordered protein.

      As to referee 2’s original comment on ‘a lot going on in the manuscript’, we believe that the complexity of the project demanded that this work needs to be dealt with an extensive analysis and objective machine learning approaches, instead of routine collective variable or traditional linear dimensional reduction techniques. This is what has been accomplished in this manuscript. For someone to get the gist of the work, the last paragraph of the introduction and first paragraph of conclusion provides a summary of the overall finding and investigation in the manuscript. First, a VAE-based machine learning approach demonstrates the modulation of free energy landscape of alpha-synuclein in presence of fasudil. Next, Markov State Model elucidates distinct binding competing states of alpha-synuclein in presence of the small-molecule drug. Then the MSMderived metastable states of alpha-synuclein monomer are structurally characterized in presence of fasudil. Next we mapped the macrostates in apo and bound-state ensembles using denoising convolutional variational autoencoder, to ensure that these are mutually distinct. Next we show that fasudil exhibits conformation-dependent interactions with individual metastable states. Finally the investigation quantatively brings out entropic signatures of small molecule binding.

      We thank the reviewer for the question. For the apo simulations, we performed 1-4 μs long simulations with 23 different starting structures and the ensemble amounted to an ensemble of ~62 μs. In the Supplementary figures,  we show analyses of how the starting structures used for apo simulations compare with the structure used to run the holo simulations as well as comparison of the apo and holo ensembles in terms of structures features as Rg, Ree, solvent accessible surface area (SASA) and secondary structure properties. This is updated in the manuscript on page 3,31- 33 and figures S1-S6, S25-S30.

      Also, regarding the choice of starting structures, we chose multiple distinct conformations from a previous simulation of alpha synuclein monomer, reported in Robustelli et. al, PNAS, 115 (21), E4758-E4766. The Rg of the starting structures represent the entire distribution of Rg of the holo ensemble; from compact, intermediate to extended states. Importantly, the Rg distribution of the apo and holo ensembles are highly comparable and overlapping, indicating that the apo simulations, although of short timescale, have sampled the phase space locally around each starting conformation and thus covered the protein phase space as in the holo simulation. Similarly, other structural properties such as SASA, Ree  and secondary structure are comparable for the two ensembles. These analyses show that the local sampling across a variety of starting conformations has ensured sufficient sampling of the IDP phase space. This is  updated in the manuscript on page 33-34 and figure S1, S25-S30.

      p. 7:

      The authors make it sound like a bad thing than some methods are deterministic. Why is that the case? What kind of uncertainty in the data do they mean? One can certainly have deterministic methods and still deal with uncertainty. Again, this seems like a somewhat ad hoc argument for the choice of the method used.

      We appreciate the reviewer’s comment. In this work, we have used a single VAE model to map the simulation of αS in its apo state and in the presence of fasudil, into two dimensions. If we had used an autoencoder, which is a deterministic model, we would have to train two independent models; one for the apo-state and one for fasudil. It would then be questionable to compare the two dimensions obtained from two different autoencoders as the model parameters are not shared. 

      VAE gives us this flexibility by not mapping it to a single point, but to a distribution, thereby encouraging it to learn more generalizable representation. The uncertainty is not in the data; but mapping a conformation (of the fasudil simulation) to a distribution would provide a new point for a similar structure (from the apo simulation). 

      p. 8:

      The authors should make it clear (i) what the reconstruction loss and KL is calculated over and (ii) what the RMSD is calculated over.

      (i) The reconstruction loss is calculated between the reconstructed and original pairwise distances, whereas the KL loss is calculated between the approximated posterior distribution and the prior distribution (for VAE it is a standard normal distribution)

      (ii) The RMSE is the root mean square error between the original data and the reconstructed data. 

      (i) is updated on page 34 and (ii) is updated in the revised manuscript on page 8.

      p. 9/figure 1:

      The authors select a beta value that may be the minimum, but then is just below a big jump in the cross-validation error. Why does the error jump so much and isn't it slightly dangerous to pick a value close to such a large jump.

      In this work, RMSE has been chosen as a metric to select the best VAE model. To do so, the β parameter (weighting factor for the KL loss) was varied. The β value was chosen as this had the minimum value.

      This is updated on page 8.

      p. 10:

      Why was a 2-dimensional representation used in the VAE? What evidence do the authors have that the representation is meaningful? The authors state "The free energy landscape represents a large number of spatially close local minima representative of energetically competitive conformations inherent in αS" but they do not say what they mean by "spatially close". In the original space? If so, where is the evidence.

      We thank the reviewer for the question. Even though an increase in the number of latent dimensions may make the model more accurate, this can also result in overfitting. The model can simply memorize the pattern in the data instead of generalizing them. A higher dimensional latent space is also more difficult to interpret; therefore, we chose two dimensions. 

      The reconstruction loss (which is the mean squared error between the input and the reconstructed data) is of the order of 10-4. Also, the MSM built on the latent space of VAE is able to identify states that are distinct for both apo and holo simulations, which ensures that the latent space representation is meaningful.

      We have also trained a model with 4 neurons in the latent space and built an MSM. The implied timescales indicate the presence of six states which is consistent with the model with two latent dimensions.

      This is updated in the manuscript on page 13 and figure S14-S15.

      No, not spatially close in the original space, but in the reduced two dimensional latent space.

      p. 10:

      It is not clear from the text whether the VAEs are the same for both aSYN and aSYN-Fasudil. I assume they are. Given that the Fasudil dataset is 25x larger, presumably the VAE is mostly driven by that system. Is the VAE an equally good representation of both systems?

      Yes, the same model is used for both aSYN and aSYN-Fasudil ensemble.

      The states obtained from the MSM of the aSyn ensemble are distinct when their Cα contact maps are analyzed. So we think it is a good representation for this system.

      p. 10/11:

      Do the authors have any evidence that the latent space representation preserves relevant kinetic properties? This is a key point because the entire analysis is built on this. The choice of using z1 and z2 to build the MSM seems somewhat ad hoc. What does the auto-correlation functions of Z1 and Z2 look like? Are the related to dynamics of some key structural properties like Rg or transient helical structure.

      Autocorrelation of z1 and z2 of the latent space of VAE and the radius of gyration for asyn-fasudil simulation.

      Author response image 5.

      We find that z1 of VAE has a much slower decay as compared to Rg. This indicates that it is much better in capturing long-time-scale dynamics as compared to Rg.

      p. 11:

      What's the argument for not building an MSM with states shared for aSYN +- Fasudil?

      We have built two different markov state models for two aSYN simulation in its apo state and in the presence of ligand. Mixing the two latent spaces to build one MSM would give incorrect transition timescales among the states as these are independent simulations.

      p. 12:

      Fig. 3b/c show quite clearly that the implied timescales are not converged at the chosen lag time (incidentally, it would have been useful with showing the timescales in physical time). The CK test is stated to be validated with "reasonable accuracy", though it is unclear what that means.

      We have mentioned the physical timescales in the main manuscript (Page no. 38), which is 36 and 32 ns for apo and holo simulations, respectively. We used “reasonable accuracy” in the context of the Chapman-Kolmogorov test. We note that for the ligand simulations, the estimated and predicted models are in excellent agreement as compared to some of the transitions in the apo state. This good agreement implies that the model has reached Markovianity and the timescales have converged. 

      The CK test is updated in the manuscript on page 12.

      p. 12:

      In Fig. 3d, what are the authors bootstrapping over? What are the errors if the authors analyse sampling noise (e.g. bootstrap over simulation blocks)?

      For bootstrapping, we randomly deleted a part of the simulation (simulation block) and rebuilt the MSM with this reduced dataset. We repeated this 10 times and reported the average value of the population and the transition timescales over the 10 iterations.  

      p. 13:

      I appreciate that the authors build an MSM using only a subset of the fasudil simulations. Here, it would be important that this analysis includes the entire workflow so that the VAE is also rebuilt from scratch. Is that the case?

      The VAE model was trained over data points of the ligand simulation sampled at every 9 ns starting from time t=0, for the entire 1.5 ms. We did not train it for the subset of the fasudil simulation, but rather used the trained VAE model to get the latent space of the 60 μs of the fasudil simulation to build the MSM. Additionally, we have compared the distributions of Rg for this simulation block with the apo ensemble and found good agreement among them. 

      Rg distribution is updated in the manuscript on page 13 and see figure S10-S11.

      p. 18:

      I don't understand the goal of building the CVAE and DCVAE. Am I correct that the authors are building a complex ML model using only 3/6 input images? What is the goal of this analysis. As it stands, it reads a bit like simply wanting to apply some ML method to the data. Incidentally, the table in Fig. 6C is somewhat intransparent.

      We appreciate the reviewer’s valid question. The ensemble averaged contact map of the macrostates of aSyn in apo state and in the presence of ligand posed us a challenge in finding contacts that are exclusive to each state. Since VAEs are excellent in finding patterns, we employed a convolutional VAE (typically used for images). However, owing to the few number of contact maps, the model overfitted and to prevent this, we added noise to the data.  A visual inspection of the ensemble averaged contact map, especially for IDPs is difficult and this lower dimensional space will give us a preliminary idea of how each macrostate is different from every other. The table in Fig. 6C provides scores for the denoised contact maps (SSIM and PSNR scores). An SSIM score above 0.9 and PSNR score between 20-48 indicates that the reconstruction of the contact map is of good quality.

      p. 22:

      "Our results indicate that the interaction of fasudil with αS residues governs the structural features of the protein."

      What results indicate this?

      By building a Markov State Model and comparing them across the apo and holo ensembles, we showed the interaction of fasudil with aSyn leads to the population of more states (than apo). In these states, we observe that fasudil interacts with aSyn in different regions as shown by the protein-ligand contact map as shown in figure 7. Also, the contact maps and the extent of secondary structure of the six states are distinct across the states. The location and extent of the helix and sheet-like character in the ensemble of the six macrostates as shown in figure S16-S17.  Based on these observations, we state that the interaction of the small molecule favors the population of new aSyn states that are distinct in their structural features.

      p. 23:

      The authors should add some (realistic) errors to the entropy values quoted. Fig. 8 have some error bars, though they seem unrealistically small. Also, is the water value quoted from the same force field and conditions as for the simulations?

      The error values are the standard deviations that are provided by the PDB2ENTROPY package. Yes, the water value is from the same force field and conditions for the simulations are the same as reported in the section “Entropy of water”  

      p. 23:

      Has PDB2ENTROPY been validated for use with disordered proteins?

      Yes, it has been used in the following paper studying liquid-liquid phase separation of an IDP. 

      This paper has also been cited in the manuscript (reference 66).

      “Thermodynamic forces from protein and water govern condensate formation of an intrinsically disordered protein domain” by Saumyak Mukherjee & Lars V. Schäfer, Nature Communications volume  14, Article number: 5892 (2023) https://doi.org/10.1038/s41467-023-41586-y

      p. 23/24:

      It would be useful to compare (i) the free energies of the states (from their populations), (ii) the entropies (as calculated) and (iii) the enthalpies (as calculated e.g. as the average force field energy). Do they match up?

      Our analysis stems from previous studies where enthalpy driven drug design has not led to significant advances in drug design, particularly for IDPs. In the presence of the drug/ligand, the protein may be able to explore a larger conformational space and hence an increase in the number of states accessible by the protein, which we found by building Markov State Model using the latent space of VAE. The entropy of the protein is calculated based on the torsional degrees of freedom relative to the random distribution (the protein with the most random configuration).

      p. 31:

      It is unclear which previous simulation the new aSYN simulations were launched from. What is the size of the box used?

      The starting conformations for the new aSYN simulations were randomly chosen from a previously reported 73 μs simulation in Robustelli et. al. (PNAS, 115 (21), E4758-E4766). 

      Box size for the 23 simulation has been added to the supplemental information in Table S1.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript Menon, Adhikari, and Mondal analyze explicit solvent molecular dynamics (MD) computer simulations of the intrinsically disordered protein (IDP) alpha-synuclein in the presence and absence of a small molecule ligand, Fasudil, previously demonstrated to bind alpha-synuclein by NMR spectroscopy without inducing folding into more ordered structures. In order to provide insight into the binding mechanism of Fasudil the authors analyze an unbiased 1500us MD simulation of alpha-synuclein in the presence of Fasudil previously reported by Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510). The authors compare this simulation to a very different set of apo simulations: 23 separate1-4us simulations of alphasynuclein seeded from different apo conformations taken from another previously reported by Robustelli et. al. (PNAS, 115 (21), E4758-E4766), for a total of ~62us.

      To analyze the conformational space of alpha-synuclein - the authors employ a variational autoencoder (VAE) to reduce the dimensionality of Ca-Ca pairwise distances to 2 dimensions, and use the latent space projection of the VAE to build Markov state Models. The authors utilize kmeans clustering to cluster the sampled states of alpha-synuclein in each condition into 180 microstates on the VAE latent space. They then coarse grain these 180 microstates into a 3macrostate model for apo alpha-synuclein and a 6-macrostate model for alpha-synuclein in the presence of fasudil using the PCCA+ course graining method. Few details are provided to explain the hyperparameters used for PCCA+ coarse graining and the rationale for selecting the final number of macrostates.

      The authors analyze the properties of each of the alpha-synuclein macrostates from their final MSMs - examining intramolecular contacts, secondary structure propensities, and in the case of alpha-synuclein:Fasudil holo simulations - the contact probabilities between Fasudil and alphasynuclein residues.

      The authors utilize an additional variational autoencoder (a denoising convolutional VAE) to compare denoised contact maps of each macrostate, and project onto an additional latent space. The authors conclude that their apo and holo simulations are sampling distinct regions of the conformational space of alpha-synuclein projected on the denoising convolutional VAE latent space.

      Finally, the authors calculate water entropy and protein conformational entropy for each microstate. To facilitate water entropy calculations - the author's take a single structure from each macrostate - and ran a 20ps simulation at a finer timestep (4 femtoseconds) using a previously published method (DoSPT), which computes thermodynamic properties of water from MD simulations using autocorrelation functions of water velocities. The authors report that water entropy calculated from these individual 20ps simulations is very similar.

      For each macrostate the authors compute protein conformational entropy using a previously published Maximum Information Spanning tree approach based on torsion angle distributions - and observe that the estimated protein conformational entropy is substantially more negative for the macrostates of the holo ensemble.

      The authors calculate mean first passage times from their Markov state models and report a strong correlation between the protein conformational entropy of each state and the mean first passage time from each state to the highest populated state.

      As the authors observe the conformational entropy estimated from macrostates of the holo alphasynuclein:Fasudil is greater than those estimated from macrostates of the apo holo alphasynuclein macrostates - they suggest that the driving force of Fasudil binding is an increase in the conformational entropy of alpha-synuclein. No consideration/quantification of the enthalpy of alpha-synuclein Fasudil binding is presented.

      Strengths:

      The author's utilize MD simulations run with an appropriate force field for IDPs (a99SB-disp and a99SB-disp water (Robustelli et. al, PNAS, 115 (21), E4758-E4766) - which has previously been used to perform MD simulations of alpha-synuclein that have been validated with extensive NMR data.

      The contact probability between Fasudil and each alpha-synuclein residue observed in the previously performed 1500us MD simulation of alpha-synuclein in the presence of Fasudil (Robustelli et. al., Journal of the American Chemical Society, 144(6), pp.2501-2510) was previously found to be in good agreement with experimental NMR chemical shift perturbations upon Fasudil binding - suggesting that this simulation is a reasonable choice for understanding IDP:small molecule interactions.

      Weaknesses:

      Major Weakness 1: Simulations of apo alpha-synuclein and holo simulations of alpha-synuclein and fasudil are not comparable.

      The most robust way to determine how presence of Fasudil affects the conformational ensemble of alpha-synuclein conclusions is to run apo and holo simulations of the same length from the same starting structures using the same simulation parameters.

      The 23 1-4 us independent simulations of apo alpha-synuclein and the long unbiased 1500us alpha-synuclein in the presence of fasudil are not directly comparable. The starting structures of simulations used to build a Markov state model to describe apo alpha-synuclein were taken from a previously reported 73us MD simulation of alpha-synuclein run with the a99SB-disp force field and water model) with 100mM NaCl, (Robustelli et. al, PNAS, 115 (21), E4758-E4766). As the holo simulation of alpha-synuclein and Fasudil was run in 50mM NaCl, snapshots from the original apo alpha-synuclein simulation were resolvated with 50mM NaCl - and new simulations were run.

      No justification is offered for how starting structures were selected. We have no sense of the conformational variability of the starting structures selected and no sense of how these conformations compare to the alpha-synuclein conformations sampled in the holo simulation in terms of standard structural descriptors such as tertiary contacts, secondary structure, radius of gyration (Rg), solvent exposed surface area etc. (we only see a comparison of projections on an uninterpretable non-linear latent-space and average contact maps). Additionally, 1-4 us is a relatively short timescale for a simulation of a 140 residue IDP- and one is unlikely to see substantial evolution for many structural properties of interest (ie. secondary structure, radius of gyration, tertiary contacts) in simulations this short. Without any information about the conformational space sample in the 23 apo simulations (aside from a projection on an uninterpretable latent space)- we have no way to determine if we observe transitions between distinct states in these short simulations, and therefore if it is possible the construct a meaningful MSM from these simulations.

      If the structures used for apo simulations are on average more compact or contain more tertiary contacts - then it is unsurprising that in short independent simulations they sample a smaller region of conformational space. Similarly, if the starting structures have similar dimensions - but we only observe extremely local sampling around starting structures in apo simulations in the short simulation times - it would also not be surprising that we sample a smaller amount of conformational space. By only presenting comparisons of conformational states on an uninformative VAE latent space - it is not possible for a reader to ask simple questions about how the conformational ensembles compare.

      It is noted that the authors attempt to address questions about sampling by building an MSM of single contiguous 60us portion of the holo simulation of alpha-synuclein and Fasudil - noting that:

      "the MSM built using lesser data (and same amount of data as in water) also indicated the presence of six states of alphaS in presence of fasudil, as was observed in the MSM of the full trajectory. Together, this exercise invalidates the sampling argument and suggests that the increase in the number of metastable macrostates of alphaS in fasudil solution relative to that in water is a direct outcome of the interaction of alphaS with the small molecule."

      However, the authors present no data to support this assertion - and readers have no sense of how the conformational space sampled in this portion of the trajectory compares to the conformational space sampled in the independent apo simulations or the full holo simulation. As the analyzed 60us portion of the holo trajectory may have no overlap with conformational space sampled in the independent apo simulations - it is unclear if this control provides any information. There is no quantification of the conformational entropy of the 6 states obtained from this portion of the holo trajectory or the full conformational space sampled. No information is presented to determine if we observe similar states in the shorter portion of the holo trajectory. Furthermore - as the authors provide almost no justification for the criteria used to select of the final number of macrostates for any of the MSMs reported in this work- and the number of macrostates is effectively a free parameter in the PCCA+ method, arriving at an MSM with 6 macrostates does not convey any information about the conformational entropy of alpha-synuclein in the presence or absence of ligands. Indeed - the implied timescale plot for 60us holo MSM (Figure S2) - shows that at least 10 processes are resolved in the 120 microstate model - and there is no information to provided explaining/justifying how a final 6-macrostate model was determined. The authors also do not project the conformations sampled in this sub- trajectory onto the latent space of the final VAE.

      One certainly expects that an MSM built with 1/20th of the simulation data should have substantial differences from an MSM built from the full trajectory - so failing additional information and hyperparameter justification - one wonders if the emergence of a 6-state model could be the direct result of hardcoded VAE and MSM construction hyperparameter choices.

      Required Controls For Supporting the Conclusions of the Study: The authors should initiate apo and holo simulations from the same starting structures - using the same simulation software and parameters. This could be done by adding a Fasudil ligand to the apo structures - or by removing the Fasudil ligand from a subset of holo structures. This would enable them to make apples-toapples comparisons about the effect of Fasudil on alpha-synuclein conformational space.

      Failing to add direct apples-to-apples comparisons, which would be required to truly support the studies conclusions, the authors should at least compare the conformational space sampled in the independent apo simulations and holo simulations using standard interpretable IDP order parameters (ie. Rg, end-to-end distance, secondary structure order parameters) and/or principal components from PCA or tICA obtained from the holo simulation. The authors should quantify the number of transitions observed between conformational states in their apo simulations. The authors could also perform more appropriate holo controls, without additional calculations, by taking batches of a similar number of short 1-4us segments of simulations used to compute the apo MSMs and examining how the parameters/macrostates of the holo MSMs vary with the input with random selections.

      In case of IDPs, one should not bias the simulation by starting from identical structures, as IDP does not have a defined structure and the starting configuration has little significance. It is the microenvironment that matters most. As for the choice of simulation software and parameters, we have used the same force field that was used in the holo simulation at the same temperature and same salt concentration. We have performed multiple independent simulations that have varying structural signatures such as Rg, SASA and secondary structure content. In fact, the starting structure for apo simulations covered the entire span of the Rg distribution of holo simulation, including the starting structure of the holo simulation. The simulations are unbiased w.r.t the starting structure. Although the fasudil simulation was run for 1.5 ms, we should also understand that it is difficult to run a millisecond range of simulation in reasonable time from a single starting structure. It is exactly for this reason that we start with different structures so that we do not bias ourselves and sample every possible conformation. 

      We have updated the manuscript on page 33-34 and figure S1, S25-S30.

      Considering the computational expense for simulating 1.5 ms timescale of a 140-residue IDP, we generated an ensemble from multiple short runs amounting to ~60 µs. The premise of this investigation is a widely popular method, Markov State Models (MSMs) that can be used to estimate long timescale kinetics and stationary populations of metastable states built from ensembles of short simulations. We have also demonstrated that comparable to the apo data, when we build an MSM for asyn-fasudil (holo) using 60 µs simulation block, the implied timescales (ITS) plot shows identical number of metastable states as for the 1.5 ms data.  

      An intrinsically disordered protein (IDP) is not represented by a fixed structure. Therefore, it would be most appropriate to run multiple simulations starting from different initial structures and simulate the local environment around those structures; thus generating an ensemble effectively sampling the phase space. Accordingly, for initiating the apo simulations, instead of biasing the initial structure (using the starting structure used for simulations with fasudil), we chose randomly 23 different conformations from the 73 µs long simulation of 𝛼-synuclein monomer reported in Robustelli et. al, PNAS, 115 (21), E4758-E4766.  Based on the reviewer’s comment on providing a justification for choice of the starting structures for apo simulations, we provide a compilation of figures below showing comparison of standard conformational properties of the chosen initial structures for apo simulations with the starting structure of the long holo simulation; we have also provided comparative analyses of the apo (~60 µs) and holo ensemble (1.5 ms) properties. 

      Figure S1 compares the Rg of the apo and holo ensembles of ~60 μs and 1.5 ms, respectively. The distributions are majorly overlapping, indicating that the apo ensemble is comparable to the holo ensemble, in terms of the extent of compaction of the conformations. In Figure 1, we have also marked the Rg values corresponding to the starting structures used to seed the apo simulations. It is evident that the 23 starting conformations chosen represent the whole range of the Rg space that is sampled in the holo ensemble. Therefore, while the apo simulations are relatively short (1-4 μs), the local sampling of these multiple starting conformations of variable compaction (Rg) ensures that the phase space is efficiently sampled and the resulting ensemble is comparable to the holo ensemble. Furthermore, the implementation of MSM on such an ensemble can be efficiently used to identify metastable states and the long timescale transitions happening between them

      Another property that is proportional to Rg is the end-to-end distance of the protein conformations. Figure S2 shows that the distribution of this property in the apo and holo ensembles are highly similar.

      Figure S3 depicts another fundamental structural descriptor i.e. solvent accessible surface area (SASA) that indicates the extent of folding and the exposure of the residues. The apo ensemble only shows a minimal shift in the distribution towards higher SASA values. The distributions of the two ensembles largely overlap. 

      In Figure S25, we have provided the root mean square deviation (RMSD) of the starting structures used in the apo simulations with the structure used to start the long simulation with fasudil. The RMSD values range from 1.6 to 3 nm, indicating that the starting structures used are highly variable. This is justifiable for IDPs since they are not identified by a single, fixed structure, but rather by an array of different conformations.  

      Figures S26-S28 show the fraction of the secondary structure elements i.e. helix, beta and coil in the starting structures of apo and holo simulations. All the conformations are mostly disordered in nature with the greatest extent of coil content. The helix content ranges from 3-10 % while sheet content varies from 3-15 % in the initial simulation structures. 

      Figures S4-s6 represent the residue-wise percentage of secondary structure elements (helix, beta and coil) in the apo and holo ensembles. It is evident that the extent of secondary structure is comparable in the two ensembles. 

      The above analyses comparing distributions of several structural features clearly indicate that the apo simulations we performed from different starting structures have effectively sampled the phase space as the single long simulation of the holo system.

      We have discussed the above in the manuscript: Computational Methods section, Page 33-34.

      The above VAMP score analyses (Figures S7 and S8has been now presented in the manuscript: Results and Discussion (Page 8)

      Building the MSM

      While building the MSM, we iteratively varied the hyperparameters to build a reasonable model. In this process, we explored different values of the number of clusters, maximum number of iterations, tolerance, stride, metric, seed, chunk size and initialization methods. There is no possible way to perform an optimization on the choice of the above hyperparameters using gradient descent methods, as no convergence would be guaranteed. The parameters were tuned carefully so that we get the best possible implied timescales of the system. The quality of the MSM was further validated using the Chapman-Kolmogorov (CK) test on a state-by-state basis i.e by considering the transitions between each pair of the metastable states. In addition, we have built the contact maps to show that the states are mutually exclusive. This is also justified by the latent space of denoising convolutional variational autoencoders.

      We have compared the conformational space in the independent apo and holo simulations for Rg, Ree, SASA and secondary structure. As for PCA/TICA, we have computed the VAMP-2 score for TICA and found out to be low as compared to VAE. In fact, neural networks have been shown previously as a better dimension reduction technique due to its non-linearity over linear methods such as PCA or TICA.

      Author response image 6.

      Distribution of (a)Rg, (b) Ree, (c) SASA and of the apo ensemble and a 60 μs slice of the holo simulation trajectory.  (d) ITS plot of the 60 μs chunk.

      First, someone familiar with MSM should understand that the basic philosophy of MSM is not the requirement of long simulation trajectories, which would defeat the purpose of its usage. Rather as motivated by Noe and coworkers in seminal PNAS (vol. 106, page 9011, year 2009) paper, MSM plays an important role in inferring long-time scale equilibrium properties by using significantly short-length scale non-equilibrium trajectories. 

      Considering the difference in the size of the ensembles in the apo and holo simulations, we verified how different is the MSM built using 60 μs slice of the data from the 1.5 ms holo simulation in terms of the number of metastable states identified by the model. For this, we considered 60 μs data beginning from 966 μs - 1026 μs. First, we compared the gross structural properties of these datasets. Author response image 6a-c compares the distributions of Rg, Ree and SASA. The distributions show that the apo and holo simulations are very similar with respect to these standard properties of protein conformations. 

      We built the MSM for this 60 μs data of the holo ensemble from the reduced data obtained from the same VAE model. We would like to clarify that the hyperparameters of the model are not hardcoded but rather carefully fine-tuned to obtain a good model that performs good kinetic discretization of the underlying macrostates. The implied timescale plot of this new MSM shows distinct timescales corresponding to six macrostates. This led us to conclude that the six-state model is robust despite the differences in the ensemble size. The implied timescale is shown in Author response image 6d.

      The above analyses in Author response image 6 are presented in Results and Discussion, Page 13. 

      Major Weakness 2: There is little justification of how the hyperparameters MSMs were selected. It is unclear if the results of the study depend on arbitrary hyperparameter selections such as the final number of macrostates in each model.

      It is unclear what criteria were used to determine the appropriate number of microstates and macrostates for each MSM. Most importantly - as all analyses of water entropy and conformational entropy are restricted to the final macrostates - the criteria used to select the final number of macrostates with the PCCA+ are extremely important to the results of the conclusions of the study. From examining the ITS plots in Figure 3 - it seems both MSMs show the same number of resolved processes (at least 11) - suggesting that a 10-state model could be apropraite for both systems. If one were to simply select a large number of macrostates for the 20x longer holo simulation - do these states converge to the same conformational entropy as the states seen in the short apo simulations? Is there some MSM quality metric used to determine what number of macrostates is more appropriate?

      Required Controls For Supporting the Conclusions of the Study: The authors should specify the criteria used to determine the appropriate number of microstates and macrostates for their MSMs and present controls that demonstrate that the conformational entropies calculated for their final states are not simply a function of the ratio of the number macrostates chosen to represent very disparate amounts of conformational sampling.

      VAMP-2 score was used to determine the number of microstates. We have calculated the VAMP2 score by varying the number of microstates, ranging from 10 to 220. We find that the VAMP-2 score has saturated at a higher number of microstates for both apo and holo simulations.

      The number of macrostates were determined by the gap between the lines of the Implied timescales plot followed by a CK test (shown in figure S1). Since we plotted the first 10 slowest timescales, the implied timescales show 10 timescales and this is not an indicator of the number of macrostates. The macrostates are separated by distinct gaps in the timescales and do not merge as seen beyond 5 timescales in the plot. The timescales, when leveled off and distinct, indicate that the system has well defined metastable states and the MSM is accurate in identifying the macrostates. We find this to be three and six for the apo and holo simulations from the corresponding implied timescales.

      The above is discussed in Computational Methods, Page 37-38.

      Major Weakness 3: The use of variational autoencoders (VAEs) obscures insights into the underlying conformational ensembles of apo and holo alpha-synuclein rather than providing new ones

      No rationale is offered for the selection of the VAE architecture or hyperparameters used to reduce the dimensionality of alpha-synuclein conformational space.

      It is not clear the VAEs employed in this study are providing any new insight into the conformational ensembles and binding mechanisms of Fasudil to alpha-synuclein, or if the underlying latent space of the VAEs are more informative or kinetically meaningful than standard linear dimensionality reduction techniques like PCA and tICA. The initial VAE is used to reduce the dimensionality of alpha-synuclein conformational ensembles to 2 degrees of freedom - but it is unclear if this projection is structurally or kinetically meaningful. It is not clear why the authors choice to use a 2-dimeinsional projection instead of a higher number of dimensions to build their MSMs. Can they produce a more kinetically and structurally meaningful model using a higher dimensional VAE latent space?

      Additionally - it is not clear what insights are provided by the Denoising Convolutional Variational Autoencoder. The authors appear to be noising-and-denoising the contact maps of each macrostate, and then projecting the denoised values onto a new latent space - and commenting that they are different. Does this provide additional insight that looking at the contact maps in Figures 4&5 does not? Is this more informative than examining the distribution of the Radii of gyration or the secondary structure propensities of each ensemble? It is not clear what insight this analysis adds to the manuscript.

      Suggested controls to improve the study: The authors should project interpretable IDP structural descriptors (ie. secondary structure, radius of gyration, secondary structure content, # of intramolecular contacts, # of intermolecular contacts between alpha-synuclein and Fasudil ) onto this latent space to illustrate if any of these properties are meaningful separated by the VAE projection. The authors should compare these projections, and MSMs built from these projections, to projections and MSMs built from projections using standard linear dimensionality projection techniques like PCA and tICA.

      We have already pointed out the IDP structural parameters for the first question.

      In case of VAE, the latent space captures the underlying pattern of the higher dimensional data. A non-linear projection using VAE has shown to have a higher VAMP-2 score over linear dimension reduction methods such as tICA. The latent space of VAE was then used to build the MSM, in order to get the macrostates and also the transition timescales among them. We can project the data onto a higher dimension, but the goal is to reduce it to lower dimensions where it will be easier to interpret. Higher number dimensions would also risk overfitting; and the model, instead of learning the pattern, it may simply memorize the data. The training and validation loss curve from VAE has reached the order of 10^-4 thereby indicating good reconstruction of the original data.

      As for dimension reduction using tICA, the VAMP-2 score confirms that our VAE model performs better than tICA. This manuscript uses deep neural networks to understand the structural and kinetic process of IDP and small molecule interaction. Dimension reduction using tICA would give different reaction coordinates and MSM built using the projected data of tICA will not be one-to one comparable with that obtained from VAE.

      We had to perform noising, as we had only 9 contact maps. This led to overfitting of the CVAE model. To overcome this problem, we have introduced white noise to our data, so as to prevent the model from overfitting. The objective of the DCVAE model was to see how distinct these contact maps are based on their locations on a lower dimensional space. A visual inspection of the ensemble averaged contact map, especially for IDPs is much more difficult as compared to folded proteins. So, even before computing the Rg, Ree, SASA or secondary structure, this lower dimensional space will give us a preliminary idea of how each macrostate is different from every other.

      As for the distribution of Rg, we have plotted it in Author response image 7. The residue-wise percentage secondary structure is plotted in figure S4-S6  for the holo and apo simulation respectively.

      Author response image 7.

      Distribution of radius of gyration for the three and six macrostates in the apo and holo simulation respectively.

      As for training a model with a higher number of latent dimensions, we have retrained a VAE model with four dimensions in the latent space. The loss was of the order of 10-4. We built a MSM with the appropriate number of microstates and found the presence of six macrostates as evident from the ITS plot as shown in Figure S14 and S15.

      This data is presented in Results and Discussion, Page 13

      Major Weakness 4: The MSMs produced in this study have large discrepancies with MSMs previously produced on the same dataset by the same authors that are not discussed.

      Previously - two of the authors of this manuscript (Menon and Mondal) authored a preprint titled "Small molecule modulates α-synuclein conformation and its oligomerization via Entropy Expansion" (https://www.biorxiv.org/content/10.1101/2022.10.20.513005v1.full) that analyzed the same 1500us holo simulation of alpha-synuclein binding Fasudil. In this study - they utilized the variational approach to Markov processes (VAMP) to build an MSM using a 1D order parameter as input (the radius of gyration), first discretizing the conformational space into 300 microstates before similarly building a 6 macrostate model. From examining the contact maps and secondary structure propensities of the holo MSMs from the current study and the previous study- some of the macrostates appear similar, however there appear to be orders of magnitude differences in the timescales of conformational transitions between the two models. The timescales of conformational transitions in the previous MSM are on the order of 10s of microseconds, while the timescales of transitions in this manuscript are 100s-1000s microseconds. In the previous manuscript, a 3 state MSM is built from an apo α-synuclein obtained from a continuous 73ms unbiased MD simulation of alpha-synuclein run at a different salt concentration (100mM) and an additional 33 ms of shorter simulations. The apo MSM from the previous study similarly reports very fast timescales of transitions between apo states (on the order ~1ms) - while the MSM reported in the current study (Figure 9) are on the order of 10s-100s of microseconds).

      These discrepancies raise further concerns that the properties of the MSMs built on these systems are extremely sensitive to the chosen projection methods and MSM modeling choices and hyperparameters, and that neither model may be an accurate description of the true underlying dynamics

      Suggestions to improve the study: The authors should discuss the discrepancies with the MSMs reported in their previous studies.

      In the previous preprint, the radius of gyration was used as the collective variable to build the MSM. In this manuscript, we have used a much more general collective variable, reduced pairwise distance using VAE. Firstly, the collective variables used to build the model in the two works are different. Secondly, for the 73 μs apo simulation in the previous manuscript, the salt concentration used was 100 mM, but in this work, we have used a salt concentration of 50 mM, same as the salt concentration used in the holo simulations. Since the two simulation conditions are different with respect to salt concentration, the conformational space sampled in these conditions will be different and this will be reflected in the nature/features of the metastable states and the associated transition kinetics. Thirdly, the lag time at which the MSM was built was 3.6 ns in the previous manuscript, whereas, in this work we have used 32 ns. This is already off by a factor of 10. So the order of timescales have also changed. Thus, changes in the collective variable and change in the lag time at which the system reaches Markovianity is different. Hence, the timescales of transition among the macrostates are also different. Because of these differences, it would not be correct to compare the results that we would get from the two investigations.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      To highlight the role of the entropic expansion mechanism, I would suggest modifying the title to capture this result, for example: "An Integrated Machine Learning Approach Delineates an Entropic Expansion Mechanism for the Binding of a Small Molecule to α-Synuclein".

      We have changed the title as suggested by the reviewer.

      To my knowledge the binding of fasudil to alpha-synuclein has been shown in the simulations by Robustelli et al (JACS 2022), but the experimental evidence is less clear cut. If an experimental binding affinity and the effect on alpha-synuclein aggregation have been measured, they should be reported.

      Reviewer #2 (Recommendations For The Authors):

      We thank the reviewer for the careful evaluation of our manuscript and providing comments and questions that we have attempted to address and incorporate. 

      Minor

      Abstract:

      In "which is able to statistically distinguish fuzzy ensemble", what does the word "statistically" mean in this context? Do the authors present evidence that the two ensembles are statistically different, and if so in what ways?

      We have analyzed the apo and holo ensembles of aSyn using the framework of Markov State Models, which provides the stationary populations of the states that the model identifies. For this reason, we have used ‘which is able to statistically distinguish fuzzy ensemble’ as we compare and contrast the metastable states that we resolve using MSM. The MSM provides metastable states which are identified through statistical analysis of the transitions between states (transition probability matrix). We characterize their structural features to distinguish them which gives a meaningful interpretation of the fuzzy ensemble.

      Abstract:

      What does "entropic ordering" mean?

      We thank the reviewer for pointing this out. Here, we mean that the presence of the small molecule only affects the protein backbone entropy while the entropy of water is not affected in the simulations with fasudil. We will rewrite this more clearly in the abstract. 

      The changed sentence is as follows: 

      “A thermodynamic analysis indicates that small-molecule modulates the structural repertoire of αS by tuning protein backbone entropy, however the entropy of the water remains unperturbed.”

      Abstract:

      What does "offering insights into entropic modulation" mean?

      In this investigation, we first discretized the ensemble of a small-molecule binding/interacting with a disordered aSyn into the underlying metastable states, followed by characterisation of these identified states. As small molecule interactions can affect the overall entropy of the IDP, we estimated the said effect of fasudil binding on aSyn. We find that small molecule binding effect is manifested in the protein backbone entropy and the solvent entropy is not affected. Through this work, we highlight these insights into the modulatory effect that fasudil brings about in the entropy of the system (entropic modulation).

      p. 3/4:

      When the authors write "However, a routine comparison of monomeric αS ensemble... ensemble" it is unclear whether they are referring to previous work (they only cite a paper with simulations of "apo" aSYN, and if so which. Do they mean Ref 32? Also, the word "routine" sounds odd in this context.

      We thank the author for pointing this out. We compared the ensemble properties (such as the distributions of the radius of gyration, end-to-end distance, solvent accessible surface area, secondary structure properties) of ɑ-synuclein monomer that we generated in neat water and the ensemble of ɑ-synuclein in the presence of the small molecule fasudil that is reported in Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510).  We have now modified this sentence in the main manuscript as follows: (Page no 3)

      “However, comparison of the global and local structural features of the αS ensemble in neat water and that in the presence of fasudil [32] (see Figure S1-S6) did not indicate a significant difference that is a customary signature of the dynamic IDP ensemble.”

      p. 4:

      Regarding "Integrative approaches are therefore gaining importance in IDP studies", these kinds of integrative approaches have been used for 20 years for studies of IDPs (with increasing sophistication and success), so I think "gaining" is somewhat of a stretch.

      We thank the reviewer for this comment. We agree with the reviewer and have now changed this sentence  as follows:

      “Integrative approaches have been exploited in studying IDPs as well as small-molecule binding to IDPs.”

      p. 5:

      What does "large scale" mean in "This study showed no large-scale differences between the bound and unbound states of αS"? Do the authors mean substantially/significantly different, or differences on a large (length) scale?

      Here, we refer to the study of small molecule (fasudil) binding study to α-synclein reported in Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510). In this study, the authors report no substantial (“large scale”) differences in the conformational ensembles of αsynuclein in the bound and unbound states of fasudil such as the backbone conformation distributions. 

      p. 6:

      The authors write "In a clear departure from the classical view of ligand binding to a folded globular protein, the visual change in αS ensemble due to the presence of small molecule is not so strikingly apparent." I don't understand this. Normally, there is very little difference between apo and holo protein structures for folded proteins, so I don't understand the "in a clear departure" part. This seems like a strawman. Of course, for folded proteins one can generally see the ligand bound, but here the authors are talking about the protein.

      In case of folded proteins, the overall tertiary structure of the protein remains mostly the same upon binding of the ligand. Structural changes are localized in nature and primarily around the binding site. However, in case of ⍺Syn, binding of fasudil is transient and not as strong as seen for folded proteins. “Clear departure” refers to the fact that for ⍺Syn, binding of fasudil is more subtle and dispersed across the ensemble of conformations rather than localized changes as in case of folded proteins.

      p. 6:

      I don't think the term "data-agnostic" makes sense since these methods are based on data and also make some assumptions about how the data can/should be used.

      We have replaced this term with “model-agnostic”.

      p. 16:

      How are contacts defined; please add to caption.

      A contact is considered if the Cα atoms of two residues are within a distance of 8 Å of each other. We have updated the caption with this information in Figures 4 and 5.  

      p. 20:

      What do the authors mean by "non-specific interactions" in this context?

      The interactions of fasudil are predominantly with the negatively charged residues in the C-terminal region of ⍺Syn via charge-charge and π-stacking interactions (Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510)).

      In addition, in some metastable states that we identify, we also observe transient interactions with residues in the hydrophobic NAC region and N-terminal region. We refer to these transient interactions as “non-specific” interactions.

      p. 27:

      Are the axes of Fig. 9c/d z1 and z2?

      Yes. The axes are z1 and z2

      Smaller than minor

      Abstract:

      Rephrase "In particular, the presence of fasudil in milieu"

      We have rephrased the sentence as follows: 

      “In particular, the presence of fasudil in the solvent…”

      p. 4:

      What does the word "potentially" do in "ensemble of conformations potentially sampled"?

      Here, by potentially, we mean the various conformations that the protein can adopt, subject to the environmental conditions. 

      p. 10:

      "we trained a large array of inter-residue pairwise distances"

      The distances were not trained; please reformulate

      We have corrected this sentence as follows:  

      “We trained a VAE model using a large array of inter-residue pairwise distances.”

      p. 13:

      N/C-terminal -> terminus (or in the C-terminal region)

      We have made the changes in the manuscript at the required places. 

      p. 20:

      Precedent -> previous (?)

      We have made the change in the manuscript. 

      p. 30:

      As far as I understand, Anton does not use GPUs and does not run Desmond.

      We thank the reviewer for providing this information. We referred to the original paper of the ⍺syn-fasudil simulations (Robustelli et.al. (Journal of the American Chemical Society, 144(6), pp.2501-2510)). The authors have performed equilibration with GPU/Desmond and used Anton for production runs. We have modified this sentence as:

      We have modified this sentence as: 

      “A 1500 μs long all-atom MD simulation trajectory of αS monomer in aqueous fasudil solution was simulated by D. E. Shaw Research with the Anton supercomputer that is specially purposed for running long-time-scale simulations.” on page 31

      References : 

      (1) Schütte  C,  Fischer  A,  Huisinga  W,  Deuflhard  P  (1999)  A  direct  approach  to  conformational  dynamics  based  on  hybrid  monte  carlo. J  Comput  Phys 151:146–168

      (2) Chodera JD, Swope WC, Pitera JW, Dill KA (2006) Long-time protein folding dynamics from short-time molecular dynamics simulations.Multiscale  Model  Simul5:1214–1226.

    1. eLife Assessment

      This important manuscript demonstrates that UGGT1 is involved in preventing the premature degradation of endoplasmic reticulum (ER) glycoproteins through the re-glucosylation of their N-linked glycans following release from the calnexin/calreticulin lectins. The authors include a wealth of convincing data in support of their findings, although extending these findings to other types of substrates, such as secreted proteins, could further demonstrate the global importance of this mechanism for protein trafficking through the secretory pathway. This work will be of interest to scientists interested in ER protein quality control, proteostasis, and protein trafficking.

    2. Reviewer #1 (Public review):

      Summary:

      UGGTs are involved in the prevention of premature degradation for misfolded glycoproteins, by utilizing UGGT1-KO cells and a number of different ERAD substrates. They proposed a concept by which the fate of glycoproteins can be determined by a tug-of-war between UGGTs and EDEMs.

      Strengths:

      The authors provided a wealth of data to indicate that UGGT1 competes with EDEMs, which promotes the glycoprotein degradation.

    3. Reviewer #2 (Public review):

      In this study, Ninagawa et al., sheds light on UGGT's role in ER quality control of glycoproteins. By utilizing UGGT1/UGGT2 DKO , they demonstrate that several model misfolded glycoproteins undergo early degradation. One such substrate is ATF6alpha where its premature degradation hampers the cell's ability to mount an ER stress response.

      This study convincingly demonstrates that many unstable misfolded glycoproteins undergo accelerated degradation without UGGTs. Also, this study provides evidence of a "tug of war" model involving UGGTs (pulling glycoproteins to being refolded) and EDEMs (pulling glycoproteins to ERAD).

      The study explores the physiological role of UGGT, particularly examining the impact of ATF6α in UGGT knockout cells' stress response. The authors further investigate the physiological consequences of accelerated ATF6α degradation, convincingly demonstrating that cells are sensitive to ER stress in the absence of UGGTs and unable to mount an adequate ER stress response.

      These findings offer significant new insights into the ERAD field, highlighting UGGT1 as a crucial component in maintaining ER protein homeostasis. This represents a major advancement in our understanding of the field.

    4. Reviewer #3 (Public review):

      This valuable manuscript demonstrates the long-held prediction that the glycosyltransferase UGGT slows degradation of endoplasmic reticulum (ER)-associated degradation substrates through a mechanism involving re-glucosylation of asparagine-linked glycans following release from the calnexin/calreticulin lectins. The evidence supporting this conclusion is solid using genetically-deficient cell models and well established biochemical methods to monitor the degradation of trafficking-incompetent ER-associated degradation substrates, although this could be improved by better defining of the importance of UGGT in the secretion of trafficking competent substrates. This work will be of specific interest to those interested in mechanistic aspects of ER protein quality control and protein secretion.

      The authors have largely addressed my comments from the previous round of review. The only remaining comment is about defining the impact of UGGT1 in the regulation of secretion-competent proteins, which the authors indicate they will continue to pursue in subsequent work, which is fine, but remains a minor limitation of the study.

      As I mentioned in my previous review, I think that this work is interesting and addresses an important gap in experimental evidence supporting a previously asserted dogma in the field. I do think that the authors would be better suited for highlighting the limitations of the study, as discussed above. Ultimately, though, this is an important addition to the literature.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      UGGTs are involved in the prevention of premature degradation for misfolded glycoproteins, by utilizing UGGT1-KO cells and a number of different ERAD substrates. They proposed a concept by which the fate of glycoproteins can be determined by a tug-of-war between UGGTs and EDEMs. 

      Strengths: 

      The authors provided a wealth of data to indicate that UGGT1 competes with EDEMs, which promotes the glycoprotein degradation. 

      Weaknesses: 

      NA 

      We appreciate your comment.

      Reviewer #2 (Public review): 

      In this study, Ninagawa et al., sheds light on UGGT's role in ER quality control of glycoproteins. By utilizing UGGT1/UGGT2 DKO , they demonstrate that several model misfolded glycoproteins undergo early degradation. One such substrate is ATF6alpha where its premature degradation hampers the cell's ability to mount an ER stress response. 

      This study convincingly demonstrates that many unstable misfolded glycoproteins undergo accelerated degradation without UGGTs. Also, this study provides evidence of a "tug of war" model involving UGGTs (pulling glycoproteins to being refolded) and EDEMs (pulling glycoproteins to ERAD). 

      The study explores the physiological role of UGGT, particularly examining the impact of ATF6α in UGGT knockout cells' stress response. The authors further investigate the physiological consequences of accelerated ATF6α degradation, convincingly demonstrating that cells are sensitive to ER stress in the absence of UGGTs and unable to mount an adequate ER stress response. 

      These findings offer significant new insights into the ERAD field, highlighting UGGT1 as a crucial component in maintaining ER protein homeostasis. This represents a major advancement in our understanding of the field. 

      Thank you very much for your comment.

      Reviewer #3 (Public review): 

      This valuable manuscript demonstrates the long-held prediction that the glycosyltransferase UGGT slows degradation of endoplasmic reticulum (ER)-associated degradation substrates through a mechanism involving re-glucosylation of asparaginelinked glycans following release from the calnexin/calreticulin lectins. The evidence supporting this conclusion is solid using genetically-deficient cell models and well established biochemical methods to monitor the degradation of trafficking-incompetent ER-associated degradation substrates, although this could be improved by better defining of the importance of UGGT in the secretion of trafficking competent substrates. This work will be of specific interest to those interested in mechanistic aspects of ER protein quality control and protein secretion. 

      The authors have attempted to address my comments from the previous round of review, although some issues still remain. For example, the authors indicate that it is difficult to assess how UGGT1 influences degradation of secretion competent proteins, but this is not the case. This can be easily followed using metabolic labeling experiments, where you would get both the population of protein secreted and degraded under different conditions. Thus, I still feel that addressing the impact of UGGT1 depletion on the ER quality control for secretion competent protein remains an important point that could be better addressed in this work. 

      We mainly focused on the impact of UGGT1 depletion on ERAD in this paper and intend to determine the impact of UGGT1 depletion on the ER quality control for secretion competent protein in the near future.

      Further, in the previous submission, the authors showed that UGGT2 depletion demonstrates a similar reduction of ATF6 activation to that observed for UGGT1 depletion, although UGGT2 depletion does not reduce ATF6 protein levels like what is observed upon UGGT1 depletion. In the revised manuscript, they largely remove the UGGT2 data and only highlight the UGGT1 depletion data. While they are somewhat careful in their discussion, the implication is that UGGT1 regulates ATF6 activity by controlling its stability. The fact that UGGT2 has a similar effect on activity, but not stability, indicates that these enzymes may have other roles not directly linked to ATF6 stability. It is important to include the UGGT2 data and explicitly highlight this point in the discussion. Its fine to state that figuring out this other function is outside the scope of this work but removing it does not seem appropriate.

      We have added the data of UGGT2-KO and UGGT-DKO cells to Figure 4 and discussed appropriately.

      As I mentioned in my previous review, I think that this work is interesting and addresses an important gap in experimental evidence supporting a previously asserted dogma in the field. I do think that the authors would be better suited for highlighting the limitations of the study, as discussed above. Ultimately, though, this is an important addition to the literature. 

      We appreciate your comments. Thank you very much.

      Recommendations for the authors: 

      Reviewer #1 (Recommendations for the authors): 

      I have carefully gone through the revised manuscript and responses to the reviewers' comments; I believe that the authors did a great job on revisions, and I do think that now this manuscript has been much improved (far easier to read through). Now I have only minor comments as follows; 

      Page 9: Lines 8-9; Comparison between WT and EDEM-TKO cells indicates that ATF6alpha is still degraded via gpERAD requiring mannose trimming even in the presence of DNJ (Fig. 1D). (it would be better to indicate which figure to look) 

      We have fixed it.

      Page 10: Lines 9-11; as multiple higher molecular weight bands (representing a mixture of G3M9, G2M9m and GM9 etc.) in WT cells treated with CST -> I am NOT AT ALL convinced with this statement on Figure 1-figure supplement 6A). How can the subtle glycan structure difference cause the ladder of the band? And if it is indeed the case (which I frankly doubt by the way), will endo-alpha-mannosidase treatment end up with a single band for CST? And PNGase F digestion can cancel all size difference between samples (control, +DNJ and +CST)? 

      CD3d-DTM-HA is a small protein (~20 kDa) possessing three N-glycans. Clear increase in the level of GM9 in WT cells treated with DNJ (Figure 1-Figure supplement 5A) caused an upward band shift (Figure 1-Figure supplement 6A). Similarly, clear increase in the levels of GM9, G2M9, G3M9 in WT cells treated with CST (Figure 1-Figure supplement 6B) produced the ladder of the band (Figure 1-Figure supplement 6A).

      Crystal violet assay (new Fig 4G; Page 33); It said that, after treating cells with drug (Tg) for 4 hours, cells were spread on 24 well plates and cultured without Tg for 5 days. If incubated that long, I wonder that any compromised viability may have been canceled by growing cells (cells become confluent no matter what?). Am I missing something? Please clarify. 

      We employed a previously published method to determine ER stress sensitivity (Yamamoto et al., Dev. Cell, 2007). Although any compromised viability may have been canceled by growing cells, as suggested, we were able to detect the difference between WT and UGGT-KO cells.

      Figure 5D; why one of the three N-glycans is missing on the last protein?? 

      We have fixed it.

    1. eLife Assessment

      This work is an important contribution to understanding the role of FGF signaling in the induction of primitive-like cells in a 2D system of human gastrulation. The authors provide compelling evidence showing that endogenous FGF ligands, acting through FGF receptors localized basolaterally, are determinant in the acquisition of a primitive streak cell fate. These observations will be of broad relevance to the FGF field.

    2. Reviewer #1 (Public review):

      Summary:

      This is an interesting study on the role of FGF signaling in the induction of primitive streak-like cells (PS-LC) in human 2D-gastruloids. The authors use a previously characterized standard culture that generates a ring of PS-LCs (TBXT+) and correlate this with pERK staining. A requirement for FGF signaling in TBXT induction is demonstrated via pharmacological inhibition of MEK and FGFR activity. A second set of culture conditions (with no exogenous FGFs) suggests that endogenous FGFs are required for pERK and TBXT induction. The authors then characterize, via scRNA-seq, various components of the FGF pathway (genes for ligands, receptors, ERK regulators, and HSPG regulation). They go on to characterize the pFGFR1, receptor isoforms, and polarized localization of this receptor. Finally, they perform FGF4 inhibition and use a cell line with a limited FGF17 inactivation (heterozygous null) and show that loss of these FGFs reduces PS-LC and derivative cell types.

      Strengths:

      (1) As the authors point out, the role of FGF signaling in gastrulation is less well understood than other signaling pathways. Hence this is a valuable contribution to that field.

      (2) The FGF4 and FGF17 loss-of-function experiments in Figure 5 are very intriguing. This is especially so given the intriguing observation that these FGFs appear to be dominating in this model of human gastrulation, in contrast to what FGFs dominate in mice, chicks, and frogs.

      (3) In general this paper is valuable as a further development of the Human gastruloid system and the role of FGF signaling in the induction of PS-CLs. The wide net that the authors cast in characterizing the FGF ligand gene, receptor isoforms, and downstream components provides a foundation for future work. As the authors write near the beginning of the Discussion "Many questions remain."

      Weaknesses:

      (1) FGFs are cell survival factors in various aspects of development. The authors fail to address cell death due to loss of FGF signaling in their experiments. For example, in Figure 1E (which requires statistical analysis) and 1G (the bottom FGFRi row), there appears to be a significant amount of cell loss. Is this due to cell death? The authors should address the question of whether the role of FGF/ERK signaling is to keep the cells alive.

      (2) Regarding the sparse cells in 1G, is there a reduction in cell number only with FGFRi and not MEKi? Is this reproducible? Gattiglio et al (Development, 2023, PMID: 37530863) present data supporting a "community effect" in the FGF-induced mesoderm differentiation of mouse embryonic stem cells. Could a community effect be at play in this human system (especially given the images in the bottom row of 1G)? If the authors don't address this experimentally they should at least address the ideas in Gattoglio et al.

      (3) Do the FGF4 and FGF17 LOF experiments in Figure 5 affect cell numbers like FGFRi in Figure 1? Why examine PS-LC induction only in FGF17 heterozygous cells and not homozygous FGF17 nulls?

      (4) The idea that FGF8 plays a dominant role during gastrulation of other species but not humans is so intriguing it warrants deeper testing. The authors dismiss FGF8 because its mRNA "...levels always remained low." (line 363) as well as the data published in Zhai et al (PMID: 36517595) and Tyser et al (PMID: 34789876). But there are cases in mouse development where a gene was expressed at levels so low, that it might be dismissed, and yet LOF experiments revealed it played a role or even was required in a developmental process. The authors should consider FGF8 inhibition or inactivation to explore its potential role, despite its low levels of expression.

      (5) Redundancy is a common feature in FGF genetics. What is the effect of inhibiting FGF4 in FGF17 LOF cells?

      (6) I suggest stating that the authors take more caution in describing FGF gradients. For example, in one Results heading they write "Endogenous FGF4 and FGF17 gradients underly the ERK activity pattern.", implying an FGF protein gradient. However, they only present data for FGF mRNA , not protein. This issue would be clarified if they used proper nomenclature for gene, mRNA (italics), and protein (no italics) throughout the paper.

    3. Reviewer #2 (Public review):

      Summary:

      The role of FGFs in embryonic development and stem cell differentiation has remained unclear due to its complexity. In this study, the authors utilized a 2D human stem cell-based gastrulation model to investigate the functions of FGFs. They discovered that FGF-dependent ERK activity is closely linked to the emergence of primitive streak cells. Importantly, this 2D model effectively illustrates the spatial distribution of key signaling effectors and receptors by correlating these markers with cell fate markers, such as T and ISL1. Through inhibition and loss-of-function studies, they further corroborated the needs of FGF ligands. Their data shows that FGFR1 is the primary receptor, and FGF2/4/17 are the key ligands for primitive streak development, which aligns with observations in primate embryos. Additional experiments revealed that the reduction of FGF4 and FGF17 decreases ERK activity.

      Strengths:

      This study provides comprehensive data and improves our understanding of the role of FGF signaling in primate primitive streak formation. The authors provide new insights related to the spatial localization of the key components of FGF signaling and attempt to reveal the temporal dynamics of the signal propagation and cell fate decision, which has been challenging.

      Weaknesses:

      Given the solid data, the work only partially clarifies the complex picture of FGF signaling, so details remain somewhat elusive. The findings lack a strong punchline, which may limit their broader impact.

    4. Reviewer #3 (Public review):

      Jo and colleagues set out to investigate the origins and functions of localized FGF/ERK signaling for the differentiation and spatial patterning of primitive streak fates of human embryonic stem cells in a well-established micropattern system. They demonstrate that endogenous FGF signaling is required for ERK activation in a ring-domain in the micropatterns, and that this localized signaling is directly required for differentiation and spatial patterning of specific cell types. Through high-resolution microscopy and transwell assays, they show that cells receive FGF signals through basally localized receptors. Finally, the authors find that there is a requirement for exogenous FGF2 to initiate primitive streak-like differentiation, but endogenous FGFs, especially FGF4 and FGF17, fully take over at later stages.

      Even though some of the authors' findings - such as the localized expression of FGF ligands during gastrulation and the importance of FGF/ERK signaling for cell differentiation in the primitive streak - have been reported in model organisms before, this is one of the first studies to investigate the role of FGF signaling during primitive streak-like differentiation of human cells. In doing so, the paper reports a number of interesting and valuable observations, namely the basal localization of FGF receptors which mirrors that of BMP and Nodal receptors, as well as the existence of a positive feedback loop centered on FGF signaling that drives primitive-streak differentiation. The authors also perform a comparison of the role of different FGFs across species and try to assign specific functions to individual FGFs. In the absence of clean genetic loss-of-function cell lines, this part of the work remains less strong.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      This is an interesting study on the role of FGF signaling in the induction of primitive streak-like cells (PS-LC) in human 2D-gastruloids. The authors use a previously characterized standard culture that generates a ring of PS-LCs (TBXT+) and correlate this with pERK staining. A requirement for FGF signaling in TBXT induction is demonstrated via pharmacological inhibition of MEK and FGFR activity. A second set of culture conditions (with no exogenous FGFs) suggests that endogenous FGFs are required for pERK and TBXT induction. The authors then characterize, via scRNA-seq, various components of the FGF pathway (genes for ligands, receptors, ERK regulators, and HSPG regulation). They go on to characterize the pFGFR1, receptor isoforms, and polarized localization of this receptor. Finally, they perform FGF4 inhibition and use a cell line with a limited FGF17 inactivation (heterozygous null) and show that loss of these FGFs reduces PS-LC and derivative cell types.

      Strengths:

      (1) As the authors point out, the role of FGF signaling in gastrulation is less well understood than other signaling pathways. Hence this is a valuable contribution to that field.

      (2) The FGF4 and FGF17 loss-of-function experiments in Figure 5 are very intriguing. This is especially so given the intriguing observation that these FGFs appear to be dominating in this model of human gastrulation, in contrast to what FGFs dominate in mice, chicks, and frogs.

      (3) In general this paper is valuable as a further development of the Human gastruloid system and the role of FGF signaling in the induction of PS-CLs. The wide net that the authors cast in characterizing the FGF ligand gene, receptor isoforms, and downstream components provides a foundation for future work. As the authors write near the beginning of the Discussion "Many questions remain."

      We thank the reviewer for these positive comments.

      Weaknesses:

      (1) FGFs are cell survival factors in various aspects of development. The authors fail to address cell death due to loss of FGF signaling in their experiments. For example, in Figure 1E (which requires statistical analysis) and 1G (the bottom FGFRi row), there appears to be a significant amount of cell loss. Is this due to cell death? The authors should address the question of whether the role of FGF/ERK signaling is to keep the cells alive.

      Indeed, FGF also strongly affects cell number and it is an interesting question to what extent this depends on ERK. Our manuscript focuses instead on the role of FGF/ERK signaling in cell fate patterning. However, as mentioned in our discussion, figure 1de show that doxycycline induced pERK leads to more TBXT+ cells than the control without restoring cell number, suggesting the role of FGF in controlling cell number is independent of the requirement for FGF/ERK in PS-LC differrentiation. Unpublished data below showing a MEK inhibitor dose response further supports this: low doses of MEKi are sufficient to inhibit differentiation without affecting cell number. To address the reviewer’s question we will include this data in the revised manuscript and perform several additional experiments to determine in more detail how cell death and proliferation depend on FGF.

      Author response image 1.

      MEK affects differentiation and cell number at different doses. a-c) control and MEKi (0.3uM) treated colonies with similar cell number but different TBXT expression. d-f) quantification of cell number per colonies (d), percentage of TBXT-positive cell per colony (e), and the distribution of pERK intensities for different doses of MEK inhibitor (f). N>6 colonies per condition. MEKi = PD0325901. Scalebar = 50 micron.

      (2) Regarding the sparse cells in 1G, is there a reduction in cell number only with FGFRi and not MEKi? Is this reproducible? Gattiglio et al (Development, 2023, PMID: 37530863) present data supporting a "community effect" in the FGF-induced mesoderm differentiation of mouse embryonic stem cells. Could a community effect be at play in this human system (especially given the images in the bottom row of 1G)? If the authors don't address this experimentally they should at least address the ideas in Gattoglio et al.

      Indeed, FGFRi reproducibly affects cell number more than MEKi, in line with the fact that pathways downstream of FGF other than MAPK/ERK (e.g. PI3K) play important roles in cell survival and growth. We think the lack of differentiation in MEKi and FGFRi in Fig.1g cannot be attributed to a loss of cells combined with a community effect. This is because without FGFRi or MEKi cells also differentiate to primitive streak at much lower densities than those shown, consistent with the data we show above in response to (1), which argue against a primarily indirect effect of FGF on PS-LC differentiation through cell density. In the context of directed differentiation (rather than 2D gastruloids), we will show this in a controlled manner by repeating the experiment in Fig.1g while adjusting cell seeding densities to obtain similar final cell densities in all three conditions. We will also include Gattoglio et al. in our revised discussion.

      (3) Do the FGF4 and FGF17 LOF experiments in Figure 5 affect cell numbers like FGFRi in Figure 1?

      It seems the effect on cell number is small but we will analyze this carefully and include it in the revised manuscript. A small effect would be consistent with our unpublished data below showing a near uniform proliferation rate. This in turn suggests that low levels of pERK in the center are sufficient to maintain proliferation there while the much higher pERK levels in the PS-LC ring (that we think depend on FGF4 and FGF17) do not signifcantly increase the proliferation rate (see Fig.1 in the manuscript for the pERK pattern). Thus, loss of high pERK in PS-LC ring while maintaining low pERK throughout would not be expected to have a major impact on cell number but would impact differentiation. In contrast, loss of all FGF signaling through FGFRi does dramatically affect cell number. This is again consistent with the data provided in response to (1) showing that ERK levels can be reduced to a point where PS-LC differentiation is lost without significantly affecting cell number. We will include the data below in the revised manuscript.

      Author response image 2.

      Why examine PS-LC induction only in FGF17 heterozygous cells and not homozygous FGF17 nulls?

      We were unable to obtain homozygous FGF17 nulls, it is not clear if there is a reason for this. We will try again and otherwise attempt to corroborate our findings with further knockdown data.

      (4) The idea that FGF8 plays a dominant role during gastrulation of other species but not humans is so intriguing it warrants deeper testing. The authors dismiss FGF8 because its mRNA "...levels always remained low." (line 363) as well as the data published in Zhai et al (PMID: 36517595) and Tyser et al (PMID: 34789876). But there are cases in mouse development where a gene was expressed at levels so low, that it might be dismissed, and yet LOF experiments revealed it played a role or even was required in a developmental process. The authors should consider FGF8 inhibition or inactivation to explore its potential role, despite its low levels of expression.

      We agree with the reviewer that FGF8 is worth investigating further and we will now pursue this.

      (5) Redundancy is a common feature in FGF genetics. What is the effect of inhibiting FGF4 in FGF17 LOF cells?

      We will attempt to do the experiment the reviewer suggests.

      (6) I suggest stating that the authors take more caution in describing FGF gradients. For example, in one Results heading they write "Endogenous FGF4 and FGF17 gradients underly the ERK activity pattern.", implying an FGF protein gradient. However, they only present data for FGF mRNA , not protein. This issue would be clarified if they used proper nomenclature for gene, mRNA (italics), and protein (no italics) throughout the paper.

      We will edit the paper to more clearly distinguish protein and mRNA.

      Reviewer #2 (Public review):

      Summary:

      The role of FGFs in embryonic development and stem cell differentiation has remained unclear due to its complexity. In this study, the authors utilized a 2D human stem cell-based gastrulation model to investigate the functions of FGFs. They discovered that FGF-dependent ERK activity is closely linked to the emergence of primitive streak cells. Importantly, this 2D model effectively illustrates the spatial distribution of key signaling effectors and receptors by correlating these markers with cell fate markers, such as T and ISL1. Through inhibition and loss-of-function studies, they further corroborated the needs of FGF ligands. Their data shows that FGFR1 is the primary receptor, and FGF2/4/17 are the key ligands for primitive streak development, which aligns with observations in primate embryos. Additional experiments revealed that the reduction of FGF4 and FGF17 decreases ERK activity.

      Strengths:

      This study provides comprehensive data and improves our understanding of the role of FGF signaling in primate primitive streak formation. The authors provide new insights related to the spatial localization of the key components of FGF signaling and attempt to reveal the temporal dynamics of the signal propagation and cell fate decision, which has been challenging.

      Weaknesses:

      Given the solid data, the work only partially clarifies the complex picture of FGF signaling, so details remain somewhat elusive. The findings lack a strong punchline, which may limit their broader impact.

      We thank this reviewer for their valuable feedback and the compliment on the solidity of our data. The punchline of our work is that FGF4- and FGF17-dependent ERK signaling plays a key role in human PS-LC differentiation, and that these are different FGFs than those thought to drive mouse gastrulation. A second key point is that like BMP and TGFβ signaling, FGF signaling is restricted to the basolateral sides of pluripotent stem cell colonies due to polarized receptor expression, which is crucial for understanding the response to exogenous ligands added to the cell medium. Indeed, many facets of FGF signaling remain to investigated in the future, such as how FGF regulates and is regulated by other signals, which we will dedicate a different manuscript to.

      Reviewer #3 (Public review):

      Jo and colleagues set out to investigate the origins and functions of localized FGF/ERK signaling for the differentiation and spatial patterning of primitive streak fates of human embryonic stem cells in a well-established micropattern system. They demonstrate that endogenous FGF signaling is required for ERK activation in a ring-domain in the micropatterns, and that this localized signaling is directly required for differentiation and spatial patterning of specific cell types. Through high-resolution microscopy and transwell assays, they show that cells receive FGF signals through basally localized receptors. Finally, the authors find that there is a requirement for exogenous FGF2 to initiate primitive streak-like differentiation, but endogenous FGFs, especially FGF4 and FGF17, fully take over at later stages.

      Even though some of the authors' findings - such as the localized expression of FGF ligands during gastrulation and the importance of FGF/ERK signaling for cell differentiation in the primitive streak - have been reported in model organisms before, this is one of the first studies to investigate the role of FGF signaling during primitive streak-like differentiation of human cells. In doing so, the paper reports a number of interesting and valuable observations, namely the basal localization of FGF receptors which mirrors that of BMP and Nodal receptors, as well as the existence of a positive feedback loop centered on FGF signaling that drives primitive-streak differentiation. The authors also perform a comparison of the role of different FGFs across species and try to assign specific functions to individual FGFs. In the absence of clean genetic loss-of-function cell lines, this part of the work remains less strong.

      We thank the reviewer for emphasizing the value of our findings in a human model for gastrulation. We agree more loss-of-function experiments would provide further insight into the role of different FGFs, and we plan to provide additional data along these lines in the revised manuscript.

    1. eLife Assessment

      The study provides valuable findings regarding the identification of a new bacteriophage that uses the Pseudomonas aeruginosa exopolysaccharide Psl as a receptor, thus suggesting a novel approach to control biofilms. While much of the data presented is solid, additional work and clarifications are still required to fully support some of the main claims. This manuscript will interest those working on biofilms, specifically in Pseudomonas, on phage physiology and discovery, and on alternatives to controlling bacterial pathogens.

    2. Reviewer #1 (Public review):

      Summary:

      Walton et al. set out to isolate new phages targeting the opportunistic pathogen Pseudomonas aeruginosa. Using a double ∆fliF ∆pilA mutant strain, they were able to isolate 4 new phages, CLEW-1. -3, -6, and -10, which were unable to infect the parental PAO1F Wt strain. Further experiments showed that the 4 phages were only able to infect a ∆fliF strain, indicating a role of the MS-protein in the flagellum complex. Through further mutational analysis of the flagellum apparatus, the authors were able to identify the involvement of c-di-GMP in phage infection. Depletion of c-di-GMP levels by an inducible phosphodiesterase renders the bacteria resistant to phage infection, while elevation of c-di-GMP through the Wsp system made the cells sensitive to infection by CLEW-1. Using TnSeq, the authors were able to not only reaffirm the involvement of c-di-GMP in phage infection but also able to identify the exopolysaccharide PSL as a downstream target for CLEW-1. C-di-GMP is a known regulator of PSL biosynthesis. The authors show that CLEW-1 binds directly to PSL on the cell surface and that deletion of the pslC gene resulted in complete phage resistance. The authors also provide evidence that the phage-PSL interaction happens during the biofilm mode of growth and that the addition of the CLEW-1 phage specifically resulted in a significant loss of biofilm biomass. Lastly, the authors set out to test if CLEW-1 could be used to resolve a biofilm infection using a mouse keratitis model. Unfortunately, while the authors noted a reduction in bacterial load assessed by GFP fluorescence, the keratitis did not resolve under the tested parameters.

      Strengths:

      The experiments carried out in this manuscript are thoughtful and rational and sufficient explanation is provided for why the authors chose each specific set of experiments. The data presented strongly supports their conclusions and they give present compelling explanations for any deviation. The authors have not only developed a new technique for screening for phages targeting P. aeruginosa, but also highlight the importance of looking for phages during the biofilm mode of growth, as opposed to the more standard techniques involving planktonic cultures.

      Weaknesses:

      While the paper is strong, I do feel that further discussions could have gone into the decision to focus on CLEW-1 for the majority of the paper. The paper also doesn't provide any detailed information on the genetic composition of the phages. It is unclear if the phages isolated are temperate or virulent. Many temperate phages enter the lytic cycle in response to QS signalling, and while the data as it is doesn't suggest that is the case, perhaps the paper would be strengthened by further elimination of this possibility. At the very least it might be worth mentioning in the discussion section.

    3. Reviewer #2 (Public review):

      This manuscript by Walton et al. suggests that they have identified a new bacteriophage that uses the exopolysaccharide Psl from Pseudomonas aeruginosa (PA) as a receptor. As Psl is an important component in biofilms, the authors suggest that this phage (and others similarly isolated) may be able to specifically target biofilm-growing bacteria. While an interesting suggestion, the manner in which this paper is written makes it difficult to draw this conclusion. Also, some of the results do not directly follow from the data as presented and some relevant controls seem to be missing.

    4. Author response:

      Reviewer #1 (Public review): 

      Summary: 

      Walton et al. set out to isolate new phages targeting the opportunistic pathogen Pseudomonas aeruginosa. Using a double ∆fliF ∆pilA mutant strain, they were able to isolate 4 new phages, CLEW-1. -3, -6, and -10, which were unable to infect the parental PAO1F Wt strain. Further experiments showed that the 4 phages were only able to infect a ∆fliF strain, indicating a role of the MS-protein in the flagellum complex. Through further mutational analysis of the flagellum apparatus, the authors were able to identify the involvement of c-di-GMP in phage infection. Depletion of c-di-GMP levels by an inducible phosphodiesterase renders the bacteria resistant to phage infection, while elevation of c-di-GMP through the Wsp system made the cells sensitive to infection by CLEW-1. Using TnSeq, the authors were able to not only reaffirm the involvement of c-di-GMP in phage infection but also able to identify the exopolysaccharide PSL as a downstream target for CLEW-1. C-di-GMP is a known regulator of PSL biosynthesis. The authors show that CLEW-1 binds directly to PSL on the cell surface and that deletion of the pslC gene resulted in complete phage resistance. The authors also provide evidence that the phage-PSL interaction happens during the biofilm mode of growth and that the addition of the CLEW-1 phage specifically resulted in a significant loss of biofilm biomass. Lastly, the authors set out to test if CLEW-1 could be used to resolve a biofilm infection using a mouse keratitis model. Unfortunately, while the authors noted a reduction in bacterial load assessed by GFP fluorescence, the keratitis did not resolve under the tested parameters. 

      Strengths: 

      The experiments carried out in this manuscript are thoughtful and rational and sufficient explanation is provided for why the authors chose each specific set of experiments. The data presented strongly supports their conclusions and they give present compelling explanations for any deviation. The authors have not only developed a new technique for screening for phages targeting P. aeruginosa, but also highlight the importance of looking for phages during the biofilm mode of growth, as opposed to the more standard techniques involving planktonic cultures. 

      Weaknesses: 

      While the paper is strong, I do feel that further discussions could have gone into the decision to focus on CLEW-1 for the majority of the paper. The paper also doesn't provide any detailed information on the genetic composition of the phages. It is unclear if the phages isolated are temperate or virulent. Many temperate phages enter the lytic cycle in response to QS signalling, and while the data as it is doesn't suggest that is the case, perhaps the paper would be strengthened by further elimination of this possibility. At the very least it might be worth mentioning in the discussion section. 

      Thank you for your review. We will upload the genomes of all Clew phages and Ocp-2 before resubmission. It turns out that the Clew phage are highly related, which we wanted to express with the genomic comparison in the supplementary figure (rather unsuccessfully). It therefore made sense to focus our in-depth analysis on one of the phage. We will include a supplementary figure demonstrating that all Clew-1 phage require an intact psl locus for infection, to make that logic clearer. The phage are virulent (there is apparently a bit of a debate about this with regard to Bruynogheviruses, but we have not been able to isolate lysogens). This will be explained in the revised version of the manuscript as well.

      Reviewer #2 (Public review): 

      This manuscript by Walton et al. suggests that they have identified a new bacteriophage that uses the exopolysaccharide Psl from Pseudomonas aeruginosa (PA) as a receptor. As Psl is an important component in biofilms, the authors suggest that this phage (and others similarly isolated) may be able to specifically target biofilm-growing bacteria. While an interesting suggestion, the manner in which this paper is written makes it difficult to draw this conclusion. Also, some of the results do not directly follow from the data as presented and some relevant controls seem to be missing. 

      Thank you for your review. We would argue that the combination of demonstrating Psl-dependent binding of Clew-1 to P. aeruginosa, as well as demonstration of direct binding of Clew-1 to affinity-purified Psl, indicates that the phage binds directly to Psl and uses it as a receptor. In looking at the recommendations, it appears that the remark about controls refers to not using the ∆pslC mutant alone (as opposed to the ∆fliF2 ∆pslC double mutant) as a control for some of the binding experiments. However, since the ∆fliF2 mutant is more permissive for phage infection, analyzing the effect of deleting pslC in the context of the ∆fliF2 mutant background is the more stringent test.

    1. Author response:

      We sincerely thank all the reviewers for their enthusiasm and positive feedback, which has encouraged us to delve deeper into this research. As this is the first report of POLK in the brain using a longitudinal normative aging model, our primary aim was to establish the observational and phenomenological aspects. We agree with the reviewers that more detailed molecular, biochemical, and cellular studies are essential to elucidate underlying mechanisms. However, as noted by some reviewers, these investigations, while they will raise the impact, may fall outside the scope of the current report. Indeed, many of these lines of investigation are currently ongoing. Below, we provide our provisional responses to individual reviewer comments.

      Response to Reviewer #1:

      a) Concern over POLK antibody characterization in mice:

      We performed knocking down of POLK by siRNA in mice cortical primary neuronal culture (Fig S1C). In the revised version, we will provide a more detailed characterization of POLK antibodies in mouse cells.

      b) More mechanistic investigation is needed before POLK could be considered as a brain aging clock:

      We sincerely appreciate the valuable suggestion. In our ongoing work exploring the mechanisms of POLK in postmitotic neurons, preliminary findings using siPOLK indicate an upregulation of senescence markers along with a reduction in DNA repair synthesis (manuscript in preparation). We will reference this companion manuscript in the revised version and are pleased to share these data with the reviewers for their consideration.

      Response to Reviewer #2:<br /> a) Concern on more mechanistic understanding of the pathways regulating POLK dynamics between the nucleus and cytosol:

      We sincerely appreciate the reviewer’s enthusiasm and valuable guidance in helping us better understand the mechanism of nuclear-cytoplasmic POLK dynamics. Previously, we developed a modified aniPOND (accelerated native isolation of proteins on nascent DNA) protocol, which we termed iPoKD-MS (isolation of proteins on Pol kappa synthesized DNA  followed by mass spectrometry), to capture proteins bound to nascent DNA synthesized by POLK in human cell lines (bioRxiv https://www.biorxiv.org/content/10.1101/2022.10.27.513845v3). In this dataset, we identified potential candidates that may regulate nuclear/cytoplasmic POLK dynamics. These candidates are currently undergoing validation in human cell lines, and we are preparing a manuscript on these findings. Among these, some candidates, including previously identified proteins such as exportin and importin (Temprine et al., 2020, PMID: 32345725), are being explored further as potential POLK nuclear/cytoplasmic shuttles. We are also conducting tests on these candidates in mouse cortical primary neurons to assess their role in POLK dynamics. In the revised version of the manuscript, we will include a discussion of our current understanding and outline our planned studies.

      b) Question on “… what is POLK doing in the cytosol, and what is it interacting with …”:

      Our data so far indicate that POLK accumulates in stress granules and lysosomes. We are very grateful for the reviewer’s insightful suggestions and will make every effort to incorporate them in the revised manuscript. Currently, we are characterizing POLK accumulation in the cytoplasm using additional lysosomal markers, as recommended by the reviewer. If these experiments prove challenging in mouse brain tissues, we plan to investigate them in primary neuron cultures. We are hopeful to include these findings in the revised version. Additionally, we have optimized the POLK antibody for immunoprecipitation from nuclear and cytoplasmic fractions of mouse brain tissue. These findings, which are beyond the scope of the current study, will be reported in a separate manuscript.

      Response to Reviewer #3:

      We highly appreciate the reviewer bringing up the context of biomolecular condensates. Our iPoKD-MS data referenced above suggests candidates from various biomolecular condensates that we are currently investigating. We are currently investigating by subcellular fractionation the presence of POLK in different biomolecular condensates that will be fully reported in future publications. We appreciate the reviewer providing important literature that will be cited and potential biomolecular condensates will be discussed in the revised version.

    2. eLife Assessment

      Abdelmageed et al. demonstrate POLK expression in neurons and report an important observation that POLK exhibits an age-dependent change in subcellular localization, from the nucleus in young tissue to the cytoplasm in old tissue. Despite potentially exciting and novel findings, many of the authors' claims are provided with incomplete support (e.g. lack of validation of the POLK antibody, characterization of the subcellular compartment, etc).

    3. Reviewer #1 (Public review):

      Summary:

      Abdelmageed et al. investigate age-related changes in the subcellular localization of DNA polymerase kappa (POLK) in the brains of mice. POLK has been actively investigated for its role in translesion DNA synthesis and involvement in other DNA repair pathways in proliferating cells, very little is known about POLK in a tissue-specific context, let alone in post-mitotic cells. The authors investigated POLK subcellular distribution in the brains of young, middle-aged, and old mice via immunoblotting of fractioned tissue extracts and immunofluorescence (IF). Immunoblotting revealed a progressive decrease in the abundance of nuclear POLK, while cytoplasmic POLK levels concomitantly increased. Similar findings were present when IF was performed on brain sections. Further, IF studies of the cingulate cortex (Cg1), the motor cortex (M1, M2), and the somatosensory (S1) cortical regions all showed an age-related decline in nuclear POLK. Nuclear speckles of POLK decrease in each region, meanwhile, the number of cytoplasmic POLK granules decreases in all four regions, but granule size is increasing. The authors report similar findings for REV1, another Y-family DNA polymerase.

      The authors then investigate the colocalization of POLK with other DNA damage response (DDR) proteins in either pyramidal neurons or inhibitory interneurons. At 18 months of age, DNA damage marker gH2AX demonstrated colocalization with nuclear POLK, while strong colocalization of POLK and 8-oxo-dG was present in geriatric mice. The authors find that cytoplasmic POLK granules colocalize with stress granule marker G3BP1, suggesting that the accumulated POLK ends up in the lysosome.

      Brain regions were further stained to identify POLK patterns in NeuN+ neurons, GABAergic neurons, and other non-neuronal cell types present in the cortex. Microglia associated with pyramidal neurons or inhibitory interneurons were found to have a higher abundance of cytoplasmic POLK. The authors also report that POLK localization can be regulated by neuronal activity induced by Kainic acid treatment. Lastly, the authors suggest that POLK could serve as an aging clock for brain tissue, but POLK deserves further characterization and correlation to functional changes before being considered as a biomarker.

      Strengths:

      Investigation of TLS polymerases in specific tissues and in post-mitotic cells is largely understudied. The potential changes in sub-cellular localization of POLK and potentially other TLS polymerases open up many questions about DNA repair and damage tolerance in the brain and how it can change with age.

      Weaknesses:

      The work is quite novel and interesting, and the authors do suggest some potentially interesting roles for POLK in the brain, but these are in and of themselves a bit speculative. The majority of the findings of this paper draw upon findings from POLK antibody and its presumed specificity for POLK. However, this antibody has not been fully validated and needs further work. Further validation experiments using Polk-deficient or knocked-down cells to investigate antibody specificity for both immunoblotting and immunofluorescence should be performed. More mechanistic investigation is needed before POLK could be considered as a brain aging clock.

    4. Reviewer #2 (Public review):

      Summary:

      Abdelmageed et al., demonstrate POLK expression in nervous tissue and focus mainly on neurons. Here they describe an exciting age-dependent change in POLK subcellular localization, from the nucleus in young tissue to the cytoplasm in old tissue. They argue that the cytosolic POLK is associated with stress granules. They also investigate the cell-type specific expression of POLK, and quantitate expression changes induced by cell-autonomous (activity) and cell nonautonomous (microglia) factors.

      I think it is an interesting report but requires a few more experiments to support their findings in the latter half of the paper. Additionally, a more mechanistic understanding of the pathways regulating POLK dynamics between the nucleus and cytosol, what is POLK doing in the cytosol, and what is it interacting with; would greatly increase the impact of this report. However, additional mechanistic experiments are mostly not needed to support much of the currently presented results, again, it would simply increase the impact.

    5. Reviewer #3 (Public review):

      Summary:

      In this study, the authors show that DNA polymerase kappa POLK relocalizes in the cytoplasm as granules with age in mice. The reduction of nuclear POLK in old brains is congruent with an increase in DNA damage markers. The cytoplasmic granules colocalize with stress granules and endo-lysosome. The study proposes that protein localization of POLK could be used to determine the biological age of brain tissue sections.

      Strengths:

      Very few studies focus on the POLK protein in the peripheral nervous system (PNS). The microscopy approach used here is also very relevant: it allows the authors to highlight a radical change in POLK localization (nuclear versus cytoplasmic) depending on the age of the neurons.

      The conclusions of the study are strong. Several types of neurones are compared, the colocalization with several proteins from the NHEJ and BER repair pathways is tested, and microscopy images are systematically quantified.

      Weaknesses:

      The authors do not discuss the physical nature of POLK granules. There is a large field of research dedicated to the nature and function of condensates: in particular numerous studies have shown that some condensates but not all exhibit liquid-like properties (https://www.nature.com/articles/nrm.2017.7, https://pubmed.ncbi.nlm.nih.gov/33510441/ https://www.mdpi.com/2073-4425/13/10/1846). The change of physical properties of condensates is particularly important in cells undergoing stress and during aging. The authors should discuss this literature.

    1. eLife Assessment

      This valuable study by Ganesh and colleagues examined how both the value and salience of sensory information can affect economic decision-making. The results provide insights into how different sources of uncertainty found in the real world, including those related to the perception of objects and those related to values associated with objects, can together influence decision-making behavior in systematic ways. The evidence is solid but overlaps with previous studies and could be improved by clarifying novelty and experimental details and considering additional models.

    2. Reviewer #1 (Public review):

      This study examined the effects of uncertainty over states (i.e., stimuli) and uncertainty over rewards (i.e., reward probability) on human learning and decision-making in a simple reinforcement learning task. The authors proposed two hypotheses: (1) high uncertainty over states reduces the learning rate, and (2) visual salience drives decision-making. A Bayesian learner is proposed to support the first hypothesis and several regression analyses confirm this finding. Furthermore, the analysis of salience bias also supports the second hypothesis.

      Strengths:

      (1) The experiment is simple and solid.

      (2) The experimental design is clever and consistent with several well-established paradigms.

      Weaknesses:

      (1) One of my main concerns is that the first conclusion "high uncertainty over states reduces learning rate" is not new and has been shown recently in Yoo et al. (2023). In that study, a slower learning rate was found when stimuli were perceptually similar. It seems to me that the only difference here is that simple Gabor patches are used instead of e.g., green vegetable images in that study. The conclusion is exactly the same.

      (2) The second hypothesis should be more explicit. Instead of claiming "A drives B", can you show specific predictions for the direction of this influence? For example, given the same expected value, do human learners prefer to choose a high-contrast stimulus? and why?

      (3) The analyses of salience bias support the second hypothesis. However, If I understand it correctly, there is no salience parameter (i.e., absolute contrast of each stimulus) in the decision process, according to Eqs. 4,5, and 6 in the Methods. In other words, the Bayesian learner should not exhibit a salience bias. The question then became, why do human learners have such a bias? What are the underlying mechanisms of the salience bias?

      (4) If high perceptual uncertainty reduces the learning rate, why does the normative agent, which takes perceptual uncertainty into account, learn faster than the categorical agent, which has no perceptual uncertainty at all? Did I miss something?

      (5) The learning algorithm is different from the standard Q-learning modeling approach. Better to include more explanation of why this type of learning algorithm is Bayesian optimal?

      (6) Similar to the above, Bayesian modeling here only confirms that high perceptual uncertainty reduces the learning rate in an optimal Bayesian learner. Two questions remain elusive: (a) whether human learners are close to the Bayesian learner (i.e., near optimal). It seems that (a) is unlikely given several suboptimal heuristics (e.g., confirmation bias) found in humans. Then the question is (b) how optimal learning and suboptimal heuristics are combined in the human learning process. One of the major disadvantages of this study is that no new model is proposed to fit trial-by-trial human choices. I believe that building formal process models is the key to improving this study.

      (7) The writing should be substantially improved. The main concern here is that the authors used several seemingly related but ambiguous words to represent the same concept. For example, "perceptual uncertainty" in Figures 1 & 2 indicate the contrast differences between two patches. But page 5 line 9 includes "belief-state uncertainty". Are they the same concept? Moreover, on page 18 line 17, if I understand it correctly, "perceptual uncertainty" here indicates sensory noise not contrast differences. Please carefully check all terminologies and use a single and concrete one to represent a concept throughout the paper.

      (8) Similarly, is the "task state" on page 17 the same as the "perceptual state" in Figure 1&2?

      (9) The Methods section could also be improved. For example, I am not sure how Eq. 5 is derived. Also, page 18 line 16 states that "in our simulations, we manipulated...'. I did not find any information about the simulation. How was the simulation performed? Did I miss something?

    3. Reviewer #2 (Public review):

      Summary:

      The authors addressed the question of how perceptual uncertainty and reward uncertainty jointly shape value-based decision-making. They sought to test two main hypotheses: (H1) perceptual uncertainty modulates learning rates, and (H2) perceptual salience is integrated in value computation. Through a series of analyses, including regression models and normative computational modeling, they showed that learning rates were modulated by perceptual uncertainty (reflected by differences in contrast), supporting H1, and the update was indeed biased toward high-contrast (ie, salient) stimuli, supporting H2.

      Strengths:

      This is a timely and interesting study, with a strong theory-driven focus, reflected by the sophisticated experimental design that systematically tests both perceptual and reward uncertainty. This paper is also well written, with relevant examples (bakery) that draw the analogy to explain the main research question. The main response by participants is reward probability estimation (on a slider), which goes beyond commonly used binary choices and offers richness of the data, that was eventually used in the regression analysis. This work may also open new directions to test the interaction between perceptual decision-making and value-based decision-making.

      Weaknesses:

      Despite the strengths, multiple points may need to be clarified, to make this paper stronger.

      (1) Experimental design:

      (1a) The authors stated (page 6) that "The systematic manipulation of uncertainty resulted in three experimental conditions." If this is truly systematic, wouldn't there be a low-low condition, in a factorial design fashion? Essentially, the current study has H(perceptual uncertainty)-H(reward uncertainty), L(perceptual uncertainty)-H(reward uncertainty), H(perceptual uncertainty)-L(reward uncertainty), but naturally, one would anticipate a L-L condition. It could be argued that the L-L condition may seem too easy, causing a ceiling effect, but it nonetheless provides a benchmark for baseline learning when everting is not ambiguous. Unless the authors would love to, I am not asking the authors to run additional experiments to include all these 4 conditions. But it would be helpful to justify their initial choice of why a L-L condition was not included.

      (1b) I feel there are certain degrees of imbalance regarding the levels of uncertainty. For reward uncertainty, {0.9, 0.1} is low uncertainty, and {0.7, 0.3} is uncertainty, whereas for perceptual uncertainty, the levels of differences in contrasts of the Gabor stimuli are much higher. This means the design appears to be more sensitive to detect any effect that can be caused by perceptual uncertainty (as there is sufficient variation) than reward uncertainty. Again, I am not asking the authors to run additional experiments, but it would be very helpful if they can explain/justify the choice of experimental set up and specification.

      (2) Statistical Analysis:

      (2a) There is some inconsistency regarding the stats used. For all the comparisons across the three conditions, sometimes an F-test is used followed by a series of t-tests (eg. page 6), but in other places, only pair-wise t-tests were reported without an F-test (eg, page 12). It would be helpful, for all of them, to have an F-test first, and then three t-tests. And for the F-test, I assume it was one-way ANOVA? This info was not explicit in the Methods. Also, what multiple comparison corrections were used, or whether it was used at all?

      (2b) Regarding normative modeling, I am aware that this is a pure simulation without model fitting, but it loses the close relationship between the data and model without model fitting. I wonder if model fitting can be done at all. As it stands, there is even no qualitative evidence regarding how well the model could explain the data (eg, by adding real data to Figure 3e). In other words, now that it is a normative model, it is no surprise that it works, but it is not known if it works to account for human data. As a side note, I appreciate that certain groups of researchers tend not to run model estimation; instead, model simulations are used to qualitatively compare the model and data. This is particularly true for "normative models". But at least in the current case, I believe model estimation can be implemented, and will provide mode insights.

      (2c) Relatedly, regarding specific results shown in Figure 4b - the normative agent has a near-zero effect on the fixed learning rate. I do not find these results surprising, because since the normative agent "knows" what is going to happen, and which state the agent is in, there is no need to update the prediction error in the classic Q-learning fashion. But humans, on the other hand, do NOT know the environment, hence they do not know what they are supposed to do, like the model. In essence, the model knows more than the humans in the task know. We can leave this to debate, but I believe most cognitive modelers would agree that the model should not know more than humans know. I think it would be helpful if the authors could discuss the advantages and disadvantages of using normative models in this case.

      (2d) I find the results in Figure 5 interesting. But given the dependent variable is identical across the three correlations (ie, absolute estimation error), I would suggest the authors put all three predicters into a single multiple regression. This way, shared variance, if any, could also be taken into account by the model.

      (2e) I feel the focus on testing H2 is somewhat too less on H1. The authors did a series of analyses on testing and supporting H1, but then only briefly on H2. On first reading, I wondered why not having a normative model also tests the effect of salience, but actually, salience is indeed included in the model (buried in the methods). I am curious to know whether analyzing the salience-related parameter (beta_4) would also support H2.

    1. eLife Assessment

      This important study introduces an approach to discovering antibiotic resistance determinants by leveraging diverse susceptibility profiles among related mycobacterial species, with particular relevance to high-level resistance against natural product-derived antibiotics. The research provides convincing evidence for the role of ADP-ribosylation enzymes in rifamycin resistance among mycobacteria, whilst also demonstrating that antibiotic susceptibility is not correlated with growth rate or intracellular compound concentration. Although some broader claims require additional experimental support, this work lays a significant foundation for understanding the complexity of antibiotic resistance mechanisms in mycobacteria and opens new avenues for future antimicrobial research.

    2. Reviewer #1 (Public review):

      This work shows that resistance profiles to a variety of drugs are variable between different mycobacterial species and are not correlated with growth rate or intrabacterial compound concentration (at least for linezolid, bedaquiline, and Rifampicin). Note that intrabacterial compound concentration does not distinguish between cytosolic and periplasmic/cell wall-associated drugs. The susceptibility profiles for a wide range of mycobacteria tested under the same conditions against 15 commonly used antimycobacterial drugs provide the first recorded cross-species comparison which will be a valuable resource for the scientific community. To understand the reasons for the high Rifampicin resistance seen in many mycobacteria, the authors confirm the presence of the arr gene known to encode a Rif ribosyltransferase involved in Rif resistance in M. smegmatis in the resistant mycobacteria after confirming the absence of on-target mutations in the RpoB RRDR. Metabolomic analyses confirm the presence of ribosylated Rif in some of the naturally resistant mycobacteria which may not be entirely surprising but an important confirmation. Presumably M. branderi is highly resistant despite lacking the arr homolog due to the rpoB S45N mutation. M. flavescens has an MIC similar to that of M. smegmatis, despite having both Arr-1 and Arr-X. Various Arr-1 and Arr-X proteins are expressed and characterized for catalytic activity which shows that Arr-X is a faster enzyme,, especially with respect to more hydrophobic rifamycins. M. flavescens has similar MIC values to Rifapentine and Rifabutin to M. smegmatis. Thus, the Arr-1 versus Arr-X comparison does not provide a complete explanation for the underlying reasons driving natural Rif resistance in mycobacteria. Downregulation of Arr-X expression in M. conceptionense confers increased sensitivity to Rifabutin confirming its role as a rifamycin-inactivating enzyme.

      Overall, the comparison of cross-species susceptibility profiles is novel; the demonstration that MIC is not correlated with intracellular drug concentration is important but not sufficiently interrogated, the demonstration that Arr-X is also a Rif ADP-ribosyltransferase is a good confirmation and shows that it is more efficient than Arr-1 on hydrophobic rifamycins is interesting but maybe not entirely surprising. The manuscript seems to have two parts that are related, but the rifamycin modification aspect of the work is not strongly linked to the first part since it interrogates the modification of one drug but not the common cause of natural resistance for other drugs.

    3. Reviewer #2 (Public review):

      Summary:

      The authors use a variety of methods to investigate the mechanisms of innate drug resistance in mycobacteria. They end up focusing on two primary determinants - drug accumulation, which correlates rather poorly with resistance for many species, and, for the rifamycins, ADP-ribosyltransferases. The latter enzymes do appear to account for a good deal of resistance, though it is difficult to extrapolate quantitatively what their relative contributions are.

      Overall, they make excellent use of biochemical methods to support their conclusions. Though they set out to draw very broad lessons, much of the focus ends up being on rifamycins. This is still a very interesting set of conclusions.

      Strengths:

      (1) A very interesting approach and set of questions.

      (2) Outstanding technical approaches to measuring intracellular drug concentrations and chemical modification of rifamycins.

      (3) Excellent characterization of variant rifamycin ADP-ribosyltransferases

      Weaknesses:

      (1) Figure 3c/d: These panels show the same experiment done twice, yet they display substantially different results in certain cases. For instance, M. smegmatis appears to show an order of magnitude lower RIF accumulation in panel d compared to M. flavescens, despite them displaying equal accumulation in panel c. The authors should provide justification for this variation, particularly as quantitative intra-species comparisons are central to the conclusions of this figure.

      (2) There are several technical concerns with Figure 3 that affect how to interpret the work. According to the methods, the authors did not appear to normalize to an internal standard, only to an external antibiotic standard (which may account for some of the technical variation alluded to above). Second, the authors used different concentrations of drug for each species to try to match the species' MICs. I appreciate the authors' thinking on this, but I think for an uptake experiment it would be more appropriate to treat with the same concentration of drug since uptake is likely saturable at higher drug concentrations. In the current setup, for the species with higher MIC, they have to be able to uptake substantially more antibiotics than the species with low MIC in order to end up with the same normalized uptake value in Figure 3d. It would be helpful to repeat this experiment with a single drug concentration in the media for all species and test whether that gives the same results seen here.

      (3) Figure 4f: This panel seems to argue against the idea that the efficacy of RIF ribosylation is what's driving drug susceptibility. M. flavescens is similarly resistant to RIF as M. smegmatis, yet M. flavescens has dramatically lower riboslyation of RIF. This is perhaps not surprising, as the authors appropriately highlight the number of different rif-modifying enzymes that have been identified that likely also contribute to drug resistance. However, I do think this means that the authors can't make the claim that the resistance they observe is caused by rifamycin modification, so those claims in the text and figure legend should be altered unless the authors can provide further evidence to support them. This experiment also has results that are inconsistent with what appears to be an identical experiment performed in Supplemental Figure 5b. The authors should provide context for why these results differ.

      (4) Fig 4f/5c: M. flavescens has both Arr-1 and Arr-X, yet it appears to not have ribosylated RIF. This result seems to undermine the authors' reliance on the enzyme assay shown in Fig 5c - in that assay, M. flavescens Arr-X is very capable of modifying rifampicin, yet that doesn't appear to translate to the in vivo setting. This is of importance because the authors use this enzyme assay to argue that Arr-X is a fundamentally more powerful RIF resistance mechanism than Arr-1 and that it has specificity for rifabutin. However, the result in Figure 4f would argue that the enzyme assay results cannot be directly translated to in vivo contexts. For the authors to claim that Arr-X is most potent at modifying rifabutin, they could test their CRISPRi knockdowns of Arr-X and Arr-1 under treatment with each of the rifamycins they use in the enzyme assay. The authors mentioned that they didn't do this because all the strains are resistant to those compounds; however, if Arr-X is important for drug resistance, it would be reasonable to expect to see sensitization of the bacteria to those compounds upon knockdown.

      (5) Figure 5d: The authors use this CRISRPi experiment to claim that ArrX from M. conceptionanse is more potent at inactivating rifabutin than Arr-1. This claim depends on there being equal degrees of knockdown of Arr-1 and Arr-X, so the authors should validate the degree of knockdown they get. This is particularly important because, to my knowledge, nobody has used this system in M. conceptionanse before

      (6) The authors' arguments about Arr-X and Arr-1 would be strengthened by showing by LC/MS that Arr-X knockdown in M. conceptionense results in more loss of ribosyl-rifabutin than knockdown of Arr-1.

    4. Reviewer #3 (Public review):

      This manuscript presents a macroevolutionary approach to the identification of novel high-level antibiotic resistance determinants that takes advantage of the natural genetic diversity within a genus (mycobacteria, in this case) by comparing antibiotic resistance profiles across related bacterial species and then using computational, molecular, and cellular approaches to identify and characterize the distinguishing mechanisms of resistance. The approach is contrasted with "microevolutionary" approaches based on comparing resistant and susceptible strains of the same species and approaches based on ecological sampling that may not include clinically relevant pathogens or related species. The potential for new discoveries with the macroevolution-inspired approach is evident in the diversity of drug susceptibility profiles revealed amongst the selected mycobacterial species and the identification and characterization of a new group of rifamycin-modifying ADP-ribosyltransferase (Arr) orthologs of previously described mycobacterial Arr enzymes. Additional findings that intra-bacterial antibiotic accumulation does not always predict potency within this genus, that M. marinum is a better proxy for M. tuberculosis drug susceptibility than the commonly used saprophyte M. smegmatis, and that susceptibility to semi-synthetic antibiotic classes is generally less variable than susceptibility to antibiotics more directly derived from natural products strengthen the claim that the macroevolutionary lens is valuable for elucidating general principles of susceptibility within a genus.

      There are some limitations to the work. The argument for the novelty of the approach could be better articulated. While the opportunities for new discoveries presented by the identification of discrepant susceptibility results between related species are evident, it is less clear how the macroevolutionary approach is further leveraged for the discovery of truly novel resistance determinants. The example of the discovery of Arr-X enzymes presented here relied upon foundational knowledge of previously characterized Arr orthologs. There is little clarity on what the pipeline for identifying more novel resistance determinants would look like. In other words, what does the macroevolutionary perspective contribute to discovery from the point of finding interspecies differences in susceptibility? Does the framework still remain distinct from other discovery frameworks and approaches? If so, how?

      While the experimentation and analyses performed appear well-designed and rigorous, there are a few instances in which broad claims are based on inferences from sample sets or data sets that are too limited to provide robust support. For example, the claim that rifampicin modification, and precisely ADP-ribosylation, is the dominant mechanism of resistance to rifampicin in mycobacteria may be a bit premature or an over-generalization, as other enzymatic modification mechanisms and other mechanisms such as helR-mediated dissociation of rifampicin-stalled RNA polymerases, efflux, etc were not examined nor were CRISPRi knockdown experiments conducted beyond an experiment to tease out the role of Arr-X and Arr-1 in one strain. The general claim that intra-bacterial antibiotic accumulation does not predict potency in mycobacteria may be another over-generalization based on the limited number of drugs and species studied, but perhaps the intended assertion was that antibiotic accumulation ALONE does not predict potency.

    1. eLife Assessment

      This important manuscript provides insights into the competition between Splicing Factor 1 (SF1) and Quaking (QKI) for binding at the ACUAA branch point sequence in a model intron, regulating exon inclusion. The study employs rigorous transcriptomic, proteomic, and reporter assays, with both mammalian cell culture and yeast models. Nevertheless, while the data are convincing, broadening the analysis to additional exons and narrowing the manuscript's title to better align with the experimental scope would strengthen the work.

    2. Reviewer #1 (Public review):

      In this manuscript, the authors aimed to show that SF1 and QKI compete for the intron branch point sequence ACUAA and provide evidence that QKI represses inclusion when bound to it.

      Major strengths of this manuscript include:<br /> (1) Identification of the ACUAA-like motif in exons regulated by QKI and SF1.<br /> (2) The use of the splicing reporter and mutant analysis to show that upstream and downstream ACUAAC elements in intron 10 of RAI are required for repressing splicing.<br /> (3) The use of proteomic to identify proteins in C2C12 nuclear extract that binds to the wild type and mutant sequence.<br /> (4) The yeast studies showing that ectopic lethality when Qki5 expression was induced, due to increased mis-splicing of transcripts that contain the ACUAA element.

      The authors conclusively show that the ACUAA sequence is bound by QKI and provide strong evidence that this leads to differences in exons inclusion and exclusion. In animal cells, and especially in human, branchpoint sequences are degenerate but seem to be recognized by specific splicing factors. Although a subset of splicing factors shows tissue-specific expression patterns most don't, suggesting that yet-to-be-identified mechanisms regulate splicing. This work suggests that an alternate mechanism could be related to the binding affinity of specific RNA binding factors for branchpoint sequences coupled with the level of these different splicing factors in a given cell.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Pereira de Castro and coworkers are studying potential competition between a more standard splicing factor SF1, and an alternative splicing factor called QK1. This is interesting because they bind to overlapping sequence motifs and could potentially have opposing effects on promoting the splicing reaction. To test this idea, the authors KD either SF1 or QK1 in mammalian cells and uncover several exons whose splicing regulation follows the predicted pattern of being promoted for splicing by SF1 and repressed by QK1. Importantly, these have introns enriched in SF1 and QK1 motifs. The authors then focus on one exon in particular with two tandem motifs to study the mechanism of this in greater detail and their results confirm the competition model. Mass spec analysis largely agrees with their proposal; however, it is complicated by the apparently quick transition of SF1-bound complexes to later splicing intermediates. An inspired experiment in yeast shows how QK1 competition could potentially have a detrimental impact on splicing in an orthogonal system. Overall, these results show how splicing regulation can be achieved by competition between a "core" and alternative splicing factor and provide additional insight into the complex process of branch site recognition. The manuscript is exceptionally clear and the figures and data are very logically presented. The work will be valuable to those in the splicing field who are interested in both mechanism and bioinformatics approaches to deconvolve any apparent "splicing code" being used by cells to regulate gene expression. Criticisms are minor and the most important of them stem from overemphasis on parts of the manuscript on the evolutionary angle when evolution itself wasn't analyzed per se.

      Strengths:

      (1) The main discovery of the manuscript involving evidence for SF1/QK1 competition is quite interesting and important for this field. This evidence has been missing and may change how people think about branch site recognition.

      (2) The experiments and the rationale behind them are exceptionally clearly and logically presented. This was wonderful!

      (3) The experiments are carried out to a high standard and well-designed controls are included.

      (4) The extrapolation of the result to yeast in order to show the potentially devastating consequences of the QK1 competition was very exciting and creative.

      Weaknesses:

      Overall the weaknesses are relatively minor and involve cases where clarification is necessary, some additional analysis could bolster the arguments, and suggestions for focusing the manuscript on its strengths.

      (1) The title (Ancient...evolutionary outcomes), abstract, and some parts of the discussion focus heavily on the evolutionary implications of this work. However, evolutionary analysis was not performed in these studies (e.g., when did QK1 and SF1 proteins arise and/or diverge? How does this line up with branch site motifs and evolution of U2? Any insight from recent work from Scott Roy et al?). I think this aspect either needs to be bolstered with experimental work/data or this should be tamped down in the manuscript. I suggest highlighting the idea expressed in the sentence "A nuanced implication of this model is that loss-of-function...". To me, this is better supported by the data and potentially by some analysis of mutations associated with human disease.

      (2) One paper that I didn't see cited was that by Tanackovic and Kramer (Mol Biol Cell 2005). This paper is relevant because they KD SF1 and found it nonessential for splicing in vivo. Do their results have implications for those here? How do the results of the KD compare? Could QK1 competition have influenced their findings (or does their work influence the "nuanced implication" model referenced above?)?

      (3) Can the authors please provide a citation for the statement "degeneracy is observed to a higher degree in organisms with more alternative splicing"? Does recent evolutionary analysis support this?

      (4) For the data in Figure 3, I was left wondering if NMD was confounding this analysis. Can the authors respond to this and address this concern directly?

      (5) To me, the idea that an engaged U2 snRNP was pulled down in Figure 4F would be stronger if the snRNA was detected. Was that able to be observed by northern or primer extension? Would SF1 be enriched if the U2 snRNA was degraded by RNaseH in the NE?

      (6) I'm wondering how additive the effects of QK1 and SF1 are... In Figure 2, if QK1 and SF1 are both knocked down, is the splicing of exon 11 restored to "wt" levels?

      (7) The first discussion section has two paragraphs that begin "How does competition between SF1..." and "Relatively little is known about how...". I found the discussion and speculation about localization, paraspekles, and lncRNAs interesting but a bit detracting from the strengths of the manuscript. I would suggest shortening these two paragraphs into a single one.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, the authors were trying to establish whether competition between the RNA-binding proteins SF1 and QKI controlled splicing outcomes. These two proteins have similar binding sites and protein sequences, but SF1 lacks a dimerization motif and seems to bind a single version of the binding sequence. Importantly, these binding sequences correspond to branchpoint consensus sequences, with SF1 binding leading to productive splicing, but QKI binding leading instead to association with paraspeckle proteins. They show that in human cells SF1 generally activates exons and QKI represses, and a large group of the jointly regulated exons (43% of joint targets) are reciprocally controlled by SF1 and QKI. They focus on one of these exons RAI14 that shows this reciprocal pattern of regulation, and has 2 repeats of the binding site that make it a candidate for joint regulation, and confirm regulation within a minigene context. The authors used the assembly of proteins within nuclear extracts to explain the effect of QKI versus SF1 binding. Finally, the authors show that the expression of QKI is lethal in yeast, and causes splicing defects.

      How this fits in the field. This study is interesting and provides a conceptual advance by providing a general rule on how SF1 and QKI interact in relation to binding sites, and the relative molecular fates followed, so is very useful. Most of the analysis seems to focus on one example, although the molecular analysis and global work significantly add to the picture from the previously published paper about NUMB joint regulation by QKI and SF (Zong et al, cited in text as reference 50, that looked at SF1 and QKI binding in relation to a duplicated binding site/branchpoint sequence in NUMB).

      Strengths:

      The data presented are strong and clear. The ideas discussed in this paper are of wide interest, and present a simple model where two binding sites generate a potentially repressive QKI response, whereas exons that have a single upstream sequence are just regulated by SF1. The assembly of splicing complexes on RNAs derived from RAI14 in nuclear extracts, followed by mass spec gave interesting mechanistic insight into what was occurring as a result of QKI versus SF1 binding.

      Weaknesses:

      I did not think the title best summarises the take-home message and could be perhaps a bit more modest. Although the authors investigated splicing patterns in yeast and human cells, yeast do not have QKI so there is no ancient competition in that case, and the study did not really investigate physiological or evolutionary outcomes in splicing, although it provides interesting speculation on them. Also as I understood it, the important issue was less conserved branchpoints in higher eukaryotes enabling alternative splicing, rather than competition for the conserved branchpoint sequence. So despite the the data being strong and properly analysed and discussed in the paper, could the authors think whether they fit best with the take-home message provided in the title? Just as a suggestion (I am sure the authors can do a better job), maybe "molecular competition between variant branchpoint sequences predict physiological and evolutionary outcomes in splicing"?

      Although the authors do provide some global data, most of the detailed analysis is of RAI14. It would have been useful to examine members of the other quadrants in Figure 1C as well for potential binding sites to give a reason why these are not co-regulated in the same way as RAI14. How many of the RAI14 quadrants had single/double sites (the motif analysis seemed to pull out just one), and could one of the non-reciprocally regulated exons be moved into a different quadrant by addition or subtraction of a binding site or changing the branchpoint (using a minigene approach for example).

    1. eLife Assessment

      This study presents an important finding on the role of telomeres in modulating interleukin-1 signaling and tumor immunity in TNBC. The evidence supporting these findings is solid, presented through comprehensive analyses including TNBC clinical samples, tumor-derived organoids, cancer cells, and xenografts. The work will be of broad interest to cell and medical biologists focusing on TNBC.

    2. Reviewer #2 (Public review):

      This study highlights the role of role of telomeres in modulating IL-1 signaling and tumor immunity. The authors demonstrate a strong correlation between telomere length and IL-1 signaling by analyzing TNBC patient samples and tumor-derived organoids. Mechanistic insights revealed that non-telomeric TRF2 binding at the IL-1R1. The observed effects on NF-kB signaling and subsequent alterations in cytokine expression contribute significantly to our understanding of the complex interplay between telomeres and the tumor microenvironment. Furthermore, the study reports that the length of telomeres and IL-1R1 expression is associated with TAM enrichment. However, the manuscript lacks in-depth mechanistic insights into how telomere length affects IL-1R1 expression Overall, this work broadens our understanding of telomere biology.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, entitled "Telomere length sensitive regulation of Interleukin Receptor 1 type 1 (IL1R1) by the shelterin protein TRF2 modulates immune signalling in the tumour microenvironment", Dr Mukherjee and colleagues pointed at clarifying the extra-telomeric role of TRF2 in regulating IL1R1 expression with consequent impact on TAMs tumor-infiltration.

      Strengths:

      Upon a careful manuscript evaluation, I feel to conclude that the presented story is undoubtedly well conceived. At technical level, experiments have been properly performed and the obtained results well-support author conclusions.

      Weaknesses:

      Unfortunately, the covered topic is not particularly novel. In detail, TRF2 capability of binding extratelomeric foci in cells with short telomeres has been well demonstrated in a previous work published by the same research group. The capability of TRF2 to regulate gene expression is well-known, the capability of TRF2 to interact with p300 has been already demonstrated and, finally, the capability of TRF2 to regulate TAMs infiltration (that is the effective novelty of the manuscript) appears as an obvious consequence of IL1R1 modulation (this is probably due to the current manuscript organization).

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This manuscript from Mukherjee et al examines potential connections between telomere length and tumor immune responses. This examination is based on the premise that telomeres and tumor immunity have each been shown to play separate, but important, roles in cancer progression and prognosis as well as prior correlative findings between telomere length and immunity. In keeping with a potential connection between telomere length and tumor immunity, the authors find that long telomere length is associated with reduced expression of the cytokine receptor IL1R1. Long telomere length is also associated with reduced TRF2 occupancy at the putative IL1R1 promoter. These observations lead the authors towards a model in which reduced telomere occupancy of TRF2 - due to telomere shortening - promotes IL1R1 transcription via recruitment of the p300 histone acetyltransferase. This model is based on earlier studies from this group (i.e. Mukherjee et al., 2019) which first proposed that telomere length can influence gene expression by enabling TRF2 binding and gene transactivation at telomere-distal sites. Further mechanistic work suggests that G-quadruplexes are important for TRF2 binding to IL1R1 promoter and that TRF2 acetylation is necessary for p300 recruitment. Complementary studies in human triple-negative breast cancer cells add potential clinical relevance but do not possess a direct connection to the proposed model. Overall, the article presents several interesting observations, but disconnection across central elements of the model and the marginal degree of the data leave open significant uncertainty regarding the conclusions.

      Strengths:

      Many of the key results are examined across multiple cell models.

      The authors propose a highly innovative model to explain their results.

      Weaknesses:

      Although the authors attempt to replicate most key results across multiple models, the results are often marginal or appear to lack statistical significance. For example, the reduction in IL1R1 protein levels observed in HT1080 cells that possess long telomeres relative to HT1080 short telomere cells appears to be modest (Supplementary Figure 1I). Associated changes in IL1R1 mRNA levels are similarly modest.

      Related to the point above, a lack of strong functional studies leaves an open question as to whether observed changes in IL1R1 expression across telomere short/long cancer cells are biologically meaningful.

      Statistical significance is described sporadically throughout the paper. Most major trends hold, but the statistical significance of the results is often unclear. For example, Figure 1A uses a statistical test to show statistically significant increases in TRF2 occupancy at the IL1R1 promoter in short telomere HT1080 relative to long telomere HT1080. However, similar experiments (i.e. Figure 2B, Figure 4A - D) lack statistical tests.

      TRF2 overexpression resulted in ~ 5-fold or more change in IL1R1 expression. Compared to this, telomere length-dependent alterations in IL1R1 expression, although about 2-fold, appear modest (~ 50% reduction in cells with long telomeres across different model systems used). Notably, this was consistent and significant across cell-based model systems and xenograft tumors (see Figure 1). Unlike TRF2 induction, telomere elongation or shortening vary within the permissible physiological limits of cells. This is likely to result in the observed variation in IL1R1 levels.

      For biological relevance, we have shown this using multiple models where telomere length was either different (patient tissue, organoids) or were altered (cell lines, xenograft models) . Where IL1 signalling in TNBC tissue and tumor organoids, and cells/xenografts were shown to impact M2 macrophage infiltration in a telomere length sensitive fashion. We made use of the tumor organoids to test M2 macrophage infiltration using IL1RA and small molecule based IL1R1 inhibition.

      We have now included statistical tests in all the relevant figures and incorporated the necessary details about the tests performed in the figure legend for clarity of readers. Additionally, all data points, p values and details of statistical tests have been included in Figure wise excel sheets for both main and supplementary figures.

      Reviewer #1 (Recommendations For The Authors):

      There are typos throughout the manuscript. The word 'expression' is incorrectly spelled on y-axis labels throughout the manuscript (for example see Figure 1B). The word 'telomere' is incorrectly spelled in Supplementary Figure 1 legend panel A. Most errors, such as these, do not interfere with my comprehension of the manuscript. However, others made the manuscript difficult to follow. For example, I think that MDAMB231, MDAMD231, and MDAM231 are frequently used interchangeably to refer to the same cell line. This makes it very difficult to understand certain experiments.

      I often found it difficult to understand which statistical test was used for a specific experiment. I suggest changing the style in the legends to more clearly connect statistical tests with specific data points.

      We thank the reviewer for pointing out the typological errors. We have now made relevant corrections to both figures and text.

      As stated above, we have now provided details of statistical tests performed in the figure legend for clarity of readers. Additionally, all data points, p values and details of statistical tests have been included in Figure wise excel sheets for both main and supplementary figures.

      Reviewer #2 (Public Review):

      This study highlights the role of telomeres in modulating IL-1 signaling and tumor immunity. The authors demonstrate a strong correlation between telomere length and IL-1 signaling by analyzing TNBC patient samples and tumor-derived organoids. Mechanistic insights revealed non-telomeric TRF2 binding at the IL-1R1. The observed effects on NF-kB signaling and subsequent alterations in cytokine expression contribute significantly to our understanding of the complex interplay between telomeres and the tumor microenvironment. Furthermore, the study reports that the length of telomeres and IL-1R1 expression is associated with TAM enrichment. However, the manuscript lacks in-depth mechanistic insights into how telomere length affects IL-1R1 expression. Overall, this work broadens our understanding of telomere biology.

      The mechanism of how telomere length affects IL1R1 expression involves sequestration and reallocation of TRF2 between telomeres and gene promoters (in this case, the IL1R1 promoter). We have previously shown this across multiple genomic sites (Mukherjee et al, 2018; reviewed in J. Biol. Chem. 2020, Trends in Genetics 2023). We have described this in the manuscript along with references citing the previous works. A scheme explaining the model was provided as Additional Supplementary Figure 1, along with a description of the mechanistic model.

      Figure 1-4 in main figures describe the molecular mechanism of telomere-dependent IL1R1 activation. This includes ChIP data for TRF2 on the IL1R1 promoter in long/short telomeres, as well as TRF2-mediated histone/p300 recruitment and IL1R1 gene expression. We further show how specific acetylation on TRF2 is crucial for TRF2-mediated IL1R1 regulation (Figure 5).

      Reviewer #2 (Recommendations For The Authors):

      The study primarily provides a snapshot of cytokine expression and telomere length at a single time point. Longitudinal studies or dynamic analyses could provide a more comprehensive understanding of the temporal relationship between telomere length and cytokine expression.

      Tumor heterogeneity is a significant problem for the various therapies. The study notes significant heterogeneity in telomere length but does not investigate the implications of this heterogeneity. Understanding the role of telomere length variation in different tumor cell populations is essential for a comprehensive interpretation of the results.

      The study only mentions a correlation between IL1R1 and relative telomere length but does not provide any potential clinical correlations with patient outcomes or survival. Addressing the clinical relevance of these molecular changes would improve the translational impact.

      The importance of IL1R1 in prognostic and clinical outcomes of TNBC has been studied by multiple groups. The overall consensus is that higher IL1R1 leads to poor prognosis – aiding both cancer progression and metastasis. Using publicly available TCGA data, we found that IL1R1 high samples had significantly lower survival in breast cancer (BRCA) datasets. The results have now been included in the manuscript as Supplemnetray Figure 7G.

      Addition in text:

      “We, next, used publicly available TCGA gene expression data of breast cancer samples (BRCA) (Supplementary file 4) to assess the effect of IL1R1 expression on cancer prognosis. We categorized samples based on IL1R1 expression: IL1R1 high (N=254) and IL1R1 low samples (N= 709). It was seen that overall patient survival was significantly lower in IL1R1 high samples (Log-rank p value -0.0149) (Supplementary Figure 7G). We also checked the frequency of occurrence of various breast cancer sub-types in IL1R1 high and low samples (Supplementary Figure 7H). While invasive mixed mucinous carcinoma (the most abundant sub-type) was predominantly seen in IL1R1 low samples, metaplastic breast cancer was only found within the IL1R1 high samples. Interestingly, metaplastic breast cancer has been frequently found to be ‘triple negative’-i.e., ER-,PR- and HER2-. (Reddy et al., 2020).”

      However, we could not access a TNBC (or any breast cancer dataset) that has been characterized for telomere length. Unfortunately, the clinical TNBC samples that we had access to did not have any paired short-term/long-term survival datasets. We could, in principle, use TERT/TERC expression as a proxy for telomere length; however, in our experiments, we found that telomerase activity did not positively correlate with telomere length as expected (Supplementary Figure 7C, Supplementary Figure 8D). Therefore, transcriptional signature (of telomere-associated genes) may not be a reliable indicator of telomere length.

      The study lacks in-depth mechanistic insights into how telomere length affects IL1R1 expression and subsequently influences TAM infiltration. Further molecular studies or pathway analyses are necessary to elucidate the underlying mechanisms.

      The mechanism involves sequestration and reallocation of TRF2 between telomeres and gene promoters (in this case, IL1R1 promoter). We have previously shown this across multiple genomic sites (Mukherjee et al, 2018). We have appropriately discussed this in the manuscript.

      A schematic explaining the model has been provided as Additional Supplementary Figure 1.

      We have provided ChIP data for TRF2 on IL1R1 promoter in long/short telomeres in the manuscript as well as histone/p300 ChIP and gene expression (Figure 1-4 in main figures exclusively deal with molecular mechanism of telomere dependent IL1R1 activation).  We further go on to show how specific acetylation on TRF2 might be crucial for TRF2-mediated IL1R1 regulation (Figure 5). One of the key findings herein is the fact that TRF2 can directly regulate IL1R1 expression through promoter occupancy- tested in telomere altered cell lines (HT1080, MDAMB231) and tumor xenografts (Figure 1 A, F, I- for TRF2 promoter occupancy).

      Pathway analysis of HT1080 (short vs long telomere) transcriptome, shows that cytokine-cytokine receptor interaction is one of the key pathways in upregulated genes.

      While we have focused on TRF2 mediated IL1R1 regulation, it is quite possible that there are other telomere sensitive pathways/mechanisms by which IL1R1 is regulated. This has been duly acknowledged in the discussion.

      The manuscript title suggests modulation of immune signaling in the tumor microenvironment, yet the authors exclusively focus on CD206+ TAMs, limiting the scope. It is recommended to investigate other immune cell types for a more comprehensive understanding of changes in the immune tumor microenvironment.

      As stated above, we approached the manuscript from the purview of TRF2-mediated IL1R1 regulation. In our assessment of TCGA data for breast cancer, we found that CD206 (MRC1) had the highest enrichment in IL1R1 high samples among key TAM and TIL markers- now added as Figure 8A (Details in Supplementary file 5). It also had the highest correlation with IL1R1 among the tested markers. Therefore, we proceeded to check CD206+ve TAMs.

      Now the following section has been added to text:

      “We further found that the total proportion of immune cells (% of CD45 +ve cells) did not vary significantly between short and long telomere TNBC samples (Supplementary Figure 8C). However, TNBC-ST samples had a higher percentage of myeloid cells (CD11B +ve) within the CD 45 +ve immune cell population. We checked in three TNBC-ST and TNBC-LT samples each and found that the percentage of M1 macrophages (CD86 high CD 206 low) in the myeloid population was lower than that of the M2 macrophages (CD 206 high CD 86 low) and unlike the latter, did not vary significantly between the TNBC-ST and TNBC-LT samples (Supplementary Figure 8C).”

      Unfortunately, due to sample limitations we are unable to test this on a larger cohort of samples.

      A single cell transcriptome experiment may have been a good way to have a more comprehensive immune profiling. However, with our TNBC samples, isolated nuclei for downstream processing had low viability as per 10X genomics specifications.

      Does IL1R1 influence TAM recruitment or polarization within the tumor microenvironment? To assess the impact, the authors should use a marker indicative of M1-like macrophages, such as CD80 or CD86.

      To address the issue of TAM recruitment vs polarization meaningfully we need to characterize tissue resident macrophages as well as macrophages in circulation. We did not have access to patient blood.  A murine breast cancer in-vivo model might be a more appropriate model to test this, which would take considerable time for us to develop. It is something that we hope to address in a follow up study.

      Did the authors analyze other breast cancer subtypes for telomere length?

      Unfortunately, other breast cancer sub-types besides TNBC were not available to us for experimentation.

      Figure legends are very briefly written and need to be elaborated. Scale bars are also missing in images.

      Add a gating strategy for flow cytometry results in Figure 8A.

      Figure legend have been expanded for clarity. More prominent scale bars have been added for better visibility and reference.  A relevant gating strategy has been added as Supplementary figure 8B.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, entitled "Telomere length sensitive regulation of Interleukin Receptor 1 type 1 (IL1R1) by the shelterin protein TRF2 modulates immune signalling in the tumour microenvironment", Dr. Mukherjee and colleagues pointed out clarifying the extra-telomeric role of TRF2 in regulating IL1R1 expression with consequent impact on TAMs tumor-infiltration.

      Strengths:

      Upon careful manuscript evaluation, I feel that the presented story is undoubtedly well conceived. At the technical level, experiments have been properly performed and the obtained results support the authors' conclusions.

      Weaknesses:

      Unfortunately, the covered topic is not particularly novel. In detail, the TRF2 capability of binding extratelomeric foci in cells with short telomeres has been well demonstrated in a previous work published by the same research group. The capability of TRF2 to regulate gene expression is well-known, the capability of TRF2 to interact with p300 has been already demonstrated and, finally, the capability of TRF2 to regulate TAMs infiltration (that is the effective novelty of the manuscript) appears as an obvious consequence of IL1R1 modulation (this is probably due to the current manuscript organization).

      Here we studied the TRF2-IL1R1 regulatory axis (not reported earlier by us or others) as a case of the telomere sequestration model that we described earlier (Mukherjee et al., 2018; reviewed in J. Biol. Chem. 2020, Trends in Genetics 2023). This manuscript demonstrates the effect of the TRF2-IL1R1 regulation on telomere-sensitive tumor macrophage recruitment. To the best of our knowledge, no previous study connects telomeres of tumor cells mechanistically to the tumor immune microenvironment. Here we focused on the IL1R1 promoter and provided mechanistic evidence for acetylated-TRF2 engaging the HAT p300 for epigenetically altering the promoter. This mechanism of TRF2 mediated activation has not been previously reported. Further, the function of a specific post translational modification (acetylation of the lysine residue 293K) of TRF2 in IL1R1 regulation is described for the first time. Additional experiments showed that TRF2-acetylation mutants, when targeted to the IL1R1 promoter, significantly alter the transcriptional state of the IL1R1 promoter. To our knowledge, the function of any TRF2 residue in transcriptional activation had not been previously described. Taken together, these demonstrate novel insights into the mechanism of TRF2-mediated gene regulation, that is telomere-sensitive, and affects the tumor-immune microenvironment.

      We considered the reviewer’s suggestion to reorganize the result section. Reorganizing the manuscript to describe the TAM-related results first would, in our opinion, limit focus of the new findings and discovery [and novelty of the mechanisms (as described in above response, and in response to other comments by reviewers)] of the non-telomeric TRF2-mediated IL1R1 regulation. We have tried to bring out the novelty, implications and importance of the TAM-related observations in the discussion.

      Reviewer #3 (Recommendations For The Authors):

      Based on the comments reported above, I would encourage the author to modify the manuscript by reorganizing the text. I would suggest starting from the capability of TRF2 to modulate macrophages infiltration. Data relative to IL1R1 expression may be used to explain the mechanism through which TRF2 exerts its immune-modulatory role. This, in my view, would dramatically strengthen the presented story.

      Concerning the text, "results" should be dramatically streamlined and background information should be just limited to the "introduction" section.

      The manuscript should be carefully revisited at grammar level. A number of incomplete sentences and some typos are present within the text.

      We thank the reviewer for the appreciation of our work for its technical strengths.

      At the onset, we agree that we have explored the TRF2-IL1R1 regulatory axis. This underscores the significance of the telomere sequestration model that we had proposed earlier (Mukherjee et al., 2018). Herein, however, we significantly extend our previous work (which was more general and intended for putting forward the idea of telomere-dependent distal gene expression) by studying TRF2-mediated regulation of IL1 signalling (which was previously unreported). In addition, mechanistic details of how telomeres are connected to IL1 signaling through non-telomeric TRF2 are entirely new, not reported before by us or others.

      We have removed some text descriptions from the result section to streamline the section.

    1. eLife Assessment

      This study presents a valuable finding on how the sensorimotor control system deals with redundancy within our body, based on a novel bimanual task. The evidence supporting the authors' claims is convincing, as demonstrated over four different experiments. The work will be of interest to researchers from the motor control community and related fields, and further investigation into the interpretation of the findings could increase the generalisation of the study to a broader audience.

    2. Reviewer #1 (Public Review):

      Summary/Strengths:

      This manuscript describes a stimulating contribution to the field of human motor control. The complexity of control and learning is studied with a new task offering a myriad of possible coordination patterns. Findings are original and exemplify how baseline relationships determine learning.

      Weaknesses:

      A new task is presented: it is a thoughtful one, but because it is a new one, the manuscript section is filled with relatively new terms and acronyms that are not necessarily easy to rapidly understand.

      First, some more thoughts may be devoted to the take-home message. In the title, I am not sure manipulating a stick with both hands is a key piece of information. Also, the authors appear to insist on the term 'implicit', and I wonder if it is a big deal in this manuscript and if all the necessary evidence appears in this study that control and adaptation are exclusively implicit. As there is no clear comparison between gradual and abrupt sessions, the authors may consider removing at least from the title and abstract the words 'implicit' and 'implicitly'. Most importantly, the authors may consider modifying the last sentence of the abstract to clearly provide the most substantial theoretical advance from this study.

      It seems that a substantial finding is the 'constraint' imposed by baseline control laws on sensorimotor adaptation. This seems to echo and extend previous work of Wu, Smith et al. (Nat Neurosci, 2014): their findings, which were not necessarily always replicated, suggested that the more participants were variable in baseline, the better they adapted to a systematic perturbation. The authors may study whether residual errors are smaller or adaptation is faster for individuals with larger motor variability in baseline. Unfortunately, the authors do not present the classic time course of sensorimotor adaptation in any experiment. The adaptation is not described as typically done: the authors should thus show the changes in tip movement direction and stick-tilt angle across trials, and highlight any significant difference between baseline, early adaptation, and late adaptation, for instance. I also wonder why the authors did not include a few no-perturbation trials after the exposure phase to study after-effects in the study design: it looks like a missed opportunity here. Overall, I think that showing the time course of adaptation is necessary for the present study to provide a more comprehensive understanding of that new task, and to re-explore the role of motor variability during baseline for sensorimotor adaptation.

      The distance between hands was fixed at 15 cm with the Kinarm instead of a mechanical constraint. I wonder how much this distance varied and more importantly whether from that analysis or a force analysis, the authors could determine whether one hand led the other one in the adaptation.

      I understand the distinction between task- and end-effector irrelevant perturbation, and at the same time results show that the nervous system reacts to both types of perturbation, indicating that they both seem relevant or important. In line 32, the errors mentioned at the end of the sentence suggest that adaptation is in fact maladaptive. I think the authors may extend the Discussion on why adaptation was found in the experiments with end-effector irrelevant and especially how an internal (forward) model or a pair of internal (forward) models may be used to predict both the visual and the somatosensory consequences of the motor commands.

    3. Reviewer #2 (Public review):

      Summary:

      The authors have developed a novel bimanual task that allows them to study how the sensorimotor control system deals with redundancy within our body. Specifically, the two hands control two robot handles that control the position and orientation of a virtual stick, where the end of the stick is moved into a target. This task has infinite solutions to any movement, where the two hands influence both tip-movement direction and stick-tilt angle. When moving to different targets in the baseline phase, participants change the tilt angle of the stick in a specific pattern that produces close to minimum movement of the two hands to produce the task. In a series of experiments, the authors then apply perturbations to the stick angle and stick movement direction to examine how either tip-movement (task-relevant) or stick-angle (task-irrelevant) perturbations effect adaptation. Both types of perturbations affect adaptation, but this adaptation follows the baseline pattern of tip-movement and stick angle relation such that even task-irrelevant perturbations drive adaptation in a manner that results in task-relevant errors. Overall, the authors suggest that these baseline relations affect how we adapt to changes in our tasks. This work provides an important demonstration that underlying solutions\relations can affect the manner in which we adapt. I think one major contribution of this work will also be the task itself, which provides a very fruitful and important framework for studying more complex motor control tasks.

      Strengths:

      Overall, I find this a very interesting and well-written paper. Beyond providing a new motor task that could be influential in the field, I think it also contributes to studying a very important question - how we can solve redundancy in the sensorimotor control system, as there are many possible mechanisms or methods that could be used - each of which produces different solutions and might affect the manner in which we adapt.

      Weaknesses:

      The visual perturbations were only provided while reaching to one target, which limits the amount of exploration of the environment that the participants experience. Overall, I would find the results even more compelling if the same perturbations applied to movements to more (or all) of the targets produced similar adaptation profiles. The question is to what degree the results derive from only providing a small subset of the environment to explore.

    4. Reviewer #3 (Public review):

      Summary:

      This study investigated motor system adaptation to new environments through modifications in redundant body movements. Utilizing a novel bimanual stick-manipulation task, participants controlled a virtual stick to reach targets, focusing on how tip-movement direction perturbations affected tip movement and stick-tilt adaptation. The findings revealed a consistent strategy among participants who flexibly adjusted the tilt angle of the stick in response to errors. The adaptation patterns were influenced by physical space relationships, which guided the motor system's selection of movement patterns. This study underscores the motor system's adaptability through changes in redundant body movement patterns.

      Strengths:

      This study introduces an innovative bimanual stick manipulation task to explore motor system adaptation to novel environments through alterations in redundant body movement patterns. It also expands the use of endpoint robots in motor control studies.

      Weaknesses:

      The generalizability of the findings is limited. Future work may strengthen the present study's findings by examining whether the observed relationships hold for different stick lengths (i.e., varying hand positions along the virtual stick) or when reaching targets to the left and right of the starting position, not just at varying angles along one side. Additionally, a more comprehensive review of the existing literature on redundant systems, rather than primarily focusing on the lack of redundancy in endpoint-reaching tasks, would have strengthened this study. While the novel task expands the use of endpoint robots in motor control studies, its utility in exploring broader aspects of motor control and learning may be constrained.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank all the reviewers for their positive evaluation of our paper, as described in the Strengths section. We are also grateful for their helpful comments and suggestions, which we have addressed below. We believe that the manuscript has been significantly improved as a result of these suggestions. In addition to these changes, we also corrected some inconsistencies (statistical values in the last sentence of a Figure 5 caption) and sentences in the main text (lines 155, 452, 522) (these corrections did not affect the results).

      Fig. 5e: R=0.599, P<0.001 -> R=0.601, P=0.007

      L150: "the angle of stick tilt angle" -> "the angle of stick tilt"

      L437: "no such" -> "such"

      L522: "?" -> "."

      Reviewer #1 (Public Review):

      Summary/Strengths:

      This manuscript describes a stimulating contribution to the field of human motor control. The complexity of control and learning is studied with a new task offering a myriad of possible coordination patterns. Findings are original and exemplify how baseline relationships determine learning.

      Weaknesses:

      A new task is presented: it is a thoughtful one, but because it is a new one, the manuscript section is filled with relatively new terms and acronyms that are not necessarily easy to rapidly understand.

      First, some more thoughts may be devoted to the take-home message. In the title, I am not sure manipulating a stick with both hands is a key piece of information. Also, the authors appear to insist on the term ‘implicit’, and I wonder if it is a big deal in this manuscript and if all the necessary evidence appears in this study that control and adaptation are exclusively implicit. As there is no clear comparison between gradual and abrupt sessions, the authors may consider removing at least from the title and abstract the words ‘implicit’ and ‘implicitly’. Most importantly, the authors may consider modifying the last sentence of the abstract to clearly provide the most substantial theoretical advance from this study.

      Thank you for your positive comment on our paper. We agree with the reviewer that our paper used a lot of acronyms that might confuse the readers. As we have addressed below (in the rebuttal to the Results section), we have reduced the number of acronyms.

      Regarding the comment on the use of the word “implicit” in the title and the abstract, we believe that its use in this paper is very important and indispensable. One of our main findings was that the pattern of adaptation between the tip-movement direction and the stick-tilt angle largely followed that in the baseline condition when aiming at different target directions. This adaptation was largely implicit because participants were not aware of the presence of the perturbation as the amount of perturbation was gradually increased. This implicitness suggests that the adaptation pattern of how the movement should be corrected is embedded in the motor learning system. On the other hand, if this adaptation pattern was achieved on the basis of the explicit strategy of changing the direction of the tip-movement, the adaptation pattern that follows the baseline pattern is not at all surprising. For these reasons, we will continue to use the word "implicit".

      It seems that a substantial finding is the ‘constraint’ imposed by baseline control laws on sensorimotor adaptation. This seems to echo and extend previous work of Wu, Smith et al. (Nat Neurosci, 2014): their findings, which were not necessarily always replicated, suggested that the more participants were variable in baseline, the better they adapted to a systematic perturbation. The authors may study whether residual errors are smaller or adaptation is faster for individuals with larger motor variability in baseline. Unfortunately, the authors do not present the classic time course of sensorimotor adaptation in any experiment. The adaptation is not described as typically done: the authors should thus show the changes in tip movement direction and stick-tilt angle across trials, and highlight any significant difference between baseline, early adaptation, and late adaptation, for instance. I also wonder why the authors did not include a few noperturbation trials after the exposure phase to study after-effects in the study design: it looks like a missed opportunity here. Overall, I think that showing the time course of adaptation is necessary for the present study to provide a more comprehensive understanding of that new task, and to re-explore the role of motor variability during baseline for sensorimotor adaptation.

      We appreciate the reviewer for raising these important issues.

      Regarding the learning curve, because the amount of perturbation was gradually increased except for Exp.1B, we were not able to obtain typical learning curves (i.e., the curve showing errors decaying exponentially with trials). However, it may still be useful to show how the movement changed with trials during adaptation. Therefore, following the reviewer's suggestion, we have added the figures of the time course of adaptation in the supplementary data (Figures S1, S2, S4, and S5).

      There are two reasons why our experiments did not include aftereffect quantification trials (i.e., probe trials). First, in the case of adaptation to a visual perturbation (e.g., visual rotation), probe trials are not necessary because the degree of adaptation can be easily quantified by the amount of compensation in the perturbation trials (however, in the case of dynamic perturbations such as force fields, the use of probe trials is necessary). Second, the inclusion of probe trials allows participants to be aware of the presence of the perturbation, which we would like to avoid.

      We also appreciate the interesting additional questions regarding the relevance of our work to the relationship between baseline motor variability and adaptation performance. As this topic, although interesting, is outside the scope of this paper, we concluded that we would not address it in the manuscript. In fact, the experiments were not ideal for quantifying motor variability in the baseline phase because participants had to aim at different targets, which could change the characteristics of motor variability. In addition, we gradually increased the size of the perturbation except for Exp.1B (see Author response image 1, upper panel), which could make it difficult to assess the speed of adaptation. Nevertheless, we think it is worth mentioning this point in this rebuttal. Specifically, we examined the correlation between baseline motor variability when aiming the 0 deg target (tip-movement direction or stick-tilt angle) and adaptation speed in Exp 1A and Exp 1B (Author response image 1 and Author response image 2). To assess adaptation speed in Exp.1A, we quantified the slope of the tip-movement direction to a gradually increasing perturbation (Author response image 1, upper panel). The adaptation speed in Exp.1B was obtained by fitting the exponential function to the data (Author response image 2, upper panel). Although the statistical results were not completely consistent, we found that the participants with greater the motor variability at baseline tended to show faster adaptation, as shown in a previous study (Wu et al., Nat Neurosci, 2014).

      Author response image 1.

      Correlation between the baseline variability and learning speed (Experiment 1A). In Exp 1A, the rotation of the tip-movement direction was gradually increased by 1 degree per trial up to 30 degrees. The learning speed was quantified by calculating how quickly the direction of movement followed the perturbation (upper panel). The lower left panel shows the variability of the tip-movement direction versus learning speed, while the lower right panel shows the variability of the stick-tilt angle versus learning speed. Baseline variability was calculated as a standard deviation across trials (trials in which a target appeared in a 0-degree direction).

      Author response image 2.

      Correlation between the baseline variability and learning speed (Experiment 1B). In Exp 1B, the rotation of the tip-movement direction was abruptly applied from the first trial (30 degrees). The learning speed was calculated as a time constant obtained by exponential curve fitting. The lower left panel shows the variability of the tip-movement direction versus learning speed, while the lower right panel shows the variability of the stick-tilt angle versus learning speed. Baseline variability was calculated as a standard deviation across trials (trials in which a target appeared in a 0-degree direction).

      The distance between hands was fixed at 15 cm with the Kinarm instead of a mechanical constraint. I wonder how much this distance varied and more importantly whether from that analysis or a force analysis, the authors could determine whether one hand led the other one in the adaptation.

      Thank you very much for this important comment. Since the distance between the two hands was maintained by the stiff virtual spring (2000 N/m), it was kept almost constant throughout the experiments as shown in Author response image 3 (the averaged distance during a movement). The distance was also maintained during reaching movements (Author response image 4).

      We also thank the reviewer for the suggestion regarding the force analysis. As shown in Author response image 5, we did not find a role for a specific hand for motor adaptation from the handle force data. Specifically, Author response image 5 shows the force applied to each handle along and orthogonal to the stick. If one hand led the other in adaptation, we should have observed a phase shift as adaptation progressed. However, no such hand specific phase shift was observed. It should be noted, however, that it was theoretically difficult to know from the force sensors which hand produced the force first, because the force exerted by the right handle was transmitted to the left handle and vice versa due to the connection by the stiff spring. 

      Author response image 3.

      The distance between hands during the task. We show the average distance between hands for each trial. The shaded area indicates the standard deviation across participants.

      Author response image 4.

      Time course changes in the distance between hands during the movement. The color means the trial epoch shown in the right legend.

      Author response image 5.

      The force profile during the movement (Exp 1A). We decomposed the force of each handle into the component along (upper panels) and orthogonal to the stick (lower panels). Changes in the force profiles in the adaptation phase are shown (left: left hand force, right: right hand force). The colors (magenta to cyan) mean trial epoch shown in the right legend.

      I understand the distinction between task- and end-effector irrelevant perturbation, and at the same time results show that the nervous system reacts to both types of perturbation, indicating that they both seem relevant or important. In line 32, the errors mentioned at the end of the sentence suggest that adaptation is in fact maladaptive. I think the authors may extend the Discussion on why adaptation was found in the experiments with end-effector irrelevant and especially how an internal (forward) model or a pair of internal (forward) models may be used to predict both the visual and the somatosensory consequences of the motor commands.

      Thank you very much for your comment. As we already described in the discussion of the original manuscript (Lines 519-538 in the revised manuscript), two potential explanations exist for the motor system’s response to the end-effector irrelevant perturbation (i.e., stick rotation). First, the motor system predicts the sensory information associated with the action and attempts to correct any discrepancies between the prediction and the actual sensory consequences, regardless of whether the error information is end-effector relevant or end-effector irrelevant. Second, given the close coupling between the tip-movement direction and stick-tilt angle, the motor system can estimate the presence of end-effector relevant error (i.e., tip-movement direction) by the presence of end-effector irrelevant error (i.e., stick-tilt angle). This estimation should lead to the change in the tip-movement direction. As the reviewer pointed out, the mismatch between visual and proprioceptive information is another possibility, we have added the description of this point in Discussion (Lines 523-526).

      Reviewer #1 (Recommendations For The Authors):

      Minor

      Line 16: “it remains poorly understood” is quite subjective and I would suggest reformulating this statement.

      We have reformulated this statement as “This limitation prevents the study of how….”  (Line 16).

      Introduction

      Line 49: the authors may be more specific than just saying ‘this task’. In particular, they need to clarify that there is no redundancy in studies where the shoulder is fixed and all movement is limited to a plane ... which turns out to truly happen in a limited set of experimental setups (for example: Kinarm exoskeleton, but not endpoint; Kinereach system...).

      We have changed this to “such a planar arm-reaching task” (Line 49).

      Line 61: large, not infinite because of biomechanical constraints.

      We have changed “an infinite” to “a large” (Line 61) and “infinite” to “a large number of” (legend in Fig. 1f).

      Lines 67-69: consider clarifying.

      We have tried to clarify the sentence (Lines 67-69).

      Results

      TMD and STA, and TMD-STA plane, are new terms with new acronyms that are not easy to immediately understand. Consider avoiding acronyms.

      We have reduced the use of these acronyms as much as possible. 

      “visual TMD–STA plane” -> “plane representing visual movement patterns” (Lines 179180)

      “TMD axis” -> “x-axis” (Line 181, Line 190)

      “physical TMD–STA plane” -> “plane representing physical movement patterns” (Lines 182-187)

      “physical TMD–STA plane” -> “physical plane” (Line 191, Line 201, Lines 216-217, Line 254, Line 301, Line 315, Line 422, Line 511, and captions of Figures 4-9, S3)

      “visual TMD–STA plane” -> “visual plane” (Line 193, Line 241, Line 248, Line 300, Lines

      313-314, and captions of Figures 4-9, S3)

      “STA axis” -> “y-axis” (Line 241)

      Line 169: please clarify the mismatch(es) that are created when the tip-movement direction is visually rotated in the CCW direction around the starting position (tip perturbation), whereas the stick-tilt angle remains unchanged.

      Thank you for your pointing this out. We have clarified that the stick-tilt angle remains identical to the tilt of both hands (Lines 171-172).

      Discussion

      I understand the physical constraint imposed between the 2 hands with the robotic device, but I am not sure I understand the physical constraint imposed by the TMD-STA relationship.

      The phrase “physical constraint” meant the constraint of the movement on the physical space. However, as the reviewer pointed out, this phrase could confuse the constraint between the two hands. Therefore, we have avoided using the phrase “physical constraint” throughout the manuscript.

      Some work looking at 3-D movements should be used for Discussion (e.g. Lacquaniti & Soechting 1982; work by d’Avella A or Jarrasse N).

      Thank you for sharing this important information. We have cited these studies in Discussion (Lines 380-382). 

      Reviewer #2 (Public Review):

      Summary:

      The authors have developed a novel bimanual task that allows them to study how the sensorimotor control system deals with redundancy within our body. Specifically, the two hands control two robot handles that control the position and orientation of a virtual stick, where the end of the stick is moved into a target. This task has infinite solutions to any movement, where the two hands influence both tip-movement direction and stick-tilt angle. When moving to different targets in the baseline phase, participants change the tilt angle of the stick in a specific pattern that produces close to the minimum movement of the two hands to produce the task. In a series of experiments, the authors then apply perturbations to the stick angle and stick movement direction to examine how either tipmovement (task-relevant) or stick-angle (task-irrelevant) perturbations affect adaptation. Both types of perturbations affect adaptation, but this adaptation follows the baseline pattern of tip-movement and stick angle relation such that even task-irrelevant perturbations drive adaptation in a manner that results in task-relevant errors. Overall, the authors suggest that these baseline relations affect how we adapt to changes in our tasks. This work provides an important demonstration that underlying solutions/relations can affect the manner in which we adapt. I think one major contribution of this work will also be the task itself, which provides a very fruitful and important framework for studying more complex motor control tasks.

      Strengths:

      Overall, I find this a very interesting and well-written paper. Beyond providing a new motor task that could be influential in the field, I think it also contributes to studying a very important question - how we can solve redundancy in the sensorimotor control system, as there are many possible mechanisms or methods that could be used - each of which produces different solutions and might affect the manner in which we adapt.

      Weaknesses:

      I would like to see further discussion of what the particular chosen solution implies in terms of optimality.

      The underlying baseline strategy used by the participants appears to match the path of minimum movement of the two hands. This suggests that participants are simultaneously optimizing accuracy and minimizing some metabolic cost or effort to solve the redundancy problem. However, once the perturbations are applied, participants still use this strategy for driving adaptation. I assume that this means that the solution that participants end up with after adaptation actually produces larger movements of the two hands than required. That is - they no longer fall onto the minimum hand movement strategy - which was used to solve the problem. Can the authors demonstrate that this is either the case or not clearly? These two possibilities produce very different implications in terms of the results.

      If my interpretation is correct, such a result (using a previously found solution that no longer is optimal) reminds me of the work of Selinger et al., 2015 (Current Biology), where participants continue to walk at a non-optimal speed after perturbations unless they get trained on multiple conditions to learn the new landscape of solutions. Perhaps the authors could discuss their work within this kind of interpretation. Do the authors predict that this relation would change with extensive practice either within the current conditions or with further exploration of the new task landscape? For example, if more than one target was used in the adaptation phase of the experiment?

      On the other hand, if the adaptation follows the solution of minimum hand movement and therefore potentially effort, this provides a completely different interpretation.

      Overall, I would find the results even more compelling if the same perturbations applied to movements to all of the targets and produced similar adaptation profiles. The question is to what degree the results derive from only providing a small subset of the environment to explore.

      Thank you very much for pointing out this significant issue. As the reviewer correctly interprets, the physical movement patterns deviated from the baseline relationship as exemplified in Exp.2. However, this deviation is not surprising for the following reason. Under the perturbation that creates the dissociation between the hands and the stick, the motor system cannot simultaneously return both the visual stick motion and physical hands motion to the original motions: When the motor system tries to return the visual stick motion to the original visual motion, then the physical hands motion inevitably deviates from the original physical hands motion, and vice versa.  

      Our interpretation of this result is that the motor system corrects the movement to reduce the visual dissociation of the visual stick motion from the baseline motion (i.e., sensory prediction error), but this movement correction is biased by the baseline physical hands motion. In other words, the motor system attempts to balance the minimization of sensory prediction error and the minimization of motor cost. Thus, our results do not indicate that the final adaptation pattern is non-optimal, but rather reflect the attempts for optimization.

      In the revised manuscript, we have added the description of this interpretation (Lines 515-517).

      Reviewer #2 (Recommendations For The Authors):

      The authors have suggested that the only study (line 472) that has also examined an end-effector irrelevant perturbation is the bimanual study of Omrani et al., 2013, which only examined reflex activity rather than adaptation. To clarify this issue - exactly what is considered end-effector irrelevant perturbations - I was wondering about the bimanual perturbations in Dimitriou et al., 2012 (J Neurophysiol) and the simultaneous equal perturbations in Franklin et al., 2016 (J Neurosci), as well as other recent papers studying task-irrelevant disturbances which aren’t discussed. I would consider these both to also be end-effector irrelevant perturbations, although again they only used these to study reflex activity and not adaptation as in the current paper. Regardless, further explanation of exactly what is the difference between task-irrelevant and end-effector irrelevant would be useful to clarify the exact difference between the current manuscript and previous work.

      Thank you for your helpful comments. We have included as references the study by Dimitriou et al. (Line 490) and Franklin et al. (Lines 486-487), which use an endeffector irrelevant perturbation and the task-irrelevant perturbation condition, respectively. We have also added further explanation of what is the difference between task-irrelevant and end-effector irrelevant (Lines 344-352). 

      Line 575: I assume that you mean peak movement speed

      We have added “peak”. (Line 597).

      Reviewer #3 (Public Review):

      Summary:

      This study explored how the motor system adapts to new environments by modifying redundant body movements. Using a novel bimanual stick manipulation task, participants manipulated a virtual stick to reach targets, focusing on how tip-movement direction perturbations affected both tip movement and stick-tilt adaptation. The findings indicated a consistent strategy among participants who flexibly adjusted the tilt angle of the stick in response to errors. The adaptation patterns are influenced by physical space relationships, guiding the motor system’s choice of movement patterns. Overall, this study highlights the adaptability of the motor system through changes in redundant body movement patterns.

      Strengths:

      This paper introduces a novel bimanual stick manipulation task to investigate how the motor system adapts to novel environments by altering the movement patterns of our redundant body.

      Weaknesses:

      The generalizability of the findings is quite limited. It would have been interesting to see if the same relationships were held for different stick lengths (i.e., the hands positioned at different start locations along the virtual stick) or when reaching targets to the left and right of a start position, not just at varying angles along one side. Alternatively, this study would have benefited from a more thorough investigation of the existing literature on redundant systems instead of primarily focusing on the lack of redundancy in endpointreaching tasks. Although the novel task expands the use of endpoint robots in motor control studies, the utility of this task for exploring motor control and learning may be limited.

      Thank you very much for the important comment. Given that there are many parameters (e.g., stick length, locations of hands, target position etc), one may wonder how the findings obtained from only one combination can be generalized to other configurations. In the revised manuscript, we have explicitly described this point (Lines 356-359). 

      Thus, the generalizability needs to be investigated in future studies, but we believe that the main results also apply to other configurations. Regarding the baseline stick movement pattern, the control with tilting the stick was observed regardless of the stick-tip positions (Author response image 6). Regarding the finding that the adapted stick movement patterns follow the baseline movement patterns, we confirmed the same results even when the other targets were used as the target for the adaptation (Author response image 7). 

      Author response image 6.

      Stick-tip manipulation patterns when the length of the stick varied. Top: 10 naïve participants moved the stick with different lengths. A target appeared on one of five directions represented by a color of each tip position. Regardless of the length of the stick and laterality, a similar relationship between tip-movement direction and stick-tilt angle was observed. (middle: at peak velocity, bottom: at movement offset).

      Author response image 7.

      Patterns of adaptation when using the other targets. In the baseline phase, 40 naïve participants moved a stick tip to a peripheral target (24 directions). They showed a stereotypical relationship between the tip-movement direction and the stick-tilt angle (a bold gray curve). In the adaptation phase, participants were divided into four groups, each with a different target training direction (lower left, lower right, upper right, or upper left), and visual rotation was gradually imposed on the tip-movement direction. Irrespective of the target direction, the adaptation pattern of the tipmovement and stick-tilt followed with the baseline relationship.

      We also thank you for your comment about studying the existing redundant systems. We can understand the reviewer's concern about the usefulness of our task, but we believe that we have proposed the novel framework for motor adaptation in the redundant system. The future studies will be able to clarify how the knowledge gained from our task can be generally applied to understand the control and learning of the redundant system.

      Reviewer #3 (Recommendations For The Authors):

      Line 49: replace “uniquely” with primarily. A number of features of the task setup could affect the joint angles, from if/how the arm is supported, whether the wrist is fixed, alignment of the target in relation to the midline of the participant, duration of the task, and whether fatigue is an issue, etc. Your statement relates to fixed limb lengths of a participant, rather than standard reaching tasks as a whole. Not to mention the degree of inter- and intra-subject variability that does exist in point-to-point reaching tasks.

      Thank you for your helpful point. We have replaced “uniquely” with “primarily”. (Line 49).

      Line 72: the cursor is not an end-effector - it represents the end-effector.

      We have changed the expression as “the perturbation to the cursor representing the position of the end-effector (Line 72).

      Lines 73 – 78: it would benefit the authors to consider the role of intersegmental dynamics.

      Thank you for your suggestion. We are not sure if we understand this suggestion correctly, but we interpret that this suggestion to mean that the end-effector perturbation can be implemented by using the perturbation that considers the intersegmental dynamics. However, the implementation is not so straightforward, and the panels in Figure 1j,k are only conceptual for the end-effector irrelevant perturbation. Therefore, we have not described the contribution of intersegmental dynamics here.

      Lines 90 – 92: “cannot” should be “did not”, as the studies being referenced are already completed. This statement should be further unpacked to explain what they did do, and how that does not meet the requirement of redundancy in movement patterns.

      We have changed “cannot” to “did not” (Line 91). We have also added the description of what the previous studies had demonstrated (Line 88-90).

      Figure text could be enlarged for easier viewing.

      We have enlarged texts in all figures. 

      Lines 41 - 47: Interesting selection of supporting references. For the introduction of a novel environment, I would recommend adding the support of Shadmehr and MussaIvaldi 1994.

      Thank you for your suggestion. We have added Shadmehr and Mussa-Ivaldi 1994 as a reference (Line 45).

      Line 49: “this task” is vague - the above references relate to a number of different tasks. For example, the authors could replace it with a reaching task involving an end-point robot.

      Thank you very much for your suggestion. As per the suggestion by Reviewer #1, we have changed this to “such a planar arm-reaching task” (Line 49).

      Line 60: “hypothetical limb with three joints” - in Figure 1a, the human subject, holding the handle of a robotic manipulandum does have flexibility around the wrist.

      Previous studies using planar arm-reaching task have constrained the wrist joint (e.g., Flash & Hogan, 1985; Gordon et al., 1994; Nozaki et al., 2006). We tried to emphasize this point as “participants manipulate a visual cursor with their hands primarily by moving their shoulder and elbow joints” (Line 42). In the revised manuscript, we have also emphasized this point in the legend of Figure 1a.

      Lines 93-108: this paragraph could be cleaned up more clearly stating that while the use of task-irrelevant perturbations has been used in the domain of reaching tasks, the focus of these tasks has not been specifically to address “In our task, we aim to exploit this feature by doing”

      Thank you very much for your helpful comments. To make this paragraph clear, we have modified some sentences (Line 100-104).

      Line 109: “coordinates to adapt” is redundant.

      We have changed this to “adapts” (Line 110).

      Lines 109-112: these sentences could be combined to have better flow.

      Thank you very much for your valuable suggestion. We have combined these two sentences for the better flow (Line 110-112).

      Line 113-114: consider rewording - “This is a redundant task because ...” to something like “Redundancy in the task is achieved by acknowledging that ....“.

      We have changed the expression according to the reviewer’s suggestion (Line 114).

      Line 118: Consider changing “changes” to “makes use of”.

      We have changed the expression (Line 119).

      Lines 346 - 348: grammar and clarity - “This redundant motor task enables the investigation of adaptation patterns in the redundant system following the introduction of perturbations that are either end-effector relevant, end-effector irrelevant, or both.“.

      Thank you very much again for your helpful suggestion of English expression. We have adopted the sentence you suggested (Line 354-356).

    1. eLife Assessment

      The contributions of ipsilateral cortical pathways to motor control are yet not fully understood. Here, the authors present important insights into their role in locomotion following unilateral spinal cord injury. Their data provide convincing evidence in rats that stimulation of ipsilateral motor cortex improves the injured side's ability to support weight and leads to improved locomotion, a result that may inspire new treatments for spinal or cerebral injuries.

    2. Reviewer #2 (Public review):

      Summary:

      The authors long term goals are to understand the utility of precisely phased cortex stimulation regimes on recovery of function after spinal cord injury (SCI). In prior work the authors explored effects of contralesion cortex stimulation. Here, they explore ipsilesion cortex stimulation in which the ipsilesion corticospinal fibers that cross at the pyramidal decussation are spared. The authors explore the effects of such stimulation in intact rats and rats with a hemisection lesion at thoracic level ipsilateral to the stimulated cortex. The appropriately phased microstimulation enhances contralateral flexion and ipsilateral extension, presumably through lumbar spinal cord crossed extension interneuron systems. This microstimulation improves weight bearing in the ipsilesion hindlimb soon after injury, before any normal recovery of function would be seen. The contralateral homologous cortex can be lesioned in intact rats without impacting the microstimulation effect on flexion and extension during gait. In two rats ipsilateral flexion responses are noted, but these are not clearly demonstrated to be independent of the contralateral homologous cortex remaining intact.

      Strengths:

      This paper adds to prior data on cortical microstimulation by the authors' laboratory in interesting ways. First, the strong effects of the spared crossed fibers from ipsi-lesional cortex in parts of the ipsi-lesion leg's step cycle and weight support function are solidly demonstrated. This raises the interesting possibility that stimulating contra-lesion cortex as reported previously may execute some of its effects through callosal coordination with the ipsi-lesion cortex tested here. This is also now discussed by the authors and may represent a significant aspect of these data. The authors demonstrate solidly that ablation of the contra-lesional cortex does not impede the effects reported here. I believe this has not been shown for the contra-lesional cortex microstimulation effects reported earlier, but I may be wrong.<br /> Effects and neuroprosthetic control of these effects are explored well in the ipsi-lesion cortex tests here.

      Weaknesses:

      Some data is based on only a few rats. For example (N=2) for ipsilateral flexion effects of microstimulation. N=3 for homologous cortex ablation, and only ipsi extension is tested it seems. However, these data clearly point the way and replication is likely.

      Likely Impacts:

      This data adds in significant ways to prior work by the authors, and an understanding of how phased stimulation in cortical neuroprosthetics may aid in recovery of function after SCI, especially if a few ambiguities in writing and interpretation are fully resolved.

    3. Reviewer #3 (Public review):

      Summary:

      This article aims to investigate the impact of neuroprosthesis (intracortical microstimulation) implanted unilaterally on the lesion side in the context of locomotor recovery following thoracic spinal hemisection.

      Strength:

      The study reveals that stimulating the left motor cortex, on the same side as the lesion, not only activates the expected right (contralateral) muscle activity but also influences unexpected muscle activity on the left (ipsilateral) side. These muscle activities resulted a substantial enhancement in lift during the swing phase of the contralateral limb and improved trunk-limb support for the ipsilateral limb. They used different experimental and stimulation condition to show the ipsilateral limb control evoked by the stimulation. This outcome holds significance, shedding light on the engagement of the contralateral-projecting corticospinal tract (CST) in activating a not only contralateral but also ipsilateral spinal network.

      The experimental design and findings align with the investigation of the stimulation effect of contralateral projecting CSTs. They carefully examined the recovery of ipsilateral limb control with motor maps. And they also tested the effective sites of cortical stimulation. The study successfully demonstrates the impact of electrical stimulation on the contralateral projecting neurons on ipsilateral limb control during locomotion, as well as identifying importance stimulation spots for such effect. These results contribute to our understanding of how these neurons influence bilateral spinal circuitry. The study's findings contribute valuable insights to the broader neuroscience and rehabilitation communities.

      Weakness:

      The term "ipsilateral" lacks a clear definition in some cases, potentially causing confusion for the reader. Readers can potentially link ipsilateral cortical network to ipsilateral-projecting CSTs, which is less likely to play a role to ipsilateral limb control in this study since this tract is disrupted by the thoracic hemisection.

      Specific comments:

      Abstract: Line 1-4: Consider refining the initial sentences of the abstract to reduce ambiguity around the term 'ipsilateral lesion' and its potential conflation with ipsilateral projecting cortical neurons.

      The abstract begins with 'Control of voluntary limb movement is predominantly attributed to the contralateral motor cortex.' This is followed by, 'However, increasing evidence suggests the involvement of ipsilateral cortical networks in this process, especially in motor tasks requiring bilateral coordination, such as locomotion.'

      The phrase 'ipsilateral cortical networks' remains somewhat unclear. Readers may mistakenly interpret it as referring to the ipsilateral projecting corticospinal tract (CST), which is not the focus of this study.

      Shifting the focus away from 'ipsilateral cortical control' and instead highlighting ipsilateral limb control following a spinal hemisection would improve clarity. This adjustment would also align the title and abstract more closely with the study's primary focus.

      Introduction:<br /> It is suggested to revise the introduction to more closely align with the study's experimental design and outcomes, placing emphasis on the stimulation effects observed in contralateral projecting tracts rather than implying a primary focus on ipsilateral projecting CST neurons.

      Line 30-32: "Nevertheless, the function of the ipsilateral motor cortex is unclear and its role in the recovery of motor control after injury remains controversial. " This still gives the impression that ipsilateral projecting CST is the topic of the research here. Also, some of the cited references contains discuss ipsilateral projecting CSTs.

      Line 34-36: "While the most prominent feature of motor cortex pathways is their contralateral organization, unilateral or bilateral movements are well represented in the ipsilateral hemisphere." This sentence is unclear to me. It would be helpful to specify what 'ipsilateral hemisphere' refers to-ipsilateral to what? Clarifying whether it's ipsilateral to the lesion or another reference point would make the statement more precise."

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This manuscript reveals important insights into the role of ipsilateral descending pathways in locomotion, especially following unilateral spinal cord injury. The study provides solid evidence that this method improves the injured side's ability to support weight, and as such the findings may lead to new treatments for stroke, spinal cord injuries, or unilateral cerebral injuries. However, the methods and results need to be better detailed, and some of the statistical analysis enhanced.

      Thank you for your assessment. We incorporated various text improvements in the final version of the manuscript to address the weaknesses you have pointed out. The specific improvements are outlined below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript provides potentially important new information about ipsilateral cortical impact on locomotion. A number of issues need to be addressed.

      Strengths:

      The primary appeal and contribution of this manuscript are that it provides a range of different measures of ipsilateral cortical impact on locomotion in the setting of impaired contralateral control. While the pathways and mechanisms underlying these various measures are not fully defined and their functional impacts remain uncertain, they comprise a rich body of results that can inform and guide future efforts to understand cortical control of locomotion and to develop more effective rehabilitation protocols.

      Weaknesses:

      (1) The authors state that they used a cortical stimulation location that produced the largest ankle flexion response (lines 102-104). Did other stimulation locations always produce similar, but smaller responses (aside from the two rats that showed ipsilateral neuromodulation)? Was there any site-specific difference in response to stimulation location?

      We derived motor maps in each rat, akin to the representation depicted in Fig 6. In each rat, alternative cortical sites did, indeed, produce distal or proximal contralateral leg flexion responses. Distal responses were more likely to be evoked in the rostral portion of the array, similarly to proximal responses early after injury. This distribution in responses across different cortical sites is reported in this study (Fig. 6) and is consistent with our prior work. The Results section has been revised to provide additional clarification of the passage you indicated and context for the data presented in Figure 6:

      On page 4, we have clarified: “Stimulation through these channels produced a strong whole-leg flexion movement, with an evident distal component. From visual inspection, all responding electrodes in the array produced contralateral leg flexion, although with different strength of contraction for a fixed stimulation intensity (100μA). Moreover, some sites did not present a distal movement component, failing in eliciting ankle flexion and resulting in a generally weaker proximal flexion.”

      On page 12, we have further noted: “By visually inspecting the responses elicited by stimulation delivered through each of the array electrodes, we categorized movements as proximal or distal. This classification was based on whether the ankle participated in the evoked response or if the movement was restricted to the proximal hindlimb. Each leg was scored independently.”

      (2) Figure 2: There does not appear to be a strong relationship between the percentage of spared tissue and the ladder score. For example, the animal with the mild injury (based on its ladder score) in the lower left corner of Figure 2A has less than 50% spared tissue, which is less spared tissue than in any animal other than the two severe injuries with the most tissue loss. Is it possible that the ladder test does not capture the deficits produced by this spinal cord injury? Have the authors looked for a region of the spinal cord that correlates better with the deficits that the ladder test produces? The extent of damage to the region at the base of the dorsal column containing the corticospinal tract would be an appropriate target area to quantify and compare with functional measures.

      In Fig. S6 of our 2021 publication "Bonizzato and Martinez, Science Translational Medicine", we investigated the predictive value of tissue sparing in specific sub-regions of the spinal cord for ladder performance. Among others, we examined the correlation between the accuracy of left leg ladder performance in the acute state and the preservation of the corticospinal tract (CST). Our results indicated that dorsal CST sparing serves as a mild predictor for ladder deficits, confirming the results obtained in this study.

      (3) Lines 219-221: The authors state that "phase-coherent stimulation reinstated the function of this muscle, leading to increased burst duration (90{plus minus}18% of the deficit, p=0.004, t-test, Fig. 4B) and total activation (56{plus minus}13% of the deficit, p=0.014, t-test, Fig. 3B). This way of expressing the data is unclear. For example, the previous sentence states that after SCI, burst duration decreased by 72%. Does this mean that the burst duration after stimulation was 90% higher than the -72% level seen with SCI alone, i.e., 90% + -72% = +18%? Or does it mean that the stimulation recovered 90% of the portion of the burst duration that had been lost after SCI, i.e., -72% * (100%-90%)= -7%? The data in Figure 4 suggests the latter. It would be clearer to express both these SCI alone and SCI plus stimulation results in the text as a percent of the pre-SCI results, as done in Figure 4.

      Your assessment is correct; we intended to report that the stimulation recovered 90% of the portion of the burst duration that had been lost after SCI. This point has been clarified (see page 9):

      “…leading to increased burst duration (recovered 90±18% of the lost burst duration, p=0.004, t-test, Fig. 4B) and total activation (recovered 56±13% of the total activation, p=0.014, t-test, Fig. 3B)”

      (4) Lines 227-229: The authors claim that the phase-dependent stimulation effects in SCI rats are immediate, but they don't say how long it takes for these effects to be expressed. Are these effects evident in the response to the first stimulus train, or does it take seconds or minutes for the effects to be expressed? After the initial expression of these effects, are there any gradual changes in the responses over time, e.g., habituation or potentiation?

      The effects are immediately expressed at the very first occurrence of stimulation. We never tested a rat completely naïve to stimuli, as each treadmill session involves prior cortical mapping to identify a suitable active site for involvement in locomotor experiments. Yet, as demonstrated in Supplementary Video 1 accompanying our 2021 publication on contralateral effects of cortical stimulation, "Bonizzato and Martinez, Science Translational Medicine," the impact of phase-dependent cortical stimulation on movement modulation is instantaneous and ceases promptly upon discontinuation of the stimulation. We did not quantify potential gradual changes in responsiveness over time, but we cannot exclude that for long stimulation sessions (e.g., 30 min or more), stimulus amplitude may need to be slightly increased over time to compensate habituation.

      (5) Awake motor maps (lines 250-277): The analysis of the motor maps appears to be based on measurements of the percentage of channels in which a response can be detected. This analytic approach seems incomplete in that it only assesses the spatial aspect of the cortical drive to the musculature. One channel could have a just-above-threshold response, while another could have a large response; in either case, the two channels would be treated as the same positive result. An additional analysis that takes response intensity into account would add further insight into the data, and might even correlate with the measures of functional recovery. Also, a single stimulation intensity was used; the results may have been different at different stimulus intensities.

      We confirm that maps of cortical stimulation responsiveness may vary at different stimulus amplitudes. To establish an objective metric of excitability, we identified 100µA as a reliable stimulation amplitude across rats and used this value to build the ipsilateral motor representation results in Figure 6. This choice allows direct comparison with Figure 6 of our 2021 article, related to contralateral motor representation. The comparison reveals a lack of correlation with functional recovery metrics in the ipsilateral case, in contrast to the successful correlation achieved in the contralateral case.

      Regarding the incorporation of stimulation amplitudes into the analysis, as detailed in the Method section (lines 770-771), we systematically tested various stimulation amplitudes to determine the minimal threshold required for eliciting a muscle twitch, identified as the threshold value. This process was conducted for each electrode site.

      Upon reviewing these data, we considered the possibility of presenting an additional assessment of ipsilateral cortical motor representation based on stimulation thresholds. However, the representation depicted in the figure did not differ significantly from the data presented in Figure 6A. Furthermore, this representation introduced an additional weakness, as it was unclear how to represent the absence of a response in the threshold scale. We chose to arbitrarily designate it as zero on the inverse logarithmic scale, where, for reference, 100 µA is positioned at 0.2 and 50 µA at 0.5.

      In conclusion, we believe that the conclusions drawn from this analysis align substantially with those in the text. The addition of the threshold analysis, in our assessment, would not contribute significantly to improving the manuscript.

      Author response image 1.

      Threshold analysis

      Author response image 2.

      Occurrence probability analysis, for comparison.

      (6) Lines 858-860: The authors state that "All tests were one-sided because all hypotheses were strictly defined in the direction of motor improvement." By using the one-sided test, the authors are using a lower standard for assessing statistical significance that the overwhelming majority of studies in this field use. More importantly, ipsilateral stimulation of particular kinds or particular sites might conceivably impair function, and that is ignored if the analysis is confined to detecting improvement. Thus, a two-sided analysis or comparable method should be used. This appropriate change would not greatly modify the authors' current conclusions about improvements.

      Our original hypothesis, drawn from previous studies involving cortical stimulation in rats and cats, as well as other neurostimulation research for movement restoration, posited a favorable impact of neurostimulation on movement. Consistent with this hypothesis, we designed our experiments with a focus on enhancing movement, emphasizing a strict direction of improvement.

      It's important to note that a one-sided test is the appropriate match for a one-sided hypothesis, and it is not a lower standard in statistics. Each experiment we conducted was constructed around a strictly one-sided hypothesis: the inclusion of an extensor-inducing stimulus would enhance extension, and the inclusion of a flexion-inducing stimulus would enhance flexion. This rationale guided our choice of the appropriate statistical test.

      We acknowledge your concern regarding the potential for ipsilateral stimulation to have negative effects on locomotion, which might not be captured when designing experiments based on one-sided hypotheses. That is, when hypothesizing that an extensor stimulus would enhance extension (a one-sided hypothesis) in a functional task, and finding an opposite result (inhibition), statistical rigor would impose that we cannot present that result as significant. This concern is valid, and we explicitly mentioned our design choice it in the method section, Quantification and statistical analyses:

      “All tests were one-sided, as our hypotheses were strictly defined to predict motor improvement. Specifically, we hypothesized that delivering an extension-inducing stimulus would enhance leg extension, and delivering a flexion-inducing stimulus would enhance leg flexion. Consequently, any potentially statistically significant result in the opposite direction (e.g., inhibition) would not be considered. However, no such occurrences were observed.”

      As a final note, even if such opposite observations were made, they could serve as the basis for triggering an ad-hoc follow-up study.

      Reviewer #1 also provided several detailed suggestions in the section “Recommendations for the authors”. We estimated that each of them was beneficial for the correctness or for the readability of the text, and thus all were incorporated into the final version.

      Reviewer #2 (Public Review):

      Summary:

      The authors' long-term goals are to understand the utility of precisely phased cortex stimulation regimes on recovery of function after spinal cord injury (SCI). In prior work, the authors explored the effects of contralesion cortex stimulation. Here, they explore ipsilesion cortex stimulation in which the corticospinal fibers that cross at the pyramidal decussation are spared. The authors explore the effects of such stimulation in intact rats and rats with a hemisection lesion at the thoracic level ipsilateral to the stimulated cortex. The appropriately phased microstimulation enhances contralateral flexion and ipsilateral extension, presumably through lumbar spinal cord crossed-extension interneuron systems. This microstimulation improves weight bearing in the ipsilesion hindlimb soon after injury, before any normal recovery of function would be seen. The contralateral homologous cortex can be lesioned in intact rats without impacting the microstimulation effect on flexion and extension during gait. In two rats ipsilateral flexion responses are noted, but these are not clearly demonstrated to be independent of the contralateral homologous cortex remaining intact.

      Strengths:

      This paper adds to prior data on cortical microstimulation by the laboratory in interesting ways. First, the strong effects of the spared crossed fibers from the ipsi-lesional cortex in parts of the ipsi-lesion leg's step cycle and weight support function are solidly demonstrated. This raises the interesting possibility that stimulating the contra-lesion cortex as reported previously may execute some of its effects through callosal coordination with the ipsi-lesion cortex tested here. This is not fully discussed by the authors but may represent a significant aspect of these data. The authors demonstrate solidly that ablation of the contra-lesional cortex does not impede the effects reported here. I believe this has not been shown for the contra-lesional cortex microstimulation effects reported earlier, but I may be wrong. Effects and neuroprosthetic control of these effects are explored well in the ipsi-lesion cortex tests here.

      In the revised version of the manuscript, we incorporated various text improvements to address the points you have highlighted in your review. Additionally, we have integrated the suggested discussion topic on callosal coordination related to contralateral cortical stimulation. The discussion section now incorporates:

      “Since bi-cortical interactions in sculpting descending commands are known (Brus-Ramer et al., 2009), and in light of the changes we report in ipsilesional motor cortex excitability, the role of the ipsilateral cortex in mediating or supporting functional descending commands from the contralateral cortex, particularly the immediate increase in flexion of the affected hindlimb and long-term recovery of functional control (Bonizzato & Martinez, 2021), could be further explored.”

      The localization of the specific channels closest to the interhemispheric fissure (Fig. 7D) may suggest the involvement of transcallosal interactions in mediating the transmission of the cortical command generated in the ipsilateral motor cortex (Brus-Ramer, Carmel, & Martin, 2009). “While ablation experiments (Fig. 8) refute this hypothesis for ipsilateral extension control, they do not conclusively determine whether a different efferent pathway is involved in ipsilateral flexion control in this specific case."

      Weaknesses:

      Some data is based on very few rats. For example (N=2) for ipsilateral flexion effects of microstimulation. N=3 for homologous cortex ablation, and only ipsi extension is tested it seems. There is no explicit demonstration that the ipsilateral flexion effects in only 2 rats reported can survive the contra-lateral cortex ablation.

      We agree with this assessment. The ipsilateral flexion representation is here reported as a rare but consistent phenomenon, which we believe to have robustly described with Figure 7 experiments. We underlined in the text that the ablation experiment did not conclude on the unilateral-cortical nature of ipsilateral flexion effects, by replacing the sentence with the following:

      “While ablation experiments (Fig. 8) refute this hypothesis for ipsilateral extension control, they do not conclusively determine whether a different efferent pathway is involved in ipsilateral flexion control in this specific case."

      Some improvements in clarity and precision of descriptions are needed, as well as fuller definitions of terms and algorithms.

      Likely Impacts: This data adds in significant ways to prior work by the authors, and an understanding of how phased stimulation in cortical neuroprosthetics may aid in recovery of function after SCI, especially if a few ambiguities in writing and interpretation are fully resolved.

      The manuscript text has been revised in its final version, and we sought to eliminate all ambiguity in writing and data interpretation.

      In the section “Recommendations for the authors” Reviewer #2 also suggested to better define multiple terms throughout the manuscript. A clarification was added for each.

      The Reviewer pointed out that we might have overlooked a correlation between locomotor recovery and motor maps increase in Figure 6. We re-approached this evaluation and found that the reviewer is correct. We were led to think that there was no correlation by “horizontally” looking at whether motor map size across rats would predict locomotor scores (as it did in the case of contralateral cortex mapping, Bonizzato and Martinez, 2021). However we now found a strong correlation between changes that happen over time for each rat and locomotor recovery, a result that was only hinted with no appropriate quantification in the previous version of the manuscript. We have now reformulated the results of Figure 6 on page 12, to include this result, and we would like to thank the reviewer for having noticed this opportunity.

      Finally, we have expanded the discussion to include the following points:

      The possibility that hemi-cortex coordination of contralesional microstimulation inputs may explain the Sci Transl Med results for contralesional cortex ICMS, which warrants further investigation.

      The recognition that the ablation experiments do not provide conclusive evidence regarding ipsilateral flexion control and whether an alternative efferent pathway might be involved in this specific case.

      Reviewer #3 (Public Review):

      Summary:

      This article aims to investigate the impact of neuroprosthesis (intracortical microstimulation) implanted unilaterally on the lesion side in the context of locomotor recovery following unilateral thoracic spinal cord injury.

      Strength:

      The study reveals that stimulating the left motor cortex, on the same side as the lesion, not only activates the expected right (contralateral) muscle activity but also influences unexpected muscle activity on the left (ipsilateral) side. These muscle activities resulted in a substantial enhancement in lift during the swing phase of the contralateral limb and improved trunk-limb support for the ipsilateral limb. They used different experimental and stimulation conditions to show the ipsilateral limb control evoked by the stimulation. This outcome holds significance, shedding light on the engagement of the "contralateral projecting" corticospinal tract in activating not only the contralateral but also the ipsilateral spinal network.

      The experimental design and findings align with the investigation of the stimulation effect of contralateral projecting corticospinal tracts. They carefully examined the recovery of ipsilateral limb control with motor maps. They also tested the effective sites of cortical stimulation. The study successfully demonstrates the impact of electrical stimulation on the contralateral projecting neurons on ipsilateral limb control during locomotion, as well as identifying important stimulation spots for such an effect. These results contribute to our understanding of how these neurons influence bilateral spinal circuitry. The study's findings contribute valuable insights to the broader neuroscience and rehabilitation communities.

      Thank you for your assessment of this manuscript. The final version of the manuscript incoporates your suggestions for improving term clarity and we enhanced the discussion on the mechanisms of spinal network engagement, as outlined below.

      Weakness:

      The term "ipsilateral" lacks a clear definition in the title, abstract, introduction, and discussion, potentially causing confusion for the reader.

      [and later] However, in my opinion, readers can easily link the ipsilateral cortical network to the ipsilateral-projecting corticospinal tract, which is less likely to play a role in ipsilateral limb control in this study since this tract is disrupted by the thoracic spinal injury.

      In order to mitigate the risk of having readers linking the effects of ipsilateral cortical stimulation with ipsilateral-projecting corticospinal tract, we specified:

      In the abstract, we precise that our goal was: “to investigate the functional role of the ipsilateral motor cortex in rat movement through spared contralesional pathways.”

      In the introduction: “In most cases, this lesion also disrupts all spinal tracts descending on the same side as the cortex under investigation at the thoracic level, meaning that the transmission of cortical commands to the ipsilesional hindlimb must depend on crossed descending tracts (Fig. S1).”

      The unexpected ipsilateral (left) muscle activity is most likely due to the left corticospinal neurons recruiting not only the right spinal network but also the left spinal network. This is probably due to the joint efforts of the neuroprosthesis and activation of spinal motor networks which work bilaterally at the spinal level.

      We agree with your assessment and the discussion section now emphasizes the effects of supraspinal drive onto spinal circuits.

      In the section “Recommendations for the authors” Reviewer #3 suggested to provide an early reminder to the reader that the focus is on exploring the control of the ipsilateral limb through the corticospinal tract of the same side, projecting contralaterally. We did so in the abstract and introduction, as presented above.

      The reviewer also suggested that the discussion could be shorter. While we recognize it covers diverse subjects that may appeal to different readers, we believe omitting some sections could limit its overall scope. The manuscript underwent three revisions and a thorough dialogue with reviewers from diverse backgrounds, and we are hesitant to undo some of these improvements.

      Moreover, the section falls short of fully exploring the involvement of contralateral projecting corticospinal neurons in spinal networks for diverse motor behaviors. It could potentially delve into aspects like the potential impact of corticospinal inputs on gating the cross-extensor reflex loop and elucidating the mechanisms underlying the recruitment of the ipsilateral spinal network for generating ipsilateral limb movements. Is it a direct control on motor neurons or via existing spinal circuits?

      The discussion section now includes the potential spinal circuits through which corticospinal neurons may affect motor control and reflexes.

      Reviewer #3 also provided several detailed suggestions in the sub-section “Minor points”. We estimated that all of them were beneficial for the correctness or for the readability of the text, and thus were incorporated into the final version. Some of the questions raised were answered directly in the text (defining “% of chronic map” and rephrasing the original Line 479). We would like to answer here below two remaining questions:

      Fig. 3C I wonder what is the average latency between stimulation onset and onset of right ankle flexor activity. Is the latency fixed, or variable (which probably indicates that the Cortical activation signal is integrated with spinal CPG activity.)

      ICMS trains, unfortunately, do not allow for precise dissection of transmission timing. Single pulses at 100 µA are insufficient to generate motoneuron responses and require multiple pulses to build up cortical transmission. Alstermark et al. (Journal of Neurophysiology, 2004) used two to four stimuli with higher amplitudes to investigate forelimb transmission timing. In our 2021 Science Translational Medicine paper, we employed single pulses at 1 mA to establish transmission delays from the contralateral cortex to the ankle flexor. However, the circuits recruited at 1 mA are not directly comparable to those activated by shorter trains.

      In this study, we used cortical trains of approximately 14 pulses, typical of ICMS protocols. Each pulse could potentially be the first to generate a response volley in the ankle flexor, with delays measured at 30 to 60 ms from ICMS train onset. While we believe that cortical commands are necessarily integrated with spinal CPG activity—as indicated in Figures 1B and 3D, where timing is crucial and descending commands can be gated out if delivered off-phase—the variability in latency that we recorded could be attributed to any of the following factors: cortical activation build-up, integration within reticular relay networks, or CPG integration.

      Fig. 4A. Why is the activity of under contralateral ankle flexor intact condition is later than the stimulation condition?

      We timed the stimulation to coincide with the contralateral leg lift and did not adjust its onset relative to spontaneous walking in SCI rats. Although stimulation could induce leg lift, as shown in Fig. 4A, SCI rats exhibited a slightly earlier and stronger activation of the right (contralateral) ankle flexor muscle even during spontaneous walking. This phenomenon is attributed to the deficits observed on the left side. The stronger right leg bears the body weight, as illustrated in Fig. 3, and thus, during body advancement, the right leg is engaged sooner and more rapidly (with a shorter swing phase) to provide support (right foot forward).

    1. eLife Assessment

      This study provides convincing evidence for functional subpopulations of β-cells responsible for Ca2+ signal initiation and maintenance using novel three-dimensional light sheet microscopy imaging and analysis of pancreatic islets. The findings are important as they help decode the mechanistic underpinnings of islet calcium oscillations and the resulting pulsatile insulin secretion. The work will be of general interest to cell biologists and of particular interest to islet biologists.

    2. Reviewer #1 - Public Review

      Summary:

      Jin, Briggs, and colleagues use light sheet imaging to reconstruct the islet three-dimensional Ca2+ network. The authors find that early/late responding (leader) cells are dynamic over time, and located at the islet periphery. By contrast, highly connected or hub cells are stable and located toward the islet center. Suggesting that the two subpopulations are differentially regulated by fuel input, glucokinase activation only influences leader cell phenotype, whereas hubs remain stable.

      Strengths:

      The studies are novel in providing the first three-dimensional snapshot of the beta cell functional network, as well as determining the localization of some of the different subpopulations identified to date. The studies also provide some consensus as to the origin, stability, and role of such subpopulations in islet function.

      Weaknesses:

      Experiments with metabolic enzyme activators do not take into account the influence of cell viability on the observed Ca2+ network data. Limitations of the imaging approach used need to be recognized and evaluated/discussed.

    3. Reviewer #2 - Public Review

      The manuscript by Erli Jin, Jennifer Briggs et al. utilizes light sheet microscopy to image islet beta cell calcium oscillations in 3D and determine where beta cell populations are located that begin and coordinate glucose-stimulated calcium oscillations. The light sheet technique allowed clear 3D mapping of beta cell calcium responses to glucose, glucokinase activation, and pyruvate kinase activation. The manuscript finds that synchronized beta-cells are found at the islet center, that leader beta cells showing the first calcium responses are located on the islet periphery, that glucokinase activation helped maintain beta cells that lead calcium responses, and that pyruvate kinase activation primarily increases islet calcium oscillation frequency. The study is well-designed, contains a significant amount of high-quality data, and the conclusions are largely supported by the results.

      It has recently been shown that beta cells within islets containing intact vasculature (such as those in a pancreatic slice) show different calcium responses compared to isolated islets (such as that shown in PMID: 35559734). It would be important to include some discussion about the potential in vitro artifacts in calcium that arise following islet isolation (this could be included in the discussion about the limitations of the study).

    4. Reviewer #3 - Public Review

      Summary:

      Jin, Briggs et al. made use of light-sheet 3D imaging and data analysis to assess the collective network activity in isolated mouse islets. The major advantage of using whole islet imaging, despite compromising on the speed of acquisition, is that it provides a complete description of the network, while 2D networks are only an approximation of the islet network. In static-incubation conditions, excluding the effects of perfusion, they assessed two subpopulations of beta cells and their spatial consistency and metabolic dependence.

      Strengths:

      The authors confirmed that coordinated Ca2+ oscillations are important for glycemic control. In addition, they definitively disproved the role of individual privileged cells, which were suggested to lead or coordinate Ca²⁺ oscillations. They provided evidence for differential regional stability, confirming the previously described stochastic nature of the beta cells that act as strongly connected hubs as well as beta cells in initiating regions (doi.org/10.1103/PhysRevLett.127.168101).

      The fact that islet cores contain beta cells that are more active and more coordinated has also been readily observed in high-frequency 2D recordings (e.g. DOI: 10.2337/db22-0952), suggesting that the high-speed capture of fast activity can partially compensate for incomplete topological information.

      They also found an increased metabolic sensitivity of mantle regions of an islet with a subpopulation of beta cells with a high probability of leading the islet activity which can be entrained by fuel input. They discuss a potential role of alpha/delta cell interaction, however relative lack of beta cells in the islet border region could also be a factor contributing to less connectivity and higher excitability.

      The Methods section contains a useful series of direct instructions on how to approach fast 3D imaging with currently available hardware and software.

      The Discussion is clear and includes most of the issues regarding the interpretation of the presented results.

      Some issues concerning inconsistencies between data presented and statements made as well as statistical analysis need to be addressed.

      Taken together it is a strong technical paper to demonstrate the stochasticity regarding the functions subpopulations of beta cells in the islets may have and how less well-resolved approaches (both missing spatial resolution as well as missing temporal resolution) led us to jump to unjustified conclusions regarding the fixed roles of individual beta cells within an islet.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      In Ryu et al., the authors use a cortical mouse astrocyte culture system to address the functional contribution of astrocytes to circadian rhythms in the brain. The authors' starting point is transcriptional output from serum-shocked culture, comparative informatics with existing tools and existing datasets. After fairly routine pathway analyses, they focus on the calcium homeostasis machinery and one gene, Herp, in particular. They argue that Herp is rhythmic at both mRNA and protein levels in astrocytes. They then use a calcium reporter targeted to the ER, mitochondria, or cytosol and show that Herp modulates calcium signaling as a function of circadian time. They argue that this occurs through the regulation of inositol receptors. They claim that the signaling pathway is clock-controlled by a limited examination of Bmal1 knockout astrocytes. Finally, they switch to calcium-mediated phosphorylation of the gap junction protein Connexin 43 but do not directly connect HERP-mediated circadian signaling to these observations. While these experiments address very important questions related to the critical role of astrocytes in regulating circadian signaling, the mechanistic arguments for HERP function, its role in circadian signaling through inositol receptors, the connection to gap junctions, and ultimately, the functional relevance of these findings is only partially substantiated by experimental evidence. 

      Strengths: 

      - The paper provides useful datasets of astrocyte gene expression in circadian time. 

      - Identifies HERP as a rhythmic output of the circadian clock. 

      - Demonstrates the circadian-specific sensitivity of ATP -> calcium signaling. 

      - Identifies possible rhythms in both Connexin 43 phosphorylation and rhythmic movement of calcium between cells. 

      Weaknesses: 

      - It is not immediately clear why the authors chose to focus on Ca2+ homeostasis or Herp from their initial screens as neither were the "most rhythmic" pathways in their primary analyses. 

      We appreciate the reviewer’s comment. We chose to focus on Ca2+ homeostasis processes because intracellular Ca2+ signaling plays crucial role in numerous astrocyte functions and is notably associated with sleep/wake status of animals, which is our primary interest (Bojarskaite et al., 2020; Ingiosi et al., 2020; Blum et al., 2021; Szabó et al., 2017). Among the genes involved in calcium ion homeostasis, Herp exhibited the most robust rhythmicity (supplementary table 1). The rationale for our focus on Ca2+ homeostasis and Herp is explained in the results section (line 143-150). We hope this provides a clear justification for our focus.

      - It would have been interesting (and potentially important) to know whether various methods of cellular synchronization would also render HERP rhythmic (e.g., temperature, forskolin, etc). If Herp is indeed relatively astrocyte-specific and rhythmic, it should be easy to assess its rhythmicity in vivo. 

      Thank you for the reviewer’s insightful comment. In response, we examined HERP expression in cultured astrocytes synchronized using either Dexamethasone or Forskolin treatment. We found that Herp exhibited rhythmic expression at both the the mRNA and protein levels under these conditions. These results have been added to Figure S3 and are explained in the manuscript (lines 173-175).

      Additionally, we measured HERP levels in the prefrontal cortex of mice at CT58 and CT70 and found no rhythmicity, as shown in Author response image 1. Given that Herp is expressed in various brain cell types, including microglia, endothelial cells, neurons, oligodendrocytes, and the astrocytes- with the highest expression in microglia(Cahoy et al., 2008), we reason that the potential rhythmic expression of HERP in astrocytes might be masked by its continuous expression in other cell types. Nonetheless, to assess HERP rhythmicity specifically in astrocytes in vivo, we attempted immunostaining using several anti-HERP antibodies, but none were successful. Consequently, we were unable to determine whether HERP exhibits rhythmic expression in astrocytes in vivo.

      Author response image 1.

      HERP levels were constant at CT58 and CT70. (A, B) Mice were entrained under 12h:12h LD cycle and maintained in constant dark. Prefrontal cortices were harvested at indicated time and processed for Western blot analysis. Representative image shows three independent samples. (B) Quantification of HERP levels normalized to VINCULIN. Values in graphs are mean ± SEM (*p < 0.05, **p < 0.005, ***p < 0.0005, and ****p < 0.00005; t-test)

      - The authors show that Herp suppression reduces ATP-mediated suppression of calcium whereas it initially increases Ca2+ in the cytosol and mitochondria and then suppresses it. The dynamics of the mitochondrial and cytosolic responses are not discussed in any detail and it is unclear what their direct relationship is to Herp-mediated ER signaling. What is the explanation for Herp (which is thought to be ER-specific) to calcium signaling in other organelles? 

      Our examination of cytosolic and mitochondrial Ca2+ responses was aimed at corroborating HERP’s effect on ER Ca2+ response. Upon ATP stimulation, Ca2+ is released from the ER via IP3R receptors (IP3Rs) and subsequently transmitted to other organelles including mitochondria (Carreras-Sureda et al., 2018; Giorgi et al., 2018). Ca2+ is directly transferred to the cytosol by IP3Rs located on the ER membrane, and to the mitochondria through a complex formed by IP3R and the voltage-dependent anion channel (VDAC) on the mitochondria (Giorgi et al., 2018).  Consistent with previous reports, we observed an increase of cytosolic and mitochondrial Ca2+ levels accompanied by decrease in ER Ca2+ levels following ATP treatment (See Fig. 3B, E, H, control siRNA). The ATP-stimulated ER Ca2+ release was enhanced by Herp knockdown. We reasoned that if Ca2+ release was enhanced, then cytosolic and mitochondrial Ca2+ uptakes would also be enhanced. The results were consistent with our hypothesis (See Fig. 3B, E, H, Herp siRNA). These observations are described in the Results section (lines 202-208) and in the Discussion (lines 333-348). We hope this explanation clarifies the relationship between Herp-mediated ER Ca2+ response and Ca2+ response in other organelles. Thank you for your consideration.

      - What is the functional significance of promoting ATP-mediated suppression of calcium in ER? 

      In astrocytes, intracellular Ca2+ plays crucial role in regulating several processes. In this study, among various downstream effects of intracellular Ca2+, we examined the gap junction channel (GJC) conductance, which affects astrocytic communication. As discussed in the manuscript (lines 357-381), circadian variation in HERP results in rhythmic Cx43 (S368) phosphorylation linked with GJC conductance. We propose that during the subjective night phase, heightened ATP induced ER Ca2+ release reduces GJC conductance, uncoupling astrocytes from the syncytium, making them better equipped for localized response. On the other hand, during the subjective day phase, increased GJC conductance may allow astrocytes to control a larger area for synchronous neuronal activity which is a key feature of sleep.

      - The authors then nicely show that the effect of ATP is dependent on intrinsic circadian timing but do not explain why these effects are antiphase in cytosol or mitochondria.

      Moreover, the ∆F/F for calcium in mitochondria and cytosol both rise, cross the abscissa, and then diminish - strongly suggesting a biphasic signaling event. Therefore, one wonders whether measuring the area under the curve is the most functionally relevant measurement of the change. 

      We appreciate the reviewer’s insightful comments. As explained in our previous response, Ca2+ released from the ER is transferred to the cytosol and mitochondria. This transfer explains why the fluorescent intensities of cytosolic and mitochondrial Ca2+ indicators show anti-phasic responses to those of the ER.

      We agree that cytosolic and mitochondrial Ca2+ responses may be biphasic. The decrease below the abscissa in mitochondria and cytosol likely reflects Ca2+ extrusion from these organelles. However, our primary focus was on the initial uptake of Ca2+ following ER Ca2+ release. Thus, when calculating the area under the curve (AUC), we measured the area between the ∆F/F graph and the y=0 (X-axis) for both mitochondria and cytosol. We reason that the measuring the area under the curve (above the abscissa) fits with our objective.

      While addressing your concerns, we noticed errors in the Y-axis labels of Fig. 3C, 4D, and 5C. For the ER Ca2+ dynamics, we measured the area above curve. These mistakes have now been corrected.

      - Why are mitochondrial and cytosolic calcium not also demonstrated for Bmal1 KO astrocytes? 

      In two sets of experiments (Fig. 3 and Fig. 4), we demonstrated that the increase in cytosolic and mitochondrial Ca2+ aligns with ER Ca2+ release. Since there were no circadian time differences in ER Ca2+ release in the Bmal1 KO cultures, we concluded that it was unnecessary to measure Ca2+ levels in the mitochondria and cytosol. Additionally, our primary focus is on the ER Ca2+ response rather than the Ca2+ dynamics in subcellular organelles. We hope this clarifies our rationale and maintains the focus of our study.

      - The authors claim that Herp acts by regulating the degradation of ITPRs but this hypothesis - rather central to the mechanisms proposed in this study - is not experimentally substantiated. 

      We appreciate the reviewer’s insightful comments regarding the role of HERP in the degradation of IP3Rs. In the original manuscript, we demonstrated that treating cells with Herp siRNA leads to an increase in the levels of ITPR1 and ITPR2, suggesting that HERP might be involved in the regulation of IP3Rs stability. This observation is consistent with previous studies, which showed that Herp siRNA treatment increases ITPR levels in HeLa and cardiac cells (Paredes et al., 2016; Torrealba et al., 2017). Torrealba et al. also showed that HERP regulates the polyubiquitination of IP3Rs. Based on our results and previous reports, we hypothesized that HERP similarly regulates ITPR degradation in cultured astrocytes.

      However, as the reviewer rightly pointed out, further evidence is needed to confirm that HERP specifically regulates ITPR degradation. To address this, we conducted new experiments examining the effect of XesC, an inhibitor of IP3Rs, on ER Ca2+ release. The treatment of XesC reduced the ER Ca2+ release and abolished the enhancement of ER Ca2+ release by Herp KD. These results demonstrated that HERP influences ER Ca2+ response through IP3Rs. These new findings have been added to Fig. 3N – 3P and explained in the Results section (lines 217-221).

      We believe these additional experiments and clarifications strengthen our hypothesis that HERP regulates IP3R degradation, thereby modulating ER Ca2+ responses.

      - There is no clear demonstration of the functional relevance of the circadian rhythms of ATP-mediated calcium signaling.

      As mentioned in the previous response, we examined Cx43 phosphorylation linked with GJC conductance in the context of ATP-mediated Ca2+ signaling. Our results demonstrated circadian variations in Cx43 Ser368 phosphorylation leading to variations of gap junction channel (GJC) conductance (Fig. 6C – F and Fig. 7D - I). We have discussed the significance of this circadian rhythm in ATP driven ER Ca2+ signaling concerning astrocytic function during sleep/wake states in the manuscript (lines 357 – 382) as follows.

      “ATP-stimulated Cx43 (S368) phosphorylation is higher at 30hr (subjective night phase) than at 42hr (subjective day phase) (Fig. 6C and 6D.), a finding further supported by in vivo experiments showing higher pCx43(S368) levels in the prefrontal cortex during the subjective night than during the day (Fig. 6E and 6F). What are the implications of this day/night variation in Cx43 (S368) phosphorylation? We reasoned that the circadian variation in Cx43 phosphorylation could significantly impact astrocyte functionality within the syncytium. Indeed, our cultured astrocytes exhibited circadian phase-dependent variation in gap junctional communication (Fig.7D – 7F). Astrocytes influence synaptic activity through the release of gliotransmitters such as glutamate, GABA, D-serine, and ATP, triggered by increases in intracellular Ca2+ in response to the activity of adjacent neurons and astrocytes (Verkhratsky & Nedergaard, 2018). Importantly, this increase in Ca2+ spreads to adjacent astrocytes through GJCs (Fujii et al., 2017), influencing a large area of the neuronal network. Considering that Cx43 Ser368 phosphorylation occurs to uncouple specific pathways in the astrocytic syncytium to focus local responses (Enkvist & McCarthy, 1992), our findings suggest that astrocytes better equipped for localized responses when presented with a stimulus during the active phase in mice. Conversely, during the rest period, characterized by more synchronous neuronal activity across broad brain areas (Vyazovskiy et al., 2009) higher GJC conductance might allow astrocytes to exert control over a larger area. In support of this idea, recent study showed that synchronized astrocytic Ca2+ activity advances the slow wave activity (SWA) of the brain, a key feature of non-REM sleep (Szabó et al., 2017). Blocking GJC was found to reduce SWA, further supporting this interpretation. However, conflicting findings have also been reported. For instance, Ingiosi et al. (Ingiosi et al., 2020) found that astrocytic synchrony was higher during wakefulness than sleep in the mouse frontal cortex. Whether these differing results in astrocyte synchrony during resting and active periods are attributable to differences in experimental context (e.g., brain regions, sleep-inducing condition) remains unclear. Indeed, astrocyte Ca2+ dynamics during wakefulness/sleep vary according to brain regions (Tsunematsu et al., 2021). While the extent of astrocyte synchrony might differ depending on brain region and/or stimulus, on our results suggest that the baseline state of astrocyte synchrony, which is affected by GJC conductance, varies with the day/night cycle.”

      Reviewer #2 (Public Review): 

      Summary: 

      The article entitled "Circadian regulation of endoplasmic reticulum calcium response in mouse cultured astrocytes" submitted by Ryu and colleagues describes the circadian control of astrocytic intracellular calcium levels in vitro. 

      Strengths: 

      The authors used a variety of technical approaches that are appropriate 

      We appreciate the reviewer’s acknowledgement of the strengths of our manuscript.

      Weaknesses: 

      Statistical analysis is poor and could lead to a misinterpretation of the data 

      Thank you for the comment. We have carefully reviewed our statistical analyses and applied appropriate methods where necessary. Please see below for the specific revisions and improvements made.

      For Fig. 2D-E, we initially used a t-test. However, after adding more replicates and conducting a normality test, we found that the data did not follow a normal distribution. Therefore, we switched to the Mann-Whitney U test. In Fig. 5D-E, we originally used a repeated measures two-way ANOVA, but we have now changed it to a standard two-way ANOVA. For Fig. 7C and I, we also observed non-normal distribution in the normality test and consequently replaced the t-test with the Mann-Whitney U test. For other analyses not specifically mentioned, normality tests confirmed normal distribution, allowing us to use t-tests or ANOVA as appropriate for statistical analysis.

      Several conceptual issues have been identified. 

      We have addressed the reviewer’s concerns. Please see our detailed point-by-point responses below.

      Overinterpretation of the data should be avoided. This is a mechanistic paper done completely in vitro, all references to the in vivo situation are speculative and should be avoided. 

      We appreciate the reviewer’s insightful comment. Following the reviewer’s suggestion, we have removed the interpretations of GO pathways in the context of in vivo situation.

      Reviewer #3 (Public Review): 

      Astrocyte biology is an active area of research and this study is timely and adds to a growing body of literature in the field. The RNA-seq, Herp expression, and Ca2+ release data across wild-type, Bmal1 knockout, and Herp knockdown cellular models are robust and lend considerable support to the study's conclusions, highlighting their importance. Despite these strengths, the manuscript presents a gap in elucidating the dynamics of HERP and the involvement of ITPR1/2 in modulating Ca2+ release patterns and their circadian variations, which remains insufficiently supported and characterized. While the Connexin data underscore the importance of rhythmic Ca2+ release triggered by ATP, the relationship here appears correlational and the role of HERP and ITPR in Cx function remains to be characterized. Moreover, enhancing the manuscript's clarity and readability could significantly benefit the presentation and comprehension of the findings. 

      We appreciate the reviewer’s acknowledgement of the strengths of our manuscript. Regarding the identified gaps, we have conducted several new experiments to clearly demonstrate the HERP-ITPR-Cx phosphorylation axis. Please see our detailed point-by-point responses below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      - While HERP appears to be a clock-controlled gene and its protein levels appear to demonstrate rhythmicity as well, the data quality of the western blotting in Bmal1 knockout raises some concern about the accuracy of HERP protein quantification. 

      We understand the reviewer’s concern regarding the proximity of the HERP band to a nonspecific band in the Western blotting for the Bmal1 knockout. However, we took great care to ensure the accuracy of our HERP band quantification. We meticulously selected only the specific HERP band, excluding nonspecific band. Therefore, we are confident in the accuracy of our HERP protein measurements.

      - If HERP is rhythmic and ITPRs are not, if their model is correct, might we expect HERP suppression to result in 'unmasking' an ITPR rhythm? 

      Our model suggests that both HERP and ITPRs are rhythmic, with HERP regulating the degradation of ITPR proteins and driving their rhythms. Consistent with this, we observed that day/night variations in ITPR2 levels (Fig. 4N and 4O). Therefore, we concluded that circadian variations in HERP are sufficient to drive ITPR2 rhythms. We have explained this in detail in the Result section (lines 236-241) and the Discussion section (lines 324-332).

      - The authors make a rather abrupt switch to examining gap junctions and connexin 43 phosphorylation. While the data demonstrating that the phosphorylation of S368 may indeed be rhythmic - the authors do not connect these data to the rest of the manuscript by showing a connection to HERP-mediated calcium signaling, limiting the coherence of the narrative. 

      Thank you for the reviewer’s insightful comments. To address the reviewer's concern regarding the connection between Herp and the phosphorylation of CX43 at S368, we have conducted new experiments to test whether KD of Herp abolishes the rhythms of Cx43 phosphorylation at S368. We found that the phosphorylation of Cx43 at S368 is significantly enhanced at 30hrs post sync compared with 42hrs post sync in control siRNA-treated astrocytes consistent with our previous results (Fig. 6C & 6D). On the other hand, this circadian phase dependent difference in phosphorylation was abolished in Herp siRNA treated astrocytes. These results clearly indicate that circadian variations in Cx43 phosphorylation are driven by the HERP. These new results are now included in Fig. 6G and 6H and explained in the Results section (lines 276-281).

      - Comment on data presentation: the authors repeatedly present histograms with attached lines between data points - from my understanding of the experiments, this is inappropriate unless these were repeated measures from the same cells. Otherwise, the lines connecting one data point to another between different conditions (e.g., Ctrl or Herp knockdown) are arbitrary and possibly misleading (i.e., Figure 3K, 3M, 4L, 6D). 

      Thank you for the reviewer’s comment. We have updated the figures by removing the lines connecting data points in the relevant figures (Fig.3K, M, Fig4.N and Fig.6D)).

      Reviewer #2 (Recommendations For The Authors): 

      Most of the suggestions of this reviewer are related to the conceptual interpretation and presentation of the data and to the statistical analysis 

      In Figure 1 the authors analyzed the rhythmic transcriptome of cortical astrocytes synchronized with a serum shock in two different ways. The authors need to discuss what is the difference between the two methods used to detect rhythmic transcripts and make sense of them. 

      Following the reviewer’s suggestion, we have provided a more detailed explanation about MetaCycle and BioCycle, as well as the rationale for using both packages in our analysis as follows: “Various methods have been used to identify periodicity in time-series data, such as Lomb-Scargle (Glynn et al., 2006), JTK_CYCLE (Hughes et al., 2010) and ARSER (Yang & Su, 2010), each with distinct advantages and limitations. MetaCycle, integrates these three methods, facilitating the evaluation of periodicity in time-series data without requiring the selection of an optimal algorithm (Wu et al., 2016). Additionally, BioCycle has been developed using a deep neural network trained with extensive synthetic and biological time series datasets (Agostinelli et al., 2016). Because MetaCycle and Biocycle identify periodic signal based on different algorithms, we applied both packages to identify periodicity in our time-series transcriptome data. BioCycle and MetaCycle analyses detected 321 and 311 periodic transcripts, respectively (FDR corrected, q-value < 0.05) (Fig. 1B). Among these, 220 (53.4%) were detected by both methods, but many transcripts did not overlap. MetaCycle is known for its inability to detect asymmetric waveforms (Mei et al., 2020). In our analysis, genes with increasing waveforms like Adora1 and Mybph were identified as rhythmic only by BioCycle, while Plat and Il34 were identified as rhythmic only by MetaCycle (Fig. S1C). Despite these discrepancies, the clear circadian rhythmic expression profiles of these genes led us to conclude that using the union of the two lists compensates for the limitations of each algorithm.”

      Please refer to lines 105-117 in the Results section.

      The reasoning for comparing CT0 with the phase of the clock 8 hs after SS needs to be explained. Circadian time (CT) conceptually refers to the clock phase in the absence of entrainment cues in vivo, the direct transformation of "time after synchronization" in vitro to CT is misleading. 

      Thank you for the reviewer’s insightful comments. Initially, we believed that transforming TASS to CT, despite being in vitro data, might provide a more intuitive and physiologically relevant interpretation of our results. However, we agree that this approach might be misleading. Following the reviewer’s suggestion, we have revised our terminology by changing “CT” to “Time post sync (hr)”. Nonetheless, in Fig. 1F for circular peak phase map, we set 8hrs post sync to ZT0 based on a phase comparison result in Fig. 1D for physiologically relevant interpretation. We hope these revisions clarify our approach.

      Moreover, also by definition a CT cannot be defined in terms of "dark" or "light". Figure 6M needs to be changed. 

      Following the reviewer’s suggestion, we removed the labels CT22 and CT34. Instead. we have labeled the respective periods as “30hr post sync” and “42hr post sync”.

      In Figure 1D, the authors present a gene ontology analysis that is certainly interesting, however, it should not be overinterpreted when trying to explain processes that take place only in vivo (e.g. wound repair). 

      Thank you for the insightful comment. Following the reviewer’s feedback, we have removed the paragraph interpreting the cell migration process in relation to wound repair and have focused instead on Ca2+ ion homeostasis.

      In Figure 2A the relative expression of clock genes and Herp is again misleading by a white/grey shading indicating subjective night and subjective day when the system under study is a cell culture. 

      We understand the reviewer’s concern that a cell culture system is not equivalent to light/dark entrainment condition. However, we apply time-synchronizing stimuli to recapitulate in vivo entrainment. In addition, by comparing our data with CircaDB, we defined 8hrs post sync as corresponding to ZT0, thus aligning it with the beginning of the day. We have retained the shading to facilitate easier interpretation of our data in relation to in vivo situations. However, in response to the reviewer’s concern, we have revised the shading from white/grey to light grey/dark grey. We hope this adjustment addresses the reviewer’s concern, but if the reviewer still believes it is inappropriate, please let us know, we will gladly update it.

      In the Figure 2A legend, it is indicated that rhythmicity is assessed using MetaCycle with mean values obtained from n=2. The authors need to make clear whether this n=2 mean: 2 biological replicates or 2 technical replicates. This difference is relevant because it would make the analysis statistically valid or invalid, respectively. 

      Thank you for your feedback. n=2 refers to 2 biological replicates. Therefore, the analysis is statistically valid.

      In Figures 2C and D the authors applied a T-test, a parametric statistical test for one-to-one comparison that requires normality distribution of the data to be tested first. To test normality, the authors need at least 4 biological replicates. The suggestion of this reviewer is that these experiments have to be repeated and proper statistics applied. 

      Thank you for your feedback. In response to the reviewer's suggestion, we conducted additional experiments to increase the number of biological replicates to 4. After verifying the normality of the data, we applied a t-test for Figure 2C and a Mann-Whitney test for Figure 2D and 2E. These tests confirmed significant statistical difference between groups.

      Further evidence of Bmal1-dependent control of HERP circadian expression authors could check the presence of E-Box elements in the Herp promoter. 

      Thank you for the reviewer’s insightful comment. In the original version of our manuscript's Discussion section, we mentioned the absence of a canonical E-Box in the upstream of Herp gene. However, following the reviewer’s suggestion and considering the potential role of non-canonical E-Boxes, we conducted an additional analysis. This analysis identified several non-canonical E-Boxes within the 6 kb upstream region of the Herp gene (Table S2). Notably, we found one non-canonical E-Box, “CACGTT,” known to regulate circadian expression (Yoo et al., 2005) is close to the transcription start site (chr8:94386194-94386543). Moreover, this element is evolutionarily conserved across various mammals, including humans, rats, mice, dogs, and opossums (See Author response image 2). Therefore, we reasoned that these non-canonical E boxes might drive the CLOCK/BMAL1 dependent expression of Herp. We have updated the Discussion to reflect these findings in lines 315-319.

      Author response image 2.

      The calcium experiments shown in Figures 3A-I, could be more convincing if the authors showed that the different Ca2+ sensors are compartment-specific by showing co-localization with a subcellular marker. In the pictures shown it is not even possible to recognize the cell dimensions. 

      Following the reviewer’s suggestion, we performed co-staining experiments with organelle specific Ca2+ indicators and organelle markers. First, astrocytes were co-transfected with G-CEPIA1er, an ER specific Ca2+ indicator and ER targeted DsRed2 (with Calreticulin signal sequence). Live imaging analysis showed that the fluorescent intensities of G-CEPIA1er and DsRed2-ER-5 significantly overlapped in co-transfected cells. Secondly, astrocytes were transfected with Mito-R-GECO1 and Mitotracker, a cell permeable mitochondria dye, was applied. The fluorescent intensities of Mito-R-GECO1 and Mitotracker also significantly overlapped. These new data are included in Figure S4 and explained in the Result section (lines 194-195).

      Data analysis in Figure 3 K and M is misleading. According to the explanations of the results, each of the experiments to assess ITRP1 or 2 is run independently. Then it is not clear why the relative levels obtained with control or Herp siRNA are plotted as pairs. Same comment as above for Figure 4L and Figure 6D. 

      Thank you for the reviewer’s insightful comments. Reviewer1 raised similar issues. Following the reviewers’ suggestions, we have removed the lines connecting the data points in Fig. 3K, 3M, 4L, and 6D.

      In Figure 5E the authors need to explain why they consider that repeated measures 2-way ANOVA is the right statistical test to apply. According to the explained experimental design, cells transfected, synchronized, and then harvested independently at the indicated time after synchronization. 

      Thank you for the reviewer’s insightful comment. Upon reviewing the statistical methods as suggested, we have revised our approach. Instead of using repeated measures 2-way ANOVA, we have now applied a standard 2-way ANOVA, which is more appropriate given the experimental procedures were independent, as the reviewer pointed out.

      The English language needs to be revised throughout the text. 

      We have thoroughly revised the English language throughout the text.

      Reviewer #3 (Recommendations For The Authors): 

      (1) Figure 3. Clarify the physiological importance of 100 µM ATP. Would the Herp rhythm warrant Ca2+ release rhythms under basal conditions? In 3J-K, the relatively weak effect of Herp knockdown on ITPR1/2 levels, albeit statistically significant, may not be physiologically significant. This calls into question the claimed Herp-ITPR axis that underlies the Ca2+ release phenotype. Further, the correlation certainly exists but further characterization of Herp KD cells would be required to address the mechanism. 

      As previously reported, a broad range of ATP concentrations can induce Ca2+ activity in the astrocytes (Neary et al., 1988). Originally, we conducted an ATP dose-response analysis to observe ER Ca2+ release in our primary astrocyte culture. Our results show that ER Ca2+ release begins at 50 µM ATP and plateaus at 500 µM. Please refer to Author response image 3. We selected 100µM ATP for our experiments because it induces a medium level of ER Ca2+ response. Importantly, although measuring ATP concentrations at the synapse in vivo is challenging(Tan et al., 2017), estimates suggest synaptic ATP concentrations range from 5-500 µM (Pankratov et al., 2006). Thus, 100µM ATP is a physiologically relevant concentration that can affect nearby cells, including astrocytes, in the nervous system.

      Author response image 3.

      Cultured astrocytes were transfected with G-CEPIA1er ER and at 48hrs post transfection, cultured astrocytes were treated with various concentrations of ATP and Ca2+ imaging analysis was performed. (A) ΔF/F0 values over time following ATP application. (B) Area above curve values. Values in graphs are mean ± SEM (*p < 0.05, **p < 0.005, ***p < 0.0005, and ****p < 0.00005; one-way ANOVA).

      Regarding the comment on Ca2+ release rhythms under basal conditions, we interpret this as referring Ca2+ release in the absence of a stimulus. We typically observe Ca2+ release only upon stimulation, such as ATP treatment. However, we acknowledge that the modest effects of HERP knockdown on ITPR1/2 levels could question the HERP-ITPR axis’s role in ER Ca2+ release.

      To address this, we analyzed whether Herp KD induced increases in ER Ca2+ release were mediated through ITPRs by treating cells with Xestospongin C (XesC), an IP3R inhibitor. XesC treatment reduced ATP-induced ER Ca2+ release and eliminated the differences in ER Ca2+ release between control and Herp KD astrocytes (Fig. 3N – 3P). These results clearly indicate that HERP-ITPR axis plays critical role in controlling ER Ca2+ release. These new experiments have been included in Fig. 3 and explained in the result section (lines 217-221).

      Furthermore, following the reviewer’s suggestion, we examined whether HERP rhythms underlie the rhythms of ER Ca2+ response by analyzing ER Ca2+ response in Herp KD astrocyte in two different times following synchronization. In control astrocytes, ATP-induced ER Ca2+ responses vary depending on time, whereas these time-dependent variations were abolished in Herp KD astrocytes. These new experiments have been included in Fig. 4K – 4M and explained in the Results section (lines 232-235).

      Collectively, these results indicate that HERP rhythms lead to time-dependent differences in ER Ca2+ response through ITPRs.

      (2) Figure 4K-L. As data suggested the involvement of ITPR1 and ITPR2 (circadian effect), a reasonable next step is to determine their involvement, but the study did not pursue the hypothesis. 

      Thank you for your insightful comment. Our results indeed suggest that rhythms in ITPR2 levels may drive the time-dependent variations in ATP-induced ER Ca2+ release following synchronization. The newly conducted experiments demonstrated that treatment with the ITPR inhibitor XesC suppressed ATP-induced ER Ca2+ release at both control and Herp siRNA treatment conditions (Fig. 3). Based on these findings, we now further confirm that rhythms of ITPR levels, specifically ITPR2 underlie the circadian variations in ER Ca2+ release. While examining the effect of ITPR2 siRNA would directly prove the involvement of ITPR2, we have decided to pursue this experiment in the future studies.

      (3) Figure 5A-C. Data from WT cells should be included side by side with Bmal1-/- cells for comparison which is expected to be consistent with the HERP levels as in 5D-E. Again, the role of ITPR2 is suggested but not demonstrated. 

      Following the reviewer's suggestion, we conducted additional experiments including both WT and Bmal1-/- cultured astrocytes side-by-side. The results were consistent with our previous findings: WT astrocytes showed rhythms of ER Ca2+ release while Bmal1-/- astrocytes did not. We have updated the Figure 5A to 5C and the corresponding Results section in lines 242-245 accordingly.<br /> Regarding second comment, as mentioned in our previous response, we plan to examine the role of ITPR2 in further studies.

      (4) Figure 6. The Connexin data seems an addon and is correlative with the Ca2+ release. The role of Herp and Itpr in Connexin function is not addressed. Figure 6E-F was not called out in the results section. Suggest providing additional data to support the role of the HERP-ITPR axis in regulating Ca2+ release and Connexin activity. 

      We agree that additional data are needed to support the role of HERP in regulating CX43 phosphorylation. Therefore, we have conducted further experiments to determine whether rhythms of Cx43 phosphorylation are regulated by HERP. In the control astrocytes, ATP treatment induced time-dependent variations in Cx43 phosphorylation. However, these rhythms were abolished in Herp KD astrocytes. These results indicate that rhythms in HERP levels contribute to the time-dependent variations in Cx43 phosphorylation. These new experiments have included in Fig. 6G and 6H and explained in the results section (lines 276-281).

      Regarding second comment, we have corrected our oversight by properly referencing figures 6E-F in the results section. Please refer to lines 357-359 for clarification.

      (5) Discussion. This section should focus on noteworthy points to discuss, not repeating the results. 

      Based on the reviewer's valuable suggestions, we have revised the Discussion section to minimize repetition of the results. Thank you for your guidance.

      (6) The manuscript exhibits numerous grammatical and textual inaccuracies that necessitate careful revision by the authors. My observations here are confined to the title and the abstract alone. I recommend altering the title from "mouse cultured astrocytes" to "cultured mouse astrocytes" for clarity and grammatical correctness. The abstract, meanwhile, needs enhancements both in terms of its content and language. It should incorporate the results of the partitioning among the ER, cytoplasm, and mitochondria, and provide clear definitions for some of the critical terms used. It's worth noting that the abstract's second sentence contains a grammatical error. 

      Thank you for the reviewer’s valuable feedback. We have carefully revised the title, abstract, and main text to address the grammatical and textual issues. The title has been changed to “cultured mouse astrocytes”. Additionally, the abstract now includes results related to cytoplasmic Ca2+ dynamics and has been revised in several places. We appreciate your insights and have worked to enhance the content and language accordingly.

      Reference

      Agostinelli, F., Ceglia, N., Shahbaba, B., Sassone-Corsi, P., & Baldi, P. (2016). What time is it? Deep learning approaches for circadian rhythms. Bioinformatics, 32(12), i8-i17. https://doi.org/10.1093/bioinformatics/btw243

      Cahoy, J. D., Emery, B., Kaushal, A., Foo, L. C., Zamanian, J. L., Christopherson, K. S., Xing, Y., Lubischer, J. L., Krieg, P. A., Krupenko, S. A., Thompson, W. J., & Barres, B. A. (2008). A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J Neurosci, 28(1), 264-278. https://doi.org/10.1523/JNEUROSCI.4178-07.2008

      Carreras-Sureda, A., Pihán, P., & Hetz, C. (2018). Calcium signaling at the endoplasmic reticulum: fine-tuning stress responses. Cell Calcium, 70, 24-31. https://doi.org/10.1016/j.ceca.2017.08.004

      Enkvist, M. O., & McCarthy, K. D. (1992). Activation of protein kinase C blocks astroglial gap junction communication and inhibits the spread of calcium waves. J Neurochem, 59(2), 519-526. https://doi.org/10.1111/j.1471-4159.1992.tb09401.x

      Fujii, Y., Maekawa, S., & Morita, M. (2017). Astrocyte calcium waves propagate proximally by gap junction and distally by extracellular diffusion of ATP released from volume-regulated anion channels. Scientific Reports, 7(1), 13115. https://doi.org/10.1038/s41598-017-13243-0

      Giorgi, C., Marchi, S., & Pinton, P. (2018). The machineries, regulation and cellular functions of mitochondrial calcium. Nature Reviews Molecular Cell Biology, 19(11), 713-730. https://doi.org/10.1038/s41580-018-0052-8

      Glynn, E. F., Chen, J., & Mushegian, A. R. (2006). Detecting periodic patterns in unevenly spaced gene expression time series using Lomb-Scargle periodograms. Bioinformatics, 22(3), 310-316. https://doi.org/10.1093/bioinformatics/bti789

      Hughes, M. E., Hogenesch, J. B., & Kornacker, K. (2010). JTK_CYCLE: an efficient nonparametric algorithm for detecting rhythmic components in genome-scale data sets. J Biol Rhythms, 25(5), 372-380. https://doi.org/10.1177/0748730410379711

      Ingiosi, A. M., Hayworth, C. R., Harvey, D. O., Singletary, K. G., Rempe, M. J., Wisor, J. P., & Frank, M. G. (2020). A Role for Astroglial Calcium in Mammalian Sleep and Sleep Regulation. Curr Biol, 30(22), 4373-4383.e4377. https://doi.org/10.1016/j.cub.2020.08.052

      Mei, W., Jiang, Z., Chen, Y., Chen, L., Sancar, A., & Jiang, Y. (2020). Genome-wide circadian rhythm detection methods: systematic evaluations and practical guidelines. Briefings in Bioinformatics, 22(3). https://doi.org/10.1093/bib/bbaa135

      Neary, J. T., van Breemen, C., Forster, E., Norenberg, L. O., & Norenberg, M. D. (1988). ATP stimulates calcium influx in primary astrocyte cultures. Biochem Biophys Res Commun, 157(3), 1410-1416. https://doi.org/10.1016/s0006-291x(88)81032-5

      Pankratov, Y., Lalo, U., Verkhratsky, A., & North, R. A. (2006). Vesicular release of ATP at central synapses. Pflugers Arch, 452(5), 589-597. https://doi.org/10.1007/s00424-006-0061-x

      Paredes, F., Parra, V., Torrealba, N., Navarro-Marquez, M., Gatica, D., Bravo-Sagua, R., Troncoso, R., Pennanen, C., Quiroga, C., Chiong, M., Caesar, C., Taylor, W. R., Molgó, J., San Martin, A., Jaimovich, E., & Lavandero, S. (2016). HERPUD1 protects against oxidative stress-induced apoptosis through downregulation of the inositol 1,4,5-trisphosphate receptor. Free Radic Biol Med, 90, 206-218. https://doi.org/10.1016/j.freeradbiomed.2015.11.024

      Szabó, Z., Héja, L., Szalay, G., Kékesi, O., Füredi, A., Szebényi, K., Dobolyi, Á., Orbán, T. I., Kolacsek, O., Tompa, T., Miskolczy, Z., Biczók, L., Rózsa, B., Sarkadi, B., & Kardos, J. (2017). Extensive astrocyte synchronization advances neuronal coupling in slow wave activity in vivo. Scientific Reports, 7(1), 6018. https://doi.org/10.1038/s41598-017-06073-7

      Tan, Z., Liu, Y., Xi, W., Lou, H. F., Zhu, L., Guo, Z., Mei, L., & Duan, S. (2017). Glia-derived ATP inversely regulates excitability of pyramidal and CCK-positive neurons. Nat Commun, 8, 13772. https://doi.org/10.1038/ncomms13772

      Torrealba, N., Navarro-Marquez, M., Garrido, V., Pedrozo, Z., Romero, D., Eura, Y., Villalobos, E., Roa, J. C., Chiong, M., Kokame, K., & Lavandero, S. (2017). Herpud1 negatively regulates pathological cardiac hypertrophy by inducing IP3 receptor degradation. Sci Rep, 7(1), 13402. https://doi.org/10.1038/s41598-017-13797-z

      Tsunematsu, T., Sakata, S., Sanagi, T., Tanaka, K. F., & Matsui, K. (2021). Region-specific and state-dependent astrocyte Ca<sup>2+</sup> dynamics during the sleep-wake cycle in mice. The Journal of Neuroscience, JN-RM-2912-2920. https://doi.org/10.1523/jneurosci.2912-20.2021

      Verkhratsky, A., & Nedergaard, M. (2018). Physiology of Astroglia. Physiol Rev, 98(1), 239-389. https://doi.org/10.1152/physrev.00042.2016

      Vyazovskiy, V. V., Olcese, U., Lazimy, Y. M., Faraguna, U., Esser, S. K., Williams, J. C., Cirelli, C., & Tononi, G. (2009). Cortical firing and sleep homeostasis. Neuron, 63(6), 865-878. https://doi.org/10.1016/j.neuron.2009.08.024

      Wu, G., Anafi, R. C., Hughes, M. E., Kornacker, K., & Hogenesch, J. B. (2016). MetaCycle: an integrated R package to evaluate periodicity in large scale data. Bioinformatics, 32(21), 3351-3353. https://doi.org/10.1093/bioinformatics/btw405

      Yang, R., & Su, Z. (2010). Analyzing circadian expression data by harmonic regression based on autoregressive spectral estimation. Bioinformatics, 26(12), i168-174. https://doi.org/10.1093/bioinformatics/btq189

      Yoo, S. H., Ko, C. H., Lowrey, P. L., Buhr, E. D., Song, E. J., Chang, S., Yoo, O. J., Yamazaki, S., Lee, C., & Takahashi, J. S. (2005). A noncanonical E-box enhancer drives mouse Period2 circadian oscillations in vivo. Proc Natl Acad Sci U S A, 102(7), 2608-2613. https://doi.org/10.1073/pnas.0409763102

    2. eLife Assessment

      This work describes a circadian regulation in the expression of HERP, a regulator of endoplasmic reticulum calcium, in primary astrocytic cultures. This work is important because it highlights the potential importance of circadian rhythms in astrocytes, even though making a direct comparison between these rhythms in vitro and in vivo remains challenging. The technical approaches used in this work (RNA-seq, siRNA, Ca2+ imaging) are a solid support for data interpretation.

    3. Reviewer #2 (Public review):

      Summary:

      The article by Ryu and colleagues describes the circadian control of astrocytic intracellular calcium levels in vitro.

      Strengths:

      The authors used a variety of technical approaches that are appropriate and considerably improved the manuscript with experiments and more solid data analysis compared to the first version

      Weaknesses:

      Some conceptual issues are still present. This is a mechanistic paper done completely in vitro, all references to the in vivo situation are speculative and should be absolutely avoided unless the authors are citing in vivo work.

    4. Reviewer #3 (Public review):

      This study provides significant insights into how the circadian clock influences astrocytic Ca2+ homeostasis. Astrocyte biology is an active area of research and this study is timely and adds to a growing body of literature in the field. This research highlights the potential importance of circadian rhythms in astrocytes, offering a new perspective on their role in central nervous system regulation.

    1. eLife Assessment

      The manuscript by Hills, et al. presents data that support multiple conclusions regarding the gene expression patterns of cells, especially chemosensory neurons. The evidence is largely solid, with transcriptomic analysis combined and validated by spatially resolved expression in tissue sections, but is incomplete in other ways with some claims not fully supported. This large-scale single-cell transcriptomics dataset is an important resource, alongside a thorough exploration of the molecular features of the different cell types within the mouse vomeronasal organ, including expression of chemosensory receptors.

    2. Reviewer #1 (Public review):

      Summary:

      The authors comprehensively present data from single cell RNA sequencing and spatial transcriptomics experiments of the juvenile male and female mouse vomeronasal organ, with a particular emphasis on the neuronal populations found in this sensory tissue. The use of these two methods effectively maps the locations of relevant cell types in the vomeronasal organ at a level of depth beyond what is currently known. Targeted analysis of the neurons in the vomeronasal organ produced several important findings, notably the common co-expression of multiple vomeronasal type 1 receptors (V1Rs), vomeronasal type 2 receptors (V2Rs), and both V1R+V2Rs by individual neurons, as well as the presence of a small but noteworthy population of neurons expressing olfactory receptors (ORs) and associated signal transduction molecules. Additionally, the authors identify transcriptional patterns associated with neuronal development/maturation, producing lists of genes that can be used and/or further investigated by the field. Finally, the authors report the presence of coordinated combinatorial expression of transcription factors and axon guidance molecules associated with multiple neuronal types, providing the framework for future studies aimed at understanding how these patterns relate to the complex glomerular organization in the accessory olfactory bulb. Several of these conclusions have been reached by previous studies, partially limiting the overall impact of the current work. However, when combined, these results provide important insights into the cellular diversity in the vomeronasal organ that are likely to support multiple future studies of the vomeronasal system.

      Strengths:

      The comprehensive analysis of the data provides a wealth of information for future research into vomeronasal organ function. The targeted analysis of neuronal gene transcription demonstrates the co-expression of multiple receptors by individual neurons, and confirms the presence of a population of OR-expressing neurons in the vomeronasal organ. Although many of these findings have been noted by others, the depth of analysis here validates and extends prior findings in an effective manner. The use of spatial transcriptomics to identify the locations of specific cell types is especially useful and produces a template for the field's continued research into the various cell types present in this complex sensory tissue. Overall, the manuscript's biggest strength is found in the richness of the data presented, which will not only support future work in the broader field of vomeronasal system function but also provide insights into others studying complex sensory tissues.

      Weaknesses:

      The inherent weaknesses of single cell RNA sequencing studies based on the 10x Genomics platforms (need to dissociate tissues, limited depth of sequencing, etc.) is acknowledged. However, the authors document their extensive attempts to avoid making false positive conclusions through the use of software tools designed for this purpose. Because of its complexity, there are some portions of the manuscript where the data are difficult to interpret as presented, but this is a relatively minor weakness. The data resulting from the use of the Resolve Biosciences spatial transcriptomics platform are somewhat difficult to interpret because the methods are proprietary and presented in an opaque manner. That said, the resulting data provide useful links between transcriptional identities and cellular locations, which is not possible without the use of such tools.

    3. Reviewer #2 (Public review):

      In their paper entitled "Molecular, Cellular, and Developmental Organization of the Mouse Vomeronasal Organ at Single Cell Resolution" Hills Jr. et al. perform single-cell transcriptomic profiling and analyze tissue distribution of a large number of transcripts in the mouse vomeronasal organ (VNO). The use of these complementary tools provides a robust approach to investigating many aspects of vomeronasal sensory neuron (VSN) biology based on transcriptomics. Harnessing the power of these techniques, the authors present the discovery of previously unidentified sensory neuron types in the mouse VNO. Furthermore, they report co-expression of chemosensory receptors from different clades on individual neurons, including the co-expression of VR and OR. Finally, they evaluated the correlation between transcription factor expression and putative surface axon guidance molecules during the development of different neuronal lineages. Based on such correlation analysis, authors further propose a putative cascade of events that could give rise to different neuronal lineages and morphological organization.

      We appreciate the authors' efforts to add context and citations that relate to recent single cell RNA sequencing studies in the VNO as well as to studies on vomeronasal receptors co-expression and V1R/V2R lineage determination. We also appreciate the new details on the marker genes used for cell annotation as well as clarifications about the differences between juvenile versus adult or male versus female samples.

      A concern still remaining is that two major claims/interpretations - i.e., identification of canonical OSNs and a novel type sVSNs in the mouse VNO - either require experimental substantiation or the authors' claims should be toned down. In their response, Hills Jr. et al. acknowledge that their "paper is primarily intended as a resource paper to provide access to a large-scale single-cell RNA-sequenced dataset and discoveries based on the transcriptomic data that can support and inspire ongoing and future experiments in the field." The authors also write that given "the limited number of genes that we can probe using Molecular Cartography, the number of genes associated with sVSNs may be present in the non-sensory epithelium. This could lead to the identification of cells that may or may not be identical to the sVSNs in the non-neuronal epithelium. Indeed, further studies will need to be conducted to determine the specificity of these cells." Moreover, Hills Jr. et al. acknowledge that as "any transcriptomic study will only be correlative, additional studies will be needed to unequivocally determine the mechanistic link between the transcription factors with receptor choice. Our model provides a basis for these studies." We agree with all these points. Importantly, in the revised manuscript, the authors do not acknowledge that their primary intention is to present "a resource paper to provide access to a large-scale single-cell RNA-sequenced dataset", nor do they acknowledge any of the other caveats/limitations mentioned above. We believe that the authors should not only mention these aspects in their response to the reviews, but they should also make these intentions/caveats/limitations very clear in the manuscript text.

    4. Reviewer #3 (Public review):

      This study presents a detailed examination of the molecular and cellular organization of the mouse VNO, unveiling new cell types, receptor co-expression patterns, lineage specification regulation, and potential associations between transcription factors, guidance molecules, and receptor types crucial for vomeronasal circuitry wiring specificity. The study identifies a novel type of VSN molecularly different from classic VSNs, which may serve as accessory to other VSNs by secreting olfactory binding proteins and mucins in response to VNO activation. They also describe a previously undetected co-expression of multiple VRs in individual VSNs, providing an interesting view to the ongoing discussion on how receptor choice occurs in VSNs, either stochastic or deterministic. Finally, the study correlates the expression of axon guidance molecules associated with individual VRs, providing a putative molecular mechanism that specifies VSN axon projections and their connection with postsynaptic cells in the accessory olfactory bulb.

      The conclusions of this paper are well supported by data, but some aspects of data analysis and acquisition need to be clarified and extended.

      (1) The authors claim that they have identified two new classes of sensory neurons, one being a class of canonical olfactory sensory neurons (OSNs) within the VNO. This classification as canonical OSNs is based on expression data of neurons lacking the V1R or V2R markers but instead expressing ORs and signal transduction molecules, such as Gnal and Cnga2. Since OR-expressing neurons in the VNO have been previously described in many studies, it remains unclear to me why these OR-expressing cells are considered here a "new class of OSNs." Moreover, morphological features, including the presence of cilia, and functional data demonstrating the recognition of chemosignals by these neurons, are still lacking to classify these cells as OSNs akin to those present in the MOE. While these cells do express canonical markers of OSNs, they also appear to express other VSN-typical markers, such as Gnao1 and Gnai2 (Fig 2B), which are less commonly expressed by OSNs in the MOE. Therefore, it would be more precise to characterize this population as atypical VSNs that express ORs, rather than canonical OSNs.

      (2) The second new class of sensory neurons identified corresponds to a group of VSNs expressing prototypical VSN markers (including V1Rs, V2Rs, and ORs), but exhibiting lower ribosomal gene expression. Clustering analysis reveals that this cell group is relatively isolated from V1R- and V2R-expressing clusters, particularly those comprising immature VSNs. The question then arises: where do these cells originate? Considering their fewer overall genes and lower total counts compared to mature VSNs, I wonder if these cells might represent regular VSNs in a later developmental stage, i.e., senescent VSNs. While the secretory cell hypothesis is compelling and supported by solid data, it could also align with a late developmental stage scenario. Further data supporting or excluding these hypotheses would aid in understanding the nature of this new cell cluster, with a comparison between juvenile and adult subjects appearing particularly relevant in this context.

      (3) The authors' decision not to segregate the samples according to sex is understandable, especially considering previous bulk transcriptomic and functional studies supporting this approach. However, many of the highly expressed VR genes identified have been implicated in detecting sex-specific pheromones and triggering dimorphic behavior. It would be intriguing to investigate whether this lack of sex differences in VR expression persists at the single-cell level. Regardless of the outcome, understanding the presence or absence of major dimorphic changes would hold broad interest in the chemosensory field, offering insights into the regulation of dimorphic pheromone-induced behavior. Additionally, it could provide further support for proposed mechanisms of VR receptor choice in VSNs.

      (4) The expression analysis of VRs and ORs seems to have been restricted to the cell clusters associated to the neuronal lineage. Are VRs/ORs expressed in other cell types, i.e. sustentacular, HBC or other cells?

      Review update:

      I believe the novel discovery of two classes of sensory neurons within the VNO-canonical olfactory sensory neurons (OSNs) and secretory vomeronasal sensory neurons (sVSNs)-should be interpreted with caution. Firstly, these cell types are relatively rare, constituting less than 2% of total cells and only 2-6% of the neuronal population (according to Fig. S3). While the OSNs exhibit gene expression profiles consistent with canonical olfactory signal transduction and cilia-related gene ontology, key aspects such as their cell morphology (including the presence of cilia) and functional evidence for chemosignal detection have yet to be demonstrated. The neuronal lineage of sVSNs remains unclear to me. It is uncertain what developmental trajectories these cells follow: do they arise as a specialized subtype of V1R or V2R lineages, or do they have an independent lineage determination, similar to OSNs? At what stage does the commitment to the sVSN lineage begin-during the INP stage or the immature sensory neuron stage? A pseudotime inference analysis of sVSNs could help clarify these questions.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      …several previous studies have identified co-expression of vomeronasal receptors by vomeronasal sensory neurons, and the expression of non-vomeronasal receptors, and this was not adequately addressed in the manuscript as presented.

      We’ve added context and citations to the Introduction and Results sections relating to recent studies on the co-expression of vomeronasal receptors and the expression of non-vomeronasal receptors in VSNs.

      The data resulting from the use of the Resolve Biosciences spatial transcriptomics platform are somewhat difficult to interpret, and the methods are somewhat opaque.

      The Molecular Cartography platform relies on multi-plex imaging of fluorescent probes that bind specifically to individual gene transcripts to determine their spatial location. Unfortunately, the detailed protocols remain proprietary at Resolve Biosciences and were not disclosed. We have clarified this in the revised manuscript. Our role in the acquisition and processing of data for this experiment is included in the current Methods section. Additional analysis produced from the Molecular Cartography data have been added (See response to Reviewer #2, below) to the supplemental materials to help clarify interpretation of the results.

      Reviewer #2:

      …the authors present a biased report of previously published work, largely including only those results that do not overlap with their own findings, but ignoring results that would question the novelty of the data presented here.

      We had no intention of misleading the readers. In fact, we have discussed discrepancies between our results with other studies. However, we inadvertently left out a critical publication in preparing the manuscript. We have added context and citations relating to recent studies that use single cell RNA sequencing in the vomeronasal organ, studies relating to the co-expression of vomeronasal receptors, and studies discussing V1R/V2R lineage determination. In Discussion, we also compared our model with a previous one of genetic determination of VNO neuronal fate.

      Did the authors perform any cell selectivity, or any directed dissection, to obtain mainly neuronal cells? Previous studies reported a greater proportion of non-neuronal cells. For example, while Katreddi and co-workers (ref 89) found that the most populated clusters are identified as basal cells, macrophages, pericytes, and vascular smooth muscle, Hills Jr. et al. in this work did not report such types of cells. Did the authors check for the expression of marker genes listed in Ref 89 for such cell types?

      For VNO dissections, we removed bones and blood vessels from VNO tissue and only kept the sensory epithelium. This procedure removed vascular smooth muscle cells, pericytes, and other non-neuronal cell types, which explains differences in cell proportions between our study and previous studies. We used a DAPI/Draq5 assay to sort live/nucleated cells for sequencing and no specific markers were used for cell selection. All cells in the experiment were successfully annotated using the cell-type markers shown in Fig. 1B, save for cells from the sVSN cluster, which were novel, and required further analysis to characterize.

      The authors should report the marker genes used for cell annotation.

      Marker genes used for cell annotation are shown in figure 1B. A full list of all marker genes used in the cell annotation process has been added to the Methods section.

      The authors reported no differences between juvenile and adult samples, and between male and female samples. It is not clear how they evaluate statistically significant differences, which statistical test was used, or what parameters were evaluated.

      The claims made about male/female mice and P14/P56 mice directly pertain to the distribution of clusters and cells in UMAP space as seen in Figure 1 C & D. We have performed differential gene expression analysis for male/female and P14/P56 comparisons using the FindMarkers function from the Seurat R package. Although we have found significant differential expression between male and female, and between P14 and P56 animals, the genes in this list do not appear to be influential for the neuronal lineage and cell type specification or related to cell adhesion molecules, which are the main focuses of this study. Nevertheless, we have added these results to the supplemental materials.

      ‘Based on our transcriptomic analysis, we conclude that neurogenic activity is restricted to the marginal zone.’ This conclusion is quite a strong statement, given that this study was not directed to carefully study neurogenesis distribution, and when neurogenesis in the basal zone has been proposed by other works, as stated by the authors.

      We have used fourteen slides from whole VNO sections in our Molecular Cartography analysis to quantify the number of GBCs, INPs, and iVSNs predicted in the marginal zone, the intermediate zone, and main/medial zone. We have performed a Wilcoxon signed-rank test to check for the significant presence of GBCs, INPs, and iVSNs in the marginal zone over their presence in the main/medial zone. The results are included in new Figure S3. The result from this analysis justifies our claim that neurogenesis is restricted to the MZ. This claim is also supported by the 2021 study by Katreddi & Forni.

      The authors report at least two new types of sensory neurons in the mouse VNO, a finding of huge importance that could have a substantial impact on the field of sensory physiology. However, the evidence for such new cell types is based solely on this transcriptomic dataset and, as such, is quite weak, since many crucial morphological and physiological aspects would be missing to clearly identify them as novel cell types. As stated before, many control and confirmatory experiments, and a careful evaluation of the results presented in this work must be performed to confirm such a novel and interesting discovery. The reported "novel classes of sensory neurons" in this work could represent previously undescribed types of sensory neurons, but also previously reported cells (see below) or simply possible single-cell sequencing artefacts.

      The reviewer is correct that detailed morphological and physiological studies are needed to further understand these cells. This is an opinion we share. Our paper is primarily intended as a resource paper to provide access to a large-scale single-cell RNA-sequenced dataset and discoveries based on the transcriptomic data that can support and inspire ongoing and future experiments in the field. Nonetheless, we are confident that neither of the novel cell clusters are the result of sequencing artefacts. We performed a robust quality-control protocol, including count correction for ambient RNA with the R package, SoupX, multiplet cell detection and removal with the Python module, Scrublet, and a strict 5% mitochondrial gene expression cut-off. Furthermore, the cell clusters in question show no signs of being the result of sequencing artefacts, as they are physically connected in a reasonable orientation to the rest of the neuronal lineage in modular clusters in 2D and 3D UMAP space. The OSN and sVSN  cell clusters each show distinct and self-consistent expressions of genes (new Figure S4H). Gene ontology (GO) analysis reveals significant GO term enrichment for both the sVSN (Fig. 2G) and mOSN clusters when compared to mature V1R and V2R VSNs, indicating functional differences. We have performed  pseudotime analysis of sVSNs, differential gene expression and gene ontology analysis of mOSNs. The results are shown in the new Figure S6.

      The authors report the co-expression of V2R and Gnai2 transcripts based on sequencing data. That could dramatically change classical classifications of basal and apical VSNs. However, did the authors find support for this co-expression in spatial molecular imaging experiments?

      Genes with extremely high expression levels overwhelm signals from other genes, and therefore had to be removed from the experiment. This is a limitation of the Molecular Cartography platform. Unfortunately, Gnai2 was determined to be one of these genes and was not evaluated for this purpose.

      Canonical OSNs: The authors report a cluster of cells expressing neuronal markers and ORs and call them canonical OSN. However, VSNs expressing ORs have already been reported in a detailed study showing their morphology and location inside the sensory epithelium (References 82, 83). Such cells are not canonical OSNs since they do not show ciliary processes, they express TRPC2 channels and do not express Golf. Are the "canonical OSNs" reported in this study and the OR-expressing VSNs (ref 82, 83) different? Which parameters, other than Gnal and Cnga2 expression, support the authors' bold claim that these are "canonical OSNs"? What is the morphology of these neurons? In addition, the mapping of these "canonical OSNs" shown in Figure 2D paints a picture of the negligible expression/role of these cells (see their prediction confidence).

      We observe OR expression in VSNs in our data; these cells cluster with VSNs. The putative mOSN cluster exhibits its own trajectory, distinct from VSN clusters. These cells express Gnal (Golf), which is not expressed in VSNs expressing ORs, nor in any other cell-type in the data. After performing differential gene expression on the putative mOSN cluster, comparing with V1R and V2R VSNs, independently, GO analysis returned the top significantly enriched GO cellular component, ‘cilium’. This new piece of data is presented in the updated Figure S6. Because we were limited to list of 100 genes in Molecular Cartography probe panel, we have prioritized the detection of canonical VNO cell-types, vomeronasal receptor co-expression, and the putative sVSNs, and were not able to include a robust analysis of the putative OSNs.

      Secretory VSN: The authors report another novel type of sensory neurons in the VNO and call them "secretory VSNs". Here, the authors performed an analysis of differentially expressed genes for neuronal cells (dataset 2) and found several differentially expressed genes in the sVSN cluster. However, it would be interesting to perform a gene expression analysis using the whole dataset including neuronal and non-neuronal cells. Could the authors find any marker gene that unequivocally identifies this new cell type?

      We did not find unequivocal marker genes for sVSNs. We did perform differential analysis of the sVSN cluster with whole VNO data and with the neuronal subset, as well as against specific cell-types. We could not find a single gene that was perfectly exclusive to sVSNs. We used a combinatorial marker-gene approach to predicting sVSNs in the Molecular Cartography data. This required a larger subset of our 100 gene panel to be dedicated to genes for detecting sVSNs.

      When the authors evaluated the distribution of sVSN using the Molecular Cartography technique, they found expression of sVSN in both sensory and non-sensory epithelia. How do the authors explain such unexpected expression of sensory neurons in the non-sensory epithelium?

      In our scRNA-Seq experiment, blood vessels were removed, limiting the power to distinguish between certain cell types. Because of the limited number of genes that we can probe using Molecular Cartography, the number of genes associated with sVSNs may be present in the non-sensory epithelium. This could lead to the identification of cells that may or may not be identical to the sVSNs in the non-neuronal epithelium. Indeed, further studies will need to be conducted to determine the specificity of these cells.

      The low total genes count and low total reads count, combined with an "expression of marker genes for several cell types" could indicate low-quality beads (contamination) that were not excluded with the initial parameter setting. It looks like cells in this cluster express a bit of everything V1R, V2R, OR, secretory proteins.

      We are confident that the putative sVSN cell cluster is not the result of low-quality cells. We performed a robust quality-control protocol, including count correction for ambient RNA with the R package, SoupX, multiplet cell detection and removal with the Python module, Scrublet, and a strict 5% mitochondrial gene expression cut-off. Furthermore, the cell clusters in question show no signs of being the result of sequencing artefacts, as they are connected in a reasonable orientation to the rest of the neuronal lineage in modular clusters in 2D and 3D UMAP space. The OSN and sVSN cell clusters each show distinct and self-consistent expressions of genes (Fig. S1H). Gene ontology (GO) analysis reveals significant GO term enrichment for both the sVSN (Fig. 2G) and mOSN clusters when compared to mature V1R and V2R VSNs, indicating functional differences. Moreover, while some genes were expressed at a lower level when compared to the canonical VSNs, others were expressed at higher levels, precluding the cause of discrepancy as resulting from an overall loss of gene counts.

      The authors wrote ‘...the transcriptomic landscape that specifies the lineages is not known...’. This statement is not completely true, or at least misleading. There are still many undiscovered aspects of the transcriptomics landscape and lineage determination in VSNs. However, authors cannot ignore previously reported data showing the landscape of neuronal lineages in VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). Expression of most of the transcription factors reported by this study (Ascl1, Sox2, Neurog1, Neurod1...) were already reported, and for some of them, their role was investigated, during early developmental stages of VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). In summary, the authors should fully include the findings from previous works (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259), clearly state what has been already reported, what is contradictory and what is new when compared with the results from this work.

      This is a difference in opinion about the terminology. Transcriptomic landscape in our paper refers to the genome-wide expression by individual cells, not just individual genes. The reviewer is correct that many of the genetic specifiers have been identified, which we cited and discussed. We consider these studies as providing a “genetic” underpinning, rather than the “transcriptomic landscape” in lineage progression. To avoid confusion, we have revised the statement to “… the transcriptional program that specifies the lineages is not known.” 

      …the co-expression of specific V2Rs with specific transcription factors does not imply a direct implication in receptor selection. Directed experiments to evaluate the VR expression dependent on a specific transcription factor must be performed.

      The reviewer is correct, and we did not claim that the co-expression of specific transcription factors indicates a direct relationship with receptor selection. We agree that further directed experiments are required to investigate this question.

      This study reports that transcription factors, such as Pou2f1, Atf5, Egr1, or c-Fos could be associated with receptor choice in VSNs. However, no further evidence is shown to support this interaction. Based on these purely correlative data, it is rather bold to propose cascade model(s) of lineage consolidation.

      The reviewer is correct. As any transcriptomic study will only be correlative, additional studies will be needed to unequivocally determine the mechanistic link between the transcription factors with receptor choice. Our model provides a basis for these studies.

      The authors use spatial molecular imaging to evaluate the co-expression of many chemosensory receptors in single VNO cells. […] However, it is difficult to evaluate and interpret the results due to the lack of cell borders in spatial molecular imaging. The inclusion of cell border delimitation in the reported images (membrane-stained or computer-based) could be tremendously beneficial for the interpretation of the results.

      The most common practice for cell segmentation of spatial transcriptomics data is to determine cell borders based on nuclear staining with expansion. We have tested multiple algorithms based on recent studies, but each has its own caveat.

      It is surprising that the authors reported a new cell type expressing OR, however, they did not report the expression of ORs in Molecular Cartography technique. Did the authors evaluate the expression of OR using the cartography technique?

      We were limited to a 100-gene probe panel and only included one OR. The expression was not high enough for us to substantiate any claims.

      Reviewer #3:

      (1) The authors claim that they have identified two new classes of sensory neurons, one being a class of canonical olfactory sensory neurons (OSNs) within the VNO. This classification as canonical OSNs is based on expression data of neurons lacking the V1R or V2R markers but instead expressing ORs and signal transduction molecules, such as Gnal and Cnga2. Since OR-expressing neurons in the VNO have been previously described in many studies, it remains unclear to me why these OR-expressing cells are considered here a "new class of OSNs." Moreover, morphological features, including the presence of cilia, and functional data demonstrating the recognition of chemosignals by these neurons, are still lacking to classify these cells as OSNs akin to those present in the MOE. While these cells do express canonical markers of OSNs, they also appear to express other VSN-typical markers, such as Gnao1 and Gnai2 (Figure 2B), which are less commonly expressed by OSNs in the MOE. Therefore, it would be more precise to characterize this population as atypical VSNs that express ORs, rather than canonical OSNs.

      We observe OR expression in VSNs in our data; these cells cluster with VSNs. The putative mOSN cluster exhibits its own trajectory, distinct from VSN clusters. These cells express Gnal (Golf), which is not expressed in VSNs expressing ORs, nor in any other cell-type in the data. We have performed differential gene expression analysis on the putative mOSN cluster to compare with V1R and V2R VSNs. GO analysis returned the top significantly enriched GO terms, including many related to “cilium”., further supporting that these are OSNs. Because we were limited to list of 100 genes in Molecular Cartography probe panels, we have prioritized the detection of canonical VNO cell-types, vomeronasal receptor co-expression, and the putative sVSNs, and were not able to include a robust analysis of the putative OSNs. With regard to Gnai2 and Go expression, we have examined our data from the OSNs dissociated from the olfactory epithelium and detected substantial expression of both. This new analysis provides additional support for our claim. We now present differentially expressed genes and GO term analysis of the mOSN class in the updated Figure S6.

      (2) The second new class of sensory neurons identified corresponds to a group of VSNs expressing prototypical VSN markers (including V1Rs, V2Rs, and ORs), but exhibiting lower ribosomal gene expression. Clustering analysis reveals that this cell group is relatively isolated from V1R- and V2R-expressing clusters, particularly those comprising immature VSNs. The question then arises: where do these cells originate? Considering their fewer overall genes and lower total counts compared to mature VSNs, I wonder if these cells might represent regular VSNs in a later developmental stage, i.e., senescent VSNs. While the secretory cell hypothesis is compelling and supported by solid data, it could also align with a late developmental stage scenario. Further data supporting or excluding these hypotheses would aid in understanding the nature of this new cell cluster, with a comparison between juvenile and adult subjects appearing particularly relevant in this context.

      We wholeheartedly agree with this assessment. Our initial thought was that these were senescent VSNs, but the trajectory analysis did not support this scenario, leading us to propose that these are putative secretive cells. Our analysis also shows that overall, 46% of the putative sVSNs were from the P14 sample and 54% from P56. These cells comprise roughly 6.4% of all P14 cells and 8.5% of P56 cells. In comparison, 28.4% of all cells are mature V1R VSNs at P14, but the percentage rise to 46.7% at P56. The significant presence of sVSNs at P14, and the disproportionate increase when compared with mature VSNs indicate that these are unlikely to be late developmental stage or senescent cells, although we cannot exclude these possibilities.

      We have included the sVSNs in a trajectory inference analysis and found that the pseudotime values of the sVSNs are within the range of those cells within the V1R and V2R lineages, indicating a similar maturity (Fig. S6).

      (3) The authors' decision not to segregate the samples according to sex is understandable, especially considering previous bulk transcriptomic and functional studies supporting this approach. However, many of the highly expressed VR genes identified have been implicated in detecting sex-specific pheromones and triggering dimorphic behavior. It would be intriguing to investigate whether this lack of sex differences in VR expression persists at the single-cell level. Regardless of the outcome, understanding the presence or absence of major dimorphic changes would hold broad interest in the chemosensory field, offering insights into the regulation of dimorphic pheromone-induced behavior. Additionally, it could provide further support for proposed mechanisms of VR receptor choice in VSNs. 

      The reviewer raised a good point. We did not observe differences between male and female, or between P14 and P56 mice in the distribution of clusters and cells in UMAP space. Indeed, our differential expression analysis has revealed significantly differentially expressed genes in both comparisons. Results from these analyses are presented in the new Figures S1 and S2.   

      (4) The expression analysis of VRs and ORs seems to have been restricted to the cell clusters associated with the neuronal lineage. Are VRs/ORs expressed in other cell types, i.e. sustentacular, HBC, or other cells?

      Sparsely expressed low counts of VR and OR genes were observed in non-neuronal cell-types. When their expression as a percentage of cell-level gene counts is considered, however, the expression is negligible when compared to the neurons. The observed expression may be explained by stochastic base-level expression, or it may be the result of remnant ambient RNA that passed filtering.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:  

      Reviewer #1 (Public Review): 

      Summary: 

      The fungal cell wall is a very important structure for the physiology of a fungus but also for the interaction of pathogenic fungi with the host. Although a lot of knowledge on the fungal cell wall has been gained, there is a lack of understanding of the meaning of ß-1,6-glucan in the cell wall. In the current manuscript, the authors studied in particular this carbohydrate in the important humanpathogenic fungus Candida albicans. The authors provide a comprehensive characterization of cell wall constituents under different environmental and physiological conditions, in particular of ß-1,6glucan. Also, β-1,6-glucan biosynthesis was found to be likely a compensatory reaction when mannan elongation was defective. The absence of β-1,6-glucan resulted in a significantly sick growth phenotype and complete cell wall reorganization. The manuscript contains a detailed analysis of the genetic and biochemical basis of ß-1,6-glucan biosynthesis which is apparently in many aspects similar to yeast. Finally, the authors provide some initial studies on the immune modulatory effects of ß-1,6-glucan. 

      Strengths: 

      The findings are very well documented, and the data are clear and obtained by sophisticated biochemical methods. It is impressive that the authors successfully optimized methods for the analyses and quantification of ß-1-6-glucan under different environmental conditions and in different mutant strains. 

      Weaknesses: 

      However, although already very interesting, at this stage there are some loose ends that need to be combined to strengthen the manuscript. For example, the immunological studies are rather preliminary and need at least some substantiation. Also, at this stage, the manuscript in some places remains a bit too descriptive and needs the elucidation of potential causalities.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors provide the first (to my knowledge) detailed characterization of cell wall b-1,6 glucan in the pathogen Candida albicans. The approaches range from biochemistry to genetics to immunology. The study provides fundamental information and will be a resource of exceptional value to the field going forward. Highlights include the construction of a mutant that lacks all b-1,6 glucan and the characterization of its cell wall composition and structure. Figure 5a is a feast for the eyes, showing that b-1,6 glucan is vital for the outer fibrillar layer of the cell wall. Also much appreciated was the summary figure, Figure 7, which presents the main findings in digestible form.

      Strengths: 

      The work is highly significant for the fungal pathogen field especially, and more broadly for anyone studying fungi, antifungal drugs, or antifungal immune responses.

      The manuscript is very readable, which is important because most readers will be cell wall nonspecialists.

      The authors construct a key quadruple mutant, which is not trivial even with CRISPR methods, and validate it with a complemented strain. This aspect of the study sets the bar high. The authors develop new and transferable methods for b-1,6 glucan analysis. 

      Weaknesses: 

      The one "famous" cell type that would have been interesting to include is the opaque cell. This could be included in a future paper.

      Reviewer #3 (Public Review): 

      Summary: 

      The cell wall of human fungal pathogens, such as Candida albicans, is crucial for structural support and modulating the host immune response. Although extensively studied in yeasts and molds, the structural composition has largely focused on the structural glucan b,1,3-glucan and the surface exposed mannans, while the fibrillar component β-1,6-glucan, a significant component of the well wall, has been largely overlooked. This comprehensive biochemical and immunological study by a highly experienced cell wall group provides a strong case for the importance of β-1,6-glucan contributing critically to cell wall integrity, filamentous growth, and cell wall stability resulting from defects in mannan elongation. Additionally, β-1,6-glucan responds to environmental stimuli and stresses, playing a key role in wall remodeling and immune response modulation, making it a potential critical factor for host-pathogen interactions.

      Strengths: 

      Overall, this study is well-designed and executed. It provides the first comprehensive assessment of β-1,6-glucan as a dynamic, albeit underappreciated, molecule. The role of β-1,6-glucan genetics and biochemistry has been explored in molds like Aspergillus fumigatus, but this work shines an important light on its role in Candida albicans. This is important work that is of value to Medical Mycology, since β-1,6-glucan plays more than just a structural role in the wall. It may serve as a PAMP and a potential modulator of host-pathogen interactions. In keeping with this important role, the manuscript rigor would benefit from a more physiological evaluation ex vivo and preferably in vivo, assessment on stimulating the immune system within in the cell wall and not just as a purified component. This is a critical outcome measure for this study and gets squarely at its importance for host-pathogen interactions, especially in response to environmental stimuli and drug exposure.

      Response to reviewers (Public reviews):

      We thank all the three reviewers for their opinion on our work on Candida albicans β-1,6-glucan, which highlights the importance of this cell wall component in the biology of fungi. Here are our responses to their comments for public reviews:

      (1) Indeed, the data presented for immunological studies is preliminary. It has been acknowledged by the reviewers that our analysis providing insights into the biosynthetic pathways involved in comprehensive in dealing with organization and dynamics of the β-1,6-glucan polymer in relation with other cell wall components and environmental conditions (temperature, stress, nutrient availability, etc.). However, we anticipated that there would be immediate curiosity as to what the immunological contribution of β-1,6 glucan and we therefore felt we needed to initiative these studies and include them. We therefore performed immunological studies to assess whether β-1,6-glucans act as a pathogen-associated molecular pattern (PAMP), and if so, what its immunostimulatory potential is. Our data clearly suggest that β-1,6-glucan is a PAMP, and consequently lead to several questions: (a) what are the host immune receptors involved in the recognition of this polysaccharide, and thereby the downstream signaling pathways, (b) how is β-1,6-glucan differentially recognized by the host when C. albicans switches from a commensal to an opportunistic pathogen, and (c) how does the host environment impact the exposure of this polysaccharide on the fungal surface. We believe addressing these questions is beyond the scope of the present manuscript and aim to present new data in future manuscript. Nonetheless, in the revised manuscript, suggest approaches that we can take to identify the receptor that could be involved in the recognition of β-1,6-glucan. Moreover, we have modified the discussion presenting it based on the data rather than being descriptive.  

      (2) It will be interesting to assess the organization of β-1,6-glucan and other cell wall components in the opaque cells. It is documented that the opaque cells are induced at acidic pH and in the presence of N-acetylglucosamine and CO2. Our data shows that pH has an impact on β-1,6-glucan, which suggests that there will be differential organization of this polysaccharide in the cell wall of opaque cells. As suggested by the reviewer, we will include analysis of opaque cells (and other C. albicans cell types) in future studies. 

      With the exception of these major new avenues for this research, our revision can address each of the comments provided by the reviewers.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      Although the study is very interesting, there are some loose ends that need to be combined to strengthen the manuscript. For example, the immunological studies are rather preliminary and need at least some substantiation. Also, at this stage, the manuscript in some places remains a bit too descriptive and needs the elucidation of potential causalities.

      Specifically: 

      (1) As you showed, defects in chitin content led to a decrease in the cross-linking of β-glucans in the inner wall that corresponded to the effect of nikkomycin-treated C. albicans phenotype; conversely, an increase in chitin content led to more cross-linking of β-glucans as observed in the FKS1 mutant or in the presence of caspofungin. What is the mechanistic reason for these observations? 

      On one hand, yeast cell wall chitin occurs in three forms: free and covalently linked to β-1,3-glucan or β-1,6-glucan; crosslinked β-glucan-chitin forms core fibrillar structure resistant to alkali. A decrease in the chitin content, therefore, affect β-glucan-chitin crosslinking thereby making β-glucan alkali-soluble. On the other hand, a decrease in the β-glucan content, as in FKS1 mutant or upon caspofungin treatment, results in increased cell wall chitin and β-glucan-chitin contents. A decrease in the β-1,3-glucan biosynthesis is associated with upregulation of CRH1 involved in the β-glucan-chitin crosslinking, which explains an increased β-glucan-chitin content in the FKS1 mutant or upon caspofungin treatment. We have included in this discussion in the revised manuscript (p14, lines 2-10).     

      (2) The β-1,6-glucan biosynthesis is stimulated via a compensatory pathway when there is a defect in O- and N-linked cell wall mannan biosynthesis. Why? causality? Hypothesis?  

      Two phenomena were observed related to β-1,6-glucan and mannan biosynthesis: 1) a defect in the elongation of N-mannan led to an increase in the β-1,6-glucan content; 2) a defect of O-mannan elongation resulted in the reduce size of β-1,6-glucan chains, however, increased their branching. These observations of our study suggest a global rescue program of the cell wall damage that could occur due to defect in one of the cell wall contents. We have discussed this in the revised manuscript (p14, last paragraph, p15 first paragraph). Moreover, β-1,3-glucan and chitin are synthesized by respective membrane bound synthases, and a defect in of their synthesis is compensated by the other. In line, although need to be validated for β-1,6-glucan, biosynthesis of mannan and β-1,6-glucan seem to initiate intracellularly. Therefore, possibility is that the defective mannan biosynthesis could be compensated by β-1,6-glucan biosynthesis, but need to be further validated experimentally. 

      (3) You showed that the removal of β-1,6-glucan by periodate oxidation (AI-OxP) led to a significant decrease in the IL-8, IL-6, IL-1β, TNF-α, C5a, and IL-10 released, suggesting that their stimulation was in part β-1,6-glucan dependent. What is the consequence of the stimulation, e.g. better phagocytosis, etc.? This needs some more experiments, otherwise the data is purely descriptive, as the conclusion. Also, what do you want to show with the activation of the complement system? Is ß1,6-glucan detected by complement receptors? I think this is really a loose end. I think it is necessary to provide more data on this observation, which I think lacks control with serum lacking complement, this should then be moved to the main manuscript. 

      In this study, our aim was to assess whether β-1,6-glucan acts as a pathogen-associated molecular pattern (PAMP) of C. albicans, and if yes, what is its immunostimulatory capacity/potential. Our data confirms that, indeed, β-1,6-glucan acts as a PAMP, and its removal significantly reduces the immunostimulatory capacity of the fibrillar core structure of the C. albicans cell wall. On the other hand, data provided in the revised manuscript (see updated Figure S14, discussion p13 lines 16-21) indicate that the human serum factors significantly enhance the immunostimulatory capacity of β1,6-glucan and that β-1,6-glucan interacts with the complement component C3b. However, addressing the role of β-1,6-glucan in phagocytosis using β-1,6-glucan deletion mutant will not be possible as the cell wall of this mutant is modified, and β-1,6-glucan is not the only cell wall component interacting with C3b. Alternate is to coat β-1,6-glucan on beads and use to study phagocytosis and identify immune receptors; however, these are beyond the scope of our present study/focus.      

      (4) Also, you suggested that β-1,6-glucan and β-1,3-glucan stimulate innate immune cells in distinct ways. Please provide more data on this interesting suggestion. You can block the dectin-1 receptor for example or use dectin-1 deficient macrophages from mice. The part on the immune stimulation needs to be optimized. 

      Stimulation of immune cells by pustulan (insoluble linear β-1,6-glucan) via a dectin-1independent pathway has been described previously (PMIDs: 18005717, 16371356) as discussed in the manuscript. Our preliminary data indicate that dectin-1 blocking on immune cells (using antidectin-1 antibodies) has no effect on the immunostimulatory potential of β-1,6-glucan, unlike AI and AI-OxP that showed significantly reduced cytokine secretion by the immune cells upon dectin-1 blocking. Deciphering the β-1,6-glucan recognition and its immunomodulatory pathways are underway, and will be the subject of our future study/manuscript.   

      (5) β-1,6-glucan and mannan productions are coupled. What is the hypothesis? Is it due to the necessity of mannan residues in ß-1,6-glucan biosynthesis enzymes from the ER? Can that be experimentally proven? 

      β-1,6-glucan and mannan synthesis should be coupled in two ways. First, as mentioned above (Response 2), defects in mannan elongation led to an alteration of β-1,6-glucan production. Second, early steps of N-glycosylation led to a strong reduction of β-1,6-glucan size and its cell wall content. However, we do not believe that the synthesis of N-glycan is required for the synthesis of an acceptor essential to β-1,6-glucan synthesis. Defect in N-mannan elongation led to a global cell wall remodeling as described above. Kre5, Rot2 and Cwh41 are part of the calnexin cycle involved in the control of N-glycoprotein folding in the ER, suggesting that some protein directly involved in the β-1,6-glucan synthesis required a folding quality control to be active. We modified our discussion, accordingly, highlighting these points (p14, last paragraph, p15 second paragraph).

      (6) As PHR1 and PHR2 genes are strongly regulated by external pH, the compensatory differences described may be explained by pH-dependent regulation of β-1,6-glucan synthesis.' Please check. Also, could the pH regulation form the basis of e.g. differences you found for ß-1,6-glucan under different environmental conditions, i.e., growth on different carbon sources leads to different external pH values, as shown for many fungi?  

      We agree that environmental pH is dependent on carbon source and pH varies during growth curve. To test the effect of pH we buffered the medium with 100 mM MOPS or MES. Clearly, Fig. 2 and S1 show that the pH has an effect on the cell wall composition and polymer exposure as previously described (PMID: 28542528). Here, we show that pH has an impact on the β-1,6-glucan size as well as its branching. However, in buffered medium, addition of organic acid (such as acetate, propionate, butyrate or lactate) had an impact on cell wall composition, showing that not only pH has an effect on cell wall composition. About _phr1_Δ/Δ and _phr2_Δ/Δ mutants, we believe that the difference in the cell wall composition observed between mutants is mainly due to the pH-dependent regulation, which we indicated in the discussion (p14, end of first paragraph).

      Minor: 

      (1) In Figure 7B: dynamism should be replaced by dynamic and in term is rather in terms.  

      Modified as suggested.

      (2) Replace molecular size with molecular mass when you give daltons. 

      Molecular size has been replaced by molecular weight, when presented as daltons.

      (3) Page 7: for explanation, please add that nikkomycin is a chitin biosynthesis inhibitor.   

      As suggested, explained that nikkomycin is a chitin biosynthesis inhibitor.

      Reviewer #2 (Recommendations For The Authors):

      (1) I wondered if the increased chitin content of hyphae might reflect growth on the precursor GlcNAc. Have you tested hyphae that are induced in other ways? (2) Related to point 1, did you look at the relative abundance of yeast vs hyphae in the preparation? I wonder if yeast contamination might have reduced the extent of the composition changes observed. 

      We used GlcNAc as hyphae inducer as: 1) in presence of GlcNAc, hyphae are produced without any yeast contamination; in this condition, we observed an increase in the chitin content, as described, in hyphae (PMID: 16423067); 2) we excluded using of serum, another condition inducing hyphal formation, as we could not control serum factors that may impact cell wall composition. We now indicate in the methods section that hyphae induced by GlcNAc were not contaminated by yeast (p17, line 3). 

      (3) I recommend rephrasing the first sentence of the Figure 2 legend: "Cells were grown in liquid SD medium at 37oC at exponential phase under different growth conditions." The conditions varied extensively - stationary is not exponential; biofilm is probably not exponential. Also, the "D" in "SD" stands for dextrose, and the carbon source varied a good deal. Perhaps you could say: "Cells were grown in liquid synthetic medium at 37oC under different growth conditions, as specified in Methods." 

      Sentences have been rephrased.  

      (4) Figure 7b has a typo: "dependant" for "dependent".

      Typo-error has been corrected.

      Reviewer #3 (Recommendations For The Authors):

      To explore the biochemical composition of the cell wall, the authors fractionated the wall component into three categories based on polymer properties and reticulations: sodium-dodecyl-sulphate-βmercaptoethanol (SDS-β-ME) extract, alkali-insoluble (AI), and alkali-soluble (AS) fractions, and they developed several independent methods to distinguish between β-1,3-glucans and β-1,6-glucans. The composition and surface exposure of fungal cell wall polymers is known to depend on environmental growth conditions. It was shown that the cell wall of C. albicans hyphae increased chitin content (10% vs. 3%) and decreased β-1,6-glucan (18% vs. 23%) and mannan (13% vs. 20%) compared to the yeast form, and the reduced β-1,6-glucan content was associated with a smaller β1,6-glucan size (43 vs. 58 kDa), suggesting that both the content and structure of β-1,6-glucan are regulated during growth and cellular morphogenesis. Similar behavior was observed when exposing cells to acid and neutral medium pH. The most significant cell wall alteration occurred in a lactatecontaining medium, which led to a sharp reduction in structural core polysaccharides: chitin (-43%), β-1,3-glucan (-48%), and β-1,6-glucan (-72%). This reduction aligns with the previously observed decreases in inner cell wall layer thickness. As expected, the authors found that modulating chitin content genetically (chs3Δ/Δ knockout mutant) led to an increase of both β-1,3-glucan and β-1,6glucan. An increase in chitin content following genetic alteration of FKS genes impacting glucan synthase or after exposure to the echinocandin caspofungin led to enhanced cross-linking of βglucans. A slight increase in the β-1,3-glucan branching was also observed in the mnt1/mnt2Δ/Δ double mutant, suggesting that β-1,6-glucan and mannan synthesis may be coupled.

      - This effect is not that pronounced, and the relationship appears somewhat overstated and may reflect an indirect interaction. The authors should address accordingly. 

      We agree that this sentence was overstated. To make it clearer and less pronounced, we divided this sentence into to two with less pronounced statements (p8, line 34).

      The genetics of β-1,6-glucan biosynthesis appear complex and a figure describing putative roles for specific genes would be beneficial. For example, KRE6 is a glucosyl hydrolase required for beta1,6-glucan biosynthesis.

      - It would be valuable to better understand the overall biosynthetic process. Please elaborate more in a figure. 

      Although proteins/enzymatic activities directly involved in the β-1,6-glucan biosynthesis have not yet been identified, as suggested by this reviewer, we included a schematic representation of this process based on our hypothesis (Figure S15, and p15 lines 17-22 in revised manuscript), indicating the possible involvement of Kre6p.  

      The deletion of KRE6 homologs, essential for β-1,6-glucan biosynthesis, resulted in the absence of β-1,6-glucan production, and significant structural alterations of the cell wall. This result nicely confirms the important role of β-1,6-glucan in regulating cell wall homeostasis. The absence of β1,6-glucan was associated with increased (mutant v. WT) chitin content (9.5% vs. 2.5%) and highly branched β- β-1,6-glucan 1,3-glucan (48% vs. 20%). TEM ultrastructure studies nicely showed the change in cell wall overall architecture. From a drug discovery perspective, since the blockade of β1,6-glucan did not block growth, it may have more value as a potential virulence target. This would be valuable but needs to be assessed in animal model challenge competition experiments.

      - The authors may want to elaborate more. 

      We agree and modified “antifungal target” as “potential virulence target”.

      It is well known that β-1,3-glucan, mannan, and chitin function serve as PAMPs, which induce immune responses. The role of β-1,6-glucan as a PAMP is not well understood, and the authors provide evidence that different cell wall extracted fractions with enriched constituents induce immune responses invoking cytokines, chemokines, and acute phase proteins, as well as the complement system. While this data clearly shows that β-1,6-glucan is immunologically active and potentially important for host-pathogen interactions, the analysis is preliminary and falls short of making this case. 

      - This is a critical point in getting at the potential host signaling of β-1,6-glucan contained in the cell wall or shed by the cell (is this known?)

      - This analysis would be bolstered significantly by examining stimulation relative to other cell wall components, and most importantly, whole cell modulation of β-1,6-glucan exposure for immune presentation, and not just unnatural concentrated extracts. This can be readily accomplished with the various mutants in hand, as well as after exposure to various antifungal agents echinocandins and nikkomycins) (see Hohl et al. 2008 JID). Additional validation would benefit from animal model studies to examine in vivo immune modulation.

      We agree with the reviewer. However, the main focus of our present work was to study the organization and dynamics of C. albicans cell wall β-1,6-glucan, and to explore its possible role as pathogen-associated molecular pattern (PAMP). Our study indicates that, indeed, β-1,6-glucan acts as a PAMP with immunostimulatory potential. As pointed by this reviewer, and similar to β-1,3glucans, the exposure of β-1,6-glucan is probably a key point in immune response. However, this investigation beyond the scope of this study, underway and will be presented in our future work.

      - The Discussion would also benefit from an analysis of how β-1,6-glucan in Aspergillus fumigatus, which was largely elucidated by the same primary authors. 

      To our knowledge, β-1,6-glucan has never been identified, either by chemical analysis (PMID: 10869365; PMID: 36836270) or solid-state NMR (PMID: 34732740), in the cell wall of A. fumigatus, although a homolog of KRE6 is present in A. fumigatus but with unknown function.

    2. eLife Assessment

      The paper will be of broad interest to fungal biologists and fungal immunologists seeking to understand the biosynthesis of the fungal cell wall, in particular of ß-1,6-glucan synthesis and the importance of this so far understudied constituent of the cell wall for cell wall integrity and immune response. The study is of fundamental significance and adds structural clarity to the genetic, and biochemical basis of this difficult-to-analyze carbohydrate. It opens the potential for understanding its role in immune recognition and potentially as a drug target. Overall, the data is compelling, properly controlled and analyzed.

    3. Reviewer #1 (Public review):

      Summary:

      The fungal cell wall is a very important structure for the physiology of a fungus but also for the interaction of pathogenic fungi with the host. Although a lot of knowledge on the fungal cell wall has been gained, there is lack of understanding of the meaning of ß-1,6-glucan in the cell wall. In the current manuscript, the authors studied in particular this carbohydrate in the important human-pathogenic fungus Candida albicans. The authors provide a comprehensive characterization of cell wall constituents under different environmental and physiological conditions, in particular of ß-1,6-glucan. Also, β-1,6-glucan biosynthesis was found to be likely a compensatory reaction when mannan elongation was defective. The absence of β-1,6-glucan resulted in a significantly sick growth phenotype and complete cell wall reorganization. The manuscript contains a detailed analysis of the genetic and biochemical basis of ß-1,6-glucan biosynthesis which is apparently in many aspects similar to yeast. Finally, the authors provide some initial studies on immune modulatory effects of ß-1,6-glucan.

    4. Reviewer #2 (Public review):

      Summary:

      The authors provide the first (to my knowledge) detailed characterization of cell wall b-1,6 glucan in the pathogen Candida albicans. The approaches range from biochemistry to genetics to immunology. The study provides fundamental information and will be a resource of exceptional value to the field going forward. Highlights include the construction of a mutant that lacks all b-1,6 glucan and the characterization of its cell wall composition and structure. Figure 5a is a feast for the eyes, showing that b-1,6 glucan is vital for the outer fibrillar layer of the cell wall. Also much appreciated was the summary figure, Figure 7, that presents the main findings in digestible form.

      Strengths:

      The work is highly significant for the fungal pathogen field especially, and more broadly for anyone studying fungi, antifungal drugs, or antifungal immune responses.<br /> The manuscript is very readable, which is important because most readers will be cell wall nonspecialists.<br /> The authors construct a key quadruple mutant, which is not trivial even with CRISPR methods, and validate it with a complemented strain. This aspect of the study sets the bar high.<br /> The authors develop new and transferable methods for b-1,6 glucan analysis.

      Weaknesses:

      The one "famous" cell type that would have been interesting to include is the opaque cell. Please include it in the next paper!

    5. Reviewer #3 (Public review):

      Summary:

      The cell wall of human fungal pathogens, such as Candida albicans, is crucial for structural support and modulating the host immune response. Although extensively studied in yeasts and molds, the structural composition has largely focused on the structural glucan b,1,3-glucan and the surface exposed mannans, while the fibrillar component β-1,6-glucan, a significant component of the well wall, has been largely overlooked. This comprehensive biochemical and immunological study by a highly experienced cell wall group provides a strong case for the importance of β-1,6-glucan contributing critically to cell wall integrity, filamentous growth, and cell wall stability resulting from defects in mannan elongation. Additionally, β-1,6-glucan responds to environmental stimuli and stresses, playing a key role in wall remodeling and immune response modulation, making it a potential critical factor for host-pathogen interactions.

      Strengths:

      Overall, this study is well designed and executed. It provides the first comprehensive assessment of β-1,6-glucan as a dynamic, albeit underappreciated, molecule. The role of β-1,6-glucan genetics and biochemistry has been explored in molds like Aspergillus fumigatus, but this work shines important light on its role in Candida albicans. This is important work that is of value to Medical Mycology, since β-1,6-glucan plays more than just a structural role in the wall. It may serve as a PAMP and a potential modulator of host-pathogen interactions.

      Weaknesses:

      In keeping with an important role in immune recognition, it was suggested that the manuscript rigor would benefit from a more physiological evaluation ex vivo and preferably in vivo, assessment on stimulating the immune system within in the cell wall and not just as a purified component. This is a critical outcome measure for this study and gets squarely at its importance for host-pathogen interactions, especially in response to environmental stimuli and drug exposure. The authors addressed this issue contextually and indicate that it will require a more detailed immunologic evaluation but is not in keeping with the intent of this foundational study.

    1. eLife Assessment

      This valuable study uses fluorescence lifetime imaging and steady-state and time-resolved transition metal ion FRET to characterize conformational transitions in the isolated cyclic nucleotide binding domain of a bacterial CNG channel. The data are compelling and support the authors' conclusions. The results advance the understanding of allosteric mechanisms in CNBD channels and have theoretical and practical implications for other studies of protein allostery. A limitation is that only the cytosolic fragments of the channel were studied.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1: 

      Limitations are that only the cytosolic fragments of the channel were studied, and the current manuscript does not do a good job of placing the results in the context of what is already known about CNBDs from other methods that yield similar information.

      In the revision, we have now added a paragraph in the discussion that addresses why the cytosolic fragment was used and a paragraph putting our results into the context of previous work on CNBD channels where possible. 

      (1) Why do the authors not apply their approach to the full-length channel? A discussion of any limitations that make this difficult would be worthwhile.” Full-length ion channel protein expression is more challenging, and it was important to start with a simpler system. This is now stated in the discussion.

      (2) …nonetheless a comparison of the conformational heterogeneity and energetics obtained from these different approaches would help to place this work in a larger context.

      We have now added a paragraph in the discussion putting our work in a larger context and addressing the challenges of comparing our results to previous studies. 

      (3) Page 5 - 3:1 unlabeled:labeled subunits in mix => 42% of molecules have 3:1 stoichiometry as desired and 21% of molecules have 2:2 stoichiometry!!! (binomial distribution p=0.25, n=4). So 1/3 of molecules with labels have two labeled subunits. This does not seem like it is at all avoiding the problem of intersubunit FRET…

      From the experimental perspective, the 3:1 molar ratio stated is certainly a low estimate of the actual subunit ratios given our FSEC data in Figure 2D and the higher expression of the WT protein compared to labeled protein. Furthermore, even without the addition of any WT protein, the calculated contribution of intersubunit FRET is negligible given that the FRET efficiency is heavily dominated by the closest donor-acceptor distances (Figure 4). 

      (4) Figure 2E - Some monomers appear to still be present in the collected fraction. The authors should discuss any effect this might have on their results.

      We now describe in the text that, at the low concentrations (~10nM) used for mass photometry, a second small peak was observed of ~30kDa, which is below the analytical range for this method. This would not affect our results since all tmFRET experiments used higher protein concentrations to ensure tetramerization.

      (5) page 4 - "Time-resolved tmFRET, therefore, resolves the structure and relative abundance of multiple conformational states in a protein sample." - structure is not resolved, only a single distance.

      We have reworded this sentence.  

      Reviewer #2:

      Regarding cyclic nucleotide-binding domain (CNBD)-containing ion channels, I disagree with the authors when they state that "the precise allosteric mechanism governing channel activation upon ligand binding, particularly the energetic changes within domains, remains poorly understood". On the contrary, I would say that the literature on this subject is rather vast and based on a significantly large variety of methodologies…

      Despite this vast literature on the energetics of CNBD channels there is no consensus about the energetics and coupling of domains that underlies the allosteric mechanism in any CNBD channel. We have added a separate paragraph in the discussion to clarify our meaning.

      In light of the above, I suggest the authors better clarify the contribution/novelty that the present work provides to the state-of-the-art methodology employed (steady-state and time-resolved tmFRET) and of CNBD-containing ion channels…

      …In light of the above, what is the contribution/novelty that the present work provides to the SthK biophysics?

      This work is the first use of the time-resolved tmFRET method to obtain intrinsic G (of an apo conformation) and G values for different ligands. It is also the first application of this approach to SthK or, indeed, to any protein other than MBP. This is mentioned in the introduction.  

      …On the basis of the above-cited work (Evans et al., PNAS, 2020) the authors should clarify why they have decided to work on the isolated Clinker/CNBD fragment and not on the full-length protein…

      We chose to start on the C-terminal fragment to provide a technically more tractable system for validating our approach using time-resolved tmFRET before moving to the more challenging full-length membrane protein. This is now addressed in a new paragraph in the discussion. 

      What is the advantage of using the Clinker/CNBD fragment of a bacterial protein and not one of HCN channels, as already successfully employed by the authors (see above citations)?

      We have chosen to perform these studies in SthK rather than a mammalian CNBD channel as SthK presents a useful model system that allows us to later express fulllength channels in bacteria. In addition, the efficiency of noncanonical amino acid incorporation is much higher in bacteria than in mammalian cells.

      Reviewer #3: 

      While the use of a truncated construct of SthK is justified, it also comes with certain limitations…

      We agree that the truncated channel comes with limitations, but we still think that there is relevant energetic information from studies of the isolated CNBD. This is now addressed in the discussion. 

      I recommend the authors carefully assess their statements on allostery. …The authors also should consider discussing the discrepancies between their truncated construct and full-length channels in more detail.

      We added a paragraph in the introduction that now puts the conformational change of the CNBD in the context of the allosteric mechanism of the full-length channel. We also added a paragraph discussing in more detail the relationship between the energetics of the C-terminal fragment and the full-length channel.  

      Regarding the in silico predictions, it is unclear to me why the authors chose the closed state of SthK Y26F and the 'open' state of the isolated C-linker CNBD construct…

      The active cAMP bound structure (4d7t) was a high resolution X-ray crystallography structure chosen as the only model with a fully resolved C-helix. The resting state structure (7rsh) was selected as a the only resting state to resolve the acceptor residue studied here (V417).     

      Previously it has been shown that SthK (and CNG) goes through multiple states during gating. This may be discussed in more detail, especially when it comes to the simplified four-state model…

      As stated above, we added paragraphs to the introduction and discussion placing the conformational change of the CNBD in the context of the full-length channel.  

      It would be interesting to see how the conformational distribution of the C-helix position integrates with available structural data on SthK. In general, putting the results more into the context of what is known for SthK and CNG channels, could increase the impact.

      We now discuss the relationship between existing structures and energetics in the introduction.  

      This may be semantics, but when working with a truncated construct that is missing the transmembrane domains using 'open' and 'closed' state is questionable. I recommend the authors consider a different nomenclature.

      We refer to the conformational states of the CNBD as ‘resting’ and ‘active’ and used ‘closed’ and ‘open’ only for the conformational states of the pore.

    3. Reviewer #1 (Public review):

      Summary:

      The authors use fluorescence lifetime imaging (FLIM) and tmFRET to resolve resting vs. active conformational heterogeneity and free energy differences driven by cGMP and cAMP in a tetrameric arrangement of CNBDs from a prokaryotic CNG channel.

      Strengths:

      The data are excellent and provide detailed measures of the probability to adopt resting vs. activated conformations with and without bound ligands.

      Weaknesses:

      A limitation is that only the cytosolic fragments of the channel were studied.

    4. Reviewer #2 (Public review):

      The authors investigated the conformational dynamics and energetics of the SthK Clinker/CNBD fragment using both steady-state and time-resolved transition metal ion Förster resonance energy transfer (tmFRET) experiments. To do so, they engineered donor-acceptor pairs at specific sites of the CNBD (C-helix and β-roll) by incorporating a fluorescent noncanonical amino acid donor and metal ion acceptors. In particular, the authors employed two cysteine-reactive metal chelators (TETAC and phenM). This allowed to coordinate three transition metals (Cu2+, Fe2+, and Ru2+) to measure both short (10-20 Å, Cu2+) and long distances (25-50 Å, Fe2+, and Ru2+). By measuring tmFRET with fluorescence lifetimes, the authors determined intramolecular distance distributions in the absence and presence of the full agonist cAMP or the partial agonist cGMP. The probability distributions between conformational states without and with ligands were used to calculate the changes in free energy (ΔG) and differences in free energy change (ΔΔG) in the context of a simple four-state model.

      Overall, the work is conducted in a rigorous manner, and it is well-written.

      In terms of methodology, this work provides a further support to steady-state and time-resolved tmFRET approaches previously developed by the authors of the present work to probe conformational rearrangements by using a fluorescent noncanonical amino acid donor (Anap) and transition metal ion acceptor (Zagotta et al., eLife 2021; Gordon et al., Biohpysical Journal 2024; Zagotta et al., Biohpysical Journal 2024).

      For what concerns Cyclic nucleotide-binding domain (CNBD)-containing ion channels, the literature on this subject is vast and the authors of the present work have significantly contributed to the understanding of the allosteric mechanism governing the ligand-induced activation of CNBD-containing channels, including a detailed description of the energetic changes induced by ligand binding. Particularly relevant are their works based on DEER spectroscopy. In DeBerg et al., JBC 2016, the authors described, at atomic details, the conformational changes induced by different cyclic nucleotides on the HCN CNBD fragment and derived energetics associated with ligand binding to the CNBD (ΔΔG). In Collauto et al., Phys Chem Chem Phys. 2017, they further detailed the ligand-CNBD conformational changes by combining DEER spectroscopy with microfluidic rapid freeze quench to resolve these processes and obtain both equilibrium constants and reaction rates, thus demonstrating that DEER can quantitatively resolve both the thermodynamics and the kinetics of ligand binding and the associated conformational changes.<br /> In the revised manuscript the authors better framed their work in light of the literature by highlighting novelty and limitations, in particular the decision to work with the isolated Clinker/CNBD fragment and not with the full-length protein.

    5. Reviewer #3 (Public review):

      Summary:

      The manuscript by Eggan et al provides insights into conformational transitions in the cyclic nucleotide binding domain of a cyclic nucleotide-gated (CNG) channel. The authors use transition metal FRET (tmFRET) which has been pioneered by this lab and previously led to detailed insights into ion channel conformational changes. Here, the authors not only use steady-state measurements but also time-resolved, fluorescence lifetime measurements to gain detailed insights into conformational transitions within a protein construct that contains the cytosolic C-linker and cyclic nucleotide binding domain (CNBD) of a bacterial CNG channel. The use of time-resolved tmFRET is a clear advancement of this technique and a strength of this manuscript.

      In summary, the present work introduces time-resolved tmFRET as a novel tool to study conformational distributions in proteins. This is a clear technological advance. The limitations of the truncated construct used in this study and how they relate to the energetics in full-length CNG channels are discussed. It will be interesting to see in the future how results compare to similar measurements on full-length channels, for example, reconstituted into nanodiscs.

      Strengths:

      The results capture known differences in promoting the open state between different ligands (cAMP and cGMP) and are consistent across three donor-acceptor FRET pairs. The calculated distance distributions are further in agreement with predicted values based on available structures. The finding that the C-helix is conformationally more mobile in the closed state as compared to the open state quantitatively increases our understanding of conformational changes in these channels.

      Weaknesses:

      The results describe movements of the C-helix in CNBDs, but detailed energetics as calculated in this study, need to be limited to the truncated protein construct. This is a weakness that cannot be overcome easily as it will require future experiments using the full-length channel.

      The data only describe movements of the C-helix. Upon ligand binding, the C-helix moves upwards to coordinate the ligand. Thus, the results are ligand-induced conformational changes (as the title states). Allosteric regulation usually involves remote locations in the protein, which is applicable only in a limited fashion here.

    1. eLife Assessment

      This valuable work presents the latest version of CTFFIND, which is the most popular software for determination of the contrast transfer function (CTF) in cryo-electron microscopy. CTFFIND5 estimates and considers acquisition geometry and sample thickness, which leads to improved CTF determination. The paper describes compelling evidence that CTFFIND5 finds better CTF parameters than previous methods, in particular for tilted samples (e.g. for cryo-electron tomography) or where thickness is an issue (e.g. cellular samples, or electron microscopy at low voltages).

    2. Reviewer #1 (Public review):

      This work presents CTFFIND5, a new version of the software for determination of the Contrast Transfer Function (CTF) that models the distortions introduced by the microscope in cryoEM images. CTFFIND5 can take acquisition geometry and sample thickness into consideration to improve CTF estimation.

      To estimate tilt (tilt angle and tilt axis), the input image is split into tiles and correlation coefficients are computed between their power spectra and a local CTF model that includes the defocus variation according to a tilted plane. As a final step, by applying a rescaling factor to the power spectra of the tiles, an average tilt-corrected power spectrum is obtained used for diagnostic purposes and estimate the goodness of fit. This global procedure and the rescaling factor resemble those used in Bsoft, Warp, etc, with determination of the tilt parameters being a feature specific of CTFFIND5 (and formerly CTFTILT). The performance of the algorithm is evaluated with tilted 2D crystals and tilt-series, demonstrating accurate tilt estimation in general.

      CTFFIND5 represents the first CTF determination tool that considers the thickness-related modulation envelope of the CTF firstly described by McMullan et al. (2015) and experimentally confirmed by Tichelaar et al. (2020). To this end, CTFFIND5 uses a new CTF model that takes the sample thickness into account. CTFFIND5 thus provides more accurate CTF estimation and, furthermore, gives an estimation of the sample thickness, which may be a valuable resource to judge the potential for high resolution. To evaluate the accuracy of thickness estimation in CTFFIND5, the authors use the Lambert-Beer law on energy-filtered data and also tomographic data, thus demonstrating that the estimates are reasonable for images with exposure around 30 e/A2. While consideration of sample thickness in CTF determination sounds ideally suited for cryoET, practical application under the standard acquisition protocols in cryoET (exposure of 3-5 e/A2 per image) is still limited. In this regard, the authors are precise in the conclusions and clearly identify the areas where thickness-aware CTF determination will be valuable at present: in situ single particle analysis and in vitro single particle cryoEM of large specimens (e.g. viral particles).

      In conclusion, the manuscript introduces novel methods inside CTFFIND5 that improve CTF estimation, namely acquisition geometry and sample thickness. The evaluation demonstrates the performance of the new tool, with fairly accurate estimates of tilt axis, tilt angle and sample thickness and improved CTF estimation. The manuscript critically defines the current range of application of the new methods in cryoEM.

    3. Reviewer #2 (Public review):

      This paper describes the latest version of the most popular program for CTF estimation for cryo-EM images: CTFFIND5. New features in CTFFIND5 are the estimation of tilt geometry, including for samples, like FIB-milled lamellae, that are pre-tilted along a different axis than the tilt axis of the tomographic experiment, plus the estimation of sample thickness from the expanded CTF model described by McMullan et al (2015). The results convincingly show the added value of the program for thicker and tilted images, such as are common in modern cryo-ET experiments. The program will therefore have a considerable impact on the field.

      Comments on revised version:

      My comments have been addressed adequately.

    4. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their detailed comments. Several comments revolved around potential improvements in the 3D reconstructions that are obtained in later steps of the image processing pipelines for single-particle cryoEM and cryo-electron tomography. We have not investigated how our improvements in CTFFIND5 affect these downstream results and can therefore not make specific and quantitative statements in this regard. However, CTFFIND5 provided additional information about the sample that users will find useful (thickness, tilt) for selecting the data they would like to include in later processing, and how to process them. Furthermore, when the sample tilt of a thin specimen is known, local defocus estimates (e.g., per-particle defocus estimates) will be more accurate compared to estimates that ignore tilt information. In the following, we provide point-by-point responses to the reviewers’ comments.

      Reviewer #1 (Public Review):

      This work presents CTFFIND5, a new version of the software for determination of the Contrast Transfer Function (CTF) that models the distortions introduced by the microscope in cryoEM images. CTFFIND5 can take acquisition geometry and sample thickness into consideration to improve CTF estimation.

      To estimate tilt (tilt angle and tilt axis), the input image is split into tiles and correlation coefficients are computed between their power spectra and a local CTF model that includes the defocus variation according to a tilted plane. As a final step, by applying a rescaling factor to the power spectra of the tiles, an average tilt-corrected power spectrum is obtained and used for diagnostic purposes and to estimate the goodness of fit. This global procedure and the rescaling factor resemble those used in Bsoft, Warp, etc, with determination of the tilt parameters being a feature specific of CTFFIND5 (and formerly CTFTILT). The performance of the algorithm is evaluated with tilted 2D crystals and tiltseries, demonstrating accurate tilt estimation in some cases and some limitations in others. Further analysis of CTF determination with tilt-series, particularly showing whether there is accurate or stable estimation at high tilts, might be helpful to show the robustness of CTFFIND5 in cryoET.

      CTFFIND5 represents the first CTF determination tool that considers the thickness-related modulation envelope of the CTF firstly described by McMullan et al. (2015) and experimentally confirmed by Tichelaar et al. (2020). To this end, CTFFIND5 uses a new CTF model that takes the sample thickness into account. CTFFIND5 thus provides more accurate CTF estimation and, furthermore, gives an estimation of the sample thickness, which may be a valuable resource to judge the potential for high resolution. To evaluate the accuracy of thickness estimation in CTFFIND5, the authors use the Lambert-Beer law on energy-filtered data and also tomographic data, thus demonstrating that the estimates are reasonable for images with exposure around 30 e/A2. While consideration of sample thickness in CTF determination sounds ideally suited for cryoET, practical application under the standard acquisition protocols in cryoET (exposure of 3-5 e/A2 per image) is still limited. In this regard, the authors are honest in the conclusions and clearly identify the areas where thickness-aware CTF determination will be valuable at present: e.g. in situ single particle analysis and in vitro single particle cryoEM of purified samples at low voltages.

      In conclusion, the manuscript introduces novel methods inside CTFFIND5 that improve CTF estimation, namely acquisition geometry and sample thickness. The evaluation demonstrates the performance of the new tool, with fairly accurate estimates of tilt axis, tilt angle and sample thickness and improved CTF estimation. The manuscript critically defines the current range of application of the new methods in cryoEM.

      Reviewer #2 (Public Review):

      Summary:

      This paper describes the latest version of the most popular program for CTF estimation for cryo-EM images: CTFFIND5. New features in CTFFIND5 are the estimation of tilt geometry, including for samples, like FIB-milled lamellae, that are pre-tilted along a different axis than the tilt axis of the tomographic experiment, plus the estimation of sample thickness from the expanded CTF model described by McMullan et al (2015). The results convincingly show the added value of the program for thicker and tilted images, such as are common in modern cryo-ET experiments. The program will therefore have a considerable impact on the field.

      I have only minor suggestions for improvement below:

      Abstract: "[CTF estimation] has been one of the key aspects of the resolution revolution"-> This is a bit over the top. Not much changed in the actual algorithms for CTF estimation during the resolution revolution.

      We have removed this statement in the abstract.

      L34: "These parameters" -> Cs is typically given, only defocus (and if relevant phase shift) are estimated.

      We have modified the introduction to reflect this. Page 3, L30-35

      L110-116: The text is ambiguous: are rotations defined clockwise or counter-clockwise? It would be good to explicitly state what subsequent rotations, in which directions and around which axes this transformation matrix (and the input/output angles in CTFFIND5) correspond to.

      Thank you for pointing this out. We have revised the Methods section, Page 4 L57-61,  to explicitly define the convention for the tilt axis and tilt angle. We have also modified Fig. 1b to illustrate our convention for the tilt axis.

      L129-130: As a suggestion: it would be relatively easy, and possibly beneficial to the user, to implement a high-resolution limit that varies with the accumulated dose on the sample. One example of this exists in the tomography pipeline of RELION-5.

      We appreciate the suggestion. However, since CTFFIND5 currently has no concept of a tilt-series and treats every micrograph independently, this would not be trivial to implement. As detailed below, CTFFIND5 in its current form is not targeted toward tomography processing, but its features might be useful for its use in pipelines for tomography processing, such as RELION-5. We made this more explicit in the conclusion section. Page 16 L390-399

      Substituting Eq (7) into Eq (6) yields ksi=pi, which cannot be true. If t is the sample thickness, then how can this be a function of the frequency g of the first node of the CTF function? The former is a feature of the sample, the latter is a parameter of the optical system. This needs correction.

      We have rewritten the text describing equations 7 and 6 to avoid this confusion (Page 7, L146-153). The reviewer is right that inserting Eq. 7 into Eq. 6 yields ksi=psi, as in fact Eq. 7 is derived from Eq. 6, by substituting ksi=psi, since this describes the condition for the first node. Also, in this context, nodes in the CTF function refer to the places where the term sinc(ksi) becomes zero and therefore the CTF is apparently "flat". The frequency at which this occurs is sample-thickness dependent. As explained below, the previous version of our manuscript did not point out the difference between the first zero and first node in the power spectrum. We have amended Fig. 3a to make this difference clearer.

      Reviewer #3 (Public Review):

      In this manuscript, the authors detail improvements in the core CTFFIND (CTFFIND5 as implemented in cisTEM) algorithm that better estimates CTF parameters from titled micrographs and those that exhibit signal attenuation due to ice thickness. These improvements typically yield more accurate CTF values that better represent the data. Although some of the improvements result in slower calculations per micrograph, these can be easily overcome through parallelization.

      There are some concerns outlined below that would benefit from further evaluation by the authors.

      For the examples shown in Figure 3b, given the small differences in estimated defocus1 and 2, what type of improvements would be expected in the reconstructed tomograms? Do such improvements in estimates manifest in better tilt-series reconstruction?

      As explained in our preface, we do not believe that these difference would manifest in any improvements during tilt-series reconstruction and would not create any meaningful differences, even when tomograms are reconstructed with CTF correction. They might become meaningful during subtomogram averaging, but subtomograms are usually corrected using per-particle CTF estimation, similar to single-particle processing. We have included a new paragraph in the discussion to describe potential benefits of CTFFIND5 for cryo-tomography, Page 16 L390-399.

      Similarly, the data shown in Figure 3C shows minimal improvements in the CTF resolution estimate (e.g., 4.3 versus 4.2 Å), but exhibited several hundred Å difference in defocus values. How do such differences impact downstream processing? Is such a difference overcame by per-particle (local) CTF refinements (like the authors mention in the discussion, see below)?

      The difference in the defocus estimate (~600A) is substantially smaller than the thickness of the sample (2000A). Hence both estimates may be valid, depending on which particles inside the sample are considered. Particles with larger defocus errors could certainly be corrected by per-particle CTF refinement as long as the search range is chosen to be large enough. The main benefit of using CTFFIND5 is information for the user regarding the sample thickness to set the defocus search range appropriately.

      At which point does the thickness of the specimen preclude the ice thickness modulation to be included for "accurate" estimate? 500Å? 1000Å? 2000Å? Based on the data shown in Figure 3B, as high as 969 Å thick specimens benefit moderately (4.6 versus 3.4 Å fit estimate), but perhaps not significantly, from the ice thickness estimation. Considering the increased computational time for ice thickness estimation, such an estimate of when to incorporate for single-particle workflows would be beneficial.

      As explained in our preface, the main benefit for single-particle workflows will be sample tilt estimation. This will provide more accurate per-particle defocus estimates, compared to estimates that do not take the tilt into account. For single-particle samples, the ice thickness in holes is probably more efficiently monitored using the Beer-Lambert law.

      It would seem that this statement could be evaluated herein: "the analysis of images of purified samples recorded at lower acceleration voltages, e.g., 100 keV (McMullan et al., 2023), may also benefit since thickness-dependent CTF modulations will appear at lower resolution with longer electron wavelengths". There are numerous examples of 300kV, 200kV, and 100kV EMPIAR datasets to be compared and recommendations would be welcomed.

      Publicly available datasets recorded at 100kV and 200kV were collected in very thin ice, making it difficult to demonstrate the stated benefits. We have removed this statement.

      Although logical, this statement is not supported by the data presented in this manuscript: "The improvements of CTFFIND5 will provide better starting values for this refinement, yielding better overall CTF estimation and recovery of high-resolution information during 3D reconstruction."

      We have revised this statement and now explain that the sample tilt information will provide more accurate per-particle defocus estimates, compared to estimates that do not take the tilt into account, Page 17, L400-409. We did not investigate how this will affect downstream processing results.

      Moreso, the lack of single-particle data evaluation does present a concern. Naively, these improvements would benefit all cryoEM data, regardless of modality.

      We agree with the reviewer that all cryoEM modalities should benefit from more accurate defocus value estimates and have amended our concluding statement. However, how improved defocus values will benefit downstream processing results will depend on the processing pipeline, which includes various points of user input and data-dependent choices. We have therefore limited our analysis to the outputs of CTFFIND5.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) CTFFIND5 in cryo-ET

      (1.1) CTFFIND4 is prone to unreliable CTF estimates at high tilts in cryoET, a situation that can be identified by high variability or 'unstable' estimates as a function of the tilt angle. Prof. Mastronarde recently illustrated this situation in his article JSB 216:108057, 2024 (Fig. 7). Therefore, the authors could add results to show whether the improvements to tilt estimation introduced in CTFFIND5 overcome this problem. So, in addition to the estimation of tilt angle and tilt axis in Figure 2, the estimated defocus could also be shown.

      We have worked with Prof. Mastronarde to help him use CTFFIND as a tool in his cryoET processing pipeline. Mastronarde chose CTFFIND because it contains algorithms and architecture that he could optimize for his purposes. CTFFIND5 is currently lacking the concept of a tilt series and can therefore not take advantage of the additional information that comes with tilt series. Our own applications for CTFFIND5 currently do not include tomography, and our results presented in Fig. 2 were obtained for validation of the tilt estimation feature. We did not attempt to duplicate Mastronarde’s optimization for reliable tilt series processing.

      Figure 2b of this manuscript already suggests that CTFFIND5 may exhibit some variability of defocus estimates at high tilts (in view of the variability of tilt axis angle). A strategy used in IMOD and TOMOCTF is to consider the tiles of a group of consecutive images (typically 35; especially at high tilts) to add more signal to the average spectrum, thus providing more reliable estimates (illustrated in Mastronarde's article JSB 216:108057, 2024, Fig. 8). Will the authors think that CTFFIND5 might include a strategy like this for cryoET tilt-series?

      We currently do not have plans to develop CTFFIND5 as a tool for tomography as there are already other excellent tools available, some of them based on CTFFIND’s basic algorithms (see previous comment).

      (1.2) In cryoET, the CTF is often determined on the aligned tilt-series, with the tilt axis typically running along the Y axis. Has CTFFIND5 got the option to exclude estimation of the tilt geometry (tilt angle and/or axis) and, instead, take tilt geometry directly from the alignment and/or from the microscope??. This would significantly speed up determination of the CTF (in 1-2 seconds per image, according to Table 2) while still taking advantage of all power spectra in tilted images (as described in their tilt estimation algorithm) for improved CTF estimation. This strategy would be similar to what it is done in Bsoft and IMOD.

      This is an excellent idea and we may implement this in an updated version. The current version is primarily meant for lamellae and single-particle samples where we usually have a single tilt in an unknown direction. For these cases, the suggested feature will have less benefit. 

      Thus, I suggest that the authors should also include results comparing CTF estimation in aligned tilt-series with CTFFIND4 and with CTFFIND5 (with no tilt estimation but indeed taking the tilt information from the alignment or the microscope into account). The results would show that CTFFIND5 is more robust than CTFFIND4, especially at high tilts.

      Thank you for this suggestion. We are now showing a comparison of defocus estimates from CTFFIND4 and CTFFIND5 in Fig. 2. Indeed, in one case CTFFIND5 seems to report more robust defocus values at high tilt.

      (1.3) The newer improvements in CTFFIND5 seem to be especially tailored to cryoET. The cryoET community will be highly attracted by these improvements. However, the current standard acquisition protocols (exposure of 3-5 e/A2 per image, tilts up to 60 degrees, etc) limit their full exploitation, particularly the thickness-aware CTF determination. I believe that adding a paragraph exclusively focused on cryoET and describing the potential benefits from CTFFIND5 and their limitations could enrich the Conclusion section. In this paragraph, the authors could highlight the great benefits from the tilt-aware CTF estimation. They could also discuss the current standard acquisition protocols (e.g. exposure 3-5 e/A2 per image, nominal defocus 3-5 microns, cellular thickness from 150 nm up to 200-300 nm that, at a tilt of 60 degrees, become 300 nm up to 400-600 nm) and their implications for the potential benefit from the improvements available in CTFFIND5.

      This reviewer is clearly excited about the potential application of CTFFIND5 in cryoET. We are sorry that we are currently not developing CTFFIND5 in this direction.

      (1.4) Apologies for insisting on cryoET in the previous points. I am just trying to suggest ideas to make CTFFIND5 even more helpful in cryoET. You can consider them now, or for a future version of the software, or just ignore them.

      Thanks for your suggestions. Since there is clearly demand for tools to process tomographic tilt series, we will keep these suggestions in mind for the future development of CTFFIND.

      (2) Tilt estimation

      (2.1) Page 4. Tiles for the initial steps in tilt estimation are of size 128x128.  At which point tiles of larger size (e.g. 512x512) are used?. Please, define.

      Thank you for pointing out this lack of clarity. For the tilt estimation, we used a tile size 128 x 128, which has been hard-coded in our program, as mentioned in line 68 on page4. For generating the final power spectrum, we usually use size 512 x 512. This tile size can be defined by the user when running the program. We have now clarified this on Page 4, L74-76.

      (2.2) Page 6 and/or page 11: evaluation of tilt estimation with tilt-series.

      Please indicate the acquisition details of the tilt-series used for the evaluation, especially the exposure per image. This information is neither available in this manuscript nor in Elferich et al., 2022.

      Please, add these acquisition details similarly to page 9 in this manuscript (evaluation of sample thickness estimation using tomography): pixel size, exposure per image and total exposure, number of images, tilt range and interval

      The same tilt-series were used to verify tilt-estimation and sample thickness. We have revised the Methods section to make this clear on Page5, L98-105 and Page 10, L202.

      (2.3) Page 10. Section Results. Subsection Tilt estimation.

      The authors use "defocus correction" to refer to their method for scaling the power spectra. "Defocus correction" might perhaps be a misleading term. In contrast, in page 4 the authors use the term "tilt correction". Please, revise and make it consistent throughout the manuscript.

      We agree and now use “tilt correction” throughout the manuscript.

      (2.4) Legend of Figure 2.

      Please add what the red dashed curve represents. Also, please note there might be an error in the estimated stage tilt axis angle: the legend states "171.8" where in the main text it is "178.2" (apparently, the latter is the correct one).

      Thank you for pointing this out. We have modified the legend and changed the number in the legend to 178.2°.

      (3) Thickness estimation

      (3.1) Line 141, page 7. The sentence reads: "The modulation of the CTF due to sample thickness t is described by the function E (current Equation 6), "  I believe that the modulation envelope of the CTF due to sample thickness is not really E (current Equation 6), but the function sinc(E). Please, revise.

      We have revised the manuscript as advised, Page 7, L148.

      (3.2) Line 148, page 7. The sentence reads "an estimate of the frequency g of the first node of the CTF_t function "

      The concept of 'node' was introduced by Tichelaar et al. (2020). The authors should not assume that this concept is familiar to the readership. So, it is suggested that the authors should introduce this concept in this section. For instance, just after Equation 6 they could add a sentence like this: "This sinc modulation envelope increasingly attenuates the amplitude of the Thon rings with increasing spatial frequencies in an oscillatory fashion, with locations where the amplitude is zero known as nodes (Tichelaar et al., 2020)."

      Thank you for this suggestion. We have revised the manuscript accordingly (Page 7, L151-156) and also marked the position of the first node in Fig. 3a.

      (3.3) Line 154, page 8: A citation is lacking: "(corrected for astigmatism, as described in )". Perhaps the authors refer to the EPA (EquiPhase Averaging) method introduced by Zhang, JSB 193:1-12, 2016, 10.1016/j.jsb.2015.11.003.

      Thanks for spotting this omission. We have added the appropriate reference.

      (3.4) Figure 3.

      (3.4.1) Perhaps, the EPA (EquiPhase Averaging) method is used to reduce the 2D CTF to 1D curves, as represented in Figure 3b and 3c. Please, mention this in the legend of the figure or in the main text referring to Figure 3. The same might apply to Figure 1c.

      Thanks for spotting this omission. We have clarified that this is indeed an EPA in the figure legends.

      (3.4.2) Please indicate what the colored curves represent in 3b and 3c: The fitted CTF model (dashed red) and the EPA or astimatism-corrected radial average of power spectrum (solid black) ?

      Thanks for spotting this omission. We have added descriptions of the colored lines in these plots (red = modeled CTF, blue = goodness of fit).

      (3.4.3) Please note that the power spectrum (solid black curves in Figure 3b and 3c) does not look the same in the top and bottom panels: Without thickness estimation (top panels), the power spectrum is in the range [0,1] in Y, as expected. However, with thickness estimation (bottom panels), the power spectrum seems to have undergone a frequencydependent transformation (a rescaling or something that makes the power spectrum oscillates around 0.5 in Y). This transformation of the power spectrum resembles the thickness-induced sinc modulation of the CTF and seems to be appropriate to better fit the new thickness-aware CTF_t model in CTFFIND5 to the (transformed) power spectrum. However, this transformation of the power spectrum is not mentioned in the manuscript at all. Instead, according to the main text (page 8), the fitting method is based on the crosscorrelation between the new CTF model and the power spectrum, so I was expecting to see the same power spectrum black curve in the top and bottom panels. Please, clarify.

      Indeed, CTFFIND5 displays the power spectrum differently after thickness estimation. We have revised the methods to explain this (page8, L178-181). The reviewer is also correct that the 1D lines plots of the Thon ring patterns in Fig. 3b and 3c are not identical. These 1D plots are generated from the 2D plots according to the fitted CTF, which is needed to follow the astigmatic rings and avoid blurring of the oscillations in the radial average. This means that different CTF fits will also result in somewhat different 1D plots. However, these differences only affect the 1D EPA plots shown to the user. The actual fitting is performed against the same 2D spectra.

      (3.4.4) Line 319, Page 14. "A linear fit revealed .." It would be good to add a line with the linear fit in Figure 5.

      Agreed. The revised Fig. 5 now shows a line for the linear fit.

      (3.5) New CTF Model

      It is not clear from the text if the new CTF_t model is used at all times in CTFFIND5 or only when the user requests thickness estimation. Related to this, if the user requests both tilt estimation and thickness estimation, how is the CTF estimation process carried out in CTFFIND5?: Tilt and thickness are estimated at the same time? or one after the other (i.e. first the tilt is estimated, then followed by thickness estimation)?. Please, clarify.

      The new CTF_t model is only used when the user requests thickness estimation. When both tilt-estimation and thickness estimation are requested, the tilt is estimated first and the corrected power spectrum is then fitted using the CTF_t model. We have revised the Methods section to explain this better, Page 8, L158-159.

      (4) Pages 14-15. Section "CTF estimation and correction assists "

      This section just shows that correction of a highly underfocused image for the CTF with phase flipping or a Wiener filter reduces the CTF-induced fringes. I do not really understand the inclusion of this section to the manuscript. There is no contribution related to CTFFIND5.  

      The ability to apply a CTF correction to the input image according to Tegunov & Cramer is a new feature of apply_ctf, a program included with cisTEM. We think that this section fits into the theme of CTFFIND5 because the correction adds valuable information about the samples, such as FIB-milled lamellae.

      If the authors prefer to keep this section, then please take the following points into account:

      (4.1) Figure 6b: This is the only time that the term "EPA" (EquiPhase Averaging, I guess) is used in the manuscript. Please, spell it out somewhere in the manuscript, define what it means and add a proper citation, if convenient. This point is related to point 3.3 above.

      We have added the appropriate reference and defined EPA in the methods section as indicated in the reply to point 3.3.

      (4.2) Figure 6d. The contrast of this image is poor. Please, increase the contrast (to be similar to Figure 6c) so that the details can be better discerned. The image also shows a grainy texture, likely artefacts from the Wiener filter due to excessive amplification. Maybe the 'strength parameter' S of the deconvolution Wiener filter (Tegunov & Cramer, 2019) should be tuned down or the 'fall-off parameter' F tuned up to try to attenuate these artefacts.

      Agreed. The revised figure shows panel d with increased contrast with the custom fall-off parameter set to 1.3 and the custom strength parameter set to 0.7.

      (5) CTFFIND5 runtimes

      Table 2 shows that estimation of tilt increases the runtime up to 39 s in an image of 4070x2892 and to 208 s in one of 2880x2046. There is a significant difference between these two cases (39 s vs. 208 s) and the first image is much larger than the second. Why does CTFFIND5 on the smaller image take so long compared to the larger image?

      During tilt estimation, the images are binned to a pixel size of 5 Å. This causes micrograph 1 to be substantially smaller (in pixels) than micrographs 2 and 3, resulting in the faster runtime.

      (6) Conclusions

      (6.1) In the Conclusion section, the authors could elaborate a bit the insights about the sample quality provided by CTFFIND5. This is stated in the title of the manuscript, but it was hardly mentioned in the manuscript.

      We have revised the conclusion to make this clearer (Page 16, L389-396). CTFFIND5 helps in estimating sample quality since (1) the sample thickness is an important determinant in the amount of high-resolution signal in a micrograph and (2) the estimated fit-resolution reflects more accurately the amount of signal present in a micrograph after tilt and sample thickness have been taken into account.

      (6.2) The authors nicely identify and describe the applications where thickness-aware CTF determination will be valuable: in situ single particle analysis and in vitro single particle cryoEM of purified samples at low voltages. Perhaps, CTFFIND5 will also be of great interest for single particle cryoEM of thick specimens (e.g. capsid of large viruses with diameter in the range 120-200 nm such as PBCV-1 or HSV-1).

      Agreed. We have added this case to our Conclusions. (Fig. 3d)

      (7) Typographical errors:

      line 161, page 8. "1.5 time" should be "1.5 times"

      lines 185-191. All exposures are given in 'electrons/Angstrom', not in 'electrons/square Angstrom'

      line 206, page 10. With "slides" the authors seem to mean "slices"

      line 338, page 14: "describeD by Tegunov"

      line 349, page 15. "power spectra"

      lines 366 and 368, page 15: Note that Square Angstrom is written as "A2". Put "2" with superscript.

      Thank you for pointing out these errors. They have been corrected.

      (8) References:

      Reference: Lucas et al., eLife 10 e68946. Year is lacking. Add year: 2021.

      Reference: Yan et al. 2015 cited in line 169, page 8, does not appear in Bibliography. The authors may mean: Yan et al. 2015 JSB 192:287-296, 2015  

      It would be good to cite Bsoft, as it has a procedure similar to tilt-corrected CTF estimation: Heymann, Protein Science, 2021,  

      Thank you for carefully checking the cited references. We have revised the manuscript as suggested.

      Reviewer #2 (Recommendations For The Authors):

      I have only minor suggestions for improvement below:

      L218: "these option"

      Corrected

      L243: "chevron-shape" -> V-shape would be more accessible language for non-native speakers.

      Changed

      L281: "Based on these results we conclude that CTFFIND5 will provide more accurate CTF parameters" -> Given that the maximum resolutions of the fits by the old model and the new model are nearly the same, how big would the actual advantage of the new model be for subsequent sub-tomogram averaging?

      Please see our response above, Reviewer #3 (Public Review), 

      L376: The correct reference for RELION per-particle CTF estimation is Zivanov et al, (2018) [https://elifesciences.org/articles/42166]. Also, the cryoSPARC paper referenced does not describe per-particle CTF estimation and should thus be removed from this context.

      Thanks for pointing out these mistakes, which we have now corrected. We have chosen to keep the citation for CryoSPARC to reference the general software, but have added Ziavanov et.al. 2020 as suggested by the CryoSPARC website.

      Reviewer #3 (Recommendations For The Authors):

      Minor:

      Figure 1A legend - authors mention boxes but only 1 box is shown.

      Thank you for pointing this out. For visual clarity we decided to only show one box. We have corrected the legend.

      Figure 1B - it would be nice if the boxes that contributed to the power spectra were mapped on Figure 1A

      The shown power spectra are not actual data. Instead, we show power spectra with exaggerated defocus differences for visual clarity. We have revised the figure legends to make this clear. 

      The Y-axis legends in Figure 2 are not aligned vertically

      Corrected

      Figure 3A - CTFFIND4 is missing an "I"

      Corrected

      Figure 3 - Y-axis legends are not aligned vertically

      Corrected

      Page 16, line 376, Relion should be RELION

      We have revised the manuscript as advised.

      Typo in equation 5, sinc versus sin?

      “sinc” is correct here, since this is a thickness-dependent modulation of the CTF.

      Lambert-Beer's, Lambert-Beer are used variably but curious if Beer-Lambert should be used.

      We have revised the manuscript as advised.

    1. eLife Assessment

      This computational study integrates detailed electrophysiology and mechanical contraction predictions, which are often modeled separately. The findings of this important work are that abnormal ECGs that are associated with higher risk of sudden cardiac death are predicted to have almost no relationship with left ventricular ejection fraction, which is conventionally used as a risk factor for arrhythmia. The conclusions are based on compelling evidence for the need of incorporating additional risk factors for assessing post-myocardial infarction patients.

    2. Reviewer #1 (Public review):

      Summary:

      In this study from Zhou, Wang, and colleagues, the authors utilize biventricular electromechanical simulations to illustrate how different degrees of ionic remodeling can contribute to different ECG morphologies that are observed in either acute or chronic post-myocardial infarction (MI) patients. Interestingly, the simulations show that abnormal ECG phenotypes - associated with higher risk of sudden cardiac death - are predicted to have almost no correspondence with left ventricular ejection fraction, which is conventionally used as a risk factor for arrhythmia.

      Strengths:

      The numerical simulations are state-of-the-art, integrating detailed electrophysiology and mechanical contraction predictions, which are often modeled separately. The population of ventricular simulations provide mechanistic interpretation, down to the level of single cell ionic current remodeling, for different types of ECG morphologies observed in post-MI patients. Collectively, these results demonstrate compelling and significant evidence for the need of incorporating additional risk factors for assessing post-MI patients.

      The authors have addressed all of my previous concerns in this updated version.

    3. Reviewer #2 (Public review):

      Summary:

      The authors constructed a multi-scale modeling and simulation methods to investigate the electrical and mechanical properties under acute and chronic myocardial infarction (MI). The simulated three acute MI conditions and two chronic MI conditions. They showed that these conditions gave rise to distinct ECG characteristics that have seen in clinical settings. They showed that the post-MI remodeling reduced ejection fraction up to 10% due to weaker calcium current or SR calcium uptake, but the reduction of ejection fraction is not sensitive to remodeling of the repolarization heterogeneities.

      Strengths:

      The major strength of this study is the construction of the computer modeling that simulates both electrical behavior and mechanical behavior for post-MI remodeling. The links of different heterogeneities due to MI remodeling to different ECG characteristics provide some useful information for understanding the complex clinical problems.

      Weaknesses:

      The rationale (e.g., physiological or medical bases) for choosing the 3 acute MI and 2 chronic MI settings is not clear. Although the authors presented a huge number of simulation data, in particular in the supplemental materials, it is not clearly stated what novel findings or mechanistic insights that this study gained beyond the current understanding of the problem.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study by Zhou, Wang, and colleagues, the authors utilize biventricular electromechanical simulations to illustrate how different degrees of ionic remodeling can contribute to different ECG morphologies that are observed in either acute or chronic post-myocardial infarction (MI) patients. Interestingly, the simulations show that abnormal ECG phenotypes - associated with a higher risk of sudden cardiac death - are predicted to have almost no correspondence with left ventricular ejection fraction, which is conventionally used as a risk factor for arrhythmia.

      Strengths:

      The numerical simulations are state-of-the-art, integrating detailed electrophysiology and mechanical contraction predictions, which are often modeled separately. The simulation provides mechanistic interpretation, down to the level of single-cell ionic current remodeling, for different types of ECG morphologies observed in post-MI patients. Collectively, these results demonstrate compelling and significant evidence for the need to incorporate additional risk factors for assessing post-MI patients.

      Weaknesses:

      The study is rigorous and well-performed. However, some aspects of the methodology could be clearer, and the authors could also address some aspects of the robustness of the results. Specifically, does variability in ionic currents inherent in different patients, or the location/size of the infarct and surrounding remodeled tissue impact the presentation of these ECG morphologies?

      We thank the reviewer for their considered evaluation. In response to the reviewer’s comments regarding variability in ionic currents, we have added simulations using a n=17 populations of models with variability in ionic conductances in the baseline ToR-ORd model to the paper, to show the effect of such variation on the post-MI ECG presentation in acute and chronic conditions. This is now described in the Methods [lines 140, 158-161, 242-244, 245-246, 261-263], and shown in the methods Figure 1A, 1B. The ECG results using this population of models are shown in Figure 2C and described in [lines 333-335] and the pressure volume results using the population of models are shown in Figure 5A and 5B and described in [lines 417-418, 442-444, 448-450]. The population of models showed consistent patterns in both the ECG and LVEF as the baseline model, this is discussed in [lines 563-564, 688-690].

      Regarding the effect of scar location and size on the ECG, we refer the reader and reviewer to a related paper where this is explored in depth using a formal sensitivity analysis and deep learning inference (https://pubmed.ncbi.nlm.nih.gov/38373128/). This is better able to do justice to this question rather than overloading this paper with additional investigations. We include a reference to this paper in the discussion section [lines 694-695].

      Reviewer #2 (Public Review):

      Summary:

      The authors constructed multi-scale modeling and simulation methods to investigate the electrical and mechanical properties of acute and chronic myocardial infarction (MI). They simulated three acute MI conditions and two chronic MI conditions. They showed that these conditions gave rise to distinct ECG characteristics that have been seen in clinical settings. They showed that the post-MI remodeling reduced ejection fraction up to 10% due to weaker calcium current or SR calcium uptake, but the reduction of ejection fraction is not sensitive to remodeling of the repolarization heterogeneities.

      Strengths:

      The major strength of this study is the construction of computer modeling that simulates both electrical behavior and mechanical behavior for post-MI remodeling. The links of different heterogeneities due to MI remodeling to different ECG characteristics provide some useful information for understanding complex clinical problems.

      Weaknesses:

      The rationale (e.g., physiological or medical bases) for choosing the 3 acute MI and 2 chronic MI settings is not clear. Although the authors presented a huge number of simulation data, in particular in the supplemental materials, it is not clearly stated what novel findings or mechanistic insights this study gained beyond the current understanding of the problem.

      We thank the reviewer for their careful evaluations of our work. The justification for selecting the 3 acute MI and 2 chronic MI states is based on clinical and experimental reports, as summarised in the Methods section [lines 245-247, 252-256, 264-266].  We have also highlighted the key novelty and significance of the study in the Discussion [lines 579-582].

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) This was clarified very late in the Discussion, but for most of the paper, I was unclear if heart geometry was the same for all simulations. Presumably, this includes the size and location of the infarct, BZ, and RZ. It would be helpful to clarify this in the Methods.

      This has been clarified in the first paragraph of the Methods section [lines 142-145].

      (2) On lines 224-226, the Methods refers to implementing several population members from the ToR-ORd model (in addition to the baseline) into the biventricular EM simulations. Is this in reference to the simulations shown in Figures 6 and 7, or different simulations? Please clarify.

      We now randomly select 17 of the 245 cell models in the population to be embedded in ventricular simulations, to produce a ventricular population of models. This allows us to explore the effect that physiological variability in the baseline ionic conductances has on the phenotypic representation of ionic remodellings in the ECG and LVEF. An explanation of this can be found in the Methods section [lines 241-244].

      For Figures 6 and 7, we selected two arrhythmic cell models from the n=245 population of cell models to be embedded into two ventricular simulations to demonstrate the arrhythmic potential of the cellular model at ventricular scale. This has been clarified in Methods [lines 269-271].

      Additionally, for the cases where a population member is used, are all regions of the ventricles "scaled" in the same manner, or were only the properties of the particular region drawn from the population modified relative to baseline (e.g., mid-myocardial cells in Figure 6)?

      The cells were embedded according to transmural heterogeneity in the remote zone for Figures 6 and 7. This has been clarified in the Methods [line 271-273].

      (3) Interestingly, the study finds that the ionic remodeling in different peri-infarct regions to be most critical in the ECG phenotype, which at least strongly suggests that inherent intra-patient variability in ion channel expression could also be critical.

      This is related to the comment on the use of population members. If the authors utilized one of the ventricular myocyte population members as the 'reference' (instead of the baseline ToR-ORd parameters) and applied the same types of remodeling as in Figures 3 and 4, would they expect the same ECG morphologies?

      We have now performed this test and selected 17 cell models from the population to create a ventricular population of models. On top of this ventricular population, we have applied the remodellings, and showed that the simulated ECG morphologies were mostly consistent across these 20 members (Figure 2C).

      (4) Related, do the authors expect that the location and/or size of the infarct and peri-infarct regions would impact the different ECG morphologies?

      Regarding the effect of scar location and size on the ECG, we refer the reader and reviewer to a related paper where this is explored in depth using a formal sensitivity analysis and deep learning inference (https://pubmed.ncbi.nlm.nih.gov/38373128/). We feel this is better able to do justice to this question rather than overloading this paper with additional investigations. We include a reference to this paper in the discussion section [lines 694-695].

      Reviewer #2 (Recommendations For The Authors):

      (1) Although the authors listed the parameters and cited the papers for the origins of the parameter changes in SM4 and table S4, it should be summarized in the methods section what are the major changes or differences for the 5 conditions. Furthermore, it should be stated what is the rationale for choosing these conditions. Are these choices based on clinical classifications or experimental conditions?

      The major differences between the 5 conditions have now been summarised in the Methods [lines 252-256, 264-266]. These remodellings have been collated from a range of experimental measurements in both human and animal data, which are summarised in Table S4. This has been clarified in Methods [lines 245-247].

      (2) Figure 3C and Figure 4C do not add any additional information beyond the conductance changes listed in Table 4, and I'd suggest removing them from the figures. On the other hand, it took me some time to look at Table 4 to figure out the corresponding changes. As commented above, the remodeling changes should be summarized in the main text to help reading.

      Figure 3C and 4C provide a visual explanation of the ionic remodellings in these conditions to echo the added descriptions in the text [lines 252-256, 264-266]. For this reason, we have elected to keep those figures in the manuscript.

      (3) The authors presented a large amount of data in Supplemental Materials, some may be unnecessary and some are difficult to follow. For example; 1) There is a lot of data in Table S6, there is a simple mention in the main text and Table S6 legend. A summary of the data is needed for the readers to understand the properties of the different conditions, instead of letting the readers figure them out from the table. The same should be done for other tables and figures. There are some format issues for the tables, which mess up some of the numbers and text. 2) The data shown in Figures S25-29 provide almost no new information beyond the well-known effects of ionic currents on EAD genesis, i.e., EADs are promoted by inward currents and suppressed by outward currents. The data for alternans (Figures S18-22) are a little more complex than the cases for EADs, I think that they can be simplified.

      Thanks for the suggestions. We have now extracted the key information from Table S6- S9 and summarized them in the caption. We have also fixed the layout of the tables in this revision. The supplementary sections on alternans and EADs are simplified with the key parameters related to these proarrhythmic phenomena summarized in tables instead of showing all boxplots of parameter distributions (Tables S10 and S11).

      (4) The authors showed two mechanisms of alternans: EAD-driven and Ca-driven alternans in chronic MI. There are several distinct mechanisms of alternans including EAD-induced alternans (see the recent review by Qu and Weiss, Circ Res 132, 127(2023)). Theoretically, calcium alternans can also induce EAD alternans under proper conditions, can you rule out that the EAD alternans are not due to Ca alternans? The results in Fig.7D may say the opposite. There are some chicken-or-egg issues here.

      In Figure 7D, we showed that the epicardial cell type (blue trace) had stable EADs at fast pacing with no calcium alternans, while both the endocardial (red trace) and mid-myocardial (green trace) cell types failed to fully repolarise in every other beat. To explore whether the EAD alternans are driven by calcium alternans, we tested the effects of switching off the alternans related remodelling, and the APs tuned out to be normal. On the other hand, when we turned off the EAD related remodelling, neither EADs nor alternans occurred. Therefore, the results show the two types of ionic current remodelling are both necessary for the generation of EAD alternans (lines 656-659 in the discussion and SM9).

      (5) As for the formation of ectopic beats, it can be caused by EADs but it can caused by repolarization gradient, they are not the same and differ in different AP models (Liu et al, CircAE 12, e007571 (2019), Zhang et al, Biophy J 120, 352(2021)). It is not clear here whether the primary cause is repolarization gradient or EADs. At tissue, EADs tend to be suppressed by repolarization gradient, there is a goldilocks between the EAD amplitude and repolarization gradient for an ectopic beat to form.

      When isolated cells that showed EAD were embedded in ventricular tissue, we saw ectopic wave propagation. This was because the EADs in the RZ generated conduction block, which enabled a large repolarisation gradient to form between the BZ and RZ, thereby leading to ectopy. This has been clarified in the Results [lines 507-510].

      Additionally, we have clarified the presence of the EADs in the ventricular simulations by labelling where this occurs in the green, purple, and yellow traces in Figure 7C. This was easily missed before due to the stretched proportions of the traces in the x-axis, which is necessary to show clearly the repolarisation gradients that drive ectopy.

      (6) The authors showed many population simulations. I guess that they are all in single cells. If the population simulations were done in the whole heart, it should be stated how many models were simulated. If only one of the population models was selected for the whole heart for each case, it should clarify the rationale for choosing one of the many models. If populations of cells were modeled in the whole heart, clarify how the models were distributed in the heart.

      We now randomly select 17 of the 245 cell models in the population to be embedded in ventricular simulations, to produce a ventricular population of models. This allows us to explore the effect that physiological variability in the baseline ionic conductances has on the phenotypic representation of ionic remodellings in the ECG and LVEF. An explanation of this can be found in the Methods section [lines 241-244]. Whenever the cell models are embedded in the relevant zones, they are uniformly distributed according to the transmural heterogeneity [lines 271-273].  

      (7) QRS intervals in the simulations are much wider than the real recordings from patients (Figure 2 and Table S8). At least, a QRS of 120 ms for normal control is too wide and probably not normal.

      We have manually measured QRS duration and updated the delineation method to calculate the other biomarkers. The new values now lie within normal ranges and have been updated in SM Table S7 and S8 and in Figure 2, and the new delineation method has been included in SM2.

    1. eLife Assessment

      This valuable study provides solid support for the participation of the BMP-binding domain of MuSK, a tyrosine kinase mostly known for its role at the neuromuscular junction, in the maintenance and activation of muscle stem cells (SCs). These mononucleated cells, located between the muscle fiber basal lamina and its plasma membrane, are normally quiescent, but following muscle damage, become activated, proliferate, and mediate muscle regeneration. These cells are known to respond to a variety of signaling pathways, but this study makes the case for BMP acting via binding to MuSK in maintaining the quiescent state.

    2. Reviewer #1 (Public review):

      Summary:

      Madigan et al. assembled an interesting study investigating the role of the MuSK-BMP signaling pathway in maintaining adult mouse muscle stem cell (MuSC) quiescence and muscle function before and after trauma. Using a full body and MuSC-specific genetic knockout system, they demonstrate that MuSK is expressed on MuSCs and that eliminating the BMP binding domain from the MuSK gene (i.e., MuSK-IgG KO) in mice at homeostasis leads to reduced PAX7+ cells, increased myonuclear number, and increase myofiber size, which may be due to a deficit in maintaining quiescence. Additionally, after BaCl2 injury, MuSK-IgG KO mice display accelerated repair after 7 days post-injury (dpi) in males only. Finally, RNA profiling using nCounter technology showed that MuSK-IgG KO MuSCs express genes that may be associated with the activated state.

      Strengths:

      Overall, the biology regulating MuSC quiescence is still relatively unexplored, and thus, this work provides a new mechanism controlling this process. The experiments discussed in the paper are technically sound with great complementary mouse models (full body versus tissue-specific mouse KO) used to validate their hypothesis. Additionally, the paper is well written with all the necessary information in the legends, methods, and figures being reported.

      Weaknesses:

      While the data largely supports the author's conclusions, I do have a few points to consider when reading this paper.

      (1) For Figure 1, while I appreciate the author's confirming MuSK RNA and protein in MuSCs, I do think they should (a) quantify the RNA using qPCR and (b) determine the percentage of MuSCs expressing MuSK protein in their single fiber system in multiple biological replicates. This information will help us understand if MuSK is expressed in 1/10 or 10/10 PAX7-expressing MuSCs. Also, it will help place their phenotypes into the right context, especially when considering how much of the PAX7-pool is expressing MuSK from the beginning.

      (2) Throughout the paper the argument is made that MuSK-IgG KO (full body and MuSC-specific KOs) are more activated and/or break quiescence more readily, but there is no attempt to test directly. Therefore, the authors should consider measuring the activation dynamics (i.e., break from quiescence) of MuSCs directly (EdU assays or live-cell imaging) in culture and/or in muscle in vivo (EdU assays) using their various genetic mouse models.

      (3) For Figure 2, given that mice are considered adults by 3 months, it is really surprising how just two months later they are starting to see a phenotype (i.e., reduced PAX7-cells, increased number of myonuclei, and increased myofiber size)-which correlates with getting older. Given that aged MuSCs have activation defects (i.e., stuck somewhere in the quiescence cycle), a pending question is whether their phenotype gets stronger in aged mice, like 18-24 months. If yes, the argument that this pathway should be used in a therapeutic sense would be strengthened.

      (4) For Figure 4, the same question as in point (2), the increase in fiber sizes by 7dpi in MuSK-IgG KO males is minimal (going from ~23 to 27 by eye) and no difference at a later time point when compared to WT mice. However, if older mice are used (18-24 months old) - which are known to have repair deficits-will the regenerative phenotype in MuSK-IgG KO mice be more substantial and longer lasting?

      (5) For Figure 6, this gene set is not glaringly obvious as being markers of MuSC activation (i.e., no MyoD), so it's hard for the readers to know if this gene set is truly an activation signature. Also, the Shcherbina et al. data presented as a column with * being up or down (i.e. differentially expressed) is not helpful, since you don't know whether those mRNAs in that dataset are going up with the activation process. Addressing this point as well as my point (1) will further strengthen the author's conclusions about the MuSK-IgG KO MuSCs not being able to maintain quiescence as effectively.

    3. Reviewer #2 (Public review):

      Summary:

      The work by Madigan et al. provides evidence that the signaling of BMPs via the Ig3 domain of MuSK plays a role during muscle postnatal development and regeneration, ultimately resulting in enhanced contractile force generation in the absence of the MuSK Ig3 domain. They demonstrate that MuSK is expressed in satellite cells initially post-isolation of muscle single fibers both in WT and whole-body deletion of the BMP binding domain of MuSK (ΔIg3-MuSK). In mice, ΔIg3-MuSK results in increased muscle fiber size, a reduction in Pax7+ cells, and increased muscle contractile force in 5-month-old, but not 3-month-old, mice. These data are complemented by a model in which the kinetics of regeneration appear to be accelerated at early time points. Of note, the authors demonstrate muscle tibialis anterior (TA) weights and fiber feret are increased in a Pax7CreERT2;MuSK-Ig3loxp/loxp model in which satellite cells specifically lack the MuSK BMP binding domain. Finally, using Nanostring transcriptional the authors identified a short list of genes that differ between the WT and ΔIg3-MuSK SCs. These data provide the field with new evidence of signaling pathways that regulate satellite cell activation/quiescence in the context of skeletal muscle development and regeneration.

      On the whole, the findings in this paper are well supported, however additional validation of key satellite cell markers and data analysis need to be conducted given the current claims.

      (1) The Pax7CreERT2;MuSK-Ig3loxp/loxp model is the appropriate model to conduct studies to assess satellite cell involvement in MuSK/BMP regulation. Validation of changes to muscle force production is currently absent using this model, as is quantification of Pax7+ tdT+ cells in 5-month muscle. Given that MuSK is also expressed on mature myofibers at NMJs, these data would further inform the conclusions proposed in the paper.

      (2) All Pax7 quantification in the paper would benefit from high magnification images including staining for laminin demonstrating the cells are under the basal lamina.

      (3) The nanostring dataset could be further analyzed and clarified. In Figure 6b, it is not initially apparent what genes are upregulated or downregulated in young and aged SCs and how this compares with your data. Pathway analysis geared toward genes involved in the TGFb superfamily would be informative.

      (4) Characterizing MuSK expression on perfusion-fixed EDL fibers would be more conclusive to determine if MuSK is expressed in quiescent SCs. Additional characterization using MyoD, MyoG, and Fos staining of SCs on EDL fibers would help inform on their state of activation/quiescent.

      (5) Finally, the treatment of fibers in the presence or absence of recombinant BMP proteins would inform the claims of the paper.

    4. Reviewer #3 (Public review):

      Summary:

      Understanding the molecular regulation of muscle stem cell quiescence. The authors evaluated the role of the MuSK-BMP pathway in regulating adult SC quiescence by the deletion of the BMP-binding MuSK Ig3 domain ('ΔIg3-MuSK').

      Strengths:

      A novel mouse model to interrogate muscle stem cell molecular regulators. The authors have developed a nice mouse model to interrogate the role of MuSK signaling in muscle stem cells and myofibers and have unique tools to do this.

      Weaknesses:

      Only minor technical questions remain and there is a need for additional data to support the conclusions.

      (1) The authors claim that dIg3-MuSK satellite cells break quiescence and start fusing, based on the reduction of Pax7+ and increase of nuclei/fiber (Fig 2-3), and maybe the gene expression (Fig6). However, direct evidence is needed to support these findings such as quantifying quiescent (Pax7+Ki67-) or activated (Pax7+Ki67+) satellite cells (and maybe proliferating progenitors Pax7-Ki67+) in the dIg3-MuSK muscle.

      (2) It is not clear if the MuSK-BMP pathway is required to maintain satellite cell quiescence, by the end of the regeneration (29dpi), how Pax7+ numbers are comparable to the WT (Fig4d). I would expect to have less Pax7+, as in uninjured muscle. Can the authors evaluate this in more detail?

      (2) Figure 4 claims that regeneration is accelerated, but to claim this at a minimum they need to look at MYH3+ fibers, in addition to fiber size.

      (3) The Pax7 specific dIg3-MuSK (Fig5) is very exciting. However, it will be important to quantify the Pax7+ number. Could the authors check the reduction of Pax7+ in this model since it would confirm the importance of MuSK in quiescence?

      (3) Rescue of the BMP pathway in the model would be further supportive of the authors' findings.

      (4) Is the stem cell pool maintained long term in the deleted dIg3-MuSK SCs? Or would they be lost with extended treatment since they are reduced at the 5-month experiments? This is an important point and should be considered/discussed relevant to thinking about these data therapeutically.

      (5) Without the Pax7-specific targeting, when you target dIg3-MuSK in the entire muscle, what happens to the neuromuscular nuclei?

      (6) Why were differences seen in males and not females? Is XIST downregulation occurring in both sexes? Could the authors explain these findings in more detail?

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      Madigan et al. assembled an interesting study investigating the role of the MuSK-BMP signaling pathway in maintaining adult mouse muscle stem cell (MuSC) quiescence and muscle function before and after trauma. Using a full body and MuSC-specific genetic knockout system, they demonstrate that MuSK is expressed on MuSCs and that eliminating the BMP binding domain from the MuSK gene (i.e., MuSK-IgG KO) in mice at homeostasis leads to reduced PAX7+ cells, increased myonuclear number, and increase myofiber size, which may be due to a deficit in maintaining quiescence. Additionally, after BaCl2 injury, MuSK-IgG KO mice display accelerated repair after 7 days post-injury (dpi) in males only. Finally, RNA profiling using nCounter technology showed that MuSK-IgG KO MuSCs express genes that may be associated with the activated state.

      Strengths:

      Overall, the biology regulating MuSC quiescence is still relatively unexplored, and thus, this work provides a new mechanism controlling this process. The experiments discussed in the paper are technically sound with great complementary mouse models (full body versus tissue-specific mouse KO) used to validate their hypothesis. Additionally, the paper is well written with all the necessary information in the legends, methods, and figures being reported.

      Weaknesses:

      While the data largely supports the author's conclusions, I do have a few points to consider when reading this paper.

      (1) For Figure 1, while I appreciate the author's confirming MuSK RNA and protein in MuSCs, I do think they should (a) quantify the RNA using qPCR and (b) determine the percentage of MuSCs expressing MuSK protein in their single fiber system in multiple biological replicates. This information will help us understand if MuSK is expressed in 1/10 or 10/10 PAX7-expressing MuSCs. Also, it will help place their phenotypes into the right context, especially when considering how much of the PAX7-pool is expressing MuSK from the beginning.

      The quantification is a reasonable point; however, we don’t believe that this information is necessary for supporting the interpretation of the findings.

      We agree that determining the proportion of SCs that expressing MuSK is useful information and we will address this question in the Revision.

      (2) Throughout the paper the argument is made that MuSK-IgG KO (full body and MuSC-specific KOs) are more activated and/or break quiescence more readily, but there is no attempt to test directly. Therefore, the authors should consider measuring the activation dynamics (i.e., break from quiescence) of MuSCs directly (EdU assays or live-cell imaging) in culture and/or in muscle in vivo (EdU assays) using their various genetic mouse models

      We agree that this point is of interest and we plan to address it in future studies.

      (3) For Figure 2, given that mice are considered adults by 3 months, it is really surprising how just two months later they are starting to see a phenotype (i.e., reduced PAX7-cells, increased number of myonuclei, and increased myofiber size)-which correlates with getting older. Given that aged MuSCs have activation defects (i.e., stuck somewhere in the quiescence cycle), a pending question is whether their phenotype gets stronger in aged mice, like 18-24 months. If yes, the argument that this pathway should be used in a therapeutic sense would be strengthened.

      We agree that the potential role of the MuSK-BMP pathway in aged SCs is of import and could shed new light on SC dynamics in this context. However, we note that the activation observed between 3-5 months results in improved muscle quality (increased myofiber size and grip strength), which is opposite of what is observed with aging. We agree that activating the MuSK-BMP pathway in aged animals has the potential to activate SCs, promote muscle growth and counter sarcopenia.  Pharmacological and genetic approaches to test that question are underway, but given the time frame they are beyond the scope of the current manuscript.

      (4) For Figure 4, the same question as in point (2), the increase in fiber sizes by 7dpi in MuSK-IgG KO males is minimal (going from ~23 to 27 by eye) and no difference at a later time point when compared to WT mice. However, if older mice are used (18-24 months old) - which are known to have repair deficits-will the regenerative phenotype in MuSK-IgG KO mice be more substantial and longer lasting?

      Again, an interesting point that will be addressed in future studies. 

      (5) For Figure 6, this gene set is not glaringly obvious as being markers of MuSC activation (i.e., no MyoD), so it's hard for the readers to know if this gene set is truly an activation signature. Also, the Shcherbina et al. data presented as a column with * being up or down (i.e. differentially expressed) is not helpful, since you don't know whether those mRNAs in that dataset are going up with the activation process. Addressing this point as well as my point (1) will further strengthen the author's conclusions about the MuSK-IgG KO MuSCs not being able to maintain quiescence as effectively.

      We agree that this Figure should include more information and be formatted in a way more readily convey the point. We will provide these changes in the Revision.

      Reviewer #2 (Public review):

      Summary:

      The work by Madigan et al. provides evidence that the signaling of BMPs via the Ig3 domain of MuSK plays a role during muscle postnatal development and regeneration, ultimately resulting in enhanced contractile force generation in the absence of the MuSK Ig3 domain. They demonstrate that MuSK is expressed in satellite cells initially post-isolation of muscle single fibers both in WT and whole-body deletion of the BMP binding domain of MuSK (ΔIg3-MuSK). In developing mice, ΔIg3-MuSK results in increased muscle fiber size, a reduction in Pax7+ cells, and increased muscle contractile force in 5-month-old, but not 3-month-old, mice. These data are complemented by a model in which the kinetics of regeneration appear to be accelerated at early time points. Of note, the authors demonstrate muscle tibialis anterior (TA) weights and fiber feret are increased during development in a Pax7CreERT2;MuSK-Ig3loxp/loxp model in which satellite cells specifically lack the MuSK BMP binding domain. Finally, using Nanostring transcriptional the authors identified a short list of genes that differ between the WT and ΔIg3-MuSK SCs. These data provide the field with new evidence of signaling pathways that regulate satellite cell activation/quiescence in the context of skeletal muscle development and regeneration.

      On the whole, the findings in this paper are well supported, however additional validation of key satellite cell markers and data analysis need to be conducted given the current claims.

      (1) The Pax7CreERT2;MuSK-Ig3loxp/loxp model is the appropriate model to conduct studies to assess satellite cell involvement in MuSK/BMP regulation. Validation of changes to muscle force production is currently absent using this model, as is quantification of Pax7+ tdT+ cells in 5-month muscle. Given that MuSK is also expressed on mature myofibers at NMJs, these data would further inform the conclusions proposed in the paper.

      As reported in the manuscript, we observed increased myofiber size, length and TA weight in the conditional mutants at five months of age. We did not assess grip strength in those experiments. 

      We demonstrated highly efficient MuSK Ig3-domain recombination by PCR analysis of FACS-sorted SCs from these conditional mutants (Supplemental Fig. S3). However, while we checked for Pax7+ tdT+ cells in 5-month SCs, we did not quantify this finding.

      (2) All Pax7 quantification in the paper would benefit from high magnification images including staining for laminin demonstrating the cells are under the basal lamina.

      The point is reasonable, we observed that these Pax7+ cells were under the basal lamina, but we did not acquire images at higher magnification.   

      (3) The nanostring dataset could be further analyzed and clarified. In Figure 6b, it is not initially apparent what genes are upregulated or downregulated in young and aged SCs and how this compares with your data. Pathway analysis geared toward genes involved in the TGFb superfamily would be informative.

      We agree that further analysis and information regarding the data in this Figure is warranted and we will include it in the Revision.

      (4) Characterizing MuSK expression on perfusion-fixed EDL fibers would be more conclusive to determine if MuSK is expressed in quiescent SCs. Additional characterization using MyoD, MyoG, and Fos staining of SCs on EDL fibers would help inform on their state of activation/quiescent.

      These are all valid points that we intend to address in future experiments.

      (5) Finally, the treatment of fibers in the presence or absence of recombinant BMP proteins would inform the claims of the paper.

      As reported in Jaime et al (2024) we have extensively characterized the differences in BMP response in both cultured WT and DIg3-MuSK myofibers and myoblasts at the level of signaling (pSMAD 1/5/8 nuclear localization and phosphorylation) and gene expression (qRT-PCR).

      Reviewer #3 (Public review):

      Summary:

      Understanding the molecular regulation of muscle stem cell quiescence. The authors evaluated the role of the MuSK-BMP pathway in regulating adult SC quiescence by the deletion of the BMP-binding MuSK Ig3 domain ('ΔIg3-MuSK').

      Strengths:

      A novel mouse model to interrogate muscle stem cell molecular regulators. The authors have developed a nice mouse model to interrogate the role of MuSK signaling in muscle stem cells and myofibers and have unique tools to do this.

      Weaknesses:

      Only minor technical questions remain and there is a need for additional data to support the conclusions.

      (1) The authors claim that dIg3-MuSK satellite cells break quiescence and start fusing, based on the reduction of Pax7+ and increase of nuclei/fiber (Fig 2-3), and maybe the gene expression (Fig6). However, direct evidence is needed to support these findings such as quantifying quiescent (Pax7+Ki67-) or activated (Pax7+Ki67+) satellite cells (and maybe proliferating progenitors Pax7-Ki67+) in the dIg3-MuSK muscle.

      We believe that the data presented strongly supports the conclusion that the SCs break quiescence, activate, and fuse into myofibers in uninjured muscle.  As noted above, the mechanistic studies suggested are of interest and we will address them in future work.

      (2) It is not clear if the MuSK-BMP pathway is required to maintain satellite cell quiescence, by the end of the regeneration (29dpi), how Pax7+ numbers are comparable to the WT (Fig4d). I would expect to have less Pax7+, as in uninjured muscle. Can the authors evaluate this in more detail?

      The reviewer makes an important point. Our current interpretation of the findings is that quiescence is broken in SCs in uninjured muscle, but that ‘stemness’ is preserved, allowing for efficient muscle regeneration and restoration of the SC pool. Whether such properties reflect SC heterogeneity (as suggested in the comments of the other reviewers) and/or different states along a continuum is of particular interest and will be the focus of future studies. 

      (2) Figure 4 claims that regeneration is accelerated, but to claim this at a minimum they need to look at MYH3+ fibers, in addition to fiber size.

      We did not examine MYH3+ fibers in this study. However, we did observe increased in Pax7+ cells at 5dpi (male and female) as well as larger myofiber size (Feret diameter) at 7dpi in the male animals.  In addition, the panels in Figure 4 b,c (H&E and laminin, respectively) showing accelerated differentiation were selected to be representative of the experimental group. 

      (3) The Pax7 specific dIg3-MuSK (Fig5) is very exciting. However, it will be important to quantify the Pax7+ number. Could the authors check the reduction of Pax7+ in this model since it would confirm the importance of MuSK in quiescence?

      In Figure 5c, we assessed the number of Pax7+ cells in the conditional mutant during the course of regeneration (at 3, 5, 7, 14, 22 and 29 dpi). As discussed above, these results confirmed the findings of the constitutive mutant (reduction of Pax7+ cells in uninjured 5-month-old muscle) as well as showing the increased number at 5dpi and return to WT levels at 29 dpi.

      (3) Rescue of the BMP pathway in the model would be further supportive of the authors' findings.

      This point is valid. In a parallel study examining the role of the MuSK-BMP pathway at the NMJ, we have observed that BMP+/- (hypomorphs) recapitulate key phenotypes observed in DIg3-MuSK  NMJs (Fish et al., bioRxiv, 2023). This point will be included in the Revision. 

      (4) Is the stem cell pool maintained long term in the deleted dIg3-MuSK SCs? Or would they be lost with extended treatment since they are reduced at the 5-month experiments? This is an important point and should be considered/discussed relevant to thinking about these data therapeutically.

      We agree that this is an important point for future studies. 

      (5) Without the Pax7-specific targeting, when you target dIg3-MuSK in the entire muscle, what happens to the neuromuscular nuclei?

      A manuscript describing the phenotype of the NMJ in DIg3-MuSK constitutive mice is in bioRxiv (Fish et al., 2024) and is in Revision at another journal.  We anticipate discussing the findings in the Revised version of the current manuscript. 

      (6) Why were differences seen in males and not females? Is XIST downregulation occurring in both sexes? Could the authors explain these findings in more detail?

      The male and female difference in myofiber size is of interest.  The nanostring experiments,  which showed the XIST reduction, were only performed in male mice.

    1. eLife Assessment

      This valuable study reveals extensive binding of eukaryotic translation initiation factor 3 (eIF3) to the 3' untranslated regions (UTRs) of efficiently translated mRNAs in human pluripotent stem cell-derived neuronal progenitor cells. The authors provide solid evidence to support their conclusions, although this study may be enhanced by addressing potential biases of techniques employed to study eIF3:mRNA binding and providing additional mechanistic detail. This work will be of significant interest to researchers exploring post-transcriptional regulation of gene expression, including cellular, molecular, and developmental biologists, as well as biochemists.

    2. Reviewer #1 (Public review):

      Summary:

      The authors perform irCLIP of neuronal progenitor cells to profile eIF3-RNA interactions upon short-term neuronal differentiation. The data shows that eIF3 mostly interacts with 3'-UTRs - specifically, the poly-A signal. There appears to be a general correlation between eIF3 binding to 3'-UTRs and ribosome occupancy, which might suggest that eIF3 binding promotes protein synthesis, possibly through inducing mRNA closed-loop formation.

      Strengths:

      The study provides a wealth of new data on eIF3-mRNA interactions and points to the potential new concept that eIF3-mRNA interactions are polyadenylation-dependent and correlate with ribosome occupancy.

      Weaknesses:

      (1) A main limitation is the correlative nature of the study. Whereas the evidence that eIF3 interacts with 3-UTRs is solid, the biological role of the interactions remains entirely unknown. Similarly, the claim that eIF3 interactions with 3'-UTR termini require polyadenylation but are independent of poly(A) binding proteins lacks support as it solely relies on the absence of observable eIF3 binding to poly-A (-) histone mRNAs and a seeming failure to detect PABP binding to eIF3 by co-immunoprecipitation and Western blotting. In contrast, LC-MS data in Supplementary File 1 show ready co-purification of eIF3 with PABP.

      (2) Another question concerns the relevance of the cellular model studied. irCLIP is performed on neuronal progenitor cells subjected to neuronal induction for 2 hours. This short-term induction leads to a very modest - perhaps 10% - and very transient 1-hour-long increase in translation, although this is not carefully quantified. The cellular phenotype also does not appear to change and calling the cells treated with differentiation media for 2 hours "differentiated NPCs" seems a bit misleading. Perhaps unsurprisingly, the minor "burst" of translation coincides with minor effects on eIF3-mRNA interactions most of which seem to be driven by mRNA levels. Based on the ~15-fold increase in ID2 mRNA coinciding with a ~5-fold increase in ribosome occupancy (RPF), ID2 TE actually goes down upon neuronal induction.

      (3) The overlap in eIF3-mRNA interactions identified here and in the authors' previous reports is minimal. Some of the discrepancies may be related to the not well-justified approach for filtering data prior to assessing overlap. Still, the fundamentally different binding patterns - eIF3 mostly interacting with 5'-UTRs in the authors' previous report and other studies versus the strong preference for 3'-UTRs shown here - are striking. In the Discussion, it is speculated that the different methods used - PAR-CLIP versus irCLIP - lead to these fundamental differences. Unfortunately, this is not supported by any data, even though it would be very important for the translation field to learn whether different CLIP methodologies assess very different aspects of eIF3-mRNA interactions.

    3. Reviewer #2 (Public review):

      Summary:

      The paper documents the role of eIF3 in translational control during neural progenitor cell (NPC) differentiation. eIF3 predominantly binds to the 3' UTR termini of mRNAs during NPC differentiation, adjacent to the poly(A) tails, and is associated with efficiently translated mRNAs, indicating a role for eIF3 in promoting translation.

      Strengths:

      The manuscript is strong in addressing molecular mechanisms by using a combination of next-generation sequencing and crosslinking techniques, thus providing a comprehensive dataset that supports the authors' claims. The manuscript is methodologically sound, with clear experimental designs.

      Weaknesses:

      (1) The study could benefit from further exploration into the molecular mechanisms by which eIF3 interacts with 3' UTR termini. While the correlation between eIF3 binding and high translation levels is established, the functionality of these interactions needs validation. The authors should consider including experiments that test whether eIF3 binding sites are necessary for increased translation efficiency using reporter constructs.

      (2) The authors mention that the eIF3 3' UTR termini crosslinking pattern observed in their study was not reported in previous PAR-CLIP studies performed in HEK293T cells (Lee et al., 2015) and Jurkat cells (De Silva et al., 2021). They attribute this difference to the different UV wavelengths used in Quick-irCLIP (254 nm) and PAR-CLIP (365 nm with 4-thiouridine). While the explanation is plausible, it remains a caveat that different UV crosslinking methods may capture different eIF3 modules or binding sites, depending on the chemical propensities of the amino acid-nucleotide crosslinks at each wavelength. Without addressing this caveat in more detail, the authors cannot generalize their findings, and thus, the title of the paper, which suggests a broad role for eIF3, may be misleading. Previous studies have pointed to an enrichment of eIF3 binding at the 5' UTRs, and the divergence in results between studies needs to be more explicitly acknowledged.

      (3) While the manuscript concludes that eIF3's interaction with 3' UTR termini is independent of poly(A)-binding proteins, transient or indirect interactions should be tested using assays such as PLA (Proximity Ligation Assay), which could provide more insights.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript by Mestre-Fos and colleagues, authors have analyzed the involvement of eIF3 binding to mRNA during differentiation of neural progenitor cells (NPC). The authors bring a lot of interesting observations leading to a novel function for eIF3 at the 3'UTR.

      During the translational burst that occurs during NPC differentiation, analysis of eIF3-associated mRNA by Quick-irCLIP reveals the unexpected binding of this initiation factor at the 3'UTR of most mRNA. Further analysis of alternative polyadenylation by APAseq highlights the close proximity of the eIF3-crosslinking position and the poly(A) tail. Furthermore, this interaction is not detected in Poly(A)-less transcripts. Using Riboseq, the authors then attempted to correlate eIF3 binding with the translation efficacy of mRNA, which would suggest a common mechanism of translational control in these cells. These observations indicate that eIF3-binding at the 3'UTR of mRNA, near the poly(A) tail, may participate to the closed-loop model of mRNA translation, bridging 5' and 3', and allowing ribosomes recycling. However, authors failed to detect interactions of eIF3, with either PABP or Paip1 or 40S subunit proteins, which is quite unexpected.

      Strength:

      The well-written manuscript presents an attractive concept regarding the mechanism of eIF3 function at the 3'UTR. Most mRNA in NPC seems to have eIF3 binding at the 3'UTR and only a few at the 5'end where it's commonly thought to bind. In a previous study from the Cate lab, eIF3 was reported to bind to a small region of the 3'UTR of the TCRA and TCRB mRNA, which was responsible for their specific translational stimulation, during T cell activation. Surprisingly in this study, the eIF3 association with mRNA occurs near polyadenylation signals in NPC, independently of cell differentiation status. This compelling evidence suggests a general mechanism of translation control by eIF3 in NPC. This observation brings back the old concept of mRNA circularization with new arguments, independent of PABP and eIF4G interaction. Finally, the discussion adequately describes the potential technical limitations of the present study compared to previous ones by the same group, due to the use of Quick-irCLIP as opposed to the PAR-CLIP/thiouridine.

      Weaknesses:

      (1) These data were obtained from an unusual cell type, limiting the generalizability of the model.

      (2) This study lacks a clear explanation for the increased translation associated with NPC differentiation, as eIF3 binding is observed in both differentiated and undifferentiated NPC. For example, I find a kind of inconsistency between changes in Riboseq density (Figure 3B) and changes in protein synthesis (Figure 1D). Thus, the title overstates a modest correlation between eIF3 binding and important changes in protein synthesis.

      (3) This is illustrated by the candidate selection that supports this demonstration. Looking at Figure 3B, ID2, and SNAT2 mRNA are not part of the High TE transcripts (in red). In contrast, the increase in mRNA abundance could explain a proportionally increased association with eIF3 as well as with ribosomes. The example of increased protein abundance of these best candidates is overall weak and uncertain.

      (4) Despite several attempts (chemical and UV cross-linking) to identify eIF3 partners in NPC such as PABP, PAIP1, or proteins from the 40S, the authors could not provide any evidence for such a mechanism consistent with the closed-loop model. Overall, this rather descriptive study lacks mechanistic insight (eIF3 binding partners).

      (5) Finally, the authors suspect a potential impact of technical improvement provided by Quick-irCLIP, that could have been addressed rather than discussed.

    5. Author response:

      eLife Assessment

      This valuable study reveals extensive binding of eukaryotic translation initiation factor 3 (eIF3) to the 3' untranslated regions (UTRs) of efficiently translated mRNAs in human pluripotent stem cell-derived neuronal progenitor cells. The authors provide solid evidence to support their conclusions, although this study may be enhanced by addressing potential biases of techniques employed to study eIF3:mRNA binding and providing additional mechanistic detail. This work will be of significant interest to researchers exploring post-transcriptional regulation of gene expression, including cellular, molecular, and developmental biologists, as well as biochemists.

      We thank the reviewers for their positive views of the results we present, along with the constructive feedback regarding the strengths and weaknesses of our manuscript, with which we generally agree. We acknowledge our results will require a deeper exploration of the molecular mechanisms behind eIF3 interactions with 3'-UTR termini and experiments to identify the molecular partners involved. Additionally, given that NPC differentiation toward mature neurons is a process that takes around 3 weeks, we recognize the importance of examining eIF3-mRNA interactions in NPCs that have undergone differentiation over longer periods than the 2-hr time point selected in this study. Finally, considering the molecular complexity of the 13-subunit human eIF3, we agree that a direct comparison between Quick-irCLIP and PAR-CLIP will be highly beneficial and will determine whether different UV crosslinking wavelengths report on different eIF3 molecular interactions. Additional comments are given below to the identified weaknesses.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors perform irCLIP of neuronal progenitor cells to profile eIF3-RNA interactions upon short-term neuronal differentiation. The data shows that eIF3 mostly interacts with 3'-UTRs - specifically, the poly-A signal. There appears to be a general correlation between eIF3 binding to 3'-UTRs and ribosome occupancy, which might suggest that eIF3 binding promotes protein synthesis, possibly through inducing mRNA closed-loop formation.

      Strengths:

      The study provides a wealth of new data on eIF3-mRNA interactions and points to the potential new concept that eIF3-mRNA interactions are polyadenylation-dependent and correlate with ribosome occupancy.

      Weaknesses:

      (1) A main limitation is the correlative nature of the study. Whereas the evidence that eIF3 interacts with 3-UTRs is solid, the biological role of the interactions remains entirely unknown. Similarly, the claim that eIF3 interactions with 3'-UTR termini require polyadenylation but are independent of poly(A) binding proteins lacks support as it solely relies on the absence of observable eIF3 binding to poly-A (-) histone mRNAs and a seeming failure to detect PABP binding to eIF3 by co-immunoprecipitation and Western blotting. In contrast, LC-MS data in Supplementary File 1 show ready co-purification of eIF3 with PABP.

      We agree the molecular mechanisms underlying the crosslinking between eIF3 and the end of mRNA 3’-UTRs remains to be determined. We also agree that the lack of interaction seen between eIF3 and PABP in Westerns, even from HEK293T cells, is a puzzle. The low sequence coverage in the LC-MS data gave us pause about making a strong statement that these represent direct eIF3 interactions, given the similar background levels of some ribosomal proteins.

      (2) Another question concerns the relevance of the cellular model studied. irCLIP is performed on neuronal progenitor cells subjected to neuronal induction for 2 hours. This short-term induction leads to a very modest - perhaps 10% - and very transient 1-hour-long increase in translation, although this is not carefully quantified. The cellular phenotype also does not appear to change and calling the cells treated with differentiation media for 2 hours "differentiated NPCs" seems a bit misleading. Perhaps unsurprisingly, the minor "burst" of translation coincides with minor effects on eIF3-mRNA interactions most of which seem to be driven by mRNA levels. Based on the ~15-fold increase in ID2 mRNA coinciding with a ~5-fold increase in ribosome occupancy (RPF), ID2 TE actually goes down upon neuronal induction.

      We agree that it will be interesting to look at eIF3-mRNA interactions at longer time points after induction of NPC differentiation. However, the pattern of eIF3 crosslinking to the end of 3’-UTRs occurs in both time points reported here, which is likely to be the more general finding in what we present.

      (3) The overlap in eIF3-mRNA interactions identified here and in the authors' previous reports is minimal. Some of the discrepancies may be related to the not well-justified approach for filtering data prior to assessing overlap. Still, the fundamentally different binding patterns - eIF3 mostly interacting with 5'-UTRs in the authors' previous report and other studies versus the strong preference for 3'-UTRs shown here - are striking. In the Discussion, it is speculated that the different methods used - PAR-CLIP versus irCLIP - lead to these fundamental differences. Unfortunately, this is not supported by any data, even though it would be very important for the translation field to learn whether different CLIP methodologies assess very different aspects of eIF3-mRNA interactions.

      We agree the more interesting aspect of what we observe is the difference in location of eIF3 crosslinking, i.e. the end of 3’-UTRs rather than 5’-UTRs or the pan-mRNA pattern we observed in T cells. The reviewer is right that it will be important in the future to compare PAR-CLIP and Quick-irCLIP side-by-side to begin to unravel the differences we observe with the two approaches.

      Reviewer #2 (Public review):

      Summary:

      The paper documents the role of eIF3 in translational control during neural progenitor cell (NPC) differentiation. eIF3 predominantly binds to the 3' UTR termini of mRNAs during NPC differentiation, adjacent to the poly(A) tails, and is associated with efficiently translated mRNAs, indicating a role for eIF3 in promoting translation.

      Strengths:

      The manuscript is strong in addressing molecular mechanisms by using a combination of next-generation sequencing and crosslinking techniques, thus providing a comprehensive dataset that supports the authors' claims. The manuscript is methodologically sound, with clear experimental designs.

      Weaknesses:

      (1) The study could benefit from further exploration into the molecular mechanisms by which eIF3 interacts with 3' UTR termini. While the correlation between eIF3 binding and high translation levels is established, the functionality of these interactions needs validation. The authors should consider including experiments that test whether eIF3 binding sites are necessary for increased translation efficiency using reporter constructs.

      We agree with the reviewer that the molecular mechanism by which eIF3 interacts with the 3’-UTR termini remains unclear, along with its biological significance, i.e. how it contributes to translation levels. We think it could be useful to try reporters in, perhaps, HEK293T cells in the future to probe the mechanism in more detail.

      (2) The authors mention that the eIF3 3' UTR termini crosslinking pattern observed in their study was not reported in previous PAR-CLIP studies performed in HEK293T cells (Lee et al., 2015) and Jurkat cells (De Silva et al., 2021). They attribute this difference to the different UV wavelengths used in Quick-irCLIP (254 nm) and PAR-CLIP (365 nm with 4-thiouridine). While the explanation is plausible, it remains a caveat that different UV crosslinking methods may capture different eIF3 modules or binding sites, depending on the chemical propensities of the amino acid-nucleotide crosslinks at each wavelength. Without addressing this caveat in more detail, the authors cannot generalize their findings, and thus, the title of the paper, which suggests a broad role for eIF3, may be misleading. Previous studies have pointed to an enrichment of eIF3 binding at the 5' UTRs, and the divergence in results between studies needs to be more explicitly acknowledged.

      We agree with the reviewer that the two methods of crosslinking will require a more detailed head-to-head comparison in the future. However, we do think the title is justified by the fact that we see crosslinking to the termini of 3’-UTRs across thousands of transcripts in each condition. Furthermore, the 3’-UTR crosslinking is enriched on mRNAs with higher ribosome protected fragment counts (RPF) in differentiated cells, Figure 3F.

      (3) While the manuscript concludes that eIF3's interaction with 3' UTR termini is independent of poly(A)-binding proteins, transient or indirect interactions should be tested using assays such as PLA (Proximity Ligation Assay), which could provide more insights.

      This is a good idea, but would require a substantial effort better suited to a future publication. We think our observations are interesting enough to the field to stimulate future experimentation that we may or may not be most capable of doing in our lab.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript by Mestre-Fos and colleagues, authors have analyzed the involvement of eIF3 binding to mRNA during differentiation of neural progenitor cells (NPC). The authors bring a lot of interesting observations leading to a novel function for eIF3 at the 3'UTR.

      During the translational burst that occurs during NPC differentiation, analysis of eIF3-associated mRNA by Quick-irCLIP reveals the unexpected binding of this initiation factor at the 3'UTR of most mRNA. Further analysis of alternative polyadenylation by APAseq highlights the close proximity of the eIF3-crosslinking position and the poly(A) tail. Furthermore, this interaction is not detected in Poly(A)-less transcripts. Using Riboseq, the authors then attempted to correlate eIF3 binding with the translation efficacy of mRNA, which would suggest a common mechanism of translational control in these cells. These observations indicate that eIF3-binding at the 3'UTR of mRNA, near the poly(A) tail, may participate to the closed-loop model of mRNA translation, bridging 5' and 3', and allowing ribosomes recycling. However, authors failed to detect interactions of eIF3, with either PABP or Paip1 or 40S subunit proteins, which is quite unexpected.

      Strength:

      The well-written manuscript presents an attractive concept regarding the mechanism of eIF3 function at the 3'UTR. Most mRNA in NPC seems to have eIF3 binding at the 3'UTR and only a few at the 5'end where it's commonly thought to bind. In a previous study from the Cate lab, eIF3 was reported to bind to a small region of the 3'UTR of the TCRA and TCRB mRNA, which was responsible for their specific translational stimulation, during T cell activation. Surprisingly in this study, the eIF3 association with mRNA occurs near polyadenylation signals in NPC, independently of cell differentiation status. This compelling evidence suggests a general mechanism of translation control by eIF3 in NPC. This observation brings back the old concept of mRNA circularization with new arguments, independent of PABP and eIF4G interaction. Finally, the discussion adequately describes the potential technical limitations of the present study compared to previous ones by the same group, due to the use of Quick-irCLIP as opposed to the PAR-CLIP/thiouridine.

      Weaknesses:

      (1) These data were obtained from an unusual cell type, limiting the generalizability of the model.

      We agree that unraveling the mechanism employed by eIF3 at the mRNA 3’-UTR termini might be better studied in a stable cell line rather than in primary cells.

      (2) This study lacks a clear explanation for the increased translation associated with NPC differentiation, as eIF3 binding is observed in both differentiated and undifferentiated NPC. For example, I find a kind of inconsistency between changes in Riboseq density (Figure 3B) and changes in protein synthesis (Figure 1D). Thus, the title overstates a modest correlation between eIF3 binding and important changes in protein synthesis.

      We thank the reviewer for this question. Riboseq data and RNASeq data are not on absolute scales when comparing across cell conditions. They are normalized internally, so increases in for example RPF in Figure 3B are relative to the bulk RPF in a given condition. By contrast, the changes in protein synthesis measured in Figure 1D is closer to an absolute measure of protein synthesis.

      (3) This is illustrated by the candidate selection that supports this demonstration. Looking at Figure 3B, ID2, and SNAT2 mRNA are not part of the High TE transcripts (in red). In contrast, the increase in mRNA abundance could explain a proportionally increased association with eIF3 as well as with ribosomes. The example of increased protein abundance of these best candidates is overall weak and uncertain.

      We agree that using TE as the criterion for defining increased eIF3 association would not be correct. By “highly translated” we only mean to convey the extent of protein synthesis, i.e. increases in ribosome protected fragments (RPF), rather than the translational efficiency.

      (4) Despite several attempts (chemical and UV cross-linking) to identify eIF3 partners in NPC such as PABP, PAIP1, or proteins from the 40S, the authors could not provide any evidence for such a mechanism consistent with the closed-loop model. Overall, this rather descriptive study lacks mechanistic insight (eIF3 binding partners).

      We agree that it will be important to identify the molecular mechanism used by eIF3 to engage the termini of mRNA 3’-UTRs. Nevertheless, the identification of eIF3 crosslinking to that location in mRNAs is new, and we think will stimulate new experiments in the field.

      (5) Finally, the authors suspect a potential impact of technical improvement provided by Quick-irCLIP, that could have been addressed rather than discussed.

      We agree a side-by-side comparison of eIF3 crosslinks captured by PAR-CLIP versus Quick-irCLIP will be an important experiment to do. However, NPCs or other primary cells may not be the best system for the comparison. We think using an established cell line might be more informative, to control for effects such as 4-thiouridine toxicity.

    1. eLife Assessment

      The authors use single molecule imaging and in vivo loop-capture genomic approaches to investigate estrogen mediated enhancer-target gene activation in human cancer cells. These potentially important results suggest that ER-alpha can, in a temporal delay, activate a non-target gene TFF3, which is in proximity to the main target gene TFF1, even though the estrogen responsive enhancer does not loop with the TFF3 promoter. To explain these results, the authors invoke a transcriptional condensate model. The reviewers were split on the strength and interpretation of the evidence presented, which is considered incomplete at this stage. We encourage a revision which buttresses the findings with additional control experiments and careful consideration of alternative explanations and mathematical models. Further, the depth of the discussion on existing literature could be improved. This work will be of interest to those studying transcriptional gene regulation and hormone-aggravated cancers.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Bohra et al. describes the indirect effects of ligand-dependent gene activation on neighboring non-target genes. The authors utilized single-molecule RNA-FISH (targeting both mature and intronic regions), 4C-seq, and enhancer deletions to demonstrate that the non-enhancer-targeted gene TFF3, located in the same TAD as the target gene TFF1, alters its expression when TFF1 expression declines at the end of the estrogen signaling peak. Since the enhancer does not loop with TFF3, the authors conclude that mechanisms other than estrogen receptor or enhancer-driven induction are responsible for TFF3 expression. Moreover, ERα intensity correlations show that both high and low levels of ERα are unfavorable for TFF1 expression. The ERa level correlations are further supported by overexpression of GFP-ERa. The authors conclude that transcriptional machinery used by TFF1 for its acute activation can negatively impact the TFF3 at peak of signaling but once, the condensate dissolves, TFF3 benefits from it for its low expression.

      Strengths:

      The findings are indeed intriguing. The authors have maintained appropriate experimental controls, and their conclusions are well-supported by the data.

      Weaknesses:

      There are some major and minor concerns that related to approach, data presentation and discussion. But I think they can be fixed with more efforts.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript by Bohra et al., the authors use the well-established estrogen response in MCF7 cells to interrogate the role of genome architecture, enhancers, and estrogen receptor concentration in transcriptional regulation. They propose there is competition between the genes TFF1 and TFF3 which is mediated by transcriptional condensates. This reviewer does not find these claims persuasive as presented. Moreover, the results are not placed in the context of current knowledge.

      Strengths:

      High level of ERalpha expression seems to diminish the transcriptional response. Thus, the results in Fig. 4 have potential insight into ER-mediated transcription. Yet, this observation is not pursued in great depth however, for example with mutagenesis of ERalpha. However, this phenomenon - which falls under the general description of non monotonic dose response - is treated at great depth in the literature (i.e. PMID: 22419778). For example, the result the authors describe in Fig. 4 has been reported and in fact mathematically modeled in PMID 23134774. One possible avenue for improving this paper would be to dig into this result at the single-cell level using deletion mutants of ERalpha or by perturbing co-activators.

      Weaknesses:

      There are concerns with the smRNA FISH experiments. It is highly unusual to see so much intronic signal away from the site of transcription (Fig. 2) (PMID: 27932455, 30554876) which suggests to me the authors are carrying out incorrect thresholding or have a substantial amount of labeling background. The Cote paper cited in the manuscript is likewise inconsistent with their findings and is cited in a misleading manner: they see splicing within a very small region away from the site of transcription.

      One substantial way to improve the manuscript is to take a careful look at previous single cell analysis of the estrogen response, which in some cases has been done on the exact same genes (PMID: 29476006, 35081348, 30554876, 31930333). In some of these cases, the authors reach different conclusions than those presented in the present manuscript. Likewise, there have been more than a few studies which characterized these enhancers (the first one I know of is: PMID 18728018). Also, Oh et al. 2021 (cited in the manuscript) did show an interaction between TFF1e and TFF3, which seems to contradict the conclusion from Fig. 3. In summary, the results of this paper are not in dialog with the field, which is a major shortcoming.

      In the opinion of this reviewer, there are few - if any - experiments to interrogate the existence of LLPS for diffraction limited spots such as those associated with transcription. This difficulty is a general problem with the field and not specific to the present manuscript. For example, transient binding will also appear as a dynamic 'spot' in the nucleus, independently of any higher order interactions. As for Fig. 5, I don't think treating cells with 1,6 hexanediol is any longer considered a credible experiment. For example, there are profound effects on chromatin independent of changes in LLPS (PMID: 33536240).

      Summary:

      In conclusion, I suggest that the authors look at alternative explanations and analyses -- many of which are experimentally and mathematically rigorous and pre-date the condensate model -- to explain their data.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work sets out to elucidate mechanistic intricacies in inflammatory responses in pneumonia in the context of the aging process (Terc deficiency - telomerase functionality).

      Strengths:

      Very interesting, conceptually speaking, approach that is by all means worth pursuing. An overall proper approach to the posited aim.

      We want to thank the reviewer for taking the time to review our manuscript and for providing positive feedback regarding our research question.  

      Weaknesses:

      The work is heavily underpowered and may have statistical deficits. This precludes it in its current state from drawing unequivocal conclusions.

      Thank you for this essential and valuable comment. We fully accept that the small sample size of the Tercko/ko mice is a major limitation of our study and transparently discuss this in our manuscript.  However, due to Animal Welfare regulations, only a reduced number of mice were approved because of the strong burden of disease. Consequently, only three non-infected and five infected mice were available to us. This reduced number of mice presents a clear limitation to our study. However, due to ethical considerations related to animal welfare and sustainability, as well as compliance with German animal welfare regulations, it is not possible to obtain additional Tercko/ko mice to increase the dataset.

      The animal studies are an important aspect of our study; however, our hypothesis was also investigated at multiple levels, including in an in vitro co-culture model (Figure 5), to ensure comprehensive analysis. Thus, we clearly demonstrated that S. aureus pneumonia in Tercko/ko mice leads to a more severe phenotype, orchestrated by the dysregulation of both innate and adaptive immune response.

      Reviewer #2 (Public Review):

      Summary:

      The authors demonstrate heightened susceptibility of Terc-KO mice to S. aureus-induced pneumonia, perform gene expression analysis from the infected lungs, find an elevated inflammatory (NLRP3) signature in some Terc-KO but not control mice, and some reduction in T cell signatures. Based on that, They conclude that disregulated inflammation and T-cell dysfunction play a major role in these phenomena.

      Strengths:

      The strengths of the work include a problem not previously addressed (the role of the Terc component of the telomerase complex) in certain aspects of resistance to bacterial infection and innate (and maybe adaptive) immune function.

      We would like to thank the reviewer for the positive feedback regarding our aim to investigate the impact of Terc deletion on the pulmonary immune response to S. aureus.  

      Weaknesses:

      The weaknesses outweigh the strengths, dominantly because conclusions are plagued by flaws in experimental design, by lack of rigorous controls, and by incomplete and inadequate approaches to testing immune function. These weaknesses are as follows

      (1) Terc-KO mice are a genomic knockout model, and therefore the authors need to carefully consider the impact of this KO on a wide range of tissues. This, however, is not the case. There are no attempts to perform cell transfers or use irradiation chimera or crosses that would be informative.

      We thank the reviewer for bringing up this important point. The aim of our study, however; was to investigate the impact of Terc deletion in the lung and on the response to bacterial pneumonia, rather than to provide a comprehensive characterization of the Tercko/ko model itself. This characterization of different tissues and cell types has already been conducted by previous studies. For instance, studies that characterize the general phenotype of the model (Herrera et al., 1999; Lee et al., 1998; Rudolph et al., 1999) but also investigations that shed light on the impact of Terc deletion on specific cell types such as microglia (Khan et al., 2015) or T cells (Matthe et al., 2022). The impact of Terc deletion on T cells is also discussed in our manuscript in lines 89 to 105. Furthermore, a section about the general phenotype of the Terc deletion model is included in the introduction in lines 126 to 138. Thus we discussed the relevant literature regarding Tercko/ko mice in our manuscript and attempted to provide a more in-depth characterization of the lung by investigating the inflammatory response to infection as well as changes in the gene expression (Figure 2-4).  

      (2) Throughout the manuscript the authors invoke the role of telomere shortening in aging, and according to them, their Terc-KO mice should be one potential model for aging. Yet the authors consistently describe major differences between young Terc-KO and naturally aging old mice, with no discussion of the implications. This further confuses the biological significance of this work as presented.

      Thank you for mentioning this relevant point. We want to apologize for the confusion regarding this matter. While Tercko/ko mice are a well-established model for premature aging, these effects become more apparent with increasing generations (G) and thus, G5 and 6 mice are the most affected by Terc deletion (Lee et al., 1998; Wong et al., 2008).

      Thus, while Tercko/ko mice are a common model for premature aging, this accelerated aging phenotype is predominantly apparent in later-generation Tercko/ko (G5 and 6) or aged Tercko/ko mice (Lee et al., 1998; Wong et al., 2008). Since the aim of this study was to analyze the impact of Terc deletion on the lung and its immune response to bacterial infections instead of the impact of telomere shortening and telomerase dysfunction, young G3 Tercko/ko mice (8 weeks) were used in this study. This is also mentioned in the lines 131-134. In this study, Tercko/ko mice were used not as a model of aging, but rather as a model specifically for Terc deletion. The old WT mice function as a control cohort to observe possible common but also deviating effects between aging and Terc deletion. In our sequencing data, we observe that uninfected young WT mice are very similar to uninfected Tercko/ko mice. Other studies have also reported this lack of major differences between uninfected WT and Tercko/ko mice in the G3 knockout mice (Kang et al., 2018). Conversely, uninfected young WT and Tercko/ko mice exhibited great differences, for instance, regarding the numbers of differentially expressed genes (Supplemental Figure 1H). Thus, differences between naturally aged mice and young G3 Tercko/ko mice are not surprising. To clarify this aspect we reconstructed the paragraph discussing the Tercko/ko mice (lines 126-134). Additionally we added a paragraph explaining the purpose of the naturally aged mice to the lines 134 to 138:

      “As control cohort age-matched young WT mice were utilized. To investigate whether Terc deletion, beyond critical telomere shortening, impacts the pulmonary immune response, we used young Tercko/ko mice. Additionally, naturally aged mice (2 years old) were infected to explore the potential link to a fully developed aging phenotype.”

      (3) Related to #2, group design for comparisons lacks a clear rationale. The authors stipulate that TercKO will mimic natural aging, but in fact, the only significant differences seen between groups in susceptibility to S. aureus are, contrary to the authors' expectation, between young Terc-KO and naturally old mice (Figures 1A and B, no difference between young Terc-KO and young wt); or there are no significant differences at all between groups (Figures 1, C, D,).

      We thank the reviewer for this essential comment. As mentioned above the Tercko/ko mice in this study are not selected to model natural aging. To model telomerase dysfunction and accelerated aging selection of later generation or aged Tercko/ko mice would have been more suitable. 

      The lack of statistical significance in some figures is likely due to the heterogeneity of disease phenotype of S. aureus infection in mice, which is a limitation of our study that we discuss in our discussion section in lines 576-582. The phenotype of S. aureus infection can vary greatly within a mouse population, highlighting the limitations of mice as a model for S. aureus infections. To account for this heterogeneity we divided the infected Tercko/ko mice cohort into different degrees of severity based on the clinical score and the presence of bacteria in organs other than the lung (mice with systemic infection). 

      Despite the heterogeneity especially within the Tercko/ko mice cohort the differences between the knockout and young as well as old WT mice were striking. Including the fatal infections, 80% of the Tercko/ko mice had a severe course of disease, while none of the WT mice displayed a severe course (Figure 1A, B and Supplemental Figure 1A, B). This hints towards a clear role of Terc in the response to S. aureus infection in mice. Thus while in some figures the differences are not significant, strong trends towards a more severe phenotype of S. aureus infection in the Tercko/ko mice regarding bacterial load, score and inflammatory response could be observed in our study. 

      Another example of inadequate group design is when the authors begin dividing their Terc-KO groups by clinical score into animals with or without "systemic infection" (the condition where a bacterium spreads uncontrollably across the many organs and via blood, which should be properly called sepsis), and then compare this sepsis group to other groups (Supplementary Figures 1G; Figure 2; lines 374-376 and 389391). This gives them significant differences in several figures, but because they did not clearly indicate where they applied this stratification in the figure legends, the data are somewhat confusing. Most importantly, methodologically it is highly inappropriate to compare one mouse with sepsis to another one without. If Terc-KO mice with sepsis are a comparator group, then their controls have to be wild-type mice with sepsis, who are dealing with the same high bacterial load across the body and are presumably forced to deploy the same set of immune defenses.

      We sincerely appreciate the significant time and effort you have invested in reviewing our manuscript. However, with all due respect, we must point out that the definition of sepsis you have referenced is considered outdated. According to the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3), sepsis is defined as "a life-threatening organ dysfunction caused by a dysregulated host response to infection" (Marvin Singer, 2016, JAMA). Given this fundamental misunderstanding of our findings, we find the comment regarding the inadequacy of our groups to be both dismissive and lacking in scientific merit. We would like to emphasize that the group size used in our study is consistent with accepted standards in infection research. We strongly reject any insinuations of inadequacy that have been repeatedly mentioned throughout the review.

      In order to provide a nuanced investigation of disease severity in Tercko/ko mice, we added the term “systemic infection” to the figures whenever the mice were divided into groups of mice with and without systemic infection. This is the case for Figure 2A and Supplemental Figure 1C-E. The division into mice with and without systemic infection is also mentioned in the figure legend of Figure 2A in lines 932 to 935 and for Supplemental Figure 1 in lines 1052-1053. We agree that Supplemental Figure 1G is somewhat confusing as the mice with systemic infection are highlighted in this graph but not included as a separate group within our sequencing analysis. We added a sentence to the figure legend clarifying this (lines 1042-1044):

      “Nevertheless, the infected Tercko/ko mice were considered one group for the expression analysis and not split into separate groups for the subsequent analysis.”

      Additionally, we revised the section regarding this grouping in different degrees of severity in our Material and Methods section to clarify that this division was only performed for specific analysis (line 191):

      “…for the indicated analysis.”

      Furthermore, the mice which were classified as systemically infected mice were not septic mice, as mentioned above. Those mice were classified by us as systemically infected based on their clinical score and the presence of bacteria in other organs than the lung as stated in the lines 188-191 and 377-381. Bacteremia is a symptom of very severe cases of hospital-acquired pneumonia with a very high mortality (De la Calle et al., 2016).

      Therefore, the systemically infected mice or rather mice with bacteremia display an especially severe pneumonia phenotype, which is distinct from sepsis. The presence of this symptom in our Tercko/ko mice further highlights the clinical relevance of our study. This aspect was added to the manuscript in the lines 568-570.

      “The detection of bacteria in extra pulmonary organs is of particular interest, as bacteremia is a symptom of severe pneumonia and is associated with high mortality (De la Calle et al., 2016).”

      (4) The authors conclude that disregulated inflammation and T-cell dysfunction play a major role in S. aureus susceptibility. This may or may not be an important observation, because many KO mice are abnormal for a variety of reasons, and until such reasons are mechanistically dissected, the physiological importance of the observation will remain unclear.

      Two points are important here. First, there is no natural counterpart to a Terc-KO, which is a complete loss of a key non-enzymatic component of the telomerase complex starting in utero. 

      Second, the authors truly did not examine the key basic features of their model, including the features of basic and induced inflammatory and immune responses. This analysis could be done either using model antigens in adjuvants, defined innate immune stimuli (e.g. TLR, RLR, or NLR agonists), or microbial challenge. The only data provided along these lines are the baseline frequencies of total T cells in the spleen of the three groups of mice examined (not statistically significant, Figure 4B). We do not know if the composition of naïve to memory T cell subsets may have been different, and more importantly, we have no data to evaluate whether recruitment of the immune response (including T cells) to the lung upon microbial challenge is similar or different. So, what are the numbers and percentages of T cells and alveolar macrophages in the lung following S. aureus challenge and are they even comparable or are there issues in mobilizing the T cell response to the site of infection? If, for example, Terc-KO mice do not mobilize enough T cells to the lung during infection, that would explain the paucity in many T-cellassociated genes in their transcriptomic set that the authors report. That in turn may not mean dysfunction of T cells but potentially a whole different set of defects in coordinating the response in Terc-KO mice.

      We thank the reviewer for highlighting these important aspects. Regarding the first point, indeed there is no naturally occurring deletion of Terc in humans. However, studies reported reduced expression of Terc and Tert in the tissues of aged mice and rats (Tarry-Adkins et al., 2021; Zhang et al., 2018). Terc itself has been found to have several important immunomodulatory functions such as the activation of the NFκB or PI3-kinase pathway (Liu et al., 2019; Wu et al., 2022). As those aforementioned pathways are relevant for the immune response to S. aureus infections, the authors were interested in exploring the impact of Terc deletion on the pulmonary immune response. The potential immunomodulatory functions of Terc are discussed in lines 106-121. To further clarify our rationale we added a sentence to the introduction in lines 121-125.

      “Interestingly, downregulation of Terc and Tert expression in tissues of aged mice and rats has been found (Tarry-Adkins, Aiken, Dearden, Fernandez-Twinn, & Ozanne, 2021; Zhang et al., 2018). Therefore, as a potential immunomodulatory factor reduced Terc expression could be connected to agerelated pathologies.”

      Regarding the second point, as we focused on the effect of Terc deletion in the lung and its role in S. aureus infection, we investigated inflammatory and immune response parameters relevant to this setting. For instance, inflammation parameters in the lungs of all three mice cohorts were measured to investigate differences in the inflammatory response in the non-infected and infected mice (Figure 2A). Those measurements showed no baseline difference in key inflammatory parameters between young WT and Tercko/ko mice, which is consistent with previous findings (Kang et al., 2018). The inflammatory response to infection with S. aureus in the Tercko/ko mice cohort differed significantly from the other cohorts (Figure 2A), hinting towards a dysregulated inflammatory response due to Terc deletion. Furthermore, we investigated general immune cell frequencies such as dendritic cells, macrophages, and B cells in the spleen of all three mice cohorts to gather a baseline understanding of the general immune cell populations. In our manuscript only total T cell frequencies were included due to its relevance for our data regarding T cells (Figure 4B). This data could show that there was no difference of total amount of T cells in the spleen of all three mice cohorts. For a more detailed insight into our analysis we added the frequencies of the other immune cell populations analyzed in the spleen as a Supplemental Figure 3B-F. Additionally, a figure legend for the graphs was added to lines 1075-1094.

      Therefore, while we did not analyze baseline frequencies of specific populations of T cells, we analyzed and characterized the inflammatory and immune response of our model in a way relevant to our research question. 

      The differences observed in T cell marker and TCR gene expression was also partly present between the uninfected and infected Tercko/ko mice such as the complete absence of CD247 expression in infected Tercko/ko, which is however expressed in uninfected mice of this cohort (Figure 4A, C and D). Thus, this effect cannot be solely attributed to an inadequate mobilization of T cells to the lung after infectious challenge. However, we agree that a more detailed insight into recruited immune cells to the lung or frequencies of different T cell populations could contribute to a better understanding of the proposed mechanism and would be an interesting experiment to conduct in further studies. We accept this as a limitation of our study and included it in our discussion section in lines 719-723:

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      (5) Related to that, immunological analysis is also inadequate. First, the authors pull signatures from the total lung tissue, which is both imprecise and potentially skewed by differences, not in gene expression but in types of cells present and/or their abundance, a feature known to be affected by aging and perhaps by Terc deficiency during infection. Second, to draw any conclusions about immune responses, the authors would have to track antigen-specific T cells, which is possible for a wide range of microbial pathogens using peptide-MHC multimers. This would allow highly precise analysis of phenomena the authors are trying to conclude about. Moreover, it would allow them to confirm their gene expression data in populations of physiological interest

      We thank the reviewer for highlighting this important and relevant point. In our study, we aimed to investigate the role of Terc expression in modulating inflammation and the immune response to S. aureus infection in the lung. To address this, we examined the overall impact of age, genotype, and infection on lung inflammation and gene expression. Therefore, sequencing of total lung tissue was essential for addressing the research question posed. Our findings demonstrate that Tercko/ko mice exhibit a more severe phenotype following S. aureus infection, characterized by an increased bacterial load and heightened lung inflammation (Figures 1 and 2). Furthermore, our data suggest that Terc plays a role in regulating inflammation through activation of the NLRP3 inflammasome, along with the dysregulation of several T cell marker genes (Figures 2, 4, and 5). However, this study lacks a detailed analysis of distinct T cell populations, including antigen-specific T cells, as noted earlier. Investigating these aspects in future studies would be valuable to validate and expand upon our findings. We have incorporated these suggestions into the discussion section (lines 719-723)

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      Nevertheless, our study provides first evidence of a potential connection between T cell functionality and Terc expression.  

      Third, the authors co-incubate AM and T cells with S. aureus. There is no information here about the phenotype of T cells used. Were they naïve, and how many S. aureus-specific T cells did they contain? Or were they a mix of different cell types, which we know will change with aging (fewer naïve and many more memory cells of different flavors), and maybe even with a Terc-KO? Naïve T cells do not interact with AM; only effector and memory cells would be able to do so, once they have been primed by contact with dendritic cells bringing antigen into the lymphoid tissues, so it is unclear what the authors are modeling here. Mature primed effector T cells would go to the lung and would interact with AM, but it is almost certain that the authors did not generate these cells for their experiment (or at least nothing like that was described in the methods or the text).

      Thank you for bringing up this important question. For the co-cultivation experiment of T cells and alveolar macrophages, total CD4+ T cells of both young WT and Tercko/ko were used. We did not select for a specific population of T cells. Our sequencing data indicated the complete downregulation of CD247 expression, which is an important part of the T cell receptor, in the lungs of infected Tercko/ko mice (Figure 4A, C and D). Given that this factor is downregulated under chronic inflammatory conditions, we investigated the impact of the inflammatory response in alveolar macrophages on the expression of various T cell-derived cytokines, as well as CD247 expression (Figure 5D, E) (Dexiu et al., 2022). This aspect is also highlighted in the discussion in lines 622-636. Therefore, a co-cultivation model of T cells and alveolar macrophages was established and confronted with heat-killed S. aureus to elicit an inflammatory response of the macrophages. To emphasize this purpose, we have revised our statement about the model setup in lines 516-518 of the manuscript: 

      “An overactive inflammatory response could be a potential explanation for the dysregulated TCR signaling.”

      The authors hope this will clarify the intent behind the model setup.

      (6) Overall, the authors began to address the role of Terc in bacterial susceptibility, but to what extent that specifically involves inflammation and macrophages, T cell immunity, or aging remains unclear at present.

      We thank the reviewer for the helpful and relevant comments. The authors accept the limitations of the presented study such as the reduced number of Tercko/ko mice and the limitations of murine models for S. aureus infection itself and discuss those in the discussion section in the lines 558-560; 576-582; 688-690 and 719-725. However, we hope that our responses have provided sufficient evidence to convince the reviewer that our data supports a clear role for Terc expression in regulating the immune response to bacterial infections, particularly with respect to inflammation and its potential connection to T cell functionality.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The good element first:

      I read this paper with genuine interest and applaud the authors for investigating the posited question. I consider it by all means scientifically relevant in the context of physiological/pathophysiological aging and reaction to a disease (here pneumonia). The Terc deletion model looks very appropriate for the question and the methodology is very advanced/in-depth. The data flow/selection of endpoints and assays is very logical to me. Moreover, I like the breakdown of pneumonia into varying levels of severity.

      We thank the reviewer for their time and effort taken to revise our manuscript. Additionally, we are grateful to receive your positive feedback regarding our study design and research question.

      The weaknesses:

      (1) I cannot help but notice that the study is heavily underpowered. As such, it is inadmissible. The key reason is that it is the first of its kind and seminal findings must be strongly propped by the evidence. It is apparent to me that the data scatter presented in the figures tends to be abnormally distributed (e.g. obvious bimodal distribution in some groups). Therefore, the presented comparisons (even if stat. sign) can be heavily misleading in terms of: i) the true magnitude of the observed effects and ii) possible type 2 error in some cases of p value >0.05. Solution: repeat the study to ensure reasonable power/reliability. This will also make it stronger as it will immediately demonstrate its reproducibility (or lack of it).

      Thank you for bringing up this extremely relevant point. We acknowledge the issue of the small sample size of Tercko/ko mice as a major limitation of our study. This limitation is also included in our discussion section in the lines 558-560. Thus we fully agree with this limitation and transparently discuss this in our manuscript. However, due to the strict German animal welfare regulations it is not possible to obtain more Tercko/ko mice, as mentioned above. Furthermore, since fatal infections occurred in the Tercko/ko mice cohort we had a reduced number of mice available. 

      However, the differences between the Tercko/ko and WT mice were striking. Including the fatal infections 80% of the Tercko/ko mice had a severe course of disease, while none of the WT mice displayed a severe course. This hints towards a clear role of Terc in the response to S. aureus infection in mice.  

      (2) In the stat analysis section of M&Ms, the authors feature only 1 sentence. This cannot be. A detailed stats workup needs to be included there. This is very much related to the above weakness; e.g. it is impossible to test for normality (to choose an appropriate post-hoc test) with n=3. Back to square one: study underpowered.

      We thank the reviewer for highlighting this important aspect. We carefully revised the method section in lines 357-360 to include all relevant information: 

      “Data are presented as mean ± SD, or as median with interquartile range for violin and box plots, with up to four levels of statistical significance indicated. P-values were calculated using Kruskal-Wallis test. Individual replicates are represented as single data points.”

      (3) Pneumonia severity. While I noted that as a strength, I also note it as weakness here. It looks to me like the authors stopped halfway with this. I totally support testing a biological effect(s) such as the one investigated here across a spectrum of a given disease severity. The authors mention that they had various severity phenotypes produced in their model but this is not visible in the data figs. I strongly suggest including that as well; i.e., to study the posited question in the severe and mild pneumonia phenotype. This is a very smart path and previous preclinical research clearly demonstrated that this severe/mild distinction is very relevant in the context of the observed responses (their presence/absence, longevity, dynamics, etc). I realize this is challenging, thus, I would probably use this approach in the Terc k/o model as sort of a calibrator to see whether the exacerbation observed in the current setup (severe?) will be also present in a mild pneumonia phenotype. S. aureus can be effectively titrated to produce pneumonia of varying severity.

      We thank the reviewer for bringing up this relevant point. 

      In our study, we could observe heterogeneity within the infected Tercko/ko cohort. Therefore as pointed out by the reviewer we assigned different degrees of severity to those groups based on clinical scores, the fatal outcome of the disease (fatal subgroup), and the presence of bacteria in organs other than the lungs (systemic infection subgroup) as stated in our materials and methods part in the lines 188-191 (Supplemental Figure 1A and B). Moreover, we highlighted this difference in a number of our figures. For example, when categorizing the mice into groups with and without systemic infection, we noticed that the mice with systemic infection demonstrated a higher bacterial load, significant body weight loss, and increased lung weight (see Supplemental Figure 1C-E). Interestingly, the two mice with systemic infection clustered separately from the other mice, indicating that the mice with systemic infection are transcriptomically distinct from the other mice cohorts (Supplemental Figure 1G). Additionally, the inflammatory response was exclusively elevated in the lungs of mice with systemic infection (Figure 2C). Thus, we included this distinction in several figures and attempted to study the differences between those subgroups but also their similarities. For instance, we could observe that some changes in the transcriptome were present in all three infected Tercko/ko mice such as the complete absence of CD247 expression at 24 hpi (Figure 4D). This distinction therefore provided a more detailed insight into the underlying mechanisms of disease severity in Tercko/ko mice and is lacking in other studies. We agree with the reviewer, that a study investigating mild and severe pneumonia phenotypes would be clinically relevant. However, as noted above, due to ethical considerations related to animal welfare and sustainability, as well as compliance with German animal welfare regulations, it is not possible to obtain additional Tercko/ko mice to carry out the proposed experiment. 

      (4) Please read ARRIVE guidelines and note the relevant info in M&Ms as ARRIVE guidelines point out.

      Thank you for emphasizing this crucial aspect. We revised our materials and methods section according to the ARRIVE guidelines (lines 179-206).

      “Tercko/ko mice aged 8 weeks, were used for infection studies (n = 8; non-infected = 3; infected = 5). Female young WT (age 8 weeks) and old WT (age 24 months) C57Bl/6 mice (both n = 10; non-infected = 5; infected = 5) were purchased from Janvier Labs (Le Genest-Saint-Isle, France). All infected mouse cohorts were compared to their respective non-infected controls, as well as to the infected groups from other cohorts. Additionally, comparisons were made between the non-infected cohorts across all groups.

      All mice were anesthetized with 2% isoflurane before intranasal infection with S. aureus USA300 (1x108 CFU/20µl) per mouse. After 24 hours, the mice were weighed and scored as previously described (Hornung et al., 2023). Infected Tercko/ko mice were grouped into different degrees of severity based on their clinical score, fatal outcome of the disease (fatal) and the presence of bacteria in organs other than the lung (systemic infection) for the indicated analysis. Mice with fatal infections were excluded from subsequent analyses, with only their final scores being reported. The mice were sacrificed via injection of an overdose of xylazine/ketamine and bleeding of axillary artery after 24 hpi. BAL was collected by instillation and subsequent retrieval of PBS into the lungs. Serum and organs were collected. Bacterial load in the BAL, kidney and liver was determined by plating of serially diluted sample as described above. For this organs were previously homogenized in the appropriate volume of PBS. Gene expression was analyzed in the right superior lung lobe. Lobes were therefore homogenized in the appropriate amount of TriZol LS reagent (Thermo Fisher Scientific, Waltham, MA, US) prior to RNA extraction. The left lung lobe was embedded into Tissue Tek O.C.T. (science services, Munich, Germany) and stored at 80°C until further processing for histological analysis. Cytokine measurements were performed using the right inferior lung lobe. Lobes were previously homogenized in the appropriate volume of PBS. Remaining organs were stored at -80°C until further usage. Mouse studies were conducted without the use of randomization or blinding.“

      (5) There are also some other descriptive deficits but they are of a much smaller caliber so I do not list them.

      We thank the reviewer for their valuable and insightful suggestions for improving our manuscript. We hope that our responses and the corresponding revisions address these suggestions satisfactorily.

      Concluding: the investigative idea is great/interesting and the methodological flow is adequate but the low power makes this study of low reliability in its current form. I strongly urge the authors to walk the extra mile with this work to make it comprehensive and reliable. Best of luck!

      Reviewer #2 (Recommendations For The Authors):

      (1) Many legends are uninformative and do not contain critical information about the experiments. For example, Figure 2A with cytokine measurements (in lung homogenates?) is likely showing data from an ELISA or Luminex test, but there is no mention of that in the legend. It stands next to Figure 2B, which is a gene expression map, again, likely from the lung (prepared how, normalized how, etc?) lacking even the most basic information. Further, Figure 2D has no information on the meaning/effect size of gene ratios on the x-axis. Figures 3 and 4 are presumably the subsets of their transcriptome data set (whole lung, harvested on d ?? post-infection), but that is just a guess on my part. Even in the main text, the timing and the controls for the transcriptomic study are not stated (ln. 398 and onwards). The authors really need to revise the figure legends and provide all the details that an average reader would need to be able to interpret the data.

      We thank the reviewer for bringing up this important point. The figure legends of all figures including supplemental figures were revised to ensure they include all relevant data necessary for accurate interpretation of the graphs. Additionally, we clarified the sequenced samples in lines 427-429:

      “We performed mRNA sequencing of the murine lung tissue of infected and non-infected mice at 24 hpi to elucidate potential differentially expressed genes that contribute to the more severe illness of Tercko/ko mice.”

      (2) Telomere shortening affects differentially different cells and its role in aging is nuanced - different in mesenchymal cells with no telomerase induction, in non-replicating cells, and in hematopoietic cells that can readily induce telomerase. The authors should be mindful of that in setting up their introduction and discussion.

      Thank you for mentioning this essential aspect. We revised our introduction and discussion to reflect the nuanced role of telomerase shortening in different tissues (lines 83-92 and 690-695):

      “Telomerase activity is restricted to specific tissues and cell types, largely dependent on the expression of Tert. While Tert is highly expressed in stem cells, progenitor cells, and germline cells, its expression is minimal in most differentiated cells (Chakravarti, LaBella, & DePinho, 2021). Consequently, the impact of telomerase dysfunction on tissues varies according to their self-renewal rate. (Chakravarti et al., 2021). One important aspect of telomere dysfunction is the impact of telomere shortening on the immune system as well as the hematopoietic system. Tissues or organ systems that are highly replicative, such as the skin or the hematopoietic system, are affected first by telomere shortening (Chakravarti et al., 2021).”

      “It is important to note that telomere shortening has a significant impact on the immune system. Although young Tercko/ko mice were used in this study, telomere shortening is still likely to be a contributing factor. Therefore, further experiments investigating the role of T cell senescence in this model should therefore be conducted.”

      (3) Syntax and formulations need to be improved and made more scientifically precise in several spots. Specifically, in 62-63, the authors say that the aged immune system "is also discussed to be more irritable", please change to reflect the common notion that the reaction to infection is dysregulated; in many cases inflammation itself is initially blunted, misdirected, and of different type (e.g. for viruses, the key IFN-I responses are not increased but decreased). In lines 114-117, presumably, the two sentences were supposed to be connected by a comma, although some editing for clarity is probably needed regardless. Line 252, please change "unspecific" to "non-specific". Line 264, please capitalize German.

      We thank the reviewer for bringing these important points to our attention. We revised our introduction regarding the aged immune response in lines 61-69:

      “Age-related dysregulation of the immune response is also characterized by inflammaging, defined as the presence of elevated levels of pro-inflammatory cytokines in the absence of an obvious inflammatory trigger (Franceschi et al., 2000; Mogilenko, Shchukina, & Artyomov, 2022). Additionally, immune cells, such as macrophages, exhibit an activated state that alters their response to infection (Canan et al., 2014). In contrast, the immune response of macrophages to infectious challenges has been shown to be initially impaired in aged mice (Boe, Boule, & Kovacs, 2017). Thus aging is a relevant factor impacting the pulmonary immune response.”

      Sentences were edited to provide more clarity in lines 131-134:

      “Although G3 Tercko/ko mice with shortened telomeres were used in this study, they were infected at a young age (8 weeks). This approach allowed for the investigation of Terc deletion effects rather than telomere dysfunction.”

      “Unspecific was changed to “non-specific” in line 282 and “German” was capitalized in line 293 and 558.

      We appreciate and thank you for your time spent processing this manuscript and look forward to your response.

      References

      De la Calle, C., Morata, L., Cobos-Trigueros, N., Martinez, J. A., Cardozo, C., Mensa, J., & Soriano, A. (2016). Staphylococcus aureus bacteremic pneumonia. European Journal of Clinical Microbiology & Infectious Diseases, 35(3), 497-502. https://doi.org/10.1007/s10096-015-2566-8  

      Dexiu, C., Xianying, L., Yingchun, H., & Jiafu, L. (2022). Advances in CD247. Scand J Immunol, 96(1), e13170. https://doi.org/10.1111/sji.13170  

      Herrera, E., Samper, E., Martín-Caballero, J., Flores, J. M., Lee, H. W., & Blasco, M. A. (1999). Disease

      states associated with telomerase deficiency appear earlier in mice with short telomeres. Embo j, 18(11), 2950-2960. https://doi.org/10.1093/emboj/18.11.2950  

      Hornung, F., Schulz, L., Köse-Vogel, N., Häder, A., Grießhammer, J., Wittschieber, D., Autsch, A., Ehrhardt, C., Mall, G., Löffler, B., & Deinhardt-Emmer, S. (2023). Thoracic adipose tissue contributes to severe virus infection of the lung. International Journal of Obesity, 47(11), 10881099. https://doi.org/10.1038/s41366-023-01362-w  

      Kang, Y., Zhang, H., Zhao, Y., Wang, Y., Wang, W., He, Y., Zhang, W., Zhang, W., Zhu, X., Zhou, Y., Zhang, L., Ju, Z., & Shi, L. (2018). Telomere Dysfunction Disturbs Macrophage Mitochondrial Metabolism and the NLRP3 Inflammasome through the PGC-1α/TNFAIP3 Axis. Cell Reports, 22(13), 3493-3506. https://doi.org/https://doi.org/10.1016/j.celrep.2018.02.071  

      Khan, A. M., Babcock, A. A., Saeed, H., Myhre, C. L., Kassem, M., & Finsen, B. (2015). Telomere dysfunction reduces microglial numbers without fully inducing an aging phenotype. Neurobiology of Aging, 36(6), 2164-2175. https://doi.org/https://doi.org/10.1016/j.neurobiolaging.2015.03.008  

      Lee, H.-W., Blasco, M. A., Gottlieb, G. J., Horner, J. W., Greider, C. W., & DePinho, R. A. (1998). Essential role of mouse telomerase in highly proliferative organs. Nature, 392(6676), 569-574. https://doi.org/10.1038/33345  

      Liu, H., Yang, Y., Ge, Y., Liu, J., & Zhao, Y. (2019). TERC promotes cellular inflammatory response independent of telomerase. Nucleic Acids Research, 47(15), 8084-8095. https://doi.org/10.1093/nar/gkz584  

      Matthe, D. M., Thoma, O. M., Sperka, T., Neurath, M. F., & Waldner, M. J. (2022). Telomerase deficiency reflects age-associated changes in CD4+ T cells. Immun Ageing, 19(1), 16. https://doi.org/10.1186/s12979-022-00273-0  

      Rudolph, K. L., Chang, S., Lee, H. W., Blasco, M., Gottlieb, G. J., Greider, C., & DePinho, R. A. (1999). Longevity, stress response, and cancer in aging telomerase-deficient mice. Cell, 96(5), 701-712. https://doi.org/10.1016/s0092-8674(00)80580-2  

      Tarry-Adkins, J. L., Aiken, C. E., Dearden, L., Fernandez-Twinn, D. S., & Ozanne, S. (2021). Exploring Telomere Dynamics in Aging Male Rat Tissues: Can Tissue-Specific Differences Contribute to Age-Associated Pathologies? Gerontology, 67(2), 233-242. https://doi.org/10.1159/000511608  

      Wong, L. S. M., Oeseburg, H., de Boer, R. A., van Gilst, W. H., van Veldhuisen, D. J., & van der Harst, P. (2008). Telomere biology in cardiovascular disease: the TERC−/− mouse as a model for heart failure and ageing. Cardiovascular Research, 81(2), 244-252. https://doi.org/10.1093/cvr/cvn337  

      Wu, S., Ge, Y., Lin, K., Liu, Q., Zhou, H., Hu, Q., Zhao, Y., He, W., & Ju, Z. (2022). Telomerase RNA TERC and the PI3K-AKT pathway form a positive feedback loop to regulate cell proliferation independent of telomerase activity. Nucleic Acids Res, 50(7), 3764-3776. https://doi.org/10.1093/nar/gkac179  

      Zhang, M. W., Zhao, P., Yung, W. H., Sheng, Y., Ke, Y., & Qian, Z. M. (2018). Tissue iron is negatively correlated with TERC or TERT mRNA expression: A heterochronic parabiosis study in mice. Aging (Albany NY), 10(12), 3834-3850. https://doi.org/10.18632/aging.101676

    2. eLife Assessment

      In this manuscript, the authors sought to elucidate mechanistic intricacies of inflammatory responses, with emphasis on T cell dysfunction, to S. aureus-induced pneumonia in the context of aging process using Terc deficient mice. Conceptually, the study is very interesting with a set of useful findings. Although some experimental approaches are appropriate, the work as shown in the revised manuscript remains significantly underpowered and the absence of rigorous controls make this study incomplete in support of its claims.

    3. Reviewer #1 (Public review):

      Summary:

      This work sets out to elucidate mechanistic intricacies in inflammatory responses in pneumonia in the context of aging process (Terc deficiency - telomerase functionality).

      Strengths:

      Very interesting, conceptually speaking, approach that is by all means worth pursuing. An overall proper approach to the posited aim.

      Weaknesses:

      The work is heavily underpowered and may have statistical deficits. This precludes at its current state drawing unequivocal conclusions.

      I remain at my initial position regarding the weaknesses.

    4. Reviewer #2 (Public review):

      Summary

      The authors demonstrate heightened susceptibility of Terc-KO mice to S. aureus-induced pneumonia, perform gene expression analysis from the infected lungs, find an elevated inflammatory (NLRP3) signature in some Terc-KO but not control mice, and some reduction in T cell signatures. Based on that, they conclude that dysregulated inflammation and T cell dysfunction play a major role in these phenomena.

      The strengths of the work did not change, and include a problem not previously addressed (the role of Terc component of the telomerase complex) in certain aspects of resistance to bacterial infection and innate (and maybe adaptive) immune function.<br /> The weaknesses of this revised version still outweigh the strengths, because the authors did not substantially or experimentally answer the main criticism points, and have rather tried to argue away that which cannot be argued away. In summary, the most germane conclusions of this study remain plagued by flaws in experimental design, by lack of rigorous controls and by incomplete and inadequate approaches to testing of immune function.

      I will devote the rest of the comments to the revised manuscript and its success or lack thereof in responding to prior criticisms. Prior criticisms are again listed below in italics, to provide context for the attempts of the investigators to respond.

      (1) Reviewer 1 has justifiably criticized the exceptionally low power of the study, with 5 control and 3 experimental animals. The responding author has replied that the animal welfare laws preclude them from doing more experiments. That is unfortunate, and I sympathize with the authors. Nonetheless, in the absence of robust corroboration the rigor of the study remains severely compromised and the work is reduced to what I have pointed above - a preliminary and inconclusive study that is in need of deeper and more serious mechanistic investigation.

      (2) Terc-KO mice are a genomic knockout model, and therefore the authors need to carefully consider the impact of this KO on a wide range of tissues. This, however, is not the case. There are no attempts to perform cell transfers, use irradiation chimera or crosses that would be informative.

      In response to this criticism, the authors have quoted a whole bunch of papers characterizing different aspects of biology of these same mice. The most important paper in that regard would be the one by Matthe et al. on CD4 cells from these same mice. That study was limited and simply diagnosed in situ the changes in T cell pool, but did not decipher whether and to what extent such defects are cell-intrinsic or a byproduct of similarly altered microenvironments. Most importantly, none of that answers the original critique question of which cell types are truly the culprits in the Terc deletion phenotype presented here. As I indicated, one has to perform cell transfers, bone marrow irradiation chimera, additional genetic crosses and combinations thereof to substantiate whether the defects are ascribable to the lung tissue itself, the infiltrating myeloid cells, including macrophages, the T cells or a combination thereof. The authors provided none of this.

      (3) Throughout the manuscript the authors invoke the role of telomere shortening in aging, and according to them their Terc-KO mice should be one potential model for aging. Yet the authors consistently describe major differences between young Terc-KO and naturally aging old mice, with no discussion of the implications. This further confuses the biological significance of this work as presented.

      (4) Related to #2, group design for comparisons lacks a clear rationale. The authors stipulate that Terc-KO will mimic natural aging, but in fact, the only significant differences seen between groups in susceptibility to S. aureus are, contrary to the authors' expectation, between young Terc-KO and naturally old mice (Fig. 1A and B, no difference between young Terc-KO and young wt); or there are no significant differences at all between groups (Fig. 1, C, D,). I have also raised the issue of non-physiological nature of a germline Terc-KO, that does not mimic any known physiological or pathological state.<br /> The authors provided a non-response to this criticism. They argue in their response under (2) of their rebuttal that they included old mice as controls not for aging, because their experimental Terc-deletion mice were G3 and do not exhibit as much of a progeroid phenotype as G5 or G6 mice. But they still say in the revised formulation that these mice were infected "to explore the potential link to a fully developed aging phenotype". They just never conclude that no such link is substantiated by the vast majority of their data. Moreover, they come back to state in their response (4) that because the literature reported ".... reduction of Terc and Tert in tissues of old mice and rats. Therefore, as a potential immunomodulatory factor reduced Terc expression could be connected to age-related pathologies." So either they have used old mice here to compare aging phenotypes, and found that Terc-KO mice diverge massively from aging phenotypes, in which case they have to state so, or they are not using them as age comparators (in which case I am not sure what their purpose is).

      (5) (originally part of criticism #4) I have criticized inadequate group design is when the authors begin dividing their Terc-KO groups by clinical score into animals with or without "systemic infection" (the condition where a bacterium spreads uncontrollably across the many organs and via blood, which should be properly called sepsis), and then compare this sepsis group to other groups (Suppl Fig. 1G; Fig. 2; lines 374-376 and 389-391). .... Most importantly, methodologically it is highly inappropriate to compare one mouse with sepsis to another one without. If Terc-KO mice with sepsis are a comparator group, then their controls have to be wild type mice with sepsis, who are dealing with the same high bacterial load across the body and are presumably forced to deploy the same set of immune defenses.<br /> The authors responded by making me aware of the 2016 JAMA definition of sepsis that invokes "a life-threatening organ dysfunction caused by a dysregulated host response to infection". I appreciate the correction, and note that in a human setting and globally, such a definition may make sense. The authors stated that bacteremia and not sepsis should be used as a criterion. I agree, and per my original criticism, believe it will be appropriate to compare bacteremic wt and KO mice.

      (6) I am shortening my prior critique to make it more to the point that was not addressed: The authors conclude that disregulated inflammation and T cell dysfunction play a major role in S. aureus susceptibility. This may or may not be an important observation, because many KO mice are abnormal for a variety of reasons, and until such reasons are mechanistically dissected, the physiological importance of the observation will remain unclear. ....., the authors truly did not examine the key basic features of their model, including the features of basic and induced inflammatory and immune response. This analysis could be done either using model antigens in adjuvants, defined innate immune stimuli (e.g. TLR, RLR or NLR agonsists), or microbial challenge. The only data provided along these lines are the baseline frequencies of total T cells in the spleen of the three groups of mice examined (not statistically significant, Fig. 4B). We do not know if the composition of naïve to memory T cell subsets may have been different, and more importantly, we have no data to evaluate whether recruitment of the immune response (including T cells) to the lung upon microbial challenge is similar or different. So, what are the numbers and percentages of T cells and alveolar macrophages in the lung following S. aureus challenge and are they even comparable or are there issues in mobilizing the T cell response to the site of infection ? If, for example, Terc-KO mice do not mobilize enough T cells to the lung during infection, that would explain paucity in many T cell -associated genes in their transcriptomic set that they authors report. That in turn may not mean dysfunction of T cells but potentially a whole different set of defects in coordinating the response in Terc-KO mice.<br /> The authors did not respond to this criticism other than to provide more frequencies of different subsets. The key here are the NUMBERS of cells present at the peak of challenge, or better yet the kinetics of cell accumulation (again numbers), as well as transfer experiments to establish where the defect actually lies (mobilization, activation, proliferation, etc.).

      (7) Related to that, immunological analysis is also inadequate. First, the authors pull signatures from the total lung tissue, which is both imprecise and potentially skewed by differences not in gene expression but in types of cells present and/or their abundance, a feature known to be affected by aging and perhaps by Terc deficiency during infection. Second, to draw any conclusions about immune responses, the authors would have to track antigen-specific T cells, which is possible for a wide range of microbial pathogens using peptide-MHC multimers. This would allow highly precise analysis of phenomena the authors are trying to conclude about. Moreover, it would allow them to confirm their gene expression data in populations of physiological interest.<br /> The authors agreed that this would be of interest but did nothing to provide it. They provided a sentence in the discussion stating that this (as well as many other experiments needed to interpret the results) would be of interest.

      (8) Overall, the authors begun to address the role of Terc in bacterial susceptibility, but to what extent that specifically involves inflammation and macrophages, T cell immunity or aging remains unclear at the present.<br /> My conclusion from the prior review remains unchanged in the face of the revision that did not answer most of the previous criticism. The study as it stands is inconclusive and highly preliminary, with lack of clearly defined mechanistic underpinnings.

    1. eLife Assessment

      In this useful study, the authors present convincing evidence linking the enzyme D-alanine-D-alanine ligase (Ddl), crucial for cell wall fortification, to organic acid exposure in Staphylococcus aureus. While it's established that organic acids impede bacterial growth, the researchers reveal a novel coping mechanism where S. aureus maintains elevated levels of D-alanine, the substrate for Ddl, to counteract this inhibition. This discovery illuminates a bacterial strategy for organic acid tolerance, offering new insights for microbiologists and potentially informing future antimicrobial approaches.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript entitled "Staphylococcus aureus counters organic acid anion-mediated inhibition of peptidoglycan cross-linking through robust alanine racemase activity" by Panda, S et al. reports an extensive biochemical analysis of the result from a Tn screen that identified alr1 as being required for acetic acid tolerance. In the end, they demonstrate that reduced D-Ala pools in the ∆alr1 mutant lead to a drastic reduction in D-Ala-D-Ala dipeptide. They show that this is due to the ability of organic acid anions to limit the D-Ala-D-Ala ligase enzyme Ddl. They demonstrate that:

      (1) Acetate exposure in the ∆alr1 results in reduced D-Ala-D-Ala dipeptide, but not the monomers.

      (2) Acetate can bind to purified Ddl in vitro.

      (3) This binding results in reduced enzyme activity.

      (4) Other organic acid anions such as lactate, proprionate, and itaconitate can also inhibit Ddl.

      The experiments are clearly described and logically laid out.

      Comments on revised version:

      Given that multiple reviewers noted that determining intracellular acetate levels would strengthen the impact of this manuscript, I still think the comment listed below should be dealt with. Radioactivity is not necessary for this. There are enzymatic kits that will allow for the accurate determination of acetate from a lysate of a known number of cells. This can be used to determine intracellular acetate levels.

      (1) It is kind of tricky, but it is possible to measure intracellular acetate. That might be of interest to know where in the Ddl inhibition curve the cells actually are.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, using Staphylococcus aureus as a model organism, Panda et al. aim to understand how organic acids inhibit bacterial growth. Through careful characterization and interdisciplinary collaboration, the authors present valuable evidence that acetic acid specifically inhibit the activity of Ddl enzyme that converts 2 D-alanine amino acids into D-ala-D-ala dipeptide, which is then used to generate the stem pentapeptide of peptidoglycan (PG) precursors in the cytoplasm. Thus, high concentration of acetic acid weakens the cell wall by limiting PG-crosslinking (which requires D-ala portion). However, S. aureus maintains a high intracellular D-ala concentration to circumvent acetate-mediated growth inhibition.

      Strengths:

      The authors utilized a well-established transposon mutant library to screen for mutants that struggle to grow in the presence of acetic acid. This screen allowed authors to identify that a strain lacking intact alr1, which encodes for alanine racemase (converts L-ala to D-ala), is unable to grow well in the presence of acetic acid. This phenotype is rescued by the addition of external D-ala. Next, the authors rule out the contribution of other pathways that could lead to the production of D-ala in the cell. Finally, by analyzing D-ala and D-ala-D-ala concentrations, as well as muropeptide intermediates accumulation in different mutants, the authors pinpoint Ddl as the specific target of acetic acid. In fact, synthetic overexpression of ddl alone overcomes the toxic effects of acetic acid. Using genetics, biochemistry, and structural biology, the authors show that Ddl activity is specifically inhibited by acetic acid and likely by other biologically relevant organic acids. Interestingly, this mechanism is different from what has been reported for other organisms such as Escherichia coli (where methionine synthesis is affected). It remains to be seen if this mechanism is conserved in other organisms that are more closely related to S. aureus, such as Clostridioides difficile and Enterococcus faecalis.

      Weaknesses:

      None noted. With new data the authors have satisfactorily addressed all the concerns of the previous version.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer #1:

      (1) Which allele is alr1, the one upstream of mazEF or the one in the lysine biosynthetic operon?

      Alr1 is encoded by SAUSA300_2027 and is the gene upstream to mazEF. We have now incorporated this information in the manuscript (Line# 127).

      (2) Figure 3B. Where does the C3N2 species come from in the WT and why is it absent in the mutants? It is about 25% of the total dipeptide pool.

      In Figure 3B, C3N2 species results from the combination of C3N1 (from Alr1) and C0N1 (from Dat). The reason this species is completely absent in either of the two mutants is because it requires one D-Ala from both Alr1 and Dat proteins to generate C3N2 D-Ala-D-Ala.

      (3) Figure 3D could perhaps be omitted. I understand that the authors attained statistical significance in the fitness defect, but biologically this difference is very minor. One would have to look at the isotopomer distribution in the Dat overexpressing strain to make sure that increased flux actually occurred since there are other means of affecting activity (e.g. allosteric modulators).

      Thank you for the suggestion. We agree with the reviewer that the fitness defect observed after increased dat expression is relatively minor and have moved this figure to the supplementary section as Figure 3-figure supplement 1.

      Although we attempted to amplify the fitness defect of dat expression by cloning dat on to a multicopy vector, we couldn't maintain its stable expression in S. aureus. This instability may be due to the depletion of D-Ala when dat is overexpressed. As a result, we switched to expressing dat from a single additional copy integrated into the SaPI locus, which was sufficient to cause the expected fitness defect, albeit a minor one.

      (4) In Figure 4A, why is the complete subunit UDP-NAM-AEKAA increasing in each strain upon acetate challenge if there was such a stark reduction in D-Ala-D-Ala, particularly in the ∆alr1 mutant? For that matter, why are the levels of UDP-NAM-AEKAA in the ∆alr1 mutant identical to that of WT with/out acetate?

      Thank you for raising this important point. We have addressed this in line# 299-302 and 451-455 of the revised manuscript. In short, we believe that the inhibition of Ddl by acetate significantly increases the intracellular pool of the tripeptide UDP-NAM-AEK, which then outcompetes the substrate (pentapeptide; UDP-NAM-AEKAA) of MraY. As a result, the intracellular concentration of the pentapeptide increases since it is no longer efficiently consumed by MraY. This explanation is also supported by a kinetic study conducted in Ref (1), where the competition between UDP-NAM-AEKAA and UDP-NAM-AEK as substrates for MraY is demonstrated.

      (5) Figure 4B. Is there no significant difference between ddl and murF transcripts between WT and ∆alr1 under acetate stress? This comparison was not labeled if the tests were done.

      Thank you for suggesting this comparison. The ddl and murF transcripts between WT and alr1 under acetate stress were significantly different. We have added this comparison to Figure 4B.

      (6) Although tricky, it is possible to measure intracellular acetate. It might be of interest to know where in the Ddl inhibition curve the cells actually are.

      Thank you for the suggestion. We agree this would have been an excellent addition to the manuscript. However, accurately measuring intracellular acetate would require the use of radiolabeled acetate (2), and we currently lack the expertise to do this experiment. However, since our study clearly shows that acetate-mediated growth impairment is due to Ddl inhibition, and the IC50 of acetate for Ddl is around 400 mM, we predict that the intracellular concentration must be close to or above this IC50 to observe the growth phenotypes we report.

      Reviewer #2:

      Although the authors have conclusively shown that Ddl is the target of acetic acid, it appears that the acetic acid concentration used in the experiments may not truly reflect the concentration range S. aureus would experience in its environment. Moreover, Ddl is only significantly inhibited at a very high acetate concentration (>400 mM). Thus, additional experiments showing growth phenotypes at lower organic acid concentrations may be beneficial.

      Thank you for the suggestion. In response to the reviewer, we have measured growth at various acetate concentrations and demonstrate a concentration-dependent effect (Figure 1C).

      We use 20 mM acetic acid in our study. In the gut, where S. aureus colonizes, acetate levels can reach up to 100 mM, so we believe our concentrations are physiologically relevant. When S. aureus encounters 20 mM acetate, the intracellular concentration can rise to 600 mM if the transmembrane pH gradient is 1.5 units, which is well above the ~400 mM IC50 we report for Ddl.

      Another aspect not adequately discussed is the presence of D-ala in the gut environment, which may be protective against acetate toxicity based on the model provided.

      Thank you for pointing this out. We agree that D-Ala from the gut microbiota could protect against acetate toxicity, and we’ve included this in the discussion. However, our study clearly indicates that S. aureus itself maintains high intracellular D-Ala levels through Alr1 activity which is sufficient to counter acetate anion intoxication.

      Recommendation for the authors:

      Reviewer #2:

      Major Comments:

      (1) In Line 85, authors indicate S. aureus may encounter a high concentration of ~100 mM acetic acid (extracellular?). Could the authors cite more (and recent) references indicating S. aureus encounters >100 mM acetic acid in the environment?

      To the best of our knowledge, no studies have specifically examined whether S. aureus encounters high mM concentration of acetate in the gut. Line 85 was surmised from multiple studies: recent findings that S. aureus colonizes the gut (3, 4) and that the gut environment has high acetate levels (~100 mM) (5). In response to the reviewers request, more recent references supporting high acetate concentrations in the gut (6, 7) have been added in Line# 86.

      (2) In Line 117, it is mentioned that S. aureus when grown in vitro at 20 mM acetic acid can accumulate ~600 mM acetic acid in the cytoplasm.

      a. Does the intracellular concentration go up proportionally if grown in 100 mM acetic acid? Given the IC50 of acetic acid-mediated inhibition of Ddl is ~400 mM, I wonder how physiologically relevant this finding presented here is.

      Thank you for the opportunity to explain this further. If S. aureus encounters a concentration of 100 mM acetate and its transmembrane pH gradient (pHin-pHout) is held at 1.5, the intracellular concentration of acetate could theoretically increase up to 3 M based on Ref (8). However, previous studies have shown that bacteria can lower the magnitude of transmembrane pH gradient by decreasing their intracellular pH to limit accumulation of anions within cells (9, 10).

      Although our study shows that the IC50 of Ddl inhibition by acetate is relatively high (~400 mM), we believe it’s still relevant because just 20 mM of environmental acetate at a pH of 6.0 can raise the intracellular concentration of acetate to over 600 mM, which is well above the IC50 we report for Ddl. Moreover, since S. aureus may encounter high concentrations of acetate during gut colonization, we believe our findings are physiologically relevant.

      b. Could the authors show concentration-dependent growth inhibition in alr::tn by titrating a range of acetic acid concentrations (for example 0, 0.5, 1, 5, 10, 20 mM)? Measuring intracellular acetate concentration may be beneficial as well.

      Thank you for this question. We now provide data to support that acetate-mediated inhibition of the alr1 mutant is concentration-dependent (see Figure 1C).

      c. It appears that there may be excess D-ala in the gut environment (PMIDs: 30559391; 35816159), which could counter the high acetate based on the model presented here. Could the authors clarify and/or include this information in the manuscript?

      This is an excellent point, and we have now included it in the discussion (Line# 470-475). It is indeed possible that D-Ala produced by the gut microbiome may further enhance S. aureus resistance to organic acid anions, in addition to the inherent contribution of Alr1 activity.

      (3) The following is not needed; however, it would be interesting if the authors could show that S. aureus cells grown in the presence of acetate are highly sensitive to cycloserine (which targets Alr and Ddl) compared to cells grown in the absence of acetate.

      Thank you for the suggestion. We are currently studying D-cycloserine (DCS) resistance in S. aureus. Although we provide the data below for clarification, it is not included in the current manuscript as it is part of a separate study.

      As the reviewer speculated, S. aureus is more susceptible to DCS when grown in the presence of acetate (see figure below). Normally, complete growth inhibition occurs at 32 µg/ml of DCS. However, with 20 mM acetic acid present, complete inhibition is achieved at just 8 µg/ml of DCS. Furthermore, the growth inhibition is completely rescued when externally supplemented with 5 mM D-Ala. We believe that DCS works synergistically with acetate to inhibit Ddl activity, and we are conducting additional studies to explore this further.

      Minor Comments:

      (1) Many commas are missing.

      Missing commas are now incorporated.

      (2) Line 77: disassociate --> dissociate

      Corrected.

      (3) Line 103: that --> which

      Corrected.

      (4) Lines 199-203: authors could have used gfp/luciferase reporter to test their hypotheses.

      Thank you for the suggestion. Initially, we created GFP translational fusions for all the mutants mentioned in Line# 199-203. However, the fluorescence intensity was too low to test the hypothesis, as these were single-copy fusions inserted at the SaPI site of the S. aureus genome. Because of this limitation, we took advantage of the essentiality of D-Ala-D-Ala in S. aureus to report on various mutants instead of a fluorescent reporter. In hindsight, a LacZ reporter assay might have been equally effective.

      (5) Line 339: It would be beneficial to introduce that Ddl has two independent ATP and D-ala binding sites.

      We have now added that information (Line# 338-339).

      (6) Is ddl an essential gene? If so, explicitly mention that.

      Yes, ddl is an essential gene and we have now incorporated this information in Line 103.

      (7) Line 354: shows a difference in density?

      The use of the term “difference density” is a technical crystallographic term commonly used to connote density observed for ligands in X-ray crystal structures. In this case, it simply refers to the observed density that corresponds to the two acetate ions bound within the Ddl active site.

      (8) Line 498: "Thus." Typo, change period to comma.

      We have corrected as suggested in Line 496.

      (9) Figure 1 legend says "was screen" instead of screened.

      This is now corrected.

      (10) Figure 1- Figure Supplement 1B: including data for alr2::tn dat::tn may ensure no redundancy (Lines 171-172). It is currently missing.

      Thank you for the suggestion. We now include both alr2dat double mutant and the alr1alr2dat triple mutant in Figure 1 - Figure Supplement 1B. In addition we also show that the alr1alr2dat mutant is resuced by the addition of D-Ala in Figure 1 - Figure Supplement 1C. The mutant information is also added to Table S5.

      (11) Figure 7: pentaglycine coming off of NAM is misleading. Remove untethered pentaglycine bridges.

      We thank you for pointing this out. We have modified the figure in the manuscript as suggested by the reviewer.

      (12) Are alr1/ddl cells (with limited 4-3 PG crosslink) less sensitive to vancomycin?

      On the contrary, the alr1 mutant is slightly more sensitive to vancomycin compared to the wild-type strain (see Figure below). We believe this happens because the alr1 mutant incorporates less D-Ala-D-Ala into the peptidoglycan, reducing the number of targets for vancomycin. As a result, vancomycin may be able to saturate the available D-Ala-D-Ala targets on the cell wall at a lower concentration in the alr1 mutant than in the wild type strain, leading to increased sensitivity. We haven’t included this data in the manuscript as it is part of a separate study.

      (13) Based on the structural studies, could the authors mutate the residues of Ddl involved in acetic acid binding, thereby making it resistant to acetic acid stress?

      The residues that the acetate anion interacts with are located within the ATP-binding and D-Ala-binding sites of Ddl. Since these residues are essential for Ddl function, we are unable to mutate them.

      (14) Microscopy to show the cell morphologies of wild-type and mutants exposed to acetic acid (and with D-ala supplementation) could be potentially interesting.

      Thank you for the suggestion. We did perform microscopy, expecting changes in cell shape or size, but the results were unremarkable and not included in the manuscript.

      References:

      (1) Hammes WP & Neuhaus FC (1974) On the specificity of phospho-N-acetylmuramyl-pentapeptide translocase. The peptide subunit of uridine diphosphate-N-actylmuramyl-pentapeptide. J Biol Chem 249(10):3140-3150.

      (2) Roe AJ, McLaggan D, Davidson I, O'Byrne C, & Booth IR (1998) Perturbation of anion balance during inhibition of growth of Escherichia coli by weak acids. J Bacteriol 180(4):767-772.

      (3) Acton DS, Plat-Sinnige MJ, van Wamel W, de Groot N, & van Belkum A (2009) Intestinal carriage of Staphylococcus aureus: how does its frequency compare with that of nasal carriage and what is its clinical impact? Eur J Clin Microbiol Infect Dis 28(2):115-127.

      (4) Piewngam P_, et al. (2023) Probiotic for pathogen-specific _Staphylococcus aureus decolonisation in Thailand: a phase 2, double-blind, randomised, placebo-controlled trial. Lancet Microbe 4(2):e75-e83.

      (5) Cummings JH, Pomare EW, Branch WJ, Naylor CP, & Macfarlane GT (1987) Short chain fatty acids in human large intestine, portal, hepatic and venous blood. Gut 28(10):1221-1227.

      (6) Correa-Oliveira R, Fachi JL, Vieira A, Sato FT, & Vinolo MA (2016) Regulation of immune cell function by short-chain fatty acids. Clin Transl Immunology 5(4):e73.

      (7) Hosmer J, McEwan AG, & Kappler U (2024) Bacterial acetate metabolism and its influence on human epithelia. Emerg Top Life Sci 8(1):1-13.

      (8) Carpenter CE & Broadbent JR (2009) External concentration of organic acid anions and pH: key independent variables for studying how organic acids inhibit growth of bacteria in mildly acidic foods. J Food Sci 74(1):R12-15.

      (9) Russell JB (1992) Another explanation for the toxicity of fermentation acids at low pH: anion accumulation versus uncoupling. Journal of Applied Bacteriology 73(5):363-370.

      (10) Russell JB & Diez-Gonzalez F (1998) The effects of fermentation acids on bacterial growth. Adv Microb Physiol 39:205-234.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The sample size of the in-house dataset used for training the model was relatively small (34 patients), which might limit the generalizability of the findings.

      (2) The authors did not perform functional experiments to directly validate the roles of the identified key genes in radiotherapy sensitivity, relying instead on associations with immune features and signaling pathways.

      (3) The study did not discuss the potential limitations of using machine learning algorithms, such as the risk of overfitting and the need for larger, diverse datasets for more robust model development and validation.

      (1) Currently, we are actively expanding the dataset by incorporating additional patient samples to enhance the model's robustness and generalizability. Furthermore, we implement advanced statistical techniques, including cross-validation, during model development to mitigate the potential limitations associated with the small sample size on our results. This limitation has been comprehensively addressed in the discussion section of our manuscript.

      (2) Given the current resource limitations, our study predominantly employed bioinformatics analyses. We acknowledge the critical importance of experimental validation and are actively pursuing additional funding and collaborative opportunities to facilitate future experimental studies. Concurrently, we have enhanced the discussion section to comprehensively address the limitations of our approach and emphasize the necessity for future experimental validation.

      (3) We appreciate the reviewers' insightful comments regarding the potential limitations of machine learning algorithms, particularly the risk of overfitting. In response, we have incorporated a comprehensive discussion of these concerns, detailing the measures implemented to mitigate such risks, including the application of regularization techniques and the adoption of more rigorous cross-validation methodologies. We further acknowledge the necessity for larger and more diverse datasets to enhance model validity and generalizability, a concern we intend to address in our future research endeavors. The revised manuscript includes an expanded discussion on these critical points.

      Here is the limitation section in the revised Manuscript:

      “This study primarily focuses on specific subtypes of nasopharyngeal carcinoma (NPC), potentially limiting its direct generalizability to other NPC subtypes or related head and neck malignancies. Furthermore, the limited sample size of our dataset may impact the model's generalizability and extrapolation capabilities. To mitigate the potential limitations associated with the small sample size, we employed advanced statistical methodologies, including cross-validation, to enhance the robustness and reliability of our findings. Nevertheless, we acknowledge the necessity for larger datasets and are actively collaborating with other research institutions to expand our sample size, thereby enhancing the robustness and broader applicability of our findings. Additionally, while our study utilizes bioinformatics approaches to identify and analyze key genes, we recognize that the absence of direct experimental functional validation represents a significant limitation. To address this limitation, we are actively pursuing additional funding and establishing collaborations with specialized laboratories to conduct crucial functional validation experiments, which will further elucidate the specific roles of these genes in radiotherapy response. Moreover, we acknowledge the potential risk of overfitting inherent in the application of machine learning algorithms to biomedical data analysis. To mitigate this risk, we implemented regularization techniques during model development and adopted a rigorous cross-validation strategy for model validation. These methodological approaches aim to ensure that our models maintain robust predictive performance on unseen data. Notwithstanding these limitations, our study offers novel insights into the molecular mechanisms underlying radiotherapy sensitivity in NPC and indicates promising avenues for future investigation. Future research endeavors will prioritize expanding the dataset, conducting comprehensive experimental validation, and refining our predictive model to enhance its accuracy and clinical applicability.”

      Reviewer #2 (Public Review):

      (1) The study focuses on a specific type of nasopharyngeal carcinoma (NPC) and may not be generalizable to other subtypes or related head and neck cancers. The applicability of NPC-RSS to a broader range of patients and tumor types remains to be determined.

      (2) The study does not account for potential differences in radiotherapy protocols, doses, and techniques between the training and validation cohorts, which could influence the performance of the predictive model. Standardization of treatment parameters would be important for future validation studies.

      (3) The binary classification of patients into radiotherapy-sensitive and resistant groups may oversimplify the complex spectrum of treatment responses. A more granular stratification system that captures intermediate responses could provide more nuanced predictions and better guide personalized treatment decisions.

      (4) The study does not address the potential impact of other relevant factors, such as tumor stage, histological subtype, and concurrent chemotherapy, on the predictive performance of NPC-RSS. Incorporating these clinical variables into the model could enhance its accuracy and clinical utility.

      (1) We appreciate the reviewers' interest in the applicability of our study. This study specifically focuses on a particular subtype of nasopharyngeal carcinoma (NPC), which may limit its direct generalizability to other NPC subtypes or related head and neck malignancies. We have incorporated a detailed discussion of this limitation in the Discussion section and intend to investigate the applicability of NPC-RSS across a broader spectrum of tumor types and subtypes in subsequent studies.

      (2) We acknowledge the reviewers' emphasis on the significance of potential variations in radiotherapy regimens, doses, and techniques. In the current study, we did not sufficiently account for these factors, potentially impacting the model's generalizability and accuracy. We aim to improve data consistency and strengthen model validation by standardizing treatment parameters in future investigations.

      (3) We concur with the reviewers' assessment that binary categorization may oversimplify the intricate nature of treatment responses. Indeed, radiotherapy responses likely exist on a continuous spectrum. Consequently, we intend to develop more refined stratification systems to capture intermediate responses, thereby enhancing the accuracy of treatment outcome predictions and facilitating personalized treatment decisions.

      (4) We appreciate the reviewers' recommendation to incorporate clinical variables, including tumor stage, histological subtype, and concurrent chemotherapy, into the model. We acknowledge that these factors are crucial for enhancing the accuracy and clinical applicability of predictive models. We are presently compiling these additional data and intend to integrate these variables into subsequent model iterations.

      Reviewer #1 (Recommendations For The Authors):

      (1) The manuscript would benefit from a more comprehensive comparison of the NPC-RSS with existing prognostic models or biomarkers for nasopharyngeal carcinoma. This would help highlight the unique value and potential superiority of the NPC-RSS in predicting radiotherapy sensitivity.

      2) The authors should consider expanding their discussion on the potential molecular mechanisms underlying the association between the key NPC-RSS genes and radiotherapy response. They could explore whether these genes have been previously implicated in radiotherapy resistance in other cancer types and discuss the potential functional roles of these genes in the context of nasopharyngeal carcinoma.

      (1) We appreciate your thorough review and valuable suggestions concerning our study. In response to the suggestion of comparing the Nasopharyngeal Carcinoma Radiotherapy Sensitivity Score (NPC-RSS) with existing prognostic models or biomarkers, we have carefully considered this proposal and determined that such a comparison is beyond the scope of our current study. The primary focus of our research is on the development and internal validation of the NPC-RSS model's accuracy and reliability. At present, we do not have access to the necessary external data to conduct a valid comparison, and the integration of such data extends beyond the parameters of this study. We intend to incorporate this comparative analysis in future studies to further validate the efficacy and explore the clinical application potential of the NPC-RSS model. We appreciate your understanding and continued support for our research endeavors.(2) In the revised manuscript, we have incorporated a comprehensive review of the functions of these key genes in various cancer types and explored their potential mechanisms of action in nasopharyngeal carcinoma (NPC). Through the citation of pertinent studies, we have elucidated the impact of these genes on radiotherapy sensitivity and resistance. Furthermore, we have proposed future research directions to elucidate the specific roles of these genes in the radiotherapy response of NPC.

      The following are new additions to the revised draft:

      “Previous studies have demonstrated that SMARCA2 significantly influences the radiotherapy response in non-small cell lung cancer (NSCLC). Depletion of SMARCA2 has been shown to enhance radiosensitivity, suggesting its potential as a therapeutic target for radiosensitization [30478150]. Additionally, the DMC1 gene has been incorporated into the radiosensitivity index (RSI) to evaluate radiotherapy sensitivity and prognosis, particularly in endometrial cancers. This inclusion provides valuable insights into the DNA damage repair process [38628740]. Studies on CD9 in glioblastoma multiforme (GBM) have revealed that post-radiotherapy increases in CD9 and CD81 levels in extracellular vesicles (EVs) are strongly correlated with the cytotoxic response to treatment. This finding suggests the potential of CD9 as a novel biomarker for monitoring radiotherapy efficacy [36203458]. In contrast, the association of PSG4 and KNG1 with radiotherapy resistance remains unexplored in the current literature.

      Future research should focus on analyzing the expression patterns of SMARCA2 in NPC patients and its correlation with radiotherapy efficacy using clinical samples. This analysis could elucidate its potential as a target for radiosensitization therapy. Investigating the correlation between DMC1 expression levels and radiotherapy sensitivity in NPC could potentially aid in predicting treatment efficacy and optimizing therapeutic regimens. Furthermore, analysis of extracellular vesicles, particularly those containing CD9, in post-radiotherapy NPC patients could assess their feasibility as biomarkers for monitoring treatment response. These proposed studies would not only contribute to a deeper understanding of the mechanisms underlying the role of these genes in NPC radiotherapy but could also potentially lead to the development of novel strategies for enhancing radiotherapy efficacy.”

      Minor Recommendations:

      (1) It is recommended that the author share the code for the article on Github or a similar open source platform.

      (2) The manuscript would benefit from a thorough review of the punctuation and sentence structure to improve readability and clarity.

      (1) You suggest sharing the code utilized in this study on GitHub or a comparable open-source platform to enhance the transparency and reproducibility of the research. I fully recognize the significance of this suggestion. However, due to the sensitivity of the data involved and the existing intellectual property agreement with my research team, we are unable to make the code publicly available at this time. We are actively seeking a method to safeguard the intellectual property of the project while also planning to share our tools and methodologies in the future. At this stage, we are open to collaborating with other researchers under appropriate frameworks and conditions to validate and replicate our findings by providing essential code execution snippets or assisting with data analysis.

      (2) Your suggestions are vital for enhancing the quality of the manuscript. I will perform a comprehensive linguistic and structural review of the manuscript to ensure that statements flow coherently and punctuation is employed correctly. We also intend to engage a professional scientific and technical writing editor to ensure that the manuscript adheres to the high standards required for academic publishing.

      Reviewer #2 (Recommendations For The Authors):

      (1) The manuscript would benefit from a more in-depth discussion of the potential clinical implications of the NPC-RSS. The authors should elaborate on how this score could be integrated into clinical decision-making and patient management.

      (2) The authors should consider including a section discussing the limitations of their study and potential areas for future research. This could include the need for prospective validation of the NPC-RSS in larger patient cohorts and the exploration of additional biological mechanisms.

      (1) We concur that a more comprehensive discussion regarding the application of the NPC-RSS in clinical decision-making would significantly enhance the practical value of this study. In the revised draft, we will include a section that elaborates on the integration of the NPC-RSS scoring system into daily clinical practice, detailing how it can assist physicians in developing individualized treatment plans and optimize patient management by predicting treatment responses.

      The following are new additions to the revised draft:

      “The incorporation of the NPC-RSS scoring system into clinical decision-making and patient management involves several key steps: first, establishing genetic testing as a standard component of nasopharyngeal cancer diagnosis and ensuring that physicians have prompt access to scoring results to guide treatment planning. Second, physicians should utilize the scoring results to tailor individualized treatment plans and engage in multidisciplinary discussions to optimize decision-making. Concurrently, physicians should elucidate the clinical significance of the scores and effectively communicate with patients to facilitate shared decision-making. Furthermore, continuous monitoring of the relationship between scoring and treatment outcomes, optimizing the scoring model based on empirical data, and ensuring the integration of technological platforms along with regulatory compliance are essential for safeguarding the effective operation of the scoring system and the protection of patient information.

      (2) In light of the reviewers' valuable suggestions, we acknowledge the significance of prospective validation of the NPC-RSS scoring system in a broader patient population and the necessity for thorough exploration of the underlying biological mechanisms. Accordingly, we are incorporating a new section in the revised manuscript that elaborates on the limitations of the current study and outlines potential directions for future research. This encompasses plans to increase the sample size for validation and further investigations into the biological basis of the scoring system to enhance its predictive validity and clinical applicability. We believe that these additions will significantly enrich the depth and breadth of the study, thereby serving the scientific community and clinical practice more effectively.”

      Minor Recommendations:

      (1) The authors should ensure that all abbreviations are defined at their first mention in the text.

      (2) The figure legends should be more descriptive and self-explanatory, allowing readers to understand the main findings without referring back to the main text.

      (1) You pointed out the need to define all acronyms at the first mention in the text and suggested that a comprehensive list of acronyms be included in the revised draft. We fully concur and have included a comprehensive list of acronyms in the revised text. Additionally, to enhance clarity, we have included the full name and definition of each acronym alongside its first occurrence in the text. This will assist readers in comprehending the study without the need to repeatedly refer to the glossary.

      (2) You recommended enhancing the descriptive quality of the figure legends to enable readers to discern the key findings from the figures without consulting the text. We have redesigned and refined all charts and legends to ensure they provide adequate information and are more descriptive. Each legend now outlines the experimental conditions, the variables employed, and the primary conclusions, ensuring that the charts themselves sufficiently convey the key findings of the study.

    2. eLife Assessment

      The authors have developed a robust machine learning approach to predict radio sensitivity in patients with NPC based on a defined gene signature. Some key aspects of this signature have been validated in vitro using relevant cell lines which strengthens the conclusions of this important and convincing study. The publication will be of interest to clinicians working on this indication as well as a more broader readership made up of scientists working on radiation biology and those with a bioinformatics/machine learning background.

    3. Reviewer #1 (Public review):

      Summary:

      In this study, the authors developed a novel radiotherapy sensitivity score (NPC-RSS) for nasopharyngeal carcinoma patients using machine learning algorithms. They identified 18 key genes associated with radiosensitivity and demonstrated that NPC-RSS could effectively predict radiotherapy response in both public and in-house datasets. Furthermore, they found that the key genes of NPC-RSS were closely related to immune characteristics, the expression of radiosensitivity-related genes, and signaling pathways involved in disease progression. The authors validated the consistency of expression of two key genes, SMARCA2 and CD9, with NPC-RSS in their own cell lines. They also showed that the radiosensitive group, classified by NPC-RSS, exhibited a more enriched and activated state of immune infiltration compared to the radioresistant group.

      Strengths:

      (1) The study employed a comprehensive approach by integrating multiple machine learning algorithms to develop a robust predictive model for radiotherapy sensitivity in nasopharyngeal carcinoma patients.<br /> (2) The predictive performance of NPC-RSS was validated using both public and in-house datasets, demonstrating its potential clinical applicability.<br /> (3) The authors conducted extensive analyses to investigate the biological mechanisms underlying the association between NPC-RSS and radiotherapy response, including immune characteristics, radiosensitivity-related gene expression, and relevant signaling pathways.<br /> (4) The consistency of key gene expression with NPC-RSS was validated in the authors' own cell lines, providing additional experimental evidence.

      Weaknesses:

      (1) The sample size of the in-house dataset used for training the model was relatively small (34 patients), which might limit the generalizability of the findings.<br /> (2) The authors did not perform functional experiments to directly validate the roles of the identified key genes in radiotherapy sensitivity, relying instead on associations with immune features and signaling pathways.<br /> (3) The study did not discuss the potential limitations of using machine learning algorithms, such as the risk of overfitting and the need for larger, diverse datasets for more robust model development and validation.

    4. Reviewer #2 (Public review):

      Summary:

      This article utilizes machine learning methods and transcriptomic data from nasopharyngeal carcinoma (NPC) patients to construct a biomarker called NPC-RSS that can predict the radiosensitivity of NPC patients. The authors further explore the biological mechanisms underlying the relationship between NPC-RSS and radiotherapy response in NPC patients. The main objective of this study is to guide the selection of radiotherapy strategies for NPC patients, thereby improving their clinical outcomes and prognosis.

      Strengths:

      (1) The combination of multiple machine learning algorithms and cross-validation was used to select the best predictive model for radiotherapy sensitivity from 71 differentially expressed genes, enhancing the robustness and reliability of the predictions.<br /> (2) Functional enrichment analysis revealed close associations between NPC-RSS key genes and immune characteristics, expression of radiotherapy sensitivity-related genes, and signaling pathways related to disease progression, providing a biological basis for NPC-RSS in predicting radiotherapy sensitivity.<br /> (3) Grouping NPC samples according to NPC-RSS showed that the radiotherapy-sensitive group exhibited a more enriched and activated state of immune infiltration compared to the radioresistant group. In single-cell samples, NPC-RSS was higher in the radiotherapy-sensitive group, with immune cells playing a dominant role. These results clarify the mechanism of NPC-RSS in predicting radiotherapy sensitivity from an immunological perspective.<br /> (4) The study used public datasets and in-house cohort data for validation, confirming the good predictive performance of NPC-RSS and increasing the credibility of the results.

      Limitation:

      (1) The study focuses on a specific type of nasopharyngeal carcinoma (NPC) and may not be generalizable to other subtypes or related head and neck cancers. The applicability of NPC-RSS to a broader range of patients and tumor types remains to be determined.<br /> (2) The study does not account for potential differences in radiotherapy protocols, doses, and techniques between the training and validation cohorts, which could influence the performance of the predictive model. Standardization of treatment parameters would be important for future validation studies.<br /> (3) The binary classification of patients into radiotherapy-sensitive and resistant groups may oversimplify the complex spectrum of treatment responses. A more granular stratification system that captures intermediate responses could provide more nuanced predictions and better guide personalized treatment decisions.<br /> (4) The study does not address the potential impact of other relevant factors, such as tumor stage, histological subtype, and concurrent chemotherapy, on the predictive performance of NPC-RSS. Incorporating these clinical variables into the model could enhance its accuracy and clinical utility.

    1. eLife Assessment

      This study represents a potentially useful tool for extracting quantitative data from intravital microscopy directed at in vivo cancer models. In general, this is an area of interest as accessible non-proprietary tools are needed and some evidence of the tool's utility is provided. However, the work in its current form is incomplete as it is heavily reliant on proprietary software to segment, track, and correct the data. In addition, there are significant reservations regarding the methods used to produce statistics in the software, limiting its applicability and the potential advance over other approaches.

    2. Reviewer #1 (Public review):

      Summary:

      Intravital microscopy (IVM) is a powerful tool that facilitates live imaging of individual cells over time in vivo in their native 3D tissue environment. Extracting and analysing multi-parametric data from IVM images however is challenging, particularly for researchers with limited programming and image analysis skills. In this work, Rios-Jimenez and Zomer et al have developed a 'zero-code' accessible computational framework (BEHAV3D-Tumour Profiler) designed to facilitate unbiased analysis of IVM data to investigate tumour cell dynamics (via the tool's central 'heterogeneity module') and their interactions with the tumour microenvironment (via the 'large-scale phenotyping' and 'small-scale phenotyping' modules). It is designed as an open-source modular Jupyter Notebook with a user-friendly graphical user interface and can be implemented with Google Colab, facilitating efficient, cloud-based computational analysis at no cost.

      To demonstrate the utility of BEHAV3D-TP, they apply the pipeline to timelapse IVM imaging datasets to investigate the in vivo migratory behaviour of fluorescently labelled DMG cells in tumour-bearing mice. Using the tool's 'heterogeneity module' they were able to identify distinct single-cell behavioural patterns (based on multiple parameters such as directionality, speed, displacement, and distance from tumour edge) which was used to group cells into distinct categories (e.g. retreating, invasive, static, erratic). They next applied the framework's 'large-scale phenotyping' and 'small-scale phenotyping' modules to investigate whether the tumour microenvironment (TME) may influence the distinct migratory behaviours identified. To achieve this, they combine TME visualisation in vivo during IVM (using fluorescent probes to label distinct TME components) or ex vivo after IVM (by large-scale imaging of harvested, immunostained tumours) to correlate different tumour behavioural patterns with the composition of the TME. They conclude that this tool has helped reveal links between TME composition (e.g. degree of vascularisation, presence of tumour-associated macrophages) and the invasiveness and directionality of tumour cells, which would have been challenging to identify when analysing single kinetic parameters in isolation.

      A key limitation of the pipeline is that it does not overcome the main challenges and bottlenecks associated with processing and extracting quantitative cellular data from timelapse and longitudinal intravital images. This includes correcting breathing-induced movement artifacts, automated registration of longitudinal images taken over days/weeks, and accurate, automated segmentation and tracking of individual cells over time. Indeed, there are currently no standardised computational methods available for IVM data processing and analysis, with most laboratories relying on custom-built solutions or manual methods. This isn't made explicit in the manuscript early on (described below), and the researchers rely on expensive software packages such as IMARIS for image processing and data extraction to feed the required parameters into their pipeline. This limitation unfortunately reduces the likely impact of BEHAV3D-TP on the IVM field.

      Nonetheless, this computational framework appears to represent a useful and comparatively user-friendly tool to analyse dynamic multi-parametric data to help identify patterns in cell migratory behaviours, and to assess whether these behaviours might be influenced by neighbouring cells and structures in their microenvironment. When combined with other methods, it, therefore, has the potential to be a valuable addition to a researcher's IVM analysis 'tool-box'.

      Strengths:

      (1) The figures are clearly presented, and the manuscript is easy to follow.

      (2) The pipeline appears to be intuitive and user-friendly for researchers with limited computational expertise. A detailed step-by-step video is also included to support its uptake.

      (3) The different computational modules have been tested using a relevant dataset.

      (4) All code is open source, and the pipeline can be implemented with Google Colab.

      (5) The tool combines multiple dynamic parameters extracted from time-lapse IVM images to identify single-cell behavioural patterns and to cluster cells into distinct groups sharing similar behaviours, and provides avenues to map these onto in vivo or ex vivo imaging data of the tumour microenvironment.

      Weaknesses:

      (1) As highlighted above, the tool does not facilitate the extraction of quantitative kinetic cellular parameters (e.g. speed, directionality, persistence, and displacement) from intravital images. Indeed, to use the tool researchers must first extract dynamic cellular parameters from their IVM datasets, requiring access to expensive software (e.g. IMARIS as used here) and/or above-average computational expertise to develop and use custom-made open-source solutions. This limitation is not made explicit or discussed in the text.

      (2) The number of cells (e.g. per behavioural cluster), and the number of independent mice, represented in each result figure, is not included in the figure legends and are difficult to ascertain from the methods.

      (3) The data used to test the pipeline in this manuscript is currently not available, making it difficult to assess its usability. It would be important to include this for researchers to use as a 'training dataset'.

      (4) Precisely how the BEHAV3D-TP large-scale phenotyping module can map large-scale spatial phenotyping data generated using LSR-3D imaging data and Cytomap to 3D intravital imaging movies is unclear. Further details in the text and methods would be beneficial to aid understanding.

      (5) The analysis provides only preliminary evidence in support of the authors' conclusions on DMG cell migratory behaviours and their relationship with components of the tumour microenvironment. Conclusions should therefore be tempered in the absence of additional experiments and controls.

    3. Reviewer #2 (Public review):

      Summary:<br /> The authors produce a new tool, BEHAV3D to analyse tracking data and to integrate these analyses with large and small-scale architectural features of the tissue. This is similar to several other published methods to analyse spatiotemporal data, however, the connection to tissue features is a nice addition, as is the lack of requirement for coding. The tool is then used to analyse tracking data of tumour cells in diffuse midline glioma. They suggest that 7 clusters exist within these tracks and that they differ spatially. They ultimately suggest that these behaviours occur in distinct spatial areas as determined by CytoMAP.

      Strengths:

      (1) The tool appears relatively user-friendly and is open source. The combination with CytoMAP represents a nice option for researchers.

      - The identification of associations between cell track phenotype and spatial features is exciting and the diffuse midline glioma data nicely demonstrates how this could be used.

      Weaknesses:

      (1) The strength of democratizing this kind of analysis is undercut by the reliance upon Imaris for segmentation, so it would be nice if this was changed to an open-source option for track generation.

      (2) The main issue is with the interpretation of the biological data in Figure 3 where ANOVA was used to analyse the proportional distribution of different clusters. Firstly the n is not listed so it is unclear if this represents an n of 3 where each mouse is an individual or whether each track is being treated as a test unit. If the latter this is seriously flawed as these tracks can't be treated as independent. Also, a more appropriate test would be something like a Chi-squared test or Fisher's exact test. Also, no error bars are included on the stacked bar graphs making interpretation impossible. Ultimately this is severely flawed and also appears to show very small differences which may be statistically different but may not represent biologically important findings. This would need further study.

      (3) Figure 4 has similar statistical issues in that the n is not listed and, again, it is unclear whether they are treating each cell track as independent which, again, would be inappropriate. The best practice for this type of data would be the use of super plots as outlined in Lord et al. (2020) JCI - SuperPlots: Communicating reproducibility and variability in cell biology.

      (4) The main issue that this raises is that the large-scale phenotyping module and the heterogeneity module appear designed to produce these statistical analyses that are used in these figures and, if they are based on the assumption that each track is independent, then this will produce inappropriate analyses as a default.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Rios-Jimenez developed a computational tool, BEHAV3D Tumor Profiler, to analyze intravital imaging data and extract distinctive tumor cell migratory phenotypes based on the quantified 3D image data.

      Weaknesses:

      (1) The most challenging task of analyzing 3D time-lapse imaging data is to accurately segment and track the individual cells in 3D over a long time duration. BEHAV3D Tumor Profiler did not provide any new advancement in this regard, and instead relies on commercial software, Imaris, for this critical step. Imaris is known to have a very high error rate when used for analyzing 3D time-lapse data. In the Methods section, the authors themselves stated that "Tumor cell tracks were manually corrected to ensure accurate tracking". Based on our own experience of using Imaris, such manual correction is tedious and often required for every time step of the movie. Therefore, Imaris is not a satisfactory tool for analyzing 3D time-lapse data. Moreover, Imaris is expensive and many research labs probably can't afford to buy it. The fact that BEHAV3D Tumor Profiler critically depends on the faulty ImarisTrack module makes it unclear whether the BEHAV3D tool or the results are reliable.

      (2) The authors developed a "Heterogeneity module" to extract distinctive tumor migratory phenotypes from the cell tracks quantified by Imaris. The cell tracks of the individual tumor cells are all quite short, indicating relatively low motility of the tumor cells. It's unclear whether such short migratory tracks are sufficient to warrant the PCA analysis to identify the 7 distinctive migratory phenotypes shown in Figure 2d. It's also unclear whether these 7 migratory phenotypes correspond to unique functional phenotypes.

      (3) Using only motility to classify tumor cell behaviours in the tumor microenvironment (TME) is probably not sufficient to capture the tumor cell difference. There are also other non-tumor cell types in the TME. If the authors aim to develop a computational tool that can elucidate tumor cell behaviors in the TME, they should consider other tumor cell features, e.g., morphology, proliferation state, and tumor cell interaction with other cell types, e.g., fibroblasts and distinct immune cells.

      (4) The authors have already published two papers on BEHAV3D [Alieva M et al. Nat Protoc. 2024 Jul;19(7): 2052-2084; Dekkers JF, et al. Nat Biotechnol. 2023 Jan;41(1):60-69]. Although the previous two papers used BEHAV3D to analyze T cells, the basic pipeline and computational steps are similar, in particular regarding cell segmentation and tracking. The addition of a "Heterogeneity module" based on PCA analysis does not make a significant advancement in terms of image analysis and quantification.

    5. Author response:

      We want to thank the reviewers for their positive and constructive comments on the manuscript. We already addressed some of their concerns and are planning the following revisions to both BEHAV3D-TP and the corresponding manuscript to address the reviewers’ comments. Below, we provide a response to the most significant comments, followed by a detailed, point-by-point response:

      (1) We acknowledge the reviewer's suggestion to incorporate open-source segmentation and tracking functionalities, increasing its accessibility to a wider user base; however, these additions fall outside the primary scope of our current work and represent a substantial undertaking in their own right. This topic has been comprehensively explored in other studies (e.g. https://doi.org/10.4049/jimmunol.2100811 ; https://doi.org/10.7554/eLife.60547 ; https://doi.org/10.1016/j.media.2022.102358 ; https://doi.org/10.1038/s41592-024-02295-6), which we will cite in our revised manuscript as indicated in our responses to the reviewers’ comments. Instead, the goal of our manuscript is to provide an analytical framework for processing data generated by existing segmentation and tracking pipelines. In our analyses, we used data processed with Imaris, a commercial software that, despite its limitations, is widely used by the intravital microscopy community due to its user-friendly platform for 3D image visualization and analysis. Nevertheless, to enhance compatibility with tracking data from various pipelines, we have modified our tool to accept data formats, such as those generated by open-source Fiji plugins like TrackMate (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ). These updates are available in our GitHub repository, and we will describe this feature in the revised manuscript to emphasize compatibility with segmented and tracked data from diverse open-source platforms.

      (2) We appreciate the reviewer’s suggestion to incorporate additional features into our analytical pipeline. In response, we have already updated the GitHub repository to allow users to input and select which features (dynamic, morphological, or spatial) they wish to include in the analysis (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#feature-selection ) . In the revised manuscript, we will highlight this new functionality and provide examples using alternative datasets to demonstrate the application of these features.

      (3) We appreciate the constructive feedback of reviewers #1 and #2 regarding the statistical analysis and interpretation of the data presented in Figures 3 and 4. We understand the importance of clarity and rigor in data analysis and presentation, and we are committed to addressing the concerns raised in the revised version of the manuscript.

      (4) We appreciate Reviewer #1's suggestion regarding the inclusion of demo data, as we believe it would greatly enhance the usability of our pipeline. We acknowledge that this was an oversight on our part. To address this, we have now added demo data to our GitHub repository (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler/tree/BEHAV3D_TP-v2.0/demo_datasets). In the upcoming revised manuscript, we will also ensure to reference this addition. Additionally, we will  provide both original and processed IVM movie samples to support users in navigating the complete pipeline effectively.

      (5) Finally, we agree with the reviewers to make some small changes to the manuscript based on their feedback.

      Below we provide a point-by-point response to the reviewers’ comments, along with proposed revisions.

      Reviewer #1:

      Comment: A key limitation of the pipeline is that it does not overcome the main challenges and bottlenecks associated with processing and extracting quantitative cellular data from timelapse and longitudinal intravital images. This includes correcting breathing-induced movement artifacts, automated registration of longitudinal images taken over days/weeks, and accurate, automated segmentation and tracking of individual cells over time. Indeed, there are currently no standardised computational methods available for IVM data processing and analysis, with most laboratories relying on custom-built solutions or manual methods. This isn't made explicit in the manuscript early on (described below), and the researchers rely on expensive software packages such as IMARIS for image processing and data extraction to feed the required parameters into their pipeline. This limitation unfortunately reduces the likely impact of BEHAV3D-TP on the IVM field.

      As highlighted above, the tool does not facilitate the extraction of quantitative kinetic cellular parameters (e.g. speed, directionality, persistence, and displacement) from intravital images. Indeed, to use the tool researchers must first extract dynamic cellular parameters from their IVM datasets, requiring access to expensive software (e.g. IMARIS as used here) and/or above-average computational expertise to develop and use custom-made open-source solutions. This limitation is not made explicit or discussed in the text.

      As mentioned previously, we agree with the reviewer that image processing steps, such as segmentation, tracking, and motion correction, present significant challenges in intravital microscopy (IVM) data processing. While these aspects are being addressed by other researchers, our publication centers on the analysis of acquired data rather than on the image processing itself. Our motivation, as outlined in the manuscript, arises from our own experience: despite the substantial effort invested in image processing, researchers often rely on simplistic analytical approaches, such as averaging single parameters and comparing them across conditions. These approaches tend to overlook potential tumor heterogeneity.

      Our work aimed to develop an analytical tool that provides a comprehensive framework for extracting more insights from processed IVM data, with a focus on two key aspects: capturing the heterogeneity of tumor behavior and examining the spatial distribution of these behaviors within the tumor microenvironment. In the revised manuscript, we will clarify the scope of our study, emphasizing its limitations as an analytical tool rather than an image-processing solution. Additionally, we will provide references to relevant literature on available (open-source) software options for image processing (e.g. Diego Ulisse Pizzagalli et al J Immunol (2022); Aby Joseph et al eLife (2020) ;Molina-Moreno M et al Medical Image Analysis (2022); Hidalgo-Cenalmor, I et al, Nat Methods  (2024); Ershov. D et al Nat Methods  (2022)).

      Regarding the reviewer’s comment on our use of Imaris, we acknowledge that Imaris is a costly commercial software. However, based on our experience, it is widely used by the intravital microscopy community due to its user-friendly interface for 3D image visualization and analysis. Despite its limitations in accuracy and the fact that it is not open-source, we believe that including data processed with Imaris will be valuable to the IVM community.

      However, to improve compatibility with data from other segmentation and tracking pipelines, we have already updated our tool to support formats generated by open-source Fiji plugins like TrackMate. These updates are available in our GitHub repository, and we will describe this functionality in detail in the revised manuscript to ensure compatibility with segmented and tracked data from various open-source platforms.

      Comment: The number of cells (e.g. per behavioural cluster), and the number of independent mice, represented in each result figure, is not included in the figure legends and are difficult to ascertain from the methods.

      We appreciate the reviewer's constructive feedback regarding the clarity of the number and type of replicates used in our analyses. In the revised manuscript, we will include detailed information in the figure legends regarding the number of cells (e.g., per behavioral cluster) and the number of independent mice represented in each result figure to ensure transparency.

      Comment: The data used to test the pipeline in this manuscript is currently not available, making it difficult to assess its usability. It would be important to include this for researchers to use as a 'training dataset'.

      As stated above we acknowledge that this was an oversight on our part and thank the reviewer for pointing this out. To address this, we have now added demo data to our GitHub repository (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler/tree/BEHAV3D_TP-v2.0/demo_datasets). In the upcoming revised manuscript, we will also make sure to reference this addition. Additionally, we intend to provide both original and processed IVM movie samples to support users in navigating the complete pipeline effectively.

      Comment: Precisely how the BEHAV3D-TP large-scale phenotyping module can map large-scale spatial phenotyping data generated using LSR-3D imaging data and Cytomap to 3D intravital imaging movies is unclear. Further details in the text and methods would be beneficial to aid understanding.

      We appreciate the reviewer’s comment and will provide additional details in the text and methods of the revised manuscript to clarify how the BEHAV3D-TP module maps LSR-3D and Cytomap data to 3D intravital imaging movies.

      Comment: The analysis provides only preliminary evidence in support of the authors' conclusions on DMG cell migratory behaviours and their relationship with components of the tumour microenvironment. Conclusions should therefore be tempered in the absence of additional experiments and controls.

      We appreciate the reviewer’s comment and acknowledge that our conclusions should be tempered due to the preliminary nature of our evidence. To be able to directly analyze the impact of the brain tumor microenvironment on cancer cell behavior, we will include a new set of analyses in the revised manuscript. Specifically, we will utilize BEHAV3D-TP to analyze existing IVM data from adult gliomas with and without macrophage depletion (Alieva et al, Scientific Reports, 2017; https://doi.org/10.1038/s41598-017-07660-4 ) to evaluate the differences in heterogeneous cell populations under these conditions. Since this analysis pertains to a different tumor type, we will revise our conclusions accordingly and emphasize the necessity for additional experiments and controls to further validate our findings on DMG cell migratory behaviors and their relationship with the tumor microenvironment.

      Reviewer #2:

      Comment: The strength of democratizing this kind of analysis is undercut by the reliance upon Imaris for segmentation, so it would be nice if this was changed to an open-source option for track generation.

      As noted in our previous response to Reviewer #1, we would like to point out that although Imaris is a commercial software, it is widely used in the intravital microscopy (IVM) community due to its user-friendly interface. One of its key advantages, which we also utilized, is semi-automated data tracking that allows for manual corrections in 3D—a process that can be more challenging in other open-source software with less effective data visualization.

      However, we recognize that enhancing our pipeline's compatibility with open-source options is important. To this end, we have already updated our tool to support data formats generated by open-source Fiji plugins like TrackMate, improving compatibility with various segmentation and tracking pipelines (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ). We will describe these updates in the revised manuscript to clarify our study's scope and the available image processing options.

      Comment: The main issue is with the interpretation of the biological data in Figure 3 where ANOVA was used to analyse the proportional distribution of different clusters. Firstly the n is not listed so it is unclear if this represents an n of 3 where each mouse is an individual or whether each track is being treated as a test unit. If the latter this is seriously flawed as these tracks can't be treated as independent. Also, a more appropriate test would be something like a Chi-squared test or Fisher's exact test. Also, no error bars are included on the stacked bar graphs making interpretation impossible. Ultimately this is severely flawed and also appears to show very small differences which may be statistically different but may not represent biologically important findings. This would need further study.

      We appreciate the reviewer’s insightful comments regarding the interpretation of the biological data in Figure 3. To clarify, each mouse serves as an independent unit in this analysis. We believe that ANOVA is the appropriate test for comparing the proportions of different behavioral signatures across the tumor microenvironment (TME) regions identified by large-scale phenotyping. However, we acknowledge that using a stacked bar plot may have been misleading. While a Chi-squared test could show differences in the distribution of behavioral signatures, it would not indicate which specific signatures are responsible for those differences. Therefore, in the revised manuscript, we will retain the ANOVA analysis but will represent the proportions using a bar chart that clearly illustrates multiple conditions for each behavioral cluster. We also appreciate the reviewer’s concern regarding the transparency of our data. In the revised manuscript, we will include the number of replicates for all figures to enhance clarity and understanding.

      Comment:  Figure 4 has similar statistical issues in that the n is not listed and, again, it is unclear whether they are treating each cell track as independent which, again, would be inappropriate. The best practice for this type of data would be the use of super plots as outlined in Lord et al. (2020) JCI - SuperPlots: Communicating reproducibility and variability in cell biology.

      We appreciate the reviewer’s comments and suggestions regarding Figure 4. In the revised manuscript, we will clarify the number of replicates used and our approach to treating cell tracks as independent units. We will implement super-plots where appropriate, to enhance the communication of reproducibility and variability in our data.

      Comment: The main issue that this raises is that the large-scale phenotyping module and the heterogeneity module appear designed to produce these statistical analyses that are used in these figures and, if they are based on the assumption that each track is independent, then this will produce inappropriate analyses as a default.

      We appreciate the reviewer’s comment, though we find ourselves unsure about the specific concern being raised. To clarify, each mouse is treated as an independent unit in our analyses. For each large-scale phenotyping region, we measure the proportion of tumor cells displaying a specific behavioral phenotype independently for each mouse. These proportions are then used for statistical analysis. We hope this explanation provides clarity, and we will adjust the manuscript to better convey this methodology.

      Reviewer #3:

      Comment: The most challenging task of analyzing 3D time-lapse imaging data is to accurately segment and track the individual cells in 3D over a long time duration. BEHAV3D Tumor Profiler did not provide any new advancement in this regard, and instead relies on commercial software, Imaris, for this critical step. Imaris is known to have a very high error rate when used for analyzing 3D time-lapse data. In the Methods section, the authors themselves stated that "Tumor cell tracks were manually corrected to ensure accurate tracking". Based on our own experience of using Imaris, such manual correction is tedious and often required for every time step of the movie. Therefore, Imaris is not a satisfactory tool for analyzing 3D time-lapse data. Moreover, Imaris is expensive and many research labs probably can't afford to buy it. The fact that BEHAV3D Tumor Profiler critically depends on the faulty ImarisTrack module makes it unclear whether the BEHAV3D tool or the results are reliable.

      If the authors want to "democratize the analysis of heterogeneous cancer cell behaviors", they should perform image segmentation and tracking using open-source codes (e.g., Cellpose, Stardisk & 3DCellTracker) and not rely on the expensive and inaccurate ImarisTrack Module for the image analysis step of BEHAV3D.

      We appreciate the reviewer’s comments on the challenges of segmenting and tracking individual cells in 3D time-lapse imaging data. As mentioned previously, our primary focus is to develop an analytical tool for comprehensive data analysis rather than developing tools for image processing. To enhance accessibility, we have updated our tool to support data formats from open-source Fiji plugins, such as TrackMate, which will benefit users without access to commercial software (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ).

      While we recognize the limitations of Imaris, it remains widely used in the intravital microscopy community due to its user-friendly interface for 3D visualization and semi-automated segmentation capabilities. Since no perfect tracking method currently exist, we utilized Imaris for its ability to allow manual corrections of faulty tracks, ensuring the reliability of our results. This approach was the best available option when we began our analysis, allowing us to obtain accurate results efficiently.

      In the revised manuscript, we will clarify our methodology and provide information on both Imaris and alternative processing options to strengthen the reliability of our findings.

      Comment: The authors developed a "Heterogeneity module" to extract distinctive tumor migratory phenotypes from the cell tracks quantified by Imaris. The cell tracks of the individual tumor cells are all quite short, indicating relatively low motility of the tumor cells. It's unclear whether such short migratory tracks are sufficient to warrant the PCA analysis to identify the 7 distinctive migratory phenotypes shown in Figure 2d. It's also unclear whether these 7 migratory phenotypes correspond to unique functional phenotypes.

      For the 7 distinctive motility clusters, the authors should provide a more detailed analysis of the differences between them. It's unclear whether the difference in retreating, slow retreating, erratic, static, slow, slow invading, and invading correspond to functional difference of the tumor cells.

      While some tumor cells exhibit limited motility, indicated by short tracks, others demonstrate significant migratory capabilities. This variability in tumor cell behavior is a central focus of our analysis, and our tool is specifically designed to identify and distinguish these differences. Our PCA analysis effectively captures this variability, as illustrated in Figure 2 d-f. It differentiates between cells exhibiting varying degrees of migratory behavior, including both highly migratory and less migratory phenotypes, as well as their directionality relative to the tumor core and the persistence of their movements. Thus, we believe that our approach provides valuable insights into the distinct migratory phenotypes within the tumor microenvironment. We will clarify these aspects further in the revised manuscript to enhance the reader's understanding of our findings.

      While our current manuscript does not provide explicit evidence linking each motility cluster to functional differences among the tumor cells, it is important to note that the state of the field supports the idea that cell dynamics can predict cell states and phenotypes. Research conducted by ourselves (Dekkers, Alieva et al., Nat Biotech, 2023) and others, such as Craiciuc et al. (Nature, 2022) and Freckmann et al. (Nat Comm, 2022) has shown that variations in cell motility patterns are indicative of underlying functional characteristics. For instance, cell morphodynamic features have been shown to reflect differences in cell types, T cell targeting states, tumor metastatic potential, and drug resistance states. In the revised manuscript, we will reference relevant studies to underscore the biological significance of these behaviors. By doing so, we hope to clarify the potential implications of our findings and strengthen the overall narrative of our research.

      Comment: Using only motility to classify tumor cell behaviours in the tumor microenvironment (TME) is probably not sufficient to capture the tumor cell difference. There are also other non-tumor cell types in the TME. If the authors aim to develop a computational tool that can elucidate tumor cell behaviors in the TME, they should consider other tumor cell features, e.g., morphology, proliferation state, and tumor cell interaction with other cell types, e.g., fibroblasts and distinct immune cells.

      The authors should expand the scale of tumor behavior features to classify the tumor phenotype clusters, e.g., to include tumor morphology, proliferation state, and tumor cell interaction with other TME cell types.

      We believe that using dynamic features alone is sufficient to capture differences in tumor behavior, as demonstrated by our results in Figure 2. However, we appreciate the reviewer’s suggestion to consider additional features, such as cell morphology and interactions with other cell types, to finetune our analyses. To this end, we have adapted our pipeline to be compatible with various features present in the data (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler/tree/BEHAV3D_TP-v2.0?tab=readme-ov-file#feature-selection ). We will emphasize this in the revised manuscript. However, we would like to point out that not all features may provide informative insights and that a wide range of features can instead introduce biologically irrelevant noise, making interpretation more challenging. For instance, in 3D microscopy, the z-axis resolution is typically lower, which can lead to artifacts like elongation in that direction. Adding morphological features that capture this may skew the analysis. Therefore, we believe that incorporating additional features should be approached with caution. We will clarify these considerations in the revised manuscript to better guide users in utilizing our computational tool effectively. We will also reference the use of unbiased feature selection techniques, such as bootstrapping methods, to identify biologically relevant features based on the conditions provided (D.G. Aragones et al, Computers in Biology and Medicine (2024)).

      Comment: The authors have already published two papers on BEHAV3D [Alieva M et al. Nat Protoc. 2024 Jul;19(7): 2052-2084; Dekkers JF, et al. Nat Biotechnol. 2023 Jan;41(1):60-69]. Although the previous two papers used BEHAV3D to analyze T cells, the basic pipeline and computational steps are similar, in particular regarding cell segmentation and tracking. The addition of a "Heterogeneity module" based on PCA analysis does not make a significant advancement in terms of image analysis and quantification.

      We want to emphasize that we have no intention of duplicating our previous publications. In this manuscript, we have consistently cited our foundational papers, where BEHAV3D was first developed for T cell migratory analysis in in vitro settings. In the introduction, we clearly state that our earlier work inspired us to adopt a similar approach for analyzing cell behavior in intravital microscopy (IVM) data, addressing the specific needs and complexities of analyzing tumor cell behaviors in the tumor microenvironment.

      Importantly, our new work provides several key advancements: 1) a pipeline specifically adapted for intravital microscopy (IVM) data; 2) integration of spatial characteristics from both large-scale and small-scale phenotyping; and 3) a zero-code approach designed to empower researchers without coding skills to effectively utilize the tool. We believe that these enhancements represent meaningful progress in the analysis of cell behaviors within the tumor microenvironment which will be valuable for the IVM community. We will ensure that these points are clearly articulated in the revised manuscript.

    1. eLife Assessment

      The study identifies the adhesion G-protein-coupled receptor A3 (ADGRA3) as a potential target for activating adaptive thermogenesis in white and brown adipose tissue, providing valuable information for scientists in the field of adipose tissue biology and metabolism. Although the authors have addressed some concerns raised by reviewers, the interpretations remain somewhat limited, and the work is deemed incomplete. The evidence supporting ADGRA3's role in thermogenesis is insufficient, necessitating more rigorous experiments to validate the receptor's relevance in adipose tissue. Additionally, the lack of experiments using primary cultures, despite feedback from multiple reviewers, highlights significant shortcomings.

    2. Reviewer #1 (Public review):

      Summary:

      This article identifies ADGR3 as a candidate GPCR for mediating beige fat development. The authors use human expression data from Human Protein Atlas and Gtex databases and combine this with experiments performed in mice and a murine cell line. They refer to a GPCR bioactivity screening tool PRESTO-Salsa, with which it was found that Hesperetin activates ADGR3. From their experiments, authors conclude that Hesperetin activates ADGR3, inducing a Gs-PKA-CREB axis resulting in adipose thermogenesis.

      Strengths:

      The authors analyze human data from public databases and perform functional studies in mouse models. They identify a new GPCR with a role in thermogenic activation of adipocytes.

      Considerations:

      Selection of ADGRA3 as a candidate GPCR relevant for mediating beiging in humans:

      The authors identify GPCRs that are expressed more highly in murine iBAT compared to iWAT in response to cold and assess which of these GPCRs are expressed in human subcutaneous or visceral adipocytes. Although this strategy will identify GPCRs that are expressed at higher levels in brown fat compared to beige and thus possibly more active in thermogenic function, the relevance in choosing GPCRs that also are expressed in unstimulated human white adipocytes should be considered. Thermogenic activity is not normally present in human white adipocytes. It would have strengthened the GPCR selection if the authors instead had assessed the intersection with human brown adipocytes that were activated with norepinephrine.

      Strategy to investigate the role of ADGRA3 in WAT beiging:

      Having identified ADGRA3 as their candidate receptor, the authors investigated the receptor in mouse models, the murine inguinal adipocyte cell line 3T3 and in human subcutaneous adipose progenitors (HAdsc) differentiated in vitro. Calling the human cells "beige" is a stretch as these cells are derived from a white adipose depot. The authors do observe regulation in UCP1 and abundance of mitochondria following modification of ADGRA3 in the cells. However, in future studies, it should be considered if the receptor rather plays a role in differentiation per se, and perhaps not specifically in thermogenic differentiation/activity.

      According to the Human Protein Atlas and Gtex databases, ADGRA3 is not only expressed in adipocytes, but also in other tissues and cell types. The authors address this by measuring the expression in a panel of these tissues, demonstrating a knockdown not only in the adipose tissue, but also in the liver and less pronounced in the muscle (Figure S2). It should thus be emphasized that the decreased TG levels in serum and liver in the mice might in fact depend on Adgra3 overexpression in the liver. Even though this might not have been the purpose of the experiment, it is important to highlight this as it could serve as hypothesis building for future studies of the function of this receptor.

    3. Reviewer #2 (Public review):

      Based on bioinformatics and expression analysis using mouse and human samples, the authors claim that the adhesion G-protein coupled receptor ADGRA3 may be a valuable target for increasing thermogenic activity and metabolic health. Genetic approaches to deplete ADGRA3 expression in vitro resulted in reduced expression of thermogenic genes including Ucp1, reduced basal respiration and metabolic activity as reflected by reduced glucose uptake and triglyceride accumulation. In line, nanoparticle delivery of shAdgra3 constructs is associated with increased body weight, reduced thermogenic gene expression in white and brown adipose tissue (WAT, BAT), and impaired glucose and insulin tolerance. On the other hand, ADGRA3 overexpression is associated with an improved metabolic profile in vitro and in vivo, which can be explained by increasing the activity of the well-established Gs-PKA-CREB axis. Notably, a computational screen suggested that ADGRA3 is activated by hesperetin. This metabolite is a derivative of the major citrus flavonoid hesperidin and has been described to promote metabolic health. Using appropriate in vitro and in vivo studies, the authors show that hesperitin supplementation is associated with increased thermogenesis, UCP1 levels in WAT and BAT, and improved glucose tolerance, an effect that was attenuated in the absence of ADGRA3 expression.

      Comments on revised version:<br /> In my opinion, the critical points I raised were not adequately addressed, neither in the revision nor in the response to the reviewer. Therefore, my initial assessment has not changed, the main claims are only partially supported by the data presented.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Zhao et al. explored the function of adhesion G protein-coupled receptor A3 (ADGRA3) in thermogenic fat biology.

      Strengths:

      Through both in vivo and in vitro studies, the authors found that the gain function of ADGRA3 leads to browning of white fat and ameliorates insulin resistance.

      Comments on revised version:

      The revised manuscript by Zhao et al. has limited improvement. The authors refused to perform revised experiments using primary cultures even though two reviewers pointed out the same weakness (3T3-L1 adipocytes are unsuitable). Using infrared thermography to measure body temperature is also problematic.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This article identifies ADGR3 as a candidate GPCR for mediating beige fat development. The authors use human expression data from the Human protein atlas and Gtex databases and combine this with experiments performed in mice and a murine cell line. They refer to a GPCR bioactivity screening tool PRESTO-Salsa, with which it was found that Hesperetin activates ADGR3. From their experiments, authors conclude that Hesperetin activates ADGR3, inducing a Gs-PKA-CREB axis resulting in adipose thermogenesis.

      Strengths:

      The authors analyze human data from public databases and perform functional studies in mouse models. They identify a new GPCR with a role in the thermogenic activation of adipocytes.

      Weaknesses:

      (1) Selection of ADGRA3 as a candidate GPCR relevant for mediating beiging in humans:

      The authors identify genes upregulated in iBAT compared to iWAT in response to cold, and among these differentially expressed genes, they identify highly expressed GPCRs in human white adipocytes (visceral or subcutaneous). Finally, among these genes, they select a GPCR not previously studied in the literature.

      If the authors are interested in beiging, why do they not focus on genes upregulated in iWAT (the depot where beiging is described to occur in mice), comparing thermoneutral to cold-induced genes? I would expect that genes induced in iWAT in response to cold would be extremely relevant targets for beiging. With their strategy, the authors exclude receptors that are induced in the tissue where beiging is actually described to occur.

      Furthermore, the authors are comparing genes upregulated in cold in BAT (but not WAT) to highly expressed genes in human white adipocytes during thermoneutrality. Overall, the authors fail to discuss the logic behind their strategy and the obvious limitations of it.

      Thanks for your valuable advice. In this study, we focus on genes that exhibited higher expression in BAT compared to iWAT under cold stimulation conditions, as these genes might play a role in adipose thermogenesis. Regarding the genes you mentioned that iWAT upregulates following cold stimulation, we did identify other intriguing targets in these genes in another ongoing study, albeit not encompassed within the scope of this study. Moreover, instead of making a comparison, we intersected 27 GPCR coding genes that were highly expressed in BAT compared to iWAT with genes that were highly expressed in human adipocytes (Figure 1C).

      With your suggestions, we realized that the description of the screening strategy in the manuscript was not clear enough, so we made the following supplement:

      “…dataset obtained from the Gene Expression Omnibus (GEO) database. Additionally, we utilized the human subcutaneous adipocytes dataset (Figure 1C, red) and human visceral adipocytes dataset (Figure 1C, purple) from the human protein atlas database to obtain genes that are highly expressed in human white adipocytes. The GSE118849 dataset comprises samples of brown adipose tissue (BAT) and inguinal white adipose tissue (iWAT) obtained from mice subjected to a 72-hour cold exposure at a temperature of 4℃.

      A total of 1134 differentially expressed genes (DEGs) that exhibited up-regulation in BAT compared to iWAT under cold stimulation were identified in the analysis, which might play a role in adipose thermogenesis. These DEGs were further screened to identify highly…”

      (2) Relevance of ADGRA3 and comparison to established literature:

      There has been a lot of literature and discussion about which receptor should be targeted in humans to recruit thermogenic fat. The current article unfortunately does not discuss this literature nor explain how it relates to their findings. For example, O'Mara et al (PMID: 31961826) demonstrated that chronic stimulation with the B3 adrenergic agonist, Mirabegron, resulted in the recruitment of thermogenic fat and improvement in insulin sensitivity and cholesterol. Later, Blondin et al (PMID: 32755608), highlighted the B2 adrenergic receptor as the main activation path of thermogenic fat in humans. There is also a recent report on an agonist activating B2 and B3 simultaneously (PMID: 38796310). Thus, to bring the literature forward, it would be beneficial if the current manuscript compared their identified activation path with the activation of these already established receptors and discussed their findings in relation to previous studies.

      Thanks to your suggestion. We have included a supplementary discussion on the relevant human adipose thermogenic receptors in the discussion section, as presented below:

      “The induction of beige fat has been investigated as a potentially effective therapeutic approach in combating obesity [23]. A clinical trial revealed that treatment with the chronic β3-AR agonist mirabegron leads to an increase in human brown fat, HDL cholesterol, and insulin sensitivity [24]. Subsequently, Blondin et al discovered that oral administration of mirabegron only elicits an increase in BAT thermogenesis when administered at the maximal allowable dose, indicating that human brown adipocyte thermogenesis is primarily driven by β2-adrenoceptor (β2-AR) stimulation [11]. Consistent with this finding, we found much higher levels of ADRB2 expression in human white adipose tissue than ADRB3 (Figure S1E). Furthermore, a recent study has demonstrated that simultaneous activation of β2-AR and β3-AR enhances whole-body metabolism through beneficial effects on skeletal muscle and BAT [25].”

      In Figures 1d and e, the authors show the expression of ADGRA3 in comparison to the expression of ADRB3. In human brown adipocytes, ADRB2 has been shown to be the main receptor through which adrenergic activation occurs (PMID: 32755608), thus authors should show the relative expression of this gene as well.

      We wholeheartedly endorse the proposal to augment the ADRB2 expression data in Figures 1D and E. However, it is regrettable to note that the pertinent databases (PRJNA66167 and PRJEB4337) are deficient in ADRB2 expression information. Fortunately, the GTEx database houses the ADRB2 expression data. Consequently, we have integrated these crucial data into Figure S1E.

      (3) Strategy to investigate the role of ADGRA3 in WAT beiging:

      Having identified ADGRA3 as their candidate receptor, the authors proceed with investigations of this receptor in mouse models and the murine inguinal adipocyte cell line 3T3.

      First of all, in Figure 1D, the authors show a substantially lower expression of ADGRA3 compared to ADRB3. It could thus be argued that a mouse would not be the best model system for studying this receptor. It would be interesting to see data from experiments in human adipocytes.

      Thanks for your helpful advice. We induced human adipose-derived mesenchymal stem cells (hADSCs) into adipocytes to evaluate the effect of ADGRA3 on human adipocytes (Figure 8).

      Moreover, if the authors are interested in inducing beiging, why do they show expression in iBAT and not iWAT?

      Maybe the description of this article wasn't clear enough, but we did show the expression and effects of ADGRA3 in iWAT and BAT (Author response image 1, Figure 3F-J and Figure 4F-J).

      Author response image 1.

      The authors perform in vivo experiments using intraperitoneal injections of shRNA or overexpression CMV-driven vectors and report effects on body temperature and glucose metabolism. It is here important to note that ADGRA3 is not uniquely expressed in adipocytes. A major advantage of databases like the Human Protein Atlas and Gtex, is that they give an overview of the gene expression across tissues and cell types. When looking up ADGRA3 in these databases, it is expressed in subcutaneous and visceral adipocytes. However, other cell types and tissues demonstrate an even higher expression. In the Human protein atlas, the enhanced cell types are astrocytes and hepatocytes. In the Gtex database tissues with the highest expression are Brain, Liver, and Thyroid.

      With this information in mind, IP injections for modification of ADGRA3 receptor expression could be expected to affect any of these tissues and cells.

      The manuscript report changes body temperature. However, temperature is regulated by the brain and also affected by thyroid activity. Did the authors measure the levels of circulating thyroid hormones? Gene expression changes in the brain? The authors report that Adgra3 overexpression decreased the TG level in serum and liver. The liver could be the primary targeted organ here, and the adipose effects might be secondary. The data would be easier to interpret if authors reported the effects on the liver, thyroid, and brain, and the gene expression across tissues should be discussed in the article.

      Thank you for your valuable advice. We supplemented the results of the effect of local BAT injection of Adgra3 OE on thermogenic genes (Figures S5G-H), the levels of circulating thyroid hormones (Figures S2H, S4F and S5B) and the effects of Adgra3 overexpression/knockdown on Adgra3 expression levels (Figures S2A-B and S4B-C) in multiple tissues as well as discussed in the article, as follows:

      “Given the consideration that the non-targeted nanoparticle approach utilized in this study for modulating Adgra3 expression levels in vivo alter Adgra3 expression in tissues beyond adipose tissue (Figures S2A-B and S4B-C), notably the liver and skeletal muscle, the construction of Adgra3 adipose tissue-specific knockout/overexpression mouse models is imperative for a more nuanced understanding of the precise mechanisms underlying the influence of on adipose thermogenesis. We will employ more sophisticated models in subsequent studies to further elucidate the effects of ADGRA3 on adipose thermogenesis and metabolic homeostasis. Nevertheless, our findings underlie a potential therapeutic feature of…”

      Finally, the identification of Hesperetin using the PRESTO-Salsa tool, and how specific the effect of Hesperetin is on ADGRA3, is currently unclear. This should be better discussed, and authors should consider measuring the established effects of Hesperetin in their model systems, including apoptosis.

      Thanks for your suggestion. We have further discussed the relevant content and added it in the discussion section as follows:

      “Previously, the influence of hesperetin on ADGRA3 has remained unreported. In this study, we screened hesperetin as a potential agonist for ADGRA3 by using the PRESTO-Salsa tool as well as discovered that hesperetin has an agonist effect on ADGRA3 through a series of experiments. This study focuses on the regulatory effect of hesperetin on adipose thermogenesis and explores whether this effect is dependent upon ADGRA3. As such, we refrained from conducting further investigations into other potential effects of hesperidin, including its potential role in antioxidant and in apoptosis.”

      Reviewer #2 (Public Review):

      Based on bioinformatics and expression analysis using mouse and human samples, the authors claim that the adhesion G-protein coupled receptor ADGRA3 may be a valuable target for increasing thermogenic activity and metabolic health. Genetic approaches to deplete ADGRA3 expression in vitro resulted in reduced expression of thermogenic genes including Ucp1, reduced basal respiration, and metabolic activity as reflected by reduced glucose uptake and triglyceride accumulation. In line, nanoparticle delivery of shAdgra3 constructs is associated with increased body weight, reduced thermogenic gene expression in white and brown adipose tissue (WAT, BAT), and impaired glucose and insulin tolerance. On the other hand, ADGRA3 overexpression is associated with an improved metabolic profile in vitro and in vivo, which can be explained by increasing the activity of the well-established Gs-PKA-CREB axis. Notably, a computational screen suggested that ADGRA3 is activated by hesperetin. This metabolite is a derivative of the major citrus flavonoid hesperidin and has been described to promote metabolic health. Using appropriate in vitro and in vivo studies, the authors show that hesperetin supplementation is associated with increased thermogenesis, UCP1 levels in WAT and BAT, and improved glucose tolerance, an effect that was attenuated in the absence of ADGRA3 expression.

      Overall, the data suggest that ADGRA3 is a constitutively active Gs-coupled receptor that improves metabolism by activating adaptive thermogenesis in WAT and BAT. The conclusions of the paper are partly supported by the data, but some experimental approaches need further clarification.

      (1) The in vivo approaches to modulate Adgra3 expression in mice are carried out using non-targeted nanoparticle-based approaches. The authors do not provide details of the composition of the nanomaterials, but it is highly likely that other metabolically active organs such as the liver are targeted. This is critical because Adgre3 is expressed in many organs, including the liver, adrenal glands, and gastrointestinal system. Therefore, many of the observed metabolic effects could be indirect, for example by modulating bile acids or corticosterone levels. Consistent with this, after digestion in the gastrointestinal tract, hesperetin is rapidly metabolized in intestinal and liver cells. Thus, hesperetin levels in the systemic circulation are likely to be insufficient to activate Adgra3 in thermogenic adipocytes/precursors. Overall, the authors need to repeat the key metabolic experiments in adipose-specific Adgra3 knockout/overexpression models to validate the reliability of the in vivo results. In addition, to validate the relevance of hesperetin supplementation for adaptive thermogenesis in BAT and WAT vivo, the levels of hesperetin present in the systemic circulation should be quantified.

      Thank you for your valuable advice. Unfortunately, we could not perform quantitative determination of hesperetin concentration in the systemic circulation because we had used the serum of hesperetin-treated mice for the quantitative determination of serum insulin, fT4 and TG. According to your other suggestions, we supplemented the results of the effect of local BAT injection of Adgra3 OE on thermogenic genes (Figures S5G-H), the levels of circulating thyroid hormones (Figures S2H, S4F and S5B) and the effects of Adgra3 overexpression/knockdown on Adgra3 expression levels (Figures S2A-B and S4B-C) in multiple tissues as well as discussed in the article, as follows:

      “Given the consideration that the non-targeted nanoparticle approach utilized in this study for modulating Adgra3 expression levels in vivo alter Adgra3 expression in tissues beyond adipose tissue (Figures S2A-B and S4B-C), notably the liver and skeletal muscle, the construction of Adgra3 adipose tissue-specific knockout/overexpression mouse models is imperative for a more nuanced understanding of the precise mechanisms underlying the influence of on adipose thermogenesis. We will employ more sophisticated models in subsequent studies to further elucidate the effects of ADGRA3 on adipose thermogenesis and metabolic homeostasis. Nevertheless, our findings underlie a potential therapeutic feature of…”

      (2) Standard measurements for energy balance are not presented. Quantitative data on energy expenditure, e.g. by indirect calorimetry, and food intake are missing and need to be included to validate the authors' claims.

      We are in full agreement with your proposal. Regrettably, owing to the constraints of experimental facilities, we are presently unable to access quantitative data pertaining to the energy expenditure of animals. However, we believe that the present results can also partially support the idea that ADGRA3 promotes energy metabolism and the results of the effect of ADGRA3 on food intake were shown in Figure S2C and Figure S5A respectively.

      (3) The thermographic images used to determine the BAT temperature are not very convincing. The distance and angle between the thermal camera and the BAT have a significant effect on the determination of the temperature, which is not taken into account, at least in the images presented.

      Thank you very much for pointing out the lack of our method description. According to the methods of literatures (Xia, Bo et al. PLoS biology. 2020. doi:10.1371/journal.pbio.3000688) and (Warner, Amy et al. PNAS. 2013. doi:10.1073/pnas.1310300110), the same batch of representative infrared images of mice were all captured using a thermal imaging camera (FLIR ONE PRO), measured at the same distance perpendicular to the plane on which the mice were located. We have supplemented this description in the Materials and Methods section, as shown below:

      “2.20. Infrared Thermography.

      BAT temperature was measured at room temperature by infrared thermography according to previous publications [22, 23]. The same batch of representative infrared images of mice were all captured using a thermal imaging camera (FLIR ONE PRO), measured at the same distance perpendicular to the plane on which the mice were located. To quantify interscapular region temperature, the average surface temperature from a region of the interscapular BAT was taken with FLIR Tools software.”

      (4) The 3T3-L1 cell line is not an adequate cell culture model to study thermogenic adipocyte differentiation. To validate their results, the key experiments showing that ADGRA3 expression modulates thermogenic marker expression in a hesperetin-dependent manner need to be performed in a reliable model, e.g. primary murine adipocytes.

      Induction of 3T3L1 cell line into white adipocytes is indeed not suitable for studying thermogenic adipocyte differentiation. However, with reference to previous studies (Wei, Gang et al. Cell metabolism. 2021. doi: 10.1016/j.cmet.2021.08.012 ) and (Bae IS, Kim SH. Int J Mol Sci. 2019. doi: 10.3390/ijms20246128), 3T3-L1 cell line was used to differentiate into beige-like adipocytes in this study, and many studies believe that this method is suitable for studying the thermogenic effect of adipocytes in vitro. Meanwhile, we provided a more detailed description of the induction of beige-like adipocytes by 3T3-L1 in the Materials and Methods section and induced human adipose-derived stem cells (hADSC) into adipocytes to evaluate the effect of ADGRA3 on human adipocytes (Figure 8).

      “…supplemented with 10% FBS. Confluent 3T3-L1 pre-adipocytes were induced into mature beige-like adipocytes with 0.5 mM isobutyl methylxanthine (IBMX), 1 μM dexamethasone, 5 μg/ml insulin, 1 nM 3, 3', 5-Triiodo-L-thyronine (T3), 125 μM indomethacin and 1 μM rosiglitazone in high-glucose DMEM containing 10% FBS for 2 days, then treated with high-glucose DMEM containing 5 μg/ml insulin, 1 nM T3, 1 μM rosiglitazone and 10% FBS for 6 days and cultured with high-glucose DMEM containing 10% FBS for 2 days. hADSCs were seeded on plates coated with 0.1% gelatin and culture and grown to confluence in human mesenchymal stem cells (hMSCs) specialized culture medium (ZQ-1320). Confluent hADSCs were induced into mature human adipocytes with adipogenic induction medium (PCM-I-004) according to the manufacturer’s instructions.”

      (5) The experimental setup only allows the measurement of basal cellular respiration. More advanced approaches are needed to define the contribution of ADGRA3 versus classical adrenergic receptors to UCP1-dependent thermogenesis.

      Thanks for your suggestion. The maximum oxygen consumption rate of the cells was also measured (Figures 2G and 2N) by adding FCCP, an uncoupler of oxidative phosphorylation (OXPHOS) in mitochondria.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Zhao et al. explored the function of adhesion G protein-coupled receptor A3 (ADGRA3) in thermogenic fat biology.

      Strengths:

      Through both in vivo and in vitro studies, the authors found that the gain function of ADGRA3 leads to browning of white fat and ameliorates insulin resistance.

      Weaknesses:

      There are several lines of weak methodologies such as using 3T3-L1 adipocytes and intraperitoneal(i.p.) injection of virus. Moreover, as the authors stated that ADGRA3 is constitutively active, how could the authors then identify a chemical ligand?

      (1) Primary cultured cells should be used to perform gain and loss function analysis of ADGRA3, instead of using 3T3-L1. It is impossible to detect Ucp1 expression in 3T3-L1 cells.

      Induction of 3T3L1 cell line into white adipocytes is indeed difficult for detecting UCP1 expression. However, with reference to previous studies (Wei, Gang et al. Cell metabolism. 2021. doi:10.1016/j.cmet.2021.08.012) and (Bae IS, Kim SH. Int J Mol Sci. 2019. doi:10.3390/ijms20246128), 3T3-L1 cell line was used to differentiate into beige-like adipocytes in this study, and many studies believe that this method is suitable for studying the thermogenic effect of adipocytes in vitro. Meanwhile, we provided a more detailed description of the induction of beige-like adipocytes by 3T3-L1 in the Materials and Methods section and induced human adipose-derived stem cells (hADSC) into adipocytes to evaluate the effect of ADGRA3 on human adipocytes (Figure 8).

      “…supplemented with 10% FBS. Confluent 3T3-L1 pre-adipocytes were induced into mature beige-like adipocytes with 0.5 mM isobutyl methylxanthine (IBMX), 1 μM dexamethasone, 5 μg/ml insulin, 1 nM 3, 3', 5-Triiodo-L-thyronine (T3), 125 μM indomethacin and 1 μM rosiglitazone in high-glucose DMEM containing 10% FBS for 2 days, then treated with high-glucose DMEM containing 5 μg/ml insulin, 1 nM T3, 1 μM rosiglitazone and 10% FBS for 6 days and cultured with high-glucose DMEM containing 10% FBS for 2 days. hADSCs were seeded on plates coated with 0.1% gelatin and culture and grown to confluence in human mesenchymal stem cells (hMSCs) specialized culture medium (ZQ-1320). Confluent hADSCs were induced into mature human adipocytes with adipogenic induction medium (PCM-I-004) according to the manufacturer’s instructions.”

      (2) For virus treatment, the authors should consider performing local tissue injection, rather than IP injection. If it is IP injection, have the authors checked other tissues to validate whether the phenotype is fat-specific?

      Thank you for your valuable advice. We supplemented the results of the effect of local BAT injection of Adgra3 OE on thermogenic genes (Figures S5G-H) and the effects of Adgra3 overexpression/knockdown on Adgra3 expression levels (Figures S2A-B and S4B-C) in other tissues.

      (3) The authors should clarify how constitutively active GPCR needs further ligands.

      Thank you for your suggestion. In fact, we only identified hesperetin as a potential agonist of ADGRA3 rather than a ligand. The results also indicate that overexpression of ADGRA3 without additional hesperetin is sufficient to activate downstream PKA signaling pathways through constitutive activity (Figure 5). Recently, Chen et al identified oleic ethanolamine (OEA) as a potential endogenous agonist of GPR3, which is also a constitutively active GPCR. Overall, the high constitutive activity of constitutively active GPCRs arises from the combined effects of stimulation by endogenous agonists and their basal coupling with Gs.

      As for why we screened and identified potential agonists of ADGRA3, we hope to find more convenient pathways for its clinical application than gene overexpression, as described in the article:      

      “Considering the difficulty of overexpressing ADGRA3 in clinical application, hesperetin was screened as a potential agonist of ADGRA3 by PRESTO-Salsa database (Figure 6A). The…”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments

      The title appears to be overstated as no clinical trials were performed and experiments were not even performed in human brown adipocytes.

      Thank you for your critical suggestion, therefore we have added the experimental results of human adipocytes (Figure 8) and revised the title to “Constitutively active receptor ADGRA3 signaling induces adipose thermogenesis”.

      Please specify n-number and what are replicates or independent experiments. Please also state if any outliers were excluded and why.

      Thanks for your valuable suggestion. We have added a description of the n-number in the Figure legends section, number of independent experiments and exclusion criteria for outliers in the Materials and Methods section, as follows:

      “…of tissue samples. Cohorts of ≥4 mice per genotype or treatment were assembled for all in vivo studies. All in vivo studies were repeated 2-3 independent times. All procedures related to…”

      “…μM H-89) was added to 3T3-L1 mature beige-like adipocytes for 48 hours. All in vitro studies were repeated 2-3 independent times.”

      “All data are presented as mean ± SEM. In this study, outliers that met the three-sigma rule were excluded from analysis, with the exception of those presented in Figure S1E. Given the possibility that the outliers in Figure S1E represent extreme expressions of the inherent variability within the population sample, we have chosen to retain these specific outliers for further analysis. Student’s t-test was used to compare two groups. One-way analysis of…”

      Authors use Infrared Thermography to measure body temperature. Depending on the distance between the mouse and the camera, the mouse needs to be at the same spot.

      Thank you very much for pointing out the lack of our method description. According to the methods of literatures (Xia, Bo et al. PLoS biology. 2020. doi:10.1371/journal.pbio.3000688) and (Warner, Amy et al. PNAS. 2013. doi:10.1073/pnas.1310300110), the same batch of representative infrared images of mice were all captured using a thermal imaging camera (FLIR ONE PRO), measured at the same distance perpendicular to the plane on which the mice were located. We have supplemented this description in the Materials and Methods section, as shown below:

      “2.20. Infrared Thermography.

      BAT temperature was measured at room temperature by infrared thermography according to previous publications [22, 23]. The same batch of representative infrared images of mice were all captured using a thermal imaging camera (FLIR ONE PRO), measured at the same distance perpendicular to the plane on which the mice were located. To quantify interscapular region temperature, the average surface temperature from a region of the interscapular BAT was taken with FLIR Tools software.”

      Please discuss the limitations of the experiments and discuss the relevant literature.

      Thanks for your recommendations. We discussed the limitations of the experiments and the relevant literature in the discussion section, as follows:

      “The induction of beige fat has been investigated as a potentially effective therapeutic approach in combating obesity [23]. A clinical trial revealed that treatment with the chronic β3-AR agonist mirabegron leads to an increase in human brown fat, HDL cholesterol, and insulin sensitivity [24]. Subsequently, Blondin et al discovered that oral administration of mirabegron only elicits an increase in BAT thermogenesis when administered at the maximal allowable dose, indicating that human brown adipocyte thermogenesis is primarily driven by β2-adrenoceptor (β2-AR) stimulation [11]. Consistent with this finding, we found much higher levels of ADRB2 expression in human white adipose tissue than ADRB3 (Figure S1E). Furthermore, a recent study has demonstrated that simultaneous activation of β2-AR and β3-AR enhances whole-body metabolism through beneficial effects on skeletal muscle and BAT [25].”

      “Given the consideration that the non-targeted nanoparticle approach utilized in this study for modulating Adgra3 expression levels in vivo alter Adgra3 expression in tissues beyond adipose tissue (Figures S2A-B and S4B-C), notably the liver and skeletal muscle, the construction of Adgra3 adipose tissue-specific knockout/overexpression mouse models is imperative for a more nuanced understanding of the precise mechanisms underlying the influence of on adipose thermogenesis. We will employ more sophisticated models in subsequent studies to further elucidate the effects of ADGRA3 on adipose thermogenesis and metabolic homeostasis. Nevertheless, our findings underlie a potential therapeutic feature of…”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1 (Public Review):

      Summary

      The mammalian Shieldin complex consisting of REV7 (aka MAD2L2, MAD2B) and SHLD1-3 affects pathway usage in DSB repair favoring non-homologous end-joining (NHEJ) at the expense of homologous recombination (HR) by blocking resection and/or priming fill-in DNA synthesis to maintain or generate near blunt ends suitable for NHEJ. While the budding yeast Saccharomyces cerevisiae does not have homologs to SHLD1-3, it does have Rev7, which was identified to function in conjunction with Rev3 in the translesion DNA polymerase zeta. Testing the hypothesis that Rev7 also affects DSB resection in budding yeast, the work identified a direct interaction between Rev7 and the Rad50-Mre11-Xrs2 complex by two-hybrid and direct protein interaction experiments. Deletion analysis identified that the 42 amino acid C-terminal region was necessary and sufficient for the 2-hybrid interaction. Direct biochemical analysis of the 42 aa peptide was not possible. Rev7 deficient cells were found to be sensitive to HU only in synergy with G2 tetraplex forming DNA. Importantly, the 42 aa peptide alone suppressed this phenotype. Biochemical analysis with full-length Rev7 and a C-terminal truncation lacking the 42 aa region shows G4-specific DNA binding that is abolished in the C-terminal truncation and with a substrate containing mutations to prevent G4 formation. Rev7 lacks nuclease activity but inhibits the dsDNA exonuclease activity of Mre11. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition suggesting the involvement of additional binding sites besides the 42 aa region. Also, the Mre11 ssDNA endonuclease activity is inhibited by Rev7 but not the degradation of linear ssDNA. Rev7 does not affect ATP binding by Rad50 but inhibits in a concentration-dependent manner the Rad50 ATPase activity. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition but significantly less than the full-length protein.

      Using an established plasmid-based NHEJ assay, the authors provide strong evidence that Rev7 affects NEHJ, showing a four-fold reduction in this assay. The mutations in the other Pol zeta subunits, Rev3 and Rev1, show a significantly smaller effect (~25% reduction). A strain expressing only the Rev7 C-terminal 42 aa peptide showed no NHEJ defect, while the truncation protein lacking this region exhibited a smaller defect than the deletion of REV7. The conclusion that Rev7 supports NHEJ mainly through the 42 aa region was validated using a chromosomal NHEJ assay. The effect on HR was assessed using a plasmid:chromosome system containing G4 forming DNA. The rev7 deletion strain showed an increase in HR in this system in the presence and absence of HU. Cells expressing the 42 aa peptide were indistinguishable from the wild type as were cells expressing the Rev7 truncation lacking the 42 aa region. The authors conclude that Rev7 suppresses HR, but the context appears to be system-specific and the conclusion that Rev7 abolished HR repair of DSBs is unwarranted and overly broad.

      Strength

      This is a well-written manuscript with many well-executed experiments that suggest that Rev7 inhibits MRX-mediated resection to favor NEHJ during DSB repair. This finding is novel and provides insight into the potential mechanism of how the human Shieldin complex might antagonize resection.

      We thank Reviewer 1 for their comprehensive summary of our work. The Reviewers' recognition that our manuscript is “well-written” with “many well-executed experiments” and our findings are “novel” is greatly appreciated.

      Weaknesses

      The nuclease experiments were conducted using manganese as a divalent cation, and it is unclear whether there is an effect with the more physiological magnesium cation. Additional controls for the ATPase and nuclease experiments to eliminate non-specific effects would be helpful. Evidence for an effect on resection in cells is lacking. The major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified, as only a highly specialized assay is used that does not warrant the broad conclusion drawn. Specifically, the results that the Rev7 C terminal truncation lacking the 42 aa region still suppresses HR is unexpected and unexplained. The effect of Rev7 on G4 metabolism is underdeveloped and distracts from the main results that Rev7 modulated MRX activity. The authors should consider removing this part and develop a more complete story on this later.

      We have addressed each point identified as “Weaknesses” by the reviewer, as described below:

      The nuclease experiments were conducted using manganese as a divalent cation, and it is unclear whether there is an effect with the more physiological magnesium cation.

      We acknowledge the Reviewer’s concern and apologize for not having been clear in our first submission.  However, several studies have demonstrated that Mre11 exhibits all three DNase activities, namely single-stranded endonuclease, double-stranded exonuclease and DNA hairpin opening only in the presence of Mn²⁺ but not with other divalent cations, such as magnesium or calcium (Paull and Gellert, Mol. Cell 1998; 2000; Usui et al., Cell 1998; Ghosal and Muniyappa, JMB, 2007; Arora et al., Mol Cell Biol. 2017). For this reason, Mn²⁺ was used as a cofactor for the Mre11 nuclease assays. We have clarified this in the revised manuscript. As a side note, Mg2+ serves as a cofactor for Rad50’s ATPase activity.  

      Additional controls for the ATPase and nuclease experiments to eliminate non-specific effects would be helpful.

      We thank the Reviewer for raising this important point, as it led us to evaluate and confirm the specificity of Rev7 and exclude its potential non-specific effects. To this end, we have performed additional experiments, which showed that (a) the S. cerevisiae Dmc1 ATPase activity was not affected by Rev7, contrary to its inhibitory effect on Rad50 and (b) Rev7 had no discernible impact on the endonucleolytic activity of S. cerevisiae Sae2, whereas it inhibits DNase activities of Mre11. Thus, the lack of inhibitory effects on the ATPase activity of Dmc1 and nuclease activity of Sae2 confirm the specificity of Rev7 for Mre11 and Rad50 subunits. We have included this new data in Figure 6H and 6J and in Figure 5 –figure supplement 1, respectively, in the revised manuscript.

      Evidence for an effect on resection in cells is lacking. The major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified, as only a highly specialized assay is used that does not warrant the broad conclusion drawn.

      We agree with the Reviewer that in vivo evidence demonstrating the inhibitory effect of REV7 on DNA end resection was lacking in the first submission. Reviewer 2 and 3 have also raised point. We now measured the rate of DNA end resection using a qPCR-based assay (Mimitou and Symington, EMBO J. 2010; Gnugge et al., Mol. Cell 2023). The results revealed that deletion of REV7 led to an enhancement in the rate of DNA end resection at a DSB site inflicted by HO endonuclease (Figure 9—figure supplement 3), providing direct evidence that loss of REV7 contributes to increase in DNA end resection at the DSBs.

      Specifically, the results that the Rev7 C-terminal truncation lacking the 42 aa region still suppresses HR is unexpected and unexplained.

      This is a fair point, and we thank the reviewer for raising it. Although the interaction of Rev7-C1 in the yeast two-hybrid assays was not apparent, surprisingly, it partially suppressed HR (Figure 9). In line with this, biochemical assays showed that it exerts partial inhibitory effect on the Mre11 nuclease (Figure 5) and Rad50 ATPase (Figure 6) activities compared with the full-length Rev7. Consistent with vitro data, the AF2 models revealed that, in addition to the C-terminal 42-aa region, residues in the N-terminal region of Rev7 also interact with the Mre11 and Rad50 subunits (Figure 2—figure supplement 2).

      The effect of Rev7 on G4 metabolism is underdeveloped and distracts from the main results that Rev7 modulated MRX activity. The authors should consider removing this part and develop a more complete story on this later.

      We agree with the reviewer’s comment “that the effect of Rev7 on G4 DNA metabolism is underdeveloped and distracts” from the central theme of the present paper, and suggested that we develop this part as a complete story later. This point has also been raised by Reviewer 2 and 3 and, therefore, Figures and associated text were removed in the revised version of the manuscript.

      Reviewer 2 (Public Review):

      In this study, Badugu et al investigate the Rev7 roles in regulating the Mre11-Rad50-Xrs2 complex and in the metabolism of G4 structures. The authors also try to make a conclusion that REV7 can regulate the DSB repair choice between homologous recombination and non-homologous end joining.

      The major observations of this study are:

      (1) Rev7 interacts with the individual components of the MRX complex in a two-hybrid assay and in a protein-protein interaction assay (microscale thermophoresisi) in vitro.

      (2) Modeling using AlphaFold-Multimier also indicated that Rev7 can interact with Mre11 and Rad50.

      (3) Using a two-hybrid assay, a 42 C terminal domain in Rev7 responsible for the interaction with MRX was identified.

      (4) Rev7 inhibits Mre11 nuclease and Rad50 ATPase activities in vitro.

      (5) Rev 7 promotes NHEJ in plasmid cutting/relegation assay.

      (6) Rev7 inhibits recombination between chromosomal ura3-1 allele and plasmid ura3 allele containing G4 structure.

      (7) Using an assay developed in V. Zakian's lab, it was found that rev7 mutants grow poorly when both G4 is present in the genome and yeast are treated with HU.

      (8) In vitro, purified Rev7 binds to G4-containing substrates.

      In general, a lot of experiments have been conducted, but the major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified.

      We appreciate Reviewer 2 for comprehensive assessment of our manuscript and their insightful comments. However, we believe that the data (Figure 7-9) in our manuscript, together with new data (Figure 9- figure supplement 2 and 3) in the revised manuscript, clearly demonstrate that Rev7 regulates the choice between HR and NHEJ.

      (1) Two stories that do not overlap (regulation of MRX by Rev7 and Rev7's role in G4 metabolism) are brought under one umbrella in this work. There is no connection unless the authors demonstrate that Rev7 inhibits the cleavage of G4 structures by the MRX complex.

      We agree with the reviewer’s point that the themes associated with the regulation of the functions of MRX subunits by Rev7 and its role G4 DNA metabolism do not overlap. This concern has also been expressed by Reviewer 1 and 3. According to their suggestion, we have deleted all figures and text describing the role of Rev7 in G4 DNA metabolism from the revised manuscript.

      (2) The authors cannot conclude based on the recombination assay between G4-containing 2-micron plasmid and chromosomal ura3-1 that Rev7 "completely abolishes DSB-induced HR". First of all, there is no evidence that DSBs are formed at G4. Why is there no induction of recombination when cells are treated with HU? Second, as the authors showed, Rev7 binds to G4, therefore it is not clear if the observed effects are the result of Rev7 interaction with G4 or its impact on HR. The established HO-based assays where the speed of resection can be monitored (e.g., Mimitou and Symington, 2010) have to be used to justify the conclusion that Rev7 inhibits MRX nuclease activity in vivo.

      We thank the Reviewer for the insightful comments and drawing our attention to the inference "completely abolishes DSB-induced HR". We have we have rephrased the conclusion, and replaced it with “REV7 gene product plays an anti-recombinogenic role during HR”. Then, the reviewer refers to lack of “evidence that DSBs are formed at G4”. At this point, unfortunately, our attempts to identify DSB at the G4 DNA site in the 2-micron plasmid did not provide a clear answer to this question. This might be related to the existence of myriad DNases in the cell and technical issues associated with the isolation of low-abundant, linearized 2-micron plasmid molecules. Because of these reasons, we cannot provide any data on DSB at the G4 site in the 2-micron plasmid.

      The reviewer then correctly points out “Why is there no induction of recombination when cells are treated with HU?” These findings are consistent with previous studies which have shown that Mre11-deficient cells are sensitivity to HU, resulting in cell death (Tittel-Elmer et al., EMBO J. 28, 1142-1156, 2009; Hamilton and Maizels, PLoS One, 5, e15387, 2010). However, a novel finding of our study is that ura3-1 rev7D cells and ura3-1 cells expressing Rev7-42 amino acid peptide (to limited extent) produce Ura3+ papillae. We have included this information in the Results section and adjusted the text to make this point clear to the reader.

      In the same paragraph, the Reviewer expresses a concern about the interaction of Rev7 with G4 DNA substrates and its impact on HR. As discussed above, in response to your comment (1) and a similar comment of Reviewer 1 and 3, we have deleted all figures and text describing the role of Rev7 in G4 DNA metabolism in the revised manuscript. The reviewer specifically refers to a study by Mimitou and Symington, 2010 in which the speed DNA end resection at the HO endonuclease-inflicted DSB was quantified. We have carried out the suggested experiment and the results are presented in Figure 9─figure supplement 3.

      Reviewer 3 (Public Review):

      Summary:

      REV7 facilitates the recruitment of Shieldin complex and thereby inhibits end resection and controls DSB repair choice in metazoan cells. Puzzlingly, Shieldin is absent in many organisms and it is unknown if and how Rev7 regulates DSB repair in these cells. The authors surmised that yeast Rev7 physically interacts with Mre11/Rad50/Xrs2 (MRX), the short-range resection nuclease complex, and tested this premise using yeast two-hybrid (Y2H) and microscale thermophoresis (MST). The results convincingly showed that the individual subunits of MRX interact robustly with Rev7. AlphaFold Multimer modelling followed by Y2H confirmed that the carboxy-terminal 42 amino acid is essential for interaction with MR and G4 DNA binding by REV7. The mutant rev7 lacking the binding interface (Rev7-C1) to MR shows moderate inhibition to the nuclease and the ATPase activity of Mre11/Rad50 in biochemical assays. Deletion of REV7 also causes a mild reduction in NHEJ using both plasmid and chromosome-based assays and increases mitotic recombination between chromosomal ura3-01 and the plasmid ura3 allele interrupted by G4. The authors concluded that Rev7 facilitates NHEJ and antagonizes HR even in budding yeast, but it achieves this by blocking Mre11 nuclease and Rad50 ATPase.

      Weaknesses

      There are many strengths to the studies and the broad types of well-established assays were used to deduce the conclusion. Nevertheless, I have several concerns about the validity of experimental settings due to the lack of several key controls essential to interpret the experimental results. The manuscript also needs a few additional functional assays to reach the accurate conclusions as proposed.

      We are happy that the Reviewer has found “many strengths” in our manuscript and further noted that “results convincingly showed that the individual subunits of MRX interact robustly with Rev7”. We greatly appreciate the Reviewer for these encouraging words, and for specific suggestions that helped us to improve the manuscript. As suggested, we have performed additional experiments including key controls and the data is presented in the revised manuscript.

      (1) AlphaFold model predicts that Mre11-Rev7 and Rad50-Rev7 binding interfaces overlap and Rev7 might bind only to Mre11 or Rad50 at a time. Interestingly, however, Rev7 appears dimerized (Figure 1). Since the MR complex also forms with 2M and 2R in the complex, it should still be possible if REV7 can interact with both M and R in the MR complex. The author should perform MST using MR complex instead of individual MR components. The authors should also analyze if Rev7-C1 is indeed deficient in interaction with MR individually and with complex using MST assay.

      Thank you for the valuable suggestion. As requested, MST titration experiments have been performed to examine the affinity of purified GFP-tagged Rev7-C1 for the Mre11, Rad50 and MR complex. The results revealed that Rev7-C1 binds to the Mre11 and Rad50 subunits with about 3- and 8.8-fold reduced affinity, respectively; whereas it binds to the MR complex with ~5.6-fold reduced affinity compared with full-length Rev7. The data is shown in Figure 1─figure supplement 4A-C.

      (2) The nuclease and the ATPase assays require additional controls. Does Rev7 inhibit the other nuclease or ATPase non-specifically? Are these outcomes due to the non-specific or promiscuous activity of Rev7? In Figure 6, the effect of REV7 on the ATP binding of Rad50 could be hard to assess because the maximum Rad50 level (1 mM) was used in the experiments. The author should use the suboptimal level of Rad50 to check if REV7 still does not influence ATP binding by Rad50.

      We thank the Reviewer for these valuable comments (Reviewer 1 has raised similar issues). Thus, we performed additional control experiments and the results indicate that (a) the ATPase activity of S. cerevisiae Dmc1 was not affected by Rev7 and (b) Rev7 does not inhibit the endonucleolytic activity of S. cerevisiae Sae2. The results are depicted in Figure 6H and 6J and Figure 5 –figure supplement 1A-D, respectively.

      As suggested by the Reviewer, using suboptimal levels of Rad50 (0.2 mM), we carried out experiments to test the effect of varying concentrations of Rev7 on the ability of Rad50 to bind ATP and catalyse its hydrolysis. The results showed that Rev7 had no discernible effect on its ability to bind ATP, even at concentrations 30 times higher than the concentration of Rad50 (Figure 6B and 6D). However, Rev7 suppresses the ATPase activity of Rad50, but not that of Dmc1, in a concentration-dependent manner (Figure G, 6J).  

      (3) The moderate deficiency in NHEJ using plasmid-based assay in REV7 deleted cells can be attributed to aberrant cell cycle or mating type in rev7 deleted cells. The authors should demonstrate that rev7 deleted cells retain largely normal cell cycle patterns and the mating type phenotypes. The author should also analyze the breakpoints in plasmid-based NHEJ assays in all mutants, especially from rev7 and rev7-C1 cells.

      We appreciate the Reviewer's critical and insightful comment. We monitored cell-cycle progression of both wild-type and rev7D cells over time using FACS. The results revealed that the cell cycle profiles and mating type phenotypes rev7D cells were similar to the wild type cells. The data is presented in Figure 7-figure supplement 1. This indicates that rev7D cells do not possess aberrant cell cycle or mating type defects as compared with the wild-type cells.

      We find the second point raised by the Reviewer although is intriguing, its relevance to the current study is unclear. In our view, identification of breakpoints using plasmid-based NHEJ assays in all the mutants will require a significant amount of time, and the insight that we may gain is unlikely to add to the central theme of this paper.  Moreover, we know for sure that Rev7 has no DNA cleavage/nicking activity.

      (4) It is puzzling why the authors did not analyze end resection defects in rev7 deleted cells after a DSB. The author should employ the widely used resection assay after a HO break in rev3, rev7, and mre11 rev7 cells as described previously.

      Thank you for the suggestion. Reviewer 1 also has raised this point. As suggested, we have analysed end resection in the rev7D cells at a HO inflicted DSB site using a qPCR assay (Mimitou and Symington, EMBO J. 2010; Gnugge et al., Mol. Cell 2023). The results revealed that deletion of REV7 led to an enhancement in the rate of DNA end resection at a DSB inflicted by HO endonuclease (Figure 9—figure supplement 3),

      (5) Is it possible that Rev7 also contributes to NHEJ as the part of TLS polymerase complex? Although NHEJ largely depends on Pol4, the authors should not rule out that the observed NHEJ defect in rev7 cells is due at least partially to its TLS defect. In fact, both rev3 or rev1 cells are partially defective in NHEJ (Figure 7). Rev7-C1 is less deficient in NHEJ than REV7 deletion. These results predict that rev7-C1, rev3 should be as defective as the rev7 deletion. Additionally, the authors should examine if Rev7-C1 might be deficient in TLS. In this regard, does rev7-C1 reduce TLS and TLS-dependent mutagenesis? Is it dominant? The authors should also check if Rev3 or Rev1 are stable in Rev7 deleted or rev7-C1 cells by immunoblot assays.

      We agree with the possibility that Rev7 may play a role in translesion DNA synthesis and TLS-dependent mutagenesis. Accordingly, Rev7-C1 might be deficient in TLS. While we do not rule out such scenarios, we respectfully suggest that this is outside the scope of the current manuscript. This manuscript focuses on the role of Rev7 in NHEJ and HR pathways, not on translesion DNA synthesis. Nevertheless, we recognise the importance of this line of investigation, and we will certainly consider this suggestion in our future work. Thank you.

      (6) Due to the G4 DNA and G4 binding activity of REV7, it is not clear which class of events the authors are measuring in plasmid-chromosome recombination assay in Figure 9. Do they measure G4 instability or the integrity of recombination or both in rev7 deleted cells? Instead, the effect of rev7 deletion or rev7-C1 on recombination should be measured directly by more standard mitotic recombination assays like mating type switch or his3 repeat recombination.

      We appreciate the Reviewer for highlighting this important point and would like to take the opportunity to clarify the rationale behind plasmid-chromosome recombination assay, as previously described (Paeschke et al., Cell 145, 678, 2011). In this assay, we are measuring the rate of Ura+ papillae formation arising from integration of the targeting plasmid into the genome at the ura3-1 locus of wild-type and rev7D cells. Analysis of PCR-generated DNA fragments indicate that pFAT10-G4 plasmid integrates at the ura3-1 genomic locus of rev7D cells, but not in the wild-type cells (Figure 9-figure supplement 2). Further, we also measured the stability of G4 DNA and the results indicate that it is stable in rev7D cells.

      Recommendations for the authors:

      Reviewer 1 (Recommendations for the authors):

      (1) Title: The word 'choice' implies a regulator. Is that the model here? Alternatively, is it pathway properties that define the preference of usage?

      This is an excellent suggestion. In the revised submission, we rephrased the title “Saccharomyces cerevisiae Rev7 promotes non-homologous end-joining by inhibiting Mre11 nuclease and Rad50 ATPase activities and Homologous recombination.”

      (2) Line 83, Introduction: Titia De Lange proposed an alternative/complementary model for Shieldin and REV7 to support fill-in by DNA polymerases including Pol alpha. This should be discussed.

      We thank the reviewer for pointing out that we have not discussed the work from Titia De Lange’s research group. We have now added new sentences to the Introduction to describe the alternative model involving Polα-primase fill-in synthesis (p3.2.7).

      (3) Line 131: The paragraph title needs to change. 2-hybrid assays cannot establish direct interaction especially when analyzing yeast proteins by yeast 2-hybrid. I agree that direct interaction is established by other means later.

      Per the Reviewer’s suggestion, we have deleted the word “directly” from the title of the paragraph.

      (4) Figure 1 D-F: The purity of the Rev7-GFP fusion is shown in Figure S1, and the purity of the Rad50, Mre11, and Xrs2 subunits as assessed by PAGE should be shown as well.

      Following this suggestion, we have included images of Coomassie blue-stained SDS-polyacrylamide gels (Figure 1-figure supplement 1), which show the purity and size of GFP tagged Rev7, Rad50, Mre11, Xrs2, Rev1, Sae2 and Dmc1 proteins.

      (5) Please check the Kd values. In the graph in D, the differences between Rad50, Mre11, and Xrs2 look much larger than the values in F suggest.

      This is a fair point and we appreciate the reviewer for highlighting. The differences between the binding profiles of the Rad50, Mre11, and Xrs2 with Rev7 as shown in the previous version of the manuscript were not obvious because of cluttering of binding curves. Therefore, the binding profiles of interacting pair of proteins were plotted separately to highlight the differences (Figure 1—figure supplement 3). Further, we rigorously analysed the dataset to ascertain the binding affinities and found that the Kd values obtained were in good agreement with the values shown in Figure 1D.

      (6) Figure 1S3: Please label the bands.

      In the revised manuscript, the protein bands in Figure1-figure (previously Figure 1S3) are identified with their names.

      (7) Line 195: Change Figure 1 to Figure 1S4.

      We have introduced the correction in the revised manuscript.

      (8) Line 202: The minimal interaction domain of 42 aa is only described in the next paragraph. The description anticipates a result about the 42 aa fragment that has not been shown to this point. Please reorder results or descriptions to make this coherent.

      We have implemented the change, as per the Reviewer’s suggestion.

      (9) Figure 2: The two-hybrid analysis in Figures 1 and 2 also identifies Rev7 self-interaction, which is not discussed. This serves as another control against the artifact of the truncation proteins and should be discussed.

      We have now discussed the significance of Rev7 self-interaction in the Y2H experiments wherever relevant in the text.

      (10) Is the 42 aa fragment sufficient to elicit a two-hybrid signal?

      We thank the reviewer for this insightful comment. To test this premise, we expressed the terminal 42 amino acid sequence of Rev7 using bait pGBKT7 vector. The results revealed that the 42 residue fragment of ScRev7 alone is sufficient for a two-hybrid interaction with the MRX subunits (Figure 2-figure supplement 1).

      (11) Line 289: Why are the EMSA conditions described as physiological? As per Material and Methods, the reaction mixtures contain 20 mM Tris-HCl (pH 7.5), 0.1 mM DTT, 0.2 mg/ml BSA, and 5% glycerol, which is far from physiological.

      As suggested by all three reviewers, the data showing the interaction of Rev7 and its truncation derivative Rev7-C1 with G4 DNA has been deleted in the revised version of the manuscript.

      (12) Figure 4C: The figure needs to increase in size. The plotting symbols are not all visible, and it is undefined what the black squares represent.

      Following the reviewer's suggestion, Figure 4C has been omitted in the revised version of the manuscript.

      (13) Figure 5: The MRX nuclease assays were conducted in the presence of Manganese. Has the more physiological divalent cation magnesium been tested?

      This has been addressed in response to the query of Reviewer 1 (Public Review). As noted above, Mre11 exhibits DNase activities only in the presence of Mn²⁺.

      (14) In Figure 5D, lane 2: What is the concentration of Rev7?

      We appreciate the reviewer for catching this. The concentration of ScRev7 used for the reaction shown in Figure 5D, lane 2 was 2 μM, as specified in the Figure legend.

      (15) Figure 6 legend: Lane 1620 "same as in lane "Is there a "1" missing?

      We thank the reviewer for pointing out the typographical error, which has been corrected in the revised manuscript.

      (16) Figure 9: Rev7-C1 lacks the 42 a peptide that is postulated to mediate anti-resection but shows normal HR here. This seems unexpected based on the premise that the 42 aa fragment supports end-joining. Rev7 seems to suppress HR independent of the function of the 42 aa peptide.

      This has been addressed in response to the query posed by Reviewer 1 in the Public Review. We do see that the Rev7-C1 lacking the 42 aa peptide suppresses HR, but the suppression was only partial as compared with the wild type. This is consistent with biochemical assays suggesting that Rev7-C1 exerts partial inhibition on the Mre11 nuclease (Figure 5) and Rad50 ATPase (Figure 6) activities. Further, the AF2 models indicate that, in addition to the C-terminal 42-aa region, other regions of Rev7 also interact with the Mre11 and Rad50 subunits (Figure 2—figure supplement 2), consistent with biochemical and genetic data.

      (17) Line 478: The conclusion that "these findings are consistent with the idea that REV7 completely abolishes DSB-induced HR in S. cerevisiae." is overly broad as the assay

      We agree with the reviewer's assessment. Accordingly, we have rephrased the sentence to soften the claim.

      Line 483ff: Based on the comments on Figure 9, the introductory sentences of the discussion do not seem to be supported by the data, as Rev7 appears to regulate HR independent of the 42 aa peptide.

      Please refer to the response of comment #16 above

      (18) Line 536: Similarly to above 17, the conclusion about the effect of the 42 aa peptide on HR appears unwarranted.

      We have revised the statement to moderate the previously exaggerated claims.

      (19) In all figures, please list in the legend, which exact strains have been used referring to Table S5.

      We have now included mentions of the strains in the figure legend wherever applicable.

      (20) Line 351: linear.

      It is corrected in the revised manuscript.

      Reviewer 2 (Recommendations For The Authors):

      (1) It is very strange and unusual that Rev7 independently binds to all three subunits of the MRX complex, raising a question of how specific these interactions are. At least, it should be a negative control in their YH2 assay and protein-protein interaction assay in vitro that Rev7 does not bind to some other proteins. For example, Sae2 and Rev7 interactions can be tested.

      The reviewer is right that it is important to validate the specificity of Y2H interactions as well as in vitro enzyme assays. These findings are shown in Figure 6 and Figure 5-figure supplement 1.  As suggested by the Reviewer, we included SAE2 in Y2H and MST assays, and Dmc1 and Sae2 in vitro enzyme assays. Our results clearly showed that Sae2 neither interacts with MRX subunits in Y2H assays (Figure 1A-C) nor inhibits the Sae2’s nuclease and Dmc1’s ATPase activities in vitro (Figure 6 and Figure 5-figure supplement 1)

      (2) It is surprising that in the Discussion the authors speculate that Rev7 might recruit Mus81 nuclease for cleavage, completely ignoring their own publication on the cleavage of G4 by MRX.

      We agree with the reviewer, and we have added discussion about MRX (mentioned above by the reviewer) in revised version.

      (3) How does the AlphaFold-Multimer modeling predict the interaction between Rev7 and MRX as a complex? Are the same regions of MRX accessible for the interaction with Rev7 in this case? Similarly, how are the activities of the MRX complex and phosphorylated Sae2 (see P. Cejka's work) affected by Rev7?

      Thank you for pointing this out. In this study, we investigated the interaction between Rev7 and Mre11, and between Rev7 and Rad50 subunits using AF2 algorithm. However, the three-dimensional structure of S. cerevisae MRX-Rev7 complex could not be constructed due to the size limits imposed by AF2 algorithm. Therefore, we are unable to comment on whether the same regions of MRX subunits in the complex are accessible for the interaction with Rev7. That said, AF2 algorithm has recently been used for structural modelling of S. cerevisiae Mre11 (1–533)-Rad50 (1–260 + 1,057–1,312) complex (Nicolas et al., Mol. Cell 84, 2223, 2024). As such, there are no AF2 structural models that cover the whole length of Mre11-Rad50 proteins.

      Regarding the second point raised by the Reviewer, our results suggest that Rev7 does interact with Sae2 in Y2H assays. However, whether phosphorylated Sae2 could potentially affect the interaction between MRX subunits and Rev7 warrants further studies.

      Minor points:

      (1) Figure 1. The labeling of the strains in A and B is genes and in C is proteins.

      The reviewer is correct. We have now corrected the error in the Figure 1 and 2.

      (2) Abstract. Carefully check English grammar.

      We thank the Reviewer for spotting this, which has been corrected in the revised manuscript.

      (3) Line 322 "Further, it has been demonstrated that Mre11 cleaves non-B DNA structures such as DNA hairpins, cruciforms and intra- and inter-molecular G-quadruplex structures)." It has not been shown that Mre11 cuts cruciform structures.

      We thank the referee for spotting this error. Mre11 does not cleave cruciform DNA structures. This error is corrected in the revised manuscript.

      (4) Page 14. Lines 452-455. What does "selective and non-selective media" mean? Is it without and with HU treatment?

      Thanks very much for the comment. In our manuscript, selective medium is composed of SC/-Leu with HU and non-selective medium is without HU. We have clarified this point in the revised version.

      (5) Page 15. Lane 472 "To assess whether increased frequency of HR is due to the instability of G-quadruplex DNA in rev7Δ cells, we examined the length of G4 DNA inserts in the plasmids carrying sequences during HR assay". It is not clear what does mean" during HR assay"? Did you examine the presence of G4 in Ura+ recombinants? If not, this analysis is meaningful.

      The reviewer is correct. We measured the presence of G4 DNA insert in Ura+ recombinants. The text has been appropriately edited to reflect these necessary modifications.

      (6) What is the nature of the ura3-1 allele? Can it revert to URA3 in rev7 mutants?

      The ura3-1 allele (glycine-to-glutamate substitution) reverts to Ura3+ at a low rate of ~2.5 × 10−9 in both orientations (Johnson et al., Mol. Cell 59, 163, 2015)

      (7) From the way that the recombination process is depicted it seems that the authors believe that plasmid should integrate into the chromosome. In reality, in most cases it should be a gene conversion where the G4 sequence (if it indeed induces DSBs) should be replaced by the wild-type segment form ura3-1, integration is not required since it is 2-micron plasmid.

      We apologize for not having made this clearer. The recombination assay with targeting plasmids containing G4 DNA forming sequences was performed as previously described (Paeschke et al., Cell 145, 678, 2011). In this assay, the appearance of Ura+ recombinants arise from the integration of the targeting plasmid bearing ura3G4 allele (with a G4 DNA forming insert) integrates into the genome at the ura3-1 locus. As shown in Author response image 1B, this is confirmed by PCR amplification of the insert in the genomic DNA of wild type and rev7D cells.

      Reviewer 3 (Recommendations For The Authors):

      (1) All Y2H experiments were performed with REV7 fusion to pGBKT7 and MRX to pGADT7. It will be helpful to test if pGAD-Rev7 also interacts with pGBK-Mre11 or Rad50 by Y2H.

      Following the reviewers' suggestions, we performed Y2H experiments in wild-type PJ69-4a cells co-transformed with the pGBKT7 vector expressing MRX subunits and the pGADT7 vector expressing Rev7. The results indicated that Rev7 interacts with Mre11, Rad50 or Xrs2 subunits, indicating that interactions are vector-independent.

      Author response image 1.

      Yeast two hybrid analysis suggest interaction between Rev7 and MRX subunits. PJ69-4A cells were co-transformed with bait vector expressing Rev7 or the Mre11, Rad50 or Xrs2 subunits and prey vector expressing Rev7 protein. Equal number of cells were spotted onto –Trp – Leu and –Trp – Leu –His dropout plates containing 3-AT and images were obtained following 48 h of incubation at 30°C. The data is representative of three independent experiments.

      (2) G4 studies are under-developed and do not add much or even negatively to the manuscript. The author might consider revising the manuscript to improve their integration with better rationales or logic. Alternatively, the authors should consider removing the G4 part for another paper.

      This concern was also raised by Reviewer 1 and 2. Following the suggestions of all reviewers, figures and text related G4 DNA studies have been deleted in the revised manuscript.

    2. eLife Assessment

      This manuscript reports important data providing evidence that a 42 amino acid region of Rev7 is necessary and sufficient for interaction with the Rad50-Mre11-Xrs2 complex in budding yeast. The authors conclude that Rev7 inhibits the Rad50 ATPase and the Mre11 nuclease with the exception of ssDNA exonuclease activity. The convincing data largely support the conclusions, although the effect of Rev7 on homologous recombination is less well documented and the observed effect on resection is moderate. Specifically, the result that the Rev7 C-terminal truncation lacking the 42 amino acid region still suppresses homologous recombination is unexpected and unexplained.

    3. Reviewer #1 (Public review):

      Summary:

      The mammalian Shieldin complex consisting of REV7 (aka MAD2L2, MAD2B) and SHLD1-3 affects pathway usage in DSB repair favoring non-homologous endjoining (NHEJ) at the expense of homologous recombination (HR) by blocking resection and/or priming fill-in DNA synthesis to maintain or generate near blunt ends suitable for NHEJ. While the budding yeast Saccharomyces cerevisiae does not have homologs to SHLD1-3, it does have Rev7, which was identified to function in conjunction with Rev3 in the translesion DNA polymerase zeta. Testing the hypothesis that Rev7 also affect DSB resection in budding yeast, the work identified a direct interaction between Rev7 and the Rad50-Mre11-Xrs2 complex by two-hybrid and direct protein interaction experiments. Deletion analysis identified that the 42 amino acid C-terminal region was necessary and sufficient for the 2-hybrid interaction. Direct biochemical analysis of the 42 aa peptide was not possible. Rev7 deficient cells were found to be sensitive to HU only in synergy with G2 tetraplex forming DNA. Importantly, the 42 aa peptide alone suppressed this phenotype. Biochemical analysis with full-length Rev7 and a C-terminal truncation lacking the 42 aa region shows G4-specific DNA binding that is abolished in the C-terminal truncation and with a substrate containing mutations to prevent G4 formation. Rev7 lacks nuclease activity but inhibits the dsDNA exonuclease activity of Mre11. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition suggesting the involvement of additional binding sites besides the 42 aa region. Also, the Mre11 ssDNA endonuclease activity is inhibited by Rev7 but not the degradation of linear ssDNA. Rev7 does not affect ATP binding by Rad50 but inhibits in a concentration-dependent manner the Rad50 ATPase activity. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition but significantly less than the full-length protein. Using an established plasmid-based NHEJ assay, the authors provide strong evidence that Rev7 affects NEHJ, showing a four-fold reduction in this assay. The mutations in the other Pol zeta subunits, Rev3 and Rev1, show a significantly smaller effect (~25% reduction). A strain expressing only the Rev7 C-terminal 42 aa peptide showed no NHEJ defect, while the truncation protein lacking this region exhibited a smaller defect than the deletion of REV7. The conclusion that Rev7 supports NHEJ mainly through the 42 aa region was validated using a chromosomal NHEJ assay. The effect on HR was assessed using a plasmid:chromosome system containing G4 forming DNA. The rev7 deletion strain showed an increase in HR in this system in the presence and absence of HU. Cells expressing the 42 aa peptide were indistinguishable from wild type as were cells expressing the Rev7 truncation lacking the 42 aa region. The authors conclude that Rev7 suppresses HR, but the context appears to be system-specific and the conclusion that Rev7 abolished HR repair of DSBs is unwarranted and overly broad.

      Strength:

      This is a well-written manuscript with well-executed experiments which suggest that Rev7 inhibits MRX-mediated resection to favor NEHJ during DSB repair. This finding is novel and provides insight into the potential mechanism of how the human Shieldin complex might antagonize resection.

      Weaknesses:

      The nuclease experiments were conducted using manganese as a divalent cation, and it is unclear whether there is an effect with the more physiological magnesium cation. The data largely support the conclusions, although the effect of Rev7 on HR is less well documented, as only a highly specialized assay is used that does not warrant the broad conclusion drawn. Specifically, the results that the Rev7 c-terminal truncation lacking the 42 aa region still suppresses HR is unexpected and unexplained.

      In this revision the authors addressed most of my concerns by text revisions and addition of new data.

      The new two hybrid data showing that the 42 amino acid segment interacts with MRN are valuable. However, it may not be clear to which subunit the 42 aa segment binds, as in the yeast 2H system the chromosomally encoded subunits are present or were the 2H experiments conducted in an MRN deletion background?. This could be acknowledged.

      The material and methods section was updated to indicate use of 5 mM MnCl2 and 5 mM MgCl2 in the exonuclease assay but not the endonuclease assay. Please check if this is correct. Why the difference between both assays? There is a concern that the absence of ATP and Mg affects the endonuclease assay.

      The addition of Dmc1 as a specificity control for the ATPase inhibition is nice and shows a specific effect. The use of Sae2 associated nuclease activity as a specificity control for the nuclease inhibition is problematic. There has been considerable debate about the Sae2 associated nuclease activity, which seems to have been solved by the Cejka lab showing that Sae2 is a cofactor of MRN without intrinsic nuclease activity (e.g. https://pubmed.ncbi.nlm.nih.gov/25231868/). Or do the authors want to suggest that Sae2 has intrinsic nuclease activity? The control may still be useful mentioning that the nuclease is associated but not intrinsic and citing the relevant papers.

    4. Reviewer #2 (Public review):

      In this study, Badugu et al investigate the Rev7 roles in regulating the Mre11-Rad50-Xrs2 complex and in metabolism of G4 structures. The authors also try to make a conclusion that REV7 can regulate the DSB repair choice between homologous recombination and non-homologous end joining.<br /> The major observations of this study are:

      (1) Rev7 interacts with the individual components of the MRX complex in a two-hybrid assay and in a protein-protein interaction assay (microscale thermophoresisi) in vitro.<br /> (2) Modeling using AlphaFold-Multimier also indicated that Rev7 can interact with Mre11 and Rad50.<br /> (3) Using a two-hybrid assay, a 42 C terminal domain in Rev7 responsible for the interaction with MRX was identified.<br /> (4) Rev7 inhibits Mre11 nuclease and Rad50 ATPase activities in vitro.<br /> (5) Rev 7 promotes NHEJ in plasmid cutting/relegation assay.<br /> (6) Rev7 inhibits recombination between chromosomal ura3-1 allele and plasmid ura3 allele containing G4 structure.<br /> (7) Using an assay developed in V. Zakian's lab, it was found that rev7 mutants grow poorly when both G4 is present in the genome and yeast are treated with HU.<br /> (8) In vitro, purified Rev7 binds to G4-containing substrates.

      In general, a lot of experiments have been conducted, but the major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified.

      (1) Two stories that do not overlap (regulation of MRX by Rev7 and Rev7 role in G4 metabolism) are brought under one umbrella in this work. There is no connection unless the authors demonstrate that Rev7 inhibits the cleavage of G4 structures by the MRX complex.

      (2) The authors cannot conclude based on the recombination assay between G4-containing 2-micron plasmid and chromosomal ura3-1 that Rev7" completely abolishes DSB-induced HR". First of all, there is no evidence that DSBs are formed at G4. Why is there no induction of recombination when cells are treated with HU? Second, as the authors showed, Rev7 binds to G4, therefore it is not clear if the observed effects are the result of Rev7 interaction with G4 or impact on HR. The established HO-based assays where the speed of resection can be monitored (e.g., Mimitou and Symington, 2010) have to be used to justify the conclusion that Rev7 inhibits MRX nuclease activity in vivo.

      Comments on the revised version:

      I am satisfied with the revision. Specifically, i) the elimination of the G4 part and ii) the implementation of the HO-endonuclease resection assay described in Mimiou and Symington, 2010 significantly improved the clarity of the work and strengthened the conclusion about the Rev7 interference with DNA resection.

    5. Reviewer #3 (Public review):

      Summary:

      REV7 facilitates the recruitment of Shieldin complex and thereby inhibits end resection and controls DSB repair choice in metazoan cells. Puzzlingly, Shieldin is absent in many organisms, and it is unknown if and how Rev7 regulates DSB repair in these cells. The authors surmised that yeast Rev7 physically interacts with Mre11/Rad50/Xrs2 (MRX), the short-range resection nuclease complex and tested this premise using yeast two hybrid (Y2H) and microscale thermophoresis (MST). The results convincingly showed that the individual subunits of MRX interacts robustly with Rev7. By AlphaFold Multimer modelling followed by Y2H confirmed that the carboxy terminal 42 amino acid is essential for interaction with MR and G4 DNA binding by REV7. The mutant rev7 lacking the binding interface (Rev7-C1) to MR shows moderate inhibition to the nuclease and the ATPase activity of Mre11/Rad50 in biochemical assays. Deletion of REV7 also causes a mild reduction in NHEJ using both plasmid and chromosome-based assays and increases mitotic recombination between chromosomal ura3-01 and the plasmid ura3 allele interrupted by G4. The revision also showed that rev7 deleted cells exhibit mild hyper-resection phenotype at 0.7 and 3 kb from the DSB using qPCR assays. The authors concluded that Rev7 facilitates NHEJ and antagonises HR even in budding yeast, but it achieves this by blocking Mre11 nuclease and Rad50 ATPase.

      Weaknesses:

      There are several strengths to the studies and the broad types of well-established assays were used to deduce the conclusion. Nevertheless, there are notable discrepancies on the mutant phenotypes that were to test the functionality of Rev7-MRX interaction on the repair outcomes, raising concerns on the validity of the proposed model. The manuscript also needs a few additional functional assays to reach the accurate conclusions as proposed. The revision responded to several comments raised by the reviewers, but they are inadequate to address the key concerns and did not offer sufficient and compelling experimental support to the main premise that Rev7-Mre11/Rad50/Xrs2 interactions regulate MRX activities in cells and thereby modulates DSB repair choice in budding yeast.

      (1) AlphaFold model predicts that Mre11-Rev7 and Rad50-Rev7 binding interfaces overlap and Rev7 might bind only to Mre11 or Rad50 at a time. Interestingly, however, Rev7 appears dimerized (Fig.1). Since MR complex also forms with 2M and 2R in the complex, it should still be possible if REV7 can interact both M and R in the MR complex. The author should perform MST using MR complex instead of individual MR components. The authors should also analyze if Rev7-C1 is indeed deficient in interaction with MR individually and with complex using MST assay.

      (2) The nuclease and the ATPase assays require additional controls. Does Rev7 inhibit the other nuclease or ATPase non-specifically? Are these outcomes due to the non-specific or promiscuous activity of Rev7? In fig.6, the effect of REV7 on the ATP binding of Rad50 could be hard to assess because the maximum Rad50 level (1 uM) was used in the experiments. The author should use the suboptimal level of Rad50 to check if REV7 still does not influence ATP binding by Rad50.

      (3) The moderate deficiency in NHEJ using plasmid based assay in REV7 deleted cells can be attributed to aberrant cell cycle or mating type in rev7 deleted cells. The authors should demonstrate that rev7 deleted cells retain largely normal cell cycle pattern and the mating type phenotypes. The author should also analyze the breakpoints in plasmid based NHEJ assays in all mutants especially from rev7 and rev7-C1 cells.

      (4) It is puzzling why the authors did not analyze end resection defects in rev7 deleted cells after a DSB. The author should employ the widely used resection assay after a HO break in rev3, rev7 and mre11 rev7 cells as described previously.

      (5) Is it possible that Rev7 also contributes to NHEJ as the part of TLS polymerase complex? Although NHEJ largely depends on Pol4, the authors should not rule out the possibility if the observed NHEJ defect in rev7 cells are due at least partially to its well-known TLS defect and not all due to their role in MRX activity regulation as the authors proposed. In fact, rev3 or rev1 cells are partially defective in NHEJ (Fig. 7). Rev7-C1 is less deficient in NHEJ than REV7 deletion. These results predict that rev7-C1 rev3 could be more deficient than rev3 or rev7-C1, and such results might indicate that Rev7 contributes to NHEJ by two ways; one by interacting (and modulating) MRX and the other as part of Rev3-Rev7 complex. Additionally, the authors should examine if Rev7-C1 might be deficient in TLS. In this regard, does rev7-C1 reduce TLS and TLS dependent mutagenesis? Is it dominant? The authors should also check if Rev3/Rev1 complexes are stable in Rev7 deleted or rev7-C1 cells by immunoblot assays.

      (6) Due to the G4 DNA and G4 binding activity of REV7, it is not clear which class of events the authors are measuring in plasmid-chromosome recombination assay in Fig.9. Do they measure G4 instability or the integrity of recombination or both in rev7 deleted cells. Instead, the effect of rev7 deletion or rev7-C1 on recombination should be measured directly by more standard mitotic recombination assays like mating type switch or his3 repeat recombination. The revision did not address these concerns, which still makes the interpretation of the provided recombination results difficult.

    1. eLife Assessment

      The authors established a useful syndetome differentiation protocol from human induced pluripotent stem cells, guided by single-cell transcriptomic analysis. Their findings could significantly impact the field, particularly for patients needing tendon cell therapy. However, the evidence presented is currently incomplete, as the authors did not yet test the applicability of their protocol across multiple human induced pluripotent stem cell lines.

    2. Reviewer #1 (Public review):

      Papalamprou et al. established a methodology to differentiate iPSCs to the syndetome stage and validated it by marker gene expression and scRNA-seq analysis. They further found that inhibition of WNT signaling enhanced the homogeneity of the cell population after identifying a group of branching-off cells that overexpressed WNT. Their results will be helpful in developing cell therapy systems for tendon injuries. However, there are several issues to improve the manuscript:

      IPA analysis was performed after scRNA-seq. Although it is knowledge-based software with convenient graphic utilities, it is questionable whether an unbiased genome-level analysis was performed. Therefore, it is not convincing if WNT is the only and best signal for the branching-off marker. Perhaps independent approaches, such as GO, pathway, or module analyses, should be performed to validate the findings.

      According to the method section, two iPSC lines were used for the study. However, throughout the manuscript, it is not clearly described which line was used for which experiment. Did they show similar efficiency in differentiation and in responses to WNTi? It is also worrisome if using only two lines is the norm in the stem cell field. Please provide a rationale for using only two lines, which will restrict the observation of individual-specific differential responses throughout the study.

      How similar are syndetome cells with or without WNTi? It would be interesting to check if there are major DEGs that differentiate these two groups of cells.

      Please discuss the improvement of the current study compared to previous ones (e.g., PMID 36203346, 35083031, 35372337).

    3. Reviewer #2 (Public review):

      Summary:

      Dr. Sheyn and colleagues report the step-wise induction of syndetome-like cells from human induced pluripotent stem cells (iPSCs), following a previously published protocol which they adjusted. The progression of the cells through each stage, i.e. presomitic mesoderm (PSM), somitic mesoderm (SM), sclerotome (SCL), and syndetome (SYN)) is characterized using FACS, RT-qPCR and immunofluorescence staining (IF). The authors performed also single-cell RNA sequencing (scRNAseq) analysis of their step-wise induced cells and identify signaling pathways which are potentially involved in and possibly necessary for syndetome induction. They then optimized their protocol by simultaneous inhibition of BMP and Wnt signaling pathways, which lead to an increase in syndetome induction while inhibiting off target differentiation into neural lineages.

      Strengths:

      The authors conducted scRNAseq analysis of each step of their protocol from iPSCs to syndetome-like cells and employed pathway analysis to uncover further insights into somitic mesoderm (SM) and syndetome (SYN) differentiation. They found that BMP inhibition, in conjunction with the inhibition of WNT signaling, plays a role in driving syndetome differentiation. Analyzing their scRNAseq results, they could improve the syndetome induction efficiency of their protocol from 47.6% to 67%-78% while off-target differentiation into neural lineages could be reduced.

      Weaknesses:

      The authors demonstrated the efficiency of syndetome induction solely by scRNA-seq data analysis before and after pathway inhibition, without using e.g. FACS analysis or immunofluorescence (IF)-staining based assessment. A functional assessment and validation of the induced cells is also completely missing.

    4. Reviewer #3 (Public review):

      Papalamprou et al sought to fine tune existing tenogenic differentiation protocols to develop a robust multi-step differentiation protocol to induce tendon cells from human GMP-ready iPSCs. In so doing, they found that while existing protocols are capable of driving cells towards a syndetome-like fate, the resultant cultures contain highly heterogeneous cell populations with sub-optimal cell survival. Through single cell transcriptomic analysis they identify WNT signaling as a potential driver of an off-target neural population and show that inhibition of WNT signaling at the later 2 stages of differentiation can be used to promote higher efficiency of generation of syndetome-like cells.

      This paper includes a useful paradigm for identifying transcriptional modulators of cell fate during differentiation and a clear example where transcriptional data can be used to guide the chemical modulation of a differentiation protocol to improve cell output. The paper's conclusions are mostly well supported by the data, but the image analysis and discussion need to be improved to strengthen the impact.

      The data outlining the differences between the differentiation outcome of the two tested iPSCs is intriguing, but the authors fail to comment on potential differences between the two iPSC lines that could result in drastically different cell outputs from the same differentiation protocol. This is a critically important point, as the majority of the SCX+ cells generated from the 007i cells using their WNTi protocol were found in the FC subpopulation that failed to form from the 83i line under the same protocol. From the analysis of only these 2 cells lines in vitro, it is difficult to assess whether this WNTi protocol can be broadly used across multiple cell lines to generate tenogenic cells. The authors failed to update the text of the manuscript to reflect the potential differences in the two cell lines and the general applicability of their protocol, but rather just include the description of the proposed explanation in the response to reviewer comments. These critical differences in the response to their protocol and their implications for the applications of this proof-of-concept study should be included in the main text.

      The authors make claims about changes in protein expression but fail to quantify either fluorescence intensity or percent cell expression from their immunofluorescence analyses to substantiate these claims. The authors state in their response to reviewers that immunofluorescence is qualitative but continue to make quantitative statements such as upregulated or downregulated in both the text and legend describing these images. The authors should either perform the quantification of the IFs, use Western blots for protein quantification of their cell cultures, use Flow Cytometry to count cell numbers, or remove these quantitative words from the description of the images. The image quality and staining specificity continue to be a limitation of this study. These claims are not fully supported by the data as presented as it is unclear whether there is increased expression of tendon markers at the protein level or more cells surviving the protocol.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 1: IPA analysis was performed after scRNA-seq. Although it is knowledge-based software with convenient graphic utilities, it is questionable whether an unbiased genome-level analysis was performed. Therefore, it is not convincing if WNT is the only and best signal for the branching-off marker. Perhaps independent approaches, such as GO, pathway, or module analyses, should be performed to validate the finding.

      Thanks for your comment. We agree with the reviewer that IPA is a knowledge-based and a hypothesis-driven method. Our hypothesis was that WNT/BMP pathways, among others, are heavily involved in the development of mesenchymal tissues in general and differentiation of tendons specifically. Therefore, we have looked at differentially expressed genes between clusters from a broad array of pathways featured in IPA that could point us towards molecular function that could make a difference. We further corroborated this hypothesis by using WNT inhibitors in subsequent experiments. To address this point, we have supplemented the discussion section with the following remark:

      “This study is not without limitations. The IPA network analysis is a knowledge-based and hypothesis driven platform. We have specifically targeted known pathways to be involved in syndetome differentiation. However, WNT signaling stood out with very specific affinity to the off-target populations and we have verified our findings with experiments proving this hypothesis.”

      Per the reviewer’s suggestion, we also performed a non-biased GO analysis (Supp. Fig. 6). Multiple pathways were detected in the three clusters of interest (Supp. Fig. 6A-C), including integrin-related and TGFβ-related pathways. However, in these three clusters of interest, WNT signaling was also detected as a prominent pathway. Therefore, we could conclude that it plays a pivotal role in the differentiation process. This hypothesis was later corroborated with WNT inhibitor experiments.

      Comment 2: According to the method section, two iPSC lines were used for the study. However, throughout the manuscript, it is not clearly described which line was used for which experiment. Did they show similar efficiency in differentiation and in responses to WNTi? It is also worrisome if using only two lines is the norm in the stem cell field. Please provide a rationale for using only two lines, which will restrict the observation of individual-specific differential responses throughout the study.

      Thanks for your comment. This proof-of-concept study is the first investigation that compares data of an in vitro tenogenic induction protocol that has been tested in more than one human iPSC lines. We agree that line-specific phenomena are difficult to interpret and reproduce. Therefore, it is critical to provide data supporting that the findings can be reproduced in more than one line. Some early studies used one line as proof of concept, however now we realize the need to show that the protocol works in at least one additional line.

      Here we used the GMP-ready iPSC line CS0007iCTR-n5 for all optimization experiments. This newer low passage feeder-free line was generated from PBMCs and was designated as GMP-ready in the manuscript because it has been derived and cultured using cGMP xeno-free components (mTESR plus medium and rhLaminin-521 matrix substrate instead of Matrigel). We then wanted to confirm the application of the optimized protocol using the reference control line CS83iCTR-22n1 which has already been more widely used by our group1-5 and others.6 This line has been derived from fibroblasts and has been grown and expanded using MatrigelTM and mTESR1, followed by mTESR plus media. 

      The question of number of lines needed is stage-dependent. In our opinion at the proof-of-concept level, two lines, one of which has been generated in GMP-like conditions is sufficient. Confirmation with multiple lines becomes more pertinent as we move towards scale-up/manufacturing, where considerations regarding robustness and consistency are raised. However, at this stage, it is crucial to understand the developmental processes that are involved in cell differentiation to ensure a more robust protocol can be modified and adapted later. In future studies, as we move towards clinical translation, it is warranted that the approach presented in this work will be further optimized and subsequently evaluated using at least 3 different cell lines that have been generated from various sources.

      Comment 3: How similar are syndetome cells with or without WNTi? It would be interesting to check if there are major DEGs that differentiate these two groups of cells.

      Thanks for your comment. Single cell RNAseq analysis revealed that treatment with WNTi upregulated tenogenic markers. In SYNWNTi, the expression levels of stage-specific markers COL1A1, COL3A1, SCX, MKX, DCN, BGN, FN1, and TNMD were higher compared to the untreated SYN group, as shown in Figure 5C. Density plots depicted an increase in the number of cells expressing COL1A1, COL3A1, SCX and TNMD in SYNWNTi compared to the SYN group, as illustrated in Figure 5D. Trajectory analysis of the WNTi-treated group revealed the absence of bifurcations observed in the untreated group (Fig. 5E). Therefore, it can be conjured that syndetome cells with and without WNTi are different.

      Comment 4: Please discuss the improvement of the current study compared to previous ones (e.g., PMID 36203346 my study, 35083031- Tsutsumi, 35372337- Yoshimoto).

      Thanks for your comment. In Papalamprou et al (2023)3, we differentiated iPSCs to mesenchymal stromal-like cells (iMSCs), which were then cultured into a 2D dynamic bioreactor for 7 days. In that study, we examined the impact of simultaneous overexpression of the tendon transcription factor Scleraxis (SCX) using a lentiviral vector and mechanical stimulation on the process of tenogenic differentiation. Following 7 days of uniaxial cyclic loading, we observed notable modifications in the morphology and cytoskeleton organization of iPSC-derived MSCs (iMSCs) overexpressing SCX. Additionally, there was an increase in extracellular matrix (ECM) deposition and alignment, along with upregulation of early and late tendon markers. This proof-of-concept study showed that iPSC-derived MSCs could be a viable cell candidate for cell therapy applications and that mechanical stimulation is contributing to the differentiation of iMSCs towards the tenogenic lineage.

      Similarly, Tsutsumi et al7 overexpressed the tendon transcription factor Mohawk (MKX) stably in iPSC-derived MSCs using lentiviral vectors. These cells were then used to seed collagen hydrogels which were mechanically stimulated in a cyclic stretch 3D culture bioreactor for 15 days to create artificial tendon-like tissues, which the authors termed “bio-tendons”. Bio-tendons were then decellularized to remove cellular remnants from the xenogeneic human iPSC-derived cells and were subsequently transplanted in an in vivo Achilles tendon rupture mouse model. The authors reported improved histological and biomechanical properties in the Mkx-bio-tendon mice vs. the GFP-bio-tendon controls, providing another proof-of-concept study in favor of the utilization of iPSC-derived MSCs for tendon cell therapies, while also addressing the immunogenicity of cells of allogeneic/xenogeneic origin. Therefore, the above two studies used tendon transcription factor overexpression and mechanical loading either in 2D or 3D to differentiate MSCs towards the tendon/ligament lineage.

      Yoshimoto et al8 optimized a stepwise iPSC to tenocyte induction protocol using a SCX-GFP transgenic mouse iPSC line, by monitoring GFP expression over time. The group performed scRNA-seq to characterize the induction of mesodermal progenitors towards the tenogenic lineage and to shed light into their developmental trajectory. That study unveiled that Retinoic Acid (RA) signaling activation enhanced chondrogenic differentiation, which was in contrast to the study of Kaji et al (2021), which also used a SCX-GFP mouse iPSC line. Kaji et al inhibited TGF and BMP signaling during the process of mesodermal induction and reported that RA signaling eliminated SCX induction entirely and promoted a switch to neural fate. Yoshimoto et al suggested that variations in mesodermal cell identity could be due to the different methods used for mesodermal differentiation. In contrast to the Kaji et al study, Yoshimoto et al opted to stimulate WNT and block the Hedgehog pathway during mesoderm induction. Loh et al (2016) identified the branchpoint from the primitive streak to either the paraxial mesoderm (PSM) or the lateral plate mesoderm (LPM) as the result of two mutually exclusive signaling conditions. Specifically, they reported that induction of PSM was achieved through BMP suppression and WNT stimulation, while the specification of lateral mesoderm was accomplished by BMP stimulation and WNT suppression, all with concurrent TGFβ suppression/FGF stimulation. Lastly, a similar approach towards PSM induction from primitive streak (TGF off/BMP off/ WNT on/FGF on) has been used by many subsequent studies Matsuda et al (2020),9 Wu et al (2021)10 and Nakajima et al (2021).11 The diversity of the above-mentioned approaches points to the plasticity of mesodermal progenitors and the need for additional studies to better understand mesodermal specification and subsequent induction towards sclerotome and syndetome.   

      In the current study we optimized a stepwise differentiation protocol using xeno-free cGMP ready media and two different cell lines, one of which was cGMP-ready. We used scRNA-seq to characterize the differentiation, which led us to identify off-target cells that were closer to a neural phenotype. We performed pathway analyses and hypothesized that WNT signaling activity might have contributed to the emergence of the off-target cells. To test this, we used a WNT inhibitor (PORCN) to block WNT activity at the SCL stage and at the SYN stage. We found that blockade of WNT signaling at the end of the SM stage and during SCL and SYN induction resulted in a more homogeneous population, while eliminating the neural-like cell cluster. This is the first study that utilized scRNA-seq to shed light into the developmental trajectory of stepwise iPSC to tendon differentiation of human iPSCs and provided a proof-of-concept for the generation of a more homogeneous syndetome population. Further studies are needed to further fine-tune both the process and the final product, as well as elucidate the functionality of iPSC-derived syndetome cells in vitro and in vivo.

      Reviewer 2:

      General concerns: The authors demonstrated the efficiency of syndetome induction solely by scRNA-seq data analysis before and after pathway inhibition, without using e.g. FACS analysis or immunofluorescence (IF)-staining based assessment. A functional assessment and validation of the induced cells is also completely missing.

      We appreciate and agree with the reviewer’s critique regarding further analyses of differentiated iPSC-derived syndetome-like cells, including functional assessment of the differentiated cells. Immunofluorescence was used at all timepoints of induction for phenotype confirmation (Fig. 2,4). Flow cytometry for DLL1 was utilized to benchmark efficient differentiation to PSM (Loh et al,12 Nakajima et al11. Specifically, DLL1 expression was assessed with flow cytometry after 4 days of induction, and was used to optimize the parameter of initial iPSC aggregate seeding density, which has been previously found to be crucial for in vitro differentiation protocols (Loh et al12). Unfortunately, this parameter is usually not reported although it could be critical to establish protocol replication between different lines.

      The function of tendon progenitors is usually reported as response to mechanical cues and the ability to regenerate tendon injuries. In future studies we intend to assess the functionality of the generated syndetome and tendon progenitors and their response to in vitro biomechanical stimulation as previously reported to iMSCSCX+ cells3, 13 and in vivo in a critical tendon defect  similarly to what has been previously reported.2 

      Comment 1: Notably, in Figure 1D, certain PSM markers (TBXT, MSGN1, WNT3A) show higher expression on day 3. If the authors initiate SM induction on day 3 instead of day 4, could this potentially enhance the efficiency of syndetome-like cell induction?

      Thanks for your comment. In the current work, we initially optimized differentiation to PSM via expression of DLL1, whose gene expression peaked at d4. We found that this was influenced by the initial iPSC aggregate seeding density. We wanted to generate a homogeneous DLL1+ population which we assessed via gene expression, flow cytometry, IF and scRNA-seq (Fig. 1D, 2C, 3C and Suppl. Fig.1). Given the fact that different lines might display a diverse developmental timeline, we also confirmed reproducibility of the protocol with a second cell line. We appreciate the reviewer’s suggestion to investigate additional protocol iterations, such as the proposed one at the PSM stage, as we move towards a better understanding of key developmental events during in vitro induction.

      Comment 2:  In the third paragraph of the result section the authors note, "Interestingly, SCX, a prominent tenogenic transcription factor, was significantly downregulated at the SCL stage compared to iPSC, but upregulated during the differentiation from SCL to SYN." Despite this increase, the expression level of SCX in SYN remains lower than that in iPSCs in Fig.1G and Fig.3C. Can the authors provide an explanation for this? Can the authors provide IF data using iPSCs and compare it with in vitro-induced SYN cells? Can the authors provide e.g. additional scRNA-seq data which could support this statement?

      Thank you for your comment. In Fig. 1G, SCX expression in SYN was upregulated compared to SCL, however, it was shown to be similar to iPSCs. This suggests a baseline stochastic expression of SCX possibly stemming from spontaneous differentiation of iPSCs in culture (Fig. 3C). Previous research has shown that tenogenic marker gene expression tends to reduce during postnatal tendon maturation (Yin et al., 2016b14 Grinstein et al., 2019.15 Yoshimoto et al (2022) utilized a transgenic mouse iPSC-SCX-GFP line  to track SCX expression. It was shown that SCX expression peaked after 7d of tenogenic induction and was then decreased at day 14, which marked the end of tenogenic induction. The authors postulated that this pattern of gene expression could either indicate further maturation of tenocytes at subsequent time points, or that the number of non-tenogenic cells increased from T7 to T14.

      In the present work, we showed SCX gene expression upregulation in SYN compared to SCL, as well as significant upregulation of TNMD, EGR1, COL1A1 and COL3A1 (Fig.1G). Supp. Fig.8 has been added to show feature plots of SCX and TNMD expression from SCL, SYN and SYNWNTi.  The significant upregulation of later markers of tenogenic differentiation suggests that the 21 days of tenogenic induction might have matured the cells. Since gene expression analysis only conveys a snapshot of the transcriptional profile of a cell population, it is likely that we might have missed the peak of SCX upregulation (Supp. Fig. 5). Following treatment with the WNT inhibitor, the SYNWNTi group displayed increased SCX expression (% cells expressing SCX) compared to SYN, which might also be due to a more homogeneous population of syndetome-like cells following treatment with WNTi. In the SYNWNTi group, TNMD was shown to be expressed in the SYN cluster, whereas SCX was mostly found in the cluster that was labelled as fibrocartilage (FC) cluster based on the expression of COL2A1/SOX9/FN1/BGN/COL1A1 markers. Due to the fact that SCX+/SOX9+ progenitor cells are able to give rise to both tendon and cartilage (Sugimoto 2013)16, it could be postulated that this cluster contains tendon progenitors. Interestingly, the FC cluster was not observed in the second iPSC line that we tested, which resulted in a more homogeneous induction to syndetome (78.5% vs. 66.9% SYN cells, Supp. Table 1 & Supp. Fig.3). This slight discrepancy between the two lines and more specifically the presence of the FC cluster only in the 007i line, warrants further investigation. Taken together, these data indicate that the tenogenic induction duration could likely be shortened. Further work to assess the time course of SCX expression over the entire tenogenic induction could be used to further optimize the in vitro induction. For instance, a human edited iPSCSCX-GFP+ line could be generated and used to track SCX expression during the entire induction.

      Comment 3: In the fourth paragraph of the result section the authors state, "SM markers (MEOX1, PAX3) and SCL markers (PAX1, PAX9, NKX3.2, SOX9) were upregulated in a stepwise manner." However, the data for MEOX1 and NKX3.2 seems to be missing from Figure 3B-C. The authors should provide this data and/or additional support for their claim.

      Thanks for your comment. Feature plots for MEOX1 and NKX3.2 have been added to the Supplemental information (Supp. Fig. 9).

      Comment 4: In Figures 2B and 2E, the background of the red channel seems extremely high. Are there better images available, particularly for MEOX1? Given the expected high expression of MEOX1 in SM cells, the authors should observe a strong signal in the nucleus of the stained somitic mesoderm-like cells, but that is not the case in the shown figure. The authors should provide separate channel images instead of merged ones for clarity. The antibody which the authors used might not be specific. Can the authors provide images using an antibody which has been shown to work previously e.g. antibody by ATLAS (Cat#: HPA045214)?

      As requested by the reviewer, we have provided separate channels for those images in the Supplement (Supp. Fig. 7). The images show relatively high expression of these markers in SM cells.

      Comment 5: In Fig. 2C and Supplementary Fig. 1, the authors present data from immunofluorescence (IF) staining and FACS analysis using a DLL1 antibody. While FACS analysis indicates an efficiency of 96.2% for DLL1+ cells, this was not clearly observed in their IF data. How can the authors explain this discrepancy? Could the authors quantify their IF data and compare it with the corresponding FACS data?

      Thanks for your comment. We performed flow cytometric analysis of DLL1 expression to optimize cell seeding density using the 007i line. In the present study, we used IF only in a qualitative manner, that is to confirm protein expression of selected markers. It could be noted that the use of poly-lysine coated coverslips, which are needed for IF, might have slightly altered the density of the cells on the coverslip vs. the plate. Lastly, it cannot be ruled out that the different substrate could have influenced their phenotype differentially through matrix interactions and signaling. On the other hand, flow cytometry by nature is a quantitative and single cell approach, whereas IF staining is qualitative. Therefore, for the purpose of this proof-of-concept work, we tend to trust the quantitative data from the flow cytometry results more than semi-quantitative confirmation achieved through IF staining using coverslips. 

      Comment 6: In Fig. 2G, PAX9 is expected to be expressed in the nucleus, but the shown IF staining does not appear to be localized to the nucleus. Could the authors provide improved or alternative images to clarify this? The authors should use antibodies shown to work with high specificity as already reported by other groups.

      Thanks for your comment. Indeed, the staining seems to be mostly cytoplasmic. We have used antibodies that were previously reported3 and repeated the staining, however, the same results were replicated. We can speculate that this transcription factor has additional role in the iPSC-derived cells and might be traveling to the cytoplasm. Unfortunately, we have no evidence to this phenomenon.  

      Comment 7: Why did the authors choose to display day 10 data for SYN induction in Fig. 4A? Could they provide information about the endpoint of their culture at day 21?

      Thank you for your comment. In Fig. 1G we provided gene expression analyses results for several selected early and later tendon markers for the endpoint of our culture, that is day 21. Following scRNA-seq at each stage of the differentiation (iPSC at d0, PSM at d4, SM at d8, SCL at d11 and the endpoint day 32 for SYN), we performed DEG analysis using the IPA platform. We identified activation of genes associated with the WNT signaling pathway in the off-target clusters. We hypothesized that WNT pathway inhibition might block the formation of unwanted fates and induce a more homogeneous differentiation outcome. We thus tested a WNT inhibitor and compared the inhibitor-treated group with a non-treated group. We then assessed selected neural markers during the course of the inhibitor application. In Fig. 4A we presented gene expression of key selected markers at day 21 using qPCR, which was approximately in the middle of the syndetome induction. Since we observed that the inhibitor downregulated the selected neural markers, we then applied the inhibitor until the endpoint of the initial induction and proceeded to analyze the results using scRNA-seq (Fig. 5). Lastly, it should be acknowledged that this was a proof-of-concept study, and additional optimizations are needed regarding the application of the inhibitor (timing, duration, concentration, etc).

      Comment 8: In Supplementary Fig. 5, the authors depicted the expression level of SCX, a SYN marker, which peaked at day 14 and then decreased. By day 21, it reached a level comparable to that of iPSCs. Given this observation, could the authors provide a characterization of the cells at day 21 during SYN induction using IF? What was the rationale behind selecting 21 days for SYN induction? The authors also need to show 'n numbers'; how many times were the experiments repeated independently (independent experiments)?

      Thanks for your comment. During the optimization process, we initially used RT-qPCR to track gene expression of selected tenogenic markers using the 007i line. We found that after 21 days of tenogenic induction there was upregulation of the few established tendon markers, that is COL1A1, COL3A1, EGR1 and quite importantly, the more definitive later tendon marker, TNMD. Thus, we decided to proceed with this protocol prior to testing other compounds including the WNT inhibitor WNT-C59. However, as has been discussed in the manuscript, this extended tenogenic induction resulted in cell attrition without the application of the WNT inhibitor. This phenomenon was ameliorated following WNT inhibition. Thus, it could be postulated that the protocol could be further optimized by shortening tenogenic induction to less than 21 days.

      The experiments that were conducted to optimize the differentiation process were repeated independently at least n=3 times using qPCR and IF using two lines, that is the 007i and the 83i line as described in the manuscript. The scRNAseq analysis represents a population of cells from in vitro differentiation that originated from the same donor line, therefore it was performed on n=1 sample at each stage. However, the effects of inhibitor application (sample SYNWNTi) were also confirmed using a second cell line (83i), thus a total of n=2 independent samples were analyzed.  

      Comment 9: Overall the shown immunofluorescence (IF) data does not appear convincing. Could the authors please provide clearer images, including separate channel images, a bright field image, and magnified views of each staining?

      Thanks for your comment. The separate channels images were added to the supplemental data (Supp. Fig. 7). We agree with the reviewer regarding the limitations of IF staining, especially with the added confounding factor of using poly-lysine coated coverslips. We would like to point out, that in the current work IF staining is not the main finding or the primary outcome measure, and that it is only used to further support the differentiation by providing a qualitative assessment of protein presence and localization. We describe in this paper our thesis regarding the limitations of IF and the need for more high-throughput unbiased approaches to quantification when using IF staining. For instance, spatial transcriptomics combined with mass cytometry or flow cytometry could be used for a more unbiased approach. Thus, in the present manuscript we based our conclusion on the quantitative gene expression, single cell sequencing and flow cytometry.

      Comment 10: As stated by the authors in the manuscript, another research group performed FACS analysis to assess the efficiency of syndetome induction using SCX antibody, and/or quantification of immunofluorescence (IF) with SCX, MKX, COL1A1, or COL2A1 antibodies. Could the authors conduct a comparative analysis of syndetome induction efficiency both before and after protocol optimization, utilizing FACS analysis in conjunction with an SCX reporter line or antibody staining, e.g. quantifying induction efficiency via immunofluorescence (IF) staining with syndetome-specific marker genes?

      Thank you for your comment. As discussed in a previous comment, we agree with the reviewer that the generation of a human iPSC-SCX-GFP line would shed light into SCX expression over the entire course of induction. In the current work we used IF as qualitative confirmation of specific marker expression and we showed the presence of SCX, MKX, COL1 and COL3 in SYNWNTi as well as the absence of neuronal markers. As we also pointed it out in the present manuscript, IF can only be considered as a semi-quantitative assessment burdened with several technical limitations as well as operator bias and lower sensitivity and accuracy compared to flow cytometry or scRNA-seq, unless performed in a more unbiased manner. To further clarify this point, firstly, using poly-lysine coated coverslips for IF staining, results in a different substrate environment compared to the Geltrex-coated plates that were used for the induction. Additionally, we noticed that cells grew overconfluent at the edges of the coverslips. This is an important point, since as we have observed in this work, seeding density is critical for the reproducibility of the protocol. It could further be postulated that a different cell substrate stiffness might also have an effect on this process. In our opinion, in this context IF should rather be used qualitatively and a combination of flow cytometry with scRNAseq should be utilized to draw quantitative conclusions such as induction efficiencies of a certain cell type. Since we also observed inconsistencies with the SCX antibodies we tested, the generation of edited human iPSC lines (such as SCX-GFP, MKX-GFP and TNMD-GFP) would be the preferred approach to further explore the efficiency of differentiation.

      Comment 11: To enhance the paper's significance, the authors should conduct functional validation experiments and proper assessment of their induced syndetome-like cells. They could perform e.g. xeno-transplantation experiments with syndetome cells into SCID-mice or injury models. They could also assess whether the in vitro induced cells could be applied for in vitro tendon/ligament formation.

      Thanks for your comment. For the purpose of this proof-of-concept in vitro study, our primary goal was to initially evaluate a stepwise tenogenic induction protocol using GMP-ready cell lines and chemically defined media. Then, we wanted to utilize the analytical power of scRNA-seq in order to characterize and optimize the protocol, thus focusing on one developmental stage that is not well understood, that of syndetome specification from sclerotome, and hypothesized that by fine-tuning the WNT pathway we would be able to generate a more homogeneous syndetome cell population. We fully agree with the reviewer that the warranted next steps should be to conduct several functional validation experiments, such as in vitro 2D/3D tendon/ligament formation and in vivo transplantation in allogeneic or xenogeneic injury models.

      Comment 12: The authors should also compare their scRNA-seq data with actual human embryo data sets, something which could be done given the recent increase in available human embryo scRNA-seq data sets.

      This is a great idea and intriguing study. Unfortunately, not all data sets are available at the moment and specifically embryonic and MSK scRNA-seq data is very scarce, although growing. We have no access to data sets from human tendon development, and thus will have to leave this comparison for future studies.

      Reviewer 3:

      Comment 1: The data outlining the differences between the differentiation outcome of the two tested iPSCs is intriguing, but the authors fail to comment on potential differences between the two iPSC lines that could result in drastically different cell outputs from the same differentiation protocol. This is a critically important point, as the majority of the SCX+ cells generated from the 007i cells using their WNTi protocol were found in the FC subpopulation that failed to form from the 83i line under the same protocol. From the analysis of only these 2 cell lines in vitro, it is difficult to assess whether this WNTi protocol can be broadly used to generate tenogenic cells.

      Thanks for your comment. This proof-of-concept study is the first investigation that compares data of an in vitro tenogenic induction protocol that has been tested into more than one cell lines. Using unsupervised clustering we identified 11 clusters, which were classified into 6 cell subpopulations. The only observed difference between the two lines was a small subset that was labeled as fibrocartilage (FC), which displayed expression of both tenogenic and chondrogenic markers. This subpopulation was observed in 007i line but not in the 83i line at the end of the SYN induction. Importantly, DEG analysis also showed that it was enriched for SCX. It has been shown that SCX+/SOX9+ progenitors are a distinct multipotent cell group, responsible for the development of SCX−/SOX9+ chondrocytes and SCX+/SOX9− tenocytes/ligamentocytes (Sugimoto 2013)16. As noted in a previous comment (Comment 2 from Reviewer 1), we might have missed SCX upregulation during the 21-day syndetome induction. This can be further supported by Fig. 5E trajectory analysis which shows that this subpopulation (FC) precedes the SYN cell subpopulation. The fact that this subpopulation was present in one line but not the other, might indicate that 83i line resulted in a more mature tendon population. Therefore, we would rather posit that in the case of 83i line, it might not be that the FC subpopulation failed to form, but rather that it was missed in our scRNAseq endpoint analysis which showed that a more homogeneous SYN population was formed (8.7 % in 007i vs. 0.26 % in 83i, Supp. Table 1 & Supp. Fig. 3B). Future studies are warranted to characterize the SYN induction timeline as it pertains to SCX expression followed up by maturation from tenogenic progenitor to tenocytes.

      Comment 2: The authors make claims to changes in protein expression but fail to quantify either fluorescence intensity or percent cell expression from their immunofluorescence analyses to substantiate these claims. These claims are not fully supported by the data as presented as it is unclear whether there is increased expression of tendon markers at the protein level or more cells surviving the protocol. Additionally, in images where 3 channels are merged, it would be helpful to show individual channels where genes are shown in similar spectra (ie. Fig 2I SCX/MKX). Furthermore, the current layout and labelling scheme of Figure 4 makes it very difficult to compare conditions between SYN and SYNWNTi protocols.

      Thanks for your comment. Protein expression at each stage was verified with immunofluorescence cytochemistry whereby cells were cultured onto poly-lysine coated coverslips, which were then fixed, stained and imaged (Fig. 2). However, prior to WNT inhibitor application, we noticed gradual cell attrition in the cultures at the end of differentiation (Fig. 1B, 2I). The images show qualitative differences with and without the WNT inhibitor. This could be attributed to the heterogeneity of the cell population at SCL stage, which was confirmed by scRNA-seq (Fig. 3A). As it has been discussed previously (Reviewer 2 comments 5 & 9), in the current paper we didn’t provide any IF quantitative analysis because of the qualitative nature of the staining technique. In future work another high-resolution imaging modality will be considered like single cell proteomics and flow cytometry or mass cytometry in order to perform a more unbiased quantitative single cell analysis across different stages and samples. Furthermore, we have added single channel images in the supplemental information.

      Comment 3: Individual data points should also be presented for all qPCR experiments (ie. Fig 4A). Biological replicate information is missing from several experiments, particularly the immunofluorescence data, and it is unclear whether the qPCR data was generated from technical or biological replicates.

      Thanks for your comment. We have added additional information regarding replicates in each figure legend. We have also changed Fig. 4A.

      (1) Glaeser JD, Bao X, Kaneda G, et al. iPSC-neural crest derived cells embedded in 3D printable bio-ink promote cranial bone defect repair. Sci Rep. Nov 4 2022;12(1):18701. https://www.ncbi.nlm.nih.gov/pubmed/36333414

      (2) Kaneda G, Chan JL, Castaneda CM, et al. iPSC-derived tenocytes seeded on microgrooved 3D printed scaffolds for Achilles tendon regeneration. J Orthop Res. Oct 2023;41(10):2205-2220. https://www.ncbi.nlm.nih.gov/pubmed/36961351

      (3) Papalamprou A, Yu V, Chen A, et al. Directing iPSC differentiation into iTenocytes using combined scleraxis overexpression and cyclic loading. J Orthop Res. Jun 2023;41(6):1148-1161. https://www.ncbi.nlm.nih.gov/pubmed/36203346

      (4) Sheyn D, Ben-David S, Tawackoli W, et al. Human iPSCs can be differentiated into notochordal cells that reduce intervertebral disc degeneration in a porcine model. Theranostics. 2019;9(25):7506-7524. https://www.ncbi.nlm.nih.gov/pubmed/31695783

      (5) Später T, Kaneda G, Chavez M, et al. Retention of Human iPSC-Derived or Primary Cells Following Xenotransplantation into Rat Immune-Privileged Sites. Bioengineering. 2023;10(9):1049. https://www.mdpi.com/2306-5354/10/9/1049

      (6) Sareen D, O'Rourke JG, Meera P, et al. Targeting RNA foci in iPSC-derived motor neurons from ALS patients with a C9ORF72 repeat expansion. Sci Transl Med. Oct 23 2013;5(208):208ra149. https://www.ncbi.nlm.nih.gov/pubmed/24154603

      (7) Tsutsumi H, Kurimoto R, Nakamichi R, et al. Generation of a tendon-like tissue from human iPS cells. J Tissue Eng. Jan-Dec 2022;13:20417314221074018. https://www.ncbi.nlm.nih.gov/pubmed/35083031

      (8) Yoshimoto Y, Uezumi A, Ikemoto-Uezumi M, et al. Tenogenic Induction From Induced Pluripotent Stem Cells Unveils the Trajectory Towards Tenocyte Differentiation. Front Cell Dev Biol. 2022;10:780038. https://www.ncbi.nlm.nih.gov/pubmed/35372337

      (9) Matsuda M, Yamanaka Y, Uemura M, et al. Recapitulating the human segmentation clock with pluripotent stem cells. Nature. Apr 2020;580(7801):124-129. https://www.ncbi.nlm.nih.gov/pubmed/32238941

      (10) Wu CL, Dicks A, Steward N, et al. Single cell transcriptomic analysis of human pluripotent stem cell chondrogenesis. Nat Commun. Jan 13 2021;12(1):362. https://www.ncbi.nlm.nih.gov/pubmed/33441552

      (11) Nakajima T, Nakahata A, Yamada N, et al. Grafting of iPS cell-derived tenocytes promotes motor function recovery after Achilles tendon rupture. Nat Commun. Aug 18 2021;12(1):5012. https://www.ncbi.nlm.nih.gov/pubmed/34408142

      (12) Loh KM, Chen A, Koh PW, et al. Mapping the Pairwise Choices Leading from Pluripotency to Human Bone, Heart, and Other Mesoderm Cell Types. Cell. Jul 14 2016;166(2):451-467. https://www.ncbi.nlm.nih.gov/pubmed/27419872

      (13) Yu V, Papalamprou A, Sheyn D. Generation of Induced Pluripotent Stem Cell-Derived iTenocytes via Combined Scleraxis Overexpression and 2D Uniaxial Tension. JoVE. 2024/03/01 2024(205):e65837. https://app.jove.com/65837

      (14) Yin Z, Hu JJ, Yang L, et al. Single-cell analysis reveals a nestin(+) tendon stem/progenitor cell population with strong tenogenic potentiality. Sci Adv. Nov 2016;2(11):e1600874. https://www.ncbi.nlm.nih.gov/pubmed/28138519

      (15) Grinstein M, Dingwall HL, O'Connor LD, Zou K, Capellini TD, Galloway JL. A distinct transition from cell growth to physiological homeostasis in the tendon. Elife. Sep 19 2019;8. https://www.ncbi.nlm.nih.gov/pubmed/31535975

      (16) Sugimoto Y, Takimoto A, Akiyama H, et al. Scx+/Sox9+ progenitors contribute to the establishment of the junction between cartilage and tendon/ligament. Development. Jun 2013;140(11):2280-2288. https://www.ncbi.nlm.nih.gov/pubmed/23615282

    1. eLife Assessment

      The study describes a link between beta-amyloid monomers, regulation of microglial activity and assembly of neocortex during development. It brings valuable findings that have theoretical and practical implications in the field of neuronal migration, neuronal ectopia and type II lissencephaly. Unfortunately, the evidence is incomplete and the manuscript would benefit from additional experiments to clarify the relationship between Ric8a and APP and bolster the findings.

    2. Reviewer #1 (Public review):

      Summary:

      The authors want to elucidate which are the mechanisms that regulate the immune response in physiological conditions in cortical development. To achieve this goal, authors used a wide range of mutant mice to analyse the consequences of immune activation in the formation of cortical ectopia in mice.

      Strengths:

      The authors demonstrated that Abeta monomers are anti-inflammatory and inhibit microglial activation. This is a novel result that demonstrates the physiological role of APP in cortical development.

      The current manuscript has been slightly improved by additional experiments and editing of the text (many of the suggestions of the reviewers have not been included). However, the evidence supporting the conclusions of the study is still very weak and inconsistent.

      Remaining weaknesses:

      -There is no evidence that microglia express Emx1. The paper they referred (Zhang et al., 2014) was performed in adult mice so it is not comparable. Moreover, many other papers are saying that Emx1 is not expressed in microglia. Line 175: change in cytokine expression is not a strong evidence to state that Emx1 is expressed in microglia. Fig. S8: It is not clear whether the staining was performed on neuronal primary culture or cortical section? It is also unclear why there is a partial reduction of Ric8a mRNA levels in Emx1-Ric8a cKO and not a completed deletion?

      -NestinCre and Emx1Cre mouse models are targeting the same type of cells in the developing cortex (cortical progenitors, glutamatergic neurons and astrocytes), but with one day difference in expression (Emx1 E9.5 and Nestin E10.5). In fact, previous studies using the same approach (Nestin-Ric8a cKO) found ectopias in the cortex, it is more in line with the results of Emx1-Ric8a cKO shown in the current study. There is no evidence to assume that ric8a deficiency in neural cell lineages is not responsible for basement membrane degradation and ectopia formation in ric8a mutants.

      -Additional experiments should be performed to demonstrate that ectopia formation in Emx1-ric8a cKO mutant mice is due to an increase in immune stimulation and not a cell-autonomous effect. Using double cx3cr1-cre and nestin-cre ric8a mutant mice is not an argument to say that elevated immune activation of ric8a deficient microglia during cortical development is responsible for ectopia formation (line 2012-2013)

      -The similarities between Ric8a cKO and APP cKO mice are not enough evidence to claim that APP and Ric8a are involved in the same anti-inflammatory pathway in microglia.

      -Gel zymography is not the same as Western blot. For the quantification of the relative amount of protein, authors should use western blot and not immunofluorescence intensity as shown in Fig. 5g, h. For western blot, you also load the same amount of protein but you have to normalize your samples with a control protein.

      -The graph of BrdU cell distribution in the mutant mice (Fig. S1 F) shows that there are more BrdU cells in bins 5-7 and less in bin 9, indicating an impaired migration of upper cortical neurons in the mutant mice. The authors claimed there are no differences in migration in the result section but the figure showed significant differences. Panels E, F in Fig S2 show the density of Cux1 and Ctip2 cells per area indicating no changes in the generation of upper and lower cortical neurons, but no information about the migration as authors claimed (lines 117-118). (what is the field for Ctip2 counting?). These experiments cannot rule out the possibility of cell-autonomous effect of Ric8a deletion in glutamatergic neurons or radial glial cells.

    3. Reviewer #2 (Public review):

      Kwon et al. used several conditional KO mice for the deletion of ric8a or app in different cell types. Some of them exhibited pial basement membrane breaches leading to neuronal ectopia in the neocortex.

      I am glad to see that the authors performed some of the requested controls.

      However, a huge problem with this manuscript which has been highlighted in the reviewer's comments but not corrected by the authors, is the claim that "A novel monomeric amyloid beta-activated signaling pathway regulates brain development". They do not have any proof that Abeta is the activating signal in vivo. Whatever they showed in vitro should be confirmed in vivo to make such a strong claim. The authors even recognized it in their responses to reviewers: "we currently do not have evidence that in the developing cortex Abeta monomers play a role in inhibiting microglia". Therefore, their title is misleading, not supported by the data, and must be changed to reflect accurately the results. Maybe something like "Involvement of microglia in the formation of cortical ectopia".

      The abstract is also misleading and must be changed. The abstract is mostly about Abeta, pretending that this is the key part of their findings while they only provide a few in vitro experiments but nothing in vivo.<br /> This is such a bad way to summarize their data. Most of their in vivo data is about Ric8a, then a smaller in vivo part about APP and nothing about Abeta in vivo. But the title "novel monomeric amyloid beta-activated signaling pathway regulates brain development via inhibition of microglia" only mention Abeta. And the Abstract 90% focuses on Abeta.<br /> The first half of the introduction is about Abeta. Why would they focus their paper about Abeta while they basically have only one figure with in vitro data !! This is so deceptive.<br /> It seems that these authors do not fully understand the importance of having their claims supported by solid data.

      (1) The authors did not show in vivo data supporting that Abeta monomers are the key players here.<br /> (2) The authors did not show in vivo data supporting the cytokine secretion data provided in vitro in a model system. They claim that it is not technically feasible to extract the extracellular (secreted) fractions of cytokines from an embryonic brain without causing cell lysis and the release of the intracellular pool. But how about RT-qPCR? After all, they showed that the pathway affects the transcription of several cytokines in microglia in vitro.<br /> (3) The authors did not provide a control experiment to show that the insult induced by LPS injection does not induce the phenotype in the ric8a-foxg1-cre mice.<br /> (4) They did not agree to verify the monomer state of their Abeta monomer preparation, even after addition to the culture medium. Abeta have a strong tendency to polymerize. However, because the authors added the requested result with Ab polymers which gave a different outcome. It is OK with me if they don't do it.<br /> (5) The app-cx3cr1-cre +LPS animals show ectopia only in only subsets of mutants and in most cases only in one of the hemispheres. Experiments examining potential changes in MMP9 are therefore difficult and were not done.

      I don't mind the inability to perform all the suggestions from the reviewers but it is then necessary to tone down or remove the claims that are not supported by the data.<br /> This kind of issue appears several times later in the text too:

      (1) At the end of the introduction "we found that APP and Ric8a form a pathway in microglia that is specifically activated by the monomeric form of Abeta and that this pathway normally inhibits the transcriptional and post-transcriptional expression of immune cytokines by microglia". Data from Abeta and cytokines are only in vitro, so it has to be specified.<br /> (2) Line 282: "Thus, these results indicate that monomeric Abeta possesses a previously unreported anti-inflammatory activity against microglia that strongly inhibits microglial inflammatory activation". Specify in vitro!<br /> (3) Line 322: "We have shown that heightened microglial activation due to mutation in the Abeta monomer-activated APP/Ric8a pathway results in basement membrane degradation and ectopia during cortical development." This is an overstatement. They did not show that Abeta monomers activate the pathway in vivo.<br /> (4) Line 332: "Thus, these results indicate that excessive inflammatory activation of microglia is responsible for ectopia formation in ric8a mutants." This is incorrect. Inhibition of Akt or stat3 does much more than just being pro-inflammatory. This could affect directly migration. The data only show that Akt and/or Stat3 might be involved.<br /> (5) Line 355: "these results indicate this Abeta monomer-regulated anti-inflammatory pathway normally promotes cortical development through suppressing microglial activation and MMP induction.". Another overstatement. There is no proof that Abeta is involved in vivo.<br /> (6) Line 362: "In this article, we have identified a novel microglial anti-inflammatory pathway activated by monomeric Abeta that inhibits microglial cytokine expression and plays essential roles in the normal development of the cerebral cortex". Another overstatement. There is no proof that Abeta is involved in vivo.<br /> (7) Line 365: "this pathway is mediated by APP and the heterotrimeric G protein GEF and molecular chaperone Ric8a in microglia and its activation leads to..." They should mention that its activation was in vitro.<br /> (8) Line 387: "In this study, we have shown that immune over-activation of microglia deficient in a monomeric Ab-regulated pathway results in excessive cortical matrix proteinase activation, leading basement membrane degradation and neuronal ectopia." Another overstatement. There is no support to claim that Abeta is involved in vivo. The immune overactivation was not shown in vivo but only in vitro in a model system that does not even reflect correctly what is happening in vivo due to chronic immune stimulation during in vitro culture.<br /> (9) Line 396: "we have also shown that the anti-inflammatory regulation of microglia in corticogenesis depends on a pathway composed of APP and the heterotrimeric G protein regulator Ric8a." Overstatement. They only showed the anti-inflammatory regulation in vitro and not during corticogenesis.<br /> It is just a matter of rewriting the title, abstract and text in an honest way, in order to make sure that every claim is supported by the data and in some cases acknowledge the weakness of the provided data and describe the multiple interpretations than could be drawn out of them.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors want to elucidate which are the mechanisms that regulate the immune response in physiological conditions in cortical development. To achieve this goal, authors used a wide range of mutant mice to analyse the consequences of immune activation in the formation of cortical ectopia in mice.

      Strengths:

      The authors demonstrated that Abeta monomers are anti-inflammatory and inhibit microglial activation. This is a novel result that demonstrates the physiological role of APP in cortical development.

      Weaknesses:

      -On the other hand, cortical ectopia has been already described in mouse models in which the amyloid signalling has been disrupted (Herms et al., 2004; Guenette et al., 2006), making the current study less novel.

      We agree these previous studies have implicated amyloid precursor protein in cortical ectopia. However, since these studies use whole-body knockouts, they have not implicated the functional roles of specific cell types.  Nor have they identified the specific mechanisms underlying the formation of this unique class of cortical ectopia. In contrast, our studies show that the disruption of a novel Abeta-regulated signaling pathway in microglia is the primary cause of ectopia formation in this class of ectopia mutants. This is the first time that microglia have been specifically implicated in the development of cortical ectopia. We further show that elevated MMP activity and resulting cortical basement membrane degradation is the underlying mechanism leading to ectopia formation.  This is also the first time that MMP activity and basement membrane degradation (instead of maintenance) have been implicated in cortical ectopia development. As such, our results have provided novel insights into the diverse mechanisms underlying cortical ectopia formation in developmental brain disorders.

      One of the molecules analysed is Ric8a, a GTPase activator involved in neuronal development. Authors used the conditional mutant mice Emx1-Ric8a to delete Ric8a from early progenitors and glutamatergic neurons in the pallium. Emx1-Ric8a mutant mice present cortical ectopias and authors attributed this malformation to the increase in inflammatory response due to Ric8a deletion in microglia. Several discordances do not fit this interpretation:

      - The role of Ric8a in cortical development and function has been already described in several papers, but none of them has been cited in the current manuscript (Kask et al., 2015, 2018; Ruisu et al., 2013; Tonissoo et al., 2006).

      We have included reference to the published works on ric8a in cortical development in revision.

      - Ectopia formation in the cortex has been already described in Nestin-Ric8a cKO mice (Kask et al., 2015). In the current manuscript, authors analyzed the same mutant mice (Nestin-Ric8a), but they did not detect any ectopia. Authors should discuss this discordance.

      The expression pattern of nestin-cre is known to vary dependent on factors including transgene insertion site, genetic background, and sex. Early studies show, for example, that the nestin gene promoter drives cre expression in many non-neural tissues in another transgenic line in the FVB/N genetic background (Dubois et al Genesis. 2006 Aug;44(8):355-60. doi: 10.1002/dvg.20226).  The specific nestin-cre line used in Kask et al 2015 has also been shown to be active in brain microglia and lead to increased microglia pro-inflammatory activity upon breeding to a conditional allele of a cholesterol transporter gene (Karasinska et al., Neurobiol Dis. 2013 Jun:54:445-55; Karasinska et al.,  J Neurosci. 2009 Mar 18; 29(11): 3579–3589; Takampri et al., Brain Res. 2009 May 13:1270:10-8). These factors may in part underlie the apparent discrepancy.  We have now incorporated this discussion into the revision.

      - Authors claim that microglia express Emx1, and therefore, Ric8a is deleted in microglia cells. However, the arguments for this assumption are very weak and the evidence suggests that this is not the case. This is an important point considering that authors want to emphasise the role of Ric8a in microglia activation, and therefore, additional experiments should demonstrate that Ric8a is deleted in microglia in Emx1-Ric8a mutant mice.

      We have observed altered mRNA expression of several genes in purified microglia cultured from the emx1-cre mutants (Supplemental Fig. 8), which indicates that ric8a is deleted from microglia and suggests a role of microglial ric8a deficiency in ectopia formation.  This interpretation is further strengthened by the observation that deletion of ric8a from microglia using a microglia-specific cx3cr1-cre results in similar ectopia (Fig. 2). We also have other data supporting this interpretation, including data showing induction of the expression of a cre reporter in brain microglia by emx1-cre and loss of ric8a gene expression in microglia cells isolated from emx1-cre mutants. These data have now been incorporated into the text and in revised Supplemental Fig. 8 (new panels c-c” & d).

      Reviewer #2 (Public Review):

      Kwon et al. used several conditional KO mice for the deletion of ric8a or app in different cell types. Some of them exhibited pial basement membrane breaches leading to neuronal ectopia in the neocortex.

      They first investigated ric8a, a Guanine Nucleotide Exchange Factor for Heterotrimeric G Proteins. They observed the above-mentioned phenotype when ric8a is deleted from microglia and neural cells (ric8a-emx1-cre or dual deletion with cre combination cx3cr1 (in microglia) and nestin (in neural cells)) but not in microglia alone or neural cells alone (whether it is in CR cells (ric8a-Wnt3a-cre), post-mitotic neurons (nex-cre or dlx5/6-cre), or in progenitors and their progeny (nestin-cre or foxg1-cre). They also show that ric8a KO mutant microglia cells stimulated in vitro by LPS exhibit an increased TNFa, IL6 and IL1b secretion compared to controls (Fig 2). They therefore injected LPS in vivo and observed the neuronal ectopia phenotype in the ric8a-cx3cr1-cre (microglial deletion) cortices at P0 (Fig 2). They suggest that ric8a KO in neuronal cells mimics immune stimulation (but we have no clue how ric8a KO in neural cells would induce immune stimulation).

      We agree we do not currently know the precise mechanisms by which mutant microglia are activated in the mutant brain.  However, this does not affect the conclusion that deficiency in the Abeta monomer-regulated APP/Ric8a pathway in microglia is the primary cause of cortical ectopia in these mutants, since we have shown that genetic disruption of this pathway in microglia alone by targeting different pathway components, using cell type specific cre, in several different approaches, all results in similar cortical ectopia phenotypes.  Regarding the source of the immunogens, there are several possibilities which we plan to investigate in future studies. For example, the clearance of apoptotic cells and associated cellular debris is an important physiological process and deficits in this process have been linked to inflammatory diseases throughout life (Doran et al., Nat Rev Immunol. 2020 Apr;20(4):254-267; Boada-Romero et al., Nat Rev Mol Cell Biol. 2020 Jul;21(7):398-414.).  In the embryonic cortex, studies have shown that large numbers of cell death take place starting as early as E12 (Blaschke et al., Development. 1996 Apr;122(4):1165-74; Blaschke et al., J Comp Neurol. 1998 Jun 22;396(1):39-50).  Studies have also shown that radial glia and neuronal progenitors play critical roles in the clearance of apoptotic cells and associated cellular debris in the brain (Lu et al., Nat Cell Biol. 2011 Jul 31;13(9):1076-83; Ginisty et al., Stem Cells. 2015 Feb;33(2):515-25; Amaya et al., J Comp Neurol. 2015 Feb 1;523(2):183-96). Moreover, Ric8a-dependent heterotrimeric G proteins have been found to specifically promote the phagocytic activity of both professional and non-professional phagocytic cells (Billings et al., Sci Signal. 2016 Feb 2;9(413):ra14; Preissler et al., Glia. 2015 Feb;63(2):206-15; Pan et al. Dev Cell. 2016 Feb 22;36(4):428-39; Flak et al. J Clin Invest. 2020 Jan 2;130(1):359-373; Zhang et al., Nat Commun. 2023 Sep 14;14(1):5706).  Thus, it is probable that the failure to promptly clear up apoptotic cells and debris by mutant radial glia may play a role in triggering mutant microglial activation in ric8a-emx1-cre mutants. We have now included these possibilities in the text of the revised manuscript. However, the precise mechanisms remain to be determined in future studies, which, however, do not affect the conclusion of the current study.

      The authors then turned their attention on APP. They observed neuronal ectopia into the marginal zone when APP is deleted in microglia (app-cxcr3-cre) + intraperitoneal LPS injection (they did not show it, but we have to assume there would not be a phenotype without the injection of LPS) (Fig 3). (The phenotype is similar but not identical to ric8a-cx3cr1-cre + LPS. They suggest that the reason is because they had to inject 3 times less LPS due to enhanced immune sensitivity in this genetic background but it is only a hypothesis). After in vitro stimulation by LPS, app mutant microglia show a reduced secretion of TNFa and IL6 but not IL1b (this is the opposite to ric8a-cx3cr1-cre microglia cells) while peritoneal macrophages in culture show increased secretion of TNFa, IL1, IL6 and IL23 (fig 3 and Suppl. Fig 9).

      We have data showing that that app-cxcr3-cre mutants without LPS injection do not show ectopia, which has now been included in the revised supplemental Fig. 9 (new panels c-d).  The reason we employ LPS injection is, in the first place, that we do not see a phenotype without the injection. We agree, and have also stated in the text, that the phenotype of the app mutants is not as severe as that of the ric8a mutant.  Besides the low LPS dosage used, we also suggest that other app family members may compensate since the ectopia in the app family gene mutants reported previously were only observed in app/aplp1/2 triple knockouts, not even in any of the double knockouts (Herms et al., 2004). We have further clarified this point in the text. These possibilities are also not mutually exclusive. Nonetheless, the results clearly show that microglia specific app mutation causes cortical ectopia upon embryonic immune stimulation. They have thus implicated a specifical role of microglial APP in cortical ectopia formation.

      The different response of ric8a and app mutant microglia to LPS results from in vitro culturing of microglia. We have shown that, when acutely isolated macrophages are used, these mutants show changes in the same direction (both increased cytokine secretion) (Fig. 4).  This demonstrates without culturing app mutant microglial lineage cells indeed behave in the same way as ric8a mutant cells.

      The microglia used for analysis in in vitro assays in this study have all been cultured for two weeks before assay. They have thus been under chronic stimulation exposed to dead cells and debris in the culture dish through this period.  Previous studies have shown that dependent on the degree of perturbation to the inflammation-regulating pathways, such exposures can differentially affect microglial cytokine expression, sometimes in an opposite direction from expected.  For example, under chronic immune stimulation, while the trem2+/- microglia, which are heterozygous mutant for the anti-inflammatory Trem2, show elevated pro-inflammatory cytokine expression (as is expected), trem2-/- (null) microglia under the same conditions instead not only do not show increases but for some pro-inflammatory cytokines, actually show decreases in expression (Sayed et al.,, Proc Natl Acad Sci U S A. 2018 Oct 2;115(40):10172-10177).  In several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  Indeed, APP cytoplasmic domain is known to also bind to and signalig through several other proteins including FE65, Mena, and TIP60 (Cao & Sudhof, Science 2001. 293:115-120).  It is likely that in microglia Ric8a-dependent heterotrimeric G proteins may also mediate only a subset of the signaling downstream of APP.  As such, app knockout in microglia may have more severe effects on microglial anti-inflammatory regulation than ric8a knockout.  As a result, upon chronic immune activation, app knockout may lead to a microglial phenotype similar to the trem2 null mutation phenotype as discussed above, while ric8a knockout leads to a phenotype similar to trem2+/- phenotype). This may explain the subdued TNF and IL6 secretion by cultured app (but not ric8a) mutant microglia.

      Amyloid beta (Ab) being one of the molecules binding to APP, the authors showed that Ab40 monomers (they did not test Ab40 oligomers) partially inhibit cytokines (TNFa, IL6, IL1b, MCP-1, IL23a, IL10) secretion in vitro by microglia stimulated by LPS but does not affect secretion by microglia from app-cx3cr1-cre (tested for TNFa, IL6, IL1b, IL23a, IL10) (Fig 4, Suppl fig 10) (but still does it in aplp2-cx3cr1-cre) and does not affect secretion by ric8a-cx3cr1-cre microglia (tested for TNFa and IL6 but still suppress IL1b) (Therefore here is another difference between app and ric8a KO microglia).

      We have tested the effects of Abeta40 oligomers, which induce instead of suppressing microglial cytokine secretion, and have included the data (new panel j in supplemental Fig. 10).  As mentioned above, in several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  We assume that this is likely also true in microglia and that Ric8a-dependent heterotrimeric G proteins may mediate a subset and only a subset of the signaling downstream of APP.  This may explain the difference in the effects of app and ric8a knockout mutation in abolishing the anti-inflammatory effects of Abeta monomers on IL-1b vs TNF/IL-6.  This difference also suggests that TNF/IL-6 and IL-1b secretion must be regulated by different mechanisms in microglia. Indeed, it is well established in immunology that the secretion of IL1b, but not of TNF or IL6, is regulated by inflammasome-dependent mechanisms (see, for example, Proz & Dixit. Nat Rev Immunol. 2016 Jul;16(7):407-20. doi: 10.1038/nri.2016.58).

      The authors injected inhibitors of Akt or Stat3 in the ric8a-emx1-cre cortex and found it suppressed neuronal ectopia (Fig 5, Suppl fig 11). It is not clear whether it suppresses immune stimulation from neuronal cells or immune reaction from microglia cells.

      We agree at present the pharmacological approaches we have taken are not able to distinguish these possibilities.  However, no matter which is the case, our results still implicate a role of excessive microglial activation in the formation of cortical ectopia and support the conclusion of the study.  Thus, while worthwhile of further investigation, this question does not impact the conclusion of the current study. Furthermore, as mentioned, we plan to determine the mechanisms of how ric8a mutation in neural cells induces immune activation in future studies. These results will likely enable us to more specifically address this question.

      Finally, the authors examined the activities of MMP2 and MMP9 in the developing cortex using gelatin gel zymography. The activity and protein levels of MMP9 but not MMP2 in the ric8a-emx1-cre cortex were claimed significantly increased (Fig 5, Suppl fig 12). Unfortunately, they did not show it in the app-cx3cr1-cre +LPS mouse. They make a connection between ric8a deletion and MMP9 but unfortunately do not make the connection between app deletion and MMP9, which is at the center of the pathway claimed to be important here). Then they injected BB94, a broad-spectrum inhibitor of MMPs or an inhibitor specific for MMP9 and 13. They both significantly suppress the number and the size of the ectopia in ric8a mutants (Fig5).

      For all the gelatin gel zymography analysis, we quantify protein concentrations in the cortical lysates using the Bio-Rad Bradford assay kit and load the same amounts of proteins per lane. The results across lanes are all directly comparable. From the quantification, our results clearly show that MMP9 activity levels are increased in the mutants (we have now included whole gel images and quantification in a new supplemental Figure 13).  The similar levels of MMP2 in all lanes also provide an internal control further supporting the observation of a specific change in MMP9.  For this analysis, we focus on the ric8a-emx1-cre mutants since the app-cx3cr1-cre +LPS animals show ectopia only in only subsets of mutants and in most cases only in one of the hemispheres.  Experiments examining potential changes in MMP9 are therefore unlikely to yield meaningful results.  On the other hand, we have clearly shown that the administration of different classes of MMP inhibitors significantly eliminate ectopia in ric8a-emx1-cre mutants. This has strongly implicated a functional contribution of MMPs.

      After reading the manuscript, I still do not know how ric8a in neural cells is involved in the immune inhibition. Is it through the control of Ab monomers? In addition, the authors did not show in vivo data supporting that Ab monomers are the key players here. As the authors said, this is not the only APP interactor. Finally, I still do not know how ric8a is linked to APP in microglia in the model.

      As detailed above, there are several possibilities including potential deficits in the clearance of apoptotic cells and associated debris that may trigger microglial activation in ri8ca-emx1-cre mutants. We will investigate these possibilities in future studies.  We have now incorporated these possibilities in the revised text.  As for the role of Abeta monomers, we have indicated that we currently do not have evidence that in the developing cortex Abeta monomers play a role in inhibiting microglia.  We have also indicated in the manuscript that our conclusion is that a microglial signaling pathway that is activated by Abeta monomers in vitro regulates normal brain development in vivo, not that Abeta monomers themselves regulate brain development.  Regarding the link between Ric8a and APP, the reviewer has missed several major lines of supporting evidence. For example, we have shown that Abeta monomers activate a pathway in microglia that inhibits the secretion of several proinflammatory cytokines including TNF, IL-6, IL-10, and IL-23 (Figure 4 and Supplemental Figures 8-10).  This inhibition is abolished when either app or ric8a gene is deleted from microglia.  This clearly indicates that app and ric8a act in the same genetic pathway (the pathway activated by Abeta monomers) in microglia. We also show that this Abeta monomer-activated pathway also inhibits the transcription of several cytokines in microglia.  This inhibition is also abolished when either app or ric8a gene is deleted from microglia.  This reinforces the conclusion that app and ric8a act in the same pathway in microglia.  Furthermore, cell type specific deletion of app or ric8a from microglia in vivo also results in similar phenotypes of cortical ectopia. Together, these results strongly support the conclusion that app and ric8a act in the same pathway that is activated by Abeta monomers in vitro in microglia. This conclusion is also consistent with published findings that Ric8a dependent heterotrimeric G proteins bind to APP and mediate subsets of APP signaling across different species (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).         

      While several of the findings presented in this manuscript are of potential interest, there are a number of shortcomings. Here are some suggestions that could improve the manuscript and help substantiate the conclusions:

      (1) As the title suggests it, the focus is on Ab and APP functions in microglia. However, the analysis is more focused on ric8a. The connection between ric8a and APP in this study is not investigated, besides the fact that their deletion induces somewhat similar but not identical phenotypes. Showing a similar phenotype is not enough to conclude that they are working on the same pathway. The authors should find a way to make that connection between ric8a and app in the cells investigated here.

      As discussed above, the reviewer misses several major lines of evidence showing that APP and Ric8a acts in the same pathway in microglia.  Besides the similarity of the ectopia phenotypes, for example, we have shown that Abeta monomers activates a pathway in microglia that inhibits the secretion of several proinflammatory cytokines including TNF, IL-6, IL-10, and IL-23 (Figure 4 and Supplemental Figures 8-11).  These inhibitory effects are abolished when either app or ric8a gene is deleted from microglia.  This clearly indicates that app and ric8a act in the same genetic pathway, a pathway that is activated by Abeta monomers in vitro, in microglia. We also show that this Abeta monomer-activated pathway inhibits the transcription of several cytokine genes in microglia.  These effects are again abolished when either app or ric8_a gene is deleted from microglia.  This further reinforces the conclusion that _app and ric8a act in the same pathway in microglia.  Not only so we also show that the same results are true in macrophages.  Thus, these results strongly support the conclusion that app and ric8a act in the same genetic pathway in microglia. This conclusion is also consistent with published findings that Ric8a dependent heterotrimeric G proteins biochemically bind to APP and mediate subsets of APP signaling across different species (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  

      (2) This would help to show the appearance of breaches in the pial basement membrane leading to neuronal ectopia; to investigate laminin debris, cell identity, Wnt pathway for app-cxcr3-cre + LPS injection as you did for ric8a-emx1-cre.

      We have now provided further data on pial basement membrane breaches in the app-cxcr3-cre + LPS animals (new panels e-f” in supplemental Fig 9).  We have not observed any changes in cell identity or Wnt pathway activity in ric8a-emx1-cre mutants.  It is thus of limited value to examine potential changes in these areas in the app-cxcr3-cre + LPS animals.   

      (3) As a control, this would help to show that app-cxcr3-cre without the LPS injection does not display the phenotype.

      We have the data on app-cx3cr1-cre mutants without LPS injection, which show no ectopia.  We have now included the data in the revised supplemental Fig. 9 (new panels c-d).

      (4) This would help to show the activity and protein levels of MMP9 and MMP2 and perform the rescue experiments with the inhibitors in the app-cx3cr1-cre cortex +LPS.

      As discussed above, we focus analysis on the ric8a-emx1-cre mutants since app-cx3cr1-cre +LPS animals show ectopia in only a subset of mutants and in most cases only in one of the hemispheres.  Determining potential changes in MMP9 levels and effects of MMP inhibitors are therefore not likely to yield meaningful data.  On the other hand, we have shown that MMP9 levels are increased and administration of different classes of MMP inhibitors eliminate cortical ectopia in ric8a-emx1-cre mutants.  We have also shown a similar break in the basement membrane in app-cx3cr1-cre +LPS animals (new panels e-f” in supplemental Fig 9). These results together strongly implicates a role played by MMPs.

      (5) Is MMP9 secreted by microglia cells or neural cells?

      Our in situ hybridization data show MMP9 is most highly expressed in a sparse microglia-like cell population in the embryonic cortex, suggesting that microglia may be a major source of MMP9. We have incorporated these data in a new supplemental Fig. 12 (panel a). The precise identity of these cells, however, requires further validation.

      (6) The in vitro evidence indicates that one of the multiple APP interactors, ie Ab40 monomers, is less effective in suppressing the expression of some cytokines by microglia cells mutants for ric8a (TNFa and IL6 but still suppress IL1b) or APP (TNFa, IL6, IL1b, IL23a, IL10) when compared to WT. But there are other interactors for APP. In order to support the claim, it seems crucial to have in vivo data to show that Ab40 monomers are the molecules involved in preventing the breach in the pial basement membrane.

      As addressed in detail above, we have indicated that our conclusion is that a microglial signaling pathway that is activated by Abeta monomers in vitro regulates normal brain development in vivo, not that Abeta monomers themselves regulate brain development in vivo.  We currently do not have evidence that the Abeta monomers play a role in inhibiting microglia during cortical development.  There are candidate ligands for the pathway in the developing cortex, the functional study of which, however, is a major undertaking beyond the scope of the current study.

      (7) In order to claim that this is specific to Ab40 monomers and not oligomers, it is necessary to show that the Ab40 oligomers do not have the same effect in vitro and in vivo. Also, an assay should be done to show that your Ab preparations are pure monomers or oligomers.

      We have tested the effects of Abeta40 oligomers, which induce instead of suppressing microglial cytokine secretion, and have included these data in revision in a new panel j in supplemental Fig. 10. The protocols we use in preparing the monomers and oligomers are standard protocols employed in the field of Alzheimer’s disease research. They have been repeatedly optimized and validated over the past decades.  

      (8) Most of the cytokine secretion assays used microglia cells in culture. Two results draw my attention. Ric8a deletion increases TNFa and IL6 secretion after LPS stimulation in vitro on microglia cells while app deletion decreases their secretion. Then later, papers show that the decrease in IL1b induced by Ab on microglia cells is prevented by APP deletion but not ric8a deletion. Those two pieces of data suggest that ric8a and APP might not be in the same pathway. In addition, the phenotype from app-cxcr3-cre + LPS injection and ric8a-cxcr3-cre + LPS injection are not exactly the same. It could be due to the level of LPS as the author suggests or it might not be. More experiments are needed to prove they are in the same pathway.

      As discussed above, the reviewer misses several major lines of evidence, which strongly support the conclusion that APP and Ric8a act in the same pathway activated by Abeta monomers in microglia (see detailed discussion in point 1 above).  The differential response of TNFa/IL-6 of app and ric8a mutant microglia likely results from chronic immune stimulation during in vitro culturing, which is known to alter microglial cytokine response (see detailed discussion in point 9 below). We have demonstrated that this is indeed the case by showing that, without culturing, acutely isolated app and ric8a mutant macrophages both display elevated TNFa/IL-6 secretion (Figure 4). 

      Regarding the different regulation of TNF/IL-6 vs IL-1b by APP and Ric8a, as discussed above, in several systems, Ric8a-dependent heterotrimeric G proteins (which are degraded in ric8a mutant cortices, see new supplemental Fig. 9) have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  This is likely also the case in microglia and Ric8a-dependent heterotrimeric G proteins may mediate only a subset of the anti-inflammatory signaling activated by APP.  As such, app, mutation may abolish all the inhibitory effects of Abeta monomers (both those on TNF/IL-6 and those on IL-1b), but ric8a mutation may abolish only a subset only those on TNF/IL-6 but not those on IL-1b).  This also suggests that the secretion of TNF/IL-6 and IL-1b must be regulated by different mechanisms in microglia.  Indeed, it is well established in immunology that the secretion of IL1b, but not that of TNF or IL6, is regulated by inflammasome-dependent mechanisms (see, for example, Proz & Dixit. Nat Rev Immunol. 2016 Jul;16(7):407-20. doi: 10.1038/nri.2016.58).

      (9) How do the authors reconcile the reduced TNFa and IL6 secretion upon stimulation of app mutant microglia with the model where app is attenuating immune response in vivo? Line 213 says that microglia exhibit attenuated immune response following chronic stimulation but I don't know if 3 hours of LPS in vitro is a chronic stimulation.

      The reviewer has misunderstood.  The microglia used in this study have all been cultured in vitro for approximately two weeks before assay. They have thus been under chronic stimulation exposed to dead cells and debris in the culture dish.  Dependent on the degree of perturbation to the inflammation-regulating pathways, such exposures are known to change microglial cytokine expression, sometimes in an opposite direction than expected.  For example, under chronic immune stimulation, while the trem2+/- microglia, which are heterozygous mutant for the anti-inflammatory Trem2, show elevated pro-inflammatory cytokine expression, trem2-/- (null) microglia under the same conditions instead not only do not show increases but for some pro-inflammatory cytokines, actually show decreases in expression (Sayed et al.,, Proc Natl Acad Sci U S A. 2018 Oct 2;115(40):10172-10177).  As mentioned, in several systems, Ric8a-dependent heterotrimeric G proteins have also been shown to bind to APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9). Thus, it is likely that in microglia, Ric8a-dependent heterotrimeric G proteins also mediate only a subset of the anti-inflammatory signaling activated by APP.  As such, app knockout in microglia may have more severe effects than ric8a knockout on microglial immune activation, resembling the relationship between trem2 null vs heterozygous mutation discussed above. As such, it is predicted that chronic immune stimulation such as in vitro culturing will result in attenuated pro-inflammatory cytokine expression in app mutant microglia but elevated cytokine expression in ric8a mutant microglia. This may explain why TNF and IL6 secretion by cultured app mutant microglia is subdued, but acutely isolated _a_pp mutant macrophages instead show increased cytokine secretion. The latter may be more representative of the response of app mutant microglia in the absence of chronic stimulation.

      (10) Line 119: In their model, the authors suggest that there is a breach in pial basement membrane but that the phenotype is different from the retraction of the radial fibers due to reduced adhesion. So, could the author discuss to what substrate the radial fibers are attached to, in their model where the pial surface is destroyed?

      Radial glial endfeet normally bind to the basement membrane via cell surface receptors including the integrin and the dystroglycan protein complexes. We observe free radial glial endfeet at the breach sites, apparently without attachment to any basement membrane.  However, we cannot exclude the possibility that there may be residual, broken-off basement membrane components bound to the endfeet that are not detected by the methodology employed. 

      (11) The authors should show that the increased cytokine secretion observed in vitro is also happening in vivo in ric8a-emx1-cre compared to WT mice and compared to ric8a-nestin-cre mice. Or when app is deleted in microglia (app-cxcr3-cre) + LPS injection compared to WT mice +LPS.

      Unfortunately, this is not technically feasible since it is not possible to extract the extracellular (secreted) fractions of cytokines from an embryonic brain without causing cell lysis and the release of the intracellular pool.  This, however, does not affect our conclusion that the Abeta monomer-regulated microglia pathway plays a key role in regulates normal brain development since its genetic disruption, by different approaches, clearly results in brain malformation.

      (12) The authors injected inhibitors of Akt or Stat3 in the ric8a-emx1-cre cortex and found that it suppressed neuronal ectopia (Fig 5, Suppl fig 11). Does it suppress immune stimulation from neuronal cells or immune reaction from microglia cells?

      As discussed above, we agree at present the pharmacological approaches we have taken are not able to distinguish these two possibilities.  However, whichever is true, it does not affect our conclusion.  Also, we plan to determine the mechanisms of how ric8a mutation in neural cells induce immune activation in future studies. These results will likely enable us to adopt specific approaches to address this question.

      (13) Fig 5 and Supplementary fig 12: Please show a tubulin loading control in Fig 5i as you did in suppl fig 12 d (gel zymography). Please provide a gel zymography showing side by side Control, mutant and mutant +DM/S3I treatment. The same request for the MMP9 staining. Please provide statistics for control vs mutant for suppl fig 12c and d..

      We have now included whole gel zymography images with four control and four mutant individual samples as well as quantification in a new supplemental Fig.13 (panels b-c). This clearly shows increases in MMP9, while the MMP2 levels appear similar between controls and mutants. For all of the experiments of gelatin gel zymography, we quantify protein concentrations in the cortical lysates using the Bio-Rad Bradford assay kit and load the same amounts of proteins per lane. The results across lanes are thus all comparable.  The MMP9 staining images for the controls and mutants have also all been taken with the same parameters on the microscope and can be directly compared.  The statistics have now been provided as suggested.

      (14) Please provide the name and the source of the MMP9/13 inhibitor used in this study.

      This inhibitor is MMP-9/MMP-13 inhibitor I (CAS 204140-01-2), from Santa Cruz Biotechnology. This information has been included in revision.

      (15) The results show that deletion of ric8a in microglia and neural cells induced pia membrane breaches but no phenotype is apparent in ric8a deletion in microglia or neural cells alone. Then, the results showed that intraperitoneal injection of LPS induced the phenotype in ric8a-cxcr3-cre mutants. It would be beneficial as a control supporting the model to show that the insult induced by LPS injection does not induce the phenotype in the ric8a-foxg1-cre mice.

      We agree it may potentially be useful to show that LPS injection does not induce ectopia in ric8a-foxg1-cre mice.  Unfortunately, since the ric8a-foxg1-cre mutation shows no phenotype, we are no longer in possession of this line.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      - The information in the abstract and the introduction is only related to app. So, it is very abrupt how authors start the manuscript studying the role of Ric8a, with no information at all about this protein and why the authors want to investigate this role in microglial activation. Later in the manuscript, the authors tried to link Ric8a with app to study the role of app in the inflammatory response and ectopia formation. This link is quite weak as well.

      In the last paragraph of the Introduction, we explain the use of the ric8a mutant and how it leads to discovery of the Abeta monomer-regulated pathway. We have now improved the writing in revision to make these points especially the link between APP and Ric8a-regulated G proteins more clear.  In the Results section, we have also improved the writing on the potential link of Ric8a to APP by highlighting, among others, the fact that ric8a and app pathway mutants are among a unique group of a few mouse mutants (ric8a, app/aplp1/2, and apbb1/2) that show cortical ectopia exclusively in the lateral cortex, while all other cortical ectopia mutants also show severe ectopia are at the cortical midline.  This suggests that similar mechanisms may underlie the ectopia formation in this small group of mutants.

      -In order to validate the mouse model, double immunofluorescence or immunofluorescence+in situ hybridization should be performed to show that microglia express ric8a and that is eliminated in the Emx1-Ric8a mutant mice.

      As mentioned above, we have additional lines of evidence showing that ric8a is deleted from microglia in emx1-cre mutants. This includes data showing induction of the expression of a cre reporter in brain microglia by emx1-cre and loss of ric8a mRNA expression in microglia cells isolated from emx1-cre mutants.  These data have now been included in revised supplemental Fig. 8.

      -In Supplemental Fig. 6, the authors claimed that cell proliferation is normal in Ric8a mutant mice without doing any quantification. They also quantified the angle of mitotic division of progenitors in the ventricular zone, but there are no images for the spindle orientation quantification, and no description of how they did it. In addition, this data is contrary to what has already been published in conditional Ric8a mutant mice (Kask et al., 2015). The Vimentin staining should be improved.

      We have provided quantification of cell proliferation (phospho-histone 3 staining at the ventricular surface) in revised supplemental Fig. 6g, which shows no significant differences in the number of positive cells. We have also provided details on the definition of the angle of cleavage plane orientation in revised supplemental Fig. 6h and in the Methods section.  We are not sure why the results are different from the other study. We were indeed anticipating deficits in mitotic spindle orientation and spent major efforts in the analysis of this potential deficit.  However, based on the data, we could not draw the conclusion.     

      -Analysis of the MMP9 expression should be done by western blot and not by immunofluorescence. In fact, the MMP9 expression shown in Figure 5g,h, does not correspond with RNA expression shown in gene expression atlas like genepaint or the allen atlas, doubting the specificity of the antibody. The expression of Mmp9 is quite low or absent in the cortex at E13.5-E14.5, making this protein very unlikely to be responsible for laminin degradation during development.

      We have performed gelatin gel zymography on MMP2/9, which shows increased MMP9 activity levels in the mutant cortex. This is similar to Western blot analysis (all lanes are loaded with the same amounts of cortical lysates).  We have now included whole gel zymography images with four control and four mutant individual samples as well as quantification in a new supplemental Fig.13 (panels b-c).  The immunofluorescence staining of MMP9, a different type of analysis, was designed as a complementary approach, the results of which also support the interpretation of increases in MMP9 protein.  Regarding MMP9 RNA expression, please also note that MMP9 is secreted, and the protein expression pattern is expected to be different from that of RNA. We have performed wholemount in situ using dissected E13.5 mouse forebrains.  Our data (in new supplemental Fig.13a) show that MMP9 mRNA is strongly expressed in a sparse population of cells many of which appear to align along blood vessels. We suspect these are microglial lineage cells populating the embryonic cortex at this stage (see, for example, Squarzoni et al., Cell Rep. 2014 Sep 11;8(5):1271-9. doi: 10.1016/j.celrep.2014.07.042.).  Our control in situ using a Tnc5 probe also shows that the MMP9 signal is not a result of nonspecific probe binding.  Since the MMP9 expressing cells are very sparse even in the wholemount specimens while most database RNA in situ expression data are obtained using thin sections, we suspect this may be why the signal may have been missed in the databases.  As for functional contributions, we agree that we cannot rule roles played by other MMPs.  However, based on the ectopia suppression data, our results clearly indicate a critical contribution by MMP9/13.

      For MMP9 activity, authors should show the whole membrane with a minimum of three control and three mutant individual samples and with the quantification.<br /> - The graphs should be improved, including individual values and titles of the Y axes.

      We have included whole membrane zymography images with four control and four mutant individual samples as well as quantification in a new supplemental Fig.13b-c.  The graphs have also been improved as suggested.

    1. Author response:

      The following is the authors’ response to the current reviews.

      We are grateful to the reviewers for their positive assessment of the revised version of the article.

      Please find below our answers to the last, minor comments of the reviewers.

      We thank the reviewer for this important comment. In our live imaging experiments, we actually tracked the dorsal and ventral borders of the omp:yfp positive clusters in control and sly mutant embryos. These measurements showed that the omp:yfp positive clusters are more elongated along the DV axis in mutants as compared with control siblings, as seen on fixed samples (data not shown), suggesting that this difference in tissue shape is not due to fixation.

      Reviewer #4 (Public review):

      Summary:

      In this elegant study XX and colleagues use a combination of fixed tissue analyses and live imaging to characterise the role of Laminin in olfactory placode development and neuronal pathfinding in the zebrafish embryo. They describe Laminin dynamics in the developing olfactory placode and adjacent brain structures and identify potential roles for Laminin in facilitating neuronal pathfinding from the olfactory placode to the brain. To test whether Laminin is required for olfactory placode neuronal pathfinding they analyse olfactory system development in a well-established laminin-gamma-1 mutant, in which the laminin-rich basement membrane is disrupted. They show that while the OP still coalesces in the absence of Laminin, Laminin is required to contain OP cells during forebrain flexure during development and maintain separation of the OP and adjacent brain region. They further demonstrate that Laminin is required for growth of OP neurons from the OP-brain interface towards the olfactory bulb. The authors also present data describing that while the Laminin mutant has partial defects in neural crest cell migration towards the developing OP, these NCC defects are unlikely to be the cause of the neuronal pathfinding defects upon loss of Laminin. Altogether the study is extremely well carried out, with careful analysis of high-quality data. Their findings are likely to be of interest to those working on olfactory system development, or with an interest in extracellular matrix in organ morphogenesis, cell migration, and axonal pathfinding.

      Strengths:

      The authors describe for the first time Laminin dynamics during the early development of the olfactory placode and olfactory axon extension. They use an appropriate model to perturb the system (lamc1 zebrafish mutant), and demonstrate novel requirements for Laminin in pathfinding of OP neurons towards the olfactory bulb.

      The study utilises careful and impressive live imaging to draw most of its conclusions, really drawing upon the strengths of the zebrafish model to investigate the role of laminin in OP pathfinding. This imaging is combined with deep learning methodology to characterise and describe phenotypes in their Laminin-perturbed models, along with detailed quantifications of cell behaviours, together providing a relatively complete picture of the impact of loss of Laminin on OP development.

      Weaknesses:

      Some of the statistical tests are performed on experiments where n=2 for each condition (for example the measurements in Figure S2) - in places the data is non-significant, but clear trends are observed, and one wonders whether some experiments are under-powered.

      We initially planned the electron microscopy experiments in order to analyse 3 embryos per genotype per stage. However, because of technical issues we could not perform the measurements in all the cases, explaining why we have n = 2 in some of the graphs. The trends were quite clear, so we chose to keep these data in the article. We believe they nicely complement the immunostaining data assessing basement membrane integrity in control and mutant embryos.


      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors describe the dynamic distribution of laminin in the olfactory system and forebrain. Using immunohistochemistry and transgenic lines, they found that the olfactory system and adjacent brain tissues are enveloped by BMs from the earliest stages of olfactory system assembly. They also found that laminin deposits follow the axonal trajectory of axons. They performed a functional analysis of the sly mutant to analyse the function of laminin γ1 in the development of the zebrafish olfactory system. Their study revealed that laminin enables the shape and position of placodes to be maintained late in the face of major morphogenetic movements in the brain, and its absence promotes the local entry of sensory axons into the brain and their navigation towards the olfactory bulb. 

      Strengths: 

      - They showed that in the sly mutants, no BM staining of laminin and Nidogen could be detected around the OP and the brain. The authors then elegantly used electron microscopy to analyse the ultrastructure of the border between the OP and the brain in control and sly mutant conditions. 

      - To analyse the role of laminin γ1-dependent BMs in OP coalescence, the authors used the cluster size of Tg(neurog1:GFP)+ OP cells at 22 hpf as a marker. They found that the mediolateral dimension increased specifically in the mutants. However, proliferation did not seem to be affected, although apoptosis appeared to increase slightly at a later stage. This increase could therefore be due to a dispersal of cells in the OP. To test this hypothesis, the authors then analysed the cell trajectories and extracted 3D mean square displacements (MSD), a measure of the volume explored by a cell in a given period of time. Their conclusion indicates that although brain cell movements are increased in the absence of BM during coalescence phases, overall OP cell movements occur within normal parameters and allow OPs to condense into compact neuronal clusters in sly mutants. The authors also analysed the dimensions of the clusters composed of OMP+ neurons. Their results show an increase in cluster size along the dorso-ventral axis. These results were to be expected since, compared with BM, early neurog1+ neurons should compact along the medio-lateral axis, and those that are OMP+ essentially along the dorso-ventral axis. In addition to the DV elongation of OP tissue, the authors show the existence of isolated and ectopic (misplaced) YFP+ cells in sly mutants. 

      - To understand the origin of these phenotypes, the authors analysed the dynamic behaviour of brain cells and OPs during forebrain flexion. The authors then quantitatively measured brain versus OPs in the sly mutant and found that the OP-brain boundary was poorly defined in the sly mutant compared with the control. Once again, the methods (cell tracks, brain size, and proliferation/apoptosis, and the shape of the brain/OP boundary) are elegant but the results were expected. 

      - They then analysed the dynamic behaviour of the axon using live imaging. Thus, olfactory axon migration is drastically impaired in sly mutants, demonstrating that Laminin γ1dependent BMs are essential for the growth and navigation of axons from the OP to the olfactory bulb. 

      - The authors therefore performed a quantitative analysis of the loss of function of Laminin γ1. They propose that the BM of the OP prevents its deformation in response to mechanical forces generated by morphogenetic movements of the neighbouring brain. 

      Weaknesses: 

      - The authors did not analyse neurog1 + axonal migration at the level of the single cell and instead made a global analysis. An analysis at the cell level would strengthen their hypotheses.  

      - Rescue experiments by locally inducing Laminin expression would have strengthened the paper. 

      - The paper lacks clarity between the two neuronal populations described (early EONs and late OSNs).  

      - The authors quantitatively measured brain versus OPs in the sly mutant and found that the OP-brain boundary was poorly defined in the sly mutant compared with the control. Once again, the methods (cell tracks, brain size, proliferation/apoptosis, and the shape of the brain/OP boundary) are elegant but the results were expected. 

      - A missing point in the paper is the effect of Laminin γ1 on the migration of cranial NCCs that interact with OP cells. The authors could have analysed the dynamic distribution of neural crest cells in the sly mutant. 

      We thank the reviewer for the overall positive assessment of our work, and we carefully responded to all her/his insightful comments below. Live imaging experiments to (1) visualise exit and entry point formation with only a few axons labelled, (2) characterise the behaviour of single neurog1:GFP-positive neurons/axons during OP coalescence and to (3) analyse the migration of cranial NCC are now included in the revised manuscript to address the reviewer’s questions, and reinforce our initial conclusions.

      Reviewer #2 (Public Review): 

      Summary: 

      This manuscript addresses the role of the extracellular matrix in olfactory development. Despite the importance of these extracellular structures, the specific roles and activities of matrix molecules are still poorly understood. Here, the authors combine live imaging and genetics to examine the role of laminin gamma 1 in multiple steps of olfactory development. The work comprises a descriptive but carefully executed, quantitative assessment of the olfactory phenotypes resulting from loss of laminin gamma. Overall, this is a constructive advance in our understanding of extracellular matrix contributions to olfactory development, with a well-written Discussion with relevance to many other systems. 

      Strengths: 

      The strengths of the manuscript are in the approaches: the authors have combined live imaging, careful quantitative analyses, and molecular genetics. The work presented takes advantage of many zebrafish tools including mutants and transgenics to directly visualize the laminin extracellular matrix in living embryos during the developmental process. 

      Weaknesses: 

      The weaknesses are primarily in the presentation of some of the imaging data. In certain cases, it was not straightforward to evaluate the authors' interpretations and conclusions based on the single confocal sections included in the manuscript. For example, it was difficult to assess the authors' interpretation of when and how laminin openings arise around the olfactory placode and brain during olfactory axon guidance. 

      We thank the reviewer for the overall positive assessment of our work, and we carefully responded to all her/his insightful comments below. To address these comments, live imaging data to visualise exit and entry point formation with a sparse labelling of axons, and z-stacks showing how exit and entry points are organised in 3D, have been added to the revised manuscript.

      Reviewer #3 (Public Review): 

      This is a beautifully presented paper combining live imaging and analysis of mutant phenotypes to elucidate the role of laminin γ1-dependent basement membranes in the development of the zebrafish olfactory placode. The work is clearly illustrated and carefully quantified throughout. There are some very interesting observations based on the analysis of wild-type, laminin γ1, and foxd3 mutant embryos. The authors demonstrate the importance of a Laminin γ1-dependent basement membrane in olfactory placode morphogenesis, and in establishing and maintaining both boundaries and neuronal connections between the brain and the olfactory system. There are some very interesting observations, including the identification of different mechanisms for axons to cross basement membranes, either by taking advantage of incompletely formed membranes at early stages, or by actively perforating the membrane at later ones. 

      This is a valuable and important study but remains quite descriptive. In some cases, hypotheses for mechanisms are stated but are not tested further. For example, the authors propose that olfactory axons must actively disrupt a basement membrane to enter the brain and suggest alternative putative mechanisms for this, but these are not tested experimentally. In addition, the authors propose that the basement membrane of the olfactory placode acts to resist mechanical forces generated by the morphogenetic movement of the developing brain, and thus to prevent passive deformation of the placode, but this is not tested anywhere, for example by preventing or altering the brain movements in the laminin γ1 mutant. 

      We thank the reviewer for the overall positive assessment of our work and for suggesting interesting experiments to attempt in the future, and we carefully responded to all her/his constructive comments below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      In general, it would be easier to draw conclusions and compare data if the authors used similar stages throughout the article. 

      Throughout the article we tried to focus on a series of stages that cover both the coalescence of the OP (up to 24 hpf) and later stages of olfactory system development spanning the brain flexure process (28, 32, 36 hpf). However, for technical reasons it was not always possible to stick to these precise stages in some of our experiments. Also, in Fig. 1E-J, we picked in the movies some images illustrating specific cell or axonal behaviours, and thus the corresponding stages could not match exactly the stage series used in Fig. 1A-D and elsewhere in the article. Nevertheless, this stage heterogeneity does not affect our main conclusions.

      It would be useful to schematise the olfactory placode and the brain in an insert to clearly visualise the system in each figure. 

      We hope that the schematic which was initially presented in Fig. 1K already helps the reader to understand how the system is organised. Although we have not added more schematic views to represent the system in each figure (we think this would make the figures overcrowded), we have added additional legends to point to the OP and the brain in the pictures in order to clarify the localisation of each tissue.

      In the Summary, the authors refer to the integrity of the basement membrane. I don't think there is any attempt to affect basement membrane integrity in the article. It would be important to do so to look at the effect on CNS-PNS separation and axonal elongation. 

      In the Summary, we use the term « integrity of the basement membrane » to mention that we have analysed this integrity in the sly mutant. Given the results of our immunostainings against three main components of the basement membrane (Laminin, Collagen IV and Nidogen), as well as our EM observations, we see the sly mutant as a condition in which the integrity of the basement membrane is strongly affected.

      Rescue experiments by locally inducing Laminin expression would have strengthened the paper. 

      We have attempted to rescue the sly mutant phenotypes by introducing the mutation in the transgenic TgBAC(lamC1:lamC1-sfGFP) background, in which Laminin γ1 tagged with sfGFP is expressed under the control of its own regulatory sequences (Yamaguchi et al., 2022). To do so, we crossed sly+/-;Tg(omp:yfp) fish with sly+/-; Tg(lamC1:LamC1-sfGFP) fish. Surprisingly, while a rescue of the global embryo morphology was observed, no clear rescue of the olfactory system defects could be detected at 36 hpf. This could be due to the fact that the expression level of LamC1-sfGFP obtained with one copy of the transgene is not sufficient to rescue the olfactory system phenotypes, or that the sfGFP tag specifically affects the function of the Laminin 𝛾1 chain during the development of the olfactory system, making it unable to rescue the defects. Given the results of our first attemps, we decided not to continue in this direction.

      (1) Developing OP & brain are surrounded by laminin-containing BM (already described by Torrez-Pas & Whitlock in 2014). 

      "we first noticed the appearance of a continuous Laminin-rich BM surrounding the brain from 14-18 hpf, while around the OP, only discrete Laminin spots were detected at this stage (Fig. 1A, A'). " 

      Around 8ss for Torrez-Pas & Whitlock (before 14 hpf). Can you modify the text, or show an 8ss stage embryo? As far as I know, the authors do not show images at 14hpf. Please correct this sentence or show a 14 hpf picture. 

      The reviewer is right, we do not show any 14 hpf stage in the images and thus have removed this stage in the text and replaced it by 17 hpf.

      In Figure 1A, the labelling of laminin 111 does not appear to be homogeneous along the brain.

      Is this true? 

      At this stage the brain’s BM revealed by the Laminin immunostaining appears fairly continuous (while the OP’s one is clearly dotty and less defined), but indeed very tiny/local interruptions of the signal can been seen along the structure as detected by the reviewer. We thus modified the text to mention these tiny interruptions.

      How is the Laminin antibody used by the authors specific to laminin 111?  

      We thank the reviewer for raising this important point. The immunogen used to produce this rabbit polyclonal antibody is the Laminin protein isolated from the basement membrane of a mouse Engelbreth Holm-Swarm sarcoma (EHS). It is thus likely to recognise several Laminin isoforms and not only Laminin 111. We thus replaced Laminin 111 by Laminin when mentioning this antibody in the text and Figures.

      Please schematise in Figure 1K the stages you have tested and shown here in the article i.e. stages 18 - 22 - 28 -36 hpf using immunohistochemistry and 17-26-27-29-33 and 38 hpf using transgenics for laminin 111 and LamC1 respectively.  

      As suggested by the reviewer, we changed the stages in the schematics for stages we have presented in Figure 1 (analysed either with immunostaining or in live imaging experiments). We chose to represent 17 - 22 - 26 - 33 hpf (and thus adapted some of the schematics for them to match these stages).  

      Please specify in the Figure 1 legend for panels A to D whether this is a 3D projection or a zsection.

      We indicated in the Figure 1 legend that all these images are single z-sections (as well as for panels E-J).

      Furthermore, the schematisation in Fig. 1K does not reflect what the authors show: at 22 hpf laminin 111 labelling appears to be present only near the brain, and no labelling lateral to the olfactory placode and anteriorly and posteriorly. Thus, the schematisation in Figure 1K needs to be modified to reflect what the authors show.

      We agree with the reviewer that the Laminin staining at this stage is observed around the medial region of the OP, but not more laterally. We modified the schematic view accordingly in Figure 1K. Anterior and posterior sides of the OP are not represented in this schematic because we chose to represent a frontal view rather than a dorsal view.

      The authors suggest that" the laminin-rich BM of OP assembles between 18 and 22 hpf, during the late phase of OP coalescence". However, their data indicate that this BM assembles around 28hpf (Figure 1C). Can they clarify this point?

      What we meant with this sentence is that we cleary see two distinct BMs from 22 hpf. However, as noticed by the reviewer, the OP’s BM is only present around the medial/basal regions of the OP and does not surround the whole OP tissue at this stage. We modified the text to clarify this point (in particular by mentioning that the OP’s BM starts to assemble between 18 and 22 hpf), and replaced the image shown in Figure 1B, B’ with a more representative picture (the previous z-section was taken in very dorsal regions of the OP).

      It would be useful to disrupt these cells that have a cytoplasmic expression of Laminin-sfGFP, to analyse their contribution to BM and OP coalescence.

      Indeed it will be interesting in the future to test specifically the role of the cells expressing cytoplasmic Laminin-sfGFP around and within the OP, as proposed by the reviewer. Laser ablation of these cells could be attempted, but due to their very superficial localisation, close to the skin, we believe these ablations (with the protocol/set-up we currently use in the lab) would impair the skin integrity, preventing us to conclude. We consider that the optimisation of this experiment is out of the scope of the present work.

      Tg(-2.0ompb:gapYFP)rw032 marks ciliated olfactory sensory neurons (OSNs) (Sato et al., 2005). The authors should mention this. 

      Please see our detailed response to the next point below.

      Points to be clarified: 

      -Tg(-2.0ompb:gapYFP)rw032 marks ciliated olfactory sensory neurons (OSNs) (Sato et al., 2005). The authors should mention this here. Moreover, the authors refer to "OP neurons" throughout the article. In the development of the olfactory organ, two types of neurons have been described in the literature: early EONs (12hpf-26hpf) and later OSNs. Each could have a specific role in the establishment and maintenance of the BM described by the authors. The authors need to clarify this point as, in Figure 1 for example, they use a marker for Tg(neurog1:GFP) EONs and a marker for ciliated OSNs without distinction. The distinction between EONs and OSNs comes a little late in the text and should be placed higher up. 

      As mentioned by the reviewer, according to the initial view of neurogenesis in the OP, OP neurons are born in two waves. A transient population of unipolar, dendrite-less pioneer neurons would differentiate first, in the ventro-medial region of the OP and elongate their axons dorsally out of the placode, along the brain wall. These pioneer axons would then be used as a scaffold by later born OSNs located in the dorso-lateral rosette to outgrow their axons towards the olfactory bulb (Whitlock and Westerfield, 1998). 

      Another study further characterised OP neurogenesis and showed that the first neurons to differentiate in the OP (the early olfactory neurons or EONs) express the Tg(neurog1:GFP) transgene (Madelaine et al., 2011). As mentioned by the authors in the discussion of this article, neurog1:GFP+ neurons appear much more numerous than the previously described pioneer neurons, and may thus include pioneers but also other neuronal subtypes.

      We would like here to share additional, unpublished observations from our lab that further suggest that the situation is more complex than the pioneer/OSN and EON/OSN nomenclatures. First, in many of our live imaging experiments, we can clearly visualise some neurog1:GFP+ unipolar neurons, initially located in a medial position in the OP, which intercalate and contribute to the dorsolateral rosette (where OSNs are proposed to be located) at the end of OP coalescence, from 22-24 hpf. Second, in fixed tissues, we observed that most neurog1:GFP+ neurons located in the rosette at 32 hpf co-express the Tg(omp:meRFP) transgene (Sato et al., 2005). These observations suggest that at least a subpopulation of neurog1:GFP+ neurons could incorporate in the dorsolateral rosette and become ciliated OSNs during development. We can share these results with the reviewer upon request. Further studies are thus needed to clarify and describe the neuronal subpopulations and lineage relationships in the OP, but this detailed investigation is out of the scope and focus of the present study. 

      An additional complication comes from the fact that, as shown and acknowledged by the authors in Miyasaka et al., 2005, the Tg(omp:meYFP) line (6kb promoter) labels ciliated OSNs in the rosette but also some unipolar, ventral neurons (around 10 neurons at 1 dpf, Miyasaka et al. 2005, Figure 3A, white arrowheads). This was also observed using the 2 kb promoter Tg(omp:meYFP) line (see for instance Miyasaka et al., 2007) and in our study, we can indeed detect these ventro-medial neurons labelled in the Tg(omp:meYFP) line (2 kb promoter), see for instance Figure 1C’, D’ or Movie 6. It is unclear whether these unipolar omp:meYFPpositive cells are pioneer neurons or EONs expressing the omp:meYFP transgene, or OSN progenitors that would be located basally/ventrally in the OP at these stages.

      For all these reasons, we decided to present in the text the current view of neurogenesis in the OP but instead of attributing a definitive identity to the neurons we visualise with the transgenic lines, we prefer to mention them in the manuscript (and in the rest of the response to the reviewers) as neurons expressing neurog1:GFP or omp:meYFP transgenes (or cells/axons/neurons expressing RFP in the Tg(cldnb:Gal4; UAS:RFP) background).

      What we also changed in the text to be more clear on this point:

      - we moved higher up in the text, as suggested by reviewer 1, the description of the current model of neurogenesis in the OP,

      - we mentioned that neurog1:GFP+ neurons are more numerous than the initially described pioneer neurons, as discussed in Madelaine et al., 2011,

      - we wrote more clearly that the Tg(omp:meYFP) line labels ciliated OSNs but also a subset of unipolar, ventral neurons (Miyasaka et al., 2005), and pointed to these ventral neurons in Figure 1C’, D’,

      - in the initial presentation of the current view of OP neurogenesis we renamed neurog1:GFP+ into EONs to be coherent with Madelaine et al., 2011.

      - To visualise pioneer axons, the authors should use an EONS marker such as neurog1 because, to my knowledge, OMP only marks OSN axons and not pioneer axons.  

      To visualise neurog1:GFP+ axons during OP coalescence, we performed live imaging upon injection of the neurog1:GFP plasmid (Blader et al., 2003) in the Tg(cldnb:Gal4; UAS:RFP) background (n = 4 mutants and n = 4 controls from 2 independent experiments). We observed some GFP+ placodal neurons exhibiting retrograde axon extension in both controls and sly mutants. In such experiments it is very difficult to quantify and compare the number of neurons/axons showing specific behaviours between different experimental conditions/genetic background. Indeed, due to the cytoplasmic localisation of GFP, the axons can only be seen in neurons expressing high levels of GFP, and due to the injection the number of such neurons varies a lot in between embryos, even in a given condition. Nevertheless, our qualitative observations reinforce the idea that the basement membrane is not absolutely required for mediolateral movements and retrograde axon extension of neurog1:GFP+ neurons in the OP. We added examples of images extracted from these new live imaging experiments in the revised Fig. S5A, B.

      - The authors should analyse the presence of laminin in the OP and forebrain in conjunction with neural crest cell dynamics (using a Sox10 transgenic line for example) to refine their entry and exit point hypotheses. 

      As described in the answer to the next point, we performed new experiments in which we visualised NCC migration in the Tg(neurog1:GFP) background, which allowed us to analyse the localisation of NCC at the forebrain/OP boundary, in ventral and dorsal positions, both in sly mutant embryos and control siblings.

      - A dynamic analysis of the distribution of neural crest cells in the sly mutant over time and during OP coalescence would be important. 

      The dynamics of zebrafish cranial NCC migration in the vicinity of the OP has been previously analysed using sox10 reporter lines (Harden et al., 2012, Torres-Paz and Whitlock, 2014, Bryan et al., 2020). To address the point raised by the reviewer, we performed live imaging from 16 to 32 hpf on sly mutants and control siblings carrying the Tg(neurog1:GFP) and Tg(UAS:RFP) transgenes and injected with a sox10(7.2):KalTA4 plasmid (Almeida et al., 2015). This allows the mosaic labelling of cells that express or have expressed sox10 during their development which, in the head region at these stages, represents mostly NCC and their derivatives. 3 independent experiments were carried out (n = 4 mutant embryos in which 8 placodes could be analysed; n = 6 control siblings in which 10 placodes could be analysed). A new movie (Movie 9) has been added to the revised article to show representative examples of control and mutant embryos.

      From these new data, we could make the following observations:

      - As expected from previous studies (Harden et al., 2012, Torres-Paz and Whitlock, 2014, Bryan et al., 2020), in control embryos a lot of NCC had already migrated to reach the vicinity of the OP when the movies begin at 16 hpf, and were then seen invading mainly the interface between the eye and the OP (10/10 placodes). Surprisingly, in sly mutants, a lot of motile NCC had also reached the OP region at 16 hpf in all the analysed placodes (8/8), and populated the eye/OP interface in 7/8 placodes (10/10 in controls). Counting NCC or tracking individual NCC during the whole duration of the movies was unfortunately too difficult to achieve in these movies, because of the low level of mosaicism (a high number of cells were labelled) and of the high speed of NCC movements (as compared with the 10 min delta t we chose for the movies). 

      - in some of the control placodes we could detect a few NCC that populated the forebrain/OP interface, either ventrally, close to the exit point of the axons (4/10 placodes), or more dorsally (8/10 placodes). By contrast, in sly mutants, NCC were observed in the dorsal region of the brain/OP boundary in only 2/8 placodes, and in the ventral brain/OP frontier in only 2/8 placodes as well. Interestingly, in these 2 last samples, NCC that had initially populated the ventral region of the brain/OP interface were then expelled from the boundary at later stages.

      We reported these observations in a new Table that is presented in revised Fig. S6B. In addition, instances of NCC migrating at the eye/OP or forebain/OP interfaces are indicated with arrowheads on Movie 9. Previous Figure S6 was splitted into two parts presenting NCC defects in sly mutants (revised Figure S6) and in foxd3 mutants (revised Figure S7).

      Altogether, these new data suggest that the first postero-anterior phase of NCC migration towards the OP, as well as their migration in between the eye and OP tissues, is not fully perturbed in sly mutants. The subset of NCC that populate the OP/forebrain seem to be more specifically affected, as these NCC show defects in their migration to the interface or the maintenance of their position at the interface. Since the crestin marker labels mostly NCC at the OP/forebrain interface at 32 hpf (revised Fig. S6A), this could explain why the crestin ISH signal is almost lost in sly mutants at this stage.

      (2) Laminin distribution suggests a role in olfactory axon development 

      "Laminin 111 immunostaining revealed local disruptions in the membrane enveloping the OP and brain, precisely where YFP+ axons exit the OP (exit point) and enter the brain (entry point) (Fig. 1C-D')." Can the authors quantify this situation? It would be important to analyse this behaviour on the scale of a neuron and thus axonal migration to strengthen the hypotheses. 

      As suggested by the reviewer, to better visualise individual axons at the exit and entry point, we used mosaic red labelling of OP axons. To achieve this sparse labelling, we took advantage of the mosaic expression of a red fluorescent membrane protein observed in the Tg(cldnb:Gal4; UAS:lyn-TagRFP) background. The unpublished Tg(UAS:lyn-TagRFP) line was kindly provided by Marion Rosello and Shahad Albadri from the lab of Filippo Del Bene. We crossed the Tg(cldnb:Gal4; UAS:lyn-TagRFP) line with the TgBAC(lamC1:lamC1-sfGFP) reporter and performed live imaging on 2 embryos/4 placodes, in a frontal view. A new movie (Movie 3 in the revised article) shows examples of exit and entry point formation in this context.This allowed us to visualise the formation of the exit and entry points in more samples (6 embryos and 12 placodes in total when we pool the two strategies for labelling OP axons) and through the visualisation of a small number of axons, and reinforce our initial conclusions. 

      (3) The integrity of BMs around the brain and the OP is affected in the sly mutant 

      Why do the authors analyse the distribution of collagen IV and Nidogen and not proteoglycans and heparan sulphate? 

      We attempted to label more ECM components such as proteoglycans and heparan sulfate, but whole-mount immunostainings did not work in our hands.

      A dynamic analysis of the distribution of neural crest cells in the sly mutant over time and during OP coalescence would be important. 

      See our detailed response to this point above.  

      (4) Role of Laminin γ1-dependent BMs in OP coalescence 

      The authors use the size of the Tg(neurog1:GFP)+ OP cell cluster at 22 hpf as a marker.  The authors should count the number of cells in the OP at the indicated time using a nuclear dye to check that in the sly mutant the number of cells is the same over time. Two time points as analysed in Figure S2 may not be sufficient to quantify proliferation which at these stages should be almost zero according to Whitlock & Westerfield and Madelaine et al.

      Counting the neurog1:GFP+ cell numbers in our existing data was unfortunately impossible, due to the poor quality of the DAPI staining. We are nevertheless confident that the number of cells within neurog1:GFP+ clusters is fairly similar between controls and sly mutants at 22 hpf, since the OP dimensions are the same for AP and DV dimensions, and only slightly different for the ML dimension. In addition, we analysed proliferation and apoptosis within the neurog1:GFP+ cluster at 16 and 21 hpf and observed no difference between controls and mutants.

      (5) Role of Laminin γ1-dependent BMs during the forebrain flexure 

      In Figure 4F at 32hpf, the presence of 77% ectopic OMP+ cells medially should result in an increase in dimensions along the M-L? This is not the case in the article. The authors should clarify this point. 

      As we explained in the Material and Methods, ectopic fluorescent cells (cells that are physically separated from the main cluster) were not taken into account for the measurement of the OP dimensions. This is now also also mentioned in the legends of the Figures (4 and S3) showing the quantifications of OP dimensions.

      Cell distribution also seems to be affected within the OMP+ cluster at 36hpf, with fewer cells laterally and more medially. The authors should analyse the distribution of OMP+ cells in the clusters. in sly mutants and controls to understand whether the modification corresponds to the absence of BM function. 

      On the pictures shown in Figure 4F,G, we agree that omp:meYFP+ cells appear to be more medially distributed in the mutant, however this is not the case in other sections or samples, and is rather specific to the z-section chosen for the Figure. We found that the ML dimension is unchanged in mutants as compared with controls, except for the 28 hpf stage where it is smaller, but this appears to be a transient phenomenon, since no change is detected at earlier or later stages (Figure 4A-D and Figure S3A-L). The difference we observe at 28 hpf is now mentioned in the revised manuscript.

      The conclusions of Figures 4 and S3 would rather be that laminin allows OMP+ cells to be oriented along the medio-lateral axis whereas it would control their position along the dorsoventral axis. The authors should modify the text. It would be useful to map the distribution of OMP+ cells along the dorsoventral and mediolateral axes. The same applies to Neurog1+ cells. An analysis of skin cell movements, for example, would be useful to determine whether the effects are specific.  

      We are confident that the measurements of OP dimensions in AP, DV and ML are sufficient to describe the OP shape defects observed in the sly mutants. Analysing cell distribution along the 3 axes as well as skin cell movements will be interesting to perform in the future but we consider these quantifications as being out of the scope of the present work.

      (6) Laminin γ1-dependent BMs are required to define a robust boundary between the OP and the brain 

      The authors must weigh this conclusion "Laminin γ1-dependent BMs serve to establish a straight boundary between the brain and OP, preventing local mixing and late convergence of the two OPs towards each other during flexion movement." Indeed, they don't really show any local mixing between the brain and OP cells. They would need to quantify in their images (Figure 5A-A' and Figure S4 A-A') the percentage of cells co-labelled by HuC and Tg(cldnb:GFP). 

      We agree with the reviewer and thus replaced « reveal » by « suggest » in the conclusion of this section. 

      (7) Role of Laminin γ1-dependent BMs in olfactory axon development 

      An analysis of the retrograde extension movement in the axons of OMP+ ectopic neurons in the sly1 mutant condition would be useful to validate that the loss of laminin function does not play a role in this event. 

      Indeed, even though we can visualise instances of retrograde extension occurring normally in sly mutants, we can not rule out that this process is affected in a subset of OP neurons, for instance in ectopic cells, which often show no axon or a misoriented axon. We added a sentence to mention this in the revised manuscript.

      Minor comments and typos: 

      Please check and mention the D-V/L-M or A-P/L-M orientation of the images in all figures. 

      This has been checked.

      Legend Figure 1: "distalmost" is missing a space "distal most". 

      We checked and this word can be written without a space.

      Figure 1 panel C: check the orientation (I am not sure that Dorsal is up). 

      We double-checked and confirm that dorsal is up in this panel.

      Movie 1 Legend: "aroung "the OP should be around the OP. 

      Thanks to the reviewer for noticing the typo, we corrected it.

      Reviewer #2 (Recommendations For The Authors):

      The comments below are relatively minor and mostly raise questions regarding images and their presentation in the manuscript. 

      • Figure 1, visualization of exit and entry points: It is a bit difficult to visualize the axon exit and entry points in these images, and in particular, to understand how the exit and entry points in C and D correspond to what is seen in F, F', H, and H'. There appears to be one resolvable break in the staining in C and D, whereas there are two distinct breaks in F-H'. Are these single optical sections? Is it possible to visualize these via 3-dimensional rendering? 

      All the images presented in Figure 1 are single z-sections, which is now indicated in the Figure legend. As noticed by the reviewer, Laminin immunostainings on fixed embryos at 28 and 36 hpf suggested that the exit and entry points are facing each other, as shown in Figure 1C-D’. However, in our live imaging experiments we always observed that the exit point is slightly more ventral than the entry point (of about 10 to 20 µm). This discrepancy could be due to the fixation that precedes the immunostaining procedure, which could modify slightly the size and shape of cells/tissues. We added a sentence on this point in the text. In addition, we added new movies of the LamC1-sfGFP reporter with sparse red axonal labelling (Movie 3, see response to reviewer 1), as well as z-stacks presenting the organisation of exit and entry points in 3D (Movie 4), which should help to better illustrate the mechanisms of exit and entry point formation.

      • Movie 2, p. 6, "small interruptions of the BM were already present near the axon tips, along the ventro-medial wall of the OP." This is a bit difficult to assess since the movie seems to show at least one other small interruption in the BM in addition to the exit point, in particular, one slightly dorsal to the exit point. Was this seen in other samples, or in different optical sections? 

      Indeed the exit and entry points often appear as regions with several, small BM interruptions, rather than single holes in the BM. We now show in revised Movie 4 the two z-stacks (the merge and the single channel for green fluorescence) corresponding to the last time points of the movies showing exit and entry point formation in Movie 2, where several BM interruptions can be seen for both the exit and entry points. We had already mentioned this observation in the legend of Movie 2, and we added a sentence on this point in the main text of the revised manuscript. This is also represented for both exit and entry points in the new schematics in revised Fig. 1K and its legend. 

      • Movie 2, p. 6, "The opening of the entry point through the brain BM was concomitant with the arrival of the RFP+ axons, suggesting that the axons degrade or displace BM components to enter the brain." Similar to the questions regarding the exit point, it was a bit difficult to evaluate this statement. There appears to be a broader region of BM discontinuity more dorsal to the arrowhead in Movie 2. A single-channel movie of just the laminin fluorescence might help to convey the extent of the discontinuity. As with above, was this seen in other samples, or in different optical sections?  

      See our response to the previous comment.

      • Figure 1H, I, "the distal tip of the RFP+ axons migrated in close proximity with the brain's BM." This is again a bit difficult to see, and quite different than what is seen in Figure 4A, in which the axons do not seem close to the BM in this section. Is it possible to visualize this via 3-dimensional rendering? 

      In fixed embryos or in live imaging experiments, we observed that, once entered in the brain, the distal tips (the growth cones) of the axons are located close to the BM of the brain. However, this is not the case of the axon shafts which, as development proceeds, are located further away from the BM. This can clearly be seen at 36 hpf in Figure 1D’ and Figure 4A, as spotted by the reviewer. We modified the text to clarify this point.

      • Figure 2J, J', p. 7, the gap between the OP and brain cells of sly mutants "was most often devoid of electron-dense material." It is difficult to see this loss of electron-dense material in 2J'. The thickness of the space is quantified well and is clearly smaller, but the change in electron-dense material is more difficult to see.  

      We looked at Figure 2 again and it seems clear to us that there is electron-dense material between the plasma membranes in controls, which is practically not seen (rare spots) in the mutants. We added a sentence mentioning that we rarely see electron-dense spots in sly mutants.

      • Figure 5E-F': There are concerns about evaluating the shape of a tissue based on nuclear position. Is there a way to co-stain for cell boundaries (maybe actin?), and then quantify distortion of the dlx+ cell population using the cell boundaries, rather than nuclear staining? 

      We agree with the reviewer that it is not ideal to evaluate the shape of the OP/brain boundary based on a nuclear staining. As explained in the text, we could not use the Tg(eltC:GFP) or Tg(cldnb:Gal4; UAS:RFP) reporter lines for this analysis, due to ectopic or mosaic expression. However we are confident that the segmentation of the Dlx3b immunostaining reflects the organisation of the cells at the OP/brain tissue boundary: in other data sets in which we performed Dlx3b staining with membrane labelling independently of the present study and in the wild type context, we clearly see that cell membranes are juxtaposed to the Dlx3b nuclear staining (in other words, the cytoplasm volume of OP cells is very small). 

      • Figure S5E: It would be helpful to see representative images for each of the categories (Proper axon bundle; Ventral projections; Medial projections) or a schematic to understand how the phenotypes were assessed. 

      To address this point we added a schematic view to illustrate the phenotypes assessed in each column of the table in revised Figure S5G.

      • Figure 6, p. 12, "Laminin gamma 1-dependent BMs are essential for growth and navigation of the axons...": What fraction of the tracked axons managed to exit the OP? Given the quantitative analyses in Figure 6, one might interpret this to mean that laminin gamma 1 is not essential for axon growth (speed and persistence are largely unchanged), but rather, primarily for navigation. 

      As noticed by the reviewer, the speed and persistence of axonal growth cones are largely unchanged in the sly mutants (except for the reduced persistence in the 200-400 min window, and an increased speed in the 800-1000 min window), showing that the growth cones are still motile. However, as shown by the tracks, they tend to wander around within the OP, close to the cell bodies, which results in the end in a perturbed growth of the axons. The navigation issues are rather revealed by the analysis of fixed Tg(omp:meYFP) embryos presented in the table of Figure S5G. We modified the text to separate more clearly the conclusions of the two types of experiments (fixed, transgenic embryos versus live, mosaically labelled embryos).

      Reviewer #3 (Recommendations For The Authors):

      Testing the hypotheses mentioned in the public review will be interesting experiments for a follow-up study, but are not essential revisions for this manuscript. 

      I have only a few minor suggestions for revisions: 

      P8 subheading 'Role of Laminin γ1-dependent BMs in OP coalescence' - since no major role was demonstrated here, this heading should be reworded.  

      We agree with the reviewer and replaced the previous title by « OP coalescence still occurs in the sly mutant ».

      P11, line 3 - the authors conclude that the forebrain is smaller 'due to' the inward convergence of the OPs. I do not think it is possible to assign causation to this when the mutant disrupts Laminin γ1 systemically - it is equally possible that the OPs move inward due to a failure of the brain to form in the normal shape. Thus, the wording should be changed here. (In the Discussion on p15, the authors mention the 'apparent distortion' of the brain, and say that it is 'possibly due' to the inward migration of the placodes', but again this could be toned down.) 

      We agree with the reviewer’s comment and changed the wording of our conclusions in the Results section.

      P11 and Fig. S5 - The table and text seem to be saying opposite things here. The text on p11 (3rd paragraph) indicates that the normal exit point is ventral and that this is disrupted in the mutant, with axons exiting dorsally. However, in the table, at each time point there is a higher % of axons exiting ventrally in the mutant. Please clarify. The table does not provide a % value for axons exiting dorsally - it might help to add a column to show this value. 

      We are grateful to the reviewer for pointing this out, and we apologize for the lack of clarity in the first version of the manuscript. We have modified the text and Figure S5 in order to clarify the different points raised by the reviewer in this comment. The Table in Fig. S5G does not represent the % of axons showing defects, but the % of embryos showing the phenotypes. In addition, an embryo is counted in the ventral or medial projection category if it shows at least one ventral or medial projection (even if its shows a proper bundle). This is now clearly indicated in the title of the columns in the table itself and in the legend. The embryos in which the axons exit dorsally in sly mutants are actually those counted in the left column of the Table (they exit dorsally and form a bundle), as shown by the new schematics added below the table. We also added this information in the title of the left column, and mention in the legend the pictures in which this dorsal exit can be observed in the article (Figures 4B and S3E’). Having more sly mutant embryos with axons exiting dorsally is thus compatible with more embryos showing at least one ventral projection.

      Fig. S6, shows the lack of neural crest cells between the olfactory placode and the brain in both laminin γ1 mutants (without a basement membrane) and foxd3 mutants (which retain the membrane). Comparison of the two mutants here is a neat experiment and the result is striking, demonstrating that it is the basement membrane, and not the neural crest, that is required for correct morphology of the olfactory placode. I think this figure should be presented as a main figure, rather than supplementary.  

      Our new live imaging characterisation of NCC migration in sly mutants and control siblings (Movie 9) revealed that at 32 hpf, in the vicinity of the OP, NCC (or their derivatives) are much more numerous than the subset of NCC showing crestin expression by in situ hybridisation (compare the end of our control movie – 32 hfp, with crestin ISH shown in Figure S6A for instance). 

      Thus, the extent of the NCC migration defects should be analysed in more detail in the foxd3 mutant in the future (using live imaging or other NCC markers), and for this reason we chose to keep this dataset in the supplementary Figures.

      One of the first topics covered in the Discussion section is the potential role of Collagen. I was surprised to see the description on P15 'the dramatic disorganization of the Collagen IV pattern observed by immunofluorescence in the sly mutant', as I hadn't picked this up from the Results section of the paper. I went back to the relevant figure (Fig. 2) and description on p7, which does not give the same impression: 'in sly mutants, Collagen IV immunoreactivity was not totally abolished'. This suggested to me that there was only minor (not dramatic) disorganisation of the Collagen IV. This needs clarification.  

      The linear, BM-like Collagen IV staining was lost in sly mutants, but not the fibrous staining which remained in the form of discrete patches surrounding the OP. We modified the text in the Results section as well as in the Figure 2 legend to clarify our observations made on embryos immunostained for Collagen IV.

      Typos etc 

      P5 - '(ii) above of the neuronal rosette' - delete the word 'of'. 

      P5 two lines below this - ensheathed. 

      P10 - '3 distinct AP levels' (delete s from distincts). 

      P10 - distortion (not distorsion) . 

      P12 - 'From 14 hpf, they' should read 'From 14 hpf, neural crest cells'. 

      P15, line 1 - 'is a consequence of' rather than 'is consecutive of'? 

      P22 'When the data were not normal,' should read 'When the data were not normally distributed,'. 

      We thank the reviewer for noticing these typos and have corrected them.

      General 

      Please number lines in future manuscripts for ease of reference. 

      This has been done.

    2. eLife Assessment

      This important study describes the function of Laminin y1-dependent basement membranes in development of the olfactory placode, including morphogenesis of the placode, boundary formation, and olfactory axonal pathfinding. The study uses elegant live imaging approaches and extensive quantitative analyses, combined with detailed mutant analyses to provide a compelling description of the role of Laminin in olfactory placode development. In addition to the contributions this study makes to understanding olfactory placode development, it will also be of broader interest to individuals studying extracellular matrix regulation of tissue morphogenesis, and neural development including neuronal pathfinding.

    3. Reviewer #1 (Public review):

      The authors describe the dynamic distribution of laminin γ1 in the olfactory system and forebrain. Using immunohistochemistry and transgenic lines, they found that the olfactory system and adjacent brain tissues are enveloped by basement membrane (BMs) from the earliest stages of olfactory system assembly. They also found that laminin deposits follow the axonal trajectory of axons. They performed a functional analysis of the sly mutant to analyse the function of laminin γ1 in the development of the zebrafish olfactory system. Their study revealed that laminin enables the shape and position of olfactory placodes to be maintained late in the face of major morphogenetic movements in the brain, and its absence promotes the local entry of sensory axons into the brain and their navigation towards the olfactory bulb.

      They showed that in the laminin γ1 mutants no BM staining of laminin could be detected around the OP and the brain. The authors then elegantly used electron microscopy to analyse the ultrastructure of the border between the OP and the brain.<br /> The authors performed a quantitative analysis of the loss of function of Laminin γ1 (sly mutants).<br /> Olfactory axon migration is drastically impaired in sly mutants, demonstrating that Laminin γ1-dependent BMs are essential for the growth and navigation of axons from the OP to the olfactory bulb. They propose that the BM of the OP prevents its deformation in response to mechanical forces generated by morphogenetic movements of the neighbouring brain.<br /> Although the results are expected, the experiments carried out and the results are robust and elegant.

    4. Reviewer #2 (Public review):

      Summary:

      This manuscript addresses the role of extracellular matrix in olfactory development. Despite the importance of these extracellular structures, the specific roles and activities of matrix molecules are still poorly understood. Here, the authors combine live imaging and genetics to examine the role of the laminin gamma 1 in multiple steps of olfactory development. The work comprises a descriptive but carefully executed, quantitative assessment of the olfactory phenotypes resulting from loss of laminin gamma 1. Overall, this is a constructive advance in our understanding of extracellular matrix contributions to olfactory development, with a well-written Discussion with relevance to many other systems.

      Strengths:

      The strengths of the manuscript are in the approaches: the authors have combined live imaging, careful quantitative analyses, and molecular genetics. The work presented takes advantage of many zebrafish tools including mutants and transgenics to directly visualize the laminin extracellular matrix in living embryos during the developmental process.

      Weaknesses:

      Weaknesses in the first round of critique were addressed in the revision, and a minor caveat is regarding interpretation of differences in tissue size and shape in fixed samples (comparing mutants and controls); the fixation process can alter these properties and may do so differently between genotypes.

    5. Reviewer #4 (Public review):

      Summary:

      In this elegant study XX and colleagues use a combination of fixed tissue analyses and live imaging to characterise the role of Laminin in olfactory placode development and neuronal pathfinding in the zebrafish embryo. They describe Laminin dynamics in the developing olfactory placode and adjacent brain structures and identify potential roles for Laminin in facilitating neuronal pathfinding from the olfactory placode to the brain. To test whether Laminin is required for olfactory placode neuronal pathfinding they analyse olfactory system development in a well-established laminin-gamma-1 mutant, in which the laminin-rich basement membrane is disrupted. They show that while the OP still coalesces in the absence of Laminin, Laminin is required to contain OP cells during forebrain flexure during development and maintain separation of the OP and adjacent brain region. They further demonstrate that Laminin is required for growth of OP neurons from the OP-brain interface towards the olfactory bulb. The authors also present data describing that while the Laminin mutant has partial defects in neural crest cell migration towards the developing OP, these NCC defects are unlikely to be the cause of the neuronal pathfinding defects upon loss of Laminin. Altogether the study is extremely well carried out, with careful analysis of high-quality data. Their findings are likely to be of interest to those working on olfactory system development, or with an interest in extracellular matrix in organ morphogenesis, cell migration, and axonal pathfinding.

      Strengths:

      The authors describe for the first time Laminin dynamics during the early development of the olfactory placode and olfactory axon extension. They use an appropriate model to perturb the system (lamc1 zebrafish mutant), and demonstrate novel requirements for Laminin in pathfinding of OP neurons towards the olfactory bulb.<br /> The study utilises careful and impressive live imaging to draw most of its conclusions, really drawing upon the strengths of the zebrafish model to investigate the role of laminin in OP pathfinding. This imaging is combined with deep learning methodology to characterise and describe phenotypes in their Laminin-perturbed models, along with detailed quantifications of cell behaviours, together providing a relatively complete picture of the impact of loss of Laminin on OP development.

      Weaknesses:

      Some of the statistical tests are performed on experiments where n=2 for each condition (for example the measurements in Figure S2) - in places the data is non-significant, but clear trends are observed, and one wonders whether some experiments are under-powered.

    1. eLife Assessment

      This important study suggests that the composition of the extracellular matrix in a mouse model of liver fibrosis changes depending on the cause of liver fibrosis. The data could be used as a foundation for future antifibrotic therapies. The strength of evidence is solid with respect to the use of animal models and proteomic analysis. The study provides a helpful inventory of proteins up or down-regulated, but functional analyses are limited and translational data are lacking.

    2. Reviewer #1 (Public review):

      Summary:

      Jirouskova and colleagues in their study have carried out an in-depth proteomic characterization of the dynamics of the liver fibrotic response and the resulting resolution in two distinct models of liver injury: CCl4-induced model of hepatotoxicity and pericentral/bridging liver fibrosis and the DDC feeding model of obstructive cholestasis and periportal fibrosis. They focussed on both the insoluble extracellular matrix (ECM) components as well as the soluble secreted factors produced by hepatic stellate cells (HSCs) and/or portal fibroblasts (PFs). They identified compartment- and time-resolved proteomic signatures in the two models with disease-specific factors or matrisomes. Their study also identified phenotypic differences between the models such as that while the CCl4-induced model induced profound hepatotoxicity followed by resolution, the DDC model induced more lasting liver damage and proteomic changes that resembled advanced human liver fibrosis favouring hepatocarcinogenesis.

      Overall, this comprehensive and very well-conducted study is rigorous and well-planned. The conclusions are supported by compelling studies and analyses. One caveat is the lack of mechanistic experiments to prove causality, but this can be carried out in follow-up studies.

      Strengths:

      (1) A major strength of the study is that the experiments are rigorous and very well conducted. For instance, the authors utilized two models of liver fibrosis to study different aspects of the pathology - hepatotoxicity vs cholestasis. In addition, 4 time points for each model were investigated - 2 for fibrosis development and 2 for fibrosis resolution. They have taken 3 components for proteomic analyses - total lysates, insoluble ECM components as well as the soluble secreted factors. Thus, the authors provide a comprehensive overview of the fibrosis and resolution process in these models.

      (2) Another great strength of the study is that the methodology utilized was able to dissect unique pathways relevant to each model as well as common targets. For example, the authors identified known pathways such as mTOR signalling to be differentially regulated in the CCl4 vs DDC model. mTOR signalling was increased in the DDC model which is associated with hyperproliferation. Thus showing that the approach taken is specific enough to distinguish between the two similar (both induce fibrosis) but distinct mechanisms (hepatotoxicity vs cholestasis) is a strong point of the study.

      Weaknesses:

      (1) The authors themselves propose in their Introduction that the "ECM-associated changes are increasingly perceived as causative, rather than consequential"; however, they have not conducted mechanistic (gain of function/loss of function) studies either in vitro or in vivo from any of their identified targets to truly prove causality. This remains one of the limitations of this study. Thus, future studies should investigate this point in detail. For instance, it would have been intriguing to dissect if knocking out specific genes involved in one specific model or genes common to both would yield distinct phenotypic outcomes.

      (2) The majority of the conclusions are derived primarily from the proteomic analyses. Although well conducted, it would strengthen the study to corroborate some of the major findings by other means such as IHC/IF with the corresponding quantifications and not only representative images.

    3. Reviewer #2 (Public review):

      Summary:

      The authors suggest that ECM abundance and composition change depending on the aetiology of liver fibrosis. To understand this they have investigated the proteome in two models of animal fibrosis and resolution. They suggest their findings could provide a foundation for future anti-fibrotic therapies.

      Strengths:

      The animal models used are widely studied models of liver fibrosis from both parenchymal and biliary damage aspects. Both would allow analysis of resolution. The CCl4 model in particular fully reverts to a 'healthy' liver following cessation of the insult. I am less clear whether/how quickly the ductal plugs clear in DDC models and thus this may not provide the response they are looking for in terms of reversibility. I believe there have been several extensive studies using a transcriptomics approach in assessing genes and cells involved in the CCl4 model of resolution. Even more mutliomic models of general fibrosis progression in many of the mouse models of fibrosis. However, the proteomic approach they have used is robust and they have made some attempts to integrate with cell-type specific signatures from previously published data.

      Although there is minimal data, hepatocyte elasticity is a very interesting part of their study. Additional data and focussed attention on the mechanisms underpinning this would be very insightful.

      Weaknesses:

      As it currently stands, the data, whilst extensive, is primarily focussed on the proteomic data which is fairly descriptive and I am not clear on the additional insight gained in their approach that is not already detailed from the extensive transcriptomic studies. The manuscript overall would benefit from some mechanistic functional insight to provide new additional modes of action relevant to fibrosis progression. Whilst there is some human data presented it is a minimal analysis without quantification that would imply relevance to disease state.

      Although studying disease progression in animals is a fundamental aspect of understanding the full physiological response of fibrotic disease, without more human insight makes any analysis difficult to fulfil their suggestion that these targets identified will be of use to treat human disease.

      Some of the terminology is incorrect while discussing these models of injury used and care should be taken. For example - both models are toxin-induced and I do not think these data have any support that the DDC model has a higher carcinogenic risk. An investigation into the tumour-induced risk would require significant additional models. These types of statements are incorrect and not supported by this study.

    1. eLife Assessment

      This study provides valuable insights into the evolutionary histories and cellular infection responses of two Salmonella Dublin genotypes. While the evidence is compelling, a more phylogenetically diverse bacterial collection would enhance the findings. This research is relevant to scientists studying Salmonella and gastroenteritis-related pathogens.

    2. Reviewer #1 (Public review):

      The manuscript consists of two separate but interlinked investigations: genomic epidemiology and virulence assessment of Salmonella Dublin. ST10 dominates the epidemiological landscape of S. Dublin, while ST74 was uncommonly isolated. Detailed genomic epidemiology of ST10 unfolded the evolutionary history of this common genotype, highlighting clonal expansions linked to each distinct geography. Notably, North American ST10 was associated with more antimicrobial resistance compared to others. The authors also performed long-read sequencing on a subset of isolates (ST10 and ST74) and uncovered a novel recombinant virulence plasmid in ST10 (IncX1/IncFII/IncN). Separately, the authors performed cell invasion and cytotoxicity assays on the two S. Dublin genotypes, showing differential responses between the two STs. ST74 replicates better intracellularly in macrophages compared to ST10, but both STs induced comparable cytotoxicity levels. Comparative genomic analyses between the two genotypes showed certain genetic content unique to each genotype, but no further analyses were conducted to investigate which genetic factors were likely associated with the observed differences. The study provides a comprehensive and novel understanding of the evolution and adaptation of two S. Dublin genotypes, which can inform public health measures.

      The methodology included in both approaches was sound and written in sufficient detail, and data analysis was performed with rigour. Source data were fully presented and accessible to readers. Certain aspects of the manuscript could be clarified and extended to improve the manuscript.

      (1) For epidemiology purposes, it is not clear which human diseases were associated with the genomes included in this manuscript. This is important since S. Dublin can cause invasive bloodstream infections in humans. While such information may be unavailable for public sequences, this should be detailed for the 53 isolates sequenced for this study, especially for isolates selected to perform experiments in vitro.

      (2) The major AMR plasmid in described S. Dublin was the IncC associated with clonal expansion in North America. While this plasmid is not found in the Australian isolates sequenced in this study, the reviewer finds that it is still important to include its characterization, since it carries blaCMY-2 and was sustainedly inherited in ST10 clade 5. If the plasmid structure is already published, the authors should include the accession number in the Main Results.

      (3) The reviewer is concerned that the multiple annotations missing in<br /> (a) plasmid structures in Supplementary Figures 5 & 6, and<br /> (b) genetic content unique to ST10 and ST74 was due to insufficient annotation by Prokka. I would recommend the authors use another annotation tool, such as Bakta (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8743544/) for plasmid annotation, and reconstruction of the pangenome described in Supplementary Figure 10. Since the recombinant virulence plasmid in ST10 is a novel one, I would recommend putting Supplementary Figure 5 as a main figure, with better annotations to show the virulence region, plasmid maintenance/replication, and possible conjugation cluster.

      (4) The authors are lauded for the use of multiple strains of ST10 and ST74 in the in vitro experiment. While results for ST74 were more consistent, readouts from ST10 were more heterogenous (Figure 5, 6). This is interesting as the tested ST10 were mostly clade 1, so ST10 was, as expected, of lower genetic diversity compared to tested ST74 (partly shown in Figure 1D. Could the authors confirm this by constructing an SNP table separately for tested ST10 and ST74? Additionally, the tested ST10 did not represent the phylogenetic diversity of the global epidemiology, and this limitation should be reflected in the Discussion.

      (5) The comparative genomics between ST10 and ST74 can be further improved to allow more interpretation of the experiments. Why were only SPI-1, 2, 6, and 19 included in the search for virulome, how about other SPIs? ST74 lacks SPI-19 and has truncated SPI-6, so what would explain the larger genome size of ST74? Have the authors screened for other SPIs using more well-annotated databases or references (S. Typhi CT18 or S. Typhimurium ST313)? The mismatching between in silico prediction of invasiveness and phenotypes also warrants a brief discussion, perhaps linked to bigger ST74 genome size (as intracellular lifestyle is usually linked with genome degradation).

      (6) On the epidemiology scale, ST10 is more successful, perhaps due to its ongoing adaptation to replication inside GI epithelial cells, favouring shedding. ST74 may tend to cause more invasive disease and less transmission via fecal shedding. The presence of T6SS in ST10 also can benefit its competition with other gut commensals, overcoming gut colonization resistance. The reviewer thinks that these details should be more clearly rephrased in the Discussion, as the results highly suggested different adaptations of two genotypes of the same serovar, leading to different epidemiological success.

    3. Reviewer #2 (Public review):

      This is a comprehensive analysis of Salmonella Dublin genomes that offers insights into the global spread of this pathogen and region-specific traits that are important to understanding its evolution. The phenotyping of isolates of ST10 and ST74 also offers insights into the variability that can be seen in S. Dublin, which is also seen in other Salmonella serovars, and reminds the field that it is important to look beyond lab-adapted strains to truly understand these pathogens. This is a valuable contribution to the field. The only limitation, which the authors also acknowledge, is the bias towards S. Dublin genomes from high-income settings. However, there is no selection bias; this is simply a consequence of publically available sequences.

    1. eLife Assessment

      Following up on their previous work, the authors investigated whether cell-to-cell transmission of HIV-1 activates the CARD8 inflammasome in macrophages. This is important given that inflammasome activation in myeloid cells triggers proinflammatory cytokine release. The data are solid and support the idea that CARD8 is activated by the viral protease and promotes inflammation. However, time-course analyses in primary T cells and macrophages and further information on the specific inflammasome involved would further increase the significance of the study.

    2. Joint Public Review:

      Following up on their previous work, the authors investigated whether cell-to-cell transmission of HIV-1 activates the CARD8 inflammasome in macrophages, an important question given that inflammasome activation in myeloid cells triggers proinflammatory cytokine release. The data support the idea that CARD8 is activated by the viral protease and promotes inflammation. However, time-course analyses in primary T cells and macrophages and further information on the specific inflammasome involved would further increase the significance of the study.

      Strengths:

      The manuscript is well-written and the data is of good quality. The evidence that CARD8 senses the HIV-1 protease in the context of cell-to-cell transmission is important since cell-to-cell transmission is thought to play a key role in viral spread in vivo, and inflammation is a major driver of disease progression. Clean knockout experiments in primary macrophages are a notable strength and the results clearly support the role of CARD8 in protease-dependent sensing of viral spread and the induction of IL1β release and cell death. The finding that HIV-1 strains are resistant to protease inhibitors differ in CARD8 activation and IL1β production is interesting and underscores the potential clinical relevance of these results.

      Weaknesses:

      One weakness is that the authors used T cell lines which might not faithfully reflect the efficiency of HIV-1 production and cell-cell transfer by primary T cells. To assess whether CARD8 is also activated by protease from incoming viral particles earlier time points should be analyzed. Finally, while the authors exclude the role of NLRP3 in IL-1b and the death of macrophages it would be interesting to know whether the effect is still Gasdermin D dependent.

    3. Author response:

      Thank you for the positive and constructive feedback on our manuscript. We appreciate you highlighting the importance of our work advancing our understanding of the molecular etiology of acquired immunodeficiency syndrome (AIDS). To extend and further substantiate the observation that the CARD8 inflammasome is activated in response to viral protease during HIV-1 cell-to-cell transmission, we are in the process of completing additional experiments that are responsive to reviewer feedback, including:

      • Primary CD4+ T cell to monocyte-derived macrophage (MDM) transmission:  We have now repeated the cell-to-cell experiments with HIV-1 transfer from primary CD4+ T cells to primary monocyte-derived macrophages, and our findings are consistent with CARD8-dependent IL-1β release from HIV-1-infected macrophages in this more physiologic context. We are in the process of repeating these experiments with additional donors and will add these results to the revised manuscript.

      • Heterogeneity amongst blood donors: We have now repeated the cell-to-cell transfer and CARD8 knockout in MDMs with additional donors. While we continue to observe heterogeneity amongst donors, the key observation that CARD8 is require for inflammasome responses to HIV-1 infection is consistent. We note that some donors, including the one individual reported in the first submission, have markedly diminished CARD8 activity (to both HIV-1 and VbP).

      • Time course experiments: We did conduct a time course experiment when initially establishing these assays. We have now repeated these experiments with additional timepoints and in the presence or absence of the RT inhibitor nevirapine. The results of these experiments will be included in the revised manuscript.

      • The role of Gasdermin D: We are mostly interested in the release of IL-1β from the infected macrophages due to its potential contribution to myeloid-driven inflammation in PLWH. To date, there is no evidence that any other pore-forming protein other than GSDMD can initiate IL-1β release (and pyroptosis) downstream of CARD8. Nonetheless, we will attempt this experiment with the Gasdermin D inhibitor, disulfiram. 

      We believe these and other experiments will further support the importance of the CARD8 inflammasome in myeloid-driven inflammation in PLWH and look forward to submitting the revision.

    1. eLife Assessment

      This valuable study investigates prey capture by archer fish, showing that even though the visuomotor behavior unfolds very rapidly (within 40-70 ms), it is not hardwired; it can adapt to different simulated physics and different prey shapes. Although there was agreement that the model system, experimental design, and main hypothesis are certainly interesting, opinions were divided on whether the evidence supporting the central claims is incomplete. A more rigorous definition and assessment of "reflex speed", more detailed evidence of stimulus control, and a more detailed analysis of individual subjects could potentially increase confidence in the main conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      The authors test whether the archerfish can modulate the fast response to a falling target. By manipulating the trajectory of the target, they claim that the fish can modulate the fast response. While it is clear from the result that the fish can modulate the fast response, the experimental support for the argument that the fish can do it for a reflex-like behavior is inadequate.

      Strengths:

      Overall, the question that the authors raised in the manuscript is interesting.

      Weaknesses:

      (1) The argument that the fish can modulate reflex-like behavior relies on the claim that the archerfish makes the decision in 40 ms. There is little support for the 40 ms reaction time. The reaction time for the same behavior in Schlegel 2008, is 60-70 ms, and in Tsvilling 2012 about 75 ms, if we take the half height of the maximum as the estimated reaction time in both cases. If we take the peak (or average) of the distribution as an estimation of reaction time, the reaction time is even longer. This number is critical for the analysis the authors perform since if the reaction time is longer, maybe this is not a reflex as claimed. In addition, mentioning the 40 ms in the abstract is overselling the result. The title is also not supported by the results.

      (2) A critical technical issue of the stimulus delivery is not clear. The frame rate is 120 FPS and the target horizontal speed can be up to 1.775 m/s. This produces a target jumping on the screen 15 mm in each frame. This is not a continuous motion. Thus, the similarity between the natural system where the target experiences ballistic trajectory and the experiment here is not clear. Ideally, another type of stimulus delivery system is needed for a project of this kind that requires fast-moving targets (e.g. Reiser, J. Neurosci.Meth. 2008). In addition, the screen is rectangular and not circular, so in some directions, the target vanishes earlier than others. It must produce a bias in the fish response but there is no analysis of this type.

      (3) The results here rely on the ability to measure the error of response in the case of a virtual experiment. It is not clear how this is done since the virtual target does not fall. How do the authors validate that the fish indeed perceives the virtual target as the falling target? Since the deflection is at a later stage of the virtual trajectory, it is not clear what is the actual physics that governs the world of the experiment. Overall, the experimental setup is not well designed.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript studies prey capture by archer fish, which observe the initial values of motion of aerial prey they made fall by spitting on them, and then rapidly turn to reach the ballistic landing point on the water surface. The question raised by the article is whether this incredibly fast decision-making process is hardwired and thus unmodifiable or can be adjusted by experience to follow a new rule, namely that the landing point is deflected from a certain amount of the expected ballistic landing point. The results show that the fish learn the new rule and use it afterward in a variety of novel situations that include height, side, and speed of the prey, and which preserve the speed of the fish's decision. Moreover, a remarkable finding presented in this work is the fact that fish that have learned to use the new rule can relearn to use the ballistic landing point for an object based on its shape (a triangle) while keeping simultaneously the 'deflected rule' for an object differing in shape (a disc); in other words, fish can master simultaneously two decision-making rules based on the different shape of objects.

      Strengths:

      The manuscript relies on a sophisticated and clever experimental design that allows changing the apparent landing point of a virtual prey using a virtual reality system. Several robust controls are provided to demonstrate the reliability and usefulness of the experimental setup.

      Overall, I very much like the idea conveyed by the authors that even stimuli triggering apparently hardwired responses can be relearned in order to be associated with a different response, thus showing the impressive flexibility of circuits that are sometimes considered mediating pure reflexive responses. This is the case - as an additional example - of the main component of the Nasanov pheromone of bees (geraniol), which triggers immediate reflexive attraction and appetitive responses, and which can, nevertheless, be learned by bees in association with an electric shock so that bees end up exhibiting avoidance and the aversive response of sting extension to this odorant (1), which is a fully unnatural situation, and which shows that associative aversive learning is strong enough to override preprogrammed responding, thus reflecting an impressive behavioral flexibility.

      Weaknesses:

      As a general remark, there is some information that I missed and that is mandatory in the analysis of behavioral changes.

      Firstly, the variability in the performances displayed. The authors mentioned that the results reported come from 6 fish (which is a low sample size). How were the individual performances in terms of consistency? Were all fish equally good in adjusting/learning the new rule? How did errors vary according to individual identity? It seems to me that this kind of information should be available as the authors reported that individual fish could be recognized and tracked (see lines 620-635) and is essential for appreciating the flexibility of the system under study.

      Secondly, the speed of the learning process is not properly explained. Admittedly, fish learn in an impressive way the new rule and even two rules simultaneously; yet, how long did they need to achieve this? In the article, Figure 2 mentions that at least 6 training stages (each defined as a block of 60 evaluated turn decisions, which actually shows that the standard term 'Training Block' would be more appropriate) were required for the fish to learn the 'deflected rule'. While this means 360 trials (turning starts), I was left with the question of how long this process lasted. How many hours, days, and weeks were needed for the fish to learn? And as mentioned above, were all fish equally fast in learning? I would appreciate explaining this very important point because learning dynamics is relevant to understanding the flexibility of the system.

      Reference:

      (1) Roussel, E., Padie, S. & Giurfa, M. Aversive learning overcomes appetitive innate responding in honeybees. Anim Cogn 15, 135-141, doi:10.1007/s10071-011-0426-1 (2012).

    4. Author response:

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      The authors test whether the archerfish can modulate the fast response to a falling target.

      We have not tested whether archerfish can 'modulate the fast response'. We quantitatively test specific hypotheses on the rules used by the fish. For this the accuracy of the decisions is analyzed with respect to specific points that can be calculated precisely in each experiment. The ill-defined term 'modulate' does in no way capture what is done here. This assessment might explain the question, raised by the reviewer, of 'what is the difference of this study and Reinel, 2016' (i.e. Reinel and Schuster, 2016). In that study, all objects were strictly falling ballistically, and latency and accuracy of the turn decisions were determined when the initial motion was not only horizontal but had an additional vertical component of speed. The question of that study was if the need to account to an additional variable (vertical speed) in the decision would affect its latency or accuracy. The study showed that also then archerfish rapidly turn to the later impact point. It also showed that accuracy and latency (defined in this study exactly as in the present study) were not changed by the added degree of freedom. This is a completely different question and by its very nature does not leave the realm of ballistics.

      By manipulating the trajectory of the target, they claim

      that the fish can modulate the fast response.

      While it is clear from the result that the fish can modulate the fast response, the experimental support for the argument that the fish can do it for a reflex-like behavior is inadequate. 

      This is disturbing: The manuscript is full of data that directly report response latency (a parameter that's critical in all experiments) and there are even graphical displays of the distribution of latency (Figs. 2, 5). How fast the responses are, can also already be seen in the first video. Most importantly, the nature of the 40 ms limit has been discovered and has been reported by our group in 2008 (Schlegel and Schuster, 2008, Fig. 4). For easy reference, we attach Schlegel and Schuster, 2008 with the relevant passages marked in yellow. But later studies also using high speed video (ie. typically 500 fps) and simultaneously evaluating accuracy and kinematics (in the same ways as used here!) to address various questions repeatedly report and even graphically represent minimum latencies of 40 ms, e.g. Krupczynski and Schuster, 2013 (e.g. Fig. 2); Reinel and Schuster, 2014; Reinel and Schuster, 2016;  Reinel and Schuster, 2018a, b (e.g. see Fig. 7 in the first part) and report how latency is increased as urgency is decreased (if the fish are too close or time of falling is increased), as temperature is decreased or as viewing conditions and their homogeneity across the tank change. Moreover, even a field study is available (Rischawy, Blum and Schuster, 2015) that shows why the speed is needed. This is because of massive competition with at least some of the competitor fish also be able to turn to the later impact point. So, speed is an absolute necessity if competitors are around. Interestingly, when the fish are isolated, latency goes up and eventually the fish will no longer respond with C-starts (Schlegel and Schuster, 2008).

      Another aspect: considering the introduction it would not even have mattered if not 40 ms but instead 150 ms were the time needed for an accurate start (which is not the case). That would still be faster than an Olympic sprinter responds to a gun shot. Moreoever, please also note that we carefully talk of reflex-speed not of a reflex-behavior (which is as easy to verify as any other if the false statements made).

      Strengths: 

      Overall, the question that the authors raised in the manuscript is interesting. 

      Given the statement of no difference between the present study and Reinel and Schuster, 2016, it is not clear what this assessment refers to.

      Weaknesses: 

      (1) The argument that the fish can modulate reflex-like behavior relies on the claim that the archerfish makes the decision in 40 ms. There is little support for the 40 ms reaction time.

      The 'little support' is a paper in Science in which this important aspect is directly analyzed (Fig. 4 of that paper) and that has been praised by folks like Yadin Dudai (e.g . in Faculty 1000). The support is also data on latency as presented in the present paper. Furthermore, additional publications are available on the reaction time (see above).

      The reaction time for the same behavior in Schlegel 2008, is 60-70 ms, and in Tsvilling 2012 about 75 ms, if we take the half height of the maximum as the estimated reaction time in both cases. If we take the peak (or average) of the distribution as an estimation of reaction time, the reaction time is even longer. This number is critical for the analysis the authors perform since if the reaction time is longer, maybe this is not a reflex as claimed.

      See above.

      In addition, mentioning the 40 ms in the abstract is overselling the result.

      See above.

      Just for completeness: Considering a very interesting point raised by reviewer 2 we add an additional panel to further emphasize the exciting point that accuracy and latency are unrelated in the start decisions. That point was already made in Fig.4 of the paper in Science but can be directly addressed.  

      The title is also not supported by the results. 

      No: the title is clearly supported by the results that are reported in the paper.

      (2) A critical technical issue of the stimulus delivery is not clear.

      The stimulus delivery is described in detail. Most importantly we emphasize (even mentioning frame rate) that all VR setups require experimental confirmation that they work for the species and for the behavior at hand. Ideally, they should elicit the same behavior (in all aspects) as a real stimulus does that the VR approach intends to mimic. Whether VR works in a given animal and for the behavior at hand in that animal cannot be known or postulated a priori. It must be shown in direct critical experiments. Such experiments and the need to perform them are described in detail in Figure 2 and in the text that is associated with that figure.

      The frame rate is 120 FPS and the target horizontal speed can be up to 1.775 m/s. This produces a target jumping on the screen 15 mm in each frame. This is not a continuous motion. Thus, the similarity between the natural system where the target experiences ballistic trajectory and the experiment here is not clear. Ideally, another type of stimulus delivery system is needed for a project of this kind that requires fast-moving targets (e.g. Reiser, J. Neurosci.Meth. 2008).

      See above. It is quite funny that one of the authors of the present study had been involved in developing a setup with a complete panorama of 6000 LEDs (Strauss, Schuster and Götz, 1997; and appropriately cited in Reiser) that has been the basis for Reiser. This panorama was also used to successfully implement VR in freely walking Drosophila (Schuster et al., Curr. Biol., 2002). However, an LED based approach was abandoned because of insufficient spatial resolution (that, in archerfish, is very different from that of Drosophila).

      But the crucial point really is this: Just looking at Figure 2 shows that our approach could not have worked better in any way - it provided the input needed to cause turn decisions that are in all aspects just as those with real objects. Achieving this was not at all trivial and required enormous effort and many failed attempts. But it allows addressing our questions for the first time after 20 years of studying these interesting decisions.

      In addition, the screen is rectangular and not circular, so in some directions, the target vanishes earlier than others. It must produce a bias in the fish response but there is no analysis of this type. 

      Why 'must' it produce a bias? Is it not conceivable that you can only use a circular part of the screen? Briefly, and as could have been checked by quickly looking into the methods section, this is what we did. But still, why would it have mattered in our strictly randomized design? It could have mattered only in a completely silly way of running the experiments in which exclusively long trajectories are shown in one condition and exclusively short ones in another.

      (3) The results here rely on the ability to measure the error of response in the case of a virtual experiment. It is not clear how this is done since the virtual target does not fall.

      Well, of course it does not fall!!! That is the whole point that enables the study, and this is explained in connection with the glass plate experiment of Fig. 1 and quite some text is devoted to say that this is the starting point for the present analysis. The ballistic impact point is calculated (just as explained in our very first paper on the start decisions, Rossel, Corlija and Schuster, 2002) from the initial speed and height of the target, using simple high-school physics and the justification for that is also in that paper. This has been done already more than 20 years ago. How else could that paper have arrived at the conclusion that the fish turned to the virtual impact point even though nothing is falling? We also describe this for the readers of the present study, illustrate how accuracy is determined in Figures, in all videos and in an additional Supplementary Figure. Consulting the paper reveals that orientation of the fish is determined immediately at the end of stage 2 of its C-start and the error directly reports how close continuing in that direction would lead the fish to the (real or virtual) impact point. This measure has also been used since the first paper in 2002 in our lab and it is very useful because it provides an invariant measure that allows pooling all the different conditions (orientation and position of responding fish as well as direction, speed and height of target).

      How do the authors validate that the fish indeed perceives the virtual target as the falling target?

      See above. The fish produce C-starts (whose kinematics are analyzed and reported in Figures), whose latency is measured (from onset of target motion to onset of C-start) and whose accuracy in aligning them to the calculated virtual impact point is measured (see above). Additionally, the errors are also analyzed to other points of interest, for instance landmarks, the ballistic landing point in the re-trained fish or points calculated on the basis of specific hypotheses in the generalization experiments.

      Since the deflection is at a later stage of the virtual trajectory, it is not clear what is the actual physics that governs the world of the experiment.

      As explained in the text what we need is substituting the ballistic connection with another fixed relation between initial target motion and the landing point. This other relation needs to produce a large error in the aims when they remain based on the ballistic virtual landing point. It is directly shown in the key experiments that the fish need not see the deflection but can respond appropriately to the initial motion after training (Figs. 3, 5 and corresponding paragraphs in the text as well as additional movies). Please also note that after training the decision is based on the initial movement. This is shown in the interspersed experiments in which nothing than the initial (pre-deflection) movement was shown.

      Overall, the experimental setup is not well designed. 

      It is obviously designed well enough to mimic the natural situation in every aspect needed (see Fig. 2) and well enough to answer the questions we have asked.

      Reviewer #2 (Public review): 

      Summary: 

      This manuscript studies prey capture by archer fish, which observe the initial values of motion of aerial prey they made fall by spitting on them, and then rapidly turn to reach the ballistic landing point on the water surface. The question raised by the article is whether this incredibly fast decision-making process is hardwired and thus unmodifiable or can be adjusted by experience to follow a new rule, namely that the landing point is deflected from a certain amount of the expected ballistic landing point. The results show that the fish learn the new rule and use it afterward in a variety of novel situations that include height, side, and speed of the prey, and which preserve the speed of the fish's decision. Moreover, a remarkable finding presented in this work is the fact that fish that have learned to use the new rule can relearn to use the ballistic landing point for an object based on its shape (a triangle) while keeping simultaneously the 'deflected rule' for an object differing in shape (a disc); in other words, fish can master simultaneously two decision-making rules based on the different shape of objects. 

      Strengths: 

      The manuscript relies on a sophisticated and clever experimental design that allows changing the apparent landing point of a virtual prey using a virtual reality system. Several robust controls are provided to demonstrate the reliability and usefulness of the experimental setup. 

      Overall, I very much like the idea conveyed by the authors that even stimuli triggering apparently hardwired responses can be relearned in order to be associated with a different response, thus showing the impressive flexibility of circuits that are sometimes considered mediating pure reflexive responses.

      Thank you so much for this precise assessment of what we have shown!

      This is the case - as an additional example - of the main component of the Nasanov pheromone of bees (geraniol), which triggers immediate reflexive attraction and appetitive responses, and which can, nevertheless, be learned by bees in association with an electric shock so that bees end up exhibiting avoidance and the aversive response of sting extension to this odorant (1), which is a fully unnatural situation, and which shows that associative aversive learning is strong enough to override preprogrammed responding, thus reflecting an impressive behavioral flexibility. 

      That's very interesting, thanks.

      Weaknesses: 

      As a general remark, there is some information that I missed and that is mandatory in the analysis of behavioral changes. 

      Firstly, the variability in the performances displayed. The authors mentioned that the results reported come from 6 fish (which is a low sample size). How were the individual performances in terms of consistency? Were all fish equally good in adjusting/learning the new rule? How did errors vary according to individual identity? It seems to me that this kind of information should be available as the authors reported that individual fish could be recognized and tracked (see lines 620-635) and is essential for appreciating the flexibility of the system under study. 

      Secondly, the speed of the learning process is not properly explained. Admittedly, fish learn in an impressive way the new rule and even two rules simultaneously; yet, how long did they need to achieve this? In the article, Figure 2 mentions that at least 6 training stages (each defined as a block of 60 evaluated turn decisions, which actually shows that the standard term 'Training Block' would be more appropriate) were required for the fish to learn the 'deflected rule'. While this means 360 trials (turning starts), I was left with the question of how long this process lasted. How many hours, days, and weeks were needed for the fish to learn? And as mentioned above, were all fish equally fast in learning? I would appreciate explaining this very important point because learning dynamics is relevant to understanding the flexibility of the system. 

      First, it is very important to keep the question in mind that we wanted to clarify: Does the system have the potential to re-tune the decisions to other non-ballistic relations between the input variables and the output? This would have been established if one fish was found capable of doing that. However, we do have sufficient evidence to say that all six fish learned the new law and that at least one (actually four) individual was capable of simultaneously handling the two laws. We will explain this much better (hopefully) in our revised version. We also have to stress that not all archerfish might actually be able to do this and that not all archerfish might learn in the same way, at the same speed, or using the same strategies. These questions are extremely interesting and we therefore definitely will include all evidence that we have. If some individuals are better than others in quickly adjusting, then even observational learning could become a part of the story. However, we needed to make and document the first steps. Understanding these is essential and apparently is difficult enough.

      Reference: 

      (1) Roussel, E., Padie, S. & Giurfa, M. Aversive learning overcomes appetitive innate responding in honeybees. Anim Cogn 15, 135-141, doi:10.1007/s10071-011-0426-1 (2012). 

      Thanks for this reference!

    1. eLife Assessment

      This study provides evidence that cerebellar projections to the thalamus are required for learning and execution of motor skills in the accelerating rotarod task. This important study adds to a growing body of literature on the interactions between the cerebellum, motor cortex, and basal ganglia during motor learning. The data presentation is generally sound, especially the main observations, with some limitations in describing the statistical methods and a lack of support for two segregated cerebello-thalamic pathways, which is incomplete in supporting the overall claim.

    2. Reviewer #1 (Public review):

      This is an interesting manuscript tackling the issue of whether subcircuits of the cerebellum are differentially involved in processes of motor performance, learning, or learning consolidation. The authors focus on cerebellar outputs to the ventrolateral thalamus (VL) and to the centrolateral thalamus (CL), since these thalamic nuclei project to the motor cortex and striatum respectively, and thus might be expected to participate in diverse components of motor control and learning. In mice challenged with an accelerating rotarod, the investigators reduce cerebellar output either broadly, or in projection-specific populations, with CNO targeting DREADD-expressing neurons. They first establish that there are not major control deficits with the treatment regime, finding no differences in basic locomotor behavior, grid test, and fixed-speed rotarod. This is interpreted to allow them to differentiate control from learning, and their inter-relationships. These manipulations are coupled with chronic electrophysiological recordings targeted to the cerebellar nuclei (CN) to control for the efficacy of the CNO manipulation. I found the manuscript intriguing, offering much food for thought, and am confident that it will influence further work on motor learning consolidation. The issue of motor consolidation supported by the cerebellum is timely and interesting, and the claims are novel. There are some limitations to the data presentation and claims, highlighted below, which, if amended, would improve the manuscript.

      (1) Statistical analyses: There is too little information provided about how the Deming regressions, mean points, slopes, and intercepts were compared across conditions. This is important since in the heart of the study when the effects of inactivating CL- vs VL- projecting neurons are being compared to control performance, these statistical methods become paramount. Details of these comparisons and their assumptions should be added to the Methods section. As it stands I barely see information about these tests, and only in the figure legends. I would also like the authors to describe whether there is a criterion for significance in a given correlation to be then compared to another. If I have a weak correlation for a regression model that is non-significant, I would not want to 'compare' that regression to another one since it is already a weak model. The authors should comment on the inclusion criteria for using statistics on regression models.

      (2) The introduction makes the claim that the cerebellar feedback to the forebrain and cortex are functionally segregated. I interpreted this to mean that the cerebellar output neurons are known to project to either VL or CL exclusively (i.e. they do not collateralize). I was unaware of this knowledge and could find no support for the claim in the references provided (Proville 2014; Hintzer 2018; Bosan 2013). Either I am confused as to the authors' meaning or the claim is inaccurate. This point is broader however than some confusion about citation. The study assumes that the CN-CL population and CN-VL population are distinct cells, but to my knowledge, this has not been established. It is difficult to make sense of the data if they are entirely the same populations, unless projection topography differs, but in any event, it is critical to clarify this point: are these different cell types from the nuclei?; how has that been rigorously established?; is there overlap? No overlap? Etc. Results should be interpreted in light of the level of this knowledge of the anatomy in the mouse or rat.

      (3) It is commendable that the authors perform electrophysiology to validate DREADD/CNO. So many investigators don't bother and I really appreciate these data. Would the authors please show the 'wash' in Figure 1a, so that we can see the recovery of the spiking hash after CNO is cleared from the system? This would provide confidence that the signal is not disappearing for reasons of electrode instability or tissue damage/ other.

      (4) I don't think that the "Learning" and "Maintenance" terminology is very helpful and in fact may sow confusion. I would recommend that the authors use a day range " Days 1-3 vs 4-7" or similar, to refer to these epochs. The terminology chosen begs for careful validation, definitions, etc, and seems like it is unlikely uniform across all animals, thus it seems more appropriate to just report it straight, defining the epochs by day. Such original terminology could still be used in the Discussion, with appropriate caveats.

      (5) Minor, but, on the top of page 14 in the Results, the text states, "Suggesting the presence of a 'critical period' in the consolidation of the task". I think this is a non-standard use of 'critical period' and should be removed. If kept, the authors must define what they mean specifically and provide sufficient additional analyses to support the idea. As it stands, the point will sow confusion.

    3. Reviewer #2 (Public review):

      Summary:

      This study examines the contribution of cerebello-thalamic pathways to motor skill learning and consolidation in an accelerating rotarod task. The authors use chemogenetic silencing to manipulate the activity of cerebellar nuclei neurons projecting to two thalamic subregions that target the motor cortex and striatum. By silencing these pathways during different phases of task acquisition (during the task vs after the task), the authors report valuable findings of the involvement of these cerebellar pathways in learning and consolidation.

      Strengths:

      The experiments are well-executed. The authors perform multiple controls and careful analysis to solidly rule out any gross motor deficits caused by their cerebellar nuclei manipulation. The finding that cerebellar projections to the thalamus are required for learning and execution of the accelerating rotarod task adds to a growing body of literature on the interactions between the cerebellum, motor cortex, and basal ganglia during motor learning. The finding that silencing the cerebellar nuclei after a task impairs the consolidation of the learned skill is interesting.

      Weaknesses:

      While the controls for a lack of gross motor deficit are solid, the data seem to show some motor execution deficit when cerebellar nuclei are silenced during task performance. This deficit could potentially impact learning when cerebellar nuclei are silenced during task acquisition. Separately, I find the support for two separate cerebello-thalamic pathways incomplete. The data presented do not clearly show the two pathways are anatomically parallel. The difference in behavioral deficits caused by manipulating these pathways also appears subtle.

    4. Reviewer #3 (Public review):

      Summary:

      Varani et al present important findings regarding the role of distinct cerebellothalamic connections in motor learning and performance. Their key findings are that:<br /> (1) cerebellothalamic connections are important for learning motor skills<br /> (2) cerebellar efferents specifically to the central lateral (CL) thalamus are important for short-term learning<br /> (3) cerebellar efferents specifically to the ventral anterior lateral (VAL) complex are important for offline consolidation of learned skills, and<br /> (4) that once a skill is acquired, cerebellothalamic connections become important for online task performance.

      The authors went to great lengths to separate effects on motor performance from learning, for the most part successfully. While one could argue about some of the specifics, there is little doubt that the CN-CL and CN-VAL pathways play distinct roles in motor learning and performance. An important next step will be to dissect the downstream mechanisms by which these cerebellothalamic pathways mediate motor learning and adaptation.

      Strengths:

      (1) The dissociation between online learning through CN-CL and offline consolidation through CN-VAL is convincing.

      (2) The ability to tease learning apart from performance using their titrated chemogenetic approach is impressive. In particular, their use of multiple motor assays to demonstrate preserved motor function and balance is an important control.

      (3) The evidence supporting the main claims is convincing, with multiple replications of the findings and appropriate controls.

      Weaknesses:

      (1) Despite the care the authors took to demonstrate that their chemogenetic approach does not impair online performance, there is a trend towards impaired rotarod performance at higher speeds in Supplementary Figure 4f, suggesting that there could be subtle changes in motor performance below the level of detection of their assays.

      (2) There is likely some overlap between CN neurons projecting to VAL and CL, somewhat limiting the specificity of their conclusions.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      This is an interesting manuscript tackling the issue of whether subcircuits of the cerebellum are differentially involved in processes of motor performance, learning, or learning consolidation. The authors focus on cerebellar outputs to the ventrolateral thalamus (VL) and to the centrolateral thalamus (CL), since these thalamic nuclei project to the motor cortex and striatum respectively, and thus might be expected to participate in diverse components of motor control and learning. In mice challenged with an accelerating rotarod, the investigators reduce cerebellar output either broadly, or in projection-specific populations, with CNO targeting DREADD-expressing neurons. They first establish that there are not major control deficits with the treatment regime, finding no differences in basic locomotor behavior, grid test, and fixed-speed rotarod. This is interpreted to allow them to differentiate control from learning, and their inter-relationships. These manipulations are coupled with chronic electrophysiological recordings targeted to the cerebellar nuclei (CN) to control for the efficacy of the CNO manipulation. I found the manuscript intriguing, offering much food for thought, and am confident that it will influence further work on motor learning consolidation. The issue of motor consolidation supported by the cerebellum is timely and interesting, and the claims are novel. There are some limitations to the data presentation and claims, highlighted below, which, if amended, would improve the manuscript.

      We thank the reviewer for the positive comments and insightful critics.

      (1.1) Statistical analyses: There is too little information provided about how the Deming regressions, mean points, slopes, and intercepts were compared across conditions. This is important since in the heart of the study when the effects of inactivating CL- vs VL- projecting neurons are being compared to control performance, these statistical methods become paramount. Details of these comparisons and their assumptions should be added to the Methods section. As it stands I barely see information about these tests, and only in the figure legends. I would also like the authors to describe whether there is a criterion for significance in a given correlation to be then compared to another. If I have a weak correlation for a regression model that is non-significant, I would not want to 'compare' that regression to another one since it is already a weak model. The authors should comment on the inclusion criteria for using statistics on regression models.

      Currently the Methods indeed explain that groups are compared by testing differences of distributions of residuals of treatment and control groups around the Deming regression of the control groups: “To test if treatments altered the relationship between initial performance vs learning or daily vs overnight learning, we compared the distribution of signed distance to the control Deming regression line between groups.” But this shall indeed be explained in more details.

      The performance on a given day depends on a cumulative process, so that the average measure of performance is not fully informative on what is learned or what is changed by a treatment (this is further explained in the text p9-10).The challenge is to deal with the multivariate relationships where initial performance, daily learning, and consolidated learning are interdependent. While in control groups these quantities show linear relationships, this is far less the case in treatment groups; this may indeed be due to the variability of the effect of the treatment (efficacy of viral injections) which adds up to the intrinsic variability in the absence of treatment.

      Our choice to see if there is a shift in these relationships following treatments, is to see to which extent treatment points in bivariate comparisons (initial perf x daily learning, daily learning x consolidated learning) are evenly distributed around the control group regression line. We take the presence of a significant difference in the distribution of residuals between the control and treatment group as an indication that the process represented in group is disrupted by the treatment: e.g. if the residuals of the treatment group are lower than those of the control group in the initial performance * daily learning comparison, it indicates that learning is slower (or larger). If the residuals of the treatment group are lower than those of the control group in the daily learning * consolidated learning comparison, it indicates that consolidation is lower. This shall be clarified in a revised version.

      (1.2a) The introduction makes the claim that the cerebellar feedback to the forebrain and cortex are functionally segregated. I interpreted this to mean that the cerebellar output neurons are known to project to either VL or CL exclusively (i.e. they do not collateralize). I was unaware of this knowledge and could find no support for the claim in the references provided (Proville 2014; Hintzer 2018; Bosan 2013). Either I am confused as to the authors' meaning or the claim is inaccurate. This point is broader however than some confusion about citation.

      The references are not cited in the context of collaterals: “They [basal ganglia and cerebellum] send projections back to the cortex via anatomically and functionally segregated channels, which are relayed by predominantly non-overlapping thalamic regions (Bostan, Dum et al. 2013, Proville, Spolidoro et al. 2014, Hintzen, Pelzer et al. 2018). ” Indeed, the thalamic compartments targeted by the basal ganglia and cerebellum are distinct, and in the Proville 2014, we showed some functional segregation of the cerebello-cortical projections (whisker vs orofacial ascending projections). We do not claim that there is a full segregation of the two pathways, there is indeed some known degree of collateralization (see below).

      (1.2b) The study assumes that the CN-CL population and CN-VL population are distinct cells, but to my knowledge, this has not been established. It is difficult to make sense of the data if they are entirely the same populations, unless projection topography differs, but in any event, it is critical to clarify this point: are these different cell types from the nuclei?; how has that been rigorously established?; is there overlap? No overlap? Etc. Results should be interpreted in light of the level of this knowledge of the anatomy in the mouse or rat.

      Actually, the study does not assume that CL-projecting and VAL-projecting neurons are entirely separate populations (actually it is known that there is an overlap), but states that inhibition of neurons following retrograde infections from the CL and VAL do not produce identical results.

      There is indeed a paragraph devoted to the discussion of this point (middle paragraph p20). “Interestingly, both Dentate and Interposed nuclei contain some neurons with collaterals in both VAL and CL thalamic structures (Aumann and Horne 1996, Sakayori, Kato et al. 2019), suggesting that the effect on learning could be mediated by a combined action on the learning process in the striatum (via the CL thalamus) and in the cortex (via the VAL thalamus). However, consistent with (Sakayori, Kato et al. 2019), we found that the manipulations of cerebellar neurons retrogradely targeted either from the CL or from the VAL produced different effects in the task. This indicates that either the distinct functional roles of VAL-projecting of CL-projecting neurons reported in our study is carried by a subset of pathway-specific neurons without collaterals, or that our retrograde infections in VAL and CL preferentially targeted different cerebello-thalamic populations even if these populations had axon terminals in both thalamic regions.”. In other words, we actually know from the literature that there is a degree of collateralization (CN neurons projecting to both VAL and CL, see refs cited above), but as the reviewer says, it does not seem logically possible that the exact same population would have different effects, which are very distinct during the first learning days. The only possible explanation is the CN-CL and CN-VAL retrograde infections recruit somewhat different populations of neurons. This could be due to differences in density of collaterals in CL and VAL of neurons with collaterals in both regions, or presence of CL-projecting neurons without collaterals in VAL, and VAL-projecting neurons without collaterals in CL in addition to the (established) population of neurons with collaterals in both regions. The lesional approach of CN-thalamus neurons in Sakayori et al. 2019 also observed separate effects for CL and VL injections consistent with the differential recruitment of CN populations by retrograde infections.

      This should be improved in a revised version of the manuscript.

      (1.3) It is commendable that the authors perform electrophysiology to validate DREADD/CNO. So many investigators don't bother and I really appreciate these data. Would the authors please show the 'wash' in Figure 1a, so that we can see the recovery of the spiking hash after CNO is cleared from the system? This would provide confidence that the signal is not disappearing for reasons of electrode instability or tissue damage/ other.

      We do not have the wash data on the same day, but there is no significant change in the baseline firing rate across recording days.

      (1.4) I don't think that the "Learning" and "Maintenance" terminology is very helpful and in fact may sow confusion. I would recommend that the authors use a day range " Days 1-3 vs 4-7" or similar, to refer to these epochs. The terminology chosen begs for careful validation, definitions, etc, and seems like it is unlikely uniform across all animals, thus it seems more appropriate to just report it straight, defining the epochs by day. Such original terminology could still be used in the Discussion, with appropriate caveats.

      This shall be indeed corrected in a revised version.

      (1.5) Minor, but, on the top of page 14 in the Results, the text states, "Suggesting the presence of a 'critical period' in the consolidation of the task". I think this is a non-standard use of 'critical period' and should be removed. If kept, the authors must define what they mean specifically and provide sufficient additional analyses to support the idea. As it stands, the point will sow confusion.

      This shall be indeed corrected in a revised version

      Reviewer #2 (Public review):

      Summary:

      This study examines the contribution of cerebello-thalamic pathways to motor skill learning and consolidation in an accelerating rotarod task. The authors use chemogenetic silencing to manipulate the activity of cerebellar nuclei neurons projecting to two thalamic subregions that target the motor cortex and striatum. By silencing these pathways during different phases of task acquisition (during the task vs after the task), the authors report valuable findings of the involvement of these cerebellar pathways in learning and consolidation.

      Strengths:

      The experiments are well-executed. The authors perform multiple controls and careful analysis to solidly rule out any gross motor deficits caused by their cerebellar nuclei manipulation. The finding that cerebellar projections to the thalamus are required for learning and execution of the accelerating rotarod task adds to a growing body of literature on the interactions between the cerebellum, motor cortex, and basal ganglia during motor learning. The finding that silencing the cerebellar nuclei after a task impairs the consolidation of the learned skill is interesting.

      We thank the reviewer for the positive comments and insightful critics below.

      Weaknesses:

      (2.1) While the controls for a lack of gross motor deficit are solid, the data seem to show some motor execution deficit when cerebellar nuclei are silenced during task performance. This deficit could potentially impact learning when cerebellar nuclei are silenced during task acquisition.

      One of our key controls are the tests of the treatment on fixed speed rotarod, which provides the closest conditions to the ones found in the accelerating rotarod (the main difference between the protocols being the slow steady acceleration of rod rotation [+0.12 rpm per s]- in the accelerating version).

      In the CN experiments, we found clear deficits in learning and consolidation while there was no effect on the fixed speed rotarod (performance of the DREAD-CNO are even slightly better than some control groups), consistent with a separation of the effect on learning/consolidation from those on locomotion on a rotarod. However, small but measurable deficits are found at the highest speed in the fixed speed rotarod in the CN-VAL group; there was no significant effect in the CN-CL group, while the CN-CL actually shows lower performances from the second day of learning; we believe this supports our claim that the CN-CL inhibition impacted more the learning process than the motor coordination. In contrast the CN-VAL group only showed significantly lower performance on day 4 of the accelerating rotarod consistent with intact learning abilities. Of note, under CNO, CN-VAL mice could stay for more than a minute and half at 20rpm, while on average they fell from the accelerating rotarod as soon as the rotarod reached the speed of ~19rpm (130s).

      The text currently states “The inhibition of CN-VAL neurons during the task also yielded lower levels of performance in the Maintenance stage,[[NB: day 5-7]] suggesting that these neurons contribute also to learning and retrieval of motor skills, although the mild defect in fixed speed rotarod could indicate the presence of a locomotor deficit, only visible at high speed.” Following the reviewers’ comment, we shall however revise the sentence above in the revised version of the MS to say that we cannot fully disambiguate the execution / learning-retrieval effect at high speed for these mice.

      (2.2a) Separately, I find the support for two separate cerebello-thalamic pathways incomplete. The data presented do not clearly show the two pathways are anatomically parallel.

      As explained above (point 1.2a), it is already known that these pathways overlap to some degree (discussion p 20), but yet their targeting differentially affects the behavior, consistent with separate contributions. A similar finding was observed for a lesional (irreversible) approach in Sakayori et al. 2019.

      (2.2b) The difference in behavioral deficits caused by manipulating these pathways also appears subtle.

      While we agree that after 3-4 days of learning the difference of performance between the groups becomes elusive, we respectfully disagree with the reviewer that in the early stages these differences are negligible and the impact of inhibition on "learning rate" (ie. amount of learning for a given daily initial performance) and consolidation (i.e. overnight retention of daily gain of performance) exhibit different profiles for the two groups (fig 3h vs 3k).

      Reviewer #3 (Public review)

      Summary:

      Varani et al present important findings regarding the role of distinct cerebellothalamic connections in motor learning and performance. Their key findings are that:

      (1) cerebellothalamic connections are important for learning motor skills

      (2) cerebellar efferents specifically to the central lateral (CL) thalamus are important for short-term learning

      (3) cerebellar efferents specifically to the ventral anterior lateral (VAL) complex are important for offline consolidation of learned skills, and

      (4) that once a skill is acquired, cerebellothalamic connections become important for online task performance.

      The authors went to great lengths to separate effects on motor performance from learning, for the most part successfully. While one could argue about some of the specifics, there is little doubt that the CN-CL and CN-VAL pathways play distinct roles in motor learning and performance. An important next step will be to dissect the downstream mechanisms by which these cerebellothalamic pathways mediate motor learning and adaptation.

      Strengths:

      (1) The dissociation between online learning through CN-CL and offline consolidation through CN-VAL is convincing.

      (2) The ability to tease learning apart from performance using their titrated chemogenetic approach is impressive. In particular, their use of multiple motor assays to demonstrate preserved motor function and balance is an important control.

      (3) The evidence supporting the main claims is convincing, with multiple replications of the findings and appropriate controls.

      We thank the reviewer for the positive comments and insightful critics below.

      Weaknesses:

      (3.1) Despite the care the authors took to demonstrate that their chemogenetic approach does not impair online performance, there is a trend towards impaired rotarod performance at higher speeds in Supplementary Figure 4f, suggesting that there could be subtle changes in motor performance below the level of detection of their assays.

      This is also discussed in point 2.1 above. In our view, the fixed speed rotarod is a control very close to the accelerating rotarod condition, with very similar requirements between the two tasks (yet unfortunately rarely tested in accelerating rotarod studies). We do not exclude the presence of motor deficits, but the main argument is that these do not suffice to explain the differences observed in the accelerating rotarod. No detectable deficit was found in the CN group while very clear deficits in learning/consolidation were observed. A mild deficit is only significant in the CN-VAL group, while the deficit is not significant in the fixed-speed rotarod for the CN-CL group which shows the strongest deficit in accelerating rotarod during the first days: e.g. on day 2, the CN-CL group is already below the control group with latencies to fall ~100s (corresponding to immediate fall at ~15rpm) while the fixed speed rotarod performances at 15s of the control and CNO-treated groups show an ability to stay more than 1 min at this speed. The text shall be improved to clarify this point.

      (3.2) There is likely some overlap between CN neurons projecting to VAL and CL, somewhat limiting the specificity of their conclusions.

      There is indeed published evidence for some degree of anatomical overlap, but also for some differential contribution of CN-VAL and CN-CL to the task. The answer to this point is developed in the points 1.2a 2.2a above. Although this point was exposed in the discussion (p20), the text shall be improved in a revised version of the MS to clarify our statement.

    1. eLife Assessment

      This important study advances our understanding of the way neurons in the auditory cortex of mice respond to unpredictable sounds. Through the use of state-of-the-art recording methods, compelling evidence is provided that responses to local and global violations in sound sequences are prediction errors and not simply the consequence of stimulus-specific adaptation. Although the cell-type-specific results are intriguing, further work is needed to establish their reliability.